Log in

Relevant bibliographies by topics / Multifaceted Rasch measurement / Journal articles

Journal articles on the topic 'Multifaceted Rasch measurement'

To see the other types of publications on this topic, follow the link: Multifaceted Rasch measurement.

Author: Grafiati

Published: 10 December 2022

Last updated: 29 January 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 22 journal articles for your research on the topic 'Multifaceted Rasch measurement.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Kinsey E. Edwards, Andrew S. Edwards, and Brian C. Wesolowski. "Validation of a String Performance Rubric Using the Multifaceted Rasch Measurement Model." Bulletin of the Council for Research in Music Education, no. 215 (2018): 7. http://dx.doi.org/10.5406/bulcouresmusedu.215.0007.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Hsieh, Mingchuan. "An application of Multifaceted Rasch measurement in the Yes/No Angoff standard setting procedure." Language Testing 30, no. 4 (March 21, 2013): 491–512. http://dx.doi.org/10.1177/0265532213476259.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Park, Kang-Hyun, Ickpyo Hong, and Ji-Hyuk Park. "Development and Validation of the Yonsei Lifestyle Profile-Satisfaction (YLP-S) Using the Rasch Measurement Model." INQUIRY: The Journal of Health Care Organization, Provision, and Financing 58 (January 2021): 004695802110176. http://dx.doi.org/10.1177/00469580211017639.

Full text

Abstract:

Lifestyle plays an important role in determining health and vitality among older adults. However, there is limited evidence regarding lifestyle assessment. This study examined the psychometric properties of the Yonsei Lifestyle Profile-Satisfaction (YLP-S). The participants in the study included 156 older adults. Rasch analysis was used to test unidimensionality, fit statistics, and the precision of the YLP-S. The YLP-S demonstrated a unidimensional measurement construct, and 18 items fit the Rasch model. The YLP-S illustrated reasonable precision (person strata = 5.37). Only 4 items showed differential item functioning by sex or age groups. The findings indicate that the YLP-S demonstrated sound internal validity and can be used by health professionals to measure the multifaceted lifestyle of older adults.

APA, Harvard, Vancouver, ISO, and other styles

4

Wind, Stefanie A., and Eli Jones. "Not Just Generalizability: A Case for Multifaceted Latent Trait Models in Teacher Observation Systems." Educational Researcher 48, no. 8 (September 12, 2019): 521–33. http://dx.doi.org/10.3102/0013189x19874084.

Full text

Abstract:

Teacher evaluation systems often include classroom observations in which raters use rating scales to evaluate teachers’ effectiveness. Recently, researchers have promoted the use of multifaceted approaches to investigating reliability using Generalizability theory, instead of rater reliability statistics. Generalizability theory allows analysts to quantify the contribution of multiple sources of variance (e.g., raters and tasks) to measurement error. We used data from a teacher evaluation system to illustrate another multifaceted approach that provides additional indicators of the quality of observational systems. We show how analysts can use Many-Facet Rasch models to identify and control for differences in rater severity, identify idiosyncratic ratings associated with various facets, and evaluate rating scale functioning. We discuss implications for research and practice in teacher evaluation.

APA, Harvard, Vancouver, ISO, and other styles

5

Edwards, Andrew S., Kinsey E. Edwards, and Brian C. Wesolowski. "The psychometric evaluation of a wind band performance rubric using the Multifaceted Rasch Partial Credit Measurement Model." Research Studies in Music Education 41, no. 3 (April 24, 2019): 343–67. http://dx.doi.org/10.1177/1321103x18773103.

Full text

Abstract:

The purpose of this study was to develop a valid and reliable rubric to be used for the evaluation of large ensemble wind band performances. The guiding questions for this study were: (a) what are the psychometric qualities (i.e., reliability and validity) of the scale developed to assess wind band ensemble performance at the high school level? (b) how do the items fit the model and vary in difficulty? (c) how does the structure of the rating scale vary across individual items? and (d) how can the rating scale be transferred into an informative rubric? The primary data analysis tool used in this study was the Multifaceted Rasch Partial Credit Measurement Model. Music content experts ( N = 20) were solicited to evaluate 40 wind band performances, each evaluator listening to four. A 4-point Likert-type rating scale (e.g., Strongly Agree, Agree, Disagree, and Strongly Disagree) was used to evaluate each recorded performance. Results indicated good model data fit and resulted in a final rubric containing 24 items ranging from two to four performance categories. Implications for classroom teaching and consequential validity are discussed.

APA, Harvard, Vancouver, ISO, and other styles

6

Wesolowski, Brian C., Ross M. Amend, Thomas S. Barnstead, Andrew S. Edwards, Matthew Everhart, Quentin R. Goins, Robert J. Grogan, et al. "The Development of a Secondary-Level Solo Wind Instrument Performance Rubric Using the Multifaceted Rasch Partial Credit Measurement Model." Journal of Research in Music Education 65, no. 1 (March 1, 2017): 95–119. http://dx.doi.org/10.1177/0022429417694873.

Full text

Abstract:

The purpose of this study was to describe the development of a valid and reliable rubric to assess secondary-level solo instrumental music performance based on principles of invariant measurement. The research questions that guided this study included (1) What is the psychometric quality (i.e., validity, reliability, and precision) of a scale developed to assess secondary-level solo music performance? (2) Do the proposed items fit the measurement model, and if so, how do the items vary in difficulty? and (3) How does the structure of the rating scale vary across individual items? The psychometric considerations in this study included calibrations of items, persons, raters, school level, musical instrument, and rating scale structure using the Multifaceted Rasch Partial Credit Measurement Model. A 13-member cohort of music content experts participated as raters in this study. A total of 75 video performances of secondary-level solo and ensemble performances were evaluated. The result was the development of the Music Performance Rubric for Secondary-Level Instrumental Solos (MPR-2L-INSTSOLO), a 30-item rubric consisting of rating scale categories ranging from two to four performance criteria. Implications for consequential validity, rater training, standard setting, and benchmarking are discussed.

APA, Harvard, Vancouver, ISO, and other styles

7

Ahmadi Shirazi, Masoumeh. "For a Greater Good: Bias Analysis in Writing Assessment." SAGE Open 9, no. 1 (January 2019): 215824401882237. http://dx.doi.org/10.1177/2158244018822377.

Full text

Abstract:

Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters’ experience, language background, and the choice of essay prompt as potential sources of biases. Eight raters, four native English speakers and four Persian L1 speakers of English as a Foreign Language (EFL), scored 40 essays on one general and one field-specific topic. The raters assessed these essays based on Test of English as a Foreign Language (TOEFL) holistic and International English Language Testing System (IELTS) analytic band scores. Multifaceted Rasch Measurement (MFRM) was run to find extant biases. In spite of not finding statistically significant biases, several interesting results emerged illustrating the influence of construct-irrelevant factors such as raters’ experience, L1, and educational background. Further research is warranted to investigate these factors as potential sources of rater bias.

APA, Harvard, Vancouver, ISO, and other styles

8

Wesolowski, Brian C. "Exploring rater cognition: A typology of raters in the context of music performance assessment." Psychology of Music 45, no. 3 (September 16, 2016): 375–99. http://dx.doi.org/10.1177/0305735616665004.

Full text

Abstract:

This manuscript sought to investigate rater cognition by exploring rater types based upon differential severity and leniency associated with rating scale items, rating scale category functioning, and dimensions of music performance assessment. The purpose of this study was to empirically identify typologies of operational raters based upon systematic differential severity indices in the context of large ensemble music performance assessment. A rater cognition information-processing model was explored based upon two frameworks: a framework for scoring and a framework for audition. Rater scoring behavior was examined using a framework for scoring, where raters’ mental processes compare auditory images to the scoring criteria used to generate a scoring decision. The scoring decisions were evaluated using the Multifaceted Rasch Partial Credit Measurement Model. A rater typology was then examined under the framework of audition, where similar schemata were defined through raters’ clustering of differential severity indices related to items and compared across performance dimensions. The results provided three distinct rater-types: (a) the syntactical rater; (b) the expressive rater; and (c) the mental representation rater. Implications for fairness and precision in the assessment process are discussed as well as considerations for validity of scoring processes.

APA, Harvard, Vancouver, ISO, and other styles

9

Révész, Andrea. "TASK COMPLEXITY, FOCUS ON FORM, AND SECOND LANGUAGE DEVELOPMENT." Studies in Second Language Acquisition 31, no. 3 (September 2009): 437–70. http://dx.doi.org/10.1017/s0272263109090366.

Full text

Abstract:

Tasks have received increased attention in SLA research for the past decade, as has the role of focus on form. However, few empirical studies have investigated the relationship among tasks, focus-on-form techniques, and second language (L2) learning outcomes. To help address this gap, the present study examined how the task variable +/− contextual support combined with the focus-on-form technique known as recasting affects L2 morphosyntactic development. The participants were 90 adult learners of English as a foreign language, randomly assigned to one of five groups: four comparison groups and a control group. The comparison groups differed as to (a) whether they received recasts while describing photos and (b) whether they could see the photos while describing them. The control group only participated in the testing sessions. A pretest-posttest-delayed posttest design was employed to detect any improvement in participants’ ability to use the linguistic target, which was the past progressive form. Results from multifaceted Rasch measurement yielded two main findings. First, learners who received recasts but did not view photos outperformed learners who received recasts while viewing photos. Second, the group that viewed photos but did not receive recasts achieved greater L2 gains than the group who neither viewed photos nor received recasts.

APA, Harvard, Vancouver, ISO, and other styles

10

Han, Chao. "Investigating rater severity/leniency in interpreter performance testing." Interpreting. International Journal of Research and Practice in Interpreting 17, no. 2 (September 3, 2015): 255–83. http://dx.doi.org/10.1075/intp.17.2.05han.

Full text

Abstract:

Rater-mediated performance assessment (RMPA) is a critical component of interpreter certification testing systems worldwide. Given the acknowledged rater variability in RMPA and the high-stakes nature of certification testing, it is crucial to ensure rater reliability in interpreter certification performance testing (ICPT). However, a review of current ICPT practice indicates that rigorous research on rater reliability is lacking. Against this background, the present study reports on use of multifaceted Rasch measurement (MFRM) to identify the degree of severity/leniency in different raters’ assessments of simultaneous interpretations (SIs) by 32 interpreters in an experimental setting. Nine raters specifically trained for the purpose were asked to evaluate four English-to-Chinese SIs by each of the interpreters, using three 8-point rating scales (information content, fluency, expression). The source texts differed in speed and in the speaker’s accent (native vs non-native). Rater-generated scores were then subjected to MFRM analysis, using the FACETS program. The following general trends emerged: 1) homogeneity statistics showed that not all raters were equally severe overall; and 2) bias analyses showed that a relatively large proportion of the raters had significantly biased interactions with the interpreters and the assessment criteria. Implications for practical rating arrangements in ICPT, and for rater training, are discussed.

APA, Harvard, Vancouver, ISO, and other styles

11

Wesolowski, Brian C., Stefanie A. Wind, and George Engelhard. "Examining Rater Precision in Music Performance Assessment." Music Perception 33, no. 5 (June 1, 2016): 662–78. http://dx.doi.org/10.1525/mp.2016.33.5.662.

Full text

Abstract:

The use of raters as a methodological tool to detect significant differences in performances and as a means to evaluate music performance achievement is a solidly defended practice in music psychology, education, and performance science research. However, psychometric concerns exist in raters’ precision in the use of task-specific scoring criteria. A methodology for managing rater quality in rater-mediated assessment practices has not been systematically developed in the field of music. The purpose of this study was to examine rater precision through the analysis of rating scale category structure across a set of raters and items within the context of large-group music performance assessment using a Multifaceted Rasch Partial Credit (MFR-PC) Measurement Model. Allowing for separate parameterization estimation of the rating scale for each rater can more clearly detect variability in rater judgment and improve model-data fit, thereby enhancing objectivity, fairness, and precision of rating quality in the music assessment process. Expert judges (N = 23) rated a set of four recordings by middle school, high school, collegiate, and professional jazz big bands. A single common expert rater evaluated all 24 jazz ensemble performances. The data suggest that raters significantly vary in severity, items significantly vary in difficulty, and rating scale category structure significantly varies across raters. Implications for the improvement and management of rater quality in music performance assessment are provided.

APA, Harvard, Vancouver, ISO, and other styles

12

BOONE, WILLIAM J., J. SCOTT TOWNSEND, and JOHN R. STAVER. "Utilizing Multifaceted Rasch Measurement Through FACETS to Evaluate Science Education Data Sets Composed of Judges, Respondents, and Rating Scale Items: An Exemplar Utilizing the Elementary Science Teaching Analysis Matrix Instrument." Science Education 100, no. 2 (November 25, 2015): 221–38. http://dx.doi.org/10.1002/sce.21210.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Musselwhite, Dorothy J., and Brian C. Wesolowski. "Evaluating the Psychometric Qualities of a Rating Scale to Assess Pre-Service Teachers’ Lesson Plan Development in the Context of a Secondary-Level Music Performance Classroom." Journal of Research in Music Education 66, no. 3 (August 28, 2018): 338–58. http://dx.doi.org/10.1177/0022429418793645.

Full text

Abstract:

The purpose of this study was to evaluate the psychometric quality (i.e., validity and reliability) of a rating scale to assess pre-service teachers’ lesson plan development in the context of secondary-level music performance classrooms. The research questions that guided this study include: (1) What items demonstrate acceptable model fit for the construct of lesson plan development in the context of a secondary-level music performance classroom? (2) How does the structure of the rating scale vary across items? and (3) Does differential severity emerge for academic administrators or music education content specialists across items? Using multiple teacher effectiveness frameworks, the lesson plans in this study were evaluated using a 4-point Likert-type rating scale (e.g., strongly agree, agree, disagree, strongly disagree) consisting of five domains: (a) instructional planning, (b) instructional delivery, (c) differentiated instruction, (d) assessment uses, and (e) assessment strategies. Secondary-level school administrators ( n = 8) and music education content specialists ( n = 8) rated 32 lesson plans using a balanced incomplete assessment network. The multifaceted Rasch measurement partial credit model was used in this study. Results suggest higher rater severity among administrators than music specialists. Of the 68 potential pairwise interactions examined in the study, 5 (7.4 %) of those were found to be statistically significant, which indicates that 5 raters demonstrated differential severity across at least one lesson plan. Implications for student teacher preparation, teacher effectiveness, and the validity of measures are discussed.

APA, Harvard, Vancouver, ISO, and other styles

14

Wesolowski, Brian C., and Stefanie A. Wind. "Investigating rater accuracy in the context of secondary-level solo instrumental music performance." Musicae Scientiae 23, no. 2 (June 13, 2017): 157–76. http://dx.doi.org/10.1177/1029864917713805.

Full text

Abstract:

In any performance-based musical assessment context, construct-irrelevant variability attributed to raters is a cause of concern when constructing a validity argument. Therefore, evidence of rater quality is a necessary criterion for psychometrically sound (i.e., valid, reliable, and fair) rater-mediated music performance assessments. Rater accuracy is a type of rater quality index that measures the distance between raters’ operational ratings and an expert’s criterion ratings on a set of benchmark, exemplar, or anchor musical performances. The purpose of this study was to examine the quality of ratings in the context of a secondary-level solo music performance assessment using a Multifaceted Rasch Rater Accuracy (MFR-RA) measurement model. This study was guided by the following research questions: (a) overall, how accurate were the rater judgments in the assessment context? (b) how accurate were the rater judgments across each of the items of the rubric?, and (c) how accurate were the rater judgments across each of the domains of the rubric? Results indicated that accuracy scores generally matched the expectations of the MFR-RA model, with rater locations higher than the average student performance, item, and domain locations, indicating that the student performances, items, and domains were relatively easy to rate accurately for the sample of raters examined in this study. Overall, rater accuracy ranged from 0.54 logits ( SE = 0.05) for the most accurate rater to 0.24 logits ( SE = 0.04) for the least accurate rater. Difficulty of rater accuracy across items indicated a range of 0.91 logits ( SE = 0.08) to -1.83 logits ( SE = 0.17). Difficulty of rater accuracy across domains ranged from 0.25 logits ( SE = 0.08) to -0.68 logits ( SE = 0.17). Implications for the improvement of music performance assessments with specific regard to rater training are discussed.

APA, Harvard, Vancouver, ISO, and other styles

15

O’Grady, Stefan. "The impact of pre-task planning on speaking test performance for English-medium university admission." Language Testing 36, no. 4 (March 2019): 505–26. http://dx.doi.org/10.1177/0265532219826604.

Full text

Abstract:

This study investigated the impact of different lengths of pre-task planning time on performance in a test of second language speaking ability for university admission. In the study, 47 Turkish-speaking learners of English took a test of English language speaking ability. The participants were divided into two groups according to their language proficiency, which was estimated through a paper-based English placement test. They each completed four monologue tasks: two picture-based narrative tasks and two description tasks. In a balanced design, each test taker was allowed a different length of planning time before responding to each of the four tasks. The four planning conditions were 30 seconds, 1 minute, 5 minutes, and 10 minutes. Trained raters awarded scores to the test takers using an analytic rating scale and a context-specific, binary-choice rating scale, designed specifically for the study. The results of the rater scores were analysed by using a multifaceted Rasch measurement. The impact of pre-task planning on test scores was found to be influenced by four variables: the rating scale; the task type that test takers completed; the length of planning time provided; and the test takers’ levels of proficiency in the second language. Increases in scores were larger on the picture-based narrative tasks than on the two description tasks. The results also revealed a relationship between proficiency and pre-task planning, whereby statistical significance was only reached for the increases in the scores of the lowest-level test takers. Regarding the amount of planning time, the 5-minute planning condition led to the largest overall increases in scores. The research findings offer contributions to the study of pre-task planning and will be of particular interest to institutions seeking to assess the speaking ability of prospective students in English-medium educational environments.

APA, Harvard, Vancouver, ISO, and other styles

16

Coniam, David. "An Investigation into the Effect of Raw Scores in Determining Grades in a Public Examination of Writing." JALT Journal 30, no. 1 (May 1, 2008): 69. http://dx.doi.org/10.37546/jaltjj30.1-4.

Full text

Abstract:

This article examines the effect on the grades assigned to test takers either directly through the use of raters’ raw scores, or through the use of measures obtained through multifaceted Rasch measurement (MFRM). Using data from the Hong Kong 2005 public examination of writing, the current study examines how test takers’ grades differ by comparing the results of grades from “lenient” raters against those of “severe” raters on the two systems for assigning grades–raw band scores and MFRM-derived scores. Examination of the results of a pair of raters indicates that the use of raw scores may produce widely different results from those obtained via MFRM, with test takers potentially disadvantaged by being rated by a severe rather than a lenient rater. In the Hong Kong English language public examination system from 2007 onwards, band scales will be used extensively, as indeed they already are in many Asian countries. The article therefore concludes with a call for consideration to be given to how test takers’ final grades may be derived from raw scores. 本研究は香港における公的試験のライティング・テストの採点に関する実証研究である。採点者の得点をそのまま使った場合と、多相ラッシュ・モデリング（ＭＦＲＭ）の得点を使った場合、成績の上でどのような違いがあるのかを調査したものである。香港で2005年度に実施された試験をデータとして使った。分析の結果、採点者の得点をそのまま使った場合には、より厳しい採点者によって受験者が不利を蒙る傾向があることがわかった。採点者の得点を使って最終成績をつける場合にはどうすればよいのかを論じて結論とした。

APA, Harvard, Vancouver, ISO, and other styles

17

Eckes, Thomas. "Beurteilerübereinstimmung und Beurteilerstrenge." Diagnostica 50, no. 2 (April 2004): 65–77. http://dx.doi.org/10.1026/0012-1924.50.2.65.

Full text

Abstract:

Zusammenfassung. Leistungsbeurteilungen unterliegen einer Reihe von Urteilsfehlern, die ihre Genauigkeit und Validität erheblich mindern können. Ein besonders kritischer Urteilsfehler ist die Tendenz zur Strenge bzw. Milde. In der vorliegenden Arbeit wird mit der Multifacetten-Rasch-Analyse (“many-facet Rasch measurement“; Linacre, 1989 ; Linacre & Wright, 2002 ) ein Item-Response-Modell vorgestellt, das Messungen der Strenge bzw. Milde eines jeden Beurteilers erlaubt und die ermittelten Strengemaße zusammen mit den Fähigkeitsmaßen der beurteilten Personen und den Schwierigkeitsmaßen der Aufgaben oder Beurteilungskriterien in einen gemeinsamen Bezugsrahmen stellt. Das Modell ermöglicht ferner eine um die Strenge der Beurteiler korrigierte Leistungsmessung. Mittels dieses Ansatzes werden im Rahmen des “Test Deutsch als Fremdsprache“ (TestDaF) Beurteilungen analysiert, die je 2 von insgesamt 29 Beurteilern zu Leistungen von 1359 Pbn im schriftlichen Ausdruck nach 3 Kriterien abgegeben haben. Die Gruppe der Beurteiler erweist sich als sehr heterogen, so dass eine Strengekorrektur der Urteile geboten ist. Abschließend werden verschiedene Implikationen des Multifacetten-Rasch-Modells für die Evaluation von Leistungsbeurteilungen diskutiert.

APA, Harvard, Vancouver, ISO, and other styles

18

Han, Chao, and Xiaoqi Shang. "An item-based, Rasch-calibrated approach to assessing translation quality." Target. International Journal of Translation Studies, September 15, 2022. http://dx.doi.org/10.1075/target.20052.han.

Full text

Abstract:

Abstract Item-based scoring has been advocated as a psychometrically robust approach to translation quality assessment, outperforming traditional neo-hermeneutic and error analysis methods. The past decade has witnessed a succession of item-based scoring methods being developed and trialed, ranging from calibration of dichotomous items to preselected item evaluation. Despite this progress, these methods seem to be undermined by several limitations, such as the inability to accommodate the multifaceted reality of translation quality assessment and inconsistent item calibration procedures. Against this background, we conducted a methodological exploration, utilizing what we call an item-based, Rasch-calibrated method, to measure translation quality. This new method, built on the sophisticated psychometric model of many-facet Rasch measurement, inherits the item concept from its predecessors, but addresses previous limitations. In this article, we demonstrate its operationalization and provide an initial body of empirical evidence supporting its reliability, validity, and utility, as well as discuss its potential applications.

APA, Harvard, Vancouver, ISO, and other styles

19

Bijani, Houman, Bahareh Hashempour, Khaled Ahmed Abdel-Al Ibrahim, Salim Said Bani Orabah, and Tahereh Heydarnejad. "Investigating the effect of classroom-based feedback on speaking assessment: a multifaceted Rasch analysis." Language Testing in Asia 12, no. 1 (October 7, 2022). http://dx.doi.org/10.1186/s40468-022-00176-3.

Full text

Abstract:

AbstractDue to subjectivity in oral assessment, much concentration has been put on obtaining a satisfactory measure of consistency among raters. However, the process for obtaining more consistency might not result in valid decisions. One matter that is at the core of both reliability and validity in oral assessment is rater training. Recently, multifaceted Rasch measurement (MFRM) has been adopted to address the problem of rater bias and inconsistency in scoring; however, no research has incorporated the facets of test takers’ ability, raters’ severity, task difficulty, group expertise, scale criterion category, and test version together in a piece of research along with their two-sided impacts. Moreover, little research has investigated how long rater training effects last. Consequently, this study explored the influence of the training program and feedback by having 20 raters score the oral production produced by 300 test-takers in three phases. The results indicated that training can lead to more degrees of interrater reliability and diminished measures of severity/leniency, and biasedness. However, it will not lead the raters into total unanimity, except for making them more self-consistent. Even though rater training might result in higher internal consistency among raters, it cannot simply eradicate individual differences related to their characteristics. That is, experienced raters, due to their idiosyncratic characteristics, did not benefit as much as inexperienced ones. This study also showed that the outcome of training might not endure in long term after training; thus, it requires ongoing training throughout the rating period letting raters regain consistency.

APA, Harvard, Vancouver, ISO, and other styles

20

Fischl, Caroline, Camilla Malinowsky, and Ingeborg Nilsson. "Measurement of older adults’ performance in digital technology-mediated occupations and management of digital technology." British Journal of Occupational Therapy, August 2, 2020, 030802262093797. http://dx.doi.org/10.1177/0308022620937971.

Full text

Abstract:

Introduction Supporting older adults’ digital engagement requires an understanding of how occupational performance and technology use are related, as well as having a range of methods that can assist occupational therapists while observing occupational performance and management of technology. The study objectives were to investigate how older adults’ ability to perform digital technology-mediated occupations and ability to manage digital technology could be measured and to examine the association between these two abilities. Method Twenty-five older adults were observed performing digital technology-mediated occupations and managing digital technologies, and were scored on two instruments: the Assessment of Computer-Related Skills and the Management of Everyday Technology Assessment. FACETS was used to generate respective multifaceted Rasch measurement models for scores on the instruments. The Spearman correlation test was used to investigate correlation between person ability measures from respective Rasch models of the instruments. Results The results include item, occupation, and technology difficulty estimates, as well as person ability measures that could illustrate older adults’ ability to perform occupations and to manage technology. There is also a strong positive correlation between these abilities. Conclusion Insight into an older person’s ability to manage technology can provide information about his or her ability to perform digital technology-mediated occupations and vice versa.

APA, Harvard, Vancouver, ISO, and other styles

21

Han, Chao. "Using analytic rating scales to assess English/Chinese bi-directional interpretation: A longitudinal Rasch analysis of scale utility and rater behavior." Linguistica Antverpiensia, New Series – Themes in Translation Studies 16 (January 29, 2018). http://dx.doi.org/10.52034/lanstts.v16i0.429.

Full text

Abstract:

Descriptor-based analytic rating scales have been increasingly used to assess interpretation quality. However, little empirical evidence is available to unequivocally support the effectiveness of rating scales and rater reliability. This longitudinal study thus attempts to shed insight into scale utility and rater behavior in English/Chinese interpretation performance assessment, using multifaceted Rasch measurement. Specifically, the study focuses on criterion/scale difficulty, scale effectiveness, rater severity/leniency and rater self-consistency between English/Chinese interpreting and over three time points. Research results are discussed, highlighting the utility of analytic rating scales and the variability of rater behavior in interpretation assessment. The results also have implications for developing reliable, valid, and practical instruments to assess interpretation quality.

APA, Harvard, Vancouver, ISO, and other styles

22

Li, Jiuliang, and Qian Wang. "Development and validation of a rating scale for summarization as an integrated task." Asian-Pacific Journal of Second and Foreign Language Education 6, no. 1 (July 1, 2021). http://dx.doi.org/10.1186/s40862-021-00113-6.

Full text

Abstract:

AbstractSummary writing is essential for academic success, and has attracted renewed interest in academic research and large-scale language test. However, less attention has been paid to the development and evaluation of the scoring scales of summary writing. This study reports on the validation of a summary rubric that represented an approach to scale development with limited resources out of consideration for practicality. Participants were 83 students and three raters. Diagnostic evaluation of the scale components and categories was based on raters’ perception of their use and the scores of students’ summaries which were analyzed using multifaceted Rasch measurement (MFRM). Correlation analysis revealed significant relationships among the scoring components, but the coefficients among some of the components were over high. MFRM analysis provided evidence in support of the usefulness of the scoring rubric, but also suggested the need of a refinement of the components and categories. According to the raters, the rubric was ambiguous in addressing some crucial text features. This study has implications for summarization task design, scoring scale development and validation in particular.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!