The Study of Cultural Consensus Model Efficiency on Analysis of Differential Rater Functioning: A Simulation Study
Main Article Content
Abstract
The Study of Cultural Consensus Model Efficiency on Analysis of Differential Rater Functioning: A Simulation Study aimed to: 1) examine the efficiency of parameter estimating of the cultural consensus model, 2) study the factors affecting the model’s efficiency on parameter estimating. The study was conducted in simulation scenario using the Markov chain Monte Carlo simulation method (MCMC).
The study found that: 1) the MC-GCM model could efficiently recover the model parameters. The Pearson correlation coefficient between the true value and the estimated value was statistically significant. 2) The result of MANOVA analysis between the number of raters , number of items and the differential rater functioning showed that the differential rater functioning had significant effect on MSE and correlation between the true value and the estimated value.
Article Details
The content and information contained in the published article in the Journal of Educational Measurement Mahasarakham University represent the opinions and responsibilities of the authors directly. The editorial board of the journal is not necessarily in agreement with or responsible for any of the content.
The articles, data, content, images, etc. that have been published in the Journal of Educational Measurement Mahasarakham University are copyrighted by the journal. If any individual or organization wishes to reproduce or perform any actions involving the entirety or any part of the content, they must obtain written permission from the Journal of Educational Measurement Mahasarakham University.
References
Batchelder W. H., Anders R. (2012). Cultural Consensus Theory: Comparing Different Concepts of Cultural Truth. Journal of Mathematical Psychology, 56, 316-332.
Engelhard, G. Jr., Wind, S. A., Jennifer, L. K., Chajewski, M. (2013). Differential Item and Person Functioning in Large-Scale Writing Assessments within the Context of the SAT. Research report. College Board.
Farrokhi et al. (2012). A Many-Facet Rasch Measurement of Differential Rater Severity/Leniency in Three Types of Assessment. JALT Journal, 34(1), 79-102.
Muckle, T. J., Karabatsos, G. (2009). Hierarchical Generalized Linear Models for the Analysis of Jude Rating. Journal of Education Measurement, 46(2), 198-219.
Myford, C. M., & Wolfe, E. W. (2009). Monitoring Rater Performance over Time: A Framework for Detecting Differential Accuracy and Differential Scale Category Use. Journal of Educational Measurement. 46(4), 371-389.
Patz, R. J., Junker, B. W., Johnson, M. S., and Mariano, L. T. (2002). The Hierarchical Rater Model for Rated Test Items and its Application to Large-Scale Educational Assessment Data. Journal of Educational and Behavioral Statistics, 27(4), 341-384.
Romney, A. K., Weller, S. C., & Batchelder, W. H. (1986). Culture as consensus: A theory of culture and informant accuracy. American anthropologist, 88(2), 313-338.
Schaefer, E. (2008). Rater Bias Patterns in an EFL Writing Assessment. Language Testing, 28(4), 465-493.
Wesoloaki, B. C., Wind, S. A., & Engelhard, G. Jr. (2015). Rater Fairness in Music Performance Assessment: Evaluating Model-Data and Differential Rater Functioning. Musicae Scientiae, 19920, 147-170.
Xun Yan. (2014). An Examination of rater performance on a local oral English proficiency test: A mixed-methods approach. Language Testing, 31(4), 501-527.