An Accuracy Investigation of Concurrent Calibration for the Assessment of Examinee’s Growth Ability in Mixed-Format Test that has Different Test Lengths, Item Diffculties, and Scoring Categories.
Main Article Content
Abstract
The main purpose of this study was to investigate the accuracy of concurrent
calibration for the assessment of examinee’s growth ability (O2- O1) in mixed–format test
consisting in terms of multiple choice (MC) items and constructed–response (CR) items where
the MC was dichotomous response model and the CR was polytomous response model.
In order to fulfll this purpose, the 3PL/GPCM model combination was then used to
simulate the item responses data of 1,000 examinees, in which four factors–growth ability,
test lengths (the number of MC:CR, i.e. 30 : 10, 24 : 8, and 15 : 5), item diffculties and
scoring categories of mixed–format test – were manipulated. In total, there were 243
conditions (9x3x3x3) with respect to the four variables. The accuracy of concurrent calibration
was determined from the degrees of bias (BIAS) and the root mean square errors (RMSE) of
the estimated growth ability (O2- O1).
The results of the research were as follows :
1. Pearson’s correlation coeffcients between the estimated ability and the true
ability were negatively high and statistically signifcant at the .01 level.
2. For all conditions, the accuracy of concurrent calibration when the standard
deviations of growth ability were 0.80 and 1.00 was statistically insignifcantly higher than when
the standard deviation of growth ability was 1.2
2.1 When the item diffculties and scoring categories were fxed, the accuracy of
concurrent calibration of the 24 : 8 mixed–format test was statistically signifcantly higher, at
the .05 level, than those of the 15 : 5 and the 30 : 10 mixed–format test.
2.2 When the test lengths and scoring categories were fxed, the accuracy of
concurrent calibration of the 1st mixed–format test with the 0.00 diffculty was statistically
signifcantly higher, at the .05 level, than those of the 1st test mixed–format test with the -0.50
diffculty and the 0.50 diffculty.
2.3. When the test lengths and item diffculties were fxed, the accuracy of concurrent calibration of the mixed–format test with three–categories CR items (0/1/2) was statistically signifcantly higher, at the .05 level, than those of the mixed–format test with four–categories CR items (0/1/2/3) and with fve–categories CR items (0/1/2/3/4).
Article Details
The content and information contained in the published article in the Journal of Educational Measurement Mahasarakham University represent the opinions and responsibilities of the authors directly. The editorial board of the journal is not necessarily in agreement with or responsible for any of the content.
The articles, data, content, images, etc. that have been published in the Journal of Educational Measurement Mahasarakham University are copyrighted by the journal. If any individual or organization wishes to reproduce or perform any actions involving the entirety or any part of the content, they must obtain written permission from the Journal of Educational Measurement Mahasarakham University.