A Comparison of the Quality of Measurement and Evaluation Models under the Study Plans of Sukhothai Thammathirat Open University

Main Article Content

Vinita Kaewkua
Sarhistthep Sukkaew

Abstract

This research aimed to: (1) compare the total scores obtained from examinations and activity-based assessments in Educational Plans A1 and A2 with those in Plans A1 and A3 ; (2) compare the differences in the proportion of students who passed the examination based on the grading results of Plans A1 and A2, and Plans A1 and A3; and (3) compare the concordance between grading results using raw scores (before equating) and grading results using equated scores in Plans A1 and A2, and Plans A1 and A3. This research consisted of three phases. Phase 1 involved equating the total scores from examination and activity-based assessments. Phase 2 involved comparing the differences in the proportion of students who passed the examination using the Mann–Whitney U test. Phase 3 involved comparing the consistency of grading results before and after score equating using Kappa coefficient statistics. The research findings were as follows: (1) There was no difference in the equated scores between Plans A1 and A2 and Plans A1 and A3; (2) There was no difference in the proportion of students who passed the examination under Plans A1 and A2, and Plans A1 and A3; and (3) The comparison of grading concordance before and after score equating showed strong agreement. The Kappa value for Plans A1 and A2 was 0.893, and for Plans A1 and A3 was 0.830.

Article Details

Section
Research Article

References

Angoff, W. H. (1984). Scales, Norms, and Equivalent Scores. Educational Testing Service.

Hanson, B. A., & Béguin, A. A. (2002). Obtaining a common scale for item response theory item parameters using separate versus concurrent estimation in the common-item equating design. Applied Psychological Measurement, 26(1), 3–24. https://doi.org/10.1177/0146621602026001001

Lord, F. M. (1980). Applications of item response theory to practical testing problems. Routledge.

McHugh, M. L. (2012). Interrater reliability: The kappa statistic. Biochemia Medica, 22(3), 276–282. https://doi.org/10.11613/BM.2012.031

Meng, Y. (2012). Comparison of Kernel equating and item response theory equating methods [Doctoral dissertation]. University of Massachusetts. https://search.proquest.com/docview/1033227222

Petersen, N. S., Marco, C. L., & Stewart, E. E. (1982). A test of the adequacy of linear Score Equating Models. Academic Press.

Sun, T. & Kim, S. Y. (2021). Evaluating Six Approaches to Handling Zero-Frequency Scores under Equipercentile Equating. Measurement: Interdisciplinary Research and Perspectives, 19(4), 213 – 235.

Chanchusakul, S., Paiwittayasiritham, C., Phonphanthin, Y., & Suphanophap, P. (2017). Comparing the quality of English language test scores equating between the equipercentile method straight lines method and regression equation method. Academic journal Veridian E-Journal, Silpakorn University Humanities Social Sciences and Arts field, 10(2), 2444–2455. (in Thai)

Chutinantakul, S., Wongnam, P., & Panhun, S. (2018). Results of score calibration using the kernel method and IRT method under different conditions. Journal of Education Sukhothai Thammathirat Open University, 11(1), 294–306. (in Thai)

Juimoungsri, S., & Wijitwanna, S. (2019). Factors Affecting Test Scores of STOU Students Who Take Walk-in Examination via Computer [Research report]. Institute of Research and Development Sukhothai Thammathirat Open University. (in Thai)

Kanchanawasi, S. (1998). Score comparison between tests (Test Equating). Textbook Center and academic documents Faculty of Education Chulalongkorn University. (in Thai)

Pasunon, P. (2015). Assessment of confidence between evaluators using Kappa coefficient statistics. Journal of Applied Liberal Arts King Mongkut's University of Technology North Bangkok, 8(1), 2–20. (in Thai)

Sukhothai Thammathirat Open University. (n.d.). Teaching Management. https://www.stou.ac.th/main/StouPlan.html (in Thai)