การออกแบบวิธีการตรวจให้คะแนนสำหรับแบบทดสอบอัตนัยเพื่อประเมินระดับความสามารถทางคณิตศาสตร์ ผ่านเทคโนโลยีดิจิทัล

Apinya  Fiothong; Putcharee  Junpeng; Prapawadee  Suwannatrai; Samruan  Chinjunthuk; Chaiwat  Tawarungruang

PDF

Published: Jun 16, 2022

Keywords:

open-ended question scoring design research multidimensional item response model mathematical proficiency level

Apinya Fiothong

Faculty of Education, Khon Kaen University

Putcharee Junpeng

Faculty of Education, Khon Kaen University

Prapawadee Suwannatrai

Demonstration School of Khon Kaen University, Secondary Division (Modindang)

Samruan Chinjunthuk

Demonstration School of Khon Kaen University, Secondary Division (Modindang)

Chaiwat Tawarungruang

Faculty of Public Health, Khon Kaen University

Abstract

The study aimed to (1) analyze students’ multidimensional response patterns for determining cut scores for assessment of mathematical proficiency levels on the topic of Measurement and Geometry, and (2) to design and assess the quality of open-ended question scoring for assessment of mathematical proficiency levels through digital technology. Design research was applied. The sample consisted of 528 grade 7 students. The research instrument was an open-ended question test on the topic of Measurement and Geometry through diagnostic tools in an online testing system—"eMAT-Testing.” The analysis of the collected data employed the MRCML model.
The results were as follows:
1. On determining cut scores of mathematical proficiency levels by defining criterion zones on Wright Map, it was found that mathematical processes featured five levels with four cut scores, ranging from the lowest to highest as follows: -2.30, -0.43, 0.78, and 1.15, respectively. Similarly, conceptual structures consisted of five levels with four cut scores, including -2.76, 0.11, 0.46, and 1.16, respectively. Such cut scores can be employed to determine proficiency ranges, scale scores, and raw scores as criteria for assessment of mathematical proficiency in each dimension.
2. In terms of designing the open-ended question scoring through digital technology, it featured five parts, namely (1) input, (2) process, (3) processing, (4) output, and (5) assessment reporting. The assessment of its quality through standards-based assessment and heuristic assessment conducted by experts showed that: (1) the standards-based assessment on all 3 aspects— accuracy, utility, and feasibility—were rated with the highest level of assessment. (2) Based on the heuristic assessment, the overall system had the highest level of suitability; visibility of system status was rated with the highest level of assessment, while aesthetic and minimalist design obtained the lowest level of assessment.

Issue

Vol. 28 No. 1: January - June 2022

Section

Research Article

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

The content and information contained in the published article in the Journal of Educational Measurement Mahasarakham University represent the opinions and responsibilities of the authors directly. The editorial board of the journal is not necessarily in agreement with or responsible for any of the content.

The articles, data, content, images, etc. that have been published in the Journal of Educational Measurement Mahasarakham University are copyrighted by the journal. If any individual or organization wishes to reproduce or perform any actions involving the entirety or any part of the content, they must obtain written permission from the Journal of Educational Measurement Mahasarakham University.

References

AERA, APA, & NCME. (2014). Standards for Educational and Psychological Testing (6th ed.). American Educational Research Association.

Adams, R. J., Wilson, M., and Wang, W.C. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21(1), 1-23.

Berggren, S. J., Rama, T., and Ovrelid, L. (2019). Regression or classification? Automated Essay Scoring for Norwegian. https://www.aclweb.org/anthology/W19-4409.pdf

Black, P., and William, D. (1998). Inside the black box: raising standards through classroom assessment. Phi Delta Kappan, 8(2), 139-148.

Demars, C. (2010). Item Response Theory: Understanding Statistics Measurement. Oxford University Press.

European Language Resources Association (ELRA). (2020). Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020). Tokyo Metropolitan University.

Junpeng, P., Krotha, J., Chanayota, K., Tang, K. N., & Wilson, M. (2019). Constructing Progress Maps of Digital Technology for Diagnosing Mathematical Proficiency. Journal of Education and Learning, 8 (6), 90-102.

Junpeng, P., Marwiang, M., Chiajunthuk, S., Suwannatrai, P., Chanayota, K., Pongboriboon, K., Tang, K. N., Wilson, M. (2020b). Validation of a digital tool for diagnosing mathematical proficiency. International Journal of Evaluation and Research in Education (IJERE), 9(3), 665-674.

Koyama, Kiyuna, Kobayashi, Arai, and Komachi. (2020). Proceedings of the 12th conference on language resources and evaluation (LREC 2020). France.

Nielsen, J. (1992). Finding Usability Problems through Heuristic Evaluation. Paper presented at the ACM CHI'92, Monterey, CA.

Rodrigues, and Araújo. (2012, April). Automatic assessment of short free text answers. https://www.researchgate.net/profile/Fatima_Rodrigues3/publication/234023013_Automatic_Assessment_of_Short_Free_Text_Answers/links/552d8aa90cf2e089a3ad78af/Automatic-Assessment-of-Short-Free-Text-Answers.pdf.

Wang, J., and Brown, M.S. (2007). Automated Essay Scoring Versus Human Scoring: A Comparative Study. Journal of Technology, Learning, and Assessment (2). http://www.jtla.org

Wilson, M. (2005). Constructing measures: An item response modeling approach. Routledge.

Wright, B. D., and Stone, M. H. (1979). Best test design: Rasch measurement. Mesa Press.

Wu, Adams, Wilson, and Haldane. (2007). ACER ConQuest version 2.0. ACER Press.

Aungkaseraneekul, S. (2012). Automated thai-language essay scoring. [Unpublished master’s thesis]. Kasetsart University. (in Thai)

Chinjunthuk, S., Junpeng, P. (2020). Assessment Guidelines for Student’s Personalized Mathematical Proficiency Development. Journal of Educational Measurement, Mahasarakram University, 26(1), 47- 64. (in Thai)

Jaihuek, S., and Mungsing, S. (2020). Scoring Thai Language Subjective Answer Automaic Sysem by Sematic. Information Technology Journal, 16(1), 15-23. (in Thai)

Junpeng, P., Marwiang, M., Chinjunthuk, S., Suwannatrai, P., Krotha, J., Chanayota, K., Tawarungruang, C., Thuanman, J., Tang K. N., and Wilson M. (2020a). Developing Students’ Mathematical Proficiency Level Diagnostic Tools through Information Technology in Assessment for Learning Report. The Thailand Research Fund and Khon Kaen University. (in Thai)

Suksiri, W. and Worain, C. (2016). Investigating Tentative Cut scores for Science Learning Area on the Ordinary National Educational Test Scores using the Construct Mapping Method: An Analysis for Further Judgments. National Institute of Educational Testing Service (Public Organization). (in Thai)

The institute for the Promotion of Teaching Science and Technology (IPST). (2020). PISA 2021 with assessment mathematical literacy. https://pisathailand.ipst.ac.th/issue-2020-53 (in Thai)

Wongwanit, S. (2020). Design Research in Education (1st ed.). Chulalongkorn University Press. (in Thai)

Article Sidebar

Main Article Content

Abstract

Article Details

References