Designing Automated Scoring Approach to Diagnose Scientific Competency Misconceptions through Machine Learning
Main Article Content
Abstract
Designing automated scoring to diagnose misconceptions in scientific competencies through machine learning is a scoring method for a mixed-format diagnostic test to assess scientific competency levels through a scoring machine in a single step. This innovation can report results and provide real-time feedback to learners. Therefore, to design the prototypical innovation, the objectives of the present study were defined as follows: (1) to analyze the students’ response patterns to the design of the prototypical innovation, including identifying types and levels of misconceptions in scientific competencies, defining cut scores based on response patterns in the scientific competency diagnostic test, and determining variables to create the predictive equation based on students' background data; and (2) to design automated scoring for diagnosis of misconceptions in scientific competencies through machine learning. This study drew on secondary data of a sample of 847 grade 7 students. The data were analyzed using a multidimensional random coefficients multinomial logit model (MRCMLM) while cut scores on the Wright map were defined. The results are reported below.
(1) There were 5 types of misconceptions in scientific competencies. Besides, on determining cut scores of 3 scientific competencies, it was found that the competency to explain phenomena scientifically had the cut scores of -0.49, 0.27 and 0.83; the cut scores of the competency to evaluate and design scientific enquiry were -0.09, 0.54 and 1.72 respectively; and those of the competency to interpret data and use evidence scientifically were -0.78, 0.47 and 2.98 respectively. In addition, the important independent variables were discovered as follows: (1) the total GPA of the previous semester; (2) the GPA in the science subject of the previous semester; and (3) the number of hours of self-study on a science subject per week.
(2) The designed scoring to diagnose scientific competencies in subjective tests employed identified keywords and converted answers from the synthesis of response patterns in the secondary data into choices to allow for scoring, processing scientific competency levels, and reporting the levels of misconceptions automatically and in real time. The results also showed that, based on five experts’ evaluation, the design of automated scoring was found to cover 5 standards, namely utility, feasibility, propriety, accuracy and accountability.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The content and information contained in the published article in the Journal of Educational Measurement Mahasarakham University represent the opinions and responsibilities of the authors directly. The editorial board of the journal is not necessarily in agreement with or responsible for any of the content.
The articles, data, content, images, etc. that have been published in the Journal of Educational Measurement Mahasarakham University are copyrighted by the journal. If any individual or organization wishes to reproduce or perform any actions involving the entirety or any part of the content, they must obtain written permission from the Journal of Educational Measurement Mahasarakham University.
References
Adams, R. A., Wilson, M., & Wang, W. C. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21(1), 1-23.
Desstya, A., Prasetyo, Z.K., Suyanta., Susila, I., & Irwanto. (2019). Developing an Instrument to Detect Science Misconception of an Elementary School Teacher. International Journal of Instruction, 12(3), 201-218.
Intasoi, S., Jungpeng P., Ngang Tang K., Ketchatturat J., Zhang Y., & Wilson M. (2020). Developing an assessment framework of multidimensional scientific competencies. International Journal of Evaluation and Research in Education, 9(4), 963-970.
Junpeng, P., Krotha, J., Chanayota, K., Tang, K. N., & Wilson, M. (2019). Constructing Progress Maps of Digital Technology for Diagnosing Mathematical Proficiency. Journal of Education and Learning, 8(6), 90-102.
Junpeng, P., Marwiang, M., Chiajunthuk, S., Suwannatrai, P., Chanayota, K., Pongboriboon, K., Tang, K. N., & Wilson, M. (2020). Validation of a digital tool for diagnosing mathematical proficiency. International Journal of Evaluation and Research in Education, 9(3), 665-674.
Lee, H. S., Gweon, G. H., Lord, T., Paessel, N., Pallant, A., & Pryputniewicz, S. (2021). Machine Learning-Enabled Automated Feedback: Supporting Students’ Revision of Scientific Arguments Based on Data Drawn from Simulation. Journal of Science Education and Technology, 30(2), 168-192.
Lintean, M., Vasile, R., & Roger, A. (2012). Automatic Detection of Student Mental Models Based on Natural Language Student Input during Metacognitive Skill Training. International Journal of Artificial Intelligence in Education, 21(3), 169-190.
Maestrales, S., Zhai, X., Touitou, I., Baker, Q., Schneider, B., & Krajcik, J. (2021). Using Machine Learning to Score Multi Dimensional Assessments of Chemistry and Physics. Journal of Science Education and Technology, 30(2), 239-254.
National Research Council. (1997). Science teaching reconsidered: A handbook. National Academies Press.
Organisation for Economic Co-operation and Development. (2019). PISA 2018 Assessment and Analytical Framework. PISA, OECD Publishing.
Reeves, T. (2006). Design research from a technology perspective. In J.V.D. Akker, K. Gravemeijer, S. McKenney & N. Nieveen (Eds.). Educational design research (pp. 52–66). Routledge.
Varasunun, P. (2011). The Program Evaluation Standards (3rd Ed.) in 2010. Journal of Research Methodology, 24(2), 273-278.
Wilson, M. (2005). Constructing Measures: an item response modeling approach. Lawrence Erlbaum Associates.
Wright, B.D., & Stone, M. H. (1979). Best Test Design: Rasch Measurement. Mesa Press.
Wu, M. L., Adums, R., Wilson, M., & Haldane, S. A. (2007). ConQuest Version 2: Generalised Item Response Modelling Software. ACER Press.
Yarbrough, D.B., Shulha, L.M., Hopson, R.K., and Caruthers, F.A. (2010). The program Evaluation standards: A guide for evaluators and evaluation users (3rd Ed.). Sage.
Anuntasawat, S. (2017). Development of a Misconception Diagnostic system in Using a Three-Tier Diagnostic Test with Computer Based Reflective Feedback for Tenth Grade students [Doctoral dissertation]. Chulalongkorn University. (in Thai)
Chianchana, C. (2009). Multidimensional Analysis. Journal of Education Khon Kaen University, 32(4), 13-22. (in Thai)
Insawat, M. (2016). A comparison of the quality of three-tier diagnostic test of mathematical misconception using different levels of confidence [Master’s thesis]. Chulalongkorn University. (in Thai)
Institute for the promotion of technology science and technology. (2017). Teacher Guide Basic Science Volume 1. Bangkok: office of the welfare promotion commission for teachers and educational personnel press. (in Thai)
Intasoi, S. (2020). Developing a digital tool for assessing multidimensional scientific competencies of the seventh-grade students [Master’s thesis]. Khon Kaen University. (in Thai)
Junpeng, P. (2018). Applications of the multidimensional item response theory of research (4th Ed.). Khon Kaen University press. (in Thai)
Kamtet, W. (2017). Scientific Misconception: Type and Assessment Tool. STOU Education Journal. 10(2), 54-64. (in Thai)
Khawsuk, J., & Srihaset, K. (2016). The Development of a Mathematics Diagnostic Test for Polynomialsat the Matthayom Sueksa One Level. Veridian E-Journal, Silpakorn University, 9(3), 1206-1220. (in Thai)
Kladchuen, R., & Sanrach, J. (2018). An Efficiency Comparison of Algorithms and Feature Selection Methods to Predict the Learning Achievement of Vocational Students. Research Journal Rajamangala University of Technology Thanyaburi, 17(1), 1-10.(in Thai)
Rawanprakhon, P., & Tayraukham, S. (2017). The Development of Multidimensional Essay Test. Journal of Community Development Research (Humanities and Social Sciences), 10(4), 13-23. (in Thai)
Sudjai, M., & Mungsing, S. (2021). Development of an Automated subjective answer scoring System with full-text search technique and PHP text comparison function Journal of Information Science and Technology, 11(1), 8-17. (in Thai)
Wongphummueang, L., & Piyakun, A. (2021). Development of Multidimensional Test of Collaborative Problem-Solving Skills for Lower Secondary School Students. Journal of Educational Measurment, Mahasarakham University, 27(1), 232-243. (in Thai)
Wongwanit, S. (2020). Design Research in Education. Chulalongkorn University press. (in Thai)
Boonjun, S. (2019). A Comparison of Differential Item Funtioning for Mixed Format Tests by Tree Dffferentail Methods [Doctoral dissertation]. Burapha University. (in Thai)
Fiothong, A., Junpeng, P., Suwannatrai, P., Chinjunthuk, S., & Tawarungruang, C. (2022). Designing Open-Ended Question Scoring for Assessment of student Mathematical Proficiency Levels through Digital Technology. Journal of Educational Measurement Mahasarakham University, 28(1), 346-362. (in Thai)