Applying Generative Artificial Intelligence in Educational Research Instrument Development: A Methodological Framework
Main Article Content
Abstract
The development of educational research instruments is critical to the quality, credibility, and interpretability of research findings. However, many researchers continue to face challenges in constructing instruments that are both theoretically grounded and methodologically sound. At the same time, recent advances in Generative Artificial Intelligence (GenAI), particularly tools based on Large Language Models (LLMs) and Natural Language Processing (NLP), have created new opportunities to support construct definition, indicator specification, preliminary item drafting, and early-stage language review. Nevertheless, the application of these technologies to educational research instrument development still lacks a clearly articulated methodological framework grounded in educational measurement and evaluation principles.
This conceptual paper proposes a methodological framework for applying Generative Artificial Intelligence to educational research instrument development. The framework integrates four core components: educational measurement and evaluation foundations, instrument development processes, the role of Artificial Intelligence as a methodological support mechanism, and Human-in-the-Loop decision-making. Within this framework, AI is positioned as a methodological assistant rather than a substitute for scholarly judgment. It may support construct clarification, indicator development, preliminary item generation, and early-stage quality review, while responsibility for decisions concerning validity, reliability, measurement precision, measurement fairness, and ethical appropriateness remains with researchers and domain experts.
The proposed framework offers a structured, transparent, and theoretically grounded approach to integrating AI into instrument development. It further emphasizes that the academic value of AI-assisted instrument development depends not merely on procedural efficiency, but on its alignment with sound measurement and evaluation principles and responsible human oversight. This paper thus provides a preliminary methodological contribution for educational researchers, research instructors, and graduate students, while also offering a foundation for future empirical validation of AI-assisted instrument development practices in education.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The content and information contained in the published article in the Journal of Educational Measurement Mahasarakham University represent the opinions and responsibilities of the authors directly. The editorial board of the journal is not necessarily in agreement with or responsible for any of the content.
The articles, data, content, images, etc. that have been published in the Journal of Educational Measurement Mahasarakham University are copyrighted by the journal. If any individual or organization wishes to reproduce or perform any actions involving the entirety or any part of the content, they must obtain written permission from the Journal of Educational Measurement Mahasarakham University.
References
AERA, APA, & NCME. (2014). Standards for educational and psychological testing. American Educational Research Association.
AI Thailand. (2022). Thailand National AI Strategy and Action Plan (2022–2027). https://www.ai.in.th/en/about-ai-thailand/
Cohen, L., Manion, L., & Morrison, K. (2018). Research methods in education (8th ed.). Routledge. https://doi.org/10.4324/9781315456539
DeVellis, R. F., & Thorpe, C. T. (2021). Scale development: Theory and applications (5th ed.). SAGE.
Downing, S. M. (2006). Twelve steps for effective test development. In S. M. Downing & T. M. Haladyna (Eds.), Handbook of test development (pp. 3–25). Lawrence Erlbaum Associates.
Ebel, R. L., & Frisbie, D. A. (1991). Essentials of educational measurement (5th ed.). Prentice Hall.
Flanagin, A., Pirracchio, R., Khera, R., Berkwits, M., Hswen, Y., & Bibbins-Domingo, K. (2024). Reporting use of AI in research and scholarly publication—JAMA Network guidance. JAMA, 331(13), 1096–1098. https://doi.org/10.1001/jama.2024.3471
Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., Luetge, C., Madelin, R., Pagallo, U., Rossi, F., Schafer, B., Valcke, P., & Vayena, E. (2018). AI4People—An ethical framework for a good AI society. Minds and Machines, 28(4), 689–707. https://doi.org/10.1007/s11023-018-9482-5
Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. Routledge. https://doi.org/10.4324/9780203850381
Holmes, W., Bialik, M., & Fadel, C. (2019). Artificial intelligence in education: Promises and implications for teaching and learning. Center for Curriculum Redesign.
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73. https://doi.org/10.1111/jedm.12000
Kasneci, E., Sessler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., Krusche, S., Kutyniok, G., Michaeli, T., Nerdel, C., Pfeffer, J., Poquet, O., Sailer, M., Schmidt, A., Seidel, T., & Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, Article 102274. https://doi.org/10.1016/j.lindif.2023.102274
Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741–749. https://doi.org/10.1037/0003-066X.50.9.741
UNESCO. (2023). Guidance for generative AI in education and research. UNESCO. https://unesdoc.unesco.org/ark:/48223/pf0000386693
Zawacki-Richter, O., Marín, V. I., Bond, M., & Gouverneur, F. (2019). Systematic review of research on artificial intelligence applications in higher education. International Journal of Educational Technology in Higher Education, 16(1), Article 39. https://doi.org/10.1186/s41239-019-0171-0