Evaluation of the Psychometric Properties of Sukhothai Thammathirat Open University’s English Proficiency Test
Main Article Content
Abstract
The objectives of this research were to 1)analyze the quality of Sukhothai Thammathirat Open University’s English proficiency test (STOU-EPT) using a computer system in three parameters: the discrimination parameter (a), the item difficulty parameter (b), and the guessing parameter (c) following Item Response Theory (IRT); and 2) test differential item functioning (DIF) in STOU-EPT. The sample was 1,272 persons who took STOU-EPT, using purposive sampling in compliance with the criteria of having secondary data, as well as data on gender, dwelling place, and experience in taking STOU-EPT. The research instrument was the 100-question STOU-EPT which consists of 25 listening skills questions, 35 grammar skills questions, and 40 reading skills questions. The data was examined using RStudio. The study's findings showed that:
1. The quality of STOU-EPT included: 1) for listening skills, the a-parameter ranged from - 9.62 to 4.5, with 22 items (88%) passing the criterion a > 0, the b-parameter ranged from -1.80 to 8.19, with 22 items (88%) passing the criterion -2 ≤ b ≤ +2, and the c-parameter ranged from 0 to 0.61; 2) regarding grammar skills, the a-parameter ranged from -3.40 to 7.65, with 31 items (88.57%) passing the criterion a > 0, the b-parameter ranged from -4.21 to 3.06, with 22 items (62.86%) passing the criterion -2 ≤ b ≤ +2, and the c-parameter ranged from 0 to 0.44; and 3) reading skills, the a-parameter ranged from -8.75. to 7.87, with 36 items (90%) passing the criterion a > 0, the b-parameter ranged from -2.02 to 23.75, with 35 items (87.50%) passing the criterion -2 ≤ b ≤ +2, and the c-parameter ranged from -0 to 0.63.
2. Testing the DIF in STOU-EPT using LR, SIBTEST, and MH, by comparing the number of items that found the DIF matched from two methods and over, revealed that 1) listening skills had the DIF between gender (female = focal), dwelling place (Bangkok and surrounding areas = focal), and experience in taking the exam (> 1 time = focal), with the same number of 3 items each (12%); 2) grammar skills had the DIF between gender, dwelling place, and experience in taking the exam, with 3, 5, and 1 items, respectively (8.57%, 14.29%, and 2.86%), and 3) reading skills had the DIF between gender, dwelling place, and experience in taking the exam, with 1, 3, and 0 items, respectively (2.50%, 7.50%, and 0%).
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The content and information contained in the published article in the Journal of Educational Measurement Mahasarakham University represent the opinions and responsibilities of the authors directly. The editorial board of the journal is not necessarily in agreement with or responsible for any of the content.
The articles, data, content, images, etc. that have been published in the Journal of Educational Measurement Mahasarakham University are copyrighted by the journal. If any individual or organization wishes to reproduce or perform any actions involving the entirety or any part of the content, they must obtain written permission from the Journal of Educational Measurement Mahasarakham University.
References
AERA, APA, & NCME. (2014). Standards for Educational and Psychological Testing: National Council on Measurement in Education. American Educational Research Association.
Berrío, Á. I., Gómez-Benito, J., & Arias-Patiño, E. M. (2020). Developments and trends in research on methods of detecting differential item functioning. Educational Research Review, 31, 100340. https://doi.org/10.1016/j.edurev.2020.100340.
Bichi, A. A., & Talib, R. (2018). Item Response Theory: An Introduction to Latent Trait Models to Test and Item Development. International Journal of Evaluation and Research in Education (IJERE), 7(2), 142-151. https://doi.org/10.11591/ijere.v7i2.12900.
Council of Europe. (2001). Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Cambridge University Press.
Geramipour, M., & Shahmirzadi, N. (2019). A Gender–Related Differential Item Functioning Study of an English Test. The Journal of Asia TEFL, 16(2), 674-682.
Liao, L., & Yao, D. (2021). Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening. Frontiers in Psychology, 12. https://doi.org/10.3389/fpsyg.2021.767244
Liu, L. M. (2017). Differential Item Functioning in Large-scale Mathematics Assessments: Comparing the Capabilities of the Rasch Trees Model to Traditional Approaches [Doctoral dissertation]. University of Toledo.
Ryan, K. E., & Bachman, L. F. (1992). Differential item functioning on two tests of EFL proficiency. Language Testing, 9(1), 12–29. https://doi.org/10.1177/026553229200900103
Haruethaipun, C., Kanjanawasee, S., & Pasiphol, S. (2015). A Development of the Differential Item Functioning Detection Methods by Expert Judgment. Journal of Education Studies, 43(1), 1-18. (in Thai)
Jiraro, S., & Aungsuchoti. (2021). Validating Sukhothai Thammathirat Open University’s Computer-Based English Proficiency Test (STOU-EPT) Under the Common European Framework of Reference for Languages (CEFR). Journal of Social Sciences in Measurement Evaluation Statistics and Research, 2(1) 27-35. (in Thai)
Kanjanawasee, S. (2020). Classical Test Theory (3th ed.). Chulalongkorn University Press. (in Thai)
Posing, S. & Erawan, W. (2023). Detecting Differential Item Functioning of Grade 9 Science Literacy Test. Journal of Educational Measurement Mahasarakham University, 29(1), 231-244. (in Thai)