Development of Parallel Tests in Mathematics for Students in Grade 2 in Schools under the Bangkok Metropolitan Administration through an Application of Two-Parameter Item Response Theory

Main Article Content

Savina Boonsen
Chuthaphon Masantiah
Panida Panidvadana

Abstract

The purposes of this research were to 1) develop parallel tests in mathematics for grade 2, in the strand of measurement, and 2) verify the parallelism of the parallel tests in mathematics for grade 2, in the strand of measurement. The sample consisted of 800 grade 2 students in the academic year 2019 from 13 schools in Nong Chok District, under the Bangkok Metropolitan Administration, selected by stratified random sampling. The research instruments were 4 objective tests, on length (1A, 1B) and weight (2A, 2B). Each test comprised 20 multiple-choice items. The data were analyzed for content validity, reliability, test information function, item difficulty, discrimination, item information function, and the parallelism of the parallel tests and items using the root mean square deviation of item information (RMSDIF) and root mean square deviation of test information (RMSDTIF) respectively.
The research findings revealed that:
1) All the 4 parallel tests in mathematics had the index of item objective congruence (IOC), which was the index of content validity, ranging from 0.67 to 1.00; the reliability of 0.711, 0.652, 0.718, and 0.635 respectively; the maximum value of the test information function for the proficiency level was from -0.60 to -0.80 for the 1A and 1B tests and -0.40 to -0.60 for the 2A and 2B tests; the item difficulty (b) was between -2.98 and 0.57; the discrimination (a) was between 0.22 and 1.13, and the item information function of all the 4 tests was between 0.013 and 0.320.
2) The results of the parallelism analysis of the items and tests on length (1A, 1B) revealed that the RMSDIF ranged from 0.008 to 0.167, and RMSDTIF was 0.484; while the parallelism analysis of the items and tests on weight (2A, 2B) revealed that the RMSDIF ranged from 0.020 – 0.110, and RMSDTIF was 0.492.

Article Details

Section
Research Article

References

Ali, U. S., & Rijn, P. W. (2016). An evaluation of different statistical target for assembling parallel form in item response theory. Applied psychological measurement, 40(3), 163-179.

Debeer, D., Ali, U. S., & Rijn, P. W. (2017). Evaluating Statistical Targets for Assembling Parallel Mixed-Format Test Forms. Journal of Educational Measurement, 54(2), 218-242.

Dorans, N. J., Pommerich, M., & Holland, P. W. (2007). Linking and aligning scores and scales. Springer-Verlag.

Foley, B. F. (2010). Improving IRT item parameter estimates with small sample sizes: Evaluating the efficacy of a new data augmentation technique [Doctoral dissertation]. University of Nebraska.

Lin, C. J. (2008). Comparisons between classical theory and item response theory in automated assembly of parallel test forms. The journal of technology, learning and assessment.

Lord, F. M. (1980). Applications of item response theory to practical testing problems. Erlbaum.

Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Addison-Wesley.

Reckase, M. (2009). Multidimensional item response theory. Springer.

Rupp, A., & Templin, J. (2008). Effects of Q-matrix misspecification on parameter estimates and misclassification rates in the DINA model. Educational and Psychological Measurement.

Education Strategy Office Education Bureau. (2016). Bangkok Basic Education Development Plan issue 2 (2017-2020). Agricultural Co-operative Federation of Thailand Press. (in Thai)

Kanchanapenkul, S. (2011). A Construction of Parallel Domainre-referenced test on “Circles” by Using Facet Design for Matthayousuksa 3 [Master’s thesis]. Srinakharinwirot University. (in Thai)

Kantivong, A. (2017). Comparison of Item and Test Parallel Index Between Test Forms that Selected by Different Characteristics of Experts: Application of Formal Cultural Consensus Theory [Master’s thesis]. Chulalongkorn University. (in Thai)

Karnjanawasee, S. (2012). New Exam Theory (4th Ed.). Chulalongkorn University Press. (in Thai)

Karnjanawasee, S. (2013). Classical Test Theory (7th Ed.). Chulalongkorn University Press. (in Thai)

Katsa, T., & Amonrattanasak, S. (2008). Educational evaluation. Ramkhamhaeng University. (in Thai)

Pasiphol, S. (2016). Learning Measurement and Evaluation. Chulalongkorn University Press. (in Thai)

Pasiphol, S. (2020). The Development of Parallel Tests: Creation and Parallelism analysis. Chulalongkorn University Press. (in Thai)

Sawangboon, N. (2011). A Construction of Mathematics Parallel Test Using Facet Design on “Factoring of Degree-2 Polynomials” for Matthayomsuksa 2 Students [Master’s thesis]. Mahasaraklham University. (in Thai)

Sawangsri, P. (2015). Comparison of Psychometric Properties Integrating of Reading Analytical Thinking and Writing Abilties and Indicayors of Content Science of Nine Grade Using The Different Item Review Methods [Master’s thesis]. Chulalongkorn University. (in Thai)