Inter-rater Reliability of Alignment between Science Items and Indices
Keywords:
alignment, inter-rater reliability, Fleiss’ kappa statistic, intra-classAbstract
This research aimed to 1) examine the inter-rater reliability of alignment between science items and indices of science subjects in junior secondary school education, and 2) evaluate the alignment between science items and indices of science subjects in junior secondary school education. Research subjects were 1,089 science test items used in junior secondary schools, chosen by using a multi-stage random sampling procedure. Analysis relied on 20 expert panelists to evaluate the alignment. The data were analyzed for inter-rater reliability by Fleiss’ kappa statistic and the intra-class correlation (ICC), and mean scores of alignment between science items and indices were calculated. The findings reveal that 1) in the cognitive complexity evaluation part, there was good inter-rater reliability, as demonstrated by the Fleiss’ kappa statistic (Kf = 0.510), 2) in the evaluation of alignment between science test items, there was excellent inter-rater reliability, demonstrated by the intra-class correlation (ICC = 0.954, Sig. = .000), and 3) 92.93 percent of items aligned with the specified indices, with mean scores of 3.20-4.00.
References
กระทรวงศึกษาธิการ. (2551). แนวปฏิบัติการวัดและ ประเมินผลการเรียนรู้ ตามหลักสูตรแกนกลางการศึกษาขั้นพื้นฐาน พุทธศักราช 2551. โรงพิมพ์ชุมนุมสหกรณ์การเกษตรแห่งประเทศไทย จำกัด.
กระทรวงศึกษาธิการ. (2552). หลักสูตรแกนกลางการศึกษาขั้นพื้นฐาน พุทธศักราช 2551. โรงพิมพ์ชุมนุมสหกรณ์การเกษตรแห่งประเทศไทย.
สังวรณ์ งัดกระโทก. (2555). การวัดความสอดคล้องของมาตรฐานการเรียนรู้กับการจัดการเรียนการสอนและการประเมิน. ใน เอกสารการสอนชุดวิชา การวัดและประเมินอิง มาตรฐานการเรียนรู้ (หน่วยที่ 6). มหาวิทยาลัยสุโขทัยธรรมาธิราช.
สำนักทดสอบทางการศึกษา สำนักงานคณะกรรมการการศึกษาขั้นพื้นฐาน. (2559). การติดตามและตรวจสอบคุณภาพของข้อสอบที่ใช้ในการวัดและประเมินผลในชั้นเรียนของสถานศึกษา สังกัดสำนักงานคณะกรรมการการศึกษาขั้นพื้นฐาน. ชุมนุมสหกรณ์การเกษตรแห่งประเทศไทย.
ภาษาอังกฤษ
Ananda, S. (2003). Rethinking issues of alignment under “no child left behind”. WestEd. https://files.eric.ed.gov/fulltext/ED476416.pdf
Anderson, D., Irvin, S., Alonzo, J., & Tindal, G. A. (2015). Gauging item alignment through online systems while controlling for rater effects. Educational Measurement: Issues and Practice, 34(1), 22-33.
Anderson, L. W., Krathwohl, D. R., Airasian, P. W., Cruikshank, K. A., Mayer, R. E., Pintrich, P. R., Raths, J., & Wittrock, M. C. (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom's taxonomy of educational objectives (Complete ed.). Longman.
Case, B., & Zucker, S. (2005, July). Methodologies for alignment of standards and assessments [Paper presentation]. China-US Conference on Alignment of Assessments and Instruction, Beijing, China.
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37-46.
Cohen, J. (1968). Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213-233.
Davis-Becker, S. L., & Buckendahl, C. W. (2013). A proposed framework for evaluating alignment studies. Educational Measurement: Issues and Practice, 32(1), 23-33.
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378-382. https://doi.org/10.1037/h0031619
Fleiss, J. L., Levin, B., & Paik, M. C. (2003). Statistical methods for rates and proportions (3rd ed.). John Wiley & Sons.
Gisev, N., Bell, J. S., & Chen, T. F. (2013). Interrater agreement and interrater reliability: Key concepts, approaches, and applications. Research in Social and Administrative Pharmacy, 9(3), 330-338.
Hallgren, K. A. (2012). Computing inter-rater reliability for observational data: An overview and tutorial. Tutorials in Quantitative Methods for Psychology, 8(1), 23-34.
Impara, J. C. (2001). Alignment: One element of an assessment’s instructional unity [Paper presentation]. 2001 annual meeting of the National Council on Measurement in Education, Seattle, USA. http://www.unl.edu/BIACO/NCME/Alignment%20revised.pdf
La Marca, P. M., Redfield, D., & Winter, P. C. (2000). State standards and state assessment systems: A guide to alignment. Council of Chief State School Officers.
Marzano, R. J., & Kendall, J. S. (2001). The new taxonomy of educational objectives. Corwin.
McHugh, M. L. (2012). Interrater reliability: The kappa statistic. Biochemia Medica, 22(3), 276-282.
Näsström, G., & Henriksson, W. (2008). Alignment of standards and assessment: A theoretical and empirical study of methods for alignment. Electronic Journal of Research in Educational Psychology, 6(3), 667-690.
Porter, A. C., & Smithson, J. L. (2001). Defining, developing, and using curriculum indicators. CPRE research report series RR-048. University of Pennsylvania Graduate School of Education.
Portney, L. G., & Watkins, M. P. (2015). Foundations of clinical research: Applications to practice (3rd ed.). Davis Company.
Resnick, L. B., Rothman R., Slattery, J. B., & Vranek, J. L. (2004). Benchmarking and alignment of standards and testing. Educational Assessment, Evaluation and Accountability, 9(1-2), 1–27. https://doi.org/10.1080/10627197.2004.9652957
Rosnow, R. L., & Rosenthal, R. (1991). If you're looking at the cell means, you're not looking at only the interaction (unless all main effects are zero). Psychological Bulletin, 110(3), 574-576.
Rothman, R., Slattery, J. B., Vranek, J. L., & Resnick, L. B. (2002). Benchmarking and alignment of standards and testing (CSE Technical Report No. CSE-TR-566). Center for the Study of Evaluation, National Center for Research on Evaluation, Standards, and Student Testing, Graduate School of Education & Information Studies, University of California. https://eric.ed.gov/?id=ED466642
Webb, N. L. (1997a). Criteria for alignment of expectations and assessments in mathematics and science education (Research Monograph No. 8). Council of Chief State School Officers.
Webb, N. L. (1997b). Determining alignment of expectations and assessments in mathematics and science education. NISE Brief, 1(2), 1-10.
Webb, N. L. (1999). Alignment of science and mathematics standards and assessments in four states (Research Monograph No. 18). Council of Chief State School Officers.
Webb, N. L. (2002). An analysis of the alignment between mathematics standards and assessments for three states [Paper presentation]. American Educational Research Association, New Orleans, USA.
Webb, N. L. (2007). Issues related to judging the alignment of curriculum standards and assessments. Applied Measurement in Education, 20(1), 7-25.