การจัดคลังข้อสอบจำแนกตามเนื้อหาและระดับความยากของข้อสอบ : โดยใช้ทฤษฎีการตอบสนองข้อสอบ

Surachai  Raksombat; Piyathip  Pradujphom; Kanok  Panthong

PDF

Published: Dec 30, 2020

Keywords:

Item Bank Difficulty Levels Item Response Theory

Surachai Raksombat

Research and Statistics in Cognitive Science, College of Research Methodology and Cognitive Science, Burapha University

Piyathip Pradujphom

College of Research Methodology and Cognitive Science, Burapha University

Kanok Panthong

College of Research Methodology and Cognitive Science, Burapha University

Abstract

This study aimed to analyze the quality of test items in mathematics using the item response theory with the three-parameter model in mathematics of grade 7, and to organize the item bank for grade 7, classified by content and difficulty levels. The research methodology comprised 4 steps: 1) analysis of the quality of test items using the item response theory with the three-parameter model; 2) classification of difficulty levels according to the item difficulty parameter; 3) assessment of the construct validity and behavioral measures; and 4) organizing the item bank, classified by content and difficulty levels.

The results showed that:

1. The successful questions analyzed with the item response theory with three- parameter model had the item difficulty parameters ranging from -1.19 to 2.49 (Mean = 1.20, SD = 0.81), the item discrimination parameters ranging from 0.51 to 2.42 (Mean = 1.15, SD = 0.40), and the item guessing parameters ranging from 0.08 to 0.30 (Mean = 0.23, SD = 0.03).

2. The quality questions were divided into three levels: easy, moderate and difficult. In the first level, the easy level, the item difficulty parameters ranged from -1.19 to 0.50. In the second level, the moderate level, the item difficulty parameters ranged from 0.51 to 1.50. And finally, in the third level, the difficult level, the item difficulty parameters ranged from 1.51 to 2.49.

3. The successful questions that had been assessed on the construct validity and behavioral measures yielded the following data: the construct validity (Mean = 4.69, SD = 0.12) and behavioral measures (Mean = 4.76, SD = 0.10); on the whole, the consistency was in the highest level.

4. The item bank of grade 7 mathematics, classified by content and difficulty levels, had a total of 345 successful items (The total test items used in the research was 480). They could be divided into three learning substances: the first learning substance was number and algebra which had 238 items (32 easy, 105 moderate and 101 difficult); the second learning substance was measurement and geometry which had 80 items (20 easy, 32 moderate and 28 difficult); and the third learning substance was statistics and probability which had 27 items (6 easy, 10 moderate and 11 difficult).

Issue

Vol. 26 No. 2: July - December 2020

Section

Research Article

The content and information contained in the published article in the Journal of Educational Measurement Mahasarakham University represent the opinions and responsibilities of the authors directly. The editorial board of the journal is not necessarily in agreement with or responsible for any of the content.

The articles, data, content, images, etc. that have been published in the Journal of Educational Measurement Mahasarakham University are copyrighted by the journal. If any individual or organization wishes to reproduce or perform any actions involving the entirety or any part of the content, they must obtain written permission from the Journal of Educational Measurement Mahasarakham University.

References

กระทรวงศึกษาธิการ. (2560). ตัวชี้วัดและสาระการเรียนรู้แกนกลาง กลุ่มสาระการเรียนรู้คณิตศาสตร์ (ฉบับปรับปรุง พ.ศ. 2560) ตามหลักสูตรแกนกลางการศึกษาขั้นพื้นฐาน พุทธศักราช 2551. กรุงเทพฯ: โรงพิมพ์ชุมนุมสหกรณ์การเกษตรแห่งประเทศไทย.

ขนิษฐา ราศรี และประภาพร ศรีตระกูล. (2553). การสร้างแบบทดสอบวัดการคิดวิเคราะห์ทางคณิตศาสตร์สำหรับนักเรียนชั้นมัธยมศึกษาปีที่ 2 สังกัดสำนักงานเขตพื้นที่การศึกษาขอนแก่นเขต 4. วารสารศึกษาศาสตร์ มหาวิทยาลัยขอนแก่น, 33(4), 60-70.

จารุจิตร สิทธิปรุ, ปิยะทิพย์ ตินวร และโสฬส สุขานนท์สวัสดิ์. (2559). การพัฒนาโปรแกรมการทดสอบแบบปรับเหมาะด้วยคอมพิวเตอร์สำหรับการจัดสอบ O-NET ระดับชั้นมัธยมศึกษาปีที่ 3. วารสารการวัดผลการศึกษา มหาวิทยาลัยมหาสารคาม, 22(1), 47-62.

จิตติรัตน์ แสงเลิศอุทัย. (2560). คุณภาพของเครื่องมือที่ใช้ในการวิจัย. วารสารวิจัยและพัฒนาหลักสูตร, 7(1), 1-15.

ญานิศรา มุนินทร์สาคร. (2558). การพัฒนาโปรแกรมการทดสอบแบบปรับเหมาะด้วยคอมพิวเตอร์สำหรับการจัดสอบ O-NET ระดับชั้นประถมศึกษาปีที่ 6. วิทยานิพนธ์ปริญญาวิทยาศาสตรมหาบัณฑิต สาขาวิชาการวัดและเทคโนโลยีทางวิทยาการปัญญา วิทยาลัยวิทยาการวิจัยและวิทยาการปัญญา มหาวิทยาลัยบูรพา.

นุภาพรรณ ปลื้มใจ, ปิยะทิพย์ ตินวร และโสฬส สุขานนท์สวัสดิ์. (2558). การพัฒนาโปรแกรมการทดสอบแบบปรับเหมาะด้วยคอมพิวเตอร์สำหรับการจัดสอบ O-NET ระดับชั้นมัธยมศึกษาปีที่ 6. วิทยาการวิจัยและวิทยาการปัญญา, 13(2), 109-125.

ศศิธร จันทรมหา. (2560). การสร้างข้อสอบอัตโนมัติวิชาคณิตศาสตร์ ชั้นประถมศึกษาปีที่ 6 โดยใช้โปรแกรมคอมพิวเตอร์. วิทยานิพนธ์ปริญญาวิทยาศาสตรมหาบัณฑิต สาขาวิชาการวัดและเทคโนโลยีทางวิทยาการปัญญา วิทยาลัยวิทยาการวิจัยและวิทยาการปัญญา มหาวิทยาลัยบูรพา.

สถาบันทดสอบทางการศึกษาแห่งชาติ (องค์การมหาชน). (2562). NIETS News. วารสารจดหมายข่าวสถาบันทดสอบทางการศึกษาแห่งชาติ (องค์การมหาชน), (75), 1-8.

สุชาดา กรเพชรปาณี, ปิยะทิพย์ ตินวร และโสฬส สุขานนท์สวัสดิ์. (2559). การพัฒนาโปรแกรมการทดสอบแบบปรับเหมาะด้วยคอมพิวเตอร์สำหรับการจัดสอบ O-NET. วิทยาการวิจัยและวิทยาการปัญญา, 14(1), 14-31

สุนันทา ศิริเบญจา, ไชยรัตน์ ปราณี และดวงใจ สีเขียว. (2556). การพัฒนาแบบทดสอบปรับเหมาะโดยใช้คอมพิวเตอร์ สาระเทคโนโลยีสารสนเทศและการสื่อสาร โดยประมาณค่าความสามารถของผู้สอบด้วยวิธีของเบส์สำหรับนักเรียนชั้นมัธยมศึกษาปีที่ 3. วารสารวิชาการและวิจัยสังคมศาสตร์, 8(22), 87-102.

เอื้อมพร หลินเจริญ, สิริศักดิ์ อาจวิชัย และภีรภา จันทร์อินทร์. (2552). ปัจจัยเชิงสาเหตุที่ทำให้คะแนนการทดสอบ O-NET ของนักเรียนชั้นประถมศึกษาปีที่ 6 และชั้นมัธยมศึกษาปีที่ 6 ต่ำ. กรุงเทพฯ: สถาบันทดสอบการศึกษาแห่งชาติ (องค์การมหาชน).

Anderson, L. W., Krathwohl, D. R., Airasian, P. W., Cruikshank, K. A., Mayer, R. E., Pintrich, P. R. & Wittrock, M. C. (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational objectives, abridged edition. White Plains, NY: Longman.

Thomas, C. A. (2016). A Comparison of Traditional Test Blueprinting to Assessment Engineering in a Large Scale Assessment Context. Ph.D. Dissertation, The Graduate School, The University of North Carolina at Greensboro, U.S.A.

Urry, V. W. (1977). Tailored Testing: A Successful Application of Latent Trait Theory. Journal of Education Measurement, 14(2), 181-196.

Wiersma, W., & Jurs, S. (1990). Educational measurement and testing. Needham Heights, MA: Allyn and Bacon.

Article Sidebar

Main Article Content

Abstract

Article Details

References