selected topics in psychometrics - cas · 2020-01-07 · selected topics in psychometrics l10:...
TRANSCRIPT
Selected topics in psychometrics
L10: Computerized Adaptive Testing
Patrıcia MartinkovaNMST570, January 7, 2020
Department of Statistical Modelling
Institute of Computer Science, Czech Academy of Sciences
Institute for Research and Development of Education
Faculty of Education, Charles University, Prague
Outline
1. Introduction
2. Computerized adaptive tests
3. Simulations and examples
4. Conclusion
Introduction Computerized adaptive tests Simulations and examples Conclusion
Motivation Types of tests
Motivation
• Student abilities may differ remarkably
• Some students may become bored by too easy items
• Some students may become stressed by too hard items
• Selection of proper items may save time
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 1/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
Motivation Types of tests
Different types of tests
• Linear tests
• most popular
• test takers take all items
• Linear Computer-based tests (CBT)
• linear test administered on computer
• allow for taking track of response time for each item
• Computerized adaptive tests (CAT)
• gets estimate after each item
• guides selection of subsequent items
• Multistage test (MST)
• compromise between Linear and CAT,
• similar to CAT, but by groups of items
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 2/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT flowchart Item Bank Starting Continuing Stopping
CAT flow chart
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 3/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT flowchart Item Bank Starting Continuing Stopping
Computerized adaptive tests: components
1. Initialization
• administer initial item(s)
• estimate student ability
2. Testing cycle
• Select item and gather response
• Re-estimate student ability
• Check termination criteria
3. Output the final ability estimate / final classification
Further possibilities:
• Exposure rate control
• Content balancingPatrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 4/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT flowchart Item Bank Starting Continuing Stopping
Item Bank
• Based on test specifications
• Sufficient number of items in each content category
• Item quality reviewed by content experts
• Newly written items pretested
• Reviewed items calibrated with CTT or IRT (parameters a, b, c , d)
• Qualified items selected to item bank
• Item bank re-evaluated for size, specifications, content balance
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 5/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT flowchart Item Bank Starting Continuing Stopping
Item properties described by IRT model
• Selected IRT model
• Rasch, 1PL, 2PL, 3PL, 4PL, (G)PCM, GRM, NRM,...
• Functional form (here for 2PL IRT model)
P(Yij = 1|θi , aj , bj) =eaj (θi−bj )
1 + eaj (θi−bj )
aj discrimination, bj difficulty
• Item parameter estimates
• JML, CML, MML, Bayesian methods
• Item Characteristic Curve (ICC)
• Item Information Curve (IIC)
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 6/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT flowchart Item Bank Starting Continuing Stopping
Initialization
• Initial estimate of student ability
• Mean ability (θ0 = 0)
• Based on a single item
• Based on a pretest
• Item selection for initial estimate of ability / pretest
• Item(s) with mean difficulty (b close to 0)
• Item(s) with difficulty closest to ability estimated by pretest
• Random item(s)
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 7/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT flowchart Item Bank Starting Continuing Stopping
Ability Estimation
• Maximum likelihood (MLE)
• Bayes modal estimator, maximal a posteriori estimator (MAP)
• Expected a posteriori estimator (EAP)
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 8/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT flowchart Item Bank Starting Continuing Stopping
Item Selection
• Sequential (non-adaptive)
• Urry’s criterion: difficulty closest to current ability estimate
• Fisher Information
• maximum Fisher information (MFI)
• maximum likelihood weighted information criterion (MLWI)
• maximum posterior weighted criterion (MPWI)
• Kullback-Leibler information criteria
• Maximum expected information criterion (MEI)
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 9/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT flowchart Item Bank Starting Continuing Stopping
Other Issues
• Exposure rate control
• randomesque
• 5-4-3-2-1 method
• etc.
• Content balancing
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 10/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT flowchart Item Bank Starting Continuing Stopping
Stopping rule
• Test length reached
• Test time reached
• Precision criterion on ability estimate
• Classification criterion: threshold met specification for ability confidence
interval
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 11/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
Post-hoc analysis of real data CAT software
Post-hoc analysis of real data
• Having real data from linear test
• Checking how would the results be in CAT design
• Comparison of different settings
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 12/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
Post-hoc analysis of real data CAT software
Simulation study
• Simulating ability
• Setting item parameters (from real test or based on a model)
• Comparison of different settings
• Correlation with true ability, bias, length,...
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 13/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
Post-hoc analysis of real data CAT software
CAT Examples and software
• catIRT
• catR + Concerto
• mirtCAT
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 14/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT conclusions Course conclusions
Conclusion
• CAT is an up-do-date testing method!
• Good precision using much less items
• Increased security
• Faster score reporting
• Less stress and less boredom for students
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 15/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT conclusions Course conclusions
Selected topics in psychometrics
• Introduction
• Reliability and measurement error
• Validity
• Traditional item analysis
• Regression models for item description
• Item response theory models for binary data
• Item response theory models for ordinal and nominal items
• Differential item functioning (2 lessons)
• Computerized adaptive testing and other topics in psychometrics.
Final project assignment
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 16/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT conclusions Course conclusions
Course goals
1. Name the main topics studied in psychometrics
2. Describe and apply strategies to find proofs about validity
3. Describe and apply strategies to find proofs about reliability
4. Describe and apply traditional methods to describe item functioning
(difficulty, discrimination, distractor analysis)
5. Explain how logistic regression may be used to describe item properties,
apply regression models on real data and interpret results
6. Formulate IRT models for binary, polytomous and nominal items (Rasch,
1-4PL IRT, GRM, GPCM, RSM, NRM), apply them on real data and
interpret results
7. Explain concept of differential item functioning, describe and apply some
DIF detection methods (Mantel-Haenszel test, logistic regression,
IRT-based Lord’s test, Raju’s test etc.)
8. Explain process of computerized adaptive testing, prepare adaptive test
9. Describe process of assessment development and validationPatrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 17/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT conclusions Course conclusions
Final project
• Introduction (2 pts)
• Methods (2 pts)
• Results (12 pts)
• A consise descriptive analysis (2 pts)
• Proofs of test reliability (2 pts)
• Proofs of test validity (2 pts)
• Item analysis using traditional methods or regression models (2 pts)
• Item response theory model (2 pts)
• Differential item functioning (2 pts)
• Discussion (2 pts)
• References (2 pts)
• Supplemental Materials (4 pts)
• Bonus (own dataset (5 pts), CAT (5 pts), simulation (5pts)
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 18/19
Introduction Computerized adaptive tests Simulations and examples Conclusion
CAT conclusions Course conclusions
Seminar in Psychometrics NMST571
Seminar covers selected topics in psychometrics in more depth.
Tentative schedule:
• Tuesdays, 3:40-5:10pm
Credit requirements:
• Actively present (2 absences tolerated)
• Presentation during one of the sessions (approx. 30 min presentation plus
15 min discussion).
Patrıcia Martinkova NMST570, L10: Computerized Adaptive Testing 19/19
References
• Magis D, Yan D, von Davier AA. (2017). Computerized Adaptive and
Multistage Testing with R. Springer.
• Magis D, & Raichle G. (2012). Random Generation of Response Patterns under
Computerized Adaptive Testing with the R Package catR. Journal of Statistical
Software, 48, pp. 1-31. https://www.jstatsoft.org/article/view/v048i08
• Chalmers P. (2016). Generating Adaptive and Non-Adaptive Test Interfaces for
Multidimensional Item Response Theory Applications. Journal of Statistical
Software, 71, pp. 1-38. https://www.jstatsoft.org/article/view/v071i05
• Chalmers P. (2017-08-11). Wiki mirtCAT
https://github.com/philchalmers/mirtCAT/wiki