Download - ABCs of IRT
November 18, 2010
Diane M. Talley, MA
Stephen B. Johnson, PhD
James A. Penny, PhD
Psychometrics as Science and Art
2010 ICE Educational Conference
� IRT and Classical
� Concepts of IRT
� A logit
� The abc’s
� Benefits
� Pre-equating
� immediate scoring
� Population invariance
� Assumptions
� Implications
2010 ICE Educational Conference
The right tools for the job
� Data
� Program
� Tool
2010 ICE Educational Conference
Versus
Classical versus IRT model
2010 ICE Educational Conference
Classical versus IRT
Classical Model IRT Model
� Traditional � Modern
� Requires less strict
adherence to assumptions
� Requires stricter
adherence to assumptions
� Sample dependent � Population invariant
� Statistics
(p – diff, p-biserial – disc)
� Probability-based statistics
(b-diff, a-disc, c-guessing)
� Simple scoring model (raw
score)� Scoring is more complex
2010 ICE Educational Conference
What’s a logit?
Ability
The
Performance
StandardProbability
2010 ICE Educational Conference
b (difficulty)
2010 ICE Educational Conference
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00-3
-2.8
-2.5
-2.3 -2
-1.8
-1.5
-1.3 -1
-0.8
-0.5
-0.3 0
0.25 0.5
0.75 1
1.25 1.5
1.75 2
2.25 2.5
2.75
THETA
P(u
=1
| T
HE
TA
)
Paint by Numbers Leonardo
1
43
2
5
a (discrimination) and b
2010 ICE Educational Conference
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00-3
-2.7
5
-2.5
-2.2
5 -2
-1.7
5
-1.5
-1.2
5 -1
-0.7
5
-0.5
-0.2
5 0
0.25 0.5
0.75 1
1.25 1.5
1.75 2
2.25 2.5
2.75
THETA
P(u
=1
| T
HE
TA
)
Paint by Numbers Leonardo
1
2
3
a, b, and c (guessing)
2010 ICE Educational Conference
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
-3
-2.7
5
-2.5
-2.2
5 -2
-1.7
5
-1.5
-1.2
5 -1
-0.7
5
-0.5
-0.2
5 0
0.25 0.5
0.75 1
1.25 1.5
1.75 2
2.25 2.5
2.75
THETA
P(u
=1
| T
HE
TA
)
Paint by Numbers Leonardo
1
2
3
Fit statistics
Comparison of Infit and Outfit
0
1
2
3
4
5
6
Infit OutfitIt
em
Ord
er
ICE 2010 Conference Atlanta Georgia
Outfit Mean Square Plot
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20 25 30
Item Order
MS
Q
Infit Mean Square Plot
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
0 5 10 15 20 25 30
Item Order
MS
Q
Population Invariance
Item 3
Item 2
Item 1
.92.70
.80.60
.50.15
High
Performing
Low
Performing
Classical Difficulty Values IRT Difficulty Values
Item 3
Item 2
Item 1
-.75-.75
0.000.00
1.501.50
High
Performing
Low
Performing
2010 ICE Educational Conference
IRT Pre-Equating
� What does it mean?
� Why would you want to do it?
� What does it mean for building item banks
and forms?
2010 ICE Educational Conference
Test Information Function (TIF)
Comparison of Test Information Functions
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
-3 -2.75 -2.5 -2.25 -2 -1.75 -1.5 -1.25 -1 -0.75 -0.5 -0.25 0 0.25 0.5 0.775 1.025 1.275 1.525 1.775 2.025 2.275 2.525 2.775 3.025
Theta
Info
rmat
ion Form A
Form B
2010 ICE Educational Conference
Assumptions
� Unidimensionality
� Local Independence
2010 ICE Educational Conference
Implications
� Item writing� Leave those scored items alone!
� Focused item writing targeting the performance standard
� Assembly� Items selected for a form should be around the standard
� Testing and Reporting � Field test items for pre-equating/on-demand scoring
� Form assignment
� Scoring
� Recalibration
� Harder to explain to stakeholders
2010 ICE Educational Conference
Does IRT make sense for you?
� What is the size and maturity of your program and
item bank?� Do you like to tinker with items?
� Do your program requirements change frequently?
� How experienced/capable are your item writers?
� How do you score candidates?� IRT or number correct
� Do you hold scores or do immediate scoring?
� Can you afford a psychometrician?
2010 ICE Educational Conference
Questions?
Diane M. Talley [email protected] A. Penny [email protected] B. Johnson [email protected]
919.572.6880www.castleworldwide.com