estimating growth when content specifications change: a multidimensional irt approach mark d....
Post on 21-Dec-2015
215 Views
Preview:
TRANSCRIPT
Estimating Growth when Content Specifications Change:A Multidimensional IRT Approach
Mark D. ReckaseTianli LiMichigan State University
The Problem
State curriculum frameworks often change from one grade to the next reflecting the addition of new instructional content. For example, at grade 7 algebra may be introduced as an
instructional goal. At grade 6, algebra is not an important component of the
curriculum. Tests at the two grades reflect the instructional content
so the 6th grade test does not include algebra and the 7th grade test does.
How can the score scales of these tests be linked?
Research Questions
What do changes on the linked score scale mean, when the scale is produced using the usual unidimensional IRT models?
Can multidimensional IRT be used to form vertical scales? If so, how do the results compare to the unidimensional results?
The Approach
State testing data were analyzed using multidimensional IRT to develop a realistic model for the test data at two grade levels.
The results of the real data analyses were idealized to create the specifications for simulating the tests at two grade levels.
Simulate data with known structure to determine how unidimensional and multidimensional procedures function.
The Simulated Data Design
Grade 6 – two major constructsArithmeticProblem Solving
Grade 7 – three major constructsArithmeticProblem SolvingAlgebra
Simulated Test Structure
Test Level Algebra Arithmetic Problem Solving
Total
Grade 6 0 17 (4) 23 (6) 40 (10)
Grade 7 11 (0) 11 (4) 18 (6) 40 (10)
Note: The numbers in parentheses are the common items between the two forms of the tests.
Mean Vectors at each Grade Level
Class Level Algebra Arithmetic ProblemSolving
Grade 6Grade 7
-1.5 (-1.50)0 (.03)
.5 (.51)
.7 (.73)-.2 (-.21)0 (.01)
Note: Values in parentheses are the observed means from the simulated data
Covariance MatricesCovariance Matrix for Grade 6
Algebra Arithmetic Problem Solving
Algebra .25 (.25) 0 (.00) 0 (.00)
Arithmetic 0 (.00) .8 (.84) .7 (.76)
Problem Solving
0 (.00) .7 (.76) 1.2 (1.29)
Covariance Matrix for Grade 7
Algebra Arithmetic Problem Solving
Algebra 1 (1.05) .4 (.42) .6 (.64)
Arithmetic .4 (.42) .6 (.60) .3 (.32)
Problem Solving
.6 (.64) .3 (.32) 1 (1.02)Note: Values in parentheses are estimated from the simulated data.
Orientation of Items
-2-1.5 -1
-0.50 0.5
1
-1
0
1
2-2
-1.5
-1
-0.5
0
0.5
1
1.5
1
2
3
Effect Size Built into Data
Algebra ArithmeticProblem Solving
1.9 .26 .21
Unidimensional Basisfor Comparison Imagine that the full set of 70 items from both
test levels are administered to the students at both grade levels.
The matrix of 2000 + 2000 students from the two grades by 70 items can be analyzed with the unidimensional models to serve as a basis for comparison for the vertical scaling result.
Analyze the matrix using 2pl and Rasch model.
2PL Solution
-2
-1
0
1
2
-1
0
1
2-2
-1
0
1
2
1
2
3
Rasch Model Solution
-2
-1
0
1
2
-1
0
1
2-3
-2
-1
0
1
2
1
2
3
Vertical Scaling Analysis
Common-item concurrent calibration BILOGMG
Off grade items coded as not reachedBoth 2pl and Rasch model used for analysis
Determine effect size of difference in mean of two grade levels
Vertically Scaled Effect Sizes
2PL Model70 Items
Rasch Model
70 Items
2PL ModelConcurrent
Rasch Model
Concurrent
Mean (SD)Grade 6
-.54 (.78) -.42 (.93) -.22 (1.16) -.14 (1.06)
Mean (SD)Grade 7
.56 (1.13) .45 (1.15) .26 (1.20) .21 (1.38)
Effect Size 1.13 .83 .41 .28
Vertically Scaled Effect Sizes
Linked effect size is smaller than full data effect size.
Rasch effect size is less than 2pl effect size.
Full data set effect size is less than modeled effect size.
Alternative Linking Method
Common-item, separate calibration
Common item parameter relationship was poor
-2 -1.5 -1 -0.5 0 0.5 1-2
-1.5
-1
-0.5
0
0.5
1
b-parameters Grade xb-
para
met
ers
Gra
de x
+ 1
MIRT Analysis
Full data analysis with TESTFACTThree dimensional analysisDetermine effect size for each dimensionCorrelate each estimated with the
generating s to determine meaning of the results.
MIRT Effect Sizes
θ1 θ2 θ3
Mean (SD) Total
.01 (.95) -.01 (.90) .05 (.72)
Mean (SD) 6 -.57 (.54) .16 (.99) .03 (.74)
Mean (SD) 7 .60 (.90) -.19 (.77) .06 (.69)
Effect Size 1.56 -.40 .05
Correlation between Trueand Estimated Values
Est θ1 Est θ2 Est θ3
True θ1 .92 -.08 .02
True θ2 .47 .50 -.18
True θ3 .46 .80 -.03
Interpretation of MIRT Solution
Results are difficult to interpret because of the default procedures in TESTFACT.
Solution needs to be rotated to have axes align with content dimensions.
Current solution shows that is related to algebra and shows the big algebra effect.
is a combination of arithmetic and problem solving with the emphasis on problem solving. Most likely it has the sign of the a-parameters
reversed.
Concurrent MIRT Analysis
Use concurrent calibration of data from the two grade levels.Three dimensional solutionNo rotation
Determine effect sizes and correlations with true values.
Concurrent MIRT Calibration
θ1 θ2 θ3
Mean (SD) Total
.06 (.75) -.09 (.57) -.38 (1.01)
Mean (SD) 6 -.02 (.87) -.29 (.56) .18 (.64)
Mean (SD) 7 .14 (.59) .10 (.50) -.94 (.99)
Effect Size .22 .74 -1.34
Concurrent MIRT Calibration
Est θ1 Est θ2 Est θ3
True θ1 .16 .57 -.87
True θ2 .54 .02 -.40
True θ3 .77 -.05 -.43
Concurrent MIRT Calibration
Scale on Dimension 3 is reversed and it has a large effect size (algebra).
Dimension 1 is most related to arithmetic and problem solving with a moderate effect size.
Dimension 2 is moderately related to algebra and has a large effect size.
The overall result gives a reasonable estimate of effects, but the dimensions need to be rotated to match the constructs.
Conclusions
Unidimensional linking of the two level tests underestimate the effect size.
Rasch model gives a smaller effect size than the two parameter logistic model.
MIRT solution shows promise. Need to determine how to rotate solution to match
constructs. TESTFACT has problems converging on estimates
because of mismatch between assumptions and reality.
top related