presentation at the 12 th annual maryland assessment conference college park, md october 18, 2012
DESCRIPTION
Borrowing the Strength of Unidimensional Scaling to Produce Multidimensional Educational Effectiveness Profiles. Presentation at the 12 th Annual Maryland Assessment Conference College Park, MD October 18, 2012 Joseph A. Martineau Ji Zeng Michigan Department of Education. Background. - PowerPoint PPT PresentationTRANSCRIPT
P R E S E N TAT I O N AT T H E 1 2 T H A N N UA LM A RY L A N D A SS ESS M E N T C O N F ER EN C E
C O L L E G E PA R K , M DO C T O B E R 1 8 , 2 0 1 2
J O S EP H A . M A RT I N E A UJ I Z E N G
M I C H I GA N D E PA RT M E N T O F E D U C AT I O N
Borrowing the Strength of Unidimensional Scaling to Produce
Multidimensional Educational Effectiveness Profiles
2
Background
Prior research showing that using unidimensional measures of multidimensional achievement constructs can distort value-added Martineau, J. A. (2006). Distorting Value Added: The Use of
Longitudinal, Vertically Scaled Student Achievement Data for Value-Added Accountability. Journal of Educational and Behavioral Statistics, 31(1), 35-62.
Construct irrelevant variance can become considerable in value-added measures when a construct is multidimensional, but is modeled in value-added as unidimensional.
Common misunderstanding is that if the multiple constructs are highly correlated, value-added should not be distorted.
Correct understanding is that if value-added on the multiple constructs is highly correlated, value-added should not be distorted
3
Background
Prior research showing that the choice of dimension/domain within construct changes value-added significantly Lockwood, J.R et al. (2007). The Sensitivity of Value-Added
Teacher Effect Estimates to Different Mathematics Achievement Measures. Journal of Educational Measurement, 44(1), 47-67.
Depending on choices made in value-added modeling, the correlation between teacher value-added on Procedures and Problem Solving ranged from 0.01 to 0.46.
This gives a surprisingly low correlation in value-added that indicates that at least in this situation, one needs to be concerned about modeling value-added in both dimensions rather than unidimensionally.
Only work I am aware of to date that has inspected inter-construct value-added correlations.
4
Background
Prior research showing that commonly used factor analytic techniques underestimate the number of dimensions in a multidimensional construct Zeng, J. (2010) . Development of a Hybrid Method for
Dimensionality Identification Incorporating an Angle-Based Approach. Unpublished doctoral dissertation, University of Michigan.
Common dimensionality identifications procedures make the unwarranted assumption that all shared variance among indicator variables arise because the indicator variables measure the same construct (shared variance can also arise because the indicator variables are influenced by a common exogenous variable)
Because of this unwarranted assumption, commonly used dimensionality identification techniques underestimate the number of dimensions in a data set.
5
Background
Scaling constructs as multidimensional is a difficult task Multidimensional Item Response Theory (MIRT) is time-
consuming and costly to run Replicating MIRT analyses can be challenging (there are
multiple subjective decision points along the way) Identifying the number of dimensions in MIRT can be
challenging Once the number of dimensions is identified, identifying
which items load in which dimensions in MIRT can also be challenging The factor analysis techniques underlying MIRT are
techniques for data reduction, not dimension identification
6
Background
Short of resolving the considerable difficulties in analytically identifying dimensions within a construct (and replicating such analyses), can another approach be used?
Propose using/trusting content experts’ identifications of dimensions within constructs (e.g., the divisions agreed upon by the writers of content standards) as the best currently available identification of dimensions, for example… Within English language proficiency, producing reading,
writing, listening, and speaking scales. Within Mathematics, producing number & operations,
algebra, geometry, measurement, and data analysis/statistics scales.
7
Background
However, separately scaling each dimension can also be difficult and costly compared to running a traditional unidimensional IRT calibration Confirmatory MIRT Bi-factor IRT model Separate unidimensional calibration and year-to-year equating of
each dimension score Another option:
Unidimensionally calibrate the total score Unidimensionally equate the total score from year to year Use (fixed) item parameters from the unidimensional calibration
to create the multiple dimension scores as specified by content experts
Use of this method needs to be investigated Practical necessity for Smarter Balanced Assessment
Consortium
8
Purpose
Investigate the feasibility and validity of relying on unidimensional total score calibration as a basis for creating multidimensional profile scores… For reporting multidimensional student achievement scores For reporting multidimensional value-added measures
Investigate the impact of separate versus fixed calibration of multidimensional achievement scores in terms of impact on… Student achievement scores Value-added scores
…as compared to the impact of other common decisions in scaling, outcome selection, and value-added modeling
9
Methods
Decisions Modeled in the Analyses Psychometric decisions
Choice of psychometric model 1-PL vs. 3-PL PCM vs. GPCM
Estimation of sub-scores Separate calibration for each dimension vs. fixed calibration
based on unidimensional parameters Choice of outcome metric
Which sub-score is modeled Value-added modeling decisions
Inclusion of demographics in models Number of pre-test covariates (for covariate adjustment
models)
10
Methods
Outcomes Correlations in student achievement metrics compared
across each psychometric choice and outcome choice Correlations in value-added modeling compared across
each choice Classification consistency in value-added compared across
each choice for Three-category classification decisions
Based on confidence intervals around point-estimates placing programs/schools into three categories: (1) above average, (2) statistically indistinguishable from the average, and (3) below average
Four-category classification decisions Based on sorting programs’/schools’ point estimates into
quartiles, representing arbitrary cut points for classification
11
Methods
Data Michigan English Language Proficiency Assessment (ELPA) Level III (Grades 3-5) 3391 students each with 3 measurement occasions (10,173 total scores) Measures
Total Reading (domain) Writing (domain) Listening (domain) Speaking (domain)
Calibrated the ELPA as a unidimensional measure using both 1-PL/Partial Credit Model and 3-PL/Generalized Partial Credit Model
Created domain scores both from fixed parameters from unidimensional calibration and in separate calibrations for each domain
12
Methods
Data Michigan Educational Assessment Program (MEAP)
Mathematics Grades 7 and 8 (not on a vertical scale) Over 110,000 students per grade Measures
Total (using items from the two domains) Number & Operations (domain) Algebra (domain)
Calibrated the MEAP Math tests as unidimensional measures using both 1-PL and 3-PL models
Created domain scores both from fixed parameters from unidimensional calibration and in separate calibrations for each domain
13
Methods
Value-added modeling the ELPA 3-level HLM nesting test occasion within
student within English language learner program to obtain program value-added
14
Methods
Value-added modeling the ELPA VAMs were run in a fully-crossed design
with… All outcomes (R, W, L, S) PCM- and GPCM-calibrated outcomes Fixed and separately calibrated outcomes With and without demographics in the VAMs
32 real-data applications across design factors
15
Methods
Value-added modeling MEAP mathematics 2-level HLM covarying grade-8 outcomes
on grade-7 outcomes with students nested within schools
16
Methods
Value-added modeling MEAP mathematics VAMs were run in a fully-crossed design
with… Both outcomes (algebra and number &
operations) 1-PL and 3-PL calibrated outcomes Fixed and separately calibrated outcomes With and without demographics With either one or two pre-test covariates
32 real-data applications across design factors
17
Results
ELPA
18
Results: ELPA Student-Level Outcomes
Correlations across fixed vs. separate calibrationsModel choice Content Area Correlation
Reading 0.997Writing 0.995
Listening 0.997Speaking 1.000Reading 0.997Writing 0.997
Listening 0.994Speaking 1.000
PCM
GPCM
19
Results: ELPA Student-Level Outcomes
Correlations across model choice (PCM vs. GPCM)Calibration choice Content Area Correlation
Reading 0.972Writing 0.983
Listening 0.967Speaking 0.982Reading 0.978Writing 0.983
Listening 0.977Speaking 0.982
Fixed
Separate
20
Results: ELPA Student-Level Outcomes
Correlations across content areasReading Writing Listening Speaking
Reading - 0.636 0.627 0.371Writing - - 0.537 0.385
Listening - - - 0.368Speaking - - - -Reading - 0.622 0.626 0.373Writing - - 0.519 0.375
Listening - - - 0.365Speaking - - - -Reading - 0.655 0.662 0.402Writing - - 0.559 0.407
Listening - - - 0.405Speaking - - - -Reading - 0.639 0.648 0.395Writing - - 0.543 0.400
Listening - - - 0.394Speaking - - - -
Content AreaContentArea
Calibrationchoice
Modelchoice
PCM
GPCM
Fixed
Separate
Fixed
Separate
Low to moderate inter-dimension correlations
However, Rasch dimensionality analysis from WINSTEPS identified the total score as a unidimensional score
21
Results: ELPA Program District-Level Value-Added Outcomes
Impact of fixed versus separate calibrationPCM GPCM PCM GPCM
Reading 1.000 0.987 1.000 0.992 min 0.987Writing 1.000 0.997 1.000 0.997 max 1.000Listening 1.000 0.987 1.000 0.987 mean 0.997Speaking 1.000 1.000 1.000 1.000 SD 0.005
PCM GPCM PCM GPCMReading 0.996 0.996 1.000 0.991 min 0.991Writing 1.000 0.996 1.000 0.991 max 1.000Listening 1.000 1.000 1.000 0.996 mean 0.998Speaking 1.000 1.000 1.000 1.000 SD 0.003
PCM GPCM PCM GPCMReading 0.982 0.875 0.982 0.902 min 0.875Writing 0.973 0.946 0.982 0.946 max 1.000Listening 0.991 0.897 0.991 0.906 mean 0.961Speaking 1.000 1.000 1.000 1.000 SD 0.043
Content AreaNo Demos Demos
Content Area
4-CategoryConsistency
DemosNo Demos
Correlations
3-CategoryConsistency
Content AreaNo Demos Demos
22
Results: ELPA Program District-Level Value-Added Outcomes
Correlations between Listening and Reading VA
Min = 0.228, Max = 0.397Mean = 0.322, SD = 0.037
Fixed Separate Fixed Separate Fixed Separate Fixed SeparateFixed 0.371 0.371 0.301 0.327 0.303 0.302 0.228 0.245Separate 0.372 0.371 0.303 0.328 0.304 0.303 0.230 0.247Fixed 0.360 0.361 0.387 0.392 0.301 0.302 0.316 0.321Separate 0.376 0.377 0.389 0.397 0.327 0.328 0.320 0.329Fixed 0.330 0.330 0.292 0.308 0.318 0.317 0.261 0.275Separate 0.329 0.330 0.294 0.309 0.318 0.318 0.263 0.277Fixed 0.304 0.305 0.341 0.342 0.307 0.309 0.329 0.332Separate 0.328 0.329 0.346 0.350 0.333 0.335 0.332 0.339
Reading
PCM GPCMNo Demos Demos
PCM
GPCM
GPCM
No D
emos
Dem
osListe
ning
PCM GPCM
PCM
23
Results: ELPA Program District-Level Value-Added Outcomes
Correlations between Listening and Writing VA
Min = 0.342, Max = 0.420Mean = 0.373, SD = 0.019
Fixed Separate Fixed Separate Fixed Separate Fixed SeparateFixed 0.358 0.359 0.369 0.366 0.342 0.343 0.353 0.353Separate 0.359 0.360 0.370 0.367 0.343 0.344 0.354 0.354Fixed 0.403 0.403 0.420 0.412 0.385 0.385 0.401 0.396Separate 0.368 0.368 0.383 0.376 0.354 0.355 0.370 0.364Fixed 0.362 0.362 0.373 0.371 0.361 0.362 0.372 0.371Separate 0.363 0.364 0.374 0.372 0.362 0.363 0.374 0.372Fixed 0.395 0.395 0.410 0.405 0.397 0.397 0.412 0.407Separate 0.364 0.364 0.378 0.373 0.365 0.365 0.379 0.374
Liste
ning No
Dem
os PCM
GPCM
Dem
os
PCM
GPCM
WritingNo Demos Demos
PCM GPCM PCM GPCM
24
Results: ELPA Program District-Level Value-Added Outcomes
Correlations between Listening and Speaking VA
Min = -0.005, Max = 0.108Mean = 0.046, SD = 0.035
Fixed Separate Fixed Separate Fixed Separate Fixed SeparateFixed 0.002 0.002 0.026 0.026 0.005 0.005 0.032 0.032Separate 0.004 0.004 0.028 0.028 0.007 0.007 0.033 0.033Fixed 0.068 0.068 0.102 0.102 0.081 0.081 0.108 0.108Separate 0.051 0.051 0.080 0.080 0.061 0.061 0.086 0.086Fixed -0.005 -0.005 0.025 0.025 0.001 0.001 0.028 0.028Separate -0.004 -0.004 0.027 0.027 0.002 0.002 0.029 0.029Fixed 0.065 0.065 0.097 0.097 0.075 0.075 0.101 0.101Separate 0.047 0.047 0.076 0.076 0.056 0.056 0.080 0.080
Liste
ning No
Dem
os PCM
GPCM
Dem
os
PCM
GPCM
SpeakingNo Demos Demos
PCM GPCM PCM GPCM
25
Results: ELPA Program District-Level Value-Added Outcomes
Correlations between Reading and Writing VA
Min = 0.335, Max = 0.491Mean = 0.412, SD = 0.047
Fixed Separate Fixed Separate Fixed Separate Fixed SeparateFixed 0.389 0.390 0.393 0.386 0.335 0.336 0.341 0.338Separate 0.392 0.393 0.396 0.389 0.338 0.339 0.344 0.341Fixed 0.466 0.464 0.480 0.466 0.442 0.440 0.455 0.443Separate 0.455 0.454 0.468 0.455 0.420 0.419 0.432 0.422Fixed 0.365 0.365 0.370 0.365 0.374 0.374 0.379 0.372Separate 0.369 0.369 0.374 0.369 0.379 0.379 0.384 0.377Fixed 0.453 0.450 0.465 0.454 0.478 0.476 0.491 0.477Separate 0.440 0.438 0.452 0.442 0.464 0.462 0.476 0.461
Read
ing No
Dem
os PCM
GPCM
Dem
os
PCM
GPCM
WritingNo Demos Demos
PCM GPCM PCM GPCM
26
Results: ELPA Program District-Level Value-Added Outcomes
Correlations between Reading and Speaking VA
Min = 0.121, Max = 0.205Mean = 0.151, SD = 0.026
Fixed Separate Fixed Separate Fixed Separate Fixed SeparateFixed 0.121 0.121 0.132 0.132 0.131 0.131 0.136 0.136Separate 0.122 0.122 0.134 0.134 0.132 0.132 0.138 0.138Fixed 0.129 0.129 0.174 0.174 0.152 0.152 0.179 0.179Separate 0.134 0.134 0.172 0.172 0.154 0.154 0.177 0.177Fixed 0.122 0.122 0.136 0.136 0.125 0.125 0.134 0.134Separate 0.125 0.125 0.139 0.139 0.128 0.128 0.138 0.138Fixed 0.163 0.163 0.205 0.205 0.171 0.171 0.203 0.203Separate 0.162 0.162 0.199 0.199 0.168 0.168 0.197 0.197
Read
ing No
Dem
os PCM
GPCM
Dem
os
PCM
GPCM
SpeakingNo Demos Demos
PCM GPCM PCM GPCM
27
Results: ELPA Program District-Level Value-Added Outcomes
Correlations between Speaking and Writing VA
Min = 0.150, Max = 0.246Mean = 0.199, SD = 0.029
Fixed Separate Fixed Separate Fixed Separate Fixed SeparateFixed 0.151 0.150 0.169 0.180 0.158 0.157 0.180 0.189Separate 0.151 0.150 0.169 0.180 0.158 0.157 0.180 0.189Fixed 0.207 0.205 0.225 0.236 0.209 0.208 0.231 0.240Separate 0.207 0.205 0.225 0.236 0.209 0.208 0.231 0.240Fixed 0.173 0.172 0.192 0.202 0.167 0.165 0.189 0.197Separate 0.173 0.172 0.192 0.202 0.167 0.165 0.189 0.197Fixed 0.216 0.215 0.235 0.246 0.212 0.210 0.233 0.243Separate 0.216 0.215 0.235 0.246 0.212 0.210 0.233 0.243
Spea
king No
Dem
os PCM
GPCM
Dem
os
PCM
GPCM
WritingNo Demos Demos
PCM GPCM PCM GPCM
28
Results: ELPA Program District-Level Value-Added Outcomes
Impact of choice of psychometric modelFixed Sep Fixed Sep
Reading 0.837 0.900 0.834 0.887 min 0.834Writing 0.988 0.987 0.988 0.986 max 0.988Listening 0.929 0.945 0.942 0.955 mean 0.943Speaking 0.975 0.975 0.980 0.980 SD 0.052
Fixed Sep Fixed SepReading 0.973 0.982 0.978 0.987 min 0.964Writing 0.996 0.991 0.996 0.996 max 0.996Listening 0.987 0.987 0.982 0.987 mean 0.982Speaking 0.964 0.964 0.969 0.969 SD 0.011
Fixed Sep Fixed SepReading 0.567 0.634 0.580 0.634 min 0.567Writing 0.902 0.866 0.920 0.893 max 0.920Listening 0.728 0.728 0.768 0.754 mean 0.765Speaking 0.795 0.795 0.839 0.839 SD 0.113
Content AreaNo Demos Demos
Content AreaNo Demos Demos
Correlations
3-CategoryConsistency
4-CategoryConsistency
Content AreaNo Demos Demos
29
Results: ELPA Program District-Level Value-Added Outcomes
Impact of Including/Not Including Demographics
Fixed Sep Fixed SepReading 0.915 0.915 0.931 0.922 min 0.915Writing 0.978 0.978 0.979 0.982 max 0.997Listening 0.982 0.982 0.980 0.981 mean 0.969Speaking 0.993 0.993 0.997 0.997 SD 0.030
Fixed Sep Fixed SepReading 0.991 0.987 0.987 0.982 min 0.973Writing 0.987 0.987 0.987 0.973 max 0.996Listening 0.991 0.991 0.987 0.982 mean 0.988Speaking 0.991 0.991 0.996 0.996 SD 0.006
Fixed Sep Fixed SepReading 0.808 0.817 0.750 0.741 min 0.741Writing 0.830 0.821 0.848 0.839 max 0.924Listening 0.924 0.911 0.911 0.915 mean 0.859Speaking 0.902 0.902 0.911 0.911 SD 0.060
Content AreaPCM GPCM
Content AreaPCM GPCM
Correlations
3-CategoryConsistency
4-CategoryConsistency
Content AreaPCM GPCM
30
Results
MEAP Mathematics
31
Results: MEAP Math Student-Level Outcomes
Correlations among variables based on psychometric decisions
Fixed Sep Fixed Sep Fixed Sep Fixed SepFixed - 1.000 0.943 0.941 0.775 0.775 0.775 0.743Sep 1.000 - 0.943 0.941 0.775 0.775 0.775 0.742Fixed 0.900 0.901 - 0.996 0.748 0.748 0.748 0.751Sep 0.891 0.893 0.984 - 0.746 0.745 0.746 0.748Fixed 0.684 0.685 0.677 0.666 - 1.000 1.000 0.941Sep 0.684 0.685 0.676 0.665 1.000 - 1.000 0.941Fixed 0.670 0.671 0.691 0.682 0.936 0.935 - 0.941Sep 0.667 0.668 0.688 0.679 0.935 0.934 0.998 -
Grade 7 above diagonal/Grade 8
below
Alge
bra
Num
ber &
O
pera
tions
3-PL
1-PL
3-PL
1-PL
Number & OperationsAlgebra3-PL1-PL3-PL1-PL
32
Results: MEAP Math Student-Level Outcomes
Very high correlations based on fixed versus separate calibrations
Fixed Sep Fixed Sep Fixed Sep Fixed SepFixed - 1.000 0.943 0.941 0.775 0.775 0.775 0.743Sep 1.000 - 0.943 0.941 0.775 0.775 0.775 0.742Fixed 0.900 0.901 - 0.996 0.748 0.748 0.748 0.751Sep 0.891 0.893 0.984 - 0.746 0.745 0.746 0.748Fixed 0.684 0.685 0.677 0.666 - 1.000 1.000 0.941Sep 0.684 0.685 0.676 0.665 1.000 - 1.000 0.941Fixed 0.670 0.671 0.691 0.682 0.936 0.935 - 0.941Sep 0.667 0.668 0.688 0.679 0.935 0.934 0.998 -3-
PL1-
PL3-
PL1-
PL
Number & OperationsAlgebra3-PL1-PL3-PL1-PL
Grade 7 above diagonal/Grade 8
below
Alge
bra
Num
ber &
O
pera
tions
33
Results: MEAP Math Student-Level Outcomes
Very high correlations based on fixed versus separate calibrations
Fixed Sep Fixed Sep Fixed Sep Fixed SepFixed - 1.000 0.943 0.941 0.775 0.775 0.775 0.743Sep 1.000 - 0.943 0.941 0.775 0.775 0.775 0.742Fixed 0.900 0.901 - 0.996 0.748 0.748 0.748 0.751Sep 0.891 0.893 0.984 - 0.746 0.745 0.746 0.748Fixed 0.684 0.685 0.677 0.666 - 1.000 1.000 0.941Sep 0.684 0.685 0.676 0.665 1.000 - 1.000 0.941Fixed 0.670 0.671 0.691 0.682 0.936 0.935 - 0.941Sep 0.667 0.668 0.688 0.679 0.935 0.934 0.998 -3-
PL1-
PL3-
PL1-
PL
Number & OperationsAlgebra3-PL1-PL3-PL1-PL
Grade 7 above diagonal/Grade 8
below
Alge
bra
Num
ber &
O
pera
tions
34
Results: MEAP Math Student-Level Outcomes
Not as high correlations based on 1-PL versus 3-PL calibrations
Fixed Sep Fixed Sep Fixed Sep Fixed SepFixed - 1.000 0.943 0.941 0.775 0.775 0.775 0.743Sep 1.000 - 0.943 0.941 0.775 0.775 0.775 0.742Fixed 0.900 0.901 - 0.996 0.748 0.748 0.748 0.751Sep 0.891 0.893 0.984 - 0.746 0.745 0.746 0.748Fixed 0.684 0.685 0.677 0.666 - 1.000 1.000 0.941Sep 0.684 0.685 0.676 0.665 1.000 - 1.000 0.941Fixed 0.670 0.671 0.691 0.682 0.936 0.935 - 0.941Sep 0.667 0.668 0.688 0.679 0.935 0.934 0.998 -
Grade 7 above diagonal/Grade 8
below
Alge
bra
Num
ber &
O
pera
tions
3-PL
1-PL
3-PL
1-PL
Number & OperationsAlgebra3-PL1-PL3-PL1-PL
35
Results: MEAP Math Student-Level Outcomes
Moderate to high correlations across dimensions
Fixed Sep Fixed Sep Fixed Sep Fixed SepFixed - 1.000 0.943 0.941 0.775 0.775 0.775 0.743Sep 1.000 - 0.943 0.941 0.775 0.775 0.775 0.742Fixed 0.900 0.901 - 0.996 0.748 0.748 0.748 0.751Sep 0.891 0.893 0.984 - 0.746 0.745 0.746 0.748Fixed 0.684 0.685 0.677 0.666 - 1.000 1.000 0.941Sep 0.684 0.685 0.676 0.665 1.000 - 1.000 0.941Fixed 0.670 0.671 0.691 0.682 0.936 0.935 - 0.941Sep 0.667 0.668 0.688 0.679 0.935 0.934 0.998 -3-
PL1-
PL3-
PL1-
PL
Number & OperationsAlgebra3-PL1-PL3-PL1-PL
Grade 7 above diagonal/Grade 8
below
Alge
bra
Num
ber &
O
pera
tions
36
Results: MEAP Mathematics School-Level Value-Added Outcomes
Impact of fixed versus separate calibration
1-PL 3-PL 1-PL 3-PL 1-PL 3-PL 1-PL 3-PLAlgebra 1.000 0.995 1.000 0.992 1.000 0.985 1.000 0.985Number & Operations 1.000 0.977 1.000 0.956 1.000 0.988 1.000 0.983
1-PL 3-PL 1-PL 3-PL 1-PL 3-PL 1-PL 3-PLAlgebra 0.989 0.968 0.987 0.973 0.987 0.935 0.989 0.960Number & Operations 0.989 0.923 0.994 0.935 0.990 0.946 0.989 0.966
1-PL 3-PL 1-PL 3-PL 1-PL 3-PL 1-PL 3-PLAlgebra 0.995 0.926 0.993 0.883 0.992 0.856 0.986 0.848Number & Operations 0.989 0.827 0.984 0.712 0.993 0.875 0.983 0.817
Corr
elati
ons
3-Ca
t Co
nsist
ency
4-Ca
t Co
nsist
ency
DemosNo DemosDemosNo Demos2 pre-test covariates1 pre-test covariate
Content Area
1 pre-test covariate 2 pre-test covariatesNo Demos Demos No Demos Demos
Content Area
Content Area
1 pre-test covariate 2 pre-test covariatesNo Demos Demos No Demos Demos
37
Results: MEAP Mathematics School-Level Value-Added Outcomes
Impact of choice of outcome (Algebra vs. Number)
1-PL 3-PL 1-PL 3-PL 1-PL 3-PL 1-PL 3-PLFixed Parameter 0.548 0.608 0.361 0.391 0.652 0.697 0.574 0.609Separate 0.549 0.649 0.366 0.436 0.653 0.711 0.576 0.614
1-PL 3-PL 1-PL 3-PL 1-PL 3-PL 1-PL 3-PLFixed Parameter 0.637 0.667 0.649 0.703 0.703 0.751 0.716 0.774Separate 0.637 0.691 0.650 0.726 0.705 0.749 0.713 0.784
1-PL 3-PL 1-PL 3-PL 1-PL 3-PL 1-PL 3-PLFixed Parameter 0.399 0.424 0.322 0.337 0.447 0.475 0.404 0.412Separate 0.397 0.429 0.322 0.350 0.444 0.484 0.405 0.436
Corr
elati
ons
3-Ca
t Co
nsist
ency
4-Ca
t Co
nsist
ency
1 pre-test covariate 2 pre-test covariatesNo Demos Demos No Demos Demos
MultidimensionalCalibration Type
1 pre-test covariate 2 pre-test covariatesNo Demos Demos No Demos Demos
MultidimensionalCalibration Type
MultidimensionalCalibration Type
1 pre-test covariate 2 pre-test covariatesNo Demos Demos No Demos Demos
38
Results: MEAP Mathematics School-Level Value-Added Outcomes
Impact of choice of psychometric model
Alg Num Alg Num Alg Num Alg NumFixed Parameter 0.939 0.963 0.883 0.934 0.918 0.961 0.925 0.962Separate 0.938 0.962 0.876 0.937 0.925 0.962 0.873 0.938
Alg Num Alg Num Alg Num Alg NumFixed Parameter 0.890 0.901 0.851 0.912 0.867 0.921 0.837 0.915Separate 0.886 0.907 0.841 0.918 0.876 0.918 0.839 0.915
Alg Num Alg Num Alg Num Alg NumFixed Parameter 0.732 0.763 0.611 0.673 0.679 0.773 0.602 0.677Separate 0.717 0.775 0.604 0.685 0.701 0.770 0.610 0.670
Corr
elati
ons
3-Ca
t Co
nsist
ency
4-Ca
t Co
nsist
ency
1 pre-test covariate 2 pre-test covariatesNo Demos Demos No Demos Demos
MultidimensionalCalibration Type
1 pre-test covariate 2 pre-test covariatesNo Demos Demos No Demos Demos
MultidimensionalCalibration Type
MultidimensionalCalibration Type
1 pre-test covariate 2 pre-test covariatesNo Demos Demos No Demos Demos
39
Results: MEAP Mathematics School-Level Value-Added Outcomes
Impact of Including/Not Including Demographics
Alg Num Alg Num Alg Num Alg NumFixed Parameter 0.964 0.815 0.813 0.717 0.984 0.822 0.895 0.775Separate 0.962 0.819 0.806 0.780 0.983 0.825 0.877 0.793
Alg Num Alg Num Alg Num Alg NumFixed Parameter 0.880 0.772 0.771 0.713 0.928 0.774 0.841 0.771Separate 0.875 0.767 0.774 0.724 0.927 0.775 0.831 0.756
Alg Num Alg Num Alg Num Alg NumFixed Parameter 0.775 0.551 0.572 0.464 0.864 0.557 0.646 0.508Separate 0.774 0.556 0.544 0.522 0.858 0.552 0.635 0.547
Corr
elati
ons
3-Ca
t Co
nsist
ency
4-Ca
t Co
nsist
ency
1 pre-test covariate 2 pre-test covariates1-PL 3-PL 1-PL 3-PL
MultidimensionalCalibration Type
1 pre-test covariate 2 pre-test covariates1-PL 3-PL 1-PL 3-PL
MultidimensionalCalibration Type
MultidimensionalCalibration Type
1 pre-test covariate 2 pre-test covariates1-PL 3-PL 1-PL 3-PL
40
Results: MEAP Mathematics School-Level Value-Added Outcomes
Impact of covarying on one vs. two pre-test scores
Alg Num Alg Num Alg Num Alg NumFixed Parameter 0.937 0.965 0.923 0.964 0.941 0.947 0.930 0.951Separate 0.937 0.965 0.937 0.962 0.941 0.948 0.941 0.942
Alg Num Alg Num Alg Num Alg NumFixed Parameter 0.855 0.884 0.851 0.889 0.889 0.918 0.872 0.744Separate 0.859 0.889 0.878 0.883 0.885 0.922 0.885 0.755
Alg Num Alg Num Alg Num Alg NumFixed Parameter 0.734 0.764 0.696 0.753 0.715 0.687 0.704 0.713Separate 0.729 0.768 0.727 0.754 0.716 0.693 0.714 0.698
Corr
elati
ons
3-Ca
t Co
nsist
ency
4-Ca
t Co
nsist
ency
No Demographics Includes Demographics1-PL 3-PL 1-PL 3-PL
MultidimensionalCalibration Type
No Demographics Includes Demographics1-PL 3-PL 1-PL 3-PL
MultidimensionalCalibration Type
MultidimensionalCalibration Type
No Demographics Includes Demographics1-PL 3-PL 1-PL 3-PL
41
Conclusions
Practically important impacts on value-added metrics and value-added classifications Choice of psychometric model Including/not including demographics Including/not including multiple pre-test values
Prohibitive impacts on value-added metrics and value-added classifications Choice of outcome (i.e., domain within construct)
Practically negligible impacts on value-added metrics and value-added classifications Separate versus fixed calibrations of domains within
construct
42
Conclusions, continued…
Need to pay attention to modeling domains within constructs if constructs can reasonably be considered multidimensional Of the common psychometric and statistical modeling decisions one can
make, the choice of which subscore to use as an outcome is the most influential
Because subscores give different profiles of both student achievement and program/school value-added, each subscore should be modeled to the degree possible
4-category (i.e., quartile) classifications on value-added are appreciably impacted by every psychometric and statistical modeling choice evaluated here, but 3-category classifications are not Discourage more than three categories RTTT requires at least four categories
43
Conclusions, continued…
3- vs. 4-category distinction is actually a proxy for Statistical decision categories (3-categories) Arbitrary cut point categories (4-categories)
Can leverage unidimensional calibrations of multidimensional achievement scales to create multidimensional profiles of value-added Except where using four categories of classifications
44
Limitations
Inductive reasoning Results are likely to hold in similar circumstances Still will need to investigate feasibility of using
fixed parameters from unidimensional calibration for specific circumstances if those circumstances are high stakes
This is a proof of conceptPCM and GPCM models were run using
different software (WINSTEPS vs. PARSCALE)
45
Contact Information
Joseph A. Martineau, Ph.D. Executive Director Bureau of Assessment & Accountability Michigan Department of Education [email protected]
Ji Zeng, Ph.D. Psychometrician Bureau of Assessment & Accountability Michigan Department of Education [email protected]