where do we go from here? next generation assessment systems for achievement and accountability...
TRANSCRIPT
Where Do We Go from Here?
Next Generation Assessment Systems for Achievement and
Accountability
Presentation at the National Conference on Large Scale Assessment
Gregory J. CizekUniversity of North Carolina at Chapel Hill
2015, June
Purposes of Assessment:Four Principles
1) All assessments are designed, on purpose, to best fulfill one purpose
3
Purposes of Assessment:Four Principles
1) All assessments are designed, on purpose, to best fulfill one purpose2) Some assessments can be made to fulfill multiple purposes, but they are not likely to serve those multiple purposes equally well
4
Purposes of Assessment:Four Principles
1) All assessments are designed, on purpose, to best fulfill one purpose2) Some assessments can be made to fulfill multiple purposes, but they are not likely to serve those multiple purposes equally well3) To demand that they do so is unrealistic
5
Purposes of Assessment:Four Principles
1) All assessments are designed, on purpose, to best fulfill one purpose2) Some assessments can be made to fulfill multiple purposes, but they are not likely to serve those multiple purposes equally well3) To demand that they do so is unrealistic4) Diverse purposes likely require multiple, targeted assessments
6
What types of test(s) would be most important to have in an
assessment system?Design Purpose(s) OptionsAnnual large-scale, state-mandated student achievement test (Summative)
Public and system accountability; fiduciary monitoring of overall achievement in a state; policy evaluation, funding decisions, program & instructional planning at aggregate levels
Matrix, cluster, random, or other sampling strategy of students Random, systematic, purposeful, or census sampling of content standards
Annual large-scale, state-mandated student achievement test (Summative)
Above, plus reporting to students and parents of overall mastery of grade/subject content standards; local and teacher instructional planning; student accountability (e.g., graduation, promotion)
Census sampling of students Random, systematic, purposeful, or census sampling of content standards 8
9
Design Purpose(s) Options
Fine-grained, narrowly-focused assessment of student learning (Diagnostic)
Individually-diagnostic, educator actionable, tailoring of instruction
On demand, individual or group administration, state or locally developed, aggregated (or not), secure (or not)
Instructionally-embedded assessment for learning (Formative)
Instructional planning for educator and student; student responsibility for goals, monitoring progress, self-evaluation, meta-cognition
As needed, individual or classroom administration, state or locally developed, not aggregated, highly insecure
Why all the controversy about accountability systems?
1) “It’s the accountability, stupid.”2) The personnel evaluation system in education is broken.
12
Why all the controversy about accountability systems?
1) “It’s the accountability, stupid.”2) The personnel evaluation system in education is broken.3) Federal shenanigans
13
Why all the controversy about accountability systems?
1) “It’s the accountability, stupid.”2) The personnel evaluation system in education is broken.3) Federal shenanigans4) Teacher preparation requirements: An Epic Fail
14
Can accountability results improve achievement?
Point of diminishing returns on
large-scale, every pupil state (or
federal) mandated student
achievement tests.
17
Can accountability results improve achievement?
Best candidates for improving achievement:
* REAL formative assessment
19
Can accountability results improve achievement?
Best candidates for improving achievement:
* REAL formative assessment
* ANY content standards that change what happens in classrooms
20
Construct Clarity
“A consensus definition of what constitutes college and career readiness does not exist.” (Hao & Wyse, 2015, p. 3)
23
Construct Clarity (cont’d)
The score at which a student has a 75% probability of getting a score of X or better on a different test [or a grade of Y or better in a different course].
24
Construct Clarity (cont’d)
The score at which a student has a 75% probability of getting a score of X or better on a different test [or a grade of Y or better in a different course].
“Intelligence as a measurable capacity must at the start be defined as the capacity to do well on an intelligence test. Intelligence is what the tests test.” (Boring, 1923, p. 35)
25
Clarity of Purpose:Example 1
“How can we extend the content standards and develop a modified assessment for the most severely cognitively disabled students so that their scores have equivalent meaning as those from the general assessment?
27