consequential validity by suzanne lane presentation for ccsso scass-asr accountability systems and...

46
Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Upload: norman-quinn

Post on 30-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Consequential Validity

bySuzanne Lane

Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting

January 20 2005

Page 2: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

NCLB Goals

• Better, more demanding instruction

• Challenging content standards

• Same educational opportunities for all students

• All students reach same level of achievement

Page 3: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Standards-Based Accountability and Assessment Systems

• Goal: improve instruction and student learning

• Assessments: grounded in theories and models of student cognition and student learning

• Impact: instruction and student learning are improving in meaningful ways

Page 4: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Overview of Presentation

• Need for consequential validity evidence

• Summary of research on consequential validity

• Factors affecting the impact of standards-based reform

• Measuring the impact of standards-based reform

Page 5: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Standards and Assessments Peer Review Guidelines

• Has the State Ascertained that the decisions based on the results of its assessments are consistent with the purposes for which the assessments were designed?

• Has the State ascertained whether the assessments produce intended and unintended consequences?

(U.S. Department of Education, 2004)

Page 6: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Standards for Educational and Psychological Testing and Consequential Validity

• 13.1 When educational testing programs are mandated by school, district, state or other authorities, the ways in which test results are intended to be used should be clearly described. It is the responsibility of those who mandate the use of tests to monitor their impact and to identify and minimize potential negative consequences. Consequences resulting from the uses of the test, both intended and unintended, should also be examined by the test user.

Efforts should be made to document the provision of instruction in tested content and skills

(AERA, APA, NCME, 1999)

Page 7: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Need for Consequential Validity Evidence

• Success of a reform policy is typically evaluated in terms of assessment scores

• Assessment scores in accountability systems can increase in the first few years without actual student learning with respect to the larger construct domain (Linn et. al, 1990)

• Examination of the impact of assessments on instruction and student learning provides validity evidence addressing the effectiveness of standards-based accountability systems

• Such studies allow a deeper understanding of whether improved performance on assessments reflects meaningful improvements in student achievement and learning

Page 8: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Success of a reform policy in terms of assessment scores

• Compared NAEP results in high-stakes sates against the national average for NAEP scores (Amrein & Berliner, 2002)

Concluded that states that introduced consequences to their statewide tests did not show any particular gains in their statewide NAEP scores

• Using a comparison group, found that states that attached consequences outperformed the comparison group of states for each grade (Rosenshine, 2003; Braun, 2004)

• Strong positive associations between gains and accountability index for grade 8 but weaker for grade 4 (Carnoy & Loeb, 2003)

Page 9: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Consequential Validity Research

• Many studies have reported a narrowing of curriculum (KY - Koretz et al., 1996, CT- Chudowsky et al., 1997)

• Using interviews and classroom artifacts - KY and N.C. assessment systems led to new instructional strategies, but depth and complexity of the content covered in instruction did not change in any essential way (McDonnell & Choisser, 1997)

• Teacher-reported professional activities and attitudes toward the state assessment, especially the scoring rubric, were related to changes in instructional practices (KS – Pomplun, 1997)

Page 10: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

• 2/3 of on-grade teachers reported that standards and state assessment extended-response items promoted better instruction and student learning (WA - Stecher et al., 2000)

Teachers used reform-oriented strategies in a meaningful way - mathematics and writing scoring rubrics used in instruction in a way that reinforced meaningful learning (WA- Borko et al, 2001)

Page 11: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Few studies have examined the relationship between changes in instruction and improved performance on the assessments

KIRIS (Stecher et al., 1998)• Few consistent findings across subject areas and grades

• Positive relationship between standards-based practices in writing instruction and direct writing assessment at the middle school level

• E.g., More 7th grade teachers in high- versus low-gain score schools had reported integration of writing with other subjects and increased emphasis on the writing process

Page 12: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

QUASAR (Stein & Lane, 1996)

Examined the relationship between the presence of reform features of math instruction and student performance on a mathematics performance assessment

Cognitive Demands of Instructional Task

As RepresentedBy Instructional

Material

As Set upBy the Teacher

in the Classroom

As ImplementedBy the Studentsin the Classroom

Page 13: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

• Greatest gains on performance assessment when instruction tasks engaged students in high levels of cognitive processing

• Moderate gains when instruction tasks began as cognitively demanding but implemented so students were not engaged in high levels of reasoning and problem solving

• Relatively small gains when instruction tasks were procedurally based and able to be solved with a single, easily accessible strategy

Page 14: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Maryland - MSPAP ( Lane et al.,; Stone & Lane, 2003)

• MSPAP Study – majority of the teachers indicated that MSPAP had a positive impact on their instruction

• Teacher reported reform-oriented instruction-related variables explained performance differences across schools in reading, writing, math and science

ie., schools in which teachers reported that their instruction over the years reflected more reform-oriented problem types and learning outcomes similar to those assessed on MSPAP had higher levels of school performance on MSPAP

Page 15: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

• Teacher reported reform-oriented instruction-related variables explained differences in MSPAP school performance gains in in reading and writing

i.e., increased reported use of reform-oriented tasks in writing and reading and a focus on the reading and writing learning outcomes in instruction was associated with greater rates of change in MSPAP school performance over a 5 year period

• Teacher perceived impact of MSPAP on instructional practices explained differences in MSPAP school performance gains in math and science

Page 16: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

• MSPAP study examined contextual variables such as SES

• Percent free or reduced lunch which served as a proxy for SES was significantly related to school performance in all content areas – Schools with a higher percent tended to perform poorer on MSPAP

• No significant relationship between percent free or reduced lunch and growth in school level performance

Page 17: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Factors Affecting the Impact of Standards-Based Reform on Improving Instruction

and Student Learning

• Quality of Content Standards

• Assessments aligned to standards

• Variability in defining proficiency

• Classification Error

Page 18: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Content Standards

• Carefully crafted

• Coherent across grade levels and across content areas, reflecting a developmental progression

• Accessible to teachers

• Teachers need to have a shared understanding of the standards

• Cognitive complexity of the standards

• Implications for instructionally practices

Page 19: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Content Standards

Mathematics (NCTM, 2000)

• Develop & evaluate mathematical arguments & proofs• Organize mathematical thinking through communication• Create & use representations to organize, record, & communicate

mathematical ideas

Science (NRC, 1996) Abilities necessary to do scientific inquiry:

• Identify questions that can be answered through scientific investigations

• Design & conduct a scientific investigation• Use appropriate tools & techniques to gather, analyze & interpret

data• Develop descriptions, explanations, predictions, & models using

evidence

Page 20: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Alignment of Content Standards and Assessment

Depth-of-knowledge consistency between standards and assessment indicates alignment if what is elicited from students on the assessment is as demanding cognitively as what students are expected to know and do as stated in the standards (Webb, 2002, p. 5).

Levels: Recall, Skill or Concept, Strategic Thinking, Extended Thinking

Page 21: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Alignment (Webb, 2002 – 4 states)

• Math: Over 50% of objectives under the standards required a more complex depth-of-knowledge than the corresponding items

• Reading: 3 of 5 state/grade combinations had 66% of the objectives requiring a more complex depth-of-knowledge than the corresponding items

• Most objectives judged to have a depth-of-knowledge level of 2 or 3

• Most multiple-choice items judged to have a depth-of-knowledge of 1 or 2

Page 22: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Alignment

• Inconsistency of the cognitive level of content standards across subject matters within states

• Inconsistency of the cognitive level of content standards across states

• Assessments tent to measure lower cognitive levels than reflected in content standards

• Implications for the quality and level of instruction

Page 23: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Proficiency LevelsGrade 8 AYP Starting Points for 42 States on the State Assessment and Performance on NAEP (Linn, 2003)

State % At or Above Proficient Starting Point on State

Assessment

% At or Above Proficient on NAEP

Arizona 7 21

North Carolina

74.6 30

Page 24: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Proficiency LevelsGrade 8 AYP Starting Points for 42 States on the State

Assessment (Linn, 2003, 2004)

Most stringent (AZ): 7.0%

Most lenient (CO): 74.6%

75th percentile: 56.5%

Median: 39.4%

25th percentile: 23.6%

Page 25: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Differences in Reading Proficiency Rates from Start (2001-02)

to End (2013-14) of the 12-Year Timeline in Six States (Kim and Sunderman, 2004)

0

20

40

60

80

100

2001-02 2004-05 2007-08 2010-11 2013-14

Intermediate Goals

%

At

or

Ab

ov

e P

rofi

cie

nt Virginia

Georgia

Arizona

Illinois

New York

California

Page 26: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Variability in Starting Points and Effect on Rates of Improvement

• At starting point, less than 15% proficient in CA and only 25% proficient in NY

• 60% were proficient n GA and VA

• Relatively consistent rate of growth during the first few years across the 6 states

• Considerable variability in rate of growth during the remaining years

Page 27: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Variability in Proficiency Levels

• Meaningfulness of the term proficiency

• Need for coherency of performance standards across grades and across content areas

• Impact on instructional practices

Page 28: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Classification Errors

• Concern with false negatives or misclassifying passing students as non-passing and proficient students as not proficient

• Impact grade promotion, retention, and high school graduation

• Need to consider measurement error and form confidence bands

• Need to provide validity evidence for the psychometric models used (Stone, Weissman, & Lane, in press)

Page 29: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

2757(91%)

261(9%)

Adv.

42(1%)

2992(98%)

12(1%)

Prof.

573(23%)

1826(74%)

60(2%)

Basic

421(8%)

4676(92%)

B.Basic

Adv.Prof.BasicB.Basic

1P

cla

ssif

icat

ion

3P classification

3018

3046

2459

5097

Row

TotalsRow % in parens

~.90 overall agreement

1P model underestimates proficiency in two categories1P model overestimates proficiency for one category

Page 30: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Measuring the Impact of Standards-Based Reform on Instructional Practice

• Information about instructional practice provides evidence to evaluate the validity of test performance and performance gains as well as to evaluate the effectiveness of reform programs

“While policy makers and reformers at all levels of the system are crucial if these reforms are to be enacted locally, teachers are the key agents when it comes to changing classroom practice. They are the final policy brokers” (Spillane, 1999)

Page 31: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Intended Impact of Assessment and Accountability Program

• Motivation and effort to adapt the curriculum and instruction to content standards

• Beliefs about the content standards and the assessment

• Student motivation to learn and put forth their best effort on the assessment

• Professional development support

• Instructional practice – content and strategies

• Classroom assessments – content and format

• Use and nature of test preparation activities

• Improved student learning

Page 32: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Unintended Impact of Assessment and Accountability Program

• Narrowing of curriculum and instruction

• Use of test preparation materials closely linked to the assessment without making changes to instruction

• Decreased confidence and motivation to learn and perform well on the assessment

• Differential performance gains for subgroups of students

Page 33: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Methods

• Surveys- cost effective- increases generalizability- limited in capturing complex instructional practices- lack of shared understanding of terminology- self-report bias

• Observations/Case Studies- captures complexities of instructional practices- costly, time consuming, labor intensive- lacks generalizabiltiy

Page 34: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

• Classroom Artifacts

- Captures instructional practices - more direct evidence of whether the cognitive demands required by students in the classroom reflect the cognitive demands required by students on the standards and assessments

- increases generalizability

- time-consuming for the analysis

Page 35: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Consequential Validity Evidence:Familiarity, Beliefs, Morale, Effort

• Familiarity with the content standards and assessment

• Beliefs and attitudes toward the assessment

• Principal, teacher, student morale

• Teacher and student effort

Surveys

Page 36: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Consequential Validity Evidence:Professional Development

• Nature of professional development support

• Amount of professional development support

Surveys and artifacts

Page 37: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Consequential Validity Evidence:Instruction and Curriculum

• Extent to which instruction and classroom assessments reflect the state standards and assessment

- content

- cognitive demands

- reform oriented

Classroom Artifacts- instruction, classroom assessments, scoring rubrics, test preparation materials

Surveys

Page 38: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Conceptual Modeling of the Relationship Between Score Gains and School, Principal,

Teacher, and Student Variables

• Evaluation of consequential evidence involves examining variation in school performance in terms of contextual and evidential variables.

• To model the change process and examine agents of change, growth models have been advocated (e.g., Rogosa & Willet, 1985; Willet & Sayer, 1994)

Page 39: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Between School Model (Lane & Stone, 2002)

Contextual Variables• % free or reduced lunch• % minority students• Funding per student• Stability

Change in Evidential Var• Motivation and Effort• Professional Dev• Classroom Instruction• Student Motivation

InitialPerformance

Level(Intercept)

Year 1School Score

Year 2School Score

Year 3School Score

Year 4Class Score

Rate of Change(Slope)

Evidential Variables• Motivation and Effort• Professional Dev• Classroom Instruction• Student Motivation

Page 40: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Between School Model

• Two latent variables are used to describe school performance on the assessment– Initial performance– Rate of Change

• Degree of variability in these latent variables reflects the degree of variability between the schools

• Contextual and evidential factors can be introduced to explain any variability in these factors

Page 41: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Effects that may be evaluated

• relevant contextual variables on school level initial performance

• relevant evidential variables at Year 1 on school level initial performance and rates of change

• changes in relevant evidential variables (Year 1-4) on school-level initial performance and rates of change

Page 42: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Within School Model

Contextual Variables(Class Level)•Teacher Education•Teacher Experience

Change in Evidential Var(Class Level)• Motivation and Effort• Professional Dev• Classroom Instruction• Student Motivation

InitialPerformance

Level(Intercept)

2001/02Class Score

2002/03Class Score

2003/04Class Score

2004/05Class Score

Rate of Change(Slope)

Evidential Variables(Class Level)• Motivation and Effort• Professional Dev• Classroom Instruction• Student Motivation

Page 43: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Within School Model

• Examines variability in relevant evidential variables at the class level within schools

• Effects that may be evaluated– Contextual variables on class-level initial

performance and rates of change– Evidential variables at Year 1 on class-level initial

performance and rates of change– Changes in evidential variables (Year 1-4) on

class-level initial performance and rates of change

Page 44: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Multilevel Model with 3 Levels

• Level I: Changes in performance at the class level

• Level 2: Variability within schools

• Level 3: Variability between schools

(Bryk & Raudenbush, 1992; Muthen; 1995)

Page 45: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Case Studies

• Case studies can be used to supplement large-scale studies

• Provide richer, contextualized information on instructional practices, classroom assessment, and professional development

• Can focus on effective programs and illustrate what factors contribute to quality instruction and student learning as a result of the standards-based reform

Page 46: Consequential Validity by Suzanne Lane Presentation for CCSSO SCASS-ASR Accountability Systems and Reporting January 20 2005

Conclusion

• Information on teachers’ instruction and classroom assessment practices is pivotal in understanding the success or failure of accountability systems and reform efforts

• Need to better understand the extent to which performance gains on assessments reflect improved instruction and student learning rather than more superficial interventions such as narrow test preparation activities