evaluating assessments for fairness canal center plaza, suite 700, alexandria, va 22314-1578 |...
TRANSCRIPT
66 Canal Center Plaza, Suite 700, Alexandria, VA 22314-1578 | Phone: 703.549.3611 | www.humrro.org
Evaluating Assessments for FairnessThe First Implementation of the High Quality Assessment
Program’s Accessibility Criteria
Presentation at the National Conference on Student Assessment
Philadelphia, PA
June 21, 2016
Presenting with:
HQAP Research Partners
3
Developed and published the
content alignment methodology
Implemented methodology
(grades 5 & 8)
Implemented methodology
(high school and accessiblity)
Funders supporting the study
Objectives
Participants will learn about:
● HQAP process for accessibility
● Lessons learned in implementing the
process
● Areas of further investigation needed
4
High Quality Assessment Program (HQAP) Overview
● Purpose
– Evaluate quality, content, accessibility against criteria
● Process
– Methodology developed by the Center for Assessment using CCSSO’s
“Criteria for Procuring and Evaluating High Quality Assessments”
● Uses
– Inform educators, parents, policymakers and other state and local officials
of the strengths and weaknesses of assessments
● Assessments Sampled
– ACT Aspire, MCAS, PARCC, and Smarter Balanced
5
Our Focus
● Research question– Are the assessments accessible to all
students, including ELs and SWDs?
● Criterion– Providing accessibility to all students,
including ELs and SWDs
6
Correspondence: APA Fairness Clusters (Standards)
● Test design, development, administration, and scoring procedures to minimize barriers to valid score interpretations for widest range of individuals and relevant subgroups
● Validity of test score interpretations for intended uses for the intended population
● Accommodations to remove construct-irrelevant barriers and support interpretations of scores for their intended uses
● Safeguards against inappropriate score interpretations for intended uses
7Note: The shaded diamonds are color codes for the clusters.
Correspondence: HQAP Accessibility Evidence Descriptors
● Defined the construct, appropriate standardization, and important threats to validity that are addressed through UD
● A comprehensive set of coherent procedures to develop its items in terms of accessibility
● Procedures to develop and construct its test forms while considering accessibility and to support valid score inferences
● Appropriate accommodations/access features that address needs of the vast majority of the students
● Documentation on accommodations including a rationale for how each supports valid score interpretations, when they may be used, and instructions for administration
● Considered validity and available accommodations/access features that specifically address the needs of ELs and/or SWDs
8
The HQAP program has:
Accessibility Review Panels and Design
● Experts in – Assessment
– Accessibility
– Universal Design for Learning
– ELs
– SWDs
– ELA
– Math
● Strategically assigned to programs
● Evaluated documentation and exemplars
9
HQAP Accessibility Study Components
● “Light Touch” Review (2-day workshop)
● Generalizability (Document) Review: Accessibility and Assessment
Frameworks, Blueprints, etc.
– Evidence of rationales, research, best practices, design principles,
review processes
● Exemplar Review: Operational or Sample Items
– Evidence of implementation
● Aggregation of information to develop summary statements
10
Example of Scoring Criteria
● Scoring Components – Must have 3 of the following for ELs and SWDs to meet requirement
– Defines construct to be assessed with sufficient clarity that program and others can distinguish construct-irrelevant from construct-relevant variance
– Provides rationale for construct definition that incorporates available research
– Has defined threats to validity relevant to assessment program that might require accommodations and/or access features, including those relevant to ELs and SWDs
– Has process in place to improve its conception and support of validity regarding accessibility and accommodations
13
A Provider’s Perspective
Note: In this session, PARCC Inc. is characterized as a “vendor” or “provider”. PARRC Inc.
is the consortium’s project management partner and facilitate states’ work with vendors.
PARCC Inc. project management led the effort to provide materials for the HQAP project.
Speaker:
Francine Markowitz
Gathering Documentation
17
Reading Evidence
Tables
Writing Evidence
TablesELA Practice Items
Item Guidelines
PARCC Accessibility
Features and
Accommodations
Manual
Form Specifications
Performance Level
DescriptorsTask Models
Cognitive Complexity
One Pager
Providing Realistic Exemplars for Review
● Provided 26 exemplar items from accommodated forms to illustrate
range of accessibility features
● Item selection process- PARCC Inc. staff worked with state experts
in PARCC’s AAF working group to select items that showed how the
accessibility and accommodation features interacted
● Accommodations included: TTS items, ASL videos, and Screen
reader items
18
Exemplar Accessibility Features
*Available to all participating students but must be identified in advance
19
● Answer Masking*
● Bookmark
● Color Contrast (Background/Font Color)*
● Eliminate Answer Choices
● Highlight Tool
● Line Reader Tool
● Magnification
● Pop-up Glossary
● Notepad
● Writing Tools (on writing items)
Exemplar of an Accommodated ASL Item
● http://parcc.pearson.com/practice-tests/english/
20
Requested Information to Support Exemplars
● Content standard/construct addressed
● What the accommodation/accessibility feature is; how it differs from
non-accommodated version
● Instructions for administration/use
● Conditions under which feature is available
● Process by which feature is approved to be used by student
● Why it is fair in relation to focal construct and intended score
interpretation
● How it relates to assessment program’s documentation on fairness and
item specifications
● Any other salient aspect about exemplar that reviewers need to know
21
Review Process: Form and Function
23
Picture credits: https://en.wikipedia.org/wiki/Blind_men_and_an_elephant & https://en.wikipedia.org/wiki/If_You_Give_a_Mouse_a_Cookie
Form
● Material quantity made “light touch” challenging in time allotted
● Refined organizational template to add more structure could potentially
help reviewers prioritize tasks
● Collectively calibrating to single assessment, although time intensive,
was critical in providing common language among reviewers
● Peer collaboration was beneficial
24
Function
● Often evidence showed that UDL principles were incorporated
into design of items, but unclear if it drove design decisions
● Providers showed evidence of accessibility features much farther
along UDL continuum than their legacy assessment counterparts
25
Evaluation Criteria
● Framework to guide what UDL looks like within new on-line platform
was beneficial
● As with reviewers, providers might find this process a useful road map
for focusing their work on UDL features
26
First Implementation Results
Narrative results
• Light touch review
• Rating criteria were specific and required a lot to “meet”
requirement
• Scoring considerations
28
PARCC Narrative Results: Strengths Identified
● Program incorporates accessibility features that are available to all
students and offers several test administration considerations for any
student
● Sensitivity to design of item types that reflect individual needs of
students with disabilities
● Strong research base and inclusion of existing research on ELs
● Wide range of accommodations for SWDs and ELs
● Valid and appropriate accommodations based on current research
29
PARCC Narrative Results: Areas for Improvement
● Research needed to determine whether accessibility features and
accommodations alter constructs being measured
● Clearer documentation may be needed regarding how PARCC
administers multiple features simultaneously and implications of how
multiple accessibility features impact student performance
30
PARCC: Usefulness and Value of the Findings
● What was gained?– A valiant attempt to study accessibility features on assessments; PARCC
appreciates/supports effort to shed light on this important topic
● Anything that could be changed/improved?– Guidance for providing exemplars
– More specific, actionable feedback
● What does PARCC plan to do with the feedback gained
from the study?– Shared with State Leads and PARCC working groups
– Will determine whether adjustments should be made to PARCC policies
or research agenda
31
Reviewer: Usefulness and Value of Findings
● Evaluation process can serve as framework for collecting evidence
from vendors on how UDL was used within test development process
● Criteria can be built into Request for Proposals (RFPs) to allow users
to more readily recognize accessibility and UDL principles across
assessments
● With some adjustments, framework can provide means for test
producers and accompanying vendors to package offerings to users
32
What Did We Learn?
● UDL needs to be addressed from onset to drive test design
● Computer administered tests had more features
● No one list of what is required in terms of accommodations and accessibility features
– State policies
– Different definitions and different implementation rules for similar terms
– Changing environment
– Reliance on “what is commonly done” and expert judgement
– Research needs to catch up
• Which features are the most effective?
• Which are the easiest to provide? Do they level the playing field for students using them?
34
What Do We Still Need to Know?
● Features (UDL, Accessibility, and Accommodations)
– Which features do not interfere with item construct (and thus
should be available for anyone)? How do we know?
– What barriers can we reduce or remove if we plan for
features and UDL from beginning?
35
How Should/Could Criteria Be Used?
● Provide foundation that could be considered when
– Developing new tests
– Evaluating tests
– Writing RFPs
36