evaluating assessments for fairness canal center plaza, suite 700, alexandria, va 22314-1578 |...

37
66 Canal Center Plaza, Suite 700, Alexandria, VA 22314-1578 | Phone: 703.549.3611 | www.humrro.org Evaluating Assessments for Fairness The First Implementation of the High Quality Assessment Program’s Accessibility Criteria Presentation at the National Conference on Student Assessment Philadelphia, PA June 21, 2016 Presenting with:

Upload: trinhkhanh

Post on 19-Apr-2018

218 views

Category:

Documents


3 download

TRANSCRIPT

66 Canal Center Plaza, Suite 700, Alexandria, VA 22314-1578 | Phone: 703.549.3611 | www.humrro.org

Evaluating Assessments for FairnessThe First Implementation of the High Quality Assessment

Program’s Accessibility Criteria

Presentation at the National Conference on Student Assessment

Philadelphia, PA

June 21, 2016

Presenting with:

Presenters

Hillary R. Michaels

Audrey Lesondak

2

Deborah Taub

Francine Markowitz

HQAP Research Partners

3

Developed and published the

content alignment methodology

Implemented methodology

(grades 5 & 8)

Implemented methodology

(high school and accessiblity)

Funders supporting the study

Objectives

Participants will learn about:

● HQAP process for accessibility

● Lessons learned in implementing the

process

● Areas of further investigation needed

4

High Quality Assessment Program (HQAP) Overview

● Purpose

– Evaluate quality, content, accessibility against criteria

● Process

– Methodology developed by the Center for Assessment using CCSSO’s

“Criteria for Procuring and Evaluating High Quality Assessments”

● Uses

– Inform educators, parents, policymakers and other state and local officials

of the strengths and weaknesses of assessments

● Assessments Sampled

– ACT Aspire, MCAS, PARCC, and Smarter Balanced

5

Our Focus

● Research question– Are the assessments accessible to all

students, including ELs and SWDs?

● Criterion– Providing accessibility to all students,

including ELs and SWDs

6

Correspondence: APA Fairness Clusters (Standards)

● Test design, development, administration, and scoring procedures to minimize barriers to valid score interpretations for widest range of individuals and relevant subgroups

● Validity of test score interpretations for intended uses for the intended population

● Accommodations to remove construct-irrelevant barriers and support interpretations of scores for their intended uses

● Safeguards against inappropriate score interpretations for intended uses

7Note: The shaded diamonds are color codes for the clusters.

Correspondence: HQAP Accessibility Evidence Descriptors

● Defined the construct, appropriate standardization, and important threats to validity that are addressed through UD

● A comprehensive set of coherent procedures to develop its items in terms of accessibility

● Procedures to develop and construct its test forms while considering accessibility and to support valid score inferences

● Appropriate accommodations/access features that address needs of the vast majority of the students

● Documentation on accommodations including a rationale for how each supports valid score interpretations, when they may be used, and instructions for administration

● Considered validity and available accommodations/access features that specifically address the needs of ELs and/or SWDs

8

The HQAP program has:

Accessibility Review Panels and Design

● Experts in – Assessment

– Accessibility

– Universal Design for Learning

– ELs

– SWDs

– ELA

– Math

● Strategically assigned to programs

● Evaluated documentation and exemplars

9

HQAP Accessibility Study Components

● “Light Touch” Review (2-day workshop)

● Generalizability (Document) Review: Accessibility and Assessment

Frameworks, Blueprints, etc.

– Evidence of rationales, research, best practices, design principles,

review processes

● Exemplar Review: Operational or Sample Items

– Evidence of implementation

● Aggregation of information to develop summary statements

10

Facilitator’s Comments and Material Overview

Speaker:

Debbie Taub

Example of Portion of Worksheet

12

Example of Scoring Criteria

● Scoring Components – Must have 3 of the following for ELs and SWDs to meet requirement

– Defines construct to be assessed with sufficient clarity that program and others can distinguish construct-irrelevant from construct-relevant variance

– Provides rationale for construct definition that incorporates available research

– Has defined threats to validity relevant to assessment program that might require accommodations and/or access features, including those relevant to ELs and SWDs

– Has process in place to improve its conception and support of validity regarding accessibility and accommodations

13

Example of Group Form Scoring and Comments

14

Example of Final Scoring Summary

15

A Provider’s Perspective

Note: In this session, PARCC Inc. is characterized as a “vendor” or “provider”. PARRC Inc.

is the consortium’s project management partner and facilitate states’ work with vendors.

PARCC Inc. project management led the effort to provide materials for the HQAP project.

Speaker:

Francine Markowitz

Gathering Documentation

17

Reading Evidence

Tables

Writing Evidence

TablesELA Practice Items

Item Guidelines

PARCC Accessibility

Features and

Accommodations

Manual

Form Specifications

Performance Level

DescriptorsTask Models

Cognitive Complexity

One Pager

Providing Realistic Exemplars for Review

● Provided 26 exemplar items from accommodated forms to illustrate

range of accessibility features

● Item selection process- PARCC Inc. staff worked with state experts

in PARCC’s AAF working group to select items that showed how the

accessibility and accommodation features interacted

● Accommodations included: TTS items, ASL videos, and Screen

reader items

18

Exemplar Accessibility Features

*Available to all participating students but must be identified in advance

19

● Answer Masking*

● Bookmark

● Color Contrast (Background/Font Color)*

● Eliminate Answer Choices

● Highlight Tool

● Line Reader Tool

● Magnification

● Pop-up Glossary

● Notepad

● Writing Tools (on writing items)

Exemplar of an Accommodated ASL Item

● http://parcc.pearson.com/practice-tests/english/

20

Requested Information to Support Exemplars

● Content standard/construct addressed

● What the accommodation/accessibility feature is; how it differs from

non-accommodated version

● Instructions for administration/use

● Conditions under which feature is available

● Process by which feature is approved to be used by student

● Why it is fair in relation to focal construct and intended score

interpretation

● How it relates to assessment program’s documentation on fairness and

item specifications

● Any other salient aspect about exemplar that reviewers need to know

21

A Reviewer’s Lens

Speaker:

Audrey Lesondak

Review Process: Form and Function

23

Picture credits: https://en.wikipedia.org/wiki/Blind_men_and_an_elephant & https://en.wikipedia.org/wiki/If_You_Give_a_Mouse_a_Cookie

Form

● Material quantity made “light touch” challenging in time allotted

● Refined organizational template to add more structure could potentially

help reviewers prioritize tasks

● Collectively calibrating to single assessment, although time intensive,

was critical in providing common language among reviewers

● Peer collaboration was beneficial

24

Function

● Often evidence showed that UDL principles were incorporated

into design of items, but unclear if it drove design decisions

● Providers showed evidence of accessibility features much farther

along UDL continuum than their legacy assessment counterparts

25

Evaluation Criteria

● Framework to guide what UDL looks like within new on-line platform

was beneficial

● As with reviewers, providers might find this process a useful road map

for focusing their work on UDL features

26

Results & Potential Uses

Speakers:

Hillary Michaels

Francine Markowitz

Audrey Lesondak

Debbie Taub

First Implementation Results

Narrative results

• Light touch review

• Rating criteria were specific and required a lot to “meet”

requirement

• Scoring considerations

28

PARCC Narrative Results: Strengths Identified

● Program incorporates accessibility features that are available to all

students and offers several test administration considerations for any

student

● Sensitivity to design of item types that reflect individual needs of

students with disabilities

● Strong research base and inclusion of existing research on ELs

● Wide range of accommodations for SWDs and ELs

● Valid and appropriate accommodations based on current research

29

PARCC Narrative Results: Areas for Improvement

● Research needed to determine whether accessibility features and

accommodations alter constructs being measured

● Clearer documentation may be needed regarding how PARCC

administers multiple features simultaneously and implications of how

multiple accessibility features impact student performance

30

PARCC: Usefulness and Value of the Findings

● What was gained?– A valiant attempt to study accessibility features on assessments; PARCC

appreciates/supports effort to shed light on this important topic

● Anything that could be changed/improved?– Guidance for providing exemplars

– More specific, actionable feedback

● What does PARCC plan to do with the feedback gained

from the study?– Shared with State Leads and PARCC working groups

– Will determine whether adjustments should be made to PARCC policies

or research agenda

31

Reviewer: Usefulness and Value of Findings

● Evaluation process can serve as framework for collecting evidence

from vendors on how UDL was used within test development process

● Criteria can be built into Request for Proposals (RFPs) to allow users

to more readily recognize accessibility and UDL principles across

assessments

● With some adjustments, framework can provide means for test

producers and accompanying vendors to package offerings to users

32

What Do We Do with this Information?

● What are next steps?

● What needs to happen?

33

What Did We Learn?

● UDL needs to be addressed from onset to drive test design

● Computer administered tests had more features

● No one list of what is required in terms of accommodations and accessibility features

– State policies

– Different definitions and different implementation rules for similar terms

– Changing environment

– Reliance on “what is commonly done” and expert judgement

– Research needs to catch up

• Which features are the most effective?

• Which are the easiest to provide? Do they level the playing field for students using them?

34

What Do We Still Need to Know?

● Features (UDL, Accessibility, and Accommodations)

– Which features do not interfere with item construct (and thus

should be available for anyone)? How do we know?

– What barriers can we reduce or remove if we plan for

features and UDL from beginning?

35

How Should/Could Criteria Be Used?

● Provide foundation that could be considered when

– Developing new tests

– Evaluating tests

– Writing RFPs

36

Thank you