annual technical report - oh.portal.cambiumast.com...chapter 9 describes the procedures used to...

Annual Technical Report

Ohio’s State Tests in English Language

Arts, Mathematics, Science, and Social

Studies

2017–2018 School Year

September 2018

Ohio Department of Education i American Institutes for Research

OHIO STATEWIDE ASSESSMENT

OHIO’S STATE TESTS (OST)

ELA GRADES 3 THROUGH 8, HIGH SCHOOL ELA I, AND ELA II

MATHEMATICS GRADES 3 THROUGH 8, HIGH SCHOOL ALGEBRA, GEOMETRY,

INTEGRATED MATHEMATICS I, AND INTEGRATED MATHEMATICS II

SCIENCE GRADE 5, GRADE 8, BIOLOGY, AND PHYSICAL SCIENCE

SOCIAL STUDIES AMERICAN HISTORY AND AMERICAN GOVERNMENT

2017–2018 ANNUAL TECHNICAL REPORT

SEPTEMBER 2018

Prepared by American Institutes for Research (AIR) in collaboration with the Ohio Department of Education (ODE)

Ohio Department of Education i American Institutes for Research

TABLE OF CONTENTS

1. Introduction: The Validity of OST Test Score Interpretations ................................................................................ 5 1.1 Overview ......................................................................................................................................................... 5 1.2 Validity Evidence ............................................................................................................................................. 6

1.2.1 Evidence Based on Test Content ..................................................................................................... 10 1.2.2 Evidence for Performance Standard Interpretation of Test Scores ................................................ 12 1.2.3 Evidence Based on Internal Structure ............................................................................................. 15 1.2.4 Measurement Invariance Across Subgroups ................................................................................... 18 1.2.5 Test Integrity Forensics ................................................................................................................... 20 1.2.6 Summary of Validity of Test Score Interpretations ......................................................................... 23

2. Background of Ohio Computer-Based Assessments............................................................................................ 24 2.1 Background of ELA and Mathematics Assessments ...................................................................................... 24 2.2 Background of Science and Social Studies Assessments ............................................................................... 24 2.3 OST Test Design ............................................................................................................................................. 25

3. Summary of Fall 2017 Operational Test Adminstration ...................................................................................... 29 3.1 Student Population and Participation ........................................................................................................... 29 3.2 Summary of Overall Student Performance for Fall 2017 .............................................................................. 30 3.3 Student Performance by Subgroup for Fall 2017 .......................................................................................... 31 3.4 Reliabity for Fall 2017 .................................................................................................................................... 34

3.4.1 Internal Consistency ........................................................................................................................ 34 3.4.2 Standard Error of Measurement ..................................................................................................... 35 3.4.3 Student Classification Reliability ..................................................................................................... 40 3.4.4 Classification Accuracy .................................................................................................................... 40 3.4.5 Classification Consistency ............................................................................................................... 41 3.4.6 Classification Accuracy and Consistency Estimates ........................................................................ 41 3.4.7 Reliability for Subgroups in the Population .................................................................................... 42 3.4.8 Reliability for Subscales................................................................................................................... 44 3.4.9 Subscale Intercorrelation ................................................................................................................ 46

4. Summary of Spring 2018 Operational Test Adminstration .................................................................................. 49 4.1 Student Population and Participation ........................................................................................................... 50 4.2 Summary of Overall Student Performance for Spring 2018 .......................................................................... 51 4.3 Student Performance by Subgroup for Spring 2018 ..................................................................................... 53 4.4 Classical Item Analysis ................................................................................................................................... 58 4.5 Item Response Theory Analysis ..................................................................................................................... 59 4.6 Reliability for Spring 2018 ............................................................................................................................. 63

4.6.1 Internal Consistency ........................................................................................................................ 63 4.6.2 Standard Error of Measurement ..................................................................................................... 65 4.6.3 Student Classification Reliability ..................................................................................................... 70 4.6.4 Classification Accuracy .................................................................................................................... 71 4.6.5 Classification Consistency ............................................................................................................... 71 4.6.6 Classification Accuracy and Consistency Estimates ........................................................................ 72 4.6.7 Reliability for Subgroups in the Population .................................................................................... 73 4.6.8 Reliability for Subscales................................................................................................................... 77

4.7 Subscale Intercorrelations ............................................................................................................................. 80 4.8 Rater Agreement ........................................................................................................................................... 85

5. Item Development and Test Construction........................................................................................................... 86 5.1 Item Development Process ........................................................................................................................... 86 5.2 Machine-Scored Constructed-Response Item Development Tools............................................................... 87 5.3 Item Types ..................................................................................................................................................... 87 5.4 Item Review ................................................................................................................................................... 88

6. Field Testing ......................................................................................................................................................... 91

Ohio Department of Education ii American Institutes for Research

6.1 Item Statistics ................................................................................................................................................ 92 6.1.1 Classical Statistics ............................................................................................................................ 92 6.1.2 IRT Statistics .................................................................................................................................... 93 6.1.3 Analysis of Differential Item Functioning ........................................................................................ 93

6.2 Data Review Summary .................................................................................................................................. 95 6.3 Test Construction .......................................................................................................................................... 95

6.3.1 Operational Form Construction ...................................................................................................... 96 6.3.2 Assembling Test Forms ................................................................................................................... 97 6.3.3 Embedded Field-Test Slots .............................................................................................................. 98

7. Test Administration ........................................................................................................................................... 100 7.1 Eligibility ...................................................................................................................................................... 100 7.2 Administration Procedures ......................................................................................................................... 100 7.3 Accomodations ............................................................................................................................................ 102 7.4 Test Security ................................................................................................................................................ 109

8. Reporting and Interpreting OST Scores ............................................................................................................. 112 8.1 Appropriate Uses for Scores and Reports ................................................................................................... 112 8.2 Reports Provided ......................................................................................................................................... 113

8.2.1 Online Reporting System for Educators ........................................................................................ 114 8.3 Interpretation of Scores .............................................................................................................................. 121

8.3.1 Scale Scores ................................................................................................................................... 121 8.3.2 Performance Standards ................................................................................................................ 123 8.3.3 Performance-Level Descriptors ..................................................................................................... 123

9. Performance Standards ..................................................................................................................................... 124 9.1 Standard Setting Procedures ....................................................................................................................... 124

9.1.1 Performance-Level Descriptors ..................................................................................................... 125 9.2 Recommended Performance Standards ..................................................................................................... 126 9.3 OST Transformations and Rounding Rules .................................................................................................. 132

9.3.1 Rules for Transforming the Within-Grade Theta to the OST Scale ............................................... 132 9.3.2 OST Rounding Rules ...................................................................................................................... 133 9.3.3 Rules for Overall Performance Level Classification ....................................................................... 133 9.3.4 OST Subscale Performance Classification ..................................................................................... 134

10. Scaling And Equating ......................................................................................................................................... 135 10.1 Item Response Theory Procedures ............................................................................................................. 135

10.1.1 Calibration of OST Item Banks....................................................................................................... 135 10.1.2 Estimating Student Ability Using Maximum Likelihood Estimation .............................................. 136

10.2 OST Reporting Scale (Scale Scores) ............................................................................................................. 138 10.3 Equating Paper-Pencil and Online Test Scores ............................................................................................ 140

11. Constructed-Response Scoring .......................................................................................................................... 142 11.1 Machine-Scoring .......................................................................................................................................... 142

11.1.1 Explicit Rubrics .............................................................................................................................. 142 11.1.2 Essay Autoscoring ......................................................................................................................... 142

11.2 Handscoring ................................................................................................................................................. 144 11.2.1 Rangefinding ................................................................................................................................. 144 11.2.2 Developing Training Materials After Rangefinding ....................................................................... 145 11.2.3 Scoring Guides with Anchor Responses ........................................................................................ 145 11.2.4 Training Sets .................................................................................................................................. 145 11.2.5 Operational Training and Qualifying Materials ............................................................................. 145 11.2.6 Handscoring Procedures ............................................................................................................... 146 11.2.7 Training of Scorers ........................................................................................................................ 147 11.2.8 Monitoring and Maintaining Quality Control ................................................................................ 147 11.2.9 Handling Unusual Responses and Disturbing Responses .............................................................. 148

12. Quality Control Procedures ............................................................................................................................... 149 12.1 Quality Assurance in Test Construction ...................................................................................................... 149

Ohio Department of Education iii American Institutes for Research

12.2 Quality Assurance in Test Production ......................................................................................................... 151 12.2.1 Production of Content .................................................................................................................. 151 12.2.2 Web Approval of Content During Development ........................................................................... 152 12.2.3 Approval of Final Forms ................................................................................................................ 152 12.2.4 Packaging ...................................................................................................................................... 152 12.2.5 Platform Review ............................................................................................................................ 152 12.2.6 User Acceptance Testing and Final Review ................................................................................... 153 12.2.7 Functionality and Configuration ................................................................................................... 153

12.3 Quality Assurance in Document Processing ................................................................................................ 154 12.3.1 Scanning Accuracy ......................................................................................................................... 154 12.3.2 Quality Assurance in Editing and Data Input................................................................................. 154

12.4 Quality Assurance in Data Preparation ....................................................................................................... 155 12.5 Quality Assurance in Test Form Equating .................................................................................................... 156 12.6 Quality Assurance in Scoring and Reporting ............................................................................................... 156

12.6.1 Quality Assurance in Handscoring ................................................................................................ 156 12.6.2 Quality Assurance for Score Reporting ......................................................................................... 159 12.6.3 Quality Assurance for Test Scoring ............................................................................................... 161 12.6.4 Reporting ...................................................................................................................................... 164

Ohio Department of Education iv American Institutes for Research

APPENDICES

Appendix A Global Model Fit ..................................................................................................................................... A-1 Appendix B Test Integrity Forensics Report............................................................................................................... B-1 Appendix C Number of Students Participating by Test Mode ................................................................................... C-1 Appendix D Test Score Frequency Distributions ....................................................................................................... D-1 Appendix E Operational Bank Parameters ................................................................................................................. E-1 Appendix F Ability Measures at Raw Score Cuts ........................................................................................................ F-1 Appendix G Raw-to-Scale Score Conversion Tables .................................................................................................. G-1 Appendix H Rater Agreement Rates ......................................................................................................................... H-1 Appendix I Test Characteristics Curve Graphs ............................................................................................................. I-1 Appendix J Test Administrator User Guide ................................................................................................................. J-1 Appendix K Ohio Accessibility Manual .......................................................................................................................K-1 Appendix L ELA Writing Prompts Scoring Rubric Summary ........................................................................................ L-1

Ohio Department of Education 5 American Institutes for Research

1. INTRODUCTION: THE VALIDITY OF OST TEST SCORE INTERPRETATIONS

1.1 OVERVIEW

The purpose of this technical report is to document the evidence supporting the claims made for how Ohio’s State Tests (OST) scores may be interpreted. Evidence for the validity of test score interpretations is central to claims that OST test scores can be used to evaluate the effectiveness with which Ohio districts and schools teach students Ohio’s Learning Standards and whether individual students have achieved those standards by the end of each school year. Thus, the report begins with a review of validity evidence evaluated to date. Evidence for the validity of test score interpretations is expected to accrue over time, and this section will be expanded as further evidence is gained.

Chapter 2 of the report describes the design and development of OST assessments, including Ohio’s Learning Standards which define the content domain to be assessed by OST; the development of test specifications, including blueprints, that ensure the breadth and depth of the content domain is adequately sampled by the assessments; as well as test development procedures that ensure alignment of test forms with the blueprint specifications.

Chapters 3 and 4 present results of the fall 2017 and spring 2018 OST test administrations, respectively. The fall administration is limited and includes only the grade 3 ELA assessment and the high school end-of-course (EOC) tests in ELA, mathematics, science, and social studies. The full OST assessment system administered in spring includes end-of-year assessments in ELA and mathematics for grades 3–8, as well as end-of-year assessments in science at grades 5 and 8. The spring assessments also include the high school EOC tests in ELA (grade 10 and grade 11), mathematics (Algebra I, Geometry, Integrated Mathematics I, and Integrated Mathematics II), science (Physical Science and Biology), and social studies (American Government and American History). These chapters provide summaries of the test-taking student population and their performance on the assessments. In addition, these chapters describe administration-specific evidence for the reliability of OST assessments, including internal consistency reliability, standard errors of measurement, and the reliability of performance level classifications.

The remaining chapters document technical details of the test development, administration, scoring, and reporting activities. Chapter 5 describes the item development process and especially the sequence of reviews that each item must pass through before being eligible for OST test administration. This chapter also describes the procedures for constructing test forms from items successfully passing through the review process.

Chapter 6 documents the test administration procedures, including eligibility of participation in OST assessments, testing conditions, including accessibility tools and accommodations, systems security for assessments administered online, as well as test security procedures for all test administrations.

A description of the score reporting system and the interpretation of test scores is provided in Chapter 7. Chapter 8 describes the procedures that ODE used to identify and adopt performance standards for OST assessments, and Chapter 9 describes the procedures used to scale and equate OST assessments for scoring and reporting.

Chapter 10 describes the procedures for scoring constructed-response items, both machine-scored and handscored items, and provides summary rater agreement results. Finally, Chapter 11 provides an overview of the quality assurance processes described throughout that are used to ensure that all test development, administration, scoring, and reporting activities are conducted with fidelity to the developed procedures.


1.2 VALIDITY EVIDENCE

Validity refers to the degree to which test score interpretations are supported by evidence and speaks directly to the legitimate uses of test scores. Establishing the validity of test score interpretations is thus the most fundamental component of test design and evaluation. The Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014) provide a framework for evaluating whether claims based on test score interpretations are supported by evidence. Within this framework, the standards describe the range of evidence that may be brought to bear to support the validity of test score interpretations.

The types of evidence required to support the validity of test score interpretations depend centrally on the claims made for how test scores may be interpreted. Moreover, the standards prove that validity is an attribute not of tests but rather of test score interpretations. Some test score interpretations may be supported by validity evidence while others are not. The validity of the test itself is not considered, rather the validity of the intended interpretation and use of test scores is evaluated.

OST assessments are designed to measure the degree to which students have achieved the academic learning standards defined by Ohio’s Learning Standards. The evidence presented here focuses on the validity of test score and performance level interpretations about student achievement of Ohio’s Learning Standards. There are a number of intended uses for Ohio’s State Test (OST scores, including school accountability, feedback about student and class performance, measurement of student growth over time, evaluation of performance gaps between groups, evaluation of teacher performance, and diagnosis of individual student strengths and weaknesses. Each intended use requires claims to be made about the interpretation of test scores, and the strength of those claims rests on the validity evidence supporting those claims. Some validity evidence will be central to all of the claims, especially evidence for the alignment of test items and administrations to Ohio’s Learning Standards. Other evidence may target more specific claims, such as evidence for measurement of student growth or evaluation of teacher performance. Evaluation of validity evidence should therefore be made with respect to the claim that it is purported to support.

Determining whether the test measures the intended construct is central to evaluating the validity of test score interpretations. Such an evaluation in turn requires a clear definition of the measurement construct. For OST assessments, the definition of the measurement construct is provided by Ohio’s Learning Standards.

Ohio’s Learning Standards specify what students should know and be able to do by the end of each grade level, or by end-of-course for high school, in order to graduate ready for post-secondary education or entry into the workforce.1 Ohio first adopted learning standards in 2001, recognizing that learning standards would continue to be revised over time. The Ohio State Board of Education adopted Ohio’s Learning Standards for ELA and mathematics in 2010 as part of a multi-state effort. In 2010, the Ohio State Board of Education also adopted new, more rigorous, science and social studies standards. Ohio’s Learning Standards for ELA, mathematics, science, and social studies describe the educational targets for students in each subject area.

Because directly measuring student achievement against each benchmark in Ohio’s Learning Standards would result in an impractically long test, each test administration is designed to measure a representative sample of the content domain defined by Ohio’s Learning Standards.2 To ensure that each student is assessed on the intended breadth and depth of Ohio’s Learning Standards, item selection is guided by a set of test specifications, or blueprints, that indicate

1 Standard 1.1 – The test developer should clearly set forth how test scores are intended to be interpreted and consequently used. The population(s) for which a test is intended should be delimited clearly, and the construct or constructs that the test is intended to assess should be described clearly. 2 Standard 4.0 – Tests and testing programs should be designed and developed in a way that supports the validity of interpretations of the test scores for their intended uses. Test developers and publishers should document steps taken during the design and development process to provide evidence of fairness, reliability, and validity for intended uses for individuals in the intended examinee population.


the number of items that should be sampled from each content strand, standard, and benchmark.3 Thus, the test blueprints represent a policy statement about the relative importance of content strands and standards in addition to meeting important measurement goals (e.g., sufficient items to report strand performance levels reliably). Because the test blueprint determines how student achievement of Ohio’s Learning Standards is evaluated, alignment of test blueprints with the content standards is critical. ODE has published the OST test blueprints that specify the distribution of items across reporting categories.

The principles of universal design of assessments provide guidelines for test design to minimize the impact of construct-irrelevant factors in assessing student achievement.4 Universal design removes barriers to access for the widest range of students possible. Seven principles of universal design are applied in the process of test development (Thompson, Johnstone, & Thurlow, 2002):

• Inclusive assessment population

• Precisely defined constructs

• Accessible, non-biased items

• Amenable to accommodations

• Simple, clear, and intuitive instructions and procedures

• Maximum readability and comprehensibility

• Maximum legibility

Test development specialists receive extensive training on the principles of universal design and apply these principles in the development of all test materials, including items and accompanying stimuli. In the review process, adherence to the principles of universal design is verified.

In addition, the OST test delivery system provides a range of accessibility tools and accommodations for reducing construct-irrelevant barriers to accessing test content for virtually all students. 5 The range of accommodations provided in the online testing environment far exceed the typical accommodations made available in paper-based test administrations, which were typically limited to large print, Braille, and English and foreign language audio translations. Exhibit 1.2.1 lists the accommodations and accessibility supports currently available for OST assessments.

3 Standard 4.1 – Test specifications should describe the purpose(s) of the test, the definition of the construct or domain measured, the intended examinee population, and interpretations for intended uses. The specifications should include a rationale supporting the interpretations and uses of test results for the intended purpose(s). 4 Standard 3.0 – All steps in the testing process, including test design, validation, development, administration, and scoring procedures, should be designed in such a manner as to minimize construct-irrelevant variance and to promote valid score interpretations for the intended uses for all examinees in the intended population. 5 Standard 3.1 – Those responsible for test development, revision, and administration should design all steps of the testing process to promote valid score interpretations for intended score uses for the widest possible range of individuals and relevant subgroups in the intended population. Standard 3.2 – Test developers are responsible for developing tests that measure the intended construct and for minimizing the potential for tests to be affected by construct-irrelevant characteristics, such as linguistic, communicative, cognitive, cultural, physical, or other characteristics. Standard 12.3 – Those responsible for the development and use of educational assessments should design all relevant steps of the testing process to promote access to the construct for all individuals and subgroups for whom the assessment is intended.


Exhibit 1.2.1: Accommodations and Accessibility Supports

Accessibility Feature Description

Text-to-Speech—Directions, Passages,

Items

Computer reads text and graphics aloud on directions, passages, and items. What is read and how it is read is configurable.

Text-to-Speech—Graphic Description

Computer reads graphics and tables aloud.

Magnification Interface

Student can zoom in and zoom out on the entire page. This capability persists throughout the test.

Magnifier Student can magnify a selected portion of an item.

Variable Font Size The number of levels (generally, five levels) and rate of increase (generally, 1.25x the previous level) are configurable.

Refreshable Braille/ Tactile With External

Embosser Printer

Items can be rendered to desktop embossers that can integrate Braille and tactile graphics. The items are simultaneously rendered on a reader-accessible screen, and the student can navigate to response spaces to provide answers.

Reverse Contrast Background is black, while text is white.

Administrator- Selectable Variable

Font and Background Colors

Any foreground and background color can be supported.

Color Overlay Any color can be laid on the screen. This persists throughout the test.

Increased White Space

This is the streamlined interface.

Sign Language—Directions, Passages,

Items

This capability consists of recorded videos using American Sign Language. Experts on hearing impairment do not recommend avatars because they do not translate well to American Sign Language.

Translations Versions are available in alternate languages.

Keyword Translation This enables translators to associate keyword translations.

Glossaries and Dictionaries

These enable content developers to associate additional content with words or phrases. The content can comprise multiple types, and the content shown to a student can be controlled by his or her personal profile.

Alternate Language Glossaries and

Dictionaries

These enable content developers to associate alternate-language content with words or phrases. The content can comprise multiple types, and the content shown to a student can be controlled by his or her personal profile.



Administrator- Selectable Assistive Devices Integration

Our system has a standard interface and a streamlined interface. Most assistive devices can work with the former, and an even wider group works with the latter. If the use of the device requires relaxation of certain security features (e.g., if suppression of pop-up windows interferes with on-screen keyboards), the system can be configured to allow the test administrator to select a more permissive mode.

Line Reader This feature allows a student to track the line he or she is reading.

Masking Students can mask extraneous information on the screen.

Speech-to-Text Speech is converted to text and then saved in the database. (Available through compatibility with third-party assistive technology.)

Auditory Calming This enables music or white noise to be played in the background. (Available through third-party software.)

Administrator-Selectable Zoom

Default font size can be set in advance through a file upload or user interface or at the time of testing by the test administrator. Student can zoom in or zoom out at any time.

Administrator-Selectable Large

Print Font

Default font size can be set in advance through a file upload or user interface or at the time of testing by the test administrator. Student can zoom in or zoom out at any time.

Administrator-Selectable Screen-

Reader

The system supports an integrated screen reader that can be configured to provide a variety of support levels, each selectable by the test administrator.

Additional Time

AIR’s system currently does not impose a time limit on the test. It is up to the proctor to stop a student’s test or stop the entire session. However, if there are unforeseen events, such as a fire alarm, that trigger need for additional testing time, AIR’s system can enable a grace period extension (GPE) for a single test opportunity or for multiple test opportunities.

Segment Breaks

AIR’s system has the capability of adding test segments within a test. A test segment is made up of multiple item groups and creates a logical break between segments within a test. For example, a segment break might separate a calculator from a non-calculator segment of a test.

Recorded Audio AIR’s system efficiently delivers recorded audio. We are able to deliver voice-audio using only about 10 Kbps of bandwidth.

Secure Print Facility

A visual accessibility feature, the secure print facility allows the secure printing of items or passages. A student requests that a passage or item be printed; the request is then encrypted and sent securely to the proctor; the proctor approves the request before it is sent to the printer. In addition, this feature also allows for the delivery of real-time paper-pencil tests, including large print tests.



Test Pauses and Restarts

An attention accessibility feature, test pauses and restarts allow the test to be paused at any time and restarted and taken over many days. So that security is not compromised, visibility on past items is not allowed when the test has been paused longer than a specified period of time.

Writing Checklist An attention accessibility feature generally used for essay items, the writing checklist enables a student to check off writing guidelines from a checklist.

Review Test Students can review the test before ending it.

Area Boundaries An agility accessibility feature, area boundaries for mouse-clicking multiple-choice options allow students to click anywhere on the selected-response text or button.

Language Any language that is necessary can be supported.

Help Section A reference feature, the Help Section explains how the system and its tools work.

Performance Report A reference feature, a performance report is available at the end of the test for the student.

1.2.1 EVIDENCE BASED ON TEST CONTENT

Because OST assessments are designed to measure student progress toward achievement of Ohio’s Learning Standards, the validity of OST test score interpretations critically depend on the degree to which test content is aligned with expectations for student learning specified in the academic standards.6

Alignment of Ohio’s Learning Standards is achieved through a rigorous test development process that proceeds from the learning standards and refers back to those standards in a highly iterative test development process that includes ODE, test developers, and educator and stakeholder committees.

In addition to ensuring that test items are aligned with their intended learning standards, each assessment is intended to measure a representative sample of the knowledge and skills identified in the standards. Test blueprints specify the range with which each of the content strands and standards will be covered in each test administration.7 Thus, the test blueprints represent a policy document specifying the relative importance of content strands and standards in addition to meeting important measurement goals (e.g., sufficient items to report strand performance levels reliably). Because the test blueprint determines how student achievement of Ohio’s Learning Standards is evaluated, alignment of test blueprints with the learning standards is critical.

With the desired alignment of test blueprints to Ohio’s Learning Standards, alignment of test forms to the learning standards becomes a mechanical, although sometimes difficult, task of developing test forms that meet the

6 Standard 12.4 – When a test is used as an indicator of achievement in an instructional domain or with respect to specified content standards, evidence of the extent to which the test samples the range of knowledge and elicits the processes reflected in the target domain should be provided. Both the tested and the target domains should be described in sufficient detail for their relationship to be evaluated. The analyses should make explicit those aspects of the target domain that the test represents, as well as those aspects that the test fails to represent. 7 Standard 4.1 – Test specifications should describe the purpose(s) of the test, the definition of the construct or domain measured, the intended examinee population, and interpretations for intended uses. The specifications should include a rationale supporting the interpretations and uses of test results for the intended purpose(s).


blueprints. Developing test forms is difficult because test blueprints can be highly complex, specifying not only the range of items and points for each strand and standard, but also cross-cutting criteria such as distribution across item types, writing genre, and so on. Also, in addition to meeting complex blueprint requirements, test developers must work to meet psychometric goals so that alternate test forms measure equivalently across the range of ability.

Following a standard item-review process, item reviews proceed initially through a series of internal reviews before items are eligible for review by ODE content experts. Most of AIR’s content staff members, who are responsible for conducting internal reviews, are former classroom teachers who hold degrees in education and/or their respective content areas. Each item passes through four internal review steps before it is eligible for review by ODE. Those steps include the following:

• Preliminary Review, in which the item is reviewed by a group of AIR content area experts

• Content Review 1, in which the item is reviewed by an AIR content specialist

• Editorial Review, in which a copyeditor checks the item for correct grammar/usage

• Senior Content Review, in which the item is reviewed by the lead content expert.

At every stage of the item-review process, beginning with preliminary review, AIR’s test developers analyze each item to ensure the following:

• The item is well-aligned with the intended learning standard.

• The item conforms to the item specifications for the target being assessed.

• The item is based on a quality idea (i.e., it assesses something worthwhile in a reasonable way).

• The vocabulary used in the item is appropriate for the intended grade/age and subject matter, and takes

into consideration language accessibility, bias, and sensitivity.

• The item content is accurate and straightforward.

• Any accompanying graphic and stimulus materials are actually necessary to answer the question.

• The item stem is clear, concise, and succinct, meaning that it contains enough information to know what is

being asked, it is stated positively (and does not rely on negatives such as no, not, none, or never unless

absolutely necessary), and it ends with a question.

• For selected-response items, the set of response options are succinct; parallel in structure, grammar,

length, and content; sufficiently distinct from one another; and all plausible, all non-keyed response options

are unambiguously incorrect.

• There is no obvious or subtle cluing within the item.

• The score points for constructed-response items are clearly defined.

• For machine-scored constructed-response (MSCR) items, that item responses yield the intended score

points and rationales based on the rubric.

• For handscored constructed-response items, the scoring rubric clearly explains what characterizes

responses at each possible level of achievement.

In addition, rubric-scored items, both machine-scored and handscored, are validated following field-test administration. Machine-scored items go through a rubric validation process wherein samples of student responses are reviewed, along with resulting scores, to ensure that rubrics are enacted as intended. This process is described in Section 10.1. Handscored items go through a range-finding process prior to scoring where samples of item responses are used to create scorer-training materials and ensure that the scoring rubric is appropriate, as described in Section 10.2.

Based on their review of each item, the test developer may have accepted the item and classification as written, revised the item, or rejected the item outright.


Items passing through the internal review process were sent to ODE for their review. At this stage, items may have been further revised based on any edits or changes requested by ODE, or rejected outright. Items passing through the ODE review level then had to pass through two stakeholder reviews in which committees of Ohio educators and stakeholders review each item’s accuracy, alignment to the intended standard and DOK level, as well as item fairness and language sensitivity. Thus, all items considered for inclusion in the OST item pools were initially reviewed by the following committees.

• A content advisory committee checked to ensure that each item is

o aligned to Ohio’s Learning Standards;

o appropriate for the grade level;

o accurate; and

o presented online in a way that is clear and appropriate.

• A fairness and sensitivity committee checked to ensure that each item and any associated stimulus

materials were free from bias, sensitive issues, controversial language, stereotyping, and statements that

reflect negatively on race, ethnicity, gender, culture, region, disability, or other social and economic

conditions and characteristics.

Items successfully passing through this committee review process were then field tested to ensure that the items behaved as intended when administered to students. Despite conscientious item development, some items perform differently than expected when administered to students. Using the item statistics computed following field testing to review item performance is an important step in constructing equivalent operational test forms that support valid inferences.

Classical item analyses ensure that items function as intended with respect to the underlying scales. Classical item statistics are designed to evaluate the item difficulty and the relationship of each item to the overall scale (item discrimination) and to identify items that may exhibit a bias across subgroups (differential item functioning analyses).

Items flagged for review based on their statistical performance have to pass a three-stage review to be included in the final item pool from which operational forms are created. In the first stage of this review, a team of psychometricians review all flagged items to ensure that the data are accurate and properly analyzed, response keys are correct, and there are no other obvious problems with the items.

ODE then reconvened their content review and fairness and sensitivity committees to re-evaluate flagged field-test items in the context of each item’s statistical performance. Based on their review of each item’s performance, the content review and fairness and sensitivity committees could recommend that flagged items be rejected or deem the item eligible for inclusion in operational test administrations.

1.2.2 EVIDENCE FOR PERFORMANCE STANDARD INTERPRETATION OF TEST SCORES

Alignment of test content to Ohio’s Learning Standards ensures that test scores can serve as valid indicators of the degree to which students have achieved the learning expectations detailed in the standards. However, the interpretation of the OST test scores rests fundamentally on how test scores relate to performance standards, which define the extent to which students have achieved the expectations defined in the standards. OST test scores are reported with respect to five proficiency levels, indicating the degree to which Ohio students have achieved the learning expectations defined by Ohio’s Learning Standards. The cut score establishing the Proficient level of performance is the most critical, since it indicates that students are meeting grade-level expectations for achievement of Ohio’s Learning Standards and that they are prepared to benefit from instruction at the next grade level. The Accelerated level is also of critical importance because performance at this level is intended to indicate


that students are on track to pursue post-secondary education. Procedures used to adopt performance standards for OST assessments are therefore central to the validity of test score interpretations.8

Following the first operational administration of the science and social studies assessments in spring 2015, a standard-setting workshop was conducted to recommend to the Ohio State Board of Education a set of performance standards for reporting student achievement of Ohio’s Learning Standards. In December 2015, a standard-setting workshop was conducted to recommend the performance standards in ELA and mathematics. For each of the standard-setting workshops, a technical report was produced that describes the standardized and rigorous procedures that Ohio educators, serving as standard-setting panelists, followed to recommend performance standards. The workshops employed the Bookmark procedure, a widely used method in which standard-setting panelists use their expert knowledge of the academic content standards and student achievement to map the performance level descriptors adopted by the Ohio State Board of Education onto an ordered-item booklet based on the first operational test forms administered to students in spring 2015 for science and social studies, and in fall 2015 and spring 2016 for ELA and mathematics.9

Panelists were also provided with contextual information to help inform their primarily content-driven performance standard recommendations. Panelists recommending performance standards were provided with the approximate location of performance standards from other statewide assessment systems, including the Partnership for Assessments of Readiness for College and Careers (PARCC) and Smarter Balanced. Panelists recommending performance standard for the grades 3–8 summative assessments were provided with the approximate location of relevant National Assessment of Educational Progress (NAEP) performance standards at grades 4 and 8, as well as interpolated NAEP standards for grade 6. High school end-of-course panelists were also informed of the approximate location of the ACT college-ready cut scores for appropriate subject area assessments. Panelists were asked to consider these benchmark locations when making their content-based performance standard recommendations. When panelists are able to use benchmark information to locate performance standards that converge across assessment systems, validity of test score interpretations is bolstered.

In addition, panelists were provided with feedback about the vertical articulation of their recommended performance standards so that they could view how the locations of their recommended performance standards for each grade-level assessment sat in relation to the performance standard recommendations at the other grade levels. This approach allowed panelists to view their performance standard recommendations as a coherent system of performance standards, which further reinforces the interpretation of test scores as indicating not only achievement of current grade-level standards, but also preparedness to benefit from instruction in the subsequent grade level.

Based on the recommended performance standards, Exhibits 1.2.2.1 show the estimated percentage of students meeting or exceeding the OST assessments’ proficient and accelerated performance standards for each of the ELA and mathematics assessments, while Exhibit 1.2.2.2 shows the percentage of students expected to meet or exceed the OST assessments’ proficient and accelerated standards for the science and social studies assessments. Exhibit 1.2.2.1 and 1.2.2.2 also show the approximate percentage of Ohio students that would be expected to meet the relevant ACT college-ready standard, and the percentage of Ohio students meeting the NAEP proficient standards at grades 4 and 8, and as interpolated at grade 6. Exhibit 1.2.2.1 also presents the estimated percent of Ohio students meeting the PARCC and Smarter Balanced proficient standard. Exhibit 1.2.2.2 provides the estimated percentage of students nationally scoring at or above the highly proficient standard for TIMSS. As the exhibits indicate, the recommended OST performance standards are quite consistent with relevant ACT college-ready standards, and the NAEP and Smarter Balanced proficient benchmarks. Moreover, because the performance standards were vertically articulated, the proficiency rates across grade levels are generally consistent.

8 Standard 4.22 – Test developers should specify the procedures used to interpret test scores and, when appropriate, the normative or standardization samples or the criterion used. 9 Standard 1.18 – When it is asserted that a certain level of test performance predicts adequate or inadequate criterion performance, information about the levels of criterion performance associated with given levels of test scores should be provided.


Exhibit 1.2.2.1: Percentage of Students Meeting OST and Benchmark Proficient Standards — ELA and

Mathematics

Grade / Course OST

Proficient OST

Accelerated ACT College

Ready NAEP

Proficient PARCC

Proficient SBAC Meets

ELA

Grade 3 56 36 -- -- -- 52

Grade 4 54 33 -- 37 71 55

Grade 5 57 36 -- -- 66 58

Grade 6 58 37 -- 37 69 54

Grade 7 55 32 -- -- 68 57

Grade 8 55 28 -- 36 64 57

ELA I 53 24 -- -- 71 63

ELA II 52 25 37 -- 72 63

Mathematics

Grade 3 66 36 -- -- 66 58

Grade 4 65 37 -- 45 65 54

Grade 5 65 34 -- -- 65 49

Grade 6 62 31 -- 40 63 45

Grade 7 61 36 -- -- 63 49

Grade 8 63 32 -- 35 53 50

Algebra I 58 36 -- -- 65 47

Geometry 59 38 31 -- -- 48

Integrated Math I 58 36 -- -- 58 47

Integrated Math II 56 36 32 -- -- 45

Exhibit 1.2.2.2: Percentage of Students Meeting OST and Benchmark Proficient Standards — Science and Social

Studies

Grade / Course OST

Proficient OST

Accelerated ACT College

Ready NAEP

Proficient National

TIMSS High

Science

Grade 5 62 38 -- -- 47

Grade 8 60 37 -- 38 40

Physical Science 63 22 26 -- --

Biology 60 27 26 -- --

Social Studies

Grade 4 70 29 -- 37 --

Grade 6 57 36 -- 38 --

American History 71 35 37 -- --

American Government 67 18 37 -- --


1.2.3 EVIDENCE BASED ON INTERNAL STRUCTURE

Ohio’s State Tests represent a structural model of student achievement in grade-level and course-specific content areas. Within each subject area (e.g., ELA), items are designed to measure a single content strand (e.g., Reading Information, Reading Literature, Writing). Content strands within each subject area are, in turn, indicators of achievement in the subject area. The form of the second-order confirmatory factor analyses is illustrated in Exhibit 1.2.3.1 with each item as an indicator of an academic content strand. Because items are never pure indicators of an underlying factor, each item also includes an error component. Similarly, each academic content strand serves as an indicator of achievement in a subject area. As at the item level, the content strands include an error term indicating that the content strands are not pure indicators of overall achievement in the subject area. The paths from the content strands to the items represent the first-order factor loadings or the degree to which items are correlated with the underlying academic content strand construct. Similarly, the paths from subject area achievement to the content strands represent the second-order factor loading, indicating the degree to which academic content strand constructs are correlated with the underlying construct of subject area achievement.

Exhibit 1.2.3.1: Second-Order Structural Model for OST Assessments

Confirmatory factor analysis was used to evaluate the fit of this structural model to student response data from OST assessments’ test administrations.10 For each of test forms administered in spring 2018, we examined the goodness-of-fit between the structural model and the operational test data. Goodness-of-fit is typically indexed by a χ2 statistic, with good model fit indicated by a non-significant χ2 statistic. The χ2 statistic is sensitive to sample size, however; even well-fitting models will demonstrate highly significant χ2 statistics given a very large number of students. Therefore, fit indices, such as the Comparative Fit Index (CFI; Bentler, 1990), the Tucker-Lewis Index (Tucker & Lewis, 1973), and the Root Mean Square of Approximation (RMSEA) were also used to evaluate model fit. Exhibit 1.2.3.2 illustrates the guidelines for evaluating goodness-of-fit.

10 Standard 1.13 – If the rationale for a test score interpretation for a given use depends on premises about the relationships among test items or among parts of the test, evidence concerning the internal structure of the test should be provided.


Exhibit 1.2.3.2: Guidelines for Evaluating Goodness-of-Fit

Goodness-of-Fit Index Indication of Good Fit

CFI 0 ≥ 0.95

TLI 0 ≥ 0.95

RMSEA 0 ≤ 0.05

In addition to testing the fit of the hypothesized OST second-order confirmatory factor analysis model, we examined the degree to which the second-order model improved fit over the more general one-factor model (i.e., first-order model) of academic achievement in each subject area. Because the one-factor, general achievement model was nested within the second-order model, a simple likelihood ratio test was used to determine whether the added information provided by the structure of the OST assessments’ frameworks improved model fit over a general achievement model. Results indicating improved model fit for the second-order factor model provide support for the interpretation of learning standard performance at the strand level above that provided by the overall subject area score.11

ELA Content Model

We began by evaluating the fit of the first-order, general achievement model in which all items are indicators of a common subject area factor. This model importantly evaluates the assumption of unidimensionality of the subject area assessments, and provides a baseline for evaluating the improvement of fit for the more differentiated second-order (i.e., strand) model. The goodness-of-fit statistics for the first-order, general achievement models in ELA are shown in Exhibit 1.2.3.3. All of the statistics indicate the general achievement factor model fit the data well. This pattern was true across all grades. The CFI and TLI values were all greater than 0.93, and the RMSEA values were at or below 0.06, indicating reasonable fit for the base model. The goodness-of-fit statistics for the hypothesized OST second-order models in ELA are shown in Exhibit 1.2.3.3. All of the statistics indicate the second-order models posited by OST assessments fit the data well. This pattern was true across all grades. The CFI and TLI values for the second-order models were all equal to or greater than 0.95, with RMSEA values well below the 0.05 threshold used to indicate good fit.

The results of the comparison between the hypothesized OST model and the more general achievement model are presented in Exhibit 1.2.3.3. We note that model fit for first-order models of general achievement are reasonably high and provide evidence for the unidimensionality of the subject area assessments. The purpose of these analyses is to determine whether the posited second-order reporting model adds information beyond that provided by the first-order model. The chi-square difference test shows that across grade levels, the strand-based, second-order model showed significantly better fit than the general achievement first-order model. The χ2

Diff p-values were less than 0.001 across all grade levels. The improved fit derives primarily from the differentiation between the reading and writing assessments. While the evidence supports a unified ELA construct, reading and writing are sufficiently independent to warrant differentiated reporting.

Exhibit 1.2.3.3: Goodness-of-Fit for the OST First-Order Model and Second-Order Model and Difference in Fit

between Two Competing Models — ELA

Grade / Course

Goodness-of-Fit Difference in Fit between First- and Second-Order Models First-Order Models Second-Order Models

CFI TLI RMSEA CFI TLI RMSEA χ2 df p value

Grade 3 0.97 0.96 0.06 1.00 1.00 0.02 42920.307 3 p < 0.001

Grade 4 0.98 0.98 0.04 1.00 1.00 0.02 30094.493 3 p < 0.001

11 Standard 1.14 – When interpretation of subscores, score differences, or profiles is suggested, the rationale and relevant evidence in support of such interpretation should be provided. Where composite scores are developed, the basis and rationale for arriving at the composites should be given.


Grade / Course



Grade 5 0.96 0.96 0.06 1.00 1.00 0.02 39872.871 3 p < 0.001

Grade 6 0.97 0.97 0.05 0.99 0.99 0.03 34166.295 3 p < 0.001

Grade 7 0.97 0.97 0.05 0.99 0.99 0.03 48889.907 3 p < 0.001

Grade 8 0.97 0.97 0.05 0.99 0.99 0.03 37699.965 3 p < 0.001

ELA I 0.98 0.98 0.05 0.99 0.99 0.03 35643.652 3 p < 0.001

ELA II 0.99 0.98 0.05 0.99 0.99 0.03 34637.670 3 p < 0.001

Mathematics Content Model

As with ELA, structural analyses of the mathematics assessments began with an evaluation of fit for the first-order, general achievement model in which all items are indicators of a common mathematics subject area factor. This model provides for an evaluation of the unidimensionality assumption of the subject area assessments, and provides a baseline for evaluating the improvement of fit for the more differentiated second-order model. The goodness-of-fit statistics for the general achievement models in mathematics are shown in Exhibit 1.2.3.4. Fit statistics indicate that the general achievement factor model fit the data well. This pattern was true across all grades. The CFI and TLI values were greater than 0.95, and the RMSEA values were below 0.05 for all grades, indicating adequate fit for the base model.

The goodness-of-fit statistics for the hypothesized OST second-order models in mathematics are shown in Exhibit 1.2.3.4. Fit statistics indicate the second-order models posited by OST assessments also fit the data well. This pattern was true across all grades. The CFI and TLI values for the second-order models were all equal to or greater than 0.95, with RMSEA values well below the 0.05 threshold used to indicate good fit.

The results of the comparison between the hypothesized OST model and the more general achievement model for mathematics tests are presented in Exhibit 1.2.3.4. The chi-square difference test shows that across grade levels, the strand-based, second-order model showed significantly better fit than the general achievement first-order model. The χ2

Diff p-values were less than 0.001 across all grade levels. We note that the magnitude of the χ2 Diff values

are much smaller than those observed for ELA, as are the differences between the first- and second-order model fit indices. This suggests that while differentiation among the latent traits may be supported, precision of observed subscale scores may not be sufficient to support differential comparisons between subscale scores. This appears especially relevant for grade 6 and the high school end-of-course tests which are focused more narrowly on a single content strand.


between Two Competing Models — Mathematics

Grade / Course



Grade 3 0.96 0.95 0.04 0.96 0.95 0.04 617.339 4 p < 0.001

Grade 4 0.97 0.97 0.03 0.97 0.97 0.03 2770.506 3 p < 0.001

Grade 5 0.97 0.97 0.03 0.98 0.98 0.03 7981.289 3 p < 0.001

Grade 6 0.98 0.98 0.03 0.98 0.98 0.03 40.451 4 p < 0.001

Grade 7 0.98 0.98 0.03 0.98 0.98 0.03 711.761 4 p < 0.001

Grade 8 0.97 0.97 0.03 0.97 0.97 0.03 323.786 4 p < 0.001

Algebra 0.98 0.98 0.03 0.98 0.98 0.03 2087.410 3 p < 0.001


Grade / Course



Geometry 0.97 0.97 0.03 0.97 0.97 0.03 341.206 4 p < 0.001

Int Math I 0.99 0.98 0.02 0.99 0.98 0.02 81.381 4 p < 0.001

Int Math II 0.97 0.97 0.03 0.97 0.97 0.03 244.299 4 p < 0.001

Science and Social Studies Content Models

Structural analyses of the science and social studies assessments also began with an evaluation of fit for the first-order, general achievement model in which all items are indicators of a common subject area factor. The goodness-of-fit statistics for the general achievement models in science and social studies are shown in Exhibit 1.2.3.5. All of the statistics indicate the general achievement factor model fit the data well. This pattern was true across all grades. The CFI and TLI values were all equal to or greater than 0.95, and the RMSEA values are all below 0.05, indicating good fit for the base model. The goodness-of-fit statistics for the hypothesized OST second-order models in science and social studies are shown in Exhibit 1.2.3.5. As with the general factor model, all of the fit statistics indicate the second-order models posited by OST assessments fit the data well. This pattern was true across all grades. The CFI and TLI values for the second-order models were all equal to or greater than 0.95, with RMSEA values well below the 0.05 threshold used to indicate good fit.

The results of the comparison between the hypothesized OST model and the more general achievement model for science and social studies tests are presented in Exhibit 1.2.3.5. The chi-square difference test shows that across grade levels, the strand-based second-order model showed significantly better fit than the general achievement first-order model. The χ2

Diff p-values were less than 0.001 across all grade levels. As observed with respect to the mathematics assessments, the magnitude of the χ2

Diff values are relatively smaller than those observed for ELA, as are the differences between the first- and second-order model fit indices. Thus, while differentiation among the latent traits may be supported theoretically, precision of observed subscale scores is likely not sufficient to support differential comparisons between subscale scores.


between Two Competing Models — Science and Social Studies

Grade / Course



Science

Grade 5 0.99 0.99 0.02 0.99 0.99 0.02 56.579 3 p < 0.001

Grade 8 0.95 0.95 0.03 0.96 0.95 0.03 4334.645 3 p < 0.001

Biology 0.98 0.98 0.02 0.98 0.98 0.02 8409.255 4 p < 0.001

Physical Science 0.41 0.38 0.05 0.42 0.39 0.05 58.596 4 p < 0.001

Social Studies

American Government 0.96 0.96 0.03 0.96 0.96 0.03 604.544 3 p < 0.001

American History 0.99 0.99 0.02 0.99 0.99 0.02 316.798 3 p < 0.001

1.2.4 MEASUREMENT INVARIANCE ACROSS SUBGROUPS

Measurement invariance occurs when the likelihood of a correct response conforms to the measurement model and is independent of group membership and the parameters of a measurement model are statistically equivalent across


groups. 12 The parameters of interest in measurement invariance testing are the factor loadings and intercepts/thresholds. Invariance in residual variances or scale factors can also be tested, but there is consensus that it is not necessary to demonstrate invariance across groups on these parameters. In general, measurement invariance testing can be conducted using a series of multiple-group confirmatory factor analysis (CFA) models, which impose identical parameters across groups. The measurement model parameters, including factor patterns (configural invariance), factor loadings (metric or weak invariance), latent intercepts/thresholds (scalar or strong invariance), and unique or residual factor variances (strict invariance), are tested across groups in that sequential order. When factor loadings and intercepts/thresholds are invariant across groups, scores on latent variables can be validly compared across the groups and the latent variables can be used in structural models that hypothesize relationships among latent variables (Millsap, 2011).

Items comprising the spring 2018 operational test administration were used to investigate measurement invariance across subgroups for all subjects. The full set of tables associated with these analyses is provided in Appendix A for each of the grade and subject area assessments.

The series “a” tables (e.g., A.1a, A.2a, etc., in Appendix A) present the global model fit indices for the measurement invariance tests for each assessment. Following the sequence of tests of measurement invariance (Millsap & Cham, 2012), we tested configural, metric, and scalar invariance models using χ2 difference test (at α ≤ 0.05) and the examination of significant differences of the Root Mean Square of Approximation (RMSEA, change in RMSEA ≤ 0.015; Chen, 2007) between the two nested invariance models. Measurement invariance was investigated across the following subgroups: gender (Model A); ethnicity including African American vs. White (Model B-1), Hispanic vs. White (Model B-2), Asian vs. White (Model B-3), American Indian vs. White (Model B-4), and Multi-ethnic vs. White (Model B-5); Individualized Education Program status (IEP; Model C); and Limited English Proficiency status (LEP; Model D). Invariance tests of subgroups were investigated separately for each grade and subject area test. Please note that multiple-group CFA for Physical Science and Biology could not be tested because of the very small number of focal groups in Physical Science and missing responses of certain items in Biology.

The null hypothesis of the χ2 difference test is that the more restricted invariance model (e.g., metric) fits the data equally as well as the less restricted invariance model (e.g., configural). Given that the sensitivity of the χ2 difference tests to sample size, we additionally examined significant differences on this test with an examination of the RMSEA. A small change in the RMSEA between the more restricted and less restricted invariance models supports retention of the more restricted invariance model (Chen, 2007).

The series “b” tables (e.g., A.1b, A.2b, etc.) show the model fit indices of scalar invariance models assuming same factor pattern plus identical factor loadings plus identical latent intercept/threshold across subgroups. Global model fit indices included the Comparative Fit Index (CFI; Bentler, 1990) and Root Mean Square of Approximation (RMSEA). CFI values ≥ 0.90 and RMSEA values ≤ 0.08 were used to evaluate acceptable model fit. The model fit indices of the scalar invariance models for all tests suggested acceptable fit to the data. For ELA, CFI ranged from 0.93 to 0.99 and RMSEA ranged from 0.01 to 0.06. For mathematics, CFI values ranged from 0.95 to 0.99 and RMSEA ranged from 0.03 to 0.04. For science, CFI values ranged from 0.90 to 0.97 and RMSEA ranged from 0.02 to 0.03. For social studies, CFI values ranged from 0.96 to 0.99 and RMSEA ranged from 0.01 to 0.03.

Although the χ2 difference test should ideally be nonsignificant, almost all χ2 difference tests were significant at α = 0.05 due to large sample sizes. An exception to this was observed for Model B-4 (American Indian vs. White), where the χ2 difference tests for most grades was nonsignificant or marginally significant at α = 0.05. In spite of significant χ2 difference tests for most models, we found that changes of the RMSEA between the two nested invariance models were very small (ranging from 0.000 to 0.005 across assessments in all grades and subjects), which indicates acceptable fit indices of the scalar model. Based on the similar magnitudes of the RMSEA (i.e., no material change across all tested models; Cheung & Rensvold, 2002) and the acceptable fit indices of the scalar invariance model to the data, OST spring 2018 test scores have the same measurement structure across gender, ethnicity (African

12 Standard 3.15 – Test developers and publishers who claim that a test can be used with examinees from specific subgroups are responsible for providing the necessary information to support appropriate test score interpretations for their intended uses for individuals from these subgroups.


American vs. White, Hispanics vs. White, Asian vs. White, American Indian vs. White, and Multi-Ethnic vs. White), individualized education program status, and limited English proficiency status.

1.2.5 TEST INTEGRITY FORENSICS

The validity of test score interpretation depends critically on the integrity of the test administrations on which those scores are based. Any irregularities in the administration of assessments can therefore cast doubt on the validity of the inferences based on those test scores. Multiple facets work to ensure that tests are administered properly which include clear test administration policies, effective test administrator training, and tools to identify possible irregularities in test administrations.13

For online administrations, quality assurance (QA) reports are generated during and after the testing windows. These are geared toward detection of possible cheating, aggregating unusual responses at the student level to detect possible group-level testing anomalies.

Online test administration allows Ohio’s testing contractor to track information that was not possible to track in the context of the paper-pencil tests. This information includes not only item responses but also item response changes, latencies between item responses and changes, number of revisits to an item or items, test start and end times, scores in each opportunity in the current year, scores in the previous year, and other selected information in the system (e.g., accommodations) as requested by the state. AIR’s Test Delivery System (TDS) captures all of this information.

Unlike with paper-based assessments where data analysis must await the close of the testing window and processing of answer documents, AIR’s TDS allows AIR psychometricians and state assessment staff to monitor testing anomalies throughout each testing window, following the first operational administration. Following the base year, the analyses used to detect the testing anomalies can be run at any time within the testing window. Evidence evaluated includes changes in test scores across administrations, item response times, and item response patterns using the person-fit index. The flagging criteria used for these analyses are configurable and can be changed by the user. Analyses are performed at student-level and summarized for each aggregate unit, including testing session, test administrator, and school.

Changes in Student Performance

Although not available for spring 2016, beginning in the 2016–2017 school year, for both online and paper-pencil tests, it became possible to examine score changes between test administrations using a regression model. For between-year comparisons, the scores between past and current years are compared, with the current-year score regressed on the test score from the previous year. Between-year comparisons are performed starting with the second year of the test administration.

A large score gain or loss between grades is detected by examining the residuals for outliers. The residuals are computed as observed value minus predicted value. To detect unusual residuals, we compute the studentized t residuals. An unusual increase or decrease in student scores between opportunities is flagged when studentized t residuals are greater than 3 or less than -3.

The number of students with a large score gain or loss is aggregated for a testing session, test administrator, and school. Unusual changes in an aggregate performance between administrations and/or years is flagged based on the average studentized t residuals in an aggregate unit (e.g., a testing session or a test administrator). For each aggregate unit, a critical t value is computed and flagged when t was greater than 3 or less than -3,

13 Standard 6.6 – Reasonable efforts should be made to ensure the integrity of test scores by eliminating

opportunities for students to attain scores by fraudulent or deceptive means.


𝑡 =𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙𝑠

√𝑠2

𝑛+

∑ 𝑣𝑎𝑟(𝑒𝑖)𝑛𝑖=1

𝑛2

,

where s = standard deviation of residuals in an aggregate unit; n = number of students in an aggregate unit (e.g., testing session or test administrator); and 𝑣𝑎𝑟(𝑒𝑖) = 𝜎2(1 − ℎ𝑖𝑖). The term 1 − ℎ𝑖𝑖 is the diagonal component of the variance-covariance matrix of the residuals. The QA report includes a list of the flagged aggregate units with the number of flagged students in the aggregate unit.

If the aggregate unit size is 1–5 students, the aggregate unit was flagged if the percentage of flagged students was greater than 50%. The aggregate unit size for the score change is based on the number of students included in the within- or between-year regression analyses in the aggregate unit. The number of flagged aggregate units and the percentage of flagged aggregate are presented in Table B.1 to B.4 in Appendix B.

Item Response Latency

The online environment also allows item response latency to be captured as the item page time (the length of time

that each item page is presented) in milliseconds. Discrete items appear one item on the screen at a time.

However, for stimulus-based items selected as part of an item group, all items associated with the stimulus are

selected and loaded as a group. For each student, the total time taken to complete the test is computed by

summing up the page time for all items and item groups.

An example of unusual item response time would be a test record for an individual who scores very well on the

test even though the average time spent for each item was far less than that required of students statewide. If

students already know the answers to the questions, the response time is much shorter than the response time for

those items where the student has no prior knowledge of the item content. Conversely, if a test administrator

helps students by “coaching” them to change their responses during the test, the testing time could be longer than

expected.

The average and the standard deviation of test-taking time are computed across all students for each opportunity.

Students and aggregate units were flagged if the test-taking time was greater than 3 or smaller than -3 standard

deviations of the state average. The state average and standard deviation were computed based on all students at

the time the analysis was performed. The QA report includes a list of the flagged aggregate units with the number

of flagged students in the aggregate unit. The number of flagged aggregate units and the percentage of flagged

aggregate are presented in Table B.1 to B.4 in Appendix B.

Inconsistent Item Response Pattern (Person Fit)

In Item Response Theory (IRT) models, person-fit measurement is used to identify students whose response patterns

are improbable given an IRT model. If a test has psychometric integrity, little irregularity will be seen in the item

responses of the individual who responds to the items fairly and honestly.

If a student has prior knowledge of some test items (or is provided answers during the exam), the student will

respond correctly to those items at a higher probability than indicated by his or her ability as estimated across all

items. In this case, the person-fit index will be large for the student. We note, however, that if a student has prior

knowledge of the entire test content, this will not be detected based on the person-fit index, although the item

response latency index might flag such a student.

The person-fit index is based on all item responses. An unlikely response to a single test question may not result in

a flagged person-fit index. Of course, not all unlikely patterns indicate cheating, as in the case of a student who is


able to guess a significant number of correct answers. Therefore, the evidence of person-fit index should be

evaluated along with other testing irregularities to determine possible testing irregularities. The number of flagged

students is summarized for every testing session and test administrator.

The person-fit index, zl , is computed using a standardized log-likelihood statistic. Following Drasgow, Levine, and

Williams (1985), Sotaridona, Pornell, and Vallejo (2003) define aberrant response patterns as a deviation from the

expected item score model. Snijders (2001) showed that the distribution of zl is asymptotically normal (i.e., with

an increasing number of administered items, i). Even at shorter test lengths of 8 or 15 items, the “asymptotic error

probabilities are quite reasonable for nominal Type I error probabilities of 0.10 and 0.05” (Snijders, 2001).

Sotaridona et al. (2003) report promising results of using zl for systematic flagging of aberrant response patterns.

Students with zl values greater than 3 or smaller than -3 are flagged. Aggregate units are flagged with t greater than

3 or smaller than -3 are flagged. The number of flagged aggregate units and the percentage of flagged aggregate are

presented in Table B.1 to B.4 in Appendix B.

𝑡 =𝐴𝑣𝑒𝑟𝑎𝑔𝑒 zl values

√(𝑠2 + 1) 𝑛⁄,

where s = standard deviation of zl values in an aggregate unit and n = number of students in an aggregate unit. The

QA report includes a list of the flagged aggregate units with the number of flagged students in the aggregate unit

(e.g., test session, test administrator, school). The number of flagged aggregate units and the percentage of flagged

aggregate are presented in Table B.1 to B.4 in Appendix B.

Item Response Similarity

The item response similarity was investigated using a cheating detection method, proposed in the paper titled

“Detecting excessive similarity in answers on multiple choice exams” (Wesolowsky, 2000). This study uses the

similarity of responses between a pair of students to estimate the probability of possible cheating. The

computational steps are as follows:

1. Based on assumptions and probability theory (pp. 911–912), ˆjip is estimated by solving the following two

equations

1/

1

(1 (1 ) )j ja a

ji i

q

jiij

p r

pc

q

for ja , and from ˆ

ja and ir to obtain ˆ ˆ1/

ˆ (1 (1 ) )j ja a

ji ip r , where ir is the proportion of the analysis unit

(e.g., school) that answered correctly on item i, jc is the proportion of items answered correctly by student j;


2. tiw is the probability that, conditional on the answer being wrong, distractor t is chosen on question i. For now,

this is estimated by the proportion of students who choose option t over students who choose wrong options on

this item;

3. Using estimates from steps 1 and 2 to estimate ˆjk and 2ˆ

jk , hence, jkZ ;

4. Based on jkZ and significant level to decide if the students j and k have significant probability to copy each other.

In order to investigate the probability of false positive of the estimating procedure, the procedure is applied to

estimate the probability of cheating for each pair within each aggregate unit (school/session), and two Bonferroni

adjustments are used, one of which is based on (n – 1), and the other of which is based on (n(n – 1)/2), where n is

the number of students within the aggregate unit (school/session).

Aggregate units are flagged with two different methods: aggressive method and conservative method. The

aggressive method uses an alpha=0.05 and Bonferroni adjustment factor (n – 1) to flag test sessions and schools.

The more conservative method uses alpha=0.01 and Bonferroni adjustment factor (n(n – 1)/2) to flag suspect test

sessions and schools.

Bonferroni adjustment with factor (n – 1) is used if we know the seating of the students and the possible cheating

can only happen between the front and back student pair. If no seating chart is available, the factor (n(n – 1)/2) is

usually used. Based on simulation studies, the results based on (n(n – 1)/2) provide a good safety buffer against the

false positive, that we see only a slight chance of false positive. As for the alpha level, it seems that using alpha=0.01

is preferred, so only extreme pairs that worth investigation will be flagged. The number of flagged aggregate units

and the percentage of flagged aggregate by the two different methods are presented in Table B.5 to B.8 in Appendix

B.

1.2.6 SUMMARY OF VALIDITY OF TEST SCORE INTERPRETATIONS

Evidence for the validity of test score interpretations is strengthened as evidence supporting test score

interpretations accrues. In this sense, the process of seeking and evaluating evidence for the validity of test score

interpretation is ongoing. Nevertheless, there currently exists sufficient evidence to support the principle claims for

the test scores, including that OST test scores indicate the degree to which students have achieved Ohio’s Learning

Standards at each grade level, and that students scoring at the proficient level or higher demonstrate levels of

achievement consistent with extrapolations of national benchmarks indicating that they are on track to graduate

and are ready for post-secondary education or entry into the workforce. These claims are supported by evidence of

a test development process that ensures alignment of test content to Ohio’s Learning Standards, a standard-setting

process that yielded performance standards consistent with those of rigorous, national benchmarks, and evidence

that the structural model described by the new standards and implemented in Ohio’s State Tests is sound.


2. BACKGROUND OF OHIO COMPUTER-BASED ASSESSMENTS

2.1 BACKGROUND OF ELA AND MATHEMATICS ASSESSMENTS

The Ohio State Board of Education adopted the Common Core State Standards (CCSS) in both English Language Arts

(ELA) and mathematics as Ohio’s Learning Standards. Ohio’s Learning Standards are designed to help ensure that

students are college and career ready by the end of high school. Ohio’s Learning Standards in ELA and mathematics

were fully implemented in classrooms and assessed starting in the 2014–2015 school year. Beginning in the 2015–

2016 school year, ODE began administering Ohio’s State Tests (OST) to assess student proficiency in ELA and

mathematics at grades 3–8, and following completion of high school coursework in ELA I, ELA II, Algebra, and

Geometry (or alternatively, following coursework in Integrated Mathematics I and Integrated Mathematics II).

The first operational administration of OST assessments in ELA and mathematics took place in fall 2015, with

administration of grade 3 ELA and high school end-of-course (EOC) assessments in ELA and mathematics. The first

operational forms of OST assessments in ELA and mathematics were constructed using items from the AIRCore item

bank. The AIRCore items were developed to be aligned to the CCSS and had been previously administered as part of

statewide assessments in Arizona, Florida, Utah, and/or Oregon. Following administration in one or more of the

statewide assessment systems and completion of the item-review process, AIRCore items were calibrated using

Rasch and Masters’ Partial Credit models, and linked to a common AIRCore scale. In December 2015, a standard-

setting workshop was conducted to recommend to the Ohio State Board of Education a set of performance standards

for reporting student achievement of Ohio’s Learning Standards in ELA and mathematics.14

The full system of grade-level and end-of-course assessments in ELA and mathematics were administered in spring

2016. Following the close of the testing window, the item pools for grade-level summative and high school EOC

assessments were calibrated. The Rasch model and Masters’ (1982) partial credit model, an extension of the one

parameter Rasch model that allows for graded responses, were used to estimate item parameters for OST

assessments. The OST assessments’ scale for each of the ELA and mathematics assessments was established by

centering on spring 2016 operational form. The performance standards previously set on the AIRCore scale were

shifted to the new OST scale by the linking constants obtained through the mean-mean equating method. In

subsequent years, pre-equated bank item parameter estimates will be applied directly for final scoring and reporting,

a strategy that allows for more rapid reporting of tests administered online.

2.2 BACKGROUND OF SCIENCE AND SOCIAL STUDIES ASSESSMENTS

Ohio adopted new academic learning standards in social studies and science in 2010 and 2011, respectively. Ohio’s

Learning Standards in science and social studies are designed to ensure that students across grades are receiving the

instruction they need to become scientifically literate and civic-minded citizens equipped with knowledge and skills

for the 21st century workforce and able to successfully transition to higher education. In spring 2015, ODE

administered the OST for the first time in social studies and science to assess proficiency with respect to Ohio’s

Learning Standards. OST assessment assesses science achievement in grades 5 and 8, and following instruction in

Physical Science and Biology in high school. Social studies achievement is assessed following high school coursework

14 Standard 7.1 – The rationale for a test, recommended uses of the test, support for such uses, and information that assists in score interpretation should be documented. When particular misuses of a test can be reasonably anticipated, cautions against such misuses should be specified.


in American History and American Government. Prior to spring 2018, social studies achievement was also assessed

in grades 4 and 6.

The first operational administration of OST assessments in science and social studies took place in spring 2015. The

paper-based and online administrations occurred through the months of March and April. Following the close of the

testing window, the American Institutes for Research (AIR), under contract to ODE, convened eight panels of Ohio

educators to recommend performance standards on the assessments.

The Rasch model and Masters’ (1982) partial credit model were used to estimate item parameters for OST

assessments in science and social studies. Item pools for grade-level summative and end-of-course assessments

were calibrated following the first operational administration in spring 2015. In subsequent years, pre-equated bank

item parameter estimates will be applied directly for final scoring and reporting, a strategy that allows for more

rapid reporting of tests administered online.

2.3 OST TEST DESIGN

OST assessments are a series of fixed-form assessments that are intended to be administered online, although the

assessment is offered as a dual-mode (online and paper-based) assessment to accommodate schools that are not

ready to transition to the online testing environment.

Five types of machine-scored constructed-response (MSCR) items were included in OST assessments’ forms: graphic

response, natural language, equation response, hot text, and table input items. The graphic response item types

require students to place objects or move objects around in the answer space. A student can also plot points, draw

lines, and draw shapes. The natural language item types require students to type an English language answer. The

equation response items require students to enter a value or equation. The table input item types require students

to input numerical values into a table. Rubric validation for all operational test items was completed prior to test

construction and was based on the previous field-test administration of those items.

Each ELA assessment included one writing essay prompt that required an extended essay response. For the online

test administrations, a random sample of student responses to each writing task was selected for handscoring. These

responses were scored by two human raters on three distinct scoring dimensions or rubrics: Statement of

Purpose/Focus and Organization, Evidence/Elaboration, and Conventions/Editing, with any discrepancy adjudicated

in a resolution score. This sample of essay responses and writing scores was used to develop the statistical models

used for machine-scoring the remaining online essay responses. All essay responses captured from paper-pencil

tests were handscored.

Exhibits 2.3.1–2.3.16 provide the test blueprints that guide the construction of OST assessments’ test forms.

Exhibit 2.3.1: Point Range by Subscale — ELA

Grade / Course RL RI W

Grade 3 14–16 14–16 10

Grade 4 14–16 14–16 10

Grade 5 14–16 14–16 10

Grade 6 16–20 20–24 20

Grade 7 16–20 20–24 20

Grade 8 16–20 20–24 20


Grade / Course RL RI W

ELA I 14–18 22–26 20

ELA II 14–18 22–26 20

Note: RL = Reading Literary Text; RI = Reading Informational Text; W = Writing

Exhibit 2.3.2: Point Range by Subscale — Grade 3 Mathematics

Grade FRA G MUD NO

Grade 3 11–13 11–13 12–16 11–13

Note: FRA = Fraction; G = Geometry; MUD = Multiplication & Division; NO = Numbers & Operations


Grade FRA G MUD

Grade 4 17–21 11–13 17–21

Note: FRA = Fraction; G = Geometry; MUD = Multiplication & Division


Grade D FRA G

Grade 5 17-21 17-21 11-13

Note: D = Decimals; FRA = Fraction; G = Geometry


Grade EE GS NS RP

Grade 6 17–23 11–13 11–13 13–17

Note: EE = Expression and Equations; GS = Geometry and Statistics; NS = The Number System; RP = Ratios and Proportions


Grade G NS RP SP

Grade 7 11–13 15–19 12–16 12–15

Note: G = Geometry; NS = The Number System; RP = Ratios and Proportions; SP = Statistics and Probability



Grade EE F G NS

Grade 8 11–15 11–15 15–19 11–13

Note: EE = Expression and Equations; F = Functions; G = Geometry; NS = The Number System

Exhibit 2.3.8: Point Range by Subscale — Algebra

Course F NQEE S

Algebra 23–27 19–22 10–12

Note: F = Functions; NQEE = Number, Quantities, Equations and Expressions; S = Statistics

Exhibit 2.3.9: Point Range by Subscale — Geometry

Course CP C P ST

Geometry 19–21 10–13 10–12 13–19

Note: CP = Congruency & Proof; C = Circles; P = Probability; ST = Similarity & Trigonometry

Exhibit 2.3.10: Point Range by Subscale — Integrated Mathematics I

Course A G NQF S

Integrated Math I 13–15 11–13 18–21 10–12

Note: A = Algebra; G = Geometry; NQF = Number & Quantity/Functions; S = Statistics

Exhibit 2.3.11: Point Range by Subscale — Integrated Mathematics II

Course F G NQEE P

Integrated Math II 11–13 17–22 14–18 10–12

Note: F = Functions; G = Geometry; NQEE = Number, Quantities, Equations and Expressions; P = Probability

Exhibit 2.3.12 Point Range by Subscale — Grade 5 & 8 Science

Grade ES LS PS

Grade 5 15–17 19–21 19–21

Grade 8 21–23 16–18 16–18

Note: ES = Earth Science, LS = Life Science, PS = Physical Science


Exhibit 2.3.13 Point Range by Subscale — Biology

Course BS.A BS.B BS.C BS.D

Biology 13–15 13–15 13–15 13–15

Note: BS.A = Heredity; BS.B = Evolution; BS.C = Diversity and Interdependence of Life; BS.D = Cells

Exhibit 2.3.14: Point Range by Subscale — Physical Science

Course PS-HS.A PS-HS.B PS-HS.C PS-HS.D

Physical Science 15–17 15–17 15–17 7–9

Note: PS-HS.A = Study of Matter, PS-HS.B = Energy and Waves, PS-HS.C = Forces and Motion, PS-HS.D = The Universe

Exhibit 2.3.15: Point Range by Subscale — American Government

Course AGA AGB AGC

American Government 23–25 23–25 15–17

Note: AGA = Historic Documents, AGB = Principles and Structure, AGC = Ohio/Policy/ Economy

Exhibit 2.3.16: Point Range by Subscale — American History

Course AHA AHB AHC

American History 17–19 24–26 20–22

Note: AHA = Skills and Document, AHB = 1877-1945, AHC = 1945 - Present


3. SUMMARY OF FALL 2017 OPERATIONAL TEST ADMINSTRATION

The following tests were administered in fall 2017:

• Grade 3 ELA assessment

• High school end-of-course assessments

o ELA I and ELA II

o Algebra I, Geometry, Integrated Mathematics I, and Integrated Mathematics II

o Biology and Physical Science

o American Government and American History

The online testing window opened between October 23 and November 3 for grade 3 ELA.

The high school end-of-course tests are scheduled to be administered following completion of instruction in courses

targeted for assessment. Most of the courses are taught over an academic year, but some students receive

instruction in semester-based courses, necessitating a fall administration. The online and paper-based testing

window opened between December 4 and January 12 for high school EOC tests. All the 2017 fall tests were scored

using the pre-equated parameters. This section summarizes the operational test results for the fall 2017

administration of OST.

3.1 STUDENT POPULATION AND PARTICIPATION

Assessment data for operational analyses included Ohio public school students who met minimum attemptedness

requirements for scoring and reporting. The demographic composition of students taking the fall 2017 OST is

presented in Exhibit 3.1.1 by assessment and subgroup.15 The number of students participating in each assessment

by test mode is presented in Appendix C.

Exhibit 3.1.1: Number of Students Participating in Fall 2017 Assessments

Grade / Course

Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

Wh

ite

Mu

ltip

le

Eth

nic

itie

s

LEP

IEP

ELA

Grade 3 128,205 62,932 65,121 152 23,210 3,087 5,369 158 86,563 9,655 5,114 15,140

ELA I 42,039 18,136 23,125 778 14,195 764 2,214 59 21,782 2,772 2,849 9,989

ELA II 35,553 16,400 18,474 679 11,493 728 1,784 55 18,918 2,334 2,286 7,111

Mathematics

Algebra 47,491 22,303 24,208 980 14,711 573 2,608 75 26,193 3,049 2,011 9,353

Geometry 34,631 16,742 17,174 715 10,541 596 1,828 46 19,222 2,135 1,519 6,403

Integrated Math I 4,832 2,346 2,447 39 2,327 213 76 14 1,633 566 812 671

15 Standard 1.8 – The composition of any sample of students from which validity evidence is obtained should be described in as much detail as is practical and permissible, including major relevant socio-demographic and developmental characteristics.


Grade / Course

Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

Wh

ite

Mu

ltip

le

Eth

nic

itie

s

LEP

IEP

Integrated Math II 4,401 2,098 2,272 31 1,988 184 69 8 1,682 467 619 545

Science

Biology 24,837 12,013 12,508 316 8,845 392 1,255 39 12,577 1,625 1,470 4,333

Physical Science 899 436 445 18 448 6 39 327 66 41 198

Social Studies

American Government 32,330 15,655 16,262 413 7,161 752 1,317 47 21,145 1,775 1,104 4,583

American History 20,969 10,349 10,292 328 7,551 351 1,132 28 10,460 1,353 1,408 4,173

3.2 SUMMARY OF OVERALL STUDENT PERFORMANCE FOR FALL 2017

Exhibit 3.2.1 shows the statewide summary statistics for the fall 2017 OST administration. The results that include

the minimum and maximum observed scale scores, scale score mean, standard deviation, standard error of the

measure (SEM), and internal consistency reliability. Frequency distributions for each of the assessments are provided

in Appendix D.

Exhibit 3.2.1 Fall 2017 Operational Test Summary Statistics

Grade / Course N-count

Max Obtained

Scaled Score

Min Obtained

Scaled Score

Scaled Score Mean

Scaled Score Standard Deviation

Scaled Score SEM

Reliability

ELA

Grade 3 128,205 863 545 688.63 43.37 17.49 0.84

ELA I 42,039 800 606 681.63 23.55 8.89 0.86

ELA II 35,553 795 597 675.63 26.01 9.73 0.86

Mathematics

Algebra I 47,491 814 618 680.79 21.74 10.32 0.77

Geometry 34,631 810 604 669.11 25.16 12.98 0.73

Integrated Math I 4,832 811 618 677.9 24.21 10.47 0.81


Science

Biology 24,837 822 617 688.77 21.43 9.85 0.79

Physical Science 899 752 634 684.59 14.06 9.85 0.51

Social Studies

American Government 32,330 774 642 706.56 19.61 5.62 0.92

American History 20,969 800 619 693.96 19.93 7.65 0.85


Exhibit 3.2.2 shows the percentage of students classified in each of the performance levels for each the fall 2017

tests.

Exhibit 3.2.2: Fall 2017 Percentage of Students in Performance Levels

Grade / Course Number Tested

% Limited % Basic % Proficient % Accelerated % Advanced % At or Above

Proficient

ELA

Grade 3 128,205 33 29 15 15 8 38

ELA I 42,039 52 29 16 3 2 21

ELA II 35,553 55 30 14 3 1 17

Mathematics

Algebra I 47,491 51 34 13 3 1 16

Geometry 34,631 69 23 6 2 1 9

Integrated Math I 4,832 59 29 11 4 1 16

Integrated Math II 4,401 68 23 7 3 2 12

Science

Biology 24,837 43 35 16 2 5 23

Physical Science 899 47 41 14 1 0 15

Social Studies

American Government 32,330 14 26 41 13 7 60

American History 20,969 31 36 26 4 3 34

3.3 STUDENT PERFORMANCE BY SUBGROUP FOR FALL 2017

Exhibits 3.3.1–3.3.4 present the percentage of students in each grade and subject at each performance level, by

gender and ethnicity, including female, male, African American, Asian, Alaskan/Hawaiian Native Hispanic/Latino,

American Indian, White, and Multiple Ethnicities. Overall, the achievement gap between subgroups continues to

exist. For example, the white group outperforms the African American and Hispanic group.

Exhibit 3.3.1 Fall 2017 Percentage of Students at Each Performance Level by Gender and Ethnicity — ELA

Grade / Course

Performance Level

Percentage of Students in Each Grade and Subject at Each Performance Level

Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

LEP

IEP

Grade 3

Limited 33 30 35 61 56 23 46 36 25 39 68 61

Basic 29 30 29 22 28 23 30 32 30 31 25 25

Proficient 15 16 15 11 8 16 12 12 18 14 5 8

Accelerated 15 15 14 6 6 21 9 15 18 12 2 5


Grade / Course

Performance Level


Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

LEP

IEP

Advanced 8 9 7 3 2 17 3 5 10 6 0 2

ELA I

Limited 52 48 56 51 65 50 55 56 43 57 70 72

Basic 29 31 28 32 27 28 30 24 32 27 23 23

Proficient 16 18 15 18 11 18 15 20 20 16 9 7

Accelerated 3 3 3 2 1 4 1 2 4 2 0 0

Advanced 2 2 1 0 0 3 1 2 3 1 0 0

ELA II

Limited 55 51 59 49 68 53 62 55 46 58 75 79

Basic 30 32 27 32 26 30 25 27 32 29 21 19

Proficient 14 15 12 18 8 13 13 16 17 12 5 5

Accelerated 3 3 2 2 1 4 2 2 4 2 0 0

Advanced 1 1 1 0 0 2 1 2 2 1 0 0

Exhibit 3.3.2 Fall 2017 Percentage of Students at Each Performance Level by Gender and Ethnicity —

Mathematics

Grade / Course

Performance Level


Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

LEP

IEP

Algebra

Limited 51 48 54 45 62 33 54 48 44 55 64 76

Basic 34 37 32 37 31 30 34 41 37 34 28 22

Proficient 13 14 12 14 9 19 12 11 16 11 8 4

Accelerated 3 2 3 4 1 10 2 1 3 2 1 0

Advanced 1 1 1 1 0 9 0 0 1 0 0 0

Geometry

Limited 69 69 69 65 82 46 73 63 62 74 75 89

Basic 23 23 22 25 17 25 21 24 26 20 19 11

Proficient 6 6 7 7 3 14 5 11 8 6 5 2

Accelerated 2 2 2 4 0 9 1 0 3 1 2 0

Advanced 1 1 1 0 0 7 0 2 1 0 0 0

Integrated Math I

Limited 59 56 61 74 66 60 59 86 48 61 73 80

Basic 29 31 27 21 27 26 29 7 30 31 23 18

Proficient 11 12 11 8 9 12 12 7 14 9 7 6


Grade / Course

Performance Level


Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

LEP

IEP

Accelerated 4 4 3 0 1 1 3 0 8 2 1 1

Advanced 1 1 1 0 0 2 0 0 2 1 0 0

Integrated Math II

Limited 68 68 68 71 77 63 59 63 57 78 82 90

Basic 23 23 23 19 22 18 26 25 24 21 17 12

Proficient 7 7 7 6 4 11 10 13 10 4 5 1

Accelerated 3 2 3 3 1 5 1 0 6 2 0 0

Advanced 2 2 2 0 0 5 3 0 4 0 0 0

Exhibit 3.3.3 Fall 2017 Percentage of Students at Each Performance Level by Gender and Ethnicity — Science

Grade / Course

Performance Level


Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

LEP

IEP

Biology

Limited 43 41 44 45 55 40 49 46 33 43 59 62

Basic 35 37 34 38 37 35 36 44 34 38 35 32

Proficient 16 17 16 18 10 11 15 13 21 16 6 7

Accelerated 2 2 2 1 0 4 1 3 3 1 0 0

Advanced 5 5 5 1 1 11 2 3 8 4 0 1

Physical Science

Limited 47 48 47 56 54 33 41 44 33 44 61

Basic 41 42 41 39 40 50 46 41 50 49 36

Proficient 14 14 15 11 10 17 13 18 20 10 7

Accelerated 1 1 0 0 0 0 0 1 2 0 0

Advanced 0 0 0 0 0 0 0 0 0 0 0


Exhibit 3.3.4 Fall 2017 Percentage of Students at Each Performance Level by Gender and Ethnicity — Social

Studies

Grade / Course

Performance Level


Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

LEP

IEP

American Government

Limited 14 13 16 22 31 9 22 11 8 17 36 35

Basic 26 28 25 35 37 19 37 36 22 31 44 39

Proficient 41 42 40 38 29 38 35 34 45 39 20 23

Accelerated 13 12 13 5 4 19 5 19 16 10 1 3

Advanced 7 6 8 1 1 15 2 0 9 5 0 1

American History

Limited 31 28 35 28 41 27 36 32 24 34 47 48

Basic 36 40 33 31 39 32 39 29 34 36 38 38

Proficient 26 27 26 31 20 30 22 36 32 26 16 14

Accelerated 4 3 4 7 1 5 2 0 6 3 1 1

Advanced 3 2 4 5 1 7 1 4 6 3 1 1

3.4 RELIABITY FOR FALL 2017

Reliability refers to the consistency or precision of test scores and performance level classifications, and essentially

addresses the question of how likely would a student be to achieve the same score, or be classified in the same

performance level, across multiple administrations of equivalently constructed and administered test forms. As part

of each test administration, the reliability of test scores and performance classifications is evaluated from a variety

of perspectives. The reliability evidence of OST assessments in ELA, mathematics, science, and social studies is

demonstrated with respect to both classical and IRT indices of internal consistency of test scores, and decision

accuracy and consistency of performance level classifications.16

3.4.1 INTERNAL CONSISTENCY

Test score reliability is traditionally estimated using both classical and IRT approaches. While measurement error is

conditional on test information, it is nevertheless desirable to provide a single index of a test’s internal consistency

or reliability. Classical estimates of test reliability such as Cronbach’s alpha, provide an index of the internal

consistency reliability of the test, or the likelihood that a student would achieve the same score in an equivalently

16 Standard 2.2 – The evidence provided for the reliability/precision of the scores should be consistent with the domain of replications associated with the testing procedures, and with the intended interpretations for use of the test scores. Standard 2.3 – For each total score, subscore, or combination of scores that is to be interpreted, estimates of relevant indices of reliability/precision should be reported.


constructed test form. 17 Exhibit 3.4.1.1 shows the internal consistency estimates for each of the assessments.

Internal consistency estimates are around 0.8, typical of most similar length achievement tests. Internal consistency

estimate for Physical Science is, however, quite low and appears to be due to restriction of score range resulting

from the very high difficulty of test items.

Exhibit 3.4.1.1 Internal Consistency Reliabilities (Cronbach’s alpha) for Fall 2017 OST Scores

Grade / Course Internal Consistency

Reliability Variance

ELA

Grade 3 0.84 1882

ELA I 0.86 555

ELA II 0.86 677

Mathematics

Algebra 0.78 473

Geometry 0.73 633

Integrated Math I 0.81 586

Integrated Math II 0.80 813

Science

Biology 0.79 460

Physical Science NA NA

Social Studies

American Government 0.92 385

American History 0.85 397

NA: Not enough information to estimate reliably.

3.4.2 STANDARD ERROR OF MEASUREMENT

Because measurement error is conditional on test information, the precision of test scores varies with respect to the

information value of the test at each location along the ability distribution. Precision of individual test scores is

critically important to valid test score interpretation and is provided along with test scores as part of all student-

level reporting. Test scores are most precise in locations where test information is greatest. Because relatively little

test information is targeted to measurement of very low and high performing students, the precision of test scores

decreases near the tails of the ability distribution.

For OST assessments scored using MLE, the mathematical statement of the conditional standard error of

measurement (CSEM) for student i is:

𝐶𝑆𝐸𝑀(�̂�𝑖) = 1

√𝐼(�̂�𝑖)

where 𝐼(�̂�𝑖) is the Fisher information at the MLE and is calculated:

17 Standard 2.19


𝐼(�̂�) = −𝜕2𝑙(𝜃)

𝜕𝜃2 |𝜃=�̂�.

In general, the second derivative for the ith 1PL item is

𝜕2𝑙𝑜𝑔([𝑝(𝜃)]𝑧𝑖[𝑞(𝜃)]1−𝑧𝑖)

𝜕𝜃2= { −𝐷2

𝑞𝑖(𝜃) (𝑝𝑖3(𝜃))

𝑝𝑖2(𝜃)

if 𝑧𝑖 = 1

−𝐷2𝑞𝑖(𝜃)(𝑝𝑖(𝜃)) if 𝑧𝑖 = 0

The second derivative for the ith Master’s Partial Credit Model item is

𝜕2𝑙𝑜𝑔(𝑃(𝑧𝑖|𝜃))

𝜕𝜃2= 𝐷2

[∑ 𝑗Exp(∑ 𝐷(𝜃 − 𝑏𝑘𝑖))𝑗𝑘=1

𝑚𝑖𝑗=1 ]

2

[1 + ∑ exp ∑ 𝐷(𝜃 − 𝑏𝑘𝑖)𝑗𝑘=1

𝑚𝑖𝑗=1 ]

2 − 𝐷2∑ 𝑗2Exp(∑ 𝐷(𝜃 − 𝑏𝑘𝑖))

𝑗𝑘=1

𝑚𝑖𝑗=1

1 + ∑ exp ∑ 𝐷(𝜃 − 𝑏𝑘𝑖)𝑗𝑘=1

𝑚𝑖𝑗=1

Standard errors of the MLEs are transformed to be placed onto the reporting scale. This transformation is:

SEss = a∙SEθ

where SEθ is the standard error of the ability estimate on the scale; and a is the slope of the scaling constants

that transform to the reporting scale. For OST assessments, 𝑎 =(725−700)

(𝜃𝐴𝑐𝑐𝑒𝑙𝑒𝑟𝑎𝑡𝑒𝑑−𝜃𝑃𝑟𝑜𝑓𝑖𝑐𝑖𝑒𝑛𝑡).

The figures in Exhibit 3.4.2.1–3.4.2.3 present graphically the standard errors of measurement for the grade-level and

end-of-course assessments. Each figure also includes the location of the four OST performance standard cuts. As the

figures indicate, OST assessments’ standard errors are smallest near the middle of the ability distribution, and

especially near the Proficient and Accelerated performance standard. 18 Test scores near the tails of the ability

distribution have larger standard errors as expected. We note that the test precision for some assessments,

especially the elementary grade ELA tests, does not support the number of performance standards adopted for OST

assessments. Thus, the standard errors for scores within some performance levels is nearly the size of the

performance level. Nevertheless, classification consistency estimates of scores at or above each performance

standard are strong.

18 Standard 2.14 – When possible and appropriate, conditional standard errors of measurement should be reported at several score levels unless there is evidence that the standard error is constant across score levels. Where cut scores are specified for selection or classification, the standard errors of measurement should be reported in the vicinity of each cut score.


Exhibit 3.4.2.1: Overall Standard Error of Measurement for Fall 2017 ELA


Exhibit 3.4.2.2: Overall Standard Error of Measurement for Fall 2017 Mathematics


Exhibit 3.4.2.3: Overall Standard Error of Measurement for Fall 2017 Science and Social Studies


3.4.3 STUDENT CLASSIFICATION RELIABILITY

When student performance is reported in terms of performance categories, a reliability index is computed to

estimate the likelihood of consistent classification of students as specified in standard 2.15 in the Standards for

Educational and Psychological Testing (AERA, APA, NCME, 2014). 19 This index considers the consistency of

classifications for the percentage of students that would, hypothetically, be classified in the same category on an

alternate, equivalent form.

For a fixed-form test, the consistency of classifications is typically estimated on test scores based on a single test

form from a single test administration using the true-score distribution estimated by fitting a bivariate beta-binomial

model or a four-parameter beta model (Huynh, 1976; Livingston & Wingersky, 1979; Subkoviak, 1976; Livingston &

Lewis, 1995).

The classification index can be examined for classification accuracy and classification consistency. Classification

accuracy refers to the agreement between the classifications based on the form actually taken and the classifications

that would be made on the basis of the students’ true scores, if their true scores could somehow be known.

Classification consistency refers to the agreement between the classifications based on the form actually taken and

the classifications that would be made on the basis of an alternate, equivalently constructed test form—that is, the

percentages of students who would be consistently classified in the same performance levels on two equivalent test

administrations.

In reality, the student’s true ability is unknown, and students are not administered an alternate, equivalent form.

Therefore, classification accuracy and consistency are estimated based on students’ item scores and the item

parameters, and the assumed underlying latent ability distribution as described below. The true score is an expected

value of the test score with measurement error.

3.4.4 CLASSIFICATION ACCURACY

Instead of assuming a normal distribution, we can directly estimate the probability of consistent classification using

the likelihood function. The likelihood function of 𝜃 given a student’s item scores represents the likelihood of the

student’s ability at that theta value. Integrating the likelihood values over the range of theta at and above the cut

score (with proper normalization) represents the probability of the student’s latent ability or the true score being at

or above that cut point.

If a student’s estimated ability (theta) is below the cut score, the probability of at or above the cut score is an

estimate of the chance that this student is misclassified as below the cut score, and 1 minus that probability is the

estimate of the chance that the student is correctly classified as below the cut score. Using this logic, we can define

various classification probabilities.

In Exhibit 3.4.4.1, accurate classifications occur when the classification decision made on the basis of the

hypothetical true score agrees with the decision made on the basis of the form actually taken. Misclassifications,

false positives and false negatives, occur when student’s true score classifications are different from the student’s

observed scores (e.g., a student whose true score results in a classification as Proficient, but whose observed score

19 Standard 2.16 – When a test or combination of measures is used to make classification decisions, estimates should be provided of the percentage of students who would be classified in the same way on two replications of the procedure.


results in an incorrect classification as Partially Proficient). represents the expected numbers of students who

are truly above the cut score; represents the expected number of students falsely above the cut score;

represents the expected number of students truly below the cut score; and represents the number of students

falsely below the cut score.

Exhibit 3.4.4.1: Classification Accuracy

Classification on a Form Actually Taken

At or Above the Cut Score Below the Cut Score

Classification on True Score

At or Above the Cut Score

𝑁11 (Truly above the cut) 𝑁10 (False negative)

Below the Cut Score 𝑁01 (False positive) 𝑁00 (Truly below the cut)

3.4.5 CLASSIFICATION CONSISTENCY

As shown in Exhibit 3.4.5.1, consistent classification occurs when two forms agree on the classification of a student

as either at and above or below the performance standard, whereas inconsistent classification occurs when the two

decisions made on the basis of results from the two forms differ.

Exhibit 3.4.5.1: Classification Consistency

Classification on the 2nd Form Taken

Above the Cut Score Below the Cut Score

Classification on the 1st Form

Taken


N11 (Consistently above the cut)

N10 (Inconsistent)

Below the Cut Score

N01 (Inconsistent)

N00 (Consistently below the cut)

3.4.6 CLASSIFICATION ACCURACY AND CONSISTENCY ESTIMATES

Exhibit 3.4.6.1 presents the classification accuracy and consistency indices for fall 2017 administration of OST.

Accuracy classifications are slightly higher than the consistency classifications in almost all performance standards.

The consistency classification rate can be somewhat lower than the accuracy rate because consistency index

assumes two test scores, both of which include measurement error, while the accuracy index assumes only a single

test score plus the true score, which does not include measurement error. However, the accuracy index is lower

than the consistency rate in Geometry and Integrated Mathematics II, especially at the basic cut. This may indicate

the Geometry and Integrated Mathematics II tests are difficult especially for the low achieving students.

11N

01N 00N

10N


Exhibit 3.4.6.1: Decision Accuracy and Consistency Indices for Performance Standards

Grade / Course

Accuracy Consistency

Bas

ic

Pro

fici

en

t

Acc

ele

rate

d

Ad

van

ced

Bas

ic

Pro

fici

en

t

Acc

ele

rate

d

Ad

van

ced

ELA

Grade 3 0.88 0.90 0.92 0.96 0.84 0.86 0.89 0.94

ELA I 0.88 0.93 0.98 0.99 0.84 0.89 0.97 0.99

ELA II 0.89 0.94 0.98 0.99 0.85 0.91 0.98 0.99

Mathematics

Algebra 0.85 0.92 0.99 1.00 0.79 0.89 0.98 1.00

Geometry 0.66 0.79 0.95 0.99 0.77 0.81 0.96 0.99

Integrated Math I 0.87 0.94 0.99 1.00 0.83 0.91 0.98 1.00

Integrated Math II 0.69 0.84 0.96 0.99 0.78 0.84 0.96 0.99

Science

Biology 0.82 0.90 0.98 0.99 0.76 0.86 0.97 0.98

Physical Science 0.79 0.90 0.99 1.00 0.71 0.85 0.99 1.00

Social Studies

American Government 0.93 0.93 0.95 0.97 0.90 0.89 0.93 0.96

American History 0.87 0.90 0.98 0.99 0.83 0.85 0.97 0.98

3.4.7 RELIABILITY FOR SUBGROUPS IN THE POPULATION

Exhibits 3.4.7.1–3.4.7.3 show Cronbach’s alpha estimates of the internal consistency reliability of OST assessments for each of the subgroups: gender (females and males), ethnicity (African American, Asian, Hispanic/Latino, American Indian, White, and students reporting multiple ethnicities), as well as students’ Limited English Proficient (LEP) and Individualized Education Program (IEP) status. 20 Each of the ethnicity subgroups was composed of approximately equal numbers of males and females. As Exhibits 3.4.7.1–3.4.7.3 indicate, internal consistency reliabilities are generally consistent across subgroups, indicating that OST assessments measure an underlying achievement dimension that is in common across all subgroups. Where group reliabilities are attenuated, there is a corresponding decrease in test score variance for the subgroup, likely indicating that the attenuation of reliability is due to restriction of range in the subgroup.

20 Standard 2.11 – Test publishers should provide estimates of reliability/precision as soon as feasible for each relevant subgroup for which the test is recommended.


Exhibit 3.4.7.1 Internal Consistency Reliability by Subgroup Fall 2016 — Grade 3 and High School ELA

Assessments

Subgroup

Grade 3 ELA I ELA II

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

All Students 127,757 0.84 1882 41,904 0.86 555 35,466 0.86 677

Female 62,751 0.84 1865 18,094 0.86 547 16,371 0.86 636

Male 64,856 0.84 1887 23,036 0.85 554 18,418 0.86 700

Unknown Gender 150 0.78 1574 774 0.84 489 677 0.85 632

African American 23,117 0.75 1354 14,167 0.82 465 11,482 0.81 539

Asian 3,083 0.87 2360 764 0.87 604 728 0.88 751

Hispanic/Latino 5,356 0.79 1523 2,207 0.83 482 1,778 0.84 620

American Indian/Alaskan 158 0.83 1866 59 0.87 612 55 0.87 678

White 86,241 0.84 1793 21,694 0.87 563 18,860 0.87 701

Multi-Ethnic 9,640 0.82 1744 2,762 0.86 571 2,324 0.85 634

LEP 5,079 0.65 1036 2,843 0.79 416 2,282 0.77 466

IEP 14,830 0.76 1467 9,906 0.78 393 7,058 0.76 456

Exhibit 3.4.7.2: Internal Consistency Reliability by Subgroup Fall 2017 — High School Mathematics Assessments

Subgroup

Algebra Geometry Integrated Math I Integrated Math II

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

All Students 47,341 0.78 473 34,514 0.73 633 4,822 0.81 586 4,389 0.80 813

Female 22,249 0.76 440 16,700 0.71 583 2,342 0.81 564 2,092 0.79 736

Male 24,121 0.78 500 17,113 0.75 681 2,441 0.82 607 2,266 0.82 887

Unknown Gender 971 0.79 487 701 0.75 653 39 0.72 448 31 0.74 618

African American 14,667 0.69 373 10,499 0.56 427 2,326 0.74 429 1,988 0.65 487

Asian 571 0.92 1249 593 0.90 1367 213 0.83 641 184 0.88 1384

Hispanic/Latino 2,599 0.74 410 1,826 0.67 529 76 0.77 469 69 0.83 884

American Indian/Alaskan 74 0.67 311 46 0.70 465 14 0.61 296 8 0.81 1007

White 26,117 0.79 486 19,168 0.77 683 1,625 0.86 761 1,672 0.86 1086

Multi-Ethnic 3,032 0.75 429 2,120 0.69 562 565 0.79 540 466 0.71 568

LEP 2,005 0.71 402 1,515 0.69 585 812 0.72 415 619 0.67 552

IEP 9,283 0.61 313 6,346 0.50 412 664 0.70 426 534 0.59 465


Exhibit 3.4.7.3. Internal Consistency Reliability by Subgroup Fall 2017 — High School Social Studies and Science

Assessments

Subgroup

Biology Physical Science American Goverment American History

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

All Students 24,772 0.79 460 897 0.51 197 32,259 0.92 385 20,912 0.85 397

Female 11,992 0.78 430 435 0.48 185 15,625 0.91 355 10,329 0.83 326

Male 12,467 0.80 492 444 0.53 206 16,221 0.92 415 10,258 0.87 467

Unknown Gender 313 0.65 284 18 0.62 288 413 0.87 233 325 0.87 447

African American 8,825 0.56 238 448 0.43 181 7,154 0.87 248 7,534 0.77 266

Asian 391 0.87 793 6 0.09 97 749 0.92 463 348 0.88 472

Hispanic/Latino 1,249 0.67 301 39 0.05 94 1,313 0.89 272 1,130 0.80 305

American Indian/Alaskan 39 0.73 401 0 NA NA 47 0.91 335 28 0.85 379

White 12,544 0.84 559 325 0.58 220 21,092 0.91 364 10,431 0.88 452

Multi-Ethnic 1,620 0.77 437 66 0.53 183 1,771 0.91 353 1,348 0.84 375

LEP 1,467 0.46 199 41 -0.04 88 1,100 0.80 168 1,404 0.75 250

IEP 4,292 0.54 239 196 0.24 136 4,541 0.85 223 4,145 0.75 263

3.4.8 RELIABILITY FOR SUBSCALES

Internal consistency reliability estimates associated with the subscales for the fall 2017 operational forms are presented in Exhibits 3.4.8.1–3.4.8.4. As indicated in the Exhibits, subscale reliabilities are generally moderate in magnitude, as expected for subscales of the length observed in OST. The very low subscale reliabilities in ELA I, ELA II writing, and Integrated Mathematics II probability are due to the skewed subscore distribution. For example, in writing tests, 70% students got raw score 0 and 1. In probability subscale, 57% students got raw score 0 and 1.

Exhibit 3.4.8.1: Subscale Reliabilities — Fall 2017 ELA

Grade Reading

Informational Text

Reading

Literary Text Writing

Grade 3 0.68 0.68 0.81

ELA I 0.70 0.58 0.73

ELA II 0.71 0.61 0.71


Exhibit 3.4.8.2: Subscale Reliabilities — Fall 2017 Mathematics

Algebra

Functions Modeling and Reasoning Number, Quantities,

Equations and Expressions

Statistics

0.58 0.67 0.54 0.41

Geometry

Congruence & Proof

Circles Modeling and

Reasoning Probability

Similarity & Trigonometry

0.44 NA 0.52 0.20 0.33

Integrated Mathematics I

Algebra Geometry Modeling and

Reasoning Number &

Quantity/Functions Statistics

0.48 0.25 0.69 0.58 0.46

Integrated Mathematics II

Functions Geometry Modeling and

Reasoning

Number, Quantities,


Probability

0.19 0.52 0.56 0.55 0.24

NA: Negative reliability due to large SEM and small variance of scale scores.

Exhibit 3.4.8.3: Subscale Reliabilities — Fall 2017 Biology and Physical Science

Biology

Heredity Evolution Diversity and

Interdependence of Life

Cells

0.31 0.41 0.48 0.49

Physical Science

Study of Matter Energy and Waves Forces and Motion The Universe

0.01 0.11 0.14 -0.09



Exhibit 3.4.8.4: Subscale Reliabilities — Fall 2017 Social Studies

American Government

Historic Documents Principles and Structure Ohio/Policy/Economy

0.82 0.83 0.69

American History

Skills and Documents 1877-1945 1945-Present

0.66 0.72 0.59

3.4.9 SUBSCALE INTERCORRELATION

The observed correlations among reporting category scores are presented in Exhibits 3.4.9.1–3.4.9.9.

Exhibit 3.4.9.1 Subscale Intercorrelations — Fall 2017 ELA

Grade / Course

Subscale Observed Correlation

RI RL

Grade 3 RL 0.66

W 0.45 0.44

ELA I RL 0.58

W 0.55 0.50

ELA II RL 0.60

W 0.55 0.50 Note: RL = Reading Literary Text; RI = Reading Informational Text; W = Writing

Exhibit 3.4.9.2 Subscale Intercorrelations — Fall 2017 Algebra

Grade Subscale Observed Correlation

F MR NQEE

Algebra

MR 0.77

NQEE 0.57 0.69

S 0.50 0.79 0.48

Note: F = Functions; MR = Model Reasoning; NQEE = Number, Quantities, Equations and Expressions; S = Statistics

Exhibit 3.4.9.3 Subscale Intercorrelations — Fall 2017 Geometry


CP C MR P

Geometry C 0.40

MR Reasoning

0.70 0.60



CP C MR P

P 0.44 0.36 0.73

ST 0.52 0.46 0.63 0.48

Note: CP = Congruency & Proof; C = Circles; MR = Model Reasoning; P = Probability; ST = Similarity & Trigonometry

Exhibit 3.4.9.4 Subscale Intercorrelations — Fall 2017 Integrated Mathematics I


A G MR NQF

Integrated Math I

G 0.48

MR Reasoning

0.72 0.61

NQF 0.59 0.50 0.77

S 0.52 0.45 0.81 0.55 Note: A = Algebra; G = Geometry; MR = Model Reasoning; NQF = Number & Quantity/Functions; S = Statistics

Exhibit 3.4.9.5 Subscale Intercorrelations — Fall 2017 Integrated Mathematics II

Grade Subscale

Observed Correlation

F G MR NQEE

Integrated Math II

G 0.47

MR Reasoning

0.65 0.69

NQEE 0.50 0.52 0.63

P 0.46 0.50 0.75 0.49 Note: F = Functions; G = Geometry; MR = Model Reasoning; NQEE = Number, Quantities, Equations and Expressions; P = Probability

Exhibit 3.4.9.6 Subscale Intercorrelations — Fall 2017 Biology

Grade Subscale Observed Correlations

BS-A BS-B BS-C

Biology

BS-B 0.45

BS-C 0.49 0.50

BS-D 0.48 0.47 0.51 Note: BS-A = Heredity; BS-B = Evolution; BS-C = Diversity and Interdependence of Life; BS-D = Cells


Exhibit 3.4.9.7 Subscale Intercorrelations Fall 2016 — Physical Science

Grade Subscale Observed Correlations

PS-A PS-B PS-C

Physical Science

PS-B 0.23

PS-C 0.26 0.17

PS-D 0.18 0.15 0.12 Note: PS-A = Study of Matter; PS-B = Energy & Waves; PS-C = Forces & Motions; PS-D = The Universe

Exhibit 3.4.9.8 Subscale Intercorrelations and Reliability Estimates — Fall 2017 American Government

Grade Subscale

Observed Correlations

AGA AGB

American Government AGB 0.78

AGC 0.71 0.73 Note: AGA = Historic Documents; AGB = Principles & Structures; AGC = Ohio/Policy/Economy

Exhibit 3.4.9.9 Subscale Intercorrelations and Reliability Estimates — Fall 2017 American History

Grade Subscale

Observed Correlations

AHA AHB

American History AHB 0.67

AHC 0.58 0.62 Note: AHA = Skills & Documents; AHB = 1877-1945; AHC = 1945–Present


4. SUMMARY OF SPRING 2018 OPERATIONAL TEST ADMINSTRATION

The following OST assessments were administered in spring 2018:

• ELA in grades 3–8, and high school EOC assessments in ELA I and ELA II

• Mathematics in grades 3–8 and high school EOC assessments in Algebra I, Geometry (or alternatively

Integrated Mathematics I and Integrated Mathematics II)

• Science in grades 5, grade 8, and high school EOC assessments in Biology and Physical Science (the Physical

Science assessment is being phased out; only a small number of students who had previously failed the

exam participated in the assessment)

• Social Studies high school EOC assessments in American Government and American History

The third operational administration of the full system of OST assessments in ELA and mathematics took place in spring 2018. The ELA testing window occurred from March 26 through April 27 and the mathematics, science, and social studies testing window occurred from April 2 through May 11. Item parameters for all the ELA assessments were freely calibrated following the spring administration. The mean-mean equating procedure was used to link the spring 2017 OST ELA item parameters to the OST assessments’ scale which was established following spring 2016 administration.

The mathematics, science, and social studies tests were scored using the pre-equated parameters calibrated following the spring 2016 administration of Ohio’s State Tests in those subject areas.

This section summarizes the operational test results for the spring 2018 administration of Ohio’s State Tests. Detailed descriptions of procedures for item and test development, test administration, scaling, equating, and scoring are presented in subsequent sections.


4.1 STUDENT POPULATION AND PARTICIPATION

Assessment data for operational analyses included Ohio public school students who met minimum attemptedness requirements for scoring and reporting. The demographic composition of students taking OST assessments is presented in Exhibits 4.1.1–4.1.3 by assessment and subgroup.21 The number of students participating in each assessment by test mode is presented in Appendix C.

Exhibit 4.1.1: Number of Students Participating in Spring 2018 — ELA Online and Paper-Pencil

Exhibit 4.1.2: Number of Students Participating in Spring 2018 — Mathematics Online and Paper-Pencil

21 Standard 1.8 – The composition of any sample of students from which validity evidence is obtained should be described in as much detail as is practical and permissible, including major relevant socio-demographic and developmental characteristics.

Grade / Course

Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

LEP

IEP

Grade 3 126,540 62,181 64,198 161 22,913 3,039 5,335 157 85,292 9,640 5,239 16,500

Grade 4 126,494 61,924 64,397 173 20,774 3,121 5,225 164 87,972 9,106 4,067 17,602

Grade 5 127,957 62,630 65,145 182 21,365 3,129 5,071 188 89,186 8,876 3,565 17,804

Grade 6 126,408 61,828 64,393 187 20,559 3,019 5,096 174 89,046 8,375 3,248 17,204

Grade 7 124,315 60,896 63,166 253 18,947 3,068 4,652 160 89,417 7,919 2,951 16,465

Grade 8 125,288 61,084 64,012 192 19,194 2,985 4,538 158 90,302 7,920 2,980 16,599

ELA I 149,393 71,718 77,090 585 27,610 3,487 5,948 204 102,439 9,385 5,613 22,036

ELA II 139,973 68,577 70,824 572 23,745 3,367 5,309 192 98,736 8,353 4,524 19,191

Grade / Course

Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

LEP

IEP

Grade 3 127,422 62,613 64,625 184 23,023 3,027 5,370 157 86,015 9,661 5,244 16,649

Grade 4 125,922 61,718 64,023 181 20,781 3,035 5,227 161 87,515 9,069 4,061 17,605

Grade 5 126,613 62,075 64,340 198 21,321 2,932 5,061 191 88,167 8,799 3,565 17,784

Grade 6 124,820 61,145 63,461 214 20,540 2,814 5,027 174 87,824 8,296 3,258 17,252

Grade 7 119,692 58,780 60,652 260 18,692 2,618 4,603 154 85,806 7,663 2,939 16,415

Grade 8 97,465 47,014 50,266 185 16,670 1,946 4,060 129 67,923 6,597 2,834 16,011

Algebra 144,489 69,793 74,065 631 25,030 3,124 5,833 206 101,511 8,494 4,044 20,429

Geometry 127,017 62,944 63,551 522 20,224 2,860 4,888 180 91,575 7,045 3,092 16,009

Int Math I 12,228 5,887 6,287 54 3,823 496 302 25 6,255 1,305 1,447 1,893

Int Math II 10,536 5,145 5,332 59 3,057 435 268 27 5,757 970 1,008 1,557


Exhibit 4.1.3: Number of Students Participating in Spring 2018 — Science and Social Studies Online and Paper-

Pencil

4.2 SUMMARY OF OVERALL STUDENT PERFORMANCE FOR SPRING 2018

The state summary results for the average scale scores, standard deviation, standard error measurement, minimum and maximum observed scale scores and reliability of the overall test are presented in Exhibit 4.2.1.

Exhibit 4.2.1: Spring 2018 Operational Test Summary Statistics

Operational Summary Statistics


Max Obtained

Scaled Score

Min Obtained

Scaled Score

Scale Score Mean

Scale Score Standard Deviation

Scale Score SEM

Reliability

ELA

Grade 3 126,540 863 545 710.88 48.41 17.78 0.87

Grade 4 126,494 846 549 715.11 43.83 16.33 0.86

Grade 5 127,957 848 552 719.40 46.20 16.54 0.87

Grade 6 126,408 851 555 707.69 41.83 12.99 0.90

Grade 7 124,315 833 568 710.20 40.01 11.94 0.91

Grade 8 125,288 805 586 700.32 30.45 9.62 0.90

ELA I 149,393 800 606 707.94 30.52 8.78 0.92

ELA II 139,973 808 597 703.19 30.96 9.27 0.91

Mathematics

Grade 3 127,422 818 587 719.56 47.92 12.86 0.93

Grade 4 125,922 835 605 728.85 49.05 12.73 0.93

Grade 5 126,613 804 624 711.09 39.20 9.90 0.94

Grade / Course O

vera

ll

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

LEP

IEP

Science

Grade 5 127,869 62,614 65,059 196 21,265 3,139 5,082 188 89,184 8,868 3,570 17,766

Grade 8 126,202 61,719 64,281 202 19,151 3,001 4,590 155 91,211 7,905 2,988 16,558

Biology 135,480 67,025 68,031 424 21,995 3,230 4,884 186 97,025 7,895 3,888 17,665

Physical Science 484 245 235 4 196 4 20 1 216 40 25 94

Social Studies

American Government 87,077 43,010 43,653 414 14,369 1,568 2,930 122 62,876 5,021 2,158 10,474

American History 126,208 62,431 63,375 402 20,531 2,491 4,691 163 90,768 7,330 3,919 17,314


Operational Summary Statistics


Max Obtained

Scaled Score

Min Obtained

Scaled Score

Scale Score Mean

Scale Score Standard Deviation

Scale Score SEM

Reliability

Grade 6 124,820 790 616 708.74 37.46 9.29 0.94

Grade 7 119,692 806 605 708.49 40.74 10.55 0.93

Grade 8 97,465 774 633 701.91 27.78 7.49 0.93

Algebra 144,489 814 618 703.27 34.14 9.78 0.92

Geometry 127,017 810 604 693.30 41.93 11.43 0.93


Integrated Math II 10,536 813 594 684.48 37.95 11.79 0.90

Science

Grade 5 127,869 845 559 720.45 46.43 13.82 0.91

Grade 8 126,202 868 575 718.20 44.71 13.95 0.90

Biology 135,480 823 617 715.36 29.09 9.14 0.90

Physical Science 484 754 634 679.34 18.24 10.73 0.65

Social Studies

American Government 87,077 774 642 712.61 16.68 5.41 0.89

American History 126,208 800 619 716.79 26.15 7.62 0.92

The percentage of students in each performance level by grade and content area, as well as the percent of students at or above Proficient are presented in Exhibit 4.2.2.

Exhibit 4.2.2: Spring 2018 Percentage of Students in Performance Levels


% Limited

% Basic

% Proficient

% Accelerated

% Advanced

% At or

Above Proficient

ELA

Grade 3 126,540 20 20 18 18 24 60

Grade 4 126,494 16 18 22 22 22 66

Grade 5 127,957 13 17 19 28 22 70

Grade 6 126,408 16 24 24 21 15 59

Grade 7 124,315 15 21 24 21 18 64

Grade 8 125,288 27 19 31 15 8 54

ELA I 149,393 20 20 30 14 18 62

ELA II 139,973 20 22 33 15 10 59

Mathematics

Grade 3 127,422 23 10 20 19 27 67

Grade 4 125,922 19 9 18 26 29 73

Grade 5 126,613 27 10 26 18 19 63

Grade 6 124,820 25 16 23 17 19 59

Grade 7 119,692 27 14 22 23 15 59



% Limited

% Basic

% Proficient

% Accelerated

% Advanced

% At or

Above Proficient

Grade 8 97,465 31 15 33 14 7 54

Algebra 144,489 27 21 26 18 9 53

Geometry 127,017 40 17 19 16 9 44

Integrated Math I 12,228 45 17 21 13 6 40

Integrated Math II 10,536 48 23 14 11 5 30

Science

Grade 5 127,869 11 20 20 24 25 69

Grade 8 126,202 15 17 23 29 15 68

Biology 135,480 15 15 33 10 26 70

Physical Science 484 58 38 9 1 0 10

Social Studies

American Government 87,077 6 14 59 16 6 81

American History 126,208 11 17 33 17 22 72

4.3 STUDENT PERFORMANCE BY SUBGROUP FOR SPRING 2018

Exhibits 4.3.1–4.3.4 presents the percentage of students in each grade and subject at each performance level, by gender and ethnicity, including female, male, African American, Asian, Alaskan/Hawaiian Native, Hispanic/Latino, American Indian, White, and Multiple Ethnicities, other demographic information such as special education (SPED) and limited English proficiency (LEP). Performance of African American and Hispanic students lags considerably behind performance of White and Asian students and this performance gap continues to be a concern.

Exhibit 4.3.1: Percentage of Students at Each Performance Level by Gender and Ethnicity in Spring 2018 — ELA

Grade / Course

Performance Level


Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

IEP

LEP

Grade 3

Limited 20 17 22 29 38 13 29 19 14 23 48 48

Basic 20 19 21 30 26 15 26 20 18 23 25 29

Proficient 18 18 19 19 16 14 18 18 19 18 13 14

Accelerated 18 18 17 14 11 19 13 18 20 17 8 7

Advanced 24 27 21 9 8 39 13 24 29 19 6 3

Grade 4

Limited 16 14 18 31 32 11 24 20 11 20 47 49

Basic 18 18 19 25 27 11 23 14 16 21 24 27

Proficient 22 22 22 20 21 16 22 25 22 22 16 15

Accelerated 22 22 22 14 14 23 17 19 25 19 9 7

Advanced 22 24 19 10 7 40 13 22 26 17 4 2


Grade / Course

Performance Level


Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

IEP

LEP

Grade 5

Limited 13 10 15 24 29 10 19 15 8 16 43 47

Basic 17 16 18 27 27 9 24 16 14 20 28 30

Proficient 19 19 20 22 21 13 21 22 19 21 15 15

Accelerated 28 29 27 16 17 28 24 30 31 25 10 7

Advanced 22 25 20 12 6 40 12 16 27 17 4 2

Grade 6

Limited 16 12 20 36 35 12 23 19 11 20 52 58

Basic 24 22 27 32 33 13 30 30 22 30 31 31

Proficient 24 24 23 19 19 19 23 24 25 23 11 9

Accelerated 21 23 19 10 9 23 15 21 24 17 4 3

Advanced 15 19 12 4 4 34 8 6 18 11 1 1

Grade 7

Limited 15 12 19 18 34 11 24 18 11 20 51 60

Basic 21 19 23 25 31 10 26 25 19 26 30 28

Proficient 24 24 24 27 20 17 24 23 25 23 13 9

Accelerated 21 23 20 22 10 23 16 19 24 18 5 3

Advanced 18 22 15 9 4 38 10 16 22 13 2 1

Grade 8

Limited 27 22 32 48 55 17 40 28 20 33 71 79

Basic 19 18 20 17 21 11 21 17 19 21 17 14

Proficient 31 33 29 24 18 27 26 37 34 29 10 6

Accelerated 15 17 13 9 5 22 9 13 18 12 2 1

Advanced 8 9 6 3 1 23 3 4 9 6 1 0

ELA I

Limited 20 16 24 49 42 20 33 23 13 25 52 66

Basic 20 18 21 26 29 11 23 24 17 23 30 24

Proficient 30 30 29 22 23 20 27 34 32 29 16 10

Accelerated 14 15 13 5 6 14 9 12 17 11 2 1

Advanced 18 22 14 2 4 36 9 10 22 13 1 0

ELA II

Limited 20 16 24 48 43 20 35 27 14 25 55 70

Basic 22 21 23 29 30 15 24 23 20 24 30 23

Proficient 33 35 31 16 22 26 29 32 36 32 14 9

Accelerated 15 17 14 4 5 18 9 13 18 12 2 0

Advanced 10 12 9 4 2 22 4 7 12 7 1 0


Exhibit 4.3.2: Percentage of Students at Each Performance Level by Gender and Ethnicity in Spring 2018 —

Mathematics

Grade / Course

Performance Level


Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

IEP

LEP

Grade 3

Limited 23 22 23 34 46 11 31 31 16 29 52 44

Basic 10 11 10 20 14 5 13 8 9 12 13 14

Proficient 20 21 20 29 20 14 24 21 21 22 17 22

Accelerated 19 20 19 9 12 18 17 17 22 18 10 12

Advanced 27 26 28 10 9 53 15 24 33 20 8 8

Grade 4

Limited 19 19 19 36 43 8 26 23 12 25 49 44

Basic 9 9 8 10 13 5 12 10 7 11 12 13

Proficient 18 19 17 25 20 11 21 21 18 21 17 20

Accelerated 26 26 25 17 16 22 24 24 28 24 14 15

Advanced 29 27 31 11 8 55 17 22 35 19 7 8

Grade 5

Limited 27 26 28 48 57 13 38 35 19 35 60 60

Basic 10 11 10 11 13 6 13 13 10 12 12 13

Proficient 26 28 25 21 20 18 28 26 28 27 18 18

Accelerated 18 18 18 13 7 20 13 12 21 15 6 5

Advanced 19 17 20 7 4 43 9 15 23 12 4 4

Grade 6

Limited 25 23 26 49 53 13 35 31 17 32 63 65

Basic 16 17 16 24 20 9 20 16 15 20 17 17

Proficient 23 24 22 17 17 16 23 25 25 23 13 12

Accelerated 17 18 17 6 7 17 13 18 20 13 5 4

Advanced 19 18 19 4 4 44 9 10 23 12 3 3

Grade 7

Limited 27 26 28 39 55 15 38 36 20 36 67 68

Basic 14 14 14 22 18 8 17 15 13 16 14 15

Proficient 22 22 21 22 17 17 22 19 23 22 11 11

Accelerated 23 24 22 15 9 26 16 21 27 17 6 5

Advanced 15 14 15 3 3 35 7 8 17 9 2 2

Grade 8

Limited 31 29 34 57 58 17 40 40 24 38 68 67

Basic 15 15 15 11 16 9 17 21 15 17 14 14

Proficient 33 34 31 25 20 25 30 24 36 31 14 15

Accelerated 14 15 13 6 5 20 10 11 17 10 3 3

Advanced 7 7 7 2 2 28 3 4 8 5 1 2

Algebra

Limited 27 24 29 55 54 12 43 30 19 32 65 65

Basic 21 21 21 27 26 11 24 26 20 24 22 22

Proficient 26 28 24 13 17 19 21 23 29 24 10 10


Grade / Course

Performance Level


Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

IEP

LEP

Accelerated 18 20 17 5 5 25 10 17 22 14 3 3

Advanced 9 8 9 1 1 34 3 5 10 6 1 1

Geometry

Limited 40 38 41 79 73 19 58 47 31 49 83 78

Basic 17 18 16 15 15 10 17 20 18 17 10 12

Proficient 19 20 18 5 9 16 14 16 22 17 5 7

Accelerated 16 17 16 3 4 23 8 11 19 12 2 3

Advanced 9 8 10 1 1 32 3 7 10 6 1 1

Integrated Math I

Limited 45 41 49 76 69 52 40 60 28 58 80 84

Basic 17 17 16 17 19 13 15 24 15 18 12 12

Proficient 21 23 19 6 13 12 26 8 27 19 9 6

Accelerated 13 14 12 0 3 14 15 8 21 6 2 1

Advanced 6 6 6 2 0 10 4 0 10 3 0 0

Integrated Math II

Limited 48 47 49 85 74 47 49 56 32 59 82 85

Basic 23 24 22 10 21 22 22 30 24 25 15 15

Proficient 14 16 13 5 6 12 18 7 19 10 3 3

Accelerated 11 11 11 5 2 10 9 7 17 6 1 0

Advanced 5 5 6 0 0 10 5 0 8 3 1 0

Exhibit 4.3.3: Percentage of Students at Each Performance Level by Gender and Ethnicity in Spring 2018 —

Science

Grade / Course

Performance Level


Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

LEP

IEP

Grade 5

Limited 11 11 11 26 30 7 17 12 6 14 36 38

Basic 20 22 19 28 34 11 29 23 17 25 32 37

Proficient 20 21 19 18 19 15 22 26 20 22 16 15

Accelerated 24 24 24 19 13 25 19 25 27 21 10 8

Advanced 25 22 27 9 5 42 13 15 30 18 6 3

Grade 8

Limited 15 14 17 30 39 8 23 17 9 20 45 51

Basic 17 18 16 25 28 10 24 21 15 22 27 27

Proficient 23 25 22 22 20 16 25 27 24 25 17 14


Grade / Course

Performance Level


Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

LEP

IEP

Accelerated 29 30 29 18 11 33 21 28 34 24 9 7

Advanced 15 13 16 6 2 32 6 9 18 10 2 1

Biology

Limited 15 13 17 39 35 10 26 16 10 20 43 48

Basic 15 15 15 25 26 11 21 17 12 19 28 27

Proficient 33 36 31 29 30 23 32 41 34 34 24 21

Accelerated 10 11 10 4 4 10 7 9 12 8 3 2

Advanced 26 25 28 7 6 45 14 18 31 20 4 2

Physical Science

Limited 58 61 54 125 68 50 60 100 49 60 81 76

Basic 38 41 36 25 35 50 30 0 42 33 22 20

Proficient 9 7 11 25 6 0 5 0 12 10 5 0

Accelerated 1 0 2 0 1 0 5 0 1 0 0 4

Advanced 0 0 0 0 0 0 0 0 0 3 0 0

Exhibit 4.3.4: Percentage of Students at Each Performance Level by Gender and Ethnicity in Spring 2018 — Social

Studies

Grade / Course

Performance Level


Ove

rall

Fem

ale

Mal

e

Un

kno

wn

Afr

ican

Am

eri

can

Asi

an

His

pan

ic /

Lat

ino

Am

eri

can

Ind

ian

/

Ala

skan

Wh

ite

Mu

ltip

le E

thn

icit

ies

LEP

IEP

American

Government

Limited 6 5 6 19 13 6 12 10 3 8 21 25

Basic 14 14 15 26 27 13 23 19 11 18 37 37

Proficient 59 62 56 48 53 51 54 57 60 60 40 36

Accelerated 16 15 17 7 6 21 9 8 19 12 3 2

Advanced 6 5 7 1 1 10 2 6 7 3 1 0

American History

Limited 11 10 13 29 27 10 21 12 7 15 34 39

Basic 17 18 16 31 29 15 24 17 14 21 33 34

Proficient 33 36 30 28 30 28 32 31 34 34 25 22

Accelerated 17 17 17 7 9 17 12 17 19 15 5 3

Advanced 22 19 25 7 6 30 11 24 27 17 4 2


4.4 CLASSICAL ITEM ANALYSIS

Classical item statistics for multiple-choice (MC) and constructed-response (CR) items are calculated based on all student responses and used to monitor item behavior and investigate irregularities in item scoring throughout the testing window. Classical item analyses ensure that the items function as intended with respect to the underlying scales. AIR’s analysis program computed the required item and test statistics for each multiple-choice and constructed-response (CR) item to check the integrity of the item and to verify the appropriateness of the difficulty level of the item. Key statistics computed and examined include point biserial/polyserial correlations for item discrimination, biserial correlations for distractors for selected-response items, and proportion correct for item difficulty.

The point biserial/polyserial correlations indicate the extent to which each item differentiated between those students who possess the skills being measured and those who do not. In general, the higher the value, the better the item is able to differentiate between high- and low-achieving students. The point biserial/polyserial correlations are calculated as the correlation between the focal item score and the student’s IRT-based ability estimate. For polytomous items, the mean total number correct for student scoring within each of the possible score categories is also computed. Items with point biserial/polyserial correlations less than 0.25 are flagged and further reviewed by test development experts. For multiple-choice items, the point biserial correlation for each of the distractor response options is also computed.

The proportion correct score is the average number of available points achieved by students on the item. For dichotomous items, this is simply the proportion of students responding correctly. For polytomous items, dividing the average score on the item by the points possible produces a comparable index. The proportion correct score is commonly referred to as the p-value.

Exhibit 4.4.1 presents the average item p-values or proportion of total points and average point biserial/polyserial correlations for the operational test items. As indicated in Exhibit 4.4.1, the mean difficulty of ELA and social studies items is relatively consistent across grade-level assessments. However, average difficulty of mathematics and science items increases in general across grade levels and course assessments. The proportion of students responding to test items correctly in the end-of-course assessments in mathematics and science was relatively quite low. Mean point biserial correlations for the grade-level and end-of-course assessments are moderately high and generally consistent across assessments.

Exhibit 4.4.1: Average p-Value in Operational Test Administration

Grade / Course Average p-

Value p-Value SD

Average Point-Biserial

Point-Biserial SD

ELA

Grade 3 0.53 0.19 0.45 0.11

Grade 4 0.56 0.19 0.45 0.14

Grade 5 0.59 0.17 0.47 0.10

Grade 6 0.55 0.19 0.46 0.14

Grade 7 0.55 0.18 0.46 0.13

Grade 8 0.54 0.18 0.46 0.14

ELA I 0.54 0.19 0.46 0.16

ELA II 0.51 0.19 0.45 0.15

Mathematics

Grade 3 0.61 0.20 0.51 0.08

Grade 4 0.54 0.15 0.52 0.08

Grade 5 0.48 0.18 0.53 0.10


Grade / Course Average p-

Value p-Value SD

Average Point-Biserial

Point-Biserial SD

Grade 6 0.54 0.20 0.52 0.11

Grade 7 0.51 0.15 0.53 0.12

Grade 8 0.48 0.23 0.47 0.09

Algebra 0.43 0.17 0.48 0.12

Geometry 0.36 0.18 0.51 0.11

Integrated Math I 0.37 0.16 0.49 0.14

Integrated Math II 0.33 0.17 0.46 0.11

Science

Grade 5 0.59 0.21 0.45 0.09

Grade 8 0.49 0.23 0.42 0.09

Biology 0.47 0.14 0.44 0.12

Physical Science 0.21 0.16 0.27 0.11

Social Studies

American Government 0.55 0.18 0.43 0.09

American History 0.56 0.12 0.46 0.10

4.5 ITEM RESPONSE THEORY ANALYSIS

Calibration is the process by which the statistical relationship between student responses and the underlying measurement construct is estimated. Traditional item response models assume a single underlying trait and assume that items are independent given that underlying trait. In other words, the models assume that given the value of the underlying trait, knowing the response to one item provides no information about responses to other items. This basic simplifying assumption allows the likelihood function for these models to take the relatively simple form of a product over items for a single student:

𝐿(𝑍) = ∏ 𝑃(𝑧|𝜃)

𝑛

𝑗=1

,

where Z represents the vector of item responses, and θ represents a student’s true ability.

Traditional item response models differ only in the form of the function P(Z). The one-parameter model (also known as the Rasch model) is used to calibrate dichotomously scored OST items and takes the form

𝑃(𝑥𝑗 = 1|𝜃𝑘 , 𝑏𝑗) =1

1+𝑒(𝜃𝑘−𝑏𝑗)

= 𝑃𝑗1(𝜃𝑘).

The b parameter is often called the location or difficulty parameter—the greater the value of b, the greater the difficulty of the item. The one-parameter model assumes that the probability of a correct response approaches zero as proficiency (θk – bj) decreases toward negative infinity. In other words, the one-parameter model assumes that no guessing occurs. In addition, the one-parameter model assumes that all items are equally discriminating.

For items that have multiple, ordered response categories (i.e., partial credit items), OST items are calibrated using the Rasch family Masters’ (1982) partial credit model. Under Masters’ model, the probability of a response in category i for an item with mj categories can be written as


𝑃 (𝑥𝑗 = 𝑖|𝜃𝑘, 𝑏𝑗0 … 𝑏𝑗𝑚𝑗−1) =𝑒

∑ (𝜃𝑘−𝑏𝑗𝑣)𝑖𝑣=0

∑ 𝑒∑ (𝜃𝑘−𝑏𝑗𝑣)

𝑔𝑣=0

𝑚𝑗−1𝑔=0

.

Item banks for ELA and mathematics were freely calibrated following the close of the spring 2016 testing window, centering the mean item difficulty on the operational test form to establish the OST assessments’ scale. The linking constant necessary to bring the previously adopted performance standards onto the new OST scale was then computed. The procedures for calibration, equating, and scaling of tests are described in the Scaling and Equating section. Appendix E shows the operation item parameters for each test.

The tables in Appendix E provide Rasch and Masters’ partial credit model item parameter estimates for the spring 2018 operational test items. Since OST assessments are an online assessment system, bank item parameters were estimated based only on online responses to test items. Exhibits 4.5.1–4.5.4 present the mean and standard deviation of the Rasch item parameters by item type for each test for items administered online. Item types include traditional four-option multiple-choice (MC) items and machine-scored constructed-response (MSCR) items for which students’ constructed responses are scored electronically using explicit rubrics. MSCR includes natural-language items, grid items, and table items. In addition, there are technology enhanced (TE) items, hotspot (HS) items, and writing text (ER) items. The average Rasch difficulty is presented for each scoring dimension of the writing prompt administered at each grade. As illustrated in Exhibits 4.5.1–4.5.4, selected-response items are, on average, less difficult than the constructed-response item types.

Exhibit 4.5.1: Rasch Summary Statistics by Item Type — ELA

Grade / Course

MC TE HS Writing Prompt Average Rasch

N Avg

Rasch SD N

Avg Rasch

SD N Avg

Rasch SD Org

Ev / Elab

Conv

Grade 3 20 -0.41 0.74 7 0.96 0.63 3 1.19 1.30 1.92 1.96 -0.31

Grade 4 18 -0.39 0.80 8 0.85 0.95 3 1.09 0.95 1.60 1.67 -0.01

Grade 5 17 -0.31 0.76 10 0.45 1.14 3 0.21 1.70 1.19 1.19 -1.75

Grade 6 20 -0.50 0.90 13 0.64 0.70 6 -0.28 1.01 0.25 0.47 -1.57

Grade 7 20 -0.63 0.96 16 0.71 0.59 6 0.19 1.01 0.77 0.87 -1.07

Grade 8 23 -0.28 0.86 11 0.45 0.76 6 0.22 1.20 0.72 1.21 -1.27

ELA I 22 -0.41 0.86 12 0.71 0.80 6 0.32 1.36 1.04 1.33 -1.43

ELA II 24 -0.40 0.88 11 0.76 0.71 6 0.36 1.00 0.85 1.11 -0.89


Exhibit 4.5.2: Rasch Summary Statistics by Item Type — Mathematics

Grade / Course

MC MSCR TE

N Avg

Rasch SD N

Avg Rasch

SD N Avg

Rasch SD

Grade 3 9 -1.13 1.27 31 0.07 1.24 3 0.20 1.27

Grade 4 10 -0.19 0.79 30 0.09 1.06 8 0.06 0.73

Grade 5 7 -0.96 1.06 35 0.13 1.07 4 0.42 0.91

Grade 6 12 -0.35 1.40 28 0.01 1.26 6 0.69 1.37

Grade 7 12 -0.66 0.75 30 0.37 0.80 3 0.69 0.80

Grade 8 18 -0.94 0.93 29 0.60 1.39 3 1.03 0.83

Algebra 23 -0.34 0.64 22 0.45 1.18 2 -0.72 0.26

Geometry 14 -0.45 0.75 30 0.95 0.91 7 0.36 1.62

Int Math I 21 -0.38 0.59 19 0.52 1.14 7 -0.50 1.01

Int Math II 23 -0.13 0.64 26 0.80 1.24 3 1.83 0.75


Exhibit 4.5.3: Rasch Summary Statistics by Item Type — Science

Exhibit 4.5.4: Rasch Summary Statistics by Item Type — Social Studies

Item fit is evaluated via the mean square Infit and mean square Outfit statistics reported by WINSTEPS, which are based on weighted and unweighted standardized residuals for each item response, respectively. These residual statistics indicate the discrepancy between observed item responses and the predicted item responses based on the Rasch and Masters models. Both fit statistics have an expected value of 1. Values substantially greater than 1 indicate model underfit, while values substantially less than 1 indicate model overfit (Linacre, 2004). Items are flagged if Infit or Outfit values are less than 0.7 or greater than 1.3. Exhibit 4.5.5 summarizes the number of operational test items with Infit and Outfit statistics within the range of 0.7 to 1.3 and those items outside of that range. Appendix F shows OST assessments’ performance standards on the theta and scale score scale for current operational test administrations, and Appendix G provides the raw to scale score conversion tables. The operational field-test design of the spring 2018 ELA assessments makes it impossible to report the raw to scale score transformation. Therefore only mathematics, science, and social studies conversion tables are provided in Appendix G for the spring 2018 administration.

Exhibit 4.5.5: Summary of Item Fit Statistics

Grade / Course

Infit Outfit

Below 0.7

Between 0.7–1.3

Above 1.3

Below 0.7

Between 0.7–1.3

Above 1.3

ELA

Grade 3 2 28 0 3 26 1

Grade 4 4 24 1 5 23 1

Grade 5 2 27 1 3 24 3

Grade 6 4 35 0 7 27 5

Grade 7 4 36 2 5 31 6

Grade 8 6 31 3 8 28 4

ELA I 4 34 2 6 29 5

ELA II 6 32 3 9 28 4

Grade / Course

MC MSCR TE HS

N Avg

Rasch SD N

Avg Rasch

SD N Avg

Rasch SD N

Avg Rasch

SD

Grade 5 26 -0.73 0.72 9 0.67 1.04 13 0.95 1.22 - - -

Grade 8 25 -0.84 0.82 13 0.60 1.18 12 0.98 0.98 - - -

Biology 21 -0.19 0.67 15 0.16 0.75 9 0.20 0.60 - - -

Physical Science 18 -0.80 0.57 9 -0.21 0.91 13 1.16 0.57 4 0.31 0.43

Grade / Course

MC MSCR TE

N Avg

Rasch SD N

Avg Rasch

SD N Avg

Rasch SD

American Government 18 -0.35 0.91 5 0.15 0.72 21 0.31 0.69

American History 32 -0.28 0.51 5 -0.11 0.19 13 0.59 0.44


Grade / Course

Infit Outfit

Below 0.7

Between 0.7–1.3

Above 1.3

Below 0.7

Between 0.7–1.3

Above 1.3

Mathematics

Grade 3 0 41 2 0 39 4

Grade 4 0 44 4 1 42 5

Grade 5 0 44 2 3 36 7

Grade 6 1 43 2 4 32 10

Grade 7 0 41 3 3 32 9

Grade 8 1 48 1 1 41 8

Algebra 1 44 2 8 32 7

Geometry 0 48 3 10 32 9

Integrated Math I 0 42 5 12 24 11

Integrated Math II 5 45 2 13 27 12

Science

Grade 5 1 45 2 4 39 5

Grade 8 1 48 1 4 42 4

Biology 0 45 0 2 42 1

Physical Science 7 29 8 12 24 8

Social Studies

American History 0 45 5 1 43 6

American Government 2 40 2 4 34 6

4.6 RELIABILITY FOR SPRING 2018

Reliability refers to the consistency or precision of test scores and performance level classifications, and essentially addresses the question of how likely a student would be to achieve the same score, or be classified in the same performance level, across multiple administrations of equivalently constructed and administered test forms. As part of each test administration, the reliability of test scores and performance classifications is evaluated from a variety of perspectives. The reliability evidence of OST assessments in ELA and mathematics is demonstrated with respect to both classical and IRT indices of internal consistency of test scores, and decision accuracy and consistency of performance level classifications.22

4.6.1 INTERNAL CONSISTENCY

Test score reliability is traditionally estimated using both classical and IRT approaches. While measurement error is conditional on test information, it is nevertheless desirable to provide a single index of a test’s internal consistency

22 Standard 2.2 – The evidence provided for the reliability/precision of the scores should be consistent with the domain of replications associated with the testing procedures, and with the intended interpretations for use of the test scores. Standard 2.3 – For each total score, subscore, or combination of scores that is to be interpreted, estimates of relevant indices of reliability/precision should be reported.


or reliability. Classical estimates of test reliability such as Cronbach’s alpha, provide an index of the internal consistency reliability of the test, or the likelihood that a student would achieve the same score in an equivalently constructed test form. 23 Exhibit 4.6.1.1 shows the internal consistency estimates for each OST assessment in mathematics, science, and social studies. Internal consistency estimates are uniformly near 0.9, typical of most similar length achievement tests. Internal consistency reliability for the Physical Science assessments is quite low, likely due to truncation of range in the highly selected sample of students participating in the test administration. The Physical Science assessment is being phased out and only students who had previously taken the test and not met performance requirements participated.

Exhibit 4.6.1.1: Internal Consistency Reliabilities (Cronbach’s alpha) for OST Scores

Grade / Course Internal Consistency

Reliability Variance

ELA

Grade 3 0.87 2344

Grade 4 0.86 1920

Grade 5 0.87 2132

Grade 6 0.90 1747

Grade 7 0.91 1598

Grade 8 0.90 926

ELA I 0.92 931

ELA II 0.91 958

Mathematics

Grade 3 0.93 2292

Grade 4 0.93 2400

Grade 5 0.94 1536

Grade 6 0.94 1401

Grade 7 0.93 1658

Grade 8 0.93 770

Algebra 0.92 1165

Geometry 0.93 1756

Integrated Math I 0.91 1270

Integrated Math II 0.90 1439

Science

Grade 5 0.91 2154

Grade 8 0.90 1999

Biology 0.90 846

Physical Science NA NA

Social Studies

American Government 0.89 278

American History 0.91 683

NA: Not enough information to estimate reliably.

23 Standard 2.19


4.6.2 STANDARD ERROR OF MEASUREMENT

The figures in Exhibit 4.6.2.1–4.6.2.4 present graphically the standard errors of measurement for the grade-level and end-of-course assessments. Each figure also includes the location of the four OST performance standard cuts. As the figures indicate, OST assessments’ test scores are most precise near the middle of the ability distribution, and especially near the Proficient and Accelerated performance standard.24 Test scores near the tails of the ability distribution are somewhat less precise as expected. We note that the test precision for some assessments, especially the elementary grade ELA tests, does not support the number of performance standards adopted for OST assessments. Thus, the standard errors for scores within some performance levels is nearly the size of the performance level. Nevertheless, classification consistency estimates of scores at or above each performance standard are strong.

Exhibit 4.6.2.1: Overall Standard Error of Measurement for ELA

24 Standard 2.14 – When possible and appropriate, conditional standard errors of measurement should be reported at several score levels unless there is evidence that the standard error is constant across score levels. Where cut scores are specified for selection or classification, the standard errors of measurement should be reported in the vicinity of each cut score.


Exhibit 4.6.2.2: Overall Standard Error of Measurement for Mathematics


Exhibit 4.6.2.3: Overall Standard Error of Measurement for Science


Exhibit 4.6.2.4: Overall Standard Error of Measurement for Social Studies

4.6.3 STUDENT CLASSIFICATION RELIABILITY

When student performance is reported in terms of performance categories, a reliability index is computed to estimate the likelihood of consistent classification of students as specified in standard 2.15 in the Standards for Educational and Psychological Testing (AERA, APA, NCME, 2014). 25 This index considers the consistency of classifications for the percentage of students that would, hypothetically, be classified in the same category on an alternate, equivalent form.

For a fixed-form test, the consistency of classifications is typically estimated on test scores based on a single test form from a single test administration using the true-score distribution, which is estimated by fitting a bivariate beta-binomial model or a four-parameter beta model (Huynh, 1976; Livingston & Wingersky, 1979; Subkoviak, 1976; Livingston & Lewis, 1995).

The classification index can be examined for classification accuracy and classification consistency. Classification accuracy refers to the agreement between the classifications based on the form actually taken and the classifications that would be made on the basis of the students’ true scores, if their true scores could somehow be known. Classification consistency refers to the agreement between the classifications based on the form actually taken and the classifications that would be made on the basis of an alternate, equivalently constructed test form—that is, the percentages of students who would be consistently classified in the same performance levels on two equivalent test administrations.

In reality, the student’s true ability is unknown, and students are not administered an alternate, equivalent form. Therefore, classification accuracy and consistency are estimated based on students’ item scores and the item parameters, and the assumed underlying latent ability distribution as described below. The true score is an expected value of the test score with measurement error.

25 Standard 2.16 – When a test or combination of measures is used to make classification decisions, estimates should be provided of the percentage of students who would be classified in the same way on two replications of the procedure.


4.6.4 CLASSIFICATION ACCURACY

Instead of assuming a normal distribution we can directly estimate the probability of consistent classification using the likelihood function. The likelihood function of 𝜃 given a student’s item scores represents the likelihood of the student’s ability at that theta value. Integrating the likelihood values over the range of theta at and above the cut score (with proper normalization) represents the probability of the student’s latent ability or the true score being at or above that cut point.

If a student’s estimated ability (theta) is below the cut score, the probability of at or above the cut score is an estimate of the chance that this student is misclassified as below the cut score, and 1 minus that probability is the estimate of the chance that the student is correctly classified as below the cut score. Using this logic, we can define various classification probabilities.

In Exhibit 4.6.4.1, accurate classifications occur when the classification decision made on the basis of the hypothetical true score agrees with the decision made on the basis of the form actually taken. Misclassifications, false positives and false negatives, occur when students’ true score classifications are different from students’ observed scores (e.g., a student whose true score results in a classification as Proficient, but whose observed score results in an incorrect classification as Partially Proficient). represents the expected numbers of students who

are truly above the cut score; represents the expected number of students falsely above the cut score;

represents the expected number of students truly below the cut score; and represents the number of students

falsely below the cut score.

Exhibit 4.6.4.1: Classification Accuracy

Classification on a Form Actually Taken

At or Above the Cut Score Below the Cut Score

Classification on True Score


𝑁11 (Truly above the cut) 𝑁10 (False negative)

Below the Cut Score 𝑁01 (False positive) 𝑁00 (Truly below the cut)

4.6.5 CLASSIFICATION CONSISTENCY

As shown in Exhibit 4.6.5.1, consistent classification occurs when two forms agree on the classification of a student as either at and above or below the performance standard, whereas inconsistent classification occurs when the two decisions made on the basis of results from the two forms differ.

Exhibit 4.6.5.1: Classification Consistency

Classification on the 2nd Form Taken

Above the Cut Score Below the Cut Score

Classification on the 1st Form

Taken


N11 (Consistently above the cut)

N10 (Inconsistent)

Below the Cut Score

N01 (Inconsistent)

N00 (Consistently below the cut)

11N

01N 00N

10N


4.6.6 CLASSIFICATION ACCURACY AND CONSISTENCY ESTIMATES

Exhibit 4.6.6.1 presents the classification accuracy and consistency indices for the spring 2018 administration of OST. Accuracy classifications are slightly higher than the consistency classifications in almost all performance standards. The consistency classification rate can be somewhat lower than the accuracy rate because the consistency index assumes two test scores, both of which include measurement error, while the accuracy index assumes only a single test score plus the true score, which does not include measurement error.

Exhibit 4.6.6.1: Decision Accuracy and Consistency Indexes for Performance Standards

Grade / Course

Accuracy Consistency

Bas

ic

Pro

fici

en

t

Acc

ele

rate

d

Ad

van

ced

Bas

ic

Pro

fici

en

t

Acc

ele

rate

d

Ad

van

ced

ELA

Grade 3 0.92 0.90 0.90 0.91 0.89 0.86 0.86 0.89

Grade 4 0.93 0.90 0.90 0.92 0.90 0.86 0.86 0.89

Grade 5 0.94 0.91 0.89 0.91 0.92 0.88 0.86 0.87

Grade 6 0.94 0.92 0.91 0.94 0.92 0.89 0.88 0.91

Grade 7 0.95 0.92 0.91 0.93 0.93 0.89 0.88 0.91

Grade 8 0.93 0.91 0.92 0.96 0.90 0.88 0.89 0.94

ELA I 0.94 0.93 0.93 0.94 0.91 0.90 0.90 0.92

ELA II 0.94 0.92 0.92 0.95 0.92 0.89 0.89 0.93

Mathematics

Grade 3 0.94 0.93 0.93 0.93 0.92 0.91 0.90 0.91

Grade 4 0.95 0.94 0.94 0.94 0.93 0.92 0.91 0.91

Grade 5 0.94 0.93 0.94 0.96 0.92 0.91 0.91 0.94

Grade 6 0.95 0.94 0.94 0.95 0.92 0.91 0.91 0.93

Grade 7 0.94 0.94 0.94 0.96 0.92 0.91 0.91 0.94

Grade 8 0.93 0.93 0.94 0.97 0.90 0.90 0.92 0.96

Algebra I 0.92 0.93 0.94 0.97 0.89 0.90 0.92 0.96

Geometry 0.83 0.85 0.88 0.92 0.88 0.89 0.92 0.95

Integrated Math I 0.92 0.94 0.96 0.98 0.89 0.92 0.94 0.97

Integrated Math II 0.80 0.85 0.92 0.96 0.84 0.88 0.93 0.97

Science

Grade 5 0.95 0.93 0.92 0.92 0.94 0.90 0.88 0.89

Grade 8 0.94 0.91 0.91 0.95 0.91 0.88 0.88 0.93

Biology 0.93 0.92 0.92 0.93 0.90 0.89 0.89 0.90

Physical Science 0.84 0.92 0.98 1.00 0.78 0.88 0.97 0.99

Social Studies

American Government 0.97 0.93 0.93 0.97 0.95 0.90 0.91 0.96

American History 0.95 0.94 0.93 0.93 0.93 0.91 0.90 0.91


4.6.7 RELIABILITY FOR SUBGROUPS IN THE POPULATION

Exhibits 4.6.7.1–4.6.7.7 show the Cronbach’s alpha estimates of internal consistency reliability for each of the subgroups: gender (females and males), ethnicity (African American, Asian, Hispanic/Latino, American Indian, White, and students reporting multiple ethnicities), as well as students’ Limited English Proficient (LEP) and Individualized Education Program (IEP) status.26 Each of the ethnicity subgroups was composed of approximately equal numbers of males and females. As the Exhibits indicate, internal consistency reliabilities are generally consistent across subgroups, indicating that OST assessments measure an underlying achievement dimension that is in common across all subgroups. Where group reliabilities are attenuated, there is a corresponding decrease in test score variance for the subgroup, likely indicating that the attenuation of reliability is due to restriction of range in the subgroup.

Exhibit 4.6.7.1: Internal Consistency Reliability by Subgroup — Grades 3–6 ELA Assessments

Subgroup

Grade 3 Grade 4 Grade 5 Grade 6

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

All Students 126,050 0.87 2344 125,854 0.86 1920 127,446 0.87 2132 125,943 0.90 1747

Female 61,977 0.86 2325 61,651 0.86 1919 62,418 0.87 2050 61,650 0.90 1662

Male 63,914 0.86 2323 64,033 0.86 1900 64,848 0.88 2172 64,106 0.90 1750

Unknown Gender 159 0.82 1802 170 0.85 1777 180 0.88 2289 187 0.89 1515


Asian 3,033 0.88 2746 3,115 0.88 2515 3,125 0.88 2683 3,011 0.92 2419

Hispanic/Latino 5,323 0.85 2148 5,201 0.86 1860 5,051 0.87 1971 5,088 0.90 1684


White 84,936 0.85 2117 87,542 0.85 1758 88,832 0.85 1891 88,720 0.89 1547

Multi-Ethnic 9,621 0.86 2280 9,060 0.86 1867 8,841 0.87 2078 8,349 0.90 1671

LEP 5,202 0.78 1581 4,044 0.79 1364 3,549 0.82 1527 3,231 0.85 1293

IEP 16,183 0.82 2003 17,185 0.82 1599 17,484 0.85 1728 16,888 0.86 1302

Note: LEP: Limited English Proficiency; IEP: Individualized Education Program

26 Standard 2.11 – Test publishers should provide estimates of reliability/precision as soon as feasible for each relevant subgroup for which the test is recommended.


Exhibit 4.6.7.2: Internal Consistency Reliability by Subgroup — Grades 7–HS ELA Assessments

Subgroup

Grade 7 Grade 8 ELA I ELA II

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

All Students 123,841 0.91 1598 124,880 0.90 926 148,951 0.92 931 139,572 0.91 958

Female 60,688 0.91 1514 60,910 0.89 873 71,575 0.91 893 68,420 0.90 880

Male 62,901 0.91 1620 63,779 0.90 941 76,796 0.92 929 70,584 0.91 1003

Unknown Gender 252 0.90 1459 191 0.89 891 580 0.88 667 568 0.89 893


Asian 3,063 0.92 2179 2,978 0.92 1222 3,482 0.94 1391 3,363 0.93 1373

Hispanic/Latino 4,643 0.91 1669 4,537 0.90 888 5,927 0.91 908 5,294 0.90 968


White 89,059 0.90 1397 89,992 0.89 815 102,174 0.91 807 98,465 0.90 823

Multi-Ethnic 7,901 0.91 1575 7,900 0.90 911 9,355 0.92 903 8,336 0.91 959

LEP 2,934 0.87 1322 2,966 0.82 595 5,593 0.83 509 4,511 0.80 598

IEP 16,162 0.87 1234 16,349 0.84 610 21,741 0.85 544 18,913 0.84 632


Exhibit 4.6.7.3: Internal Consistency Reliability by Subgroup — Grades 3–5 Mathematics Assessments

Subgroup

Grade 3 Grade 4 Grade 5

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N R

elia

bili

ty

Var

ian

ce

All Students 126,769 0.93 2292 125,282 0.93 2400 126,093 0.94 1536

Female 62,334 0.92 2139 61,445 0.93 2248 61,865 0.93 1424

Male 64,255 0.93 2439 63,659 0.93 2542 64,032 0.94 1643

Unknown Gender 180 0.92 1759 178 0.92 2010 196 0.92 1354


Asian 3,020 0.91 2390 3,030 0.92 2561 2,928 0.93 1785

Hispanic/Latino 5,351 0.92 1939 5,205 0.93 2113 5,042 0.92 1236


White 85,601 0.92 2030 87,091 0.92 2134 87,807 0.93 1372

Multi-Ethnic 9,636 0.93 2115 9,024 0.93 2204 8,761 0.93 1368

LEP 5,202 0.92 1814 4,034 0.91 1924 3,544 0.90 1171

IEP 16,243 0.92 2107 17,187 0.91 2017 17,457 0.90 1144



Exhibit 4.6.7.4: Internal Consistency Reliability by Subgroup — Grades 6–8 Mathematics Assessments

Subgroup

Grade 6 Grade 7 Grade 8

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

All Students 124,337 0.94 1401 119,191 0.93 1658 97,029 0.93 770

Female 60,962 0.94 1306 58,570 0.93 1565 46,827 0.92 727

Male 63,161 0.94 1491 60,362 0.93 1747 50,018 0.93 804

Unknown Gender 214 0.91 1016 259 0.91 1250 184 0.92 781


Asian 2,803 0.94 1730 2,613 0.93 1981 1,939 0.94 1041

Hispanic/Latino 5,019 0.93 1176 4,594 0.92 1438 4,059 0.92 682


White 87,489 0.93 1250 85,433 0.93 1505 67,596 0.92 692

Multi-Ethnic 8,267 0.93 1291 7,644 0.93 1558 6,572 0.92 715

LEP 3,235 0.90 970 2,915 0.88 1135 2,811 0.90 643

IEP 16,919 0.90 956 16,090 0.88 1119 15,726 0.89 574


Exhibit 4.6.7.5: Internal Consistency Reliability by Subgroup — High School Mathematics Assessments

Subgroup

Algebra Geometry Integrated Math I Integrated Math II

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

All Students 144,091 0.92 1165 126,729 0.93 1756 12,152 0.91 1270 10,490 0.90 1439

Female 69,666 0.91 1081 62,827 0.92 1606 5,860 0.92 1234 5,132 0.90 1337

Male 73,795 0.92 1239 63,385 0.93 1901 6,239 0.91 1291 5,301 0.91 1540

Unknown Gender 630 0.84 697 517 0.78 858 53 0.79 643 57 0.77 736


Asian 3,119 0.94 1674 2,857 0.95 2358 494 0.93 1661 434 0.93 1954

Hispanic/Latino 5,817 0.88 904 4,876 0.88 1310 300 0.91 1105 266 0.89 1234


White 101,262 0.92 1080 91,392 0.93 1644 6,206 0.93 1284 5,725 0.92 1494

Multi-Ethnic 8,470 0.91 1073 7,028 0.91 1606 1,298 0.88 979 967 0.87 1142

LEP 4,025 0.82 659 3,084 0.81 969 1,441 0.7 464 1,004 0.65 493

IEP 20,161 0.80 606 15,831 0.77 841 1,830 0.77 616 1,516 0.74 660



Exhibit 4.6.7.6: Internal Consistency Reliability by Subgroup — Science Assessments

Subgroup

Grade 5 Grade 8 Biology Physical Science

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

All Students 127,349 0.91 2154 125,778 0.90 1999 135,109 0.90 846 478 0.65 333

Female 62,404 0.91 2016 61,534 0.89 1787 66,887 0.89 758 243 0.60 293

Male 64,751 0.91 2272 64,043 0.91 2201 67,803 0.91 932 231 0.70 376

Unknown Gender 194 0.91 2103 201 0.89 1770 419 0.85 612 4 0.68 336

African American 21,166 0.89 1670 19,078 0.85 1348 21,928 0.83 536 195 0.61 330

Asian 3,135 0.90 2287 2,995 0.91 2398 3,225 0.91 1089 4 -0.12 95

Hispanic/Latino 5,063 0.91 1861 4,589 0.88 1634 4,870 0.88 734 20 0.74 484

American Indian/Alaskan 188 0.91 2056 155 0.89 1702 185 0.89 707 1 NA NA

White 88,823 0.90 1862 90,895 0.89 1789 96,774 0.90 785 211 0.66 303

Multi-Ethnic 8,831 0.91 2037 7,879 0.90 1865 7,872 0.89 789 40 0.67 331

LEP 3,550 0.88 1462 2,967 0.82 1206 3,878 0.74 372 25 0.53 258

IEP 17,438 0.90 1899 16,291 0.85 1360 17,412 0.80 469 91 0.48 262


Exhibit 4.6.7.7: Internal Consistency Reliability by Subgroup — Social Studies Assessments

Subgroup

American Government American History

N

Re

liab

ility

Var

ian

ce

N

Re

liab

ility

Var

ian

ce

All Students 86,861 0.89 278 125,839 0.91 683

Female 42,931 0.89 252 62,295 0.91 602

Male 43,521 0.90 302 63,145 0.92 758

Unknown Gender 409 0.89 267 399 0.89 526

African American 14,322 0.87 232 20,438 0.89 510

Asian 1,565 0.91 355 2,489 0.92 744

Hispanic/Latino 2,924 0.89 267 4,677 0.91 613

American Indian/Alaskan 121 0.91 338 163 0.91 648

White 62,730 0.89 255 90,536 0.91 632

Multi-Ethnic 5,009 0.89 256 7,308 0.91 652

LEP 2,149 0.84 200 3,905 0.83 345

IEP 10,331 0.84 199 17,060 0.87 444



4.6.8 RELIABILITY FOR SUBSCALES

The Cronbach’s alpha internal consistency reliability estimates associated with the subscales for the spring 2018 operational forms are presented in Exhibits 4.6.8.1–4.6.8.5. As indicated in the Exhibits, subscale reliabilities are generally moderate in magnitude, as expected for subscales of the length observed in OST. Subscale reliabilities for the Physical Science assessment are quite low and likely due to restriction of range resulting from very difficult test items.

Exhibit 4.6.8.1: Subscale Reliabilities — ELA

Grade / Course Reading

Informational Text

Reading Literary Text

Writing

Grade 3 0.71 0.72 0.52

Grade 4 0.71 0.66 0.63

Grade 5 0.74 0.71 0.59

Grade 6 0.77 0.70 0.81

Grade 7 0.80 0.74 0.82

Grade 8 0.76 0.74 0.80

ELA I 0.81 0.69 0.84

ELA II 0.77 0.71 0.83


Exhibit 4.6.8.2: Subscale Reliabilities — Mathematics

Grade 3

Fractions Geometry Multiplication &

Division Modeling & Reasoning

Numbers & Operations

0.68 0.74 0.80 0.90 0.73

Grade 4

Fractions Geometry Multiplication & Division Modeling & Reasoning

0.84 0.71 0.84 0.86

Grade 5

Decimals Fractions Geometry Modeling & Reasoning

0.85 0.82 0.77 0.87

Grade 6

Expressions and Equations

Geometry and Statistics

Modeling & Reasoning

The Number System

Ratios and Proportions

0.83 0.73 0.88 0.70 0.79

Grade 7

Geometry Modeling & Reasoning

The Number System

Ratios and Proportions

Statistics & Probability

0.68 0.88 0.80 0.78 0.75

Grade 8

Expressions and Equations

Functions Geometry Modeling & Reasoning

The Number System

0.77 0.69 0.80 0.85 0.74

Algebra I

Functions Modeling & Reasoning Number, Quantities,


Statistics

0.85 0.86 0.76 0.63

Geometry

Congruence & Proof

Circles Modeling & Reasoning

Similarity & Trigonometry

Probability

0.83 0.61 0.87 0.74 0.65

Integrated Mathematics I


Algebra Geometry Modeling & Reasoning

Number & Quantity/Functions

Statistics

0.64 0.64 0.88 0.80 0.63

Integrated Mathematics II

Functions Geometry Modeling & Reasoning

Number, Quantities,


Probability

0.66 0.75 0.81 0.65 0.63

Exhibit 4.6.8.3: Subscale Reliabilities — Grade 5 and Grade 8 Science

Grade Earth

Science Life

Science Physical Science

Grade 5 0.73 0.80 0.79

Grade 8 0.75 0.78 0.75

Exhibit 4.6.8.4: Subscale Reliabilities — Biology and Physical Science

Biology

Heredity Evolution Diversity and

Interdependence of Life

Cells

0.71 0.67 0.69 0.70

Physical Science

Study of Mater Energy and Waves Forces and Motion The Universe

NA NA 0.14 0.27


Exhibit 4.6.8.5: Subscale Reliabilities — Social Studies

American Government

Historic Documents Principles and Structure Ohio/Policy/Economy

0.78 0.77 0.67

American History

Skills and Documents 1877–1945 1945–Present

0.71 0.82 0.79


4.7 SUBSCALE INTERCORRELATIONS

The observed correlations among reporting category scores are presented in Exhibits 4.7.1–4.7.16.

Exhibit 4.7.1: Subscale Intercorrelations — ELA

Grade / Course

Subscale

Observed Correlation

RI RL

Grade 3 RL 0.69

W 0.50 0.48

Grade 4 RL 0.66

W 0.62 0.56

Grade 5 RL 0.70

W 0.56 0.55

Grade 6 RL 0.70

W 0.65 0.64

Grade 7 RL 0.71

W 0.64 0.63

Grade 8 RL 0.70

W 0.65 0.61

ELA I RL 0.71

W 0.72 0.66

ELA II RL 0.70

W 0.67 0.64

Note: RL = Reading Literary Text; RI = Reading Informational Text; W = Writing

Exhibit 4.7.2: Subscale Intercorrelations — Grade 3 Mathematics


FRA G MUD MR

Grade 3

G 0.68

MUD 0.70 0.74

MR 0.83 0.84 0.92

NO 0.69 0.71 0.78 0.88

Note: FRA = Fraction; G = Geometry; MUD = Multiplication & Division; MR = Model Reasoning; NO = Numbers & Operations




FRA G MUD

Grade 4

G 0.76

MUD 0.83 0.73

MR 0.91 0.82 0.91

Note: FRA = Fraction; G = Geometry; MUD = Multiplication & Division; MR = Model Reasoning;



D FRA G

Grade 5

FRA 0.83

G 0.80 0.79

MR 0.88 0.94 0.88

Note: D = Decimals; FRA = Fraction; G = Geometry; MR = Model Reasoning;



EE GS MR NS

Grade 6

GS 0.79

MR 0.90 0.88

NS 0.78 0.73 0.83

RP 0.83 0.77 0.90 0.76

Note: EE = Expression and Equations; GS = Geometry and Statistics; MR = Model Reasoning; NS = The Number System; RP = Ratios and Proportions




G MR NS RP

Grade 7

MR 0.87

NS 0.75 0.90

RP 0.77 0.89 0.80

SP 0.73 0.86 0.75 0.77

Note: G = Geometry; MR = Model Reasoning; NS = The Number System; RP = Ratios and Proportions; SP = Statistics and Probability



EE F G MR

Grade 8

F 0.71

G 0.78 0.72

MR 0.92 0.80 0.87

NS 0.74 0.67 0.74 0.81

Note: EE = Expression and Equations; F = Functions; G = Geometry; MR = Model Reasoning; NS = The Number System

Exhibit 4.7.8: Subscale Intercorrelations — Algebra


F MR NQEE

Algebra

MR 0.92

NQEE 0.81 0.88

S 0.74 0.84 0.71

Note: F = Functions; MR = Model Reasoning; NQEE = Number, Quantities, Equations and Expressions; S = Statistics


Exhibit 4.7.9: Subscale Intercorrelations — Geometry


CP C MR P

Geometry

C 0.77

MR 0.90 0.85

P 0.73 0.69 0.86

ST 0.81 0.77 0.88 0.72

Note: CP = Congruency & Proof; C = Circles; MR = Model Reasoning; P = Probability; ST = Similarity & Trigonometry

Exhibit 4.7.10: Subscale Intercorrelations — Integrated Mathematics I


A G MR NQF

IM I

G 0.74

MR 0.90 0.85

NQF 0.79 0.77 0.91

S 0.71 0.71 0.85 0.75

Note: A = Algebra; G = Geometry; MR = Model Reasoning; NQF = Number & Quantity/Functions; S = Statistics

Exhibit 4.7.11: Subscale Intercorrelations — Integrated Mathematics II


F G MR NQEE

IM II

G 0.73

MR 0.81 0.84

NQEE 0.66 0.70 0.79

P 0.67 0.73 0.86 0.64

Note: F = Functions; G = Geometry; MR = Model Reasoning; NQEE = Number, Quantities, Equations and Expressions; P = Probability


Exhibit 4.7.12: Subscale Intercorrelations — Grade 5 and Grade 8 Science


ES LS

Grade 5 LS 0.72

PS 0.73 0.76

Grade 8 LS 0.73

PS 0.72 0.73

Note: ES = Earth Science; LS = Life Science; PS = Physical Science

Exhibit 4.7.13: Subscale Intercorrelations — Biology


BS-A BS-B BS-C

Biology

BS-B 0.66

BS-C 0.70 0.67

BS-D 0.69 0.66 0.67

Note: BS-A = Heredity; BS-B = Evolution; BS-C = Diversity and Interdependence of Life; BS-D = Cells

Exhibit 4.7.14: Subscale Intercorrelations — Physical Science


PS-A PS-B PS-C

Physical Science

PS-B 0.34

PS-C 0.35 0.38

PS-D 0.36 0.38 0.40

Note: PS-A = Study of Matter; PS-B = Energy & Waves; PS-C = Forces & Motions; PS-D = The Universe


Exhibit 4.7.15: Subscale Intercorrelations and Reliability Estimates — American Government


AGA AGB

American Government

AGB 0.71

AGC 0.66 0.73 Note: AGA = Historic Documents; AGB = Principles & Structures; AGC = Ohio/Policy/Economy

Exhibit 4.7.16: Subscale Intercorrelations and Reliability Estimates — American History


AHA AHB

American History

AHB 0.74

AHC 0.73 0.80 Note: AHA = Skills & Documents; AHB = 1877–1945; AHC = 1945–Present

4.8 RATER AGREEMENT

All essay responses for spring 2018 online and paper-pencil tests were handscored by Data Recognition Corporation (DRC). In addition, approximately 45% of handscored essay responses were routed for a second human reader. Appendix H.1 shows the rater agreement rates for each of the writing prompts administered on OST assessments. Exhibit 4.8.1 provides a summary of those results, showing the exact human rater agreement rate for dimension scores across grades. The rater agreement reports in Appendix H.1 show percentages of exact agreement (%EX), adjacent scores (%AD), and nonadjacent scores (%NA). The tables also provide score point distribution (from 0 to 2 for Conventions; from 0 to 4 for Purpose/Organization and Evidence/Elaboration) including the condition codes such as percent of Blank/No response (%B), percent of Unreadable responses (%U), percent of Foreign Language responses (%F), and percent of Off Topic responses (%T). Generally exact agreement rates ranged from 67%–86% (average of 76%), with little variability across the essay prompts.

Exhibit 4.8.1: Mean Exact Agreement Rates for Online Essay Responses — Spring 2018

Dimension Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8 ELA I ELA II

Conventions 82 79 81 80 82 85 79 86

Purpose/Organization 78 72 75 69 71 69 75 75

Evidence/Elaboration 79 74 73 68 69 67 77 75


5. ITEM DEVELOPMENT AND TEST CONSTRUCTION

OST assessments are rigorously examined in accordance to the guidelines provided in the Standards for Educational and Psychological Testing (AERA, APA, NCME, 2014). The Elementary and Secondary Education Act (ESEA) legislation also describes the evidence based on these standards that is necessary to validate assessments for their intended purposes.

OST assessments were designed to measure student progress toward achievement of Ohio’s Learning Standards. Although the validity of OST test score interpretations are evaluated along several dimensions, as a content criterion-referenced system of tests, the meaning of test scores are critically evaluated by the degree to which test content was aligned with Ohio content standards.27

Alignment to learning standards is achieved through a rigorous test development process that proceeds from the learning standards and refers back to those standards in a highly iterative test development process that included ODE, test developers, and educator and stakeholder committees.

In spring 2014, an independent field test was conducted to develop the item pool for future test forms in science and social studies. The independent field test consisted of newly developed items, as well as items from Ohio’s previous statewide tests, the Ohio Achievement Assessment and the Ohio Graduation Test (OGT). Items from Ohio’s prior tests went through AIR and ODE review before being approved for use in the new OST tests.

Items administered in the first administration of OST assessments in ELA and mathematics went through a very similar development procedure. These AIRCore items were designed to align with the CCSS that Ohio adopted as Ohio’s Learning Standards in ELA and mathematics. All of the AIRCore items administered had also been previously field tested in embedded slots within statewide assessments, and passed through content and fairness data review prior to inclusion in the item pool from which the spring 2018 OST test forms were constructed.

5.1 ITEM DEVELOPMENT PROCESS 28

The content development process for Ohio’s State Tests is managed by AIR’s Item Authoring Tool (IAT), which acts as a content development and management tool, item bank, and publication system supporting both paper-based and online publication. This item development process leads items from inception, through a series of content, fairness, graphic, and other reviews to final publication. The system captures the outcomes and rationales at each review and maintains previous drafts of each item. The workflow management ensures that each item receives each review in the designated sequence, and that the review is conducted (or recorded in the case of committee review) by an authorized person. As items travel through Ohio’s extensive review process, every version of every item is archived, along with each comment received in any review. Reviewers have immediate access to all prior versions, providing version control throughout development.

IAT allows remote Internet access by item writers and reviewers while ensuring security with individualized passwords for all users, limited access for external users, and strong encryption of all information. IAT tracks item use on test forms or adaptive pools. After items are used, IAT stores the resulting statistics, including exposure statistics, classical item statistics, and statistics based on item response theory (IRT).

27 Standard 1.11 – When the rationale for test score interpretation for a given use rests in part on the appropriateness of test content, the procedures followed in specifying and generating test content should be described and justified with reference to the intended population to be tested and the construct the test is intended to measure or the domain it is intended to represent. If the definition of the content sampled incorporates criteria such as importance, frequency, or criticality, these criteria should also be clearly explained and justified. 28 Standard 4.7 – The procedures used to develop, review, and tryout items and to select items from the item pool should be documented.


OST assessments’ item development process is predicated on a high level of interaction between test developers at AIR and the ODE, as well as with Ohio educators and stakeholders. AIR’s IAT manages item content throughout the entire life cycle of an item, from inception, through series of agreed-upon item review levels culminating in operational pool approval. It also manages item content beyond the operational life of the item, including migration of items for use in practice tests or other training materials. IAT ensures that every item follows through the entire sequence of development and provides Ohio and AIR management on-demand reports of the content and status of the inventory of items. Each item is directed through a sequence of reviews (described in this section) and sign-offs before it is locked for field-test or operational administration.

The IAT is integrated with the item display engine used by the online test delivery system. This feature, combined with a “web approval” process, allows the display of online items to be “locked” well before test forms are constructed and ensures that only approved items are administered to Ohio students.29

5.2 MACHINE-SCORED CONSTRUCTED-RESPONSE ITEM DEVELOPMENT TOOLS

OST assessments include a number of machine-scored constructed-response (MSCR) items which leverage a complicated system that allows for a large variety of item types expecting varied student responses to be developed, and scored efficiently and economically.

MSCR item development tools put the power of both item and rubric creation into the hands of item writers, and allow reviewers to score possible responses to ensure that the rubric is enacted correctly. For example, when administered a graphic-response item, students can respond by drawing, moving, arranging, or selecting graphic regions. The scoring rubric allows for each answer to be scored using scoring logic created by the item writer. Test developers have flexibility in identifying features of student responses to score, which go beyond simple features (e.g., whether the correct object is put in the correct place) but can involve abstraction. For example, if a student is asked to design an experiment, the rubric can discern whether the objects representing the experimental variable actually vary across conditions or cover the range of inquiry, among other capabilities. These concepts are abstract and many different responses may reflect those abstract features. This ability enables machine rubrics to “justify” the partial credit assigned in terms of the skills that particular response features exemplify.

In addition, throughout the item development and review process, test developers can mimic the many different possible student responses, and review how the rubric is applied to those responses. Test developers can test the scoring rubric and make corrections to the scoring logic at each step.

When creating equation items, test developers have access to the Equation Editor tool. Student responses can be simple numeric responses or complex equations or even sets of equations. This tool allows for multiple answers and the development of multistep items. Test developers can customize the equation palette to show the appropriate functions. Just as the keypad is customizable, the answer spaces are as well. Additional answer spaces can be added as needed by the item writer. The scoring rubric allows for each answer to be scored using scoring logic created by the item writer.

Such tools are integrated into the IAT, providing test developers the power and flexibility to use technology to create sophisticated OST items.

5.3 ITEM TYPES

OST assessments include a wide variety of item types that are designed around a broad and growing variety of response mechanisms. In addition to selected-response items, which include traditional multiple-choice items and more advanced multi-select and two-part items, OST assessments utilize items with the following response mechanisms:

29 Standard 4.1


• Graphic Response, which includes any item to which students respond by drawing, moving, arranging,

or selecting graphic regions.

• Hot Text, in which students select or rearrange sentences or phrases in a passage.

• Word Builder, in which students respond by entering a single number or word.

• Proposition Response, in which students respond in one English language sentence or more, which may

be scored by our proposition-scoring engine, handscored, or a mixture of both.

OST items use technology to machine-score many such items that measure deeper knowledge and application of knowledge in a more open ended way. Most MSCR items remain accessible. If accessibility is sacrificed for some population, test development staff weigh the measurement benefit before deploying that item. For example, recognizing cells under a microscope is an inherently visual task. Our simulation items can measure this ability, but the task itself cannot readily be made accessible to students who are blind. In this case, the skill itself limits accessibility, not the construction of the item. This is very different than presenting a selected-response item as a spatial matching or drag-and-drop task.

The graphic-response mechanism supports most of the typical technology-enhanced item types, including sorting, matching, hot spot, and drag-and-drop. In addition, it supports items where students actually draw a machine-scorable response and respond by constructing complex, open-ended diagrams, as well as many other possibilities. Because they are uniformly derived from a single response mechanism, the manipulations and interactions are consistent across these technology-enhanced item types, diminishing a possible source of construct-irrelevant variance.

Hot-text items are effectively selected-response items, though in some cases the number of potential selections is quite large. These machine-scored items can have multiple correct answers and allow flexibility in scoring of student responses.

The equation response mechanism asks students to enter one or more equations using a palette of symbols. Test developers can specify which symbols are available on an item-by-item basis.

The availability of tools organized around response mechanisms creates a very flexible capability for test developers to create authentic, challenging tasks.

5.4 ITEM REVIEW

This section describes the multi-step item-review process that items travel through from inception, to several rounds of test developer, ODE, educator, and stakeholder review, to field testing and final review prior to inclusion on operational test forms. 30

The item-review procedures used to develop and review OST test items are designed to ensure item accuracy and alignment with the intent of Ohio’s Learning Standards. Following a standard item-review process, item reviews proceed initially through a series of internal reviews before items are eligible for review by ODE content experts. Most of AIR’s content staff members, who are responsible for conducting internal reviews, are former classroom teachers who hold degrees in education and/or their respective content areas. Each item passes through four internal review steps before it is eligible for review by ODE. Those steps include the following:

• Preliminary Review, in which a review is conducted by a group of AIR content area experts

• Content Review 1, in which a review is performed by an AIR content specialist

30 Standard 4.8 – The test review process should include empirical analyses and/or the use of expert judges to review items and scoring criteria. When expert judges are used, their qualifications, relevant experiences, and demographic characteristics should be documented, along with the instructions and training in the item review process that the judges receive.


• Editorial Review, in which a copyeditor checks the item for correct grammar/usage

• Senior Content Review, in which a review is conducted by the lead content expert

At every stage of the item-review process, beginning with preliminary review, AIR’s test developers analyze each item to ensure that the following are true.

• The item is well-aligned with the intended learning standard.

• The item conforms to the item specifications for the target being assessed.

• The item is based on a quality idea (i.e., it assesses something worthwhile in a reasonable way).

• The item is properly aligned to a depth of knowledge (DOK) level.

• The vocabulary used in the item is appropriate for the intended grade/age and subject matter, and takes

into consideration language accessibility, bias, and sensitivity.

• The item content is accurate and straightforward.

• Any accompanying graphic and stimulus materials are actually necessary to answer the question.

• The item stem is clear, concise, and succinct, meaning it contains enough information to know what is being

asked; is stated positively (and does not rely on negatives such as no, not, none, never, unless absolutely

necessary); and it ends with a question.

• For selected-response items, the set of response options is succinct; parallel in structure, grammar, length,

and content; sufficiently distinct from one another; and all plausible, but with only one correct option.

• There is no obvious or subtle cluing within the item.

• The score points for constructed-response items are clearly defined.

• For machine-scored constructed-response (MSCR) items, the items are scored as intended at each score

point in the rubric.

Based on their review of each item, test developers can accept the item and classification as written, revise the item, or reject the item outright.

Items passing through the internal review process are sent to ODE for their review. At this stage, items may be further revised based on any edits or changes requested by ODE, or rejected outright. Items passing through the ODE review level then have to pass through two stakeholder reviews in which committees of Ohio educators and stakeholders review each item’s accuracy, alignment to the intended standard, and DOK level, as well as item fairness and language sensitivity. Thus, all items considered for inclusion in the operational item pools were initially reviewed by:

• content advisory educator committee, which checked to ensure that each item is

o aligned to the learning standards;

o appropriate for the grade level;

o accurate; and

o presented online in a way that is clear and appropriate.

• A fairness and sensitivity educator committee, which checks to ensure that each item and any associated

stimulus materials are free from bias, sensitive issues, controversial language, stereotyping, and statements

that reflect negatively on race, ethnicity, gender, culture, region, disability, or other social and economic

conditions and characteristics.

Items successfully passing through this committee review process are then field tested to ensure that the items

behave as intended when administered to students. Despite conscientious item development, some items perform

differently than expected when administered to students. Using the item statistics gathered in field testing to

review item performance is an important step in constructing valid and equivalent operational test forms.


Classical item analyses ensure that items function as intended with respect to the underlying scales. Classical item

statistics are designed to evaluate the item difficulty and the relationship of each item to the overall scale (item

discrimination) and to identify items that may exhibit a bias across subgroups (differential item functioning

analyses).

Items flagged for review based on their statistical performance must pass a three-stage review to be included in

the final item pool from which operational forms were created. In the first stage of this review, a team of

psychometricians reviewed all flagged items to ensure that the data are accurate and properly analyzed, response

keys are correct and there are no other obvious problems with the items.

Content review and fairness and sensitivity committees are again convened to re-evaluate flagged field-test items

in the context of each item’s statistical performance. Based on their review of each item’s performance, the

content review and fairness and sensitivity committees can recommend that flagged items be rejected or deem

the item eligible for inclusion in operational test administrations.


6. FIELD TESTING

Items selected for operational use in the base year ELA and mathematics forms in 2015 were previously administered as part of statewide assessments in Arizona, Florida, Utah, and/or Oregon. The Ohio science and social studies test items are field tested prior to inclusion on operational test forms. Items selected for operational use in the base year form in 2015 were previously administered as part of an independent field test in spring 2014. Additionally, newly developed items were embedded in all Ohio’s 2018 tests (except for Physical Science) and field tested, expanding the base of items for building future test forms.

Embedding field-test items in operational assessments yields item parameter estimates that capture many of the contextual effects that contribute to simulating item difficulty in operational test administrations. A number of factors that may influence item difficulty estimates in the context of operational test administrations may be less relevant in stand-alone field-test contexts. For example, in a high-stakes test, such as a high school end-of-year (EOY) exam where test performance impacts graduation, students may be motivated to expend greater effort to achieve maximum performance. Conversely, the high-stakes assessments may also be more likely to elicit anxiety in some students, thus impairing their performance on the tests. Even when assessments are low stakes for students, schools often work to convey to students the importance of statewide assessments in ways that are likely not done for independent field tests. While the impact of contextual factors may not be great, embedded field testing ensures that many aspects of the operational testing context influencing item difficulty are incorporated into the resulting item parameter estimates.

Embedded field testing is especially useful in the context of a pre-equating model for scoring and reporting test results. Because the test administration context remains the same between the embedded field test (EFT) and subsequent operational test administration, item parameter estimates are more stable than they may be when obtained through stand-alone field testing.

A potential drawback of the EFT approach is the increased assessment burden placed on students and schools. For this reason, Ohio utilizes EFT designs for purposes of item bank maintenance. Ohio uses AIR’s online field-test engine, which, when combined with Ohio’s large student population, serves to greatly reduce the number of EFT slots necessary to replenish or grow the item banks for the Ohio assessments.

The field-test engine randomly samples field-test items for each individual test administration, essentially creating thousands of unique EFT forms. This sampling approach to embedding field-test items results in several important outcomes:31

• Reduction in the number of embedded field-test items that each student must respond to and more

efficient “spiraling” of items, which reduces clustering of item responses, resulting in more precise

parameter estimates

• More generalizable item statistics because they are not based on items appearing in a single position

• A more representative sample of respondents for each item

The embedded field-testing algorithm consists of two different algorithms—one for identifying which field-test items will be administered to which student (the distribution algorithm), and one for selecting the position on the test for each item administered the student (the positioning algorithm). When a student starts a test, the system randomly selects a pre-determined number of item groups (depending on whether items have shared stimulus, etc.), stopping when it has selected item groups containing at least the minimum number of field-test items designated for administration to each student. This structured randomization ensures that a) each item is seen by a representative

31 Standard 4.9 – When item or test form tryouts are conducted, the procedures used to select the sample(s) of students as well as the resulting characteristics of the sample(s) should be documented. The sample(s) should be as representative as possible of the population(s) for which the test is intended.


sample of Ohio students, and b) every item is as likely as every other item to appear in a class or school, minimizing clustering effects.

6.1 ITEM STATISTICS

Following the close of the testing window, AIR psychometrics staff works to analyze field-test data in preparation for item data review meetings and promotion of high quality test items to operational item pools.32 The item analyses include classical item statistics as well as the IRT item calibrations. Classical item statistics are used to evaluate the relationship of each item to the overall scale, evaluate the quality of the distractors, and identify items that may exhibit bias across subsgroups (DIF analyses). The IRT item analyses allow examination of the fit of items to the measurement model and provide the statistical foundation for operational form construction and test scoring and reporting. Items are flagged if analyses indicate resulting values are out of range; flagged items are reviewed by AIR and ODE psychometric and content staff for possible miskey or scoring errors; items that pass through AIR and ODE statistical reivew are then sent to item data review committess comprised of Ohio educators for a final external review.

6.1.1 CLASSICAL STATISTICS

Classical item analyses inspected whether the items functioned as intended with respect to the underlying scales. AIR’s analysis program computed the required item and test statistics for each dichotomous multiple-choice (MC) and polytomous constructed-response (CR) item to check the integrity of the item and to verify the appropriateness of the difficulty level of the item. Key statistics that are computed and examined include item difficulty, item discrimination, and distractor analysis.

Items that are either extremely difficult or extremely easy are flagged for review but not necessarily rejected if they align with the test specifications. For multiple-choice items, the proportion of students in the sample selecting the correct answer (p-values) is computed, as well as those selecting the incorrect responses. Multiple-choice items are flagged for reviews if the p-value was less than 0.30 or greater than 0.95. For constructed-response items, item difficulty is calculated both as the item’s mean score and as the average proportion of points gotten correct (analogous to p-value and indicating the ratio of an item’s mean score divided by the number of points possible). Constructed-response items are flagged for review if the the proportion of students assigned any score-point category is greater than 0.95. In addition, items are flagged if the average IRT-based ability estimate of students in a score-point category is lower than the average IRT-based ability estimate of students in the next lower score-point category (i.e., when students who receive 3 points score lower, on average, on the total test than students who received only 2 points on the item).

The item discrimination index indicates the extent to which each item differentiated between those students who possessed the skills being measured and those who do not. In general, the higher the value, the better the item was able to differentiate between high- and low-achieving students. The discrimination index for multiple-choice items is calculated as the correlation between the item score and the student’s IRT-based ability estimate. For constructed-response items, we computed the mean total number correct for student scoring within each of the possible score categories. Items were flagged for subsequent reviews if the point biserial correlation for the keyed (correct) response is less than 0.25.

32 Standard 4.10 – When a test developer evaluates the psychometric properties of items, the model used for that purpose (e.g., classical test theory, item response theory, or another model) should be documented. The sample used for estimating item properties should be described and should be of adequate size and diversity for the procedure. The process by which items are screened and the data used for screening, such as item difficulty, item discrimination, or differential item functioning (DIF) for major examinee groups, should also be documented. When model-based methods (e.g., IRT) are used to estimate item parameters in test development, the item response model, estimation procedures, and evidence of model fit should be documented.


Distractor analysis for the multiple-choice items is used to identify items that have marginal distractors or ambiguous correct responses that were overlooked by the Content Advisory Committee. In the distractor analysis, the correct response should be the most frequently selected option among high-scoring students. The discrimination value of the correct response should be substantial and positive, and the discrimination values for distractors should be lower and, generally, negative. The point biserial correlation for distractors is the correlation between the item score, treating the target distractor as the correct response, and the student’s IRT ability estimate, restricting the analysis to those students selecting either the target distractor or the keyed response. Items are flagged for subsequent reviews if the point biserial correlation for the distractor response is greater than zero. In addition, items are flagged if the proportion of students responding to a distractor exceeds the proportion selecting the keyed response.

6.1.2 IRT STATISTICS

AIR applied the Rasch and Masters’ Partial Credit Model to estimate the item response theory (IRT) model parameters for dichotomously and polytomously scored items, respectively. The WINSTEPS output showing the item statistics resulting from the free (unanchored) estimation of parameters for items in the operational tests are reviewed, as well as the WINSTEPS-generated item and persons maps. Item fit is evaluated via the mean square Infit and mean square Outfit statistics reported by WINSTEPS, which are based on weighted and unweighted standardized residuals for each item response, respectively. These residual statistics indicate the discrepancy between observed item responses and the predicted item responses based on the IRT model. Both fit statistics have an expected value of 1. Values substantially greater than 1 indicate model underfit, while values substantially less than 1 indicate model overfit (Linacre, 2004).

6.1.3 ANALYSIS OF DIFFERENTIAL ITEM FUNCTIONING

Differential item functioning (DIF) refers to items that appear to function differently across identifiable groups, typically across different demographic groups. Identifying DIF is important because sometimes it is a clue that an item contains a cultural or other bias. Not all items that exhibit DIF are biased; characteristics of the educational system may also lead to DIF. For example, if schools in low-income areas are less likely to offer geometry classes, students at those schools might perform more poorly on geometry items than would be expected, given their proficiency on other types of items. In this example, it is not the item that exhibits bias but the curriculum. However, DIF can indicate bias, so all field-tested items were evaluated for DIF, and all items exhibiting DIF were flagged for further examination by a Fairness and Sensitivity Committee. Committee members were asked to reexamine each flagged item, using the statistics as a guide, and to make a final decision about whether the item should be excluded from the pool of potential items given its performance during field testing.

AIR conducts DIF analysis on all field-tested items to detect potential item bias across major ethnic and gender groups. In Ohio, DIF is investigated among the following group comparisons (reference group/focal group):

• Male/Female

• White/Hispanic

• White/Black

• White/Multiple ethnicities selected

AIR uses a generalized Mantel-Haenszel (MH) procedure to evaluate DIF. The generalizations include (1) adaptation to polytomous items, and (2) improved variance estimators to render the test statistics valid under complex sample designs. Because students within a district, school, and classroom are more similar than would be expected in a simple random sample of students statewide, the information provided by students within a school is not independent, so that standard errors based on the assumption of simple random samples are underestimated. We compute design-consistent standard errors that reflect the clustered nature of educational systems. While clustering is mitigated through random administration of large numbers of embedded field-test items, design effects (Kish, 1967) in student samples are rarely reduced to the level of a simple random sample.


The ability distribution is divided into a configurable number of intervals to compute the Mantel-Haenszel (MH) chi-square DIF statistics. The analysis program computes the MH chi-square value, the log-odds ratio, the standard error

of the log-odds ratio, and the MH-delta statistic (∆̂𝑀𝐻) for the MC items; the MH chi-square, the standardized mean difference (SMD), and the standard error of the SMD for the CR items.

Items are classified into three categories (A, B, or C), ranging from no evidence of DIF to severe DIF according to the DIF classification convention illustrated in Exhibit 6.1.3.1. Items are also categorized as positive DIF (i.e., +A, +B, or +C), signifying that the item favors the focal group (e.g., African American/Black, Hispanic, or female), or negative DIF (i.e., –A, –B, or –C), signifying that the item favors the reference group (e.g., white or male). Items are flagged if their DIF statistics fall into the “C” category for any group. A DIF classification of “C” indicates that the item shows significant DIF and should be reviewed for potential content bias, differential validity, or other issues that may reduce item fairness.

Exhibit 6.1.3.1: DIF Classification Rules

Category Rule

Dichotomous Items

C 2MH is significant and 1|ˆ| MH.5

B 2MH is significant and 1|ˆ| MH.5

A 2MH is not significant.

Polytomous Items

C 2MH is significant and 25.||/|| SDSMD .

B 2MH is significant and 25.||/|| SDSMD .

A 2MH is not significant.


6.2 DATA REVIEW SUMMARY

Exhibit 6.2.1 provides a summary of items flagged for review and the number of items rejected following review.

Exhibit 6.2.1: Summary of Content Flagged Items During Field Testing — Spring 2018

Grade / Course Number of Field-

Test Items Number of

Flagged items Number of

Rejected Items

ELA

Grade 3 135 19 5

Grade 4 129 21 5

Grade 5 166 23 6

Grade 6 150 23 5

Grade 7 130 15 3

Grade 8 177 33 12

ELA I 167 28 10

ELA II 171 47 18

Mathematics

Grade 3 104 7 5

Grade 4 106 7 3

Grade 5 109 5 3

Grade 6 123 12 3

Grade 7 118 16 7

Grade 8 122 26 10

Algebra I 433 129 108

Geometry 450 101 98

Science

Grade 5 44 8 0

Grade 8 36 2 0

Biology 72 20 11

Social Studies

American Government 42 12 7

American History 40 8 3

6.3 TEST CONSTRUCTION

The process for constructing fixed-form operational tests began after field testing and committee review of items. Once the item pool was finalized, AIR content specialists began the process of constructing test forms. Test forms were constructed in two stages—first with intact operational forms, and second, specification of field-test item locations within the forms. Operational passages and items qualified for operational forms were those that met all of the criteria established by ODE in terms of content, fairness review, and data characteristics.


6.3.1 OPERATIONAL FORM CONSTRUCTION

Each OST form is built to match the detailed test blueprint, and to match the target distribution of item difficulty and test information. The blueprint describes the content to be covered, and the type of items that measure the constructs, and other content-relevant aspects of the test. The statistical targets ensure that students receive scores of similar precision, regardless of which form of the test they receive.33

AIR’s test developers used AIR’s FormBuilder software to help construct operational forms. FormBuilder interfaces

with AIR’s Item Authoring Tool (IAT) to extract test information and interactively create test characteristics curves

(TCCs), test information curves (TICs), and Standard Error of Measurement Curves (SEMCs) as test developers built

a test map, which provides the relevant information that the content specialists need to ensure that the test forms

are statistically parallel, in addition to ensuring that the test blueprint is met.

Immediately upon generation of a test form, the FormBuilder generates a blueprint match report to ensure that all elements of the test blueprint are satisfied. In addition, the FormBuilder produces a statistical summary of form characteristics to ensure consistency of test characteristics across test forms.

The summary report also flags items with low biserial correlations, as well as very easy and very difficult items. Although items in the operational pool have passed through data review, construction of fixed-form assessments allow another opportunity to ensure that poorly performing items are not included in operational test forms.

The FormBuilder also plots the distribution of item difficulties, both classical and IRT indices, to both flag extremely easy or difficult items and to ensure that the distribution of item difficulties was consistent across test forms and with the bank.

As test developers construct forms, FormBuilder-generated TCCs and SEMCs are plotted using a different color trace line for each prototype form. At this point, the test developer can see the test form difficulty relationship between the target and reference forms. Exhibit 6.3.1.1 shows a sample graph of TCC differences. There are several important things to note when examining TCC differences. First, differences in TCCs can occur at specific locations in the TCCs across a range of abilities. These differences reflect different emphases in test information across forms at these ability levels. If the difficulty and error structure for the target forms is sufficiently identical to the reference form, as in the sample TCC and SEM curves, then the item selection process concludes with newly created, multiple, parallel test forms. Once the goal of parallel forms is achieved, the information is entered into IAT, which tracks item usage and generates test maps (tables of data for the items on the form) for use in scoring, forms development, and other processes.

33 Standard 4.12 – Test developers should document the extent to which the content domain of a test represents the domain defined in the test specifications.


Exhibit 6.3.1.1: Test Characteristic Curve Differences

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

-4 -3 -2 -1 0 1 2 3 4

TC

C D

iffe

rence

s

Theta

TCC Differences

For the base year, test construction targets were based on psychometric characteristics of the bank. Subsequent to the base year, construction of OST test forms will seek to match the distribution of test information in the base year’s form. As illustrated in Exhibit 6.3.1.2, by evaluating test characteristics in reference to the likely location of important cut scores, test developers could ensure that test forms measure with precision in the locations where students were being classified into performance levels. Appendix I shows the test information, test information difference and the CSEM graphs that were used to evaluate the science and social studies test forms against their base year forms. The spring 2017 ELA and mathematics operational forms will be considered as base year forms for the future test development.

Exhibit 6.3.1.2: Test Information and Standard Errors Relative to Performance Standard Locations

6.3.2 ASSEMBLING TEST FORMS

The mechanical features of a test—arrangement, directions, and production—are just as important as the quality of the items. Many factors directly affect a student’s ability to demonstrate proficiency on the assessment, while others relate to the ability to score the assessment accurately and efficiently. Still others affect the inferences made from the test results.

When the test developer reviews a test form for content, in addition to making sure all the benchmark/indicator item requirements are met, he or she also ensures that the items on the form do not cue each other—that one item does not present material that indicates the answer to another item. This is important to ensure that a student’s


response on any particular test item is unaffected by, and is statistically independent of, a response to any other test item. This is called “local independence.” Independence is most commonly violated when there is a hint in one item about the answer to another item or when two items narrowly assess the same ability/content. In that case, a student’s true ability on the second item is not being assessed independent of the first item.

Test Developers begin the form construction process by first identifying the pool of items from which forms are built. This pool of items resides at a locked operational status in the Item Tracking System. Each item contains a historical record that clearly demonstrates it has survived the full review process from internal development through client, committee, and statistical data review.

Upon identifying and reviewing the eligible pool of items, a test developer then considers the limitations of the pool, if any. For example, there might be a shortage of high depth of knowledge items at a particular benchmark. The test developer will review and select from among these items first to ensure that the constraints of the blueprint are met.

Once the items and passages for the form are selected and matched against the blueprint, the test developer reviews the form for a variety of additional content considerations, including the following.

• The items are sequentially ordered.

• Each item of the same type is presented in a consistent manner.

• The listing of the options for the multiple-choice items is consistent.

• The answer options are lettered with A, B, C, and D.

• All graphics are consistently presented.

• All tables and charts have titles and are consistently formatted.

• The number of the answer choice letters is approximately equal across the form.

• The answer key was checked by the initial reviewer and one additional independent reviewer.

• All stimuli have items associated with them.

• The topics of items, passages, or stimuli are not too similar to one another.

• There are no errors in spelling, grammar, or accuracy of graphics.

• The wording, layout, and appearance of the item matches how the item was field-tested.

• There is gender and ethnic balance where perceivable in the passages or prompts.

• The passage sets do not start with or end with a constructed-response item.

• Each item and the form are checked against the appropriate style guide.

• The directions are consistent across items and are accurate.

• All copyrighted materials have up-to-date permissions agreements.

• Word counts are within documented ranges.

After completing the initial build of the form, the test developer hands it off to another content specialist, who conducts a final review of the criteria listed above. If the test specialist reviewer finds any issues, the form is sent back for revisions. If the form meets blueprint and complies with all specified criteria, the test developer sends it to the psychometric team for review. When the form is approved by the psychometric team, the test developer uploads the item list into IAT.

6.3.3 EMBEDDED FIELD-TEST SLOTS

Each operational test form contains designated slots for administration of items that do not contribute to students’ test scores.


For online test administrations, Ohio employs AIR’s field-test engine to administer test items. As described previously, the field-test algorithm randomly assigned both the field-test items/item groups and the field-test item positions, ensuring that

• a random sample of students were administered each item; and

• for any given item, the students were sampled with equal probability.

AIR’s field-test algorithm yields a representative, randomized sample of student responses for each item. The field-test algorithm also leads to randomization of item position and the context in which items appear.

For paper-pencil assessments, AIR staff constructed fixed EFT blocks. Selection of items for EFT slots were designed to ensure proportional representation of the field-test items. Items selected for paper-based EFT slots are submitted to ODE for review and approval regarding positioning and frequency within the paper-pencil forms.


7. TEST ADMINISTRATION

7.1 ELIGIBILITY

Ohio public school students in grades 3–8 participated in grade-level ELA and mathematics testing. Students enrolled in grades 5 and 8 were administered the grade-level science assessments.34 Beginning with the class of 2018, the high school end-of-course (EOC) tests will be part of the high school graduation requirements for all Ohio students. High school students take EOC assessments following coursework in ELA I and ELA II, Algebra I, Geometry (or alternatively Integrated Mathematics I and Integrated Mathematics II), American Government, American History, and Physical Science, and/or Biology as part of their course requirements when enrolled in eligible courses.

Students with significant cognitive disabilities who are eligible for the alternate assessment do not participate in OST assessments.

7.2 ADMINISTRATION PROCEDURES

Tests were administered in both an online format and a paper-based format. For administration of the online format tests, a secure browser, developed by AIR, was required to access the computer-based tests. The browser provided a secure environment for student testing by disabling the hot keys, copy and screenshot capabilities, and access to desktop functionalities, such as the Internet and email. Other measures that protect the integrity and security of the online test are presented in the “Test Security” section of this document.

Prior to the beginning of each test administration, AIR released guidance documents, user guides and video tutorials. The materials provided information on using the Test Delivery System (TDS), Online Reporting System (ORS), and Test Information Distribution Engine (TIDE). AIR posted these training materials on the publicly available online portal for OST.

Key personnel involved with OST administration include the District Test Coordinators (DTCs), School Test Coordinators (STCs), and test administrators (TAs) who proctor the test. Materials were developed and provided by AIR for each of these roles.

The Test Administrator User Guide (included as Appendix J)35 was designed to familiarize TAs with the Test Delivery System and contained tips and screenshots throughout. The guide provides enough how-to information to enable TAs to access and navigate the Test Delivery System. The user guide provides the following information:

• Steps to take prior to accessing the system and logging in

• Navigating the TA Interface application

• The Student Interface, used by students for computer-based testing

• Secure browsers and keyboard shortcut keys

The Spring 2015 OST Test Coordinator’s Manual provides information about policies and procedures for OST Test Coordinators. This manual is updated prior to each test administration and includes test administration policies and guidance for Test Coordinators before, during, and after the testing window.

34 Standard 7.2 – The population for whom a test is intended and specifications for the test should be documented. If normative data are provided, the procedures used to gather the data should be explained; the norming population should be described in terms of relevant demographic variables; and the year(s) in which the data were collected should be reported. 35 Supporting documents (e.g., test manuals, technical manuals, user’s guides, and supplemental material) should be made available to the appropriate people in a timely manner.


The Spring 2015 OST Directions for Administration Manual and Online Testing Checklist provides easy-to-follow instructions about administering tests, creating testing sessions, monitoring sessions, verifying student information, assigning test accommodations, and starting and pausing test sessions.36 Additional instructions for administering tests to students using Braille and large print accommodated test booklets are provided in the Supplemental Instructions Appendices of the Test Coordinator’s Manual and Directions for Administration Manual.

In addition to the guidance documents and manuals, AIR provided a TA Certification Course for personnel administering tests online.37 The course provided step-by-step instructions for starting a test session in the TA Interface, marking student test settings, approving students to test and monitoring a test session. All TAs were encouraged to complete the course to prepare for the online administration.

TAs who administered the computer-based OST assessments were also encouraged to conduct a training test session using OST assessments’ sample tests.

Personnel involved with OST test administration played an important role in ensuring the validity of the assessment by maintaining both standardized administration conditions and test security.

DTCs were responsible for coordinating testing at the district level. They ensured that the STCs in each school were appropriately trained and aware of policies and procedures, and that they were trained to use the reporting system.38

STCs were ultimately accountable for ensuring that testing was conducted in accordance with the test security and other policies and procedures established by ODE. STCs were primarily responsible for identifying and training TAs. They also created or approved testing schedules and procedures for the school. STCs worked with Technology Coordinators to ensure that the necessary secure browsers were installed and any other technical issues were resolved. During the testing window, Test Coordinators needed to monitor testing progress, ensure that all students participated as appropriate, and handle testing incidents as necessary.

TAs were responsible for reviewing necessary manuals and user guides to prepare the testing environment and ensuring that students did not have unapproved books, notes, or electronic devices out during testing. They were required to administer OST following the directions in the Directions for Administration Manual and the Online Testing Checklist.39 Any deviation in test administration was to have been reported by TAs to the STC, who reports it to the DTC. The DTC then reports it to ODE.

TAs also were responsible for ensuring that only resources that were allowed for specific tests were available and no additional resources were being used during the test. The Test Coordinator’s Manual and Directions for Administration Manual addressed allowable resources.

36 Standard 4.15 – The directions for test administration should be presented with sufficient clarity so that it is possible for others to replicate the administration conditions under which the data on reliability, validity, and (where appropriate) norms were obtained. Allowable variations in administration procedures should be clearly described. The process for reviewing requests for additional testing variations should also be documented. 37 Standard 6.1 – TAs should follow carefully the standardized procedures for administration and scoring specified by the test developer and any instructions from the test user. Standard 12.16 – Those responsible for educational testing programs should provide appropriate training, documentation, and oversight so that the individuals who administer and score the test(s) are proficient in the appropriate test administration and scoring procedures and understand the importance of adhering to the directions provided by the test developer. 38 Standard 12.16 – Those responsible for educational testing programs should provide appropriate training, documentation, and oversight so that the individuals who administer and score the test(s) are proficient in the appropriate test administration and scoring procedures and understand the importance of adhering to the directions provided by the test developer. 39 Standard 6.1 – TAs should follow carefully the standardized procedures for administration and scoring specified by the test developer and any instructions from the test user.


The STC and TAs worked together to determine the most appropriate testing option(s) and testing environment and the average time needed to complete each test. The appropriate protocols were established to maintain a quiet testing environment throughout the testing session. TAs also needed to ensure that adequate time was available to start computers, load secure browsers, and log in students for computer-based tests and pass out and collect test booklets and materials for paper-pencil tests.40

7.3 ACCOMODATIONS

Some students require testing accommodations. Accommodations are supports that are already familiar to the student because they are being used in the classroom to support instruction.

Four distinct groups of students may receive accommodations on Ohio’s State Tests:

• Students with disabilities who have an Individualized Education Program (IEP).

• Students with a Section 504 Plan who have physical or mental impairments that substantially limit one

or more major life activities, have records of such impairments, or are regarded as having such

impairments, but who do not qualify for special education services.

• Students who are English learners (ELs). (Guidelines for determining EL status can be found in the Ohio

Statewide Assessments Rules Book.) Students who have exited EL status may not receive EL

accommodations on Ohio’s State Tests.

• Students who are ELs with disabilities who have IEPs or Section 504 Plans are eligible for both

accommodations for students with disabilities and ELs. For additional guidance and information about

ELs with disabilities, access the About the Lau Resource Center for English Learners page of ODE’s

website.

For Ohio’s State Tests, accommodations are considered to be adjustments to the testing conditions, test format, or test administration that provide equitable access during assessments for students with disabilities and students who are ELs. The administration of the assessment should never be the first occasion in which an accommodation is introduced to the student.41

To the extent possible, accommodations should

• provide equitable access during instruction and assessments;

• mitigate the effects of a student’s disability;

• not reduce learning or performance expectations;

• not change the construct being assessed; and

• not compromise the integrity or validity of the inferences to be made from the assessment.

40 Standard 3.4 – Students should receive comparable treatment during the test administration and scoring process. Standard 4.5 – If the test developer indicates that the conditions of administration are permitted to vary from one student or group to another, permissible variation in conditions for administration should be identified. A rationale for permitting the different conditions and any requirements for permitting the different conditions should be documented. Standard 6.4 – The testing environment should furnish reasonable comfort with minimal distractions to avoid construct-irrelevant variance. 41 Standard 3.10 – When test accommodations are permitted, test developers and/or test users are responsible for documenting standard provisions for using the accommodation and for monitoring the appropriate implementation of the accommodation.

http://education.ohio.gov/Topics/Testing/Testing-Forms-Rules-and-Committees/Ohio-Statewide-Assessment-Program-Rules-Book

http://education.ohio.gov/Topics/Other-Resources/Limited-English-Proficiency/About-the-Lau-Resource-Center



Ohio’s Accessibility Manual described available accommodations and accessibility features for the spring 2018 administration in Appendix K. Exhibit 7.3.1 shows the available accommodations for Ohio’s State Tests.42

42 Standard 3.9 – Test developers and/or test users are responsible for developing and providing test accommodations, when appropriate and feasible, to remove construct-irrelevant barriers that otherwise would interfere with examinees’ ability to demonstrate their standing on the target constructs.


Exhibit 7.3.1: Accommodations

Accommodation Description

Additional assistive technology regularly used in instruction

Students may use a range of assistive technologies on Ohio’s State Tests including devices that are compatible with the AIR Student Testing Site, and those that are used externally (i.e., on a separate device).

For more information on additional assistive technology devices and software for use on Ohio’s State Tests, refer to Appendix K.

Human read-aloud (on computer- based test)

A TA or monitor reads from the student’s computer screen to the student. For computer-based testing, most students should be able to use text-to-speech for a read-aloud. In some cases, a student’s disability may prohibit them from using the text-to-speech feature and require a human reader.

If testing in a small group, TAs should ensure that all students in the group have similar abilities so that the reader’s pace meets all student’s needs without being too slow or too fast for some students.

Refer to the TIDE User Guide for information about setting up groups for computer-based testing.

If a student needs this accommodation, then the person providing the accommodation must read the entire test to the student. It cannot be “as needed” or “on demand.”

Only students who meet the criteria to have a read-aloud accommodation on the ELA test may use this feature for ELA.

Paper version of test instead of online

If a student’s class is taking Ohio’s State Tests in an online environment and a student is unable to use a computer due to the impact of his or her disability, it is allowable for the student to take the test on a paper-pencil form instead.

Situations that may require this accommodation include:

● A student with a disability who cannot participate in the online assessment due to a health-related disability, neurological disorder or other complex disability and/or cannot meet the demands of a computer-based test administration

● A student with an emotional, behavioral or other disability who is unable to maintain sufficient concentration to participate in a computer-based test administration, even with other accessibility features

● A student with a disability who requires assistive technology that is not compatible with the testing platform

● If a student takes a paper-based version of the test, the student must take both parts of the test on a paper-pencil form

● A student with a disability who requires assistive technology that is not compatible with the testing platform

If a student takes a paper-based version of the test, the student must take both parts of the test on a paper-pencil form.

http://oh.portal.airast.org/



Read-aloud on English language arts

“Read-aloud” as a general term is when a student is administered a test via text-to-speech, human read-aloud, screen reader or sign language interpreter.

The read-aloud accommodation for the ELA test is intended to provide access for a very small number of students to printed or written texts in the ELA tests. These students have print-related disabilities and otherwise would be unable to participate in Ohio’s State Tests because their disabilities severely limit or prevent them from decoding, thus accessing printed text.

This accommodation is not intended for students reading somewhat (only moderately) below grade level.

In making decisions on whether to provide a student with this accommodation, IEP teams and Section 504 Plan coordinators should consider whether the student has

• a disability that severely limits or prevents him or her from accessing printed text, even after varied and repeated attempts to teach the student to do so (for example, the student is unable to decode printed text);

• blindness or a visual impairment and has not learned (or is

unable to use) Braille; or

• deafness or hearing loss and is severely limited or prevented from decoding text due to a documented history of early and prolonged language deprivation.

Before listing the accommodation in the student’s IEP or Section 504 Plan, teams/coordinators also should consider whether

• the student has access to printed text during routine instruction through a reader or other spoken-text audio format or sign language interpreter;

• the student’s inability to decode printed text or read Braille is documented in evaluation summaries from locally administered diagnostic assessments; or

• the student receives ongoing, intensive instruction and/or interventions in foundational reading skills to continue attaining the important college and career-ready skill of independent reading.

IEP teams and Section 504 Plan coordinators make decisions about who receives this accommodation. Schools should use a variety of sources as evidence (including state assessments, district assessments and one or more locally administered diagnostic assessments or other evaluation).

For students who receive this accommodation, no claims should be inferred regarding the student’s ability to demonstrate foundational reading skills.



Screen reader mode (English language arts) (formerly called enhanced accessibility mode or streamlined mode; not available 2015–2016 for grade 8 science, biology or physical science)

Screen reader mode is for students with visual impairments who use screen readers.


Sign language interpreter

Any student who is deaf or has hearing loss may have a sign language interpreter (American Sign Language, signed English, Cued Speech) for mathematics, science, and social studies.

For the purposes of statewide testing, sign language is considered a second language and should be treated the same as any other language from a translational standpoint. The test must be signed verbatim. The intent of the phrase “signed verbatim” does not mean a word-to-word translation, as this is not appropriate for any language translation. The expectation is that the interpreter should faithfully translate, to the greatest extent possible, all of the words on the test without changing or enhancing the meaning of the content, adding information or explaining concepts unknown to the student.


Text-to-speech for English language arts

The text-to-speech feature reads aloud the test to the student.

Student must use headphones if not tested in a one-on-one setting.


Text-to-speech tracking for English language arts

The feature will highlight words in test questions as the embedded text-to-speech feature reads the test aloud.



Students may use a range of assistive technologies on Ohio’s State Tests, including devices that are compatible with the Student Testing Site and those that are used externally (i.e., on a separate device).

For more information on additional assistive technology devices and software for use on Ohio’s State Tests, refer to Appendix K.

Answers transcribed by test administrator

The student records his or her answers directly on paper and the test administrator/monitor transcribes the responses verbatim into the Student Testing Site.

Braille note taker

A student who is blind or has visual impairments may use an electronic Braille note taker. For Ohio’s State Tests, grammar checker, Internet and stored file functionalities must be turned off.

The responses of a student who uses an electronic Braille note taker during Ohio’s State Tests must be transcribed exactly as entered in the electronic Braille note-taker. Only transcribed responses will be scored. Transcription



guidelines are available in Appendix K (Appendix C: Protocol for the Use of the Scribe Accommodation and for Transcribing Student Responses).

Braille writer

A student who is blind or has visual impairments may use an electronic Braille writer. A TA must transcribe into the computer the student’s responses exactly as entered in the electronic Braille writer.

Only transcribed responses will be scored. Transcription guidelines are available in Appendix K (Appendix C: Protocol for the Use of the Scribe Accommodation and for Transcribing Student Responses).

Calculation device or fact charts for non-calculator mathematics test part of test

The student uses a calculation device or fact chart (addition/subtraction/multiplication/division charts) on the non-calculator sections of the mathematics assessments.

The accommodation would be permitted on test sections for which calculators are not allowed for other students. IEP teams and Section 504 Plan coordinators should carefully review the following guidelines for identifying students to receive this accommodation.

This accommodation is for students with disabilities that severely limit or prevent their abilities to perform basic calculations (i.e., single-digit addition, subtraction, multiplication, or division).

In making decisions whether to provide the student with this accommodation, IEP teams and Section 504 Plan coordinators should consider whether the student has a disability that severely limits or prevents the student’s ability to perform basic calculations (i.e., single-digit addition, subtraction, multiplication or division), even after varied and repeated attempts to teach the student to do so.

Before listing the accommodation in the student’s IEP or Section 504 Plan, teams also should consider whether

● the student is unable to perform calculations without the use of a calculation device, arithmetic table or manipulative during routine instruction;

● the student’s inability to perform mathematical calculations is documented in evaluation summaries from locally administered diagnostic assessments; or

● the student receives ongoing, intensive instruction and/or interventions to learn to calculate without using a calculation device, in order to ensure that the student continues to learn basic calculation and fluency.

Specific calculation devices must match the Ohio’s State Tests calculator policy.

Mathematical tools (mathematics and physical science only) —allowable tools include:

• 100s chart

Student uses these tools and manipulatives to assist mathematical problem solving. These manipulatives allow the flexibility of grouping, representing or counting without numeric labels.

http://education.ohio.gov/Topics/Ohios-Learning-Standards/Mathematics



• Abacus and other specialized tools for students with visual impairments

• Base 10 blocks

• Counters and counting chips

• Cubes

• Square tiles

• Two-colored chips

A student with a visual impairment may need other mathematical tools such as a large print ruler, Braille ruler, tactile compass or Braille protractor.

ODE will review and revise this list annually as needed.

Scribe

The student dictates responses either verbally, using a speech-to text device, augmentative or assistive communication device (e.g., picture or word board), or by signing, gesturing, pointing, or eye gazing.

Grammar checker, Internet, and stored files functionalities must be turned off. Word prediction must also be turned off for students who do not receive this accommodation. The student must test in a separate setting.

In making decisions whether to provide the student with this accommodation, IEP teams and Section 504 Plan coordinators should consider whether the student has

● a physical disability that severely limits or prevents the student’s motor process of writing through keyboarding; or

● a disability that severely limits or prevents the student from expressing written language, even after varied and repeated attempts to teach the student to do so.

Before listing the accommodation in the student’s IEP or Section 504 Plan, teams/coordinators should also consider whether

● the student’s inability to express in writing is documented in evaluation summaries from locally administered diagnostic assessments;

● the student routinely uses a scribe for written assignments; and

● the student receives ongoing, intensive instruction and/or interventions to learn written expression, as deemed appropriate by the IEP team or Section 504 Plan coordinator.

Student’s responses must be transcribed exactly as dictated.

Information about the scribing process is available in Appendix K (Appendix C: Protocol for the Use of the Scribe Accommodation and for Transcribing Student Responses).

Specialized calculation device

A student uses a specific calculation device (for example, a large key, talking or other adapted calculator) on the calculator part of the mathematics assessments. If a talking calculator is used, the student must use headphones or test in a separate setting.



The student must qualify for the calculation device or fact charts on non-calculator mathematics test or part of test accommodation to use a specialized calculator in those tests.

Word prediction external device

The student uses an external word prediction device that provides a bank of frequently or recently used words on screen as a result of the student entering the first few letters of a word.

The student must be familiar with the use of the external device prior to assessment administration. The device cannot connect to the Internet or save information.

In making decisions whether to provide the student with this accommodation, IEP teams and Section 504 Plan coordinators are instructed to consider whether the student has

● a physical disability that severely limits or prevents the student from writing or keyboarding responses; or

● a disability that severely limits or prevents the student from recalling, processing and expressing written language, even after varied and repeated attempts to teach the student to do so.

Before listing the accommodation in the student’s IEP team and Section 504 Plan coordinators are instructed to consider whether

● the student’s inability to express in writing is documented in evaluation summaries from locally administered diagnostic assessments; and

● the student receives ongoing, intensive instruction and/or intervention in language processing and writing, as deemed appropriate by the IEP team and Section 504 Plan coordinator.

7.4 TEST SECURITY

Maintaining a secure test environment is critical to ensure that scores represent what students know and are able to do. Because OST assessments were administered both as a paper-based and a computer-based assessment, test security procedures must guard against item exposure, cheating, or other security problems for all testing modes.

The test security procedures involve the following:

• Procedures to ensure security of test materials

• Procedures to investigate test irregularities

The Test Coordinator’s Manual provides detailed instructions on test security policies and procedures that are briefly described as follows—all test items, test materials, and student-level testing information are secure documents and must be appropriately handled. Secure handling protects the integrity, validity, and confidentiality of assessment questions, prompts, and student results. Any deviation in test administration must be reported to ensure the validity of the assessment results. Mishandling of test administration puts student information at risk and disadvantages the student. Failure to honor security severely jeopardizes district and state accountability requirements and the accuracy of student data.


The security of all test materials must be maintained before, during, and after test administration. Under no circumstances are students permitted to assist in preparing secure materials before testing or in organizing and returning materials after testing. After any administration, whether initial or make-up, secure materials (e.g., test booklets) must be returned immediately to the STC and placed in locked storage. Secure materials must never be left unsecured and must not remain in classrooms or be taken off the school’s campus overnight.43

It is unethical and shall be viewed as a violation of test security for any person to

• capture images of any part of the test via any electronic or photographic device;

• duplicate in any way any part of the test;

• examine, read, or review the content of any portion of the test;

• disclose or allow to be disclosed the content of any portion of the test before, during, or after test

administration;

• discuss any OST test item before, during, or after test administration;

• allow students’ access to any test content prior to testing;

• allow students to share information during the test administration;

• read any parts of the test to students except as indicated in the test administration directions or as part of

an accommodation;

• influence students’ responses by making any kind of gestures (for example, pointing to items, holding up

fingers to signify item numbers or answer options) while students are taking the test;

• instruct students to go back and reread/redo responses after they have finished their test since this

instruction may only be given before the students take the test;

• review students’ responses;

• read or review students’ scratch paper; or

• participate in, direct, aid, counsel, assist in, encourage, or fail to report any violations of these test

administration security procedures.

Additional security violations for paper-pencil testing include

• reading or reviewing any test booklet during or after testing,

• changing any student response in the student’s scorable document,

• erasing any student response in the student’s scorable document,

• erasing any stray marks in the student’s scorable document, and

• failing to return all test booklets and other test materials.

TA and proctors may not assist students in answering questions or reword or explain any test content. No test content may ever be discussed before, during, or after test administration.

All regular test booklets and special documents (large print and Braille) are secure documents and must be protected from loss, theft, and reproduction in any medium. A unique identification number and a bar code were printed on the front cover of all test booklets. Schools were expected to maintain test security by using the security numbers to account for all secure test materials before, during, and after test administration until the time they were returned to the contractor.

43 Standard 6.7 – Test users have the responsibility of protecting the security of test materials at all times. Standard 7.9 – If test security is critical to the interpretation of test scores, the documentation should explain the steps necessary to protect test materials and to prevent inappropriate exchange of information during the test administration session.


To access the computer-based tests, the AIR-developed secure Internet browser was required. The secure browser provides a secure environment for student testing by disabling the hot keys, copy and screenshot capabilities, and access to the desktop (Internet, email, and other files or programs installed on school machines). The secure browser did not display the IP address or other URL for the site. Users could not access other applications from within the secure browser, even if they knew the keystroke sequences. The “back” and “forward” browser options were not available, except as allowed in the testing environment as testing navigation tools. Students were not able to print from the secure browsers. During testing, the device was locked down, and students were required to “Pause” (to save the test for another session) or “Submit” (to indicate they were finished with the test). The secure browser was designed to ensure test security by prohibiting access to external applications or navigation away from the test. See the Test Administrator User Guide Appendix J for further details.44

Throughout the testing window, TAs were to report any test incidents (e.g., disruptive students, loss of Internet connectivity) to the STC immediately. A test incident could include testing that was interrupted for an extended period of time due to a local technical malfunction or severe weather. STCs notified DTCs of any test irregularities that were reported, and DTCs were to discuss test incidents with ODE.

44 Standard 6.16 – Transmission of individually identified test scores to authorized individuals or institutions should be done in a manner that protects the confidential nature of the scores and pertinent ancillary information. Standard 8.6 – Test data maintained or transmitted in data files, including all personally identifiable information (not just results), should be adequately protected from improper access, use, or disclosure, including by reasonable physical, technical, and administrative protections as appropriate to the particular data set and its risks, and in compliance with applicable legal requirements. Use of facsimile transmission, computer networks, data banks, or other electronic data-processing or transmittal systems should be restricted to situations in which confidentiality can be reasonably assured. Users should develop and/or follow policies, consistent with any legal requirements, for whether and how students may review and correct personal information.


8. REPORTING AND INTERPRETING OST SCORES

A set of score reports is provided for each administration that summarizes student performance in each grade and content area. Score reports provide data on the performance of individual students and on the aggregated performance of students at various levels—such as state, districts, schools, and teachers. The test data are based on all students who participated in OST assessments for the 2017–2018 school year.

The score reports include information describing student progress toward mastery of the state learning standards. OST provides individual student score reports that are mailed directly to families, detailing student performance on overall tests and subscores. In addition, Ohio offers detailed individual and aggregate level data to educators via AIR’s Online Reporting System (ORS), which provides score data for each OST assessment, both computer-based and paper-pencil. The ORS allows users to compare score data between individual students and the school, district, or overall state, and also provides information about performance on subscore categories.

8.1 APPROPRIATE USES FOR SCORES AND REPORTS

The state provides a variety of resources for helping parents and educators understand and apply student performance results to improve student learning and classroom instruction. All reporting systems for OST assessments, both paper-based and online, are designed with stakeholders, such as teachers, parents, and students (who are not technical measurement experts) in mind and ensure that test results are used in ways that support valid inferences about student achievement and contribute to student learning.45 For example, similar colors are used for groups of similar elements, such as performance levels, throughout the design. This design strategy guides the reader to compare like elements and avoid comparison of dissimilar elements.

Sample reports are available on the portal. The sections below provide additional guidance for interpreting results.

45 Standard 6.10 – When test score information is released, those responsible for testing programs should provide interpretations appropriate to the audience. The interpretations should describe in simple language what the test covers, what scores represent, the precision/reliability of the scores, and how scores are intended to be used. Standard 13.5 – Those responsible for the development and use of tests for evaluation or accountability purposes should take steps to promote accurate interpretations and appropriate uses for all groups for which results will be applied.

http://oh.portal.airast.org/ocba/wp-content/uploads/Understanding_State_Tests_Reports_Spring2015.pdf


8.2 REPORTS PROVIDED

FAMILY REPORTS

Ohio provides full-color individual student reports to families of all OST testers. Reports are designed to be useful to families, and include

• full color to aid readers’ interpretation of the data;

• scale scores and performance level descriptors;

• scoring category performance, including descriptions of what was assessed and what results mean for each

scoring category to guide parents and students in their understanding of student scores; and

• school, district, and state average scores for comparative purposes.


8.2.1 ONLINE REPORTING SYSTEM FOR EDUCATORS

OST results are reported using AIR’s Online Reporting System, which is designed to support educators as they evaluate the needs of their students and reflect on their own curricula and practice. Navigation in the system mirrors the instructional decision-making process, meaning the user can intuitively navigate in any of the three dimensions inherent in the data, helping the user answer three kinds of questions:

1) Who? The data can be displayed at levels of aggregation anywhere from the individual level for a specific

student up to the entire state. Demographic breakdowns are immediately available at any level of

aggregation.

2) What? The subject area data can be broken down into finer or coarser “chunks” of content. Navigating this

dimension allows the user to travel from subject to scoring category and back.

3) When? When data are available over time, the system allows the user to view a data trend over time or

toggle to a fixed point in time.

Each navigational step changes the reporting display, providing richer context when interpreting a class’s or individual student’s performance. While the system contains many reports, the interface design encourages users to think about the substantive, educational questions to which they need answers and access information from that perspective. In addition, while finding and interpreting data from multiple online assessments can easily become overwhelming, the ORS minimizes information overload for educators and administrators by organizing score information in a conceptual framework that helps users quickly locate the right level of data, evaluate its impact, and identify the concrete actions they can take to help students improve.

OST assessments’ online system produces the following online score reports: individual student reports and aggregate reports at the teacher, school, district, and state level.

OST assessments’ online score reports are structured hierarchically. Upon selecting “Home” on the Welcome page, a user is taken to the Home Page Dashboard, which displays for all grades and content areas the number of students tested and the percent of students passing by grade and content area. Users who have access to multiple districts or schools are first required to select a single district or school. Once an aggregate unit is selected in this instance, the summary table of student performance for the selected entity displays. For more detailed information for a subject and a grade, the user must select that subject and grade.

On each aggregate report, the summary report presents the results for the selected aggregate unit as well as the results for the state and the aggregate unit above the selected aggregate. For example, if a school is selected on the school report page, the summary results of the state and the district the school belongs to are provided above the school summary results so that the school performance can be compared with the district and the state. If a teacher is selected, the summary results for state, district, and school are provided above the summary results for the teacher.

For a more detailed overview of the Online Reporting System you can log in and select “Help” to view the ORS User Guide.


Exhibit 8.2.2.1 summarizes the types of online score reports available and the levels at which they can be viewed (e.g., student, roster, teacher, school, district).

Exhibit 8.2.2.1: OST Online Score Report Summary

Type of Report Page Level of Aggregation Description

Home Page Dashboard District, school, and

teacher

Summary of performance and participation

(NumberTested and Percent Passing) across grades

and subjects or course

Subject Detail

District

Average scale score, percent passing, and percent at

each performance level for a district and each school

within that district; ability to disaggregate data by

subgroup

School


each performance level for a school and each teacher

within that school; ability to disaggregate data by

subgroup

Teacher


each performance level for a teacher and each class

roster associated with that teacher; ability to

disaggregate data by subgroup

Scoring Category Detail District, school, teacher,

and roster

Performance on the scoring category for a subject and

a grade for all students and by subgroups; a relative

strength and weakness indicator is also reported for

each category

Student Roster School, teacher, roster

List of students with performance on overall subject

and scoring categories for a group of students

associated with a school, teacher, or roster.

Individual Student

Report Student

Student performance for a selected subject; report

includes performance on each scoring category, and

performance on the writing essay dimensions, if

applicable


SUBJECT DETAIL REPORTS

The screenshot above demonstrates the Subject Detail Reports at the district level. Aggregated subject reports show average performance for the state, districts, schools, teachers, and classes. Bar chart displays show the distribution of students’ performance levels. These reports provide users with rosters of schools, teachers, and classes, allowing for simple comparisons across smaller groups.

The Subject Detail Report page shows the following data:

• Student Count: Number of students who have completed who completed the selected test

• Average Scale Score: Average scale score of students who completed the selected test

• Percent Proficient: The percent of tested students reaching the proficient threshold on the selected test

Percent at Each Performance Level: The distribution of students across each of the four performance

levels


SCORING CATEGORY DETAIL REPORTS

The screenshot above shows the Scoring Category Detail Reports. Aggregated scoring category detail reports follow the layout of the subject detail reports, displaying the performance data for the state, districts, schools, teachers, and classes. These can be accessed by selecting the desired entity and choosing “Reporting Category” in the drop-down menu.


STUDENT ROSTER REPORTS

The screenshot above shows the Student Roster Reports, it provides users with performance data for a group of students associated with a teacher or a school, as defined in TIDE. The report includes each student’s unique state ID, overall subject score, and overall subject performance level. Using the exploration menu, a user can also view each student’s scoring category performance for the selected test.

The table that appears on the Student Roster Report page shows the following data:

• Scale score: The score of each student who completed the test

• Performance level: Represents levels of overall subject mastery

• Scoring Categories: Represents levels of scoring category mastery


INDIVIDUAL STUDENT REPORTS

The screenshot above shows the Individual Student Reports, it summarizes a student’s performance in an organized, easy-to-understand document that can be distributed to educators, parents, and students. The student’s performance is plotted against cut scores on a barrel chart that provides detailed explanations of each performance level. A student’s scoring category scores and comparison data for the state, district, and school are provided in separate tables. The report can be exported as a PDF document, and users can batch-print multiple students’ reports, allowing for electronic distribution of student reports.

The Individual Student Report page contains the following information:

• Barrel chart: Presents the student’s performance and where his or her performance lies on the OST

assessments’ scale. The following information is presented in the barrel chart:

o Scale score: The score the student received on the selected test.

o Performance-level descriptors (PLDs): PLDs define the content area knowledge, skills, and

processes that students at a performance level are expected to possess.

o Cut scores: The barrel chart shows the cut scores for each performance level for a particular

grade and subject.

• Student performance on scoring categories: Shows the student’s performance on each of the scoring

categories, including text descriptions of what was assessed in each category and what the student’s

results mean.


PARTICIPATION REPORTS

The screenshot above shows the Participation Report. To help schools manage their test schedule, allocating testing resources, and prioritize testing, the online reporting system offers participation reports for online testers. From the “Data Files and Participation Reports” drop-down, users can select “Plan and Manage Testing” to generate up-to-the-minute reports showing students’ test status. In addition, users can set testing schedules, monitor testing progress across schools, and track students’ participation based on their performance on previous tests.


8.3 INTERPRETATION OF SCORES

Ohio provides a variety of resources to help parents and educators understand and apply student performance results to improve student learning, including interpretive guides for navigating the online reporting system, and understanding paper family reports.46 This section describes many of the measures presented in the paper and online score reports.

8.3.1 SCALE SCORES

The student’s performance in each content area assessment is summarized in an overall test score referred to as a scale score. The number of items a student answers correctly and the difficulty of the items presented are used to statistically transform theta scores (student ability expressed in logits) to scale scores so that scores from different sets of items (test forms) can be meaningfully compared on a linear and invariant scale. The scale score is used to indicate how well students perform on each subject area assessment. Scale scores can indicate how much students know and are able to do. Scale scores can also be used to compare student performance across administrations for the same grade and content area so that, for example, an average scale score of 700 for grade 5 students in the 2016–2017 school year indicates the same level of achievement as an average scale score of 700 for grade 5 students in the 2017–2018 school year. Scale scores can also be expressed as integers to facilitate communication while theta scores are cumbersome due to the need to express several decimal places.

As presented in chapter 9, Scaling and Equating, ability estimates are truncated at ±3.5 logits on the theta scale prior to transformation to the OST assessments’ reporting scale. This truncation rule suppresses reporting extreme scale scores where the standard error of the estimate is very large. Overall scale scores for science and social studies are mapped into five performance levels using four performance standards (i.e., cut scores). The OST assessments’ scale score ranges can be found in Exhibit 8.3.1.1.

Exhibit 8.3.1.1: OST Scale Score Ranges

Assessment Limited Basic Proficient Accelerated Advanced

ELA

Grade 3 545–671 672–699 700–724 725–751 752–863

Grade 4 549–673 674–699 700–724 725–752 753–846

Grade 5 552–668 669–699 700–724 725–754 755–848

Grade 6 555–667 668–699 700–724 725–750 751–851

Grade 7 568–669 670–699 700–724 725–748 749–833

Grade 8 586–681 682–699 700–724 725–743 744–805

ELA I 606–682 683–699 700–724 725–738 739–800

ELA II 597–678 679–699 700–724 725–741 742–808

Mathematics

Grade 3 587–682 683–699 700–724 725–752 753–818

Grade 4 605–685 686–699 700–724 725–758 759–835

Grade 5 624–686 687–699 700–724 725–748 749–804

Grade 6 616–681 682–699 700–724 725–743 744–790

Grade 7 605–683 684–699 700–724 725–754 755–806

46 Standard 12.18 – In educational settings, score reports should be accompanied by a clear presentation of information on how to interpret the scores, including the degree of measurement error associated with each score or classification level, and by supplementary information related to group summary scores. In addition, dates of test administration and relevant norming studies should be included in score reports.


Assessment Limited Basic Proficient Accelerated Advanced

Grade 8 633–689 690–699 700–724 725–743 744–774

Algebra 618–681 682–699 700–724 725–753 754–814

Geometry 604–677 678–699 700–724 725–755 756–810

Integrated Math I 618–681 682–699 700–724 725–753 754–814

Integrated Math II 594–676 677–699 700–724 725–757 758–813

Science

Grade 5 559–663 664–699 700–724 725–752 753–845

Grade 8 575–673 674–699 700–724 725–765 766–868

Biology 617–684 685–699 700–724 725–734 735–823

Physical Science 634–683 684–699 700–724 725–748 749–815

Social Studies

American History 619–683 684–699 700–724 725–737 738–800

American Government 642–686 687–699 700–724 725–738 739–774


8.3.2 PERFORMANCE STANDARDS

Performance standards are the points (or cut scores) on the achievement scale that differentiate performance levels. Four performance standards are used to classify students into one of five proficiency levels. Performance standard cut scores were recommended by panels of Ohio educators following the first administration of OST in 2015, and subsequently adopted by the Ohio Board of Education. Panelists engaged in a rigorous, technically sound standard setting process that is summarized in the Performance Standards section of this technical manual, and documented in detail in the 2015 “Recommending Ohio Computer-Based Assessment Performance Standards” technical report, available from ODE.47

Performance levels represent levels of mastery with respect to Ohio’s Learning Standards for a particular subject and grade. Performance levels are labeled as Limited, Basic, Proficient, Accelerated, and Advanced in accordance with Ohio Revised Code. Performance level labels and performance level descriptors (PLDs) are developed to define and illustrate the level of achievement that characterizes students in each group.

Performance levels provide context for interpreting the meaning of scale scores. While scale scores indicate how much a student knows and is able to do, performance levels indicate how much students must know and be able to do to receive a Limited, Basic, Proficient, Accelerated, or Advanced label for a subject area assessment. Teachers can evaluate how their students are performing compared with other students in the school, LEA, and state in terms of the percentage of students in each performance level for the same grade and content area.

8.3.3 PERFORMANCE-LEVEL DESCRIPTORS

PLDs define the content area knowledge, skills, and processes that students at a performance level are expected to possess. The descriptions of Limited, Basic, Proficient, Accelerated, or Advanced performance are the public statements about what and how much Ohio educators want students to know and be able to do for each grade level and content area. The PLD development process includes rounds of review from test development experts within ODE, AIR, and Ohio educators and parents. The very detailed PLDs are summarized and included in score reports to provide context for the score and are designed to help parents understand what their students can and cannot do.

47 Standard 5.21 – When proposed score interpretations involve one or more cut scores, the rationale and procedures used for establishing cut scores should be documented clearly. Standard 7.4 – Test documentation should summarize test development procedures, including descriptions and the results of the statistical analyses that were used in the development of the test, evidence of the reliability/precision of scores and the validity of their recommended interpretations, and the methods for establishing performance cut scores.


9. PERFORMANCE STANDARDS

In the summer of 2015, following the first administration of OST assessments in science and social studies, AIR convened panels of Ohio educators to recommend performance standards on each of the science and social studies assessments. Details of the panels, procedures, and outcomes are documented in the “Recommending Ohio Computer-Based Assessment Performance Standards” technical report, which is available from ODE.

To comply with legislatively mandated reporting requirements, performance standards for ELA and mathematics were recommended prior to any test administrations. In December 2015, AIR convened panels of Ohio educators to recommend performance standards on each of the ELA and mathematics assessments based on an ordered-item booklet (OIB) that comprised AIRCore items that had been previously calibrated and equated based on administration in other statewide assessments. Details of the panels, procedures, and outcomes are documented in the “Recommending Performance Standards for Ohio’s State Tests”, which is available from ODE. This section briefly describes the procedures used by educators to recommend standards, and resulting performance standards.

9.1 STANDARD SETTING PROCEDURES

Student achievement on OST assessments are classified into five performance levels: Limited, Basic, Proficient, Accelerated, and Advanced. Interpretation of OST assessments’ scores rests fundamentally on how student ability estimates, indicated by test scores, relate to the performance standards that define the extent to which students have achieved the expectations defined in Ohio’s Learning Standards. OST test scores are reported with respect to five performance levels, demarcating the degree to which Ohio students have achieved the learning expectations defined by Ohio’s Learning Standards. The levels are defined in Ohio Revised Code 3301.0710(A)(2). The cut score establishing the Proficient level of performance is the most critical, since it indicates that students are meeting grade level expectations for achievement of Ohio’s Learning Standards and that they are prepared to benefit from instruction at the next grade level. Additionally, the Accelerated level is important, as it indicates that students are on track to pursue post-secondary education or enter the workforce. Procedures used to adopt performance standards for OST assessments are therefore central to the validity of test score interpretations.

Following the first operational administration of OST assessments in spring 2015, a standard-setting workshop was conducted to recommend to the Ohio State Board of Education a set of performance standards for reporting student achievement of Ohio’s Learning Standards in science and social studies. Ohio educators, serving as standard setting panelists engaged in a standardized and rigorous process to recommend performance standards. The workshops employed the Bookmark procedure (Mitzel, Lewis, Patz, and Green, 2001 and Lewis, Mitzel, Green, 1996) a widely used method in which standard setting panelists use their expert knowledge of Ohio’s Learning Standards and student achievement to map the performance level descriptors adopted by the Ohio State Board of Education onto an OIB based on the first operational test form administered to students in spring 2015.

Similar procedures were adopted to recommend performance standards for Ohio’s State Tests in ELA and

mathematics, but with notable differences. The ELA and mathematics standard setting workshops were conducted

in December 2015, prior to test administration. The AIRCore items used to construct those initial test forms had

been previously field tested as part of other statewide assessments, with IRT parameters calibrated and linked to a

common scale based on those state test administrations. Further, because the ELA and mathematics assessments

had never been administered to Ohio students, impact of recommended cut scores had to be estimated based on

the performance of students who had been administered tests that could be linked to the AIRCore scale. Because

Washington has NAEP reading and mathematics scores that are very similar to those of Ohio, estimated impact of

recommended performance standards was projected from student performance on the Smarter Balanced

assessments administered in Washington State.

Thus, panelists in both workshops were provided with contextual information to help inform their primarily

content-driven performance standard recommendations. Panelists were charged with recommending


performance standards comparable to other important assessment systems, including multi-state consortia

(PARCC and Smarter Balanced) and national benchmarks such as NAEP. To facilitate comparisons of Ohio

performance standards with other important benchmark assessments, panelists were provided with the locations

of performance standards from these other assessments systems in their OIBs. Performance standard locations for

the following assessments were provided as part of panelists’ OIB review.

ELA and mathematics workshops:

• PARCC ELA and mathematics performance standards in grades 3–8 and end-of-course assessments

• Smarter Balanced ELA and mathematics performance standards in grades 3–8, since Smarter Balanced includes only a grade 11 assessment in high school

• NAEP performance standards in reading and mathematics in grades 4 and 8 (and interpolated for grade 6)

• ACT college-ready performance standard for reading and mathematics in grade 11

Science and social studies workshops:

• ACT college-ready performance standard for Physical Science, Biology, American Government, and American History

• NAEP reading performance standards for American Government and American History

• NAEP reading performance standards for social studies grade 4 and grade 6 (interpolated value for grade 6)

• TIMSS science assessment benchmarks for science grade 5 and grade 8 (interpolated value for grade 5)

Because the AIRCore items used to build OST assessments in ELA and mathematics can be linked to the reporting

scales of the Smarter Balanced assessments, the locations of the Smarter Balanced performance standards were

mapped directly to the OIBs. The location of performance standards for the PARCC, NAEP, and ACT assessments

were inferred through estimated impact rates (the expected percentage of students expected to meet or exceed

the performance level indicated by each page in the OIB).

In addition, following recommendation of performance standards in each of the panels, panelists were provided

with feedback about the vertical articulation of their recommended performance standards so that they could

view how the locations of their recommended performance standards for each of the grade-level and EOC

assessments sat in relation to the cut score recommendations for the other assessments. This approach allowed

panelists to view their cut score recommendations as a coherent system of performance standards, and further

reinforces the interpretation of test scores as indicating not only achievement of current grade-level standards but

also preparedness to benefit from instruction in the subsequent grade levels.

9.1.1 PERFORMANCE-LEVEL DESCRIPTORS

Student achievement on OST assessments is classified into five performance levels: Limited, Basic, Proficient,

Accelerated, and Advanced as prescribed by Ohio Revised Code 3301.0710(A)(2). Performance level descriptors

(PLDs) define the content area knowledge and skills that students at each performance level are expected to

demonstrate. The standard-setting panelists based their judgments about the location of the performance

standards on the PLDs as well as Ohio’s Learning Standards. OST assessments’ PLDs describe five levels of

achievement:


• Limited

• Basic

• Proficient

• Accelerated

• Advanced

Prior to convening the standard setting workshops, ODE, in consultation with AIR, drafted PLDs for each test that

describe the range of achievement encompassed by each performance level on the test. The PLDs were designed to

be clear, be concrete, and reflect Ohio’s expectations for proficiency based on Ohio’s Learning Standards. ODE

considered any need for clarification or revision that arose throughout the standard setting process prior to

publishing the final versions of the PLDs following the standard setting workshop. Ohio’s PLDS are available at

education.ohio.gov.

9.2 RECOMMENDED PERFORMANCE STANDARDS

Panelists were tasked with recommending five performance standards (Limited, Basic, Proficient, Accelerated and

Advanced) that resulted in four performance levels (Basic, Proficient, Accelerate and Advanced). The final

recommended performance standards for each OST assessment is provided in Exhibits 9.2.1–9.2.4, and include the

panelist-recommended OIB page numbers, theta value of the performance standard (in logit scale), as well as the

percentage of Ohio students classified as meeting or exceeding each standard. Following the standard-setting

workshop, panelist recommendations were submitted to Ohio’s State Board of Education; the Board formally

adopted the standards in spring 2015 for science and social studies and in winter 2015 for ELA and mathematics.

The estimated percentage of students at each performance level for each test is shown in Exhibit 9.2.5.

Exhibit 9.2.1 Final Recommended Performance Standards for OST Assessments — ELA

Test Performance

Level

Ordered-Item Booklet

Page Theta

Estimated Percentage of Students At or

Above Performance

Standard

Approximate Percentage of

Raw Score Points

Grade 3

Basic 8 -0.84 75 33

Proficient 14 -0.23 56 46

Accelerated 26 0.32 36 56

Advanced 36 0.92 17 67

Grade 4

Basic 4 -0.56 73 38

Proficient 17 0.06 54 50


Advanced 42 1.32 14 73

Grade 5

Basic 6 -0.74 78 38



Advanced 41 1.29 15 73

Grade 6

Basic 6 -0.88 80 31



http://education.ohio.gov/Topics/Ohios-Learning-Standards/Ohios-Learning-Standards


Test Performance

Level


Page Theta


Above Performance

Standard


Raw Score Points

Advanced 48 1.09 18 71

Grade 7

Basic 5 -0.80 76 35



Advanced 49 1.29 14 71

Grade 8

Basic 9 -0.43 72 40



Advanced 56 1.55 13 79

ELA I

Basic 3 -0.71 71 35



Advanced 48 1.31 12 73

ELA II

Basic 6 -0.77 72 35



Advanced 53 1.30 11 76

Exhibit 9.2.2 Final Recommended Performance Standards for OST Assessments — Mathematics

Test Performance

Level


Page Theta


Above Performance

Standard


Raw Score Points

Grade 3

Basic 17 -0.61 82 39



Advanced 51 1.53 11 75

Grade 4

Basic 5 -1.05 78 30



Advanced 49 1.19 9 70

Grade 5

Basic 7 -1.05 78 29



Advanced 55 1.35 10 76

Grade 6 Basic 8 -0.83 79 31


Test Performance

Level


Page Theta


Above Performance

Standard


Raw Score Points



Advanced 60 1.65 13 81

Grade 7

Basic 3 -0.76 75 35



Advanced 60 1.74 11 81

Grade 8

Basic 12 -0.69 74 38



Advanced 62 2.00 13 83

Algebra

Basic 2 -1.21 72 27



Advanced 56 1.37 13 76

Geometry

Basic 11 -0.27 75 33



Advanced 52 2.37 15 80

Integrated Math I

Basic 4 -1.20 72 27



Advanced 56 1.37 13 78

Integrated Math II

Basic 7 -0.14 72 39



Advanced 53 2.46 13 81

Exhibit 9.2.3 Final Recommended Performance Standards for OST Assessments — Science

Test Performance

Level


Page Theta


Above Performance

Standard


Raw Score Points

Grade 5

Basic 7 -0.92 88 30




Test Performance

Level


Page Theta


Above Performance

Standard


Raw Score Points

Advanced 60 1.25 17 75

Grade 8

Basic 9 -1.14 82 25



Advanced 61 1.08 10 73

Physical Science

Basic 6 -1.56 87 20



Advanced 63 0.95 4 70

Biology

Basic 13 -1.19 79 21



Advanced 63 0.51 17 57

Exhibit 9.2.4 Final Recommended Performance Standards for OST Assessments — Social Studies

Test Performance

Level


Page Theta


Above Performance

Standard


Raw Score Points

Grade 4

Basic 8 -0.92 88 33



Advanced 62 1.58 5 81

Grade 6

Basic 15 -0.22 77 44



Advanced 60 1.71 13 83

American History

Basic 9 -0.98 88 31



Advanced 58 1.12 18 73

American Government

Basic 10 -1.11 90 27



Advanced 69 1.66 4 81


Exhibit 9.2.5 shows the estimated percentage of students classified at each performance level based on final panelist-recommended standards for each OST assessments in ELA and mathematics.

Exhibit 9.2.5 Estimated Percentage of Students Classified in Each OST Performance Level

Test Limited Basic Proficient Accelerated Advanced

ELA

Grade 3 25 20 19 19 17

Grade 4 27 19 20 19 14

Grade 5 22 21 21 21 15

Grade 6 20 22 21 19 18

Grade 7 24 22 22 18 14

Grade 8 28 17 26 16 13

ELA I 29 18 29 13 12

ELA II 28 20 27 14 11

Mathematics

Grade 3 18 16 29 25 11

Grade 4 22 13 28 29 9

Grade 5 22 14 31 24 10

Grade 6 21 17 31 18 13

Grade 7 25 14 25 25 11

Grade 8 26 11 30 20 13

Algebra 28 14 22 22 13

Geometry 25 16 21 23 15

Integrated Math I 28 14 22 22 13

Integrated Math II 28 17 20 23 13

Science

Grade 5 12 26 24 21 17

Grade 8 18 22 23 27 10

Physical Science 13 24 41 18 4

Biology 21 19 33 9 17

Social Studies

Grade 4 12 19 41 24 5

Grade 6 23 20 21 22 13

American History 12 18 36 17 18

American Government 10 23 49 14 4

As noted previously, the proficiency rates provided to standard-setting panelists were projected from Washington

State performance on the Smarter Balanced assessments. However, Smarter Balanced does not assess student

achievement in grades 9 and 10, and the grade 11 assessment is not a course-based test. Therefore, to estimate

the impacts for the Algebra I/IM I and Geometry/IM II scales, AIR psychometricians applied the vertical linking

constants in the underlying AIRCore scale to the grade 8 ability estimates to project student achievement in grades

9 and 10. Based on the Ohio results, however, it appeared that the vertical linking constants in the underlying scale

overestimated the growth rate between Algebra I and Geometry as observed in Ohio. Therefore, prior to the final

scoring of the spring 2016 tests, modified cut scores for Geometry/IM II were obtained by adjusting the vertical


linking constant to reflect the observed difference between Algebra I and Geometry. While the adjusted impact

rates are still lower than those projected, they do become consistent with rates observed for Algebra I/IM I. Exhibit

9.2.6 shows the percentage of students classified at each performance level in the spring 2016 administration of

the ELA and mathematics assessments based on final panelist-recommended standards.

Exhibit 9.2.6 Percentage of Students Classified in Each OST Performance Level — Spring 2016 ELA and

Mathematics

Test % Limited % Basic % Proficient % Accelerated % Advanced

ELA

Grade 3 14 20 14 21 20

Grade 4 20 18 18 21 18

Grade 5 20 19 19 22 19

Grade 6 21 15 16 23 15

Grade 7 25 21 16 17 21

Grade 8 20 24 12 11 24

ELA I 19 32 14 11 32

ELA II 25 28 15 11 28

Mathematics

Grade 3 22 11 19 20 28

Grade 4 21 10 18 28 23

Grade 5 23 14 27 18 17

Grade 6 26 18 25 14 18

Grade 7 30 15 23 20 12

Grade 8 32 15 34 14 4

Algebra 32 18 25 17 8

Geometry 23 28 27 17 5

Integrated Math I 35 19 22 17 7

Integrated Math II 40 25 19 12 5


Exhibit 9.2.7 shows the percentage of students classified at each performance level in the initial year of the test

administration, based on final panelist-recommended standards for the student population overall across grade

levels and courses for the science and social studies assessments.

Exhibit 9.2.7 Percentage of Students at Each Performance Level based on Final Recommended Performance

Standards — Spring 2016 Science and Social Studies

Test %Limited %Basic %Proficient %Accelerated %Advanced

Science

Grade 5 12 26 24 21 17

Grade 8 18 22 23 27 10

Physical Science 13 24 41 18 4

Biology 21 19 33 9 17

Social Studies

Grade 4 12 19 41 24 5

Grade 6 23 20 21 22 13

American History 12 18 36 17 18

American Government 10 23 49 14 4

9.3 OST TRANSFORMATIONS AND ROUNDING RULES

9.3.1 RULES FOR TRANSFORMING THE WITHIN-GRADE THETA TO THE OST SCALE

There are two milestone performance standards for OST assessments. The proficient performance standard

indicates that students have met expectations for achievement of Ohio’s Learning Standards in the relevant subject

area. The proficient standard is central to Ohio’s new end-of-course based graduation requirements. In OST

assessments’ system, the Accelerated performance standard corresponds to the level of achievement indicating

readiness for post-secondary education without remediation. To support effective communication of these two

milestones, OST will be reporting on a scale that fixes both the proficient and accelerated performance standards.

In addition, ODE wishes to ensure that users do not inadvertently seek to compare performance on OST assessments

with performance on the previous OAA and OGT assessments. To distinguish test scores on OST assessments from

previous assessment results, ODE will adopt 700 as the new proficient cut score. OST assessments will therefore be

transformed from the within-grade theta estimate of ability to the reporting scale using the following

transformation:

𝑆𝑙𝑜𝑝𝑒 =(725−700)25

(𝜃𝐴𝑐𝑐𝑒𝑙−𝜃𝑃𝑟𝑜𝑓) (1)

𝑂𝑆𝑇 𝑆𝑐𝑎𝑙𝑒 𝑆𝑐𝑜𝑟𝑒 = (𝜃 − 𝜃𝑃𝑟𝑜𝑓) ∗ 𝑆𝑙𝑜𝑝𝑒 + 700 (2)

where 700 is the scaled score representing the proficient level performance standard, and the slope is the spread of

the OST assessments’ scale derived from fixing both the proficient performance standard at 700 and accelerated

performance standard at 725. The 𝜃 represents any level of student ability based on the MLE. The 𝜃𝑃𝑟𝑜𝑓 and 𝜃𝐴𝑐𝑐𝑒𝑙

represent the proficient and accelerated cut scores, respectively, adopted by the State Board.


9.3.2 OST ROUNDING RULES

After transforming theta ability estimates to the OST assessments’ reporting scale, the observable scale scores

nearest each of the performance standard cut scores will be evaluated. If the observable scale score nearest the

performance standard is below the cut score, the scale score will be rounded up to be equal to the cut score. If the

observable scale score nearest the performance standard is above the cut score, no special rounding rules will be

applied. Thus, if the student’s scale score is SS0, and adding one raw score point results in a scale score of SS1, then

where SS0 < SScut < SS1, if the SScut —SS0 < SS1 —SScut, then round the student’s scale score to the cut score. Please

note that the scale score SS0, SS1 and SScut need to be rounded to the nearest integer before the comparison.

9.3.3 RULES FOR OVERALL PERFORMANCE LEVEL CLASSIFICATION

Overall scale scores for OST assessments are mapped into five performance levels. The performance level designations are: Level 1 (Limited), Level 2 (Basic), Level 3 (Proficient), Level 4 (Accelerated), and Level 5 (Advanced).

The within-grade performance standards upon which student achievement is classified are provided in Exhibits 9.3.3.1–9.3.3.4.

Exhibit 9.3.3.1: OST Performance Standards Thetas — ELA

OST ELA Basic Proficient Accelerated Advanced

Grade 3 -0.70 -0.09 0.46 1.06

Grade 4 -0.56 0.06 0.65 1.32

Grade 5 -0.74 0.00 0.59 1.29

Grade 6 -0.83 -0.07 0.52 1.14

Grade 7 -0.80 -0.01 0.65 1.29

Grade 8 -0.43 0.15 0.95 1.55

ELA I -0.71 -0.11 0.79 1.31

ELA II -0.77 -0.08 0.75 1.30

Exhibit 9.3.3.2: OST Performance Standards Thetas — Mathematics

OST Mathematics Basic Proficient Accelerated Advanced

Grade 3 -0.61 -0.08 0.68 1.53

Grade 4 -1.05 -0.61 0.15 1.19

Grade 5 -1.05 -0.54 0.43 1.35

Grade 6 -0.83 -0.12 0.89 1.65

Grade 7 -0.76 -0.19 0.68 1.74

Grade 8 -0.69 -0.18 1.06 2.00

Algebra -1.21 -0.57 0.32 1.37

Geometry -0.98 -0.24 0.61 1.66

Integrated Math I -1.20 -0.57 0.32 1.37

Integrated Math II -0.85 -0.11 0.69 1.75


Exhibit 9.3.3.3: OST Performance Standards Thetas — Science

OST Science Basic Proficient Accelerated Advanced

Grade 5 -0.91997 -0.04328 0.56923 1.24605

Grade 8 -1.13745 -0.50512 0.09217 1.07651

Physical Science -1.56235 -0.94268 0.02261 0.94759

Biology -1.18548 -0.67156 0.17575 0.50740

Exhibit 9.3.3.4: OST Performance Standards — Social Studies

OST Social Studies Basic Proficient Accelerated Advanced

Grade 4 -0.91623 -0.40271 0.57222 1.57510

Grade 6 -0.21536 0.36261 0.96849 1.70707

American History -0.97617 -0.36759 0.60310 1.12246

American Government -1.10964 -0.41063 0.91557 1.65763

9.3.4 OST SUBSCALE PERFORMANCE CLASSIFICATION

Subscale performance classifications are computed to classify student performance levels for each of the reporting category subscales with respect to the proficient performance standard.

For each subscale, a mid-range band is defined as extending one SEM below and above the proficient level performance standard. Where student subscale scores are more than one SEM below the proficient standard for the subscale, students are classified as scoring below the standard. Conversely, where student subscale scores are more than one SEM above the proficient standard for the subscale, students are classified as scoring above the standard. Students with subscale scores falling within the mid-range band are classified as scoring near the standard. The rules surrounding classification are described below:

• If 𝑆𝑆𝑜𝑏𝑠 < 𝑆𝑆𝑐𝑢𝑡 − 1 ∗ 𝑆𝐸𝑀𝑐𝑢𝑡, then performance is classified as Below Proficient

• If 𝑆𝑆𝑜𝑏𝑠 > 𝑆𝑆𝑐𝑢𝑡 + 1 ∗ 𝑆𝐸𝑀𝑐𝑢𝑡, then performance is classified as Above Proficient

• If 𝑆𝑆𝑐𝑢𝑡 − 1 ∗ 𝑆𝐸𝑀𝑐𝑢𝑡 ≤ 𝑆𝑆𝑜𝑏𝑠 ≤ 𝑆𝑆𝑐𝑢𝑡 + 1 ∗ 𝑆𝐸𝑀𝑐𝑢𝑡, then performance is classified as At/Near

Proficient

Where SSobs is the student’s subscale score, SEMcut is the conditional standard error of measurement associated with the proficient standard for the subscale, and SScut is the proficient cut score. Zero and perfect scores on the subscale are always assigned Below Proficient and Above Proficient, respectively. Please note SSobs, SScut, and SEMcut need to be rounded to the nearest integer before the comparison.


10. SCALING AND EQUATING

OST assessments are fixed-form, online assessments, with paper-pencil forms available for schools that are not ready to transition to the online testing environment. For the science and social studies assessments, where items in the online form that cannot be rendered for paper-based administration are replaced with items measuring the same standards and with similar difficulty. In addition to the common operational base form administered to all students participating in a given grade and subject area assessment, each student is also administered a set of embedded field-test items. In the online environment, the field-test distribution engine randomly selects field-test items for administration. Embedded items include newly developed field-test items that do not contribute toward the student’s overall operational score. The paper-pencil forms also include an embedded field-test block that is used to field-test online items rendered for paper-based administration.

The paper-pencil forms are constructed to be as similar as possible to the online forms, and for ELA and mathematics the same items are administered in both modes. There are, however, some online items on the science and social studies assessments for which paper equivalents cannot be rendered. In these instances, replacement items are identified which allow the paper-pencil form to also meet the blueprint.

10.1 ITEM RESPONSE THEORY PROCEDURES

OST assessments in science and social studies were administered for the first time in spring 2015. Following test administration, item response theory (IRT) procedures were used to calibrate item parameter estimates and create the new OST scales for scoring and reporting.48 OST end-of-course assessments in ELA and mathematics, as well as grade 3 ELA, were administered for the first time in December 2015, followed by a comprehensive administration of all ELA and mathematics assessments in spring 2016. This section describes the procedures for calibration of operational item parameters. All calibration procedures are independently applied by AIR.

Within each test, students are able to skip items in both the online and paper-based test platforms. While omitted items are scored as incorrect for purposes of ability estimation, all omitted responses are treated as not administered for purposes of IRT analysis. All students who respond to at least five items or achieve five scores are considered to have attempted a test. All attempted records are included in IRT analysis with the exclusion of student records that are invalidated by TAs.

10.1.1 CALIBRATION OF OST ITEM BANKS

WINSTEPS was used to estimate Rasch and Masters’ partial credit model item parameters for OST. WINSTEPS is publically available and thoroughly documented software from Mesa Press. WINSTEPS employs a joint maximum likelihood approach towards estimation (JMLE), which jointly estimates the person and item parameters. The Rasch model is fit to estimate student responses to dichotomous (0/1 point) items. Masters’ (1982) partial credit model, an extension of the one parameter Rasch model, allows for graded responses and is fit to estimate parameters for polytomous items.

In the base year of OST assessments in science and social studies, operational items for each test were freely calibrated, centering on the mean item difficulty of each operational test form, to establish the new OST reference scales for those assessments. Following the approval of final item parameter estimates for operational items,

48 Standard 4.10 – When a test developer evaluates the psychometric properties of items, the model used for that purpose (e.g., classical test theory, item response theory, or another model) should be documented. The sample used for estimating item properties should be described and should be of adequate size and diversity for the procedure. The process by which items are screened and the data used for screening, such as item difficulty, item discrimination, or differential item functioning (DIF) for major examinee groups, should also be documented. When model-based methods (e.g., IRT) are used to estimate item parameters in test development, the item response model, estimation procedures, and evidence of model fit should be documented.


parameter estimates for the operational items were anchored to their new OST bank values and parameter estimates for field-test and linking items were estimated under that constraint. This placed parameter estimates for all field-test and external linking items on the same OST scale defined by the operational item parameters.

Beginning with the fall 2015 administrations of OST EOC assessments, pre-equated item parameters were used to score student test records in science and social studies.

The first operational forms of OST assessments in ELA and mathematics were constructed using items in the AIRCore item bank. These items were developed to be aligned to the CCSS and had all been previously administered as part of statewide assessments in Arizona, Florida, Utah, and/or Oregon. Following administration in one or more of the statewide assessment systems and completion of the item review process, AIRCore items were calibrated using Rasch and Masters’ Partial Credit, and linked to a common scale. In December 2015, a standard setting workshop was conducted to recommend to the Ohio State Board of Education a set of performance standards on the AIRCore scale for reporting student achievement of Ohio’s Learning Standards in ELA and mathematics. Because the sample of students administered the fall EOC tests is small and unrepresentative of the state population, and because the grade 3 ELA assessment was administered to grade 3 students early in the school year and before they could be expected to achieve grade 3 learning standards, the fall 2015 ELA and mathematics tests were scored using the AIRCore bank item parameter estimates.

The first operational administration of the full system of OST assessments in ELA and mathematics took place in spring 2016. Item parameters for all the ELA and mathematics assessments were freely calibrated following the spring administration. The OST assessments’ scale for each of the ELA and mathematics tests was established by centering the operational test form item difficulties to zero. The mean-mean equating procedure was used to link the spring 2016 OST item parameters to the AIRCore scale on which performance standards were recommended, allowing those performance standards to be placed onto the new OST scale.

Because the high school end-of-course tests in mathematics include Integrated Mathematics I and Integrated Mathematics II assessments that are constructed from items in both the Algebra I and Geometry item banks, it is not possible to maintain separate banks for each of the EOC mathematics assessments. Following the spring 2016 administration, the decision was made to adopt the standard-setting scale, which was common for all high school mathematics items, as the reference scale for the mathematics EOC assessments. Thus, the linking constants identified following the spring 2016 administration were applied to the spring 2016 item parameter estimates to place them back to the standard-setting scale.

10.1.2 ESTIMATING STUDENT ABILITY USING MAXIMUM LIKELIHOOD ESTIMATION

OST is scored using maximum likelihood estimation.49 As described previously, parameter estimates are calibrated using the Rasch model for dichotomously scored items and Masters’ partial credit model for polytomous items.

LIKELIHOOD FUNCTION

The likelihood function for generating the MLEs is based on a mixture of item types and can therefore be expressed as:

L(θ) = L(θ)MCL(θ)CR

49 Standard 5.0 – Test scores should be derived in a way that supports the interpretations of test scores for the proposed uses of tests. Test developers and users should document evidence of fairness, reliability, and validity of test scores for their proposed use. Standard 5.2 – The procedures for constructing scales used for reporting scores and the rationale for these procedures should be described clearly.


where:

L(θ)MC = ∏ [1

1 + exp [−D(θ − bi)]]

xi

[1 +1

1 + exp [−D(θ − bi)]]

1−xiN

i=1

L(θ)CR = ∏exp ∑ D(θ − δki)

xik=1

1 + ∑ exp ∑ D(θ − δki)j

k=1

mij=1

N

i=1

and where bi is the location parameter, xi is the observed response to the item, i indexes item and δki is the kth step for item i with m total categories.

We subsequently find arg max θ L(θ) as the student’s theta (i.e., MLE) given the set of items administered to the student.

DERIVATIVES

Finding the maximum of the likelihood requires an iterative method, such as Newton-Raphson iterations. Since the log-likelihood is a monotonic function of the likelihood, the following derivatives based on the log-likelihood function (with Rasch constraints) are used:

𝜕ln𝐿(𝜃)𝑀𝐶

𝜕𝜃= ∑ 𝑥𝑖 − [

1

1 + exp [−(𝜃 − 𝑏𝑖)]]

𝑁

𝑖=1

𝜕ln𝐿(𝜃)𝐶𝑅

𝜕𝜃= ∑ 𝑥𝑖 − [

∑ 𝑗𝑚𝑖𝑗=1 exp ∑ (𝜃 − 𝛿𝑘𝑖)

𝑥𝑖𝑘=1

1 + ∑ exp ∑ (𝜃 − 𝛿𝑘𝑖)𝑗𝑘=1

𝑚𝑖𝑗=1

]

𝑁

𝑖=1

𝜕2ln𝐿(𝜃)𝑀𝐶

𝜕2𝜃𝜕𝜃2= − ∑ (1 − [

1

1 + exp [−(𝜃 − 𝑏𝑖)]])

𝑁

𝑖=1


𝜕2𝜃

= − ∑ (1 − [1

1 + exp [−(𝜃 − 𝑏𝑖)]])

𝑁

𝑖=1

[1

1 + exp [−(𝜃 − 𝑏𝑖)]]

𝜕2ln𝐿(𝜃)𝐶𝑅

𝜕2𝜃𝜕𝜃2= ∑ [

∑ 𝑗𝑚𝑖𝑗=1 exp ∑ (𝜃 − 𝛿𝑘𝑖)

𝑥𝑖𝑘=1

1 + ∑ exp ∑ (𝜃 − 𝛿𝑘𝑖)𝑗𝑘=1

𝑚𝑖𝑗=1

]

2𝑁

𝑖=1


𝜕2𝜃

= ∑ [∑ 𝑗

𝑚𝑖𝑗=1 exp ∑ (𝜃 − 𝛿𝑘𝑖)

𝑥𝑖𝑘=1

1 + ∑ exp ∑ (𝜃 − 𝛿𝑘𝑖)𝑗𝑘=1

𝑚𝑖𝑗=1

]

2𝑁

𝑖=1

− [∑ 𝑗2𝑚𝑖

𝑗=1 exp ∑ (𝜃 − 𝛿𝑘𝑖)𝑥𝑖𝑘=1

1 + ∑ exp ∑ (𝜃 − 𝛿𝑘𝑖)𝑗𝑘=1

𝑚𝑖𝑗=1

]

Hence, the estimated MLE is found via the following maximization routine:

𝜃𝑡+1 = 𝜃𝑡 −𝜕ln𝐿(𝜃𝑡)

𝜕𝜃𝑡

𝜕2ln𝐿(𝜃𝑡)

𝜕2𝜃𝑡

⁄

where

𝜕ln𝐿(𝜃)

𝜕𝜃=

𝜕ln𝐿(𝜃)𝑀𝐶

𝜕𝜃+

𝜕ln𝐿(𝜃)𝐶𝑅

𝜕𝜃


𝜕2ln𝐿(𝜃)

𝜕2𝜃𝜕𝜃2=


𝜕2𝜃𝜕𝜃2+


𝜕2𝜃𝜕𝜃2

𝜕2ln𝐿(𝜃)

𝜕2𝜃=


𝜕2𝜃+


𝜕2𝜃

and where θt denotes the estimated θ at iteration t.

ESTIMATING ZERO AND PERFECT SCORES

In the event of zero or perfect scores, a procedure recommended by Berkson (as cited in Linacre, 2004) is implemented to add (or subtract) 0.5 to (from) the zero (perfect) score prior to estimating student ability. Thus, for students responding incorrectly to all items in a scale or subscale, students will be assigned a test raw score of 0.5. Conversely, for students responding correctly to all items in a scale or subscale, 0.5 will be subtracted from the test raw score.

10.2 OST REPORTING SCALE (SCALE SCORES)

There are two milestone performance standards for OST assessments: proficient and accelerated. The proficient

performance standard indicates that students have met expectations for achievement of Ohio’s Learning Standards

in science and social studies. The proficient standard is central to Ohio’s new EOC-based graduation requirements

and accountability practices. In the OST assessment system, the accelerated performance standard corresponds to

the level of achievement indicating readiness for post-secondary education without remediation. To support

effective communication of these two milestones, OST assessments are reported on a scale that fixes both the

proficient and accelerated performance standards. In addition, ODE wishes to ensure that users do not inadvertently

seek to compare performance on OST assessments with performance on the previous OAA and OGT assessments.

To distinguish test scores on OST assessments from previous assessment results, ODE adopted 700 as the new

proficient cut score. OST assessments have therefore been transformed from the within-grade theta estimate of

ability to the reporting scale using the following transformation:50

𝑆𝑙𝑜𝑝𝑒 =725−700

(𝜃𝐴𝑐𝑐𝑒𝑙−𝜃𝑃𝑟𝑜𝑓) (1)

𝑂𝑆𝑇 𝑆𝑐𝑎𝑙𝑒 𝑆𝑐𝑜𝑟𝑒 = (𝜃 − 𝜃𝑃𝑟𝑜𝑓) ∗ 𝑆𝑙𝑜𝑝𝑒 + 700 (2)

where 700 is the scaled score representing the proficient level performance standard, and the slope is the spread of

the OST assessments’ scale derived from fixing both the proficient and accelerated performance standards. The 𝜃

represents any level of student ability based on the MLE. The 𝜃𝑃𝑟𝑜𝑓 and 𝜃𝐴𝑐𝑐𝑒𝑙 represent the proficient and

accelerated cut scores, respectively, adopted by the State Board.

Overall scale scores for OST have been mapped into five performance levels per grade/course. The performance

level designations are: Basic, Limited, Proficient, Accelerated, and Advanced. The performance level is evaluated

using the rounded scale score. Exhibit 10.2.1 shows the scale score ranges for the performance levels for each of the

assessments.

50 Standard 5.2 – The procedures for constructing scales used for reporting scores and the rationale for these procedures should be described clearly.


Exhibit 10.2.1: Scale Score Ranges for Performance Levels

Grade / Course Limited Basic Proficient Accelerated Advanced

ELA

Grade 3 545–671 672–699 700–724 725–751 752–863

Grade 4 549–673 674–699 700–724 725–752 753–846

Grade 5 552–668 669–699 700–724 725–754 755–848

Grade 6 555–667 668–699 700–724 725–750 751–851

Grade 7 568–669 670–699 700–724 725–748 749–833

Grade 8 586–681 682–699 700–724 725–743 744–805

ELA I 606–682 683–699 700–724 725–738 739–800

ELA II 597–678 679–699 700–724 725–741 742–808

Mathematics

Grade 3 587–681 682–699 700–724 725–752 753–818

Grade 4 605–684 685–699 700–724 725–758 759–835

Grade 5 624–686 687–699 700–724 725–747 748–804

Grade 6 616–681 682–699 700–724 725–743 744–790

Grade 7 605–683 684–699 700–724 725–754 755–806

Grade 8 633–689 690–699 700–724 725–743 744–774

Algebra 618–681 682–699 700–724 725–753 754–814

Geometry 604–677 678–699 700–724 725–755 756–810

Integrated Math I 618–681 682–699 700–724 725–753 754–814

Integrated Math II 594–676 677–699 700–724 725–757 758–813

Science

Grade 5 559–663 664–699 700–724 725–752 753–845

Grade 8 575–673 674–699 700–724 725–765 766–868

Biology 617–684 685–699 700–724 725–734 735–823

Physical Science 634–683 684–699 700–724 725–748 749–815

Social Studies

Grade 4 621–686 687–699 700–724 725–750 751–800

Grade 6 541–675 676–699 700–724 725–754 755–829

American History 619–683 684–699 700–724 725–737 738–800

American Government 642–686 687–699 700–724 725–738 739–774


10.3 EQUATING PAPER-PENCIL AND ONLINE TEST SCORES

Prior to reporting test scores for OST assessments, a mode comparability study was performed to evaluate differences in test performance attributable to the mode of test administration, and to identify the linking constants necessary to place item parameter estimates across modes on a common scale for test scoring and reporting.51

A matched samples design (Way, Davis, and Fitzpatrick, 2006) was used to investigate mode comparability. A covariate regression approach was implemented to construct equivalent groups of students taking OST assessments for both modes of test administration. The regression analysis identified for each student a predicted score on the paper-based OST assessment from previous year achievement, covarying demographic variables that included gender, ethnicity, Limited English Proficiency (LEP) status, and Individualized Education Program (IEP) in the development of the prediction equation. A nearest neighbor search procedure was then applied to the predicted OST scores to select the equivalent groups of students. This procedure resulted in the identification of two matched samples for each assessment to conduct the mode comparability study.

Independent calibration of common items between the matched samples indicated that while mean differences in item difficulty between the two modes were generally small, some items performed quite differently across modes with some items much easier when administered online while other items appeared to be more difficult for online students.

Equating constants were computed to place the matched sample paper-based item parameters on the online scale. Because ODE does not intend to maintain separate item banks for the online and paper-based assessments, we compared the performance of the matched online and paper-pencil samples scoring the paper-pencil tests using both the online item parameters as well as adjusted online item parameters, which applied the common item equating constant to the online item parameters. Application of the equating constant to produce adjusted online item parameters generally brought the ability estimates of the matched samples more in line with the expectation of equivalent achievement between the two samples. The mean of item difficulty parameters for online and paper-pencil tests and the mode linking constants between two modes using common items are presented in Exhibit 10.3.1.

Because the equating constants were based only on the common items between the online and paper-pencil assessments, we also evaluated the results of the common item equating with an equipercentile equating approach. Comparison of the common item and equipercentile equating approaches indicated general consistency between linked ability estimates between the two methods. Although there was some slight divergence between methods for some assessments for estimates of low ability students, convergence between the two methods supports use of the common item approach for identifying a linking constant to adjust for any mode differences.

Following presentation of mode comparability results, ODE’s technical advisory committee recommended that rather than apply mode correction constants following each test administration, Ohio should focus on moving toward a fully online assessment system as quickly as possible.52

51 Standard 5.13 – When claims of form-to-form score equivalence are based on equating procedures, detailed technical information should be provided on the method by which equating functions were established and on the accuracy of the equating functions. 52 Standard 5.23


Exhibit 10.3.1: Mode Linking Constants

Test Mean Item Difficulties Mode Linking Constant

% Taking Paper Online Paper-Pencil Theta score Scale Score

ELA

Grade 3 -0.03 0.10 -0.13 -5.91 33%

Grade 4 -0.33 -0.44 0.11 4.66 24%

Grade 5 -0.34 -0.39 0.04 1.69 21%

Grade 6 -0.18 -0.26 0.07 2.97 19%

Grade 7 -0.21 -0.18 -0.03 -1.19 17%

Grade 8 -0.11 -0.19 0.08 2.47 18%

ELA I -0.15 -0.24 0.09 2.57 18%

ELA II -0.05 -0.15 0.10 3.01 18%

Mathematics

Grade 3 -0.63 -0.51 -0.13 -4.28 27%

Grade 4 -0.22 -0.14 -0.08 -2.63 24%

Grade 5 -0.01 0.15 -0.16 -4.12 20%

Grade 6 -0.27 -0.22 -0.06 -1.49 19%

Grade 7 -0.08 -0.09 0.00 0.00 17%

Grade 8 0.08 0.03 0.05 1.01 18%

Algebra 0.76 0.74 0.02 0.56 17%

Geometry 1.16 1.08 0.07 2.06 17%


11. CONSTRUCTED-RESPONSE SCORING

The OST assessments utilize a variety of item types to assess students’ mastery of Ohio’s Learning Standards. AIR uses item scoring technology to machine-score student responses to most items, including traditional selected-response item types (including multiple-choice items), and machine-scored constructed-response (MSCR) items types. These item types are designed to capture and score a variety of response types, such as graphing, drawing or arranging graphic regions, selecting or rearranging sentences or phrases within passages, or entering equations or words, allowing OST items to assess a wide range of student knowledge and skills. In most cases, machine-scored constructed-response items that are developed for online administration are adapted for paper-based testing and responses are captured in a format that allows machine-scoring.

In addition, human raters score some constructed-response items. AIR subcontracts with Data Recognition Corp. (DRC) to fulfill all OST handscoring requirements. This section describes the process for configuring and validating machine rubrics and the process for handscoring, including rules, descriptions of scorer training and systems used, and mechanisms for ensuring reliability and validity of item scores.

11.1 MACHINE-SCORING

11.1.1 EXPLICIT RUBRICS

As part of the item development process for machine-scored item types other than multiple-choice, a rubric validation process is enacted to verify that rubrics are implemented as intended, and responses are scored correctly. This procedure is conducted following the initial administration of items, usually when the item is field tested, and allows test developers to review the intended performance of the rubric versus the rubric’s actual behavior. Students’ responses are reviewed by test development experts, along with resulting item scores, to ensure that the rubric is functioning as intended and awarding credit appropriately. If necessary, test developers can modify machine rubrics to address insufficiencies, and automatically rescore student responses for the item, repeating the process as necessary to finalize and approve the machine-scored rubrics. Test developers review a strategic sample of responses, including responses where high-achieving students scored poorly on the item, lower-achieving students scored well on the item, and randomly selected responses from the population.

11.1.2 ESSAY AUTOSCORING

As part of each OST ELA test administration, students were administered a writing task. In fall 2016, writing responses produced from online test administrations were machine-scored, and all writing tasks administered on paper-pencil tests were handscored following the procedures described in section 11.2. In spring 2017, each of the writing tasks were paired with a reading passage and being randomly administered to students as operational-field-test items. All of the writing tasks were handscored by DRC.

For AIRCore writing tasks that had previously been administered online in Florida field tests (grades 8–10) or Utah SAGE summative assessments (grade 3), ODE adopted the scoring models generated from student responses in those test administrations. Because the scoring models are based on semantic and syntactic features of the text that discriminate high versus low scoring essays as determined by human raters, the models are highly generalizable.

To develop the scoring models, AIR drew a random sample of 2,000 responses to each of the writing tasks for use in building the statistical scoring models. Those responses were double scored by human readers, and any discrepancies were routed for resolution scoring. The resolution of all discrepancies is essential to ensure that the


human-assigned scores used to develop the statistical scoring model are highly refined and thus limit to the extent possible human error in the assignment of dimension scores that would be captured in the scoring models.53

The random sample of 2,000 responses was divided into a model-building sample of 1,500 responses and a cross-validation sample of 500 responses. Model performance was evaluated on the cross-validation sample to ensure that model fit indices were not based on the model building sample, which may inflate fit indicators.

The statistical rubrics used to develop the scoring models measure a broad set of features, some of which may be item specific and “learned” from a training set. During training, these features are related to handscores through a statistical model. The resulting estimates complete a prediction equation that predicts how a human would score a response with the measured features. Statistical rubrics are, effectively, proxy measures. Although they can directly measure some aspects of writing conventions (e.g., use of passive voice, misspellings, run-on sentences), they do not make direct measures of argument structure or content relevance. Hence, although statistical rubrics often prove useful for scoring essays and even for providing some diagnostic feedback in writing, they do not develop a sufficiently specific model of the correct semantic structure to score many propositional items. Further, they cannot provide the explanatory or diagnostic information available from an explicit rubric. For example, the frequency of incorrect spellings may predict whether a response to a factual item is correct—higher-performing students may also have better spelling skills. Spelling may prove useful in predicting the handscore, but it is not the “reason” that the handscorer deducts points. The statistical rubrics therefore are not about explanation or reason but rather about a prediction of how a human would score the response.

As noted, the engine employs a “training set,” a set of essay responses scored with maximally valid dimension scores, which we obtain by having all responses double-scored by expert scorers and a thorough adjudication process for any discrepant scores. The quality of the human-assigned scores is critical to the identification of a valid model and final performance of the scoring engine. Approximately 1500 essay responses were selected at random from the set of scored essay responses to serve as the training set.

For each dimension in the rubric, the system estimates an appropriate statistical model relating the measures to the score assigned by humans. This model, along with its final parameter estimates, is used to generate a predicted or “proxy” score.

In addition to the training set, an independent random sample of responses is drawn for cross-validation of the identified scoring rubric. As with the training set, student responses in the cross-validation study are handscored, and agreement between human- and machine-assigned scores is examined. The cross-validation process ensures that the rubric generalizes across all responses and that the statistical model identified during training does not capitalize on peculiarities in the training set.

For each of the responses of the validation set, whether or not the predicted score matches the score of record is coded with a new binary variable (match = 1, no match = 0). That variable is predicted with a probit model that has three predictors: word count, the probability of the assigned score under the regression model for predicting the scores, and the Mahalanobis distance between the response and the average of the training set on all the features used in the regression model as predictors (document qualities and LSA dimensions). The predicted probability of a match is used as a confidence measure for the validation data (and also used during operational scoring).

Table 1 in Appendix L presents agreement indicators for the two initial human raters, and between the validated final human and statistical rubric score. 54 Indicators include percent Pearson’s correlation, exact agreement, a

53 Standard 4.19 – When automated algorithms are to be used to score complex examinee responses, characteristics of responses at each score level should be documented along with the theoretical and empirical bases for the use of the algorithms. 54 Standard 6.8 – Those responsible for test scoring should establish scoring protocols. Test scoring that involves human judgment should include rubrics, procedures, and criteria for scoring. When scoring of complex responses is done by computer, the accuracy of the algorithm and processes should be documented.


quadratic weighted kappa statistic, and the standardized mean difference (SMD) between the comparing scores. Although absolute values for evaluating statistics have been advanced (Condon, 2013; Higgins, 2013), the focus of these comparisons is degradation of agreement when moving from human-human agreement to machine-human agreement. Agreement between human raters is an indicator of how reliably the responses can be scored by human raters. Since the statistical rubrics attempt to reproduce human assigned scores, evaluation of machine-human agreement is with respect to observed human-human agreement. Neither human-assigned nor machine-assigned scores will be reliable when human-human agreement is poor. As seen in Table 1 in Appendix L, agreement rates between the machine-assigned scores and the validated final handscores are as high as or higher than agreement rates observed between independent human readers. Table 2 in Appendix L presents the intercorrelations among dimension scores.

To produce a better overall scoring, in spring 2018, the responses with the lowest 25% confidence index were

routed for human verification. In addition to the low confidence responses, all responses assigned the prompt

copy condition code were also sent for human scoring. Responses with low confidence scores were sent for an

independent human read; the reader was not aware of the model based score. For responses assigned the prompt

copy condition code, the reader was informed that the response was flagged for copying of the prompt text, and

asked to either confirm that there is insufficient independent student work to support a writing score, or

determine that there is sufficient independent student writing to support a score and to assign the appropriate

score.

Exhibit 11.1.2.1 shows that human raters returned Prompt Copy Match on about 75% verification cases or, in a

limited number of occasions, assigned zero scores. In the cases where human rater identified sufficient

independent student writing to earn a score, the scores were low, for example 20% returned score of 1 in

Organization and Elaboration.

Exhibit 11.1.2.1: Human Rater Judgments of Responses Assigned the Prompt Copy Condition Code

Distribution of Human Assigned Scores Prompt Copy 0 1 2 3 4

Convention 75% 1% 11% 13% NA NA

Elaboration 75% 1% 20% 4% 0 0

Organization 76% 1% 20% 3% 0 0

11.2 HANDSCORING

AIR subcontracted with DRC to fulfill all handscoring needs for Ohio’s State Tests. For items that were scored by human raters, each student response was scored by at least one reader (Reader 1). Ten percent of all paper-pencil form responses receive a second reading (Reader 2) for the purpose of monitoring and maintaining sufficient inter-rater reliability. The Reader 1 score was the score of record.

11.2.1 RANGEFINDING

For embedded field-test items, DRC’s Content Specialists and Scoring Directors prepare for range finding meetings by using DRC’s Image Handscoring System to access student field-test responses. They select a representative sampling for the full range of student performance. These responses are assembled into range finding sets for each item, and the range finding sets are duplicated for all range finding participants.


Range finding for each item begins with a discussion about the rubric with AIR, ODE, and the committee members. Once an understanding of the rubric has been established, participants score and discuss each response until a consensus is reached. Our facilitators move through each of the responses in the range finding set for that item until there are a sufficient number of responses to construct anchor and training sets. Only responses with a high level of agreement are used to train our scorers. DRC staff makes careful notes of scoring decisions for use in training the scorers.55

11.2.2 DEVELOPING TRAINING MATERIALS AFTER RANGEFINDING

Once range finding is complete, DRC uses the range finding responses to develop training materials for scoring field-test responses. DRC’s Content Specialists and Scoring Directors select anchor and training responses from the sets of range finding responses. Scoring notes generated during the range finding process remains with each response selected, either in the annotation (for anchor papers) or in the Scoring Director’s notes (for training papers). If requested, DRC submits copies of training materials to the state assessment staff for approval prior to their use. Any training material created by DRC can also be provided to ODE in PDF format for archival purposes.

11.2.3 SCORING GUIDES WITH ANCHOR RESPONSES

Each constructed-response item requires item-specific training materials, including a rubric comprised of the item-specific scoring guidelines and 2 to 4 annotated anchor responses to illustrate/exemplify each score point. Anchor papers are selected to illustrate particular scoring concepts. These responses help ensure that scorers are able to make accurate and consistent scoring decisions. All anchor papers are annotated to explain how they exemplify each score point. The anchor sets serve as the scorers’ constant reference.

11.2.4 TRAINING SETS

For each field-test constructed-response item, DRC develops one training set of ten student responses for two-point items and two training sets for four-point items. These training papers hone each scorer’s ability to discern the different score-point levels in an accurate and consistent manner. When reviewing training responses from the front of the scoring room, the Scoring Director uses the notes generated during range finding to ensure that scorers reach scoring decisions in a manner consistent with way the rubrics were applied during range finding.

11.2.5 OPERATIONAL TRAINING AND QUALIFYING MATERIALS

Prior to scoring operational items, DRC provides the field-test training materials (anchor and training sets) for each item selected for operational administration. DRC supplements these field-test training materials with one to two more training sets of 10 student responses and two to three qualifying sets of 10 student responses. These supplemental responses are drawn from exemplar responses generated during field-test scoring. The Scoring Director reviews all exemplar responses. These sets were sent to ODE for approval. The supplemental training and

55 Standard 4.20 – The process for selecting, training, qualifying, and monitoring scorers should be specified by the test developer. The training materials, such as the scoring rubrics and examples of students’ responses that illustrate the levels on the rubric score scale, and the procedures for training scorers should result in a degree of accuracy and agreement among scorers that allows the scores to be interpreted as originally intended by the test developer. Specifications should also describe processes for assessing scorer consistency and potential drift over time in raters’ scoring.


qualifying materials solidify scorers’ understanding of how the range finding and field-test responses were scored in order to ensure accurate and consistent scoring.56

Further quality control measures, including validity and recalibration sets, were implemented for operational items. Cycling responses with known scores into the scoring queue allows DRC to continuously monitor the performance of scores and intervene when necessary. Validity sets were sent to ODE for approval. Recalibration sets use live student responses and are archived with training materials.

11.2.6 HANDSCORING PROCEDURES

Pairs of scorers were seated in ergonomically adjustable chairs at long, rectangular tables. There were two imaging stations at each table. Each workstation included a large flat-screen monitor for clear image reproduction and easy viewing. Each scorer was assigned a unique ID number and password.

Team Leaders assisted the Scoring Directors with scorer training and monitoring. Teams consisted of approximately ten scorers.

The Scoring Director explained in detail the directions for use of the computerized handscoring system. All scorers followed along using the Imaging Handbook, created specifically for DRC scorers.

For each item, scorer training began with a room-wide presentation and discussion of the scoring guide (rubric and anchors) by the Scoring Director. Next, the scorers practiced by scoring the responses in the training set(s). Afterwards, the Scoring Director and/or Team Leaders led a thorough discussion of each set.

The student responses were routed to scorers by grade/course and item. Images of responses were sent to designated groups of scorers qualified to score that item. Only qualified scorers have access to student response images. The scorers read each response and entered the correct scores. After the scores were entered, a new response image appeared. Scorers could not tell if they were reading a response for the first or second time: all readings were, in effect, “blind.”

Ongoing quality control checks and procedures, as described later in this document, were employed to monitor and maintain the quality of the scoring sessions. If any unusual data were observed, DRC would investigate and resolve any issues.

Routing and scoring of images continued until all student responses received the prescribed number of readings (listed below). If the first two scores were equal or adjacent, the score from the first read (R1) was the score of record. If the first two scores (R1 and R2) were non-adjacent (e.g., a 0 and a 2), a resolution reading was performed by a Team Leader or Scoring Director. The third read (R3) was also independent and became the score of record.

All operational item responses were handscored with 10% independent second reads.

For previously field-tested items that were simply being recalibrated, scored a random sample of 1500 responses per item, with 10% second reads.

For new field-test items, scored a random sample of 1500 responses per item, with 100% independent second reads.

DRC’s Image Handscoring System allowed for on-demand retrieval of specified images (e.g., specific batch files, specific grades, specific students) should the need have arisen during or subsequent to the handscoring process.

56 Standard 6.8 – Those responsible for test scoring should establish scoring protocols. Test scoring that involves human judgment should include rubrics, procedures, and criteria for scoring. When scoring of complex responses is done by computer, the accuracy of the algorithm and processes should be documented.


11.2.7 TRAINING OF SCORERS

DRC provides team leaders who assist the Scoring Directors with scorer training and monitoring. Teams consist of approximately ten scorers.

Scorer training begins with a room-wide presentation and discussion of the scoring guide (rubric and anchor responses) by the Scoring Director. Next, the scorers practice by scoring the responses in the training set(s). Afterwards, the Scoring Director and/or Team Leaders lead a thorough discussion of each set.

Once the scorers have become familiar with the rubric and the anchor set and received feedback from the training set(s), they begin scoring.

11.2.8 MONITORING AND MAINTAINING QUALITY CONTROL

Each room of scorers were divided into teams. Each team has approximately ten scorers and are assigned a Team Leader. The Team Leaders conduct routine read-behinds for all scorers. During read-behinds, the Team Leaders review responses and check the scores given by their team members. If a Team Leader disagrees with a reader’s score, the Team Leader will correct the score.

The Team Leaders use these read-behind responses to provide scorers with ongoing feedback and training. DRC’s imaging system allows a Team Leader to determine read-behind rates (frequency of monitoring) for each scorer. If the scorer needs scoring guidelines clarification, or is scoring tentatively, DRC typically monitors one out of five readings. Scorers requiring less feedback receive less frequent read behinds. DRC’s imaging system randomly selects which images the Team Leader will read behind.

A number of handscoring quality control reports are run on a daily basis (or more often as needed). Throughout the handscoring process, the Scoring Directors meet with their Team Leaders each morning to review the reports generated from the previous day’s work. If scoring patterns are apparent, the Team Leaders address any issues on an individual basis.

One key handscoring quality control report is DRC’s Scoring Summary Report, which includes inter-rater reliability and score point distributions by individual and item, both on a daily and a cumulative basis. To monitor scorer reliability and maintain an acceptable level of scoring accuracy, DRC closely reviews daily reports. The reports include item-level data as well as individual scorer data, including scorer number, number of responses scored, individual score point distributions, and exact, adjacent, and non-adjacent agreement rates. DRC investigates any issues and resolve any problems those reports identify. DRC can provide ODE with a copy of the reports on a daily basis or at the end of the project, depending on ODE’s preference.

DRC also studies the inter-rater agreement. Appendix H shows an item summary report of the Inter-Rater Reliability for each test. For operational assessments, DRC strives for 80% exact agreement on 0-2 point items and 70% exact agreement on 0-4 point items. DRC monitors the agreement rates with this in mind, and investigates outliers. There are generally three different causes of lower inter-rater reliability; each cause triggers a different response.

One cause may be scorers misapplying the scoring criteria defined by the rubric and exemplified by the anchor responses. In this case, scorers are re-trained (generally using responses from Team Leader read-behinds for feedback), and, if necessary, scores are erased so that the responses can be redistributed and rescored.

A second, less common cause may be some ambiguity in the rubric or the training materials. If this is uncovered, DRC will work with AIR and ODE to update the rubric and/or the training materials and rescore the responses (if time permits and everyone agrees to this solution).


A third, infrequent cause may be an item that inherently leads to lower reliability. If this is uncovered, DRC will work with AIR and ODE to see if there may be a way to improve the reliability by modifying the rubric or training materials.

11.2.9 HANDLING UNUSUAL RESPONSES AND DISTURBING RESPONSES

Unusual or aberrant responses that cannot be assigned a score receive a nonscorable code following the rules in Exhibit 11.2.9.1 Note that all rubrics have a score point range that includes a score point of “0” that are applied for incorrect responses. This limits the types of responses that can be deemed nonscorable due to falling outside of the criteria defined by the rubric.

Exhibit 11.2.9.1: OST Non-Scorable Codes

Non-score Value

Meaning Definition/Examples/Notes

B Blank The response is completely blank (nothing on the entire response).

F Foreign

Language The response is written in a language other than English.

U Unreadable

The response is unreadable. For example, an online response is unreadable if it only contains repeated/random keystrokes (e.g., “yyyyyyyyyyy”; “av:aeoiahvb;e”; “hhrrttuuvv”). Operational paper-pencil test may be unreadable if it simply contains random letter or drawings or is indecipherable for other reasons.

T Off-Topic Student writes to a subject that is unrelated to the prompt. (This nonscore code is only used on the writing prompts.) DRC will score off topic responses for conventions.

If a scorer assigns a nonscorable code other than Blank to a response, DRC’s Image Handscoring System automatically forwards the response to the Scoring Director. The Scoring Director reviews the response and makes the final determination. During scoring, DRC contacts the designated ODE representative to obtain a ruling on responses that cannot be assigned a score based on our understanding at that point.

To handle possible alert papers (student responses indicating potential issues related to the student’s safety and/or well-being that may require attention at the local level), DRC’s imaging system gives scorers the ability to alert questionable student responses. An alerted image is routed to the Scoring Director who will print the response if he/she determines it to be alertable. Next, these alerts are reviewed by the Handscoring Project Advisor, who then sends copies of the responses to DRC’s Project Management Team. If they also conclude that the response warrants an alert, it is then sent to ODE. Please be assured that at no time during scoring do scorers have access to demographic information on any students participating in the assessment.


12. QUALITY CONTROL PROCEDURES

Quality assurance procedures are enforced through all stages of OST test development, administration, and scoring and reporting of results. This section describes quality assurance procedures associated with the following:

• Test construction

• Test production

• Answer document processing

• Data preparation

• Equating and scaling

• Scoring and reporting

Because quality assurance procedures pervade all aspects of test development, we note that discussion of quality assurance procedures is not limited to this section, but is also included in sections describing all phases of test development and implementation.

12.1 QUALITY ASSURANCE IN TEST CONSTRUCTION

Each form is built to exactly match the detailed test blueprint, and match the target distribution of item difficulty and test information. Together, these constitute the definition of the instrument. The blueprint describes the content to be covered, the depth of knowledge with which it will be covered, the type of items that will measure the constructs, and every other content-relevant aspect of the test. The statistical targets ensure that students will receive scores of similar precision, regardless of which form of the test they receive.

AIR’s test developers use the FormBuilder software to help construct operational forms. FormBuilder interfaces with AIR’s Item Authoring Tool (IAT) to extract test information and interactively creates test characteristics curves (TCCs), test information curves (TICs), and Standard Error of Measurement Curves (SEMCs) as test developers build a test map. This helps our content specialists ensure that the test forms are statistically parallel, in addition to ensuring content parallelism.

Immediately upon generation of a test form, the FormBuilder generates a blueprint match report to ensure that all elements of the test blueprint have been satisfied. In addition, the FormBuilder produces a statistical summary of form characteristics to ensure consistency of test characteristics across test forms. The summary report also flags items with low biserial correlations, as well as very easy and very difficult items. Although items in the operational pool have passed through data review, construction of fixed-form assessments allows another opportunity to ensure that poorly performing items are not included in operational test forms.

The FormBuilder also plots the distribution of item difficulties, both classical and IRT indices, to both flag extremely easy or difficult items and to ensure that the distribution of item difficulties is consistent across test forms. As test developers build forms, FormBuilder generates TCCs, TICs, and CSEMs for the reference (previously used) form and the target (new) form(s) on the screen. The TCCs and SEMCs are plotted using a different color trace line for each prototype form. Using FormBuilder, our content specialists select test items that match the blueprint and are of appropriate difficulty. Beginning with content considerations and supplementing those considerations with statistical considerations, AIR creates alternate, parallel test forms by comparing TCCs for the form that is being created with TCCs from previous forms. To the degree that the TCC for the total test is the same as for previous tests, the raw score required for meeting any performance standard will remain as close to the same as it was on previous forms.

When submitting test forms for review by ODE, AIR produces a form evaluation workbook that includes an evaluation summary checklist, as well as summary statistics and test characteristic graphs.

The mechanical features of a test—arrangement, directions and production—are just as important as the quality of the items. Many factors directly affect a student’s ability to demonstrate proficiency on the assessment, while


others relate to the ability to score the assessment accurately and efficiently. Still others affect the inferences made from the test results.

When the test developer is reviewing a test form for content, in addition to making sure all the benchmark/indicator item requirements are met, test developers must also make sure that the items on the form do not cue each other—that one item does not present material that indicates the answer to another item. This is important to ensure that a student’s response on any particular test item is unaffected by, and is statistically independent of, a response to any other test item. This is called “local independence.” Independence is most commonly violated when there is a hint in one item about the answer to another item. In that case, a student’s true ability on the second item is not being assessed.

Once the items and passages for the form have been selected and matched against the blueprint, the test developer reviews the form for a variety of additional content considerations, including the following:

• The items are sequentially ordered

• Each item of the same type is presented in a consistent manner

• The listing of the options for the multiple-choice items is consistent

• The answer options are lettered with A, B, C, and D

• All graphics are consistently presented

• All tables and charts have titles and are consistently formatted

• The number of the answer choice letters should be approximately equal across the form

• The answer key should be checked by the initial reviewer and one additional independent reviewer

• All stimuli have items associated with them

• The topics of items, passages, or stimuli are not too similar to one another

• There are no errors in spelling, grammar, or accuracy of graphics

• The wording, layout, and appearance of the item matches how the item was field-tested

• There is gender and ethnic balance

• The passage sets do not start with or end with a constructed-response item

• Each item and the form have been checked against the appropriate style guide

• The directions are consistent across items and are accurate

• All copyrighted materials have up-to-date permissions agreements

• Word counts are within documented ranges

After completing the initial build of the form, the test developer hands it off to another content specialist, who conducts a final review of the criteria listed above. If the test specialist reviewer finds any issues, the form is sent back for revisions. If the form meets blueprint and complies with all specified criteria, the test developer sends it to the psychometric team for review. When the psychometric team approves the form, the test developer uploads the item list into FormBuilder. After operational forms were defined in FormBuilder, all bookmaps (test maps), key files, and conversion tables were produced directly from FormBuilder to eliminate the possibility of human error in the construction of these important files. Bookmaps, key files, conversion tables, and other critical documents were generated directly from information maintained in IAT. The information stored in IAT is rigorously reviewed by multiple skilled reviewers, to protect against errors. Automated production of these critical files (such as key files) virtually eliminates opportunities for errors.

Bookmaps include any item attribute stored in IAT, so that in addition to form-level attributes such as test administration and item position, item attributes such as learning standard, benchmark, indicator, complexity, item release status, point value, weight, keyed response, and more are included in the bookmap. The bookmap feature in FormBuilder was customized to OST.

As a further layer of quality assurance for printed test booklets, both during the blueline production phase prior to printing and again following the final printing of all test forms, two AIR technical team staff members independently took all test forms. Responses to the test forms were compared to the answer keys for each form to


confirm the accuracy of scoring keys. In addition, the printed forms were compared against IAT and FormBuilder for content and item ordering to ensure that no changes to the form were introduced prior to printing.

12.2 QUALITY ASSURANCE IN TEST PRODUCTION

The production of computer-delivered assessments involves two distinct types of products, each of which follows an appropriate quality assurance process:

1. Content for online delivery shares some processes with paper-pencil versions, but also requires additional, unique steps.

2. Online test delivery software must deliver the content reliably (and, with the right tools, the accommodations, layouts, etc.).

OST assessments’ test delivery system also has a real-time quality-monitoring component built in. As students are administered assessments, data flow through the test delivery system’s Quality Monitor (QM) software. QM conducts a series of data integrity checks, ensuring, for example, that the record for each test contains information for each item that was supposed to be on the test, and that the test record contains no data from items that have been invalidated. QM scores the test, recalculates performance level designations, calculates subscores, compares item parameters to the reference item parameters in the bank, and conducts a host of other checks.

QM also aggregates data to detect problems that become apparent only in the aggregate. For example, QM monitors item fit and flags items that perform differently operationally than their item parameters predict. This functions as a sort of automated key or rubric check, flagging items where data suggest a potential problem.

12.2.1 PRODUCTION OF CONTENT

While the online workflow requires some additional steps, it actually removes a substantial amount of work from the time critical path, reducing the likelihood of errors. Like a test book, an online system can deliver a sequence of items; however, the online system makes the layout of that sequence algorithmic. A paper-pencil form must await final forms construction before blackline proofs can show how the item will look in the booklet. Online, the appearance of the item screen can be known with certainty before the final test form is ever constructed. This characteristic of online forms enables us to lock down the final presentation of each item well before forms are constructed. In turn, this moves the final blueline review of items much earlier in the process, removing it from the critical path.

The production of computer-based tests includes five key steps:

1 Final content is previewed and approved in a process called web approval. Web approval packages the item exactly as it will be displayed to the student.

2 Forms are finalized using the process described in Section 6.3, and final forms are approved in our FormBuilder software.

3 Complete test packages are created with our test packager, which gathers the content, form information, display information, and relevant scoring and psychometric information from the item bank and packages it for deployment.

4 Forms are initially deployed to a test site where they undergo platform review, a process during which we ensure that each item displays properly on a large number of platforms representative of those used in the field.

5 The final system is deployed to a staging environment accessible to ODE for user acceptance testing and final review.


12.2.2 WEB APPROVAL OF CONTENT DURING DEVELOPMENT

The Item Authoring Tool (IAT) integrates directly with the test delivery system (TDS) display module, and displays each item exactly as it will appear to the student. This process is called web preview, and web preview is tied to specific item review levels. Upon approval at those levels, the system locks content as it will be displayed to the student, transforming the item representation to the exact representation that will be rendered to the student. No change to the display content can occur without a subsequent web preview. This process freezes the display code that will present the item to the student.

Web approval functions as an item-by-item blueline review. It is the final rendering of the item as the student will see it. Layout changes can be made after this process in two ways:

1. Content can be revised and re-approved for web display.

2. Online style sheets can change to revise the layout of all items on the test.

Both of these processes are subject to strict change control protocols to ensure that accidental changes are not introduced. Below, we discuss automated quality control processes during content publication that raise warnings if item content has changed after the most recent web-approved content was generated. The web approval process offers the benefit of allowing final layout review much earlier in the process, reducing the work that must be done during the very busy period just before tests go live.

12.2.3 APPROVAL OF FINAL FORMS

Section 6.3 describes our process for constructing operational test forms, including the approval of test forms by ODE. The forms are built in FormBuilder (a component of our IAT), and upon approval, they are ready for preliminary publication.

12.2.4 PACKAGING

The test packaging system performs two simultaneous roles in the preparation of computer-based products: It compiles the form definitions and other information about how the test is to be administered (e.g., where any embedded field-test items might be inserted) and pulls together the content packaged during web approval.

The test packager assigns form identifiers to each form, evaluates the form against the blueprint, and performs a quality check against the content. The content quality check includes checks to see that every asset (e.g., graphics) referenced in the item is included in the package, confirms that the item has not changed since it was web approved, and ensures that the items have received all the approvals necessary for publication.

12.2.5 PLATFORM REVIEW

Platform review is a process in which each item is checked to ensure that it is displayed appropriately on each tested platform. A platform is a combination of a hardware device and an operating system. In recent years, the number of platforms has proliferated, and platform review now takes place on approximately 15 platforms that are significantly different from one another.

A team conducts platform review. The team leader projects the item as it was web approved in IAT, and team members, each behind a different platform, look at the same item to see that it renders as expected.


12.2.6 USER ACCEPTANCE TESTING AND FINAL REVIEW

Prior to deployment, the testing system and content are deployed to a staging server where they are subject to user acceptance testing (UAT). UAT of the test delivery system serves both a software evaluation and content approval role. The UAT period provides ODE with an opportunity to interact with the exact test with which the students will interact.

12.2.7 FUNCTIONALITY AND CONFIGURATION

The items, both in themselves and as configured onto the tests, form one type of online product. The delivery of that test can be thought of as an independent service. Here, we document quality assurance procedures for delivering the online assessments.

One area of quality unique to online delivery is the quality of the delivery system. Three activities provide for the predictable, reliable, quality performance of our system:

1. Testing on the system itself to ensure function, performance, and capacity

2. Capacity planning

3. Continuous monitoring

AIR statisticians examine the delivery demands, including the number of tests to be delivered, the length of the testing window, and the historic state-specific behaviors to model the likely peak loads. Using data from the load tests, these calculations indicate the number of each type of server necessary to provide continuous, responsive service, and AIR contracts for service in excess of this amount. Once deployed, our servers are monitored at the hardware, operating system, and software platform levels with monitoring software that alerts our engineers at the first signs that trouble may be ahead. Applications log not only errors and exceptions, but also latency (timing) information for critical database calls. This information enables us to know instantly whether the system is performing as designed, or if it is starting to slow down or experience a problem.

In addition, latency data is captured for each assessed student—data about how long it takes to load, view, or respond to an item. All of this information is logged as well, enabling us to automatically identify schools or districts experiencing unusual slowdowns, often before they even notice.


12.3 QUALITY ASSURANCE IN DOCUMENT PROCESSING

12.3.1 SCANNING ACCURACY

Quality assurance procedures for the scanning process begin before the paper-pencil tests are ever shipped to districts.

The scanning process begins with a specifications document that defines the scanning, edit check, and other business rules that will be used to score responses, capture images and flag scans for human review and editing.

Once ODE gives approval, DRC programmers write the customized scanning programs meeting the specifications.

DRC next reviews the programming in systematic testing and code review, as it does with all of its software development.

DRC then tests the system using a test deck (mock answer documents) marked to cover all responses, blanks, multiple marks, imperfect gridding, and other markings to be defined in conjunction with ODE. These tests validate that the programs process each type of marking as intended. The checks to be conducted include

• readability of security, student and school bar codes;

• data capture of pregridded and bar code information;

• accurate capture of district and school codes;

• consistent data capture on all scanners;

• accurate scan positions on all documents and forms; and • scanner calibration and hardware functionality.

Both AIR and DRC quality management staff confirm the results of the tests before the programs are approved for use.

In addition, once real answer forms arrive, DRC visually inspect a sample to verify the accuracy of the scoring and the correct implementation of the business rules.

Throughout the scanning process, batches are checked for quality and scanning accuracy by experienced document processing staff. All scanners are calibrated and cleaned on a regularly scheduled basis to ensure accurate and consistent scoring. DRC also has an on-site field service engineer to resolve any technical issues as they arise.

DRC’s scanning process produces comprehensive, detailed information, including

• student demographic data;

• student multiple-choice response data;

• TIFF images of complete documents; and

• identifiers to link the TIFF images to the student demographic data.

12.3.2 QUALITY ASSURANCE IN EDITING AND DATA INPUT

After each batch is scanned, the documents are processed through a computer-based editing program to detect potential errors because of smudges, multiple marks and omits in the specified response fields. Marks or omits that do not meet the predefined editing standards are routed to the document processing editing staff for resolution.


Using the unique serial number printed on the document during scanning, the editor compares the actual document to the online data. Corrections are then made to the data file according to predefined, Ohio-specific specifications. The editing staff follows strict quality control procedures to produce clean data files that can be submitted for scoring and reporting functions.

Post-Editing

A final edit is performed to confirm that all requirements for final processing have been met. Once the demographic information and multiple-choice data pass all the predefined editing processes, the images of the student responses to constructed-response (CR) items are extracted into files for scoring. The CR student response images are routed through the DRC Imaging Workflow System to handscoring terminals at DRC’s Scoring Centers for scoring by qualified readers. Images are stored so that they can be efficiently retrieved based on student and school identification information, scores, and item information. Upon completion of processing, scannable documents are boxed for security purposes and final storage.

Throughout the process, DRC operators maintain an issues log. The quality assurance staff will review the log to ensure that every issue has been adequately resolved before the final data are validated.

Data File Construction

DRC ensures that all student answer documents have been accounted for and processed through scanning, pre-editing and post-editing processes. After staff confirms that these processes are complete, final data collection processes begin. The original scanned multiple-choice data are converted into a master student file. Record counts are verified against the counts from the document processing staff to ensure that all students are accounted for in the file.

The data file includes scored data. AIR has developed reliable procedures for ensuring accurate answer keys. The answer to each item is maintained along with the item in the IAT. These keys go through extensive review during development, and after books are ready for print, two members of our technical team take each form of the test as a final confirmation of the key information stored in the IAT. From that point, all keys are automatically generated in machine-readable format from the IAT, virtually eliminating the possibility of version errors or other human errors in these critical documents. DRC systems read the key files generated by the IAT. DRC’s Software Quality Assurance staff compares their scoring file against the approved answer key source file to ensure that it is 100% accurate. AIR staff members independently conduct a review of the data received from DRC, providing an independent accuracy check.

12.4 QUALITY ASSURANCE IN DATA PREPARATION

AIR’s test delivery system has a real-time quality-monitoring component built in. As students test, data flow through our Quality Monitor (QM) software. QM conducts a series of data integrity checks, ensuring, for example, that the record for each test contains information for each item that was supposed to be on the test, and that the test record contains no data from items that have been invalidated. QM scores the test, recalculates performance level designations, calculates subscores, compares item parameters to the reference item parameters in the bank, and conducts a host of other checks.

QM also aggregates data to detect problems that become apparent only in the aggregate. For example, QM monitors item fit and flags items that perform differently operationally than their item parameters predict. This functions as a sort of automated key or rubric check, flagging items where data suggest a potential problem. This automated process is similar to the sorts of checks that are done for data review, but (a) they are done on operational data and (b) they are conducted in real time so that our psychometricians can catch and correct any problems before they have an opportunity to do any harm.

Data pass directly from the QM to the database of record (DoR), which serves as the repository for all test information, and from which all test information for reporting is pulled. The data extract generator (DEG) is the


tool that is used to pull data from the DoR for delivery to ODE and their quality assurance contractor. AIR psychometricians ensure that data in the extract files matches the DoR prior to delivery to ODE.

12.5 QUALITY ASSURANCE IN TEST FORM EQUATING

Item information necessary for statistical and psychometric analyses is provided to ODE and Assessment and Evaluation Services (AES), ODE’s independent quality assurance contractor, prior to test administration. Item information is published as part of the configuration of the online assessment system that AIR employs for administering, scoring, and reporting test scores. Information contained in these workbooks includes, but is not limited to, unique item ID used for item tracking, test form ID, location on the test form, correct answer, item difficulty, and information about the strand, standard, and benchmark each item measures. These item files are used in quality control checks of the assessment data scoring and analysis.

To ensure security, all data is shared using AIR’s SFTP site.

Prior to operational work, AIR produces simulated datasets for the purpose of testing software and analysis procedures, including a dry run of calibration and post-equating activities and compare results. The practice runs serve two functions:

• To verify accuracy of program code and procedures.

• To evaluate the communication and work flow among participants. If necessary, the team will reconcile differences and correct production or verification programs.

Following the completion of these activities and resolution of questions that arise, analysis specifications are finalized.

12.6 QUALITY ASSURANCE IN SCORING AND REPORTING

12.6.1 QUALITY ASSURANCE IN HANDSCORING

The entire scoring process is managed by DRC’s electronic scoring system, which implements many programmatic controls. The system enables team leaders to call up individual responses, monitor a variety of indicators and designate items for rescoring. Throughout the scoring, the following processes will monitor the validity and reliability of the scores assigned:

Backreading, in which team leaders continuously review the work of the scorers on their teams.

Validity testing, in which each scorer scores at least two validity responses each day. A validity response is a response that has been pre-scored by expert scorers and is placed in the queue of responses to be scored. These responses are “blind” to scorers; scorers do not realize they are assessing pre-scored responses. The data generated by these processes will be presented in a set of at least nine reports, as described in Deliverables 22 and 24.

When a reader does not provide scores that are sufficiently reliable or valid, DRC has a range of remediation options:

Individual coaching, which typically occurs when a team leader disagrees with a reader’s score. The reader and team leader can discuss the situation immediately, which has proved to be a very effective type of feedback because it is done with responses that were recently scored by a particular reader.


Retraining, which occurs if scorers show unacceptable rates of missed validity scores or high non-adjacency rates with other scorers. Errant scorers are retrained in scoring the item and do not score any further operational responses until they have re-qualified to score the item.

Dismissal, which occurs if scorers continue to assign inaccurate scores following retraining on an item. All scores of responses by a dismissed employee are erased and the responses are re-routed to accurate scorers.

DRC’s Image Scoring System maintains the information needed to identify and rescore all papers scored by errant readers so that bad scorers will not contaminate student data.

Routing Responses to Ensure “Blind” Second Reads

DRC’s Image Handscoring System separates responses by item and subject and routes to qualified scorers who read each response and enter the score. The process of routing and scoring responses continues until all responses have received the prescribed level of readings. Scorers cannot tell if they are conducting first or second readings; all readings are “blind.”

Monitoring Scorers

DRC generates handscoring reports on demand in order to monitor progress and maintain handscoring quality control. Reports are also automatically generated overnight. DRC provides copies of reports in PDF format to ODE daily or on demand.

During the handscoring process, the scoring directors meet with their team leaders each morning to review the statistics generated from the previous day’s work. If scoring patterns are apparent among individual scorers, team leaders will deal with these issues on an individual basis. DRC’s imaging system allows a team leader to determine read-behind rates (frequency of monitoring) for each scorer. If a scorer appears to need clarification of the scoring rules, or is scoring tentatively, DRC typically monitors one out of five readings. The imaging system randomly selects which images the team leader monitors.

DRC also monitors the inter-rater reliability. If a scorer falls below an acceptable rate of agreement, the team leader re-trains the scorer. If the scorer fails to improve after retraining and feedback, DRC may remove the scorer from the project. In this situation, DRC removes all scores assigned by the scorer in question. The responses are then re-dealt and rescored.

DRC does not report on scorer performance after the fact, as some contractors do. DRC believes that scorers with less-than-acceptable scoring patterns must be identified immediately and those patterns corrected. DRC has worked diligently to devise effective monitoring reports and procedures to accomplish both detection and correction. Accurate and consistent results are the backbone of all handscoring activities. The following methods used by DRC guarantee scoring quality:

• Rigorous training and qualifying for each item ensures a pool of scorers who will apply consistent and accurate scores.

• Recalibration sets re-focus scorers on the scoring standards by comparing the pre-determined score to that assigned by the scorer.

• Validity responses detect possible room drift and individual drift. Validity reports compare scorers’ scores to pre-determined scores. Validity responses are seeded to scorers. Scorers cannot distinguish validity responses from live responses, making this a powerful measure of quality control.

• Team leaders conduct routine read-behinds to observe, in real time, scorers’ performance. Team leaders utilize live, scored responses to provide ongoing feedback and, if necessary, retraining for scorers.

• Inter-rater reliability and score point distribution reports are generated daily or on demand to monitor scorer reliability and maintain an acceptable level of scoring accuracy. The reports compile individual scorer data, including the scorer identification number, the number of responses scored, individual score


point distributions, and exact agreement rates. DRC investigates any issues and resolve any problems identified by the reports.

Handscoring Quality Assurance Monitoring Reports

DRC produces these reports on demand, they can assure that immediate action is taken to resolve scoring discrepancies within minutes (when necessary) of the first and/or second reading. DRC prepares a number of reports to monitor the quality and effectiveness of various aspects of the project. The reports are described in Exhibit 12.6.1.1.

Exhibit 12.6.1.1: QA Monitoring Reports

Report Report Specifics

Scoring Summary Report

DRC’s Scoring Summary Report provides daily and cumulative inter-rater reliability results, score point distribution data, and production volumes for each reader and item.

Inter-rater Reliability

Monitors how often scorers are in exact agreement with each other and ensures that an acceptable agreement rate is maintained. This report provides daily and cumulative exact and adjacent inter- scorer agreement and the percentage of responses requiring resolution (only if required). The calculations for this report are as follows:

• Percent Exact—total number of responses by scorer where scores are equal divided by the number of responses that were scored twice.

• Percent Adjacent—total number of responses by scorer where scores are one point apart divided by the number of responses that were scored twice.

• Percent Non-Adjacent—total number of responses by scorer where scores are more than one score point apart divided by the number of responses that were scored twice.

Score Point Distribution

Monitors the percentage of responses given to each of the score points. For example, for items on a 0–4 point scale, this daily and cumulative report shows how many 0s, 1s, 2s, 3s, and 4s a scorer has given to all the responses he or she has scored at the time the report is produced. These percentages can be compared to room-wide percentages to detect individual scoring issues.

Production Volumes

This report also indicates the number of responses read by each scorer each day so that production rates can be monitored. Additionally, it includes totals for each item, so that progress toward completion can be monitored.

Item Status Report Monitors the progress of handscoring. This report tracks each response and indicates the status (e.g., “needs a second reading,” “complete”). This report ensures that all discrepancies are resolved by the end of the project.

Responses Read by Reader Report

Identifies all responses scored by an individual scorer. This report is useful if any responses need rescoring due to potential scorer drift.


Report Report Specifics

Read-Behind Log

Used by team leaders/scoring directors to monitor scorer reliability. Team leaders randomly select and read scored responses from each team member daily. If the team leader disagrees with the scorer’s score, remediation occurs, either with the team leader or the scoring director. This has proven to be a very effective form of feedback because it is implemented with items live-scored by individual scorers.

Validity Reports

These reports can be generated on demand throughout the scoring process. All validity reports compare pre-determined scores to scorers’ scores for validity responses. These reports can be run at the individual, team, or room level in order to detect individual, team, or room-wide scorer drift.

Identifying, Evaluating, and Informing the State on Alert Papers

DRC applies a nonscorable code to unusual or aberrant responses that cannot be assigned a score. Prior to scoring, DRC and AIR works closely with ODE to define what constitutes a nonscorable response. During handscoring, DRC contacts the designated ODE representative to obtain a ruling on any response that cannot be assigned a score or a nonscorable code based on current understanding. The image handscoring functionality forwards all potential nonscorable responses to the scoring director. Only the scoring director is able to assign the nonscorable code.

To handle possible alerts (responses indicating potential issues that may require attention at the state or local level, such as potential security breaches or concerns about a student’s safety), DRC’s imaging system gives scorers the ability to alert individual student responses. Alerted images are routed to the scoring director who will print any responses deemed to indicate a potential issue. At no time during scoring do scorers have access to the demographic information of any students participating in the assessment.

Next, these alerts are routed to a handscoring senior manager who reviews them and, if needed, sends copies of the student’s responses to DRC project management staff. A project manager forwards copies of the alerts to ODE.

12.6.2 QUALITY ASSURANCE FOR SCORE REPORTING

As test results come back from the Test Delivery System or scanning-scoring process, they are routed to our test integration system and Quality Monitor, and ultimately to our systems for reporting. Here, we summarize the quality checks that are implemented:

• We ensure that the student response and score data to be reported are correct.

• The reporting software systems accurately report and aggregate the student scores.

• Paper reports contain accurate data correctly displayed, and any Ohio-specific programming is tested and replicated to ensure that it is error free.

• Print and packaging quality is maintained for paper reports.

Student Response and Score Data Are Correct

Data entering the reporting process, either from paper-based or online tests, flow through our Quality Monitor (QM) software. QM conducts a series of data integrity checks, ensuring, for example, that the record for each test contains information for each item that was supposed to be on the test, and that the test record contains no data from items that have been invalidated. QM scores the test, recalculates performance level designations, calculates subscores, compares item parameters to the reference item parameters in the bank, and conducts a host of other checks.


QM also aggregates data to detect problems that become apparent only in the aggregate. For example, QM monitors item fit and flags items that perform differently operationally than their item parameters predict. This functions as a sort of automated key or rubric check, flagging items where data suggest a potential problem. This automated process is similar to the sorts of checks that are done for data review, but (a) they are done on operational data; and (b) they are conducted in real time so that our psychometricians can catch and correct any problems before they have an opportunity to do any harm.

Reporting System Software Accurately Reports and Aggregates Student Scores

Although test scores in the base year could not be reported until after standard-setting activities were completed, online reports can be configured to appear in real time, and ODE may report test scores immediately in future test administrations. Therefore, quality assurance cannot rely on post-hoc reviews of reports. Instead, the accuracy of the reporting system rests on the quality of programming of the system, the implementation of Ohio’s business rules, and the quality of the algorithms used for aggregating scores.

In building our systems, we undergo an extensive software testing process and when we configure the systems for individual clients, we test them by running simulated or real data through the full system, allowing the system to generate reports. Our statistical programming team simultaneously implements the same rules and processes the same data. Statistics from the system are then compared to statistics produced by our statistical team. Discrepancies are tracked down and resolved.

The entire process is guided by a set of complete reporting and analysis specifications. Both the software team and the independent statistical programming team work from the same specifications, but work independently. This process provides an independent check on the immediate reporting system before it is deployed. The data is available for review by ODE upon request.

Quality Assurance—Statistical Programming

All custom programming is guided by detailed and precise specifications in our Reporting Specifications document. Upon approval of the specifications, analytic rules are programmed and each program is extensively tested on test decks and real data from other programs. Two senior statisticians and one senior programmer review the final programs to ensure that they implement agreed-upon procedures.

Custom programming is implemented independently by two statistical programming teams working from the specifications. Only when the output from both teams matches exactly are the scripts released for production. Quality control, however, does not stop there.

Much of the statistical processing is repeated and AIR has implemented a structured software development process to ensure that the repeated tasks are implemented correctly and identically each time. We write small programs (called macros) that take specified data as input and produce data sets containing derived variables as output. Approximately 30 such macros reside in our library for the grades 3–8 program score reports. Each macro is extensively tested and stored in a central development server. Once a macro is tested and stored, changes to the macro must be approved by the Director of Score Reporting and the Director of Psychometrics, as well as by the project directors for affected projects. A complete retesting with the entire collection of scenarios on which the macro was originally tested follows each change.

The main statistical program is mostly made up of calls to various macros, including macros that read in and verify the data and conversion tables and the macros that do the many complex calculations. This program is developed and tested using artificial data generated to test both typical and extreme cases. In addition, the program goes through a rigorous code review by a senior statistician.

Quality Assurance—Display Programming

The reports are programmed in a Xerox-developed language called VIPP. VIPP code is tested using both artificial and real data. AIR’s data generation utilities can read the output layout specifications and generate artificial data


for direct input into the VIPP programs. This allows the testing of these programs to begin before the statistical programming is complete. In later stages, artificial data are generated according to the input layout and run through the psychometric process and the score reporting statistical programs, and the output is formatted as VIPP input. This enables us to test the entire system.

Once we receive final data and VIPP programs, the AIR Score Reporting team reviews proofs that contain actual data based on quality assurance documentation that is provided by the ODE. In addition, we compare data independently calculated by AIR psychometricians with data on the reports. A large sample of each type of report is reviewed by several AIR staff members to make sure that all data are correctly placed on reports. This rigorous review typically is conducted over several days and takes place in a secure location in the AIR building. All reports containing actual data are stored in a locked storage area. Staff from ODE and its Quality Assurance contractor are welcome to visit AIR at any point during this revision process to oversee our procedures.

Prior to printing the reports, AIR provides a live data file and reports with sample districts as chosen by ODE for review. AIR works closely with the ODE to resolve questions and correct any problems. The reports will not be delivered unless ODE approves the sample reports and data file.

Print and Package Quality is Maintained

Automated tools help ensure quality and accuracy. PrintTracker tracks and manages the print and packaging process. AIR’s PrintTracker software is an online tool for managing the print/pack/ship process across multiple print sites and vendors. PrintTracker manages the workload at each printer, reallocating work in response to unforeseen downtime. It monitors the production of each shipment (e.g., reports going to a district) to ensure that the correct number of packets is packed in the expected number of boxes and shipped to the correct address. Any conflicting information entered by the print operator generates a discrepancy report, which is immediately emailed to the print lead and the on-site AIR representative. Each discrepancy is resolved before a shipment can be released. This ensures that the correct score reports make it to the correct destination every time.

Inline smart-press technology allows inline sorting, folding, stitching, trimming, and offset stacking of reports with varying page counts. This all occurs without human intervention, reducing the opportunities for errors. Materials go straight from offset stacked, collated sets into sealed school packages. The packaging can be heavy cardboard envelopes or boxes, depending on size. School packages are boxed for shipping. Customized packing lists and labels are created from the same data set used to generate the report. For quality assurance, each report is assigned a unique serial number, printed inline on the gutter of each signature as an additional human readable QA device.

PrintTracker has multiple built-in quality assurance features to ensure that reports are correctly packaged and shipped. For example, an operator at a print site must record the number of packages and boxes destined for each school or district. PrintTracker confirms this number against the database before indicating to the operator that the package may be sealed and allowing him or her to print shipping labels. Any discrepancies are reported to, and resolved by, AIR print management staff to ensure that every report is shipped to the correct location.

During printing, reports are checked for color against color samples and print site staff review reports as they are printed to make sure that graphics are printing properly, all pages are correctly printed, and there are no printing errors such as ink smudging or faulty lines. AIR will provide documentation of established quality checks upon request.

12.6.3 QUALITY ASSURANCE FOR TEST SCORING

AIR verifies the accuracy of the scoring engine using simulated test administrations. The simulator generates a sample of students with an ability distribution that matches that of the state. The ability of each simulated students is used to generate a sequence of item responses consistent with the underlying ability. Although the simulations were designed to provide a rigorous test of the adaptive algorithm for adaptively administered tests,


they also provide a check of the full range of item responses and test scores in fixed-form tests as well. Simulations are always generated using the production item selection and scoring engine to ensure that verification of the scoring engine is based on a very wide range of student response patterns.

To verify the accuracy of the online reporting system, we merge item response data with the demographic information taken either from previous year assessment data, or if current year enrollment data is available by the time simulated data files are created, we can verify online reporting using current year testing information. By populating the simulated data files with real school information, it is possible to verify that special school types and special districts are being handled properly in the reporting system.

Specifications for generating simulated data files are included in the Analysis Specifications document submitted to ODE each year. Although ODE does not currently provide immediate reporting, review of all simulated data is scheduled to be completed prior to the opening of the test administration, so that the integrity of item administration, data capture, item and test scoring and reporting can be verified before the system goes live.

To monitor the performance of the assessment system during the testing window, a series of Quality Assurance Reports can be generated at any time during the online assessment window. For example, item analysis reports allow psychometricians to ensure that items are performing as intended and serve as an empirical key check through the operational testing window. In the context of adaptive test administrations, other reports such as blueprint match and item exposure reports allow psychometricians to verify that test administrations conform to specifications.

An additional set of cheating analysis reports flags unlikely patterns of behavior in testing administrations aggregated at the test administration, test administrator, and school level. The quality assurance reports can be generated on any desired schedule. Item analysis and blueprint match reports are evaluated frequently at the opening of the testing window to ensure that test administrations conform to blueprint and items are performing as anticipated.

Each time the reports are generated, the lead psychometrician reviews the results. If any unexpected results are identified, the lead psychometrician alerts the project manager immediately to resolve any issues. Exhibit 12.6.3.1 presents an overview of the quality assurance (QA) reports.

Exhibit 12.6.3.1: Overview of Quality Assurance Reports

QA Reports Purpose Rationale

Item Statistics To confirm whether items work as expected

Early detection of errors (key errors for selected-response items and scoring errors for constructed-response, performance, or technology items)

Blueprint Match Rates To monitor unexpected low blueprint match rates

Early detection of unexpected blueprint match issue

Item Exposure Rates

To monitor unlikely high exposure rates of items or passages or unusually low item pool usage (high unused items/passages)

Early detection of any oversight in the blueprint specification


QA Reports Purpose Rationale

Cheating Analysis To monitor testing irregularities

Early detection of testing irregularities

Item Analysis Report

The item analysis report is used to monitor the performance of test items throughout the testing window and serves as a key check for the early detection of potential problems with item scoring, including incorrect designation of a keyed response or other scoring errors, as well as potential breaches of test security that may be indicated by changes in the difficulty of test items. To examine test items for changes in performance, this report generates classical item analysis indicators of difficulty and discrimination, including proportion correct and biserial/polyserial correlation, as well as IRT based item fit statistics. The report is configurable and can be produced so that only items with statistics falling outside a specified range are flagged for reporting or to generate reports based on all items in the pool.

Item p-Value. For multiple-choice items, the proportion of students selecting each of response option is computed; for constructed-response, performance, and technology items, the proportion of student responses classified at each score point is computed. For multiple-choice items, if the keyed response is not the modal response, the item is also flagged. Although the correct response is not always the modal response, keyed response options flagged for both low biserial correlations and non-modal response are indicative of miskeyed item.

Item Discrimination. Biserial correlations for the keyed response for selected-response items and polyserial correlations for polytomous constructed response, performance, and technology items are computed. AIR psychometric staff evaluates all items with biserial correlations below a target level, even if the obtained values are consistent with past item performance.

Item Fit. In addition to the item difficulty and item discrimination indices, an item fit index is produced for each item. For each student, a residual between observed and expected score given the student’s ability is computed for each item. The residuals for each are averaged across all students, and the average residual is used to flag an item.

We begin by defining Pij = pr(zij=1), representing the probability that student i responds correctly to item j ( the term zij represents the student’s score on the item). For selected-response items we use the 3PL IRT model to

calculate the expected score on item j for student i with estimated ability θ̂ as

𝐸(𝑧𝑖𝑗) = 𝑐𝑗 + (1 − 𝑐𝑗)exp (𝐷𝑎𝑗(�̂�𝑖 − 𝑏𝑗))

1 + exp (𝐷𝑎𝑗(�̂�𝑖 − 𝑏𝑗))

For constructed-response, performance, or technology items, using the Generalized Partial Credit model, the

expected score for student i with estimated ability θ̂ on an item j with a maximum possible score of Kj is calculated as

𝐸(𝑧𝑖𝑗) = ∑𝑙exp(𝐷𝑎𝑗 ∑ (�̂�𝑖 − 𝑏𝑗,𝑘)𝑙

𝑘=1 )

1 + ∑ exp(𝐷𝑎𝑗 ∑ (�̂�𝑖 − 𝑏𝑗,𝑘)𝑚𝑘=1 )

𝐾𝑗

𝑚=1

𝐾𝑗

𝑙=1

For each item j, the residual between observed and expected score for each student is defined as

𝛿𝑖𝑗 = 𝑧𝑖𝑗 − 𝐸(𝑧𝑖𝑗)


The statistic is aggregated across students of different abilities for each item,

𝛿�̅� =1

𝑛∑(𝛿𝑖𝑗)

𝑛

𝑖=1

The report can be configured to report all items or flag and report only those items where the fit index is above a given threshold (e.g., items could be flagged when

𝛿�̅�

𝑠𝑒(𝛿�̅�)> 1.96

where 𝛿�̅� =𝑆𝐷(𝛿𝑖𝑗)

√𝑛.

12.6.4 REPORTING

Scores for online assessments are assigned by automated systems in real time. For machine scored portions of assessments, the machine rubrics are created and reviewed along with the items, then validated and finalized during rubric validation following field testing. The review process “locks down” the item and rubric when the item is approved for web display (Web Approval). During operational testing, actual item responses are compared to expected item responses (given the item response theory [IRT] parameters), which can detect miskeyed items, item drift, or other scoring problems. Potential issues are automatically flagged in reports available to our psychometricians.

The handscoring processes include rigorous training, validity and reliability monitoring, and backreading to ensure accurate scoring. Handscored items are married up with the machine-scored items by our Test Integration System (TIS). The integration is based on identifiers that are never separated from their data and are further checked by the quality monitor (QM) system where the integrated record is passed for scoring. Once the integrated scores are sent to the QM, the records are rescored in the test-scoring system, a mature, well-tested real-time system that applies client-specific scoring rules and assigns scores from the calibrated items, including calculating performance-level indicators, subscale scores and other features, which then pass automatically to the reporting system and Database of Record (DoR). The scoring system is tested extensively prior to deployment, including hand checks of scored tests and large-scale simulations to ensure that point estimates and standard errors are correct.

After passing through the series of validation checks in the QM system, data are passed to the DoR, which serves as the centralized location for all student scores and responses, ensuring that there is only one place where the “official” record is stored. Only after scores have passed the QM checks and are uploaded to the DoR are they passed to the Online Reporting System, which is responsible for presenting individual-level results and calculating and presenting aggregate results. Absolutely no score is reported in the Online Reporting System until it passes all of the QM system’s validation checks.


REFERENCES

American Educational Research Association, American Psychological Association, & National Council on

Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC.

Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238–246.

Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item

response models and standardized indices. British Journal of Mathematical and Statistical Psychology,

38(1), 67–86.

Huynh, H. (1979). Statistical inference for two reliability indices in mastery testing based on the beta-binomial

model. Journal of Educational Statistics, 4, 231–246.

Kish, L. (1967). Survey sampling, New York, John Wiley and Sons.

Lewis, D.M., Mitzel, H.C., Green, D.R. (1996). Standard Setting: A Bookmark Approach. In D.R. Green (Chair), IRT-

Based Standard-Setting Procedures Utilizing Behavioral Anchoring. Symposium presented at the 1996

Council of Chief State School Officers 1996 National Conference on Large Scale Assessment, Phoenix, AZ.

Linacre, J.M. (2004). A user’s guide to WINSTEPS: Rasch-Model Computer Program. Chicago: MESA Press.

Livingston, S. A., & Wingersky, M. S. (1979). Assessing the reliability of tests used to make pass/fail

decisions. Journal of Educational Measurement, 247–260.

Livingston, S. A., & Lewis, C. (1995). Estimating the consistency and accuracy of classifications based on test

scores. Journal of Educational Measurement, 32(2), 179–197.

Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.

McLaughlin, D., Scarloss, B. A., Stancavage, F. B., & Blankenship, C. D. (2005). Using State Assessments to Impute

Achievement of Students Absent from NAEP: An Empirical Study in Four States. Washington, DC:

American Institutes for Research. Retrieved from www.air.org/files/McLaughlin_AbsentStudents.pdf

Millsap, R. E. (2011). Statistical approaches to measurement invariance. New York: Routledge.

Millsap, R. E., & Cham, H. (2012). Investigating factorial invariance in longitudinal data. In B. Laursen, T. D. Little, &

N. A. Card (Eds.), Handbook of Developmental Research Methods (pp. 109–126). New York: Guilford

Press.

Mitzel, H.C., Lewis, D.M., Patz, R.J., & Green, D.R. (2001). The bookmark procedure: Psychological perspectives. In

G.J. Cizek (Ed), Setting performance standards: Concepts, methods and perspectives (pp. 249–281).

Mahwah, NJ: Lawrence Eribaum Assoc.

Sotaridona, L. S., Pornel, J. B., & Vallejo, A. (2003). Some applications of item response theory to testing. The

Phillipine Statistician, 52(1–4), 81–92.

Snijders, T.A.B. (2001). Asymptotic null distribution of person fit statistics with estimated person parameter.

Psychometrika, 66(3), 331–342.

http://www.air.org/files/McLaughlin_AbsentStudents.pdf


Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor

analysis. Psychometrika, 38(1), 1–10.

Way, W. D., Davis, L. L., & Fitzpatrick, S. (2006, April). Score comparability of online and paper administrations of

the Texas Assessment of Knowledge and Skills. In annual meeting of the National Council on

Measurement in Education, San Francisco, CA.

Wesolowsky G.O. 2000. "Detecting Excessive Similarity in Answers on Multiple Choice Exams", Journal of Applied

Statistics, Vol. 27, 909–921.

Ohio’s State Tests —Spring 2018 Administration Technical Report

A-1 American Institutes for Research

Appendix A.1a Global Model Fit Indices of Measurement Invariance Tests – Grade 3 ELA

Invariance Model

χ2 df χ2 Difference Test Change in

RMSEA Comparison χ2(df) p value

Model A: Students’ Gender (Female vs. Male)

Configural 176272.607 700

Metric 176935.899 727 Configural 663.293 (27) < 0.01 0.001

Scalar 179591.433 754 Metric 2655.533 (27) < 0.01 0.001

Model B-1: Students’ Ethnicity (African American vs. White)



Scalar 156938.554 754 Metric 5810.986 (27) < 0.01 0.000

Model B-2: Students’ Ethnicity (Hispanics vs. White)



Scalar 128041.882 754 Metric 1307.840 (27) < 0.01 0.001

Model B-3: Students’ Ethnicity (Asian vs. White)



Scalar 125882.150 754 Metric 1715.535 (27) < 0.01 0.001

Model B-4: Students’ Ethnicity (American Indian vs. White)


Metric 119867.039 727 Configural 7.805 (27) 0.99 0.001

Scalar 119901.028 754 Metric 33.989 (27) 0.17 0.001

Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)



Scalar 133787.036 754 Metric 606.288 (27) < 0.01 0.001

Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)



Scalar 181061.496 754 Metric 4396.176 (27) < 0.01 0.000

Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)



Scalar 179987.338 754 Metric 3578.626 (27) < 0.01 0.001

Appendix A.1b Global Model Fit Indices of Scalar Invariance Model – Grade 3 ELA

Model Chi-Square Test

CFI RMSEA Value df P-Value

Model A 157017.616 736 < 0.01 0.970 0.058

Model B-1 134967.598 736 < 0.01 0.970 0.058

Model B-2 114552.858 736 < 0.01 0.970 0.059

Model B-3 114546.382 736 < 0.01 0.968 0.059

Model B-4 4953.975 626 < 0.01 0.991 0.013





Model B-5 120532.604 736 < 0.01 0.970 0.059

Model C 85146.485 680 < 0.01 0.931 0.044

Model D 150361.839 736 < 0.01 0.973 0.057


Invariance Model






Scalar 134746.210 700 Metric 3637.880 (26) < 0.01 0.000




Scalar 117261.749 700 Metric 5389.679 (26) < 0.01 0.000




Scalar 99165.715 700 Metric 1359.886 (26) < 0.01 0.001




Scalar 97294.713 700 Metric 850.454 (26) < 0.01 0.001




Scalar 92836.238 700 Metric 24.727 (26) 0.53 0.001




Scalar 102634.806 700 Metric 722.792 (26) < 0.01 0.001




Scalar 137650.795 700 Metric 11905.701 (26) < 0.01 0.002




Scalar 133310.002 700 Metric 4156.173 (26) < 0.01 0.000






Model A 79175.409 684 < 0.01 0.983 0.043

Model B-1 70429.082 684 < 0.01 0.982 0.043

Model B-2 61495.500 684 < 0.01 0.982 0.044

Model B-3 62589.438 684 < 0.01 0.981 0.045

Model B-4 41506.750 684 < 0.01 0.984 0.037

Model B-5 64156.587 684 < 0.01 0.982 0.044

Model C 80276.805 684 < 0.01 0.982 0.043

Model D 73622.215 684 < 0.01 0.985 0.041


Invariance Model






Scalar 154734.500 754 Metric 5926.262 (27) < 0.01 0.000




Scalar 135617.320 754 Metric 3110.183 (27) < 0.01 0.000




Scalar 109808.786 754 Metric 680.221 (27) < 0.01 0.001




Scalar 107252.901 754 Metric 756.491 (27) < 0.01 0.001




Scalar 102626.133 754 Metric 37.661 (27) 0.08 0.001




Scalar 113281.002 754 Metric 324.193 (27) < 0.01 0.001




Scalar 157882.846 754 Metric 3472.355 (27) < 0.01 0.001



Invariance Model






Scalar 152039.293 754 Metric 1191.966 (27) < 0.01 0.001




Model A 132173.133 738 < 0.01 0.964 0.053

Model B-1 120039.546 738 < 0.01 0.963 0.054

Model B-2 102794.477 738 < 0.01 0.965 0.054

Model B-3 104288.496 738 < 0.01 0.961 0.055

Model B-4 5865.498 628 < 0.01 0.991 0.014

Model B-5 9662.025 628 < 0.01 0.988 0.017

Model C 139017.910 738 < 0.01 0.963 0.054

Model D 11845.152 628 < 0.01 0.991 0.017


Invariance Model




Configural 168282.645 1120


Scalar 182954.777 1188 Metric 11895.685 (34) < 0.01 0.001


Configural 143585.881 1120


Scalar 153361.720 1188 Metric 3001.360 (34) < 0.01 0.001


Configural 124176.961 1120


Scalar 125638.033 1188 Metric 601.928 (34) < 0.01 0.001


Configural 121927.115 1120


Scalar 123241.196 1188 Metric 856.184 (34) < 0.01 0.001


Configural 117833.007 1120

Metric 117622.219 1154 Configural NA (34) NA 0.000

Scalar 117645.099 1188 Metric 22.880 (34) 0.93 0.001




Invariance Model



Configural 129166.104 1120


Scalar 130101.567 1188 Metric 280.424 (34) < 0.01 0.001


Configural 163070.972 1120


Scalar 182002.462 1188 Metric 7975.415 (34) < 0.01 0.000


Configural 167842.566 1120


Scalar 174443.388 1188 Metric 2053.700 (34) < 0.01 0.001




Model A 102199.823 1168 < 0.01 0.969 0.037

Model B-1 82400.115 1168 < 0.01 0.964 0.036

Model B-2 70065.871 1168 < 0.01 0.965 0.035

Model B-3 71153.323 1168 < 0.01 0.961 0.036

Model B-4 6735.073 1030 < 0.01 0.992 0.011

Model B-5 72573.325 1168 < 0.01 0.965 0.035

Model C 94312.750 1168 < 0.01 0.962 0.036

Model D 89135.877 1168 < 0.01 0.981 0.035


Invariance Model




Configural 153106.579 1330


Scalar 165855.870 1404 Metric 10575.806 (37) < 0.01 0.000


Configural 134663.915 1330


Scalar 141723.904 1404 Metric 2102.855 (37) < 0.01 0.000


Configural 119830.630 1330


Scalar 121210.813 1404 Metric 389.956 (37) < 0.01 0.000


Configural 118713.552 1330



Invariance Model




Scalar 120640.953 1404 Metric 1511.866 (37) < 0.01 0.000


Configural 114856.793 1330


Scalar 114946.565 1404 Metric 41.381 (37) 0.29 0.000


Configural 124523.031 1330


Scalar 125309.279 1404 Metric 325.007 (37) < 0.01 0.000


Configural 153916.614 1330


Scalar 164816.218 1404 Metric 5581.196 (37) < 0.01 0.000


Configural 153929.117 1330


Scalar 158653.435 1404 Metric 1747.079 (37) < 0.01 0.001




Model A 153828.801 1379 < 0.01 0.975 0.042

Model B-1 130919.521 1379 < 0.01 0.974 0.042

Model B-2 116459.642 1379 < 0.01 0.974 0.042

Model B-3 117392.902 1379 < 0.01 0.970 0.043

Model B-4 66384.668 1379 < 0.01 0.979 0.032

Model B-5 121854.746 1379 < 0.01 0.973 0.042

Model C 143154.323 1379 < 0.01 0.972 0.041

Model D 125747.203 1379 < 0.01 0.980 0.038


Invariance Model




Configural 180383.360 1188


Scalar 199572.427 1258 Metric 14860.160 (35) < 0.01 0.001


Configural 158655.439 1188




Invariance Model



Scalar 169750.937 1258 Metric 5206.080 (35) < 0.01 0.000


Configural 139622.510 1188


Scalar 141730.165 1258 Metric 1035.556 (35) < 0.01 0.000


Configural 137768.314 1188


Scalar 139931.044 1258 Metric 1593.184 (35) < 0.01 0.000


Configural 133651.357 1188


Scalar 133755.628 1258 Metric 61.818 (35) < 0.01 0.001


Configural 145125.458 1188


Scalar 146393.178 1258 Metric 602.249 (35) < 0.01 0.001


Configural 178864.671 1188


Scalar 195159.754 1258 Metric 9174.820 (35) < 0.01 0.001


Configural 181989.475 1188


Scalar 188164.706 1258 Metric 3004.650 (35) < 0.01 0.000




Model A 115362.862 1237 < 0.01 0.968 0.038

Model B-1 98674.468 1237 < 0.01 0.968 0.038

Model B-2 83198.288 1237 < 0.01 0.969 0.037

Model B-3 84753.871 1237 < 0.01 0.966 0.038

Model B-4 46742.399 1237 < 0.01 0.976 0.029

Model B-5 88122.358 1237 < 0.01 0.970 0.038

Model C 109656.528 1237 < 0.01 0.970 0.037

Model D 68275.316 1165 < 0.01 0.973 0.030



Appendix A.7a Global Model Fit Indices of Measurement Invariance Tests – High School ELA 1

Invariance Model




Configural 139303.829 1188


Scalar 149070.781 1258 Metric 6414.186 (35) < 0.01 0.000


Configural 119251.442 1188


Scalar 132296.422 1258 Metric 2240.881 (35) < 0.01 0.000


Configural 100970.374 1188


Scalar 104232.537 1258 Metric 790.181 (35) < 0.01 0.000




Scalar 101241.561 1258 Metric 1712.257 (35) < 0.01 0.000




Scalar 96212.336 1258 Metric 36.605 (35) 0.39 0.001


Configural 105233.921 1188


Scalar 107031.991 1258 Metric 493.264 (35) < 0.01 0.000


Configural 136531.110 1188


Scalar 152658.585 1258 Metric 3219.603 (35) < 0.01 0.000


Configural 137523.733 1188


Scalar 148020.572 1258 Metric 3272.807 (35) < 0.01 0.000

Appendix A.7b Global Model Fit Indices of Scalar Invariance Model – High School ELA 1



Model A 118863.527 1237 < 0.01 0.978 0.036

Model B-1 113871.033 1237 < 0.01 0.976 0.037

Model B-2 94548.376 1237 < 0.01 0.976 0.037

Model B-3 97930.200 1237 < 0.01 0.972 0.038

Model B-4 10269.973 1095 < 0.01 0.990 0.013





Model B-5 98736.587 1237 < 0.01 0.977 0.037

Model C 73975.406 1165 < 0.01 0.977 0.029

Model D 66457.321 1165 < 0.01 0.984 0.027

Appendix A.8a Global Model Fit Indices of Measurement Invariance Tests – High School ELA 2

Invariance Model




Configural 144160.958 1258


Scalar 160177.179 1330 Metric 10905.709 (36) < 0.01 0.001


Configural 128065.864 1258


Scalar 138294.074 1330 Metric 3955.377 (36) < 0.01 0.000


Configural 116372.511 1258


Scalar 119302.131 1330 Metric 1324.386 (36) < 0.01 0.001


Configural 115083.545 1258


Scalar 117326.031 1330 Metric 1619.237 (36) < 0.01 0.000


Configural 111535.374 1258


Scalar 111628.834 1330 Metric 44.055 (36) 0.17 0.001


Configural 119700.711 1258


Scalar 120561.258 1330 Metric 282.973 (36) < 0.01 0.001


Configural 145117.348 1258


Scalar 158340.252 1330 Metric 6004.559 (36) < 0.01 0.000


Configural 145933.871 1258


Scalar 154366.269 1330 Metric 6333.395 (36) < 0.01 0.000



Appendix A.8b Global Model Fit Indices of Scalar Invariance Model – High School ELA 2



Model A 103674.106 1306 < 0.01 0.987 0.033

Model B-1 91303.503 1306 < 0.01 0.986 0.033

Model B-2 80680.422 1306 < 0.01 0.986 0.034

Model B-3 84810.984 1306 < 0.01 0.983 0.035

Model B-4 48677.650 1306 < 0.01 0.989 0.027

Model B-5 83255.971 1306 < 0.01 0.985 0.034

Model C 100077.934 1306 < 0.01 0.986 0.033

Model D 87238.839 1306 < 0.01 0.991 0.031

Appendix A.9a Global Model Fit Indices of Measurement Invariance Tests – Grade 3 Math

Invariance Model




Configural 190947.987 1720


Scalar 202057.114 1804 Metric 8802.334 (42) < 0.01 0.001


Configural 150540.004 1720


Scalar 174359.453 1804 Metric 6636.970 (42) < 0.01 0.000


Configural 119295.226 1720


Scalar 121474.981 1804 Metric 464.110 (42) < 0.01 0.001


Configural 114655.043 1720


Scalar 116359.095 1804 Metric 1115.940 (42) < 0.01 0.000


Configural 110917.923 1720


Scalar 111005.438 1804 Metric 36.439 (42) 0.71 0.000


Configural 125372.296 1720


Scalar 127300.676 1804 Metric 444.436 (42) < 0.01 0.001


Configural 180820.553 1720


Scalar 199534.697 1804 Metric 4360.044 (42) < 0.01 0.000



Invariance Model




Configural 190572.554 1720


Scalar 194914.246 1804 Metric 906.661 (42) < 0.01 0.000

Appendix A.9b Global Model Fit Indices of Scalar Invariance Model – Grade 3 Math



Model A 178542.083 1767 < 0.01 0.957 0.040

Model B-1 146430.781 1767 < 0.01 0.950 0.039

Model B-2 100392.823 1767 < 0.01 0.959 0.035

Model B-3 91327.522 1767 < 0.01 0.956 0.034

Model B-4 48266.868 1767 < 0.01 0.974 0.025

Model B-5 108469.708 1767 < 0.01 0.957 0.036

Model C 170436.283 1767 < 0.01 0.952 0.039

Model D 161574.471 1767 < 0.01 0.961 0.038


Invariance Model




Configural 132124.195 2160


Scalar 142829.229 2254 Metric 9009.353 (47) < 0.01 0.001


Configural 110522.702 2160


Scalar 124533.299 2254 Metric 4343.412 (47) < 0.01 0.001




Scalar 95852.257 2254 Metric 605.658 (47) < 0.01 0.000




Scalar 93733.957 2254 Metric 1106.995 (47) < 0.01 0.000




Scalar 89192.200 2254 Metric 50.843 (47) 0.32 0.000




Invariance Model





Scalar 99658.985 2254 Metric 517.051 (47) < 0.01 0.000


Configural 128864.507 2160


Scalar 145824.015 2254 Metric 5096.925 (47) < 0.01 0.000


Configural 132333.410 2160


Scalar 135509.068 2254 Metric 1468.324 (47) < 0.01 0.000




Model A 148317.365 2207 < 0.01 0.969 0.032

Model B-1 119688.757 2207 < 0.01 0.964 0.031

Model B-2 97381.534 2207 < 0.01 0.966 0.031

Model B-3 93061.605 2207 < 0.01 0.962 0.030

Model B-4 44760.357 2207 < 0.01 0.981 0.021

Model B-5 103741.708 2207 < 0.01 0.965 0.031

Model C 136891.966 2207 < 0.01 0.966 0.031

Model D 127289.872 2207 < 0.01 0.972 0.030


Invariance Model




Configural 167827.148 1978


Scalar 181780.169 2068 Metric 11984.982 (45) < 0.01 0.001


Configural 131834.412 1978


Scalar 160457.408 2068 Metric 6468.617 (45) < 0.01 0.000


Configural 113472.774 1978


Scalar 116093.143 2068 Metric 461.008 (45) < 0.01 0.001


Configural 110120.724 1978



Invariance Model




Scalar 112014.539 2068 Metric 1245.309 (45) < 0.01 0.000


Configural 106936.222 1978


Scalar 107086.266 2068 Metric 66.851 (45) 0.02 0.000


Configural 118045.905 1978


Scalar 120510.974 2068 Metric 492.522 (45) < 0.01 0.001


Configural 162442.028 1978


Scalar 189410.778 2068 Metric 11166.610 (45) < 0.01 0.001


Configural 167756.026 1978


Scalar 172860.513 2068 Metric 2006.257 (45) < 0.01 0.000




Model A 130334.860 2026 < 0.01 0.975 0.032

Model B-1 101639.676 2026 < 0.01 0.972 0.030

Model B-2 81594.183 2026 < 0.01 0.975 0.029

Model B-3 80287.581 2026 < 0.01 0.972 0.029

Model B-4 41802.299 2026 < 0.01 0.984 0.021

Model B-5 88250.421 2026 < 0.01 0.974 0.030

Model C 124475.421 2026 < 0.01 0.972 0.031

Model D 108542.900 2026 < 0.01 0.978 0.029


Invariance Model




Configural 158114.805 1978


Scalar 176085.262 2068 Metric 15394.581 (45) < 0.01 0.002


Configural 121340.475 1978




Invariance Model



Scalar 150521.986 2068 Metric 6145.446 (45) < 0.01 0.000


Configural 107722.322 1978


Scalar 110576.871 2068 Metric 557.145 (45) < 0.01 0.000


Configural 106152.536 1978


Scalar 108965.289 2068 Metric 2048.268 (45) < 0.01 0.000


Configural 101947.142 1978


Scalar 102136.836 2068 Metric 52.828 (45) 0.20 0.001


Configural 112226.072 1978


Scalar 114827.666 2068 Metric 438.019 (45) < 0.01 0.000


Configural 139877.131 1978


Scalar 178426.752 2068 Metric 15378.976 (45) < 0.01 0.001


Configural 156734.616 1978


Scalar 162529.372 2068 Metric 3241.917 (45) < 0.01 0.000




Model A 112375.374 2029 < 0.01 0.979 0.030

Model B-1 83800.161 2029 < 0.01 0.978 0.027

Model B-2 63884.428 2029 < 0.01 0.981 0.026

Model B-3 63945.406 2029 < 0.01 0.977 0.026

Model B-4 31920.301 2029 < 0.01 0.989 0.018

Model B-5 67946.129 2029 < 0.01 0.980 0.026

Model C 91468.840 2029 < 0.01 0.979 0.027

Model D 86364.284 2029 < 0.01 0.983 0.026




Invariance Model






Scalar 99662.188 1890 Metric 7497.321 (43) < 0.01 0.000




Scalar 87043.400 1890 Metric 1656.242 (43) < 0.01 0.000




Scalar 66604.358 1890 Metric 291.685 (43) < 0.01 0.000




Scalar 66010.518 1890 Metric 1292.931 (43) < 0.01 0.000




Scalar 62127.231 1890 Metric 53.179 (43) 0.14 0.001




Scalar 69006.771 1890 Metric 234.539 (43) < 0.01 0.000




Scalar 106105.765 1890 Metric 10421.537 (43) < 0.01 0.001




Scalar 92614.653 1890 Metric 1797.496 (43) < 0.01 0.000




Model A 96462.431 1854 < 0.01 0.980 0.029

Model B-1 72340.614 1854 < 0.01 0.979 0.027

Model B-2 61077.288 1854 < 0.01 0.980 0.027

Model B-3 60574.534 1854 < 0.01 0.977 0.027





Model B-4 29630.985 1854 < 0.01 0.988 0.019

Model B-5 65535.622 1854 < 0.01 0.979 0.027

Model C 79157.480 1854 < 0.01 0.979 0.026

Model D 74019.021 1854 < 0.01 0.983 0.026


Invariance Model




Configural 142717.934 2350


Scalar 150954.434 2448 Metric 6195.757 (49) < 0.01 0.000


Configural 115632.325 2350


Scalar 131613.357 2448 Metric 6093.358 (49) < 0.01 0.000


Configural 101885.450 2350


Scalar 103538.186 2448 Metric 497.277 (49) < 0.01 0.000




Scalar 102669.061 2448 Metric 2035.983 (49) < 0.01 0.000




Scalar 97024.289 2448 Metric 64.271 (49) 0.07 0.000


Configural 105622.368 2350


Scalar 107048.496 2448 Metric 420.544 (49) < 0.01 0.000


Configural 127209.091 2350


Scalar 153384.474 2448 Metric 8764.976 (49) < 0.01 0.001


Configural 141725.007 2350


Scalar 145242.165 2448 Metric 1876.149 (49) < 0.01 0.000






Model A 108670.655 2401 < 0.01 0.967 0.030

Model B-1 87774.437 2401 < 0.01 0.963 0.029

Model B-2 70271.540 2401 < 0.01 0.966 0.028

Model B-3 69015.526 2401 < 0.01 0.961 0.028

Model B-4 29900.357 2401 < 0.01 0.979 0.018

Model B-5 75489.717 2401 < 0.01 0.965 0.029

Model C 93707.109 2401 < 0.01 0.964 0.028

Model D 88035.755 2401 < 0.01 0.970 0.027

Appendix A.15a Global Model Fit Indices of Measurement Invariance Tests – Algebra

Invariance Model




Configural 145317.574 2068


Scalar 157550.783 2160 Metric 9603.980 (46) < 0.01 0.000


Configural 114668.900 2068


Scalar 141380.030 2160 Metric 7728.899 (46) < 0.01 0.001


Configural 100042.607 2068


Scalar 104948.891 2160 Metric 1609.835 (46) < 0.01 0.000




Scalar 103587.621 2160 Metric 3031.558 (46) < 0.01 0.000




Scalar 95391.742 2160 Metric 51.295 (46) 0.27 0.000


Configural 103671.582 2068


Scalar 105299.079 2160 Metric 681.740 (46) < 0.01 0.001


Configural 133596.131 2068


Scalar 167543.093 2160 Metric 17189.942 (46) < 0.01 0.001



Invariance Model




Configural 143456.998 2068


Scalar 152056.332 2160 Metric 6427.048 (46) < 0.01 0.000

Appendix A.15b Global Model Fit Indices of Scalar Invariance Model – Algebra



Model A 120175.681 2120 < 0.01 0.976 0.028

Model B-1 89500.076 2120 < 0.01 0.976 0.025

Model B-2 69417.062 2120 < 0.01 0.979 0.024

Model B-3 71063.328 2120 < 0.01 0.975 0.025

Model B-4 35671.528 2120 < 0.01 0.986 0.018

Model B-5 74509.579 2120 < 0.01 0.978 0.025

Model C 97354.383 2120 < 0.01 0.976 0.025

Model D 90478.731 2120 < 0.01 0.980 0.024

Appendix A.16a Global Model Fit Indices of Measurement Invariance Tests – Geometry

Invariance Model




Configural 179535.383 2448


Scalar 194081.235 2548 Metric 11317.926 (50) < 0.01 0.000


Configural 149911.287 2448


Scalar 169351.533 2548 Metric 9206.782 (50) < 0.01 0.000


Configural 134034.325 2448


Scalar 136991.025 2548 Metric 1281.016 (50) < 0.01 0.000


Configural 132014.972 2448


Scalar 135537.557 2548 Metric 2076.798 (50) < 0.01 0.000


Configural 128006.674 2448


Scalar 128146.794 2548 Metric 61.014 (50) 0.14 0.000




Invariance Model



Configural 137247.895 2448


Scalar 138446.519 2548 Metric 599.550 (50) < 0.01 0.000


Configural 168429.004 2448


Scalar 197663.061 2548 Metric 19219.817 (50) < 0.01 0.002


Configural 179174.789 2448


Scalar 185655.520 2548 Metric 4809.234 (50) < 0.01 0.001

Appendix A.16b Global Model Fit Indices of Scalar Invariance Model – Geometry



Model A 181322.936 2501 < 0.01 0.968 0.033

Model B-1 135153.027 2501 < 0.01 0.968 0.031

Model B-2 118898.258 2501 < 0.01 0.969 0.031

Model B-3 129396.628 2501 < 0.01 0.963 0.033

Model B-4 54553.445 2501 < 0.01 0.981 0.021

Model B-5 131293.202 2501 < 0.01 0.966 0.032

Model C 142562.024 2501 < 0.01 0.967 0.030

Model D 127841.920 2501 < 0.01 0.974 0.028

Appendix A.17a Global Model Fit Indices of Measurement Invariance Tests – Integrated Math 1

Invariance Model






Scalar 13653.808 2160 Metric 820.554 (46) < 0.01 0.000




Scalar 13016.987 2160 Metric 501.436 (46) < 0.01 0.000




Scalar 7777.285 2160 Metric 102.493 (46) < 0.01 0.000





Invariance Model




Scalar 8483.634 2160 Metric 624.728 (46) < 0.01 0.001


Configural NA NA

Metric 9545.293 1847 Configural NA (NA) NA NA

Scalar 9577.343 1890 Metric 32.050 (43) 0.89 0.000




Scalar 9304.084 2160 Metric 243.343 (46) < 0.01 0.000




Scalar 14654.803 2160 Metric 998.859 (46) < 0.01 0.001




Scalar 14937.766 2160 Metric 1659.406 (46) < 0.01 0.002

Appendix A.17b Global Model Fit Indices of Scalar Invariance Model – Integrated Math 1



Model A 9882.116 2120 < 0.01 0.984 0.024

Model B-1 6869.263 2120 < 0.01 0.984 0.021

Model B-2 4197.159 2120 < 0.01 0.991 0.017

Model B-3 4964.188 2120 < 0.01 0.988 0.020

Model B-4 NA NA NA NA NA

Model B-5 5544.281 2120 < 0.01 0.987 0.021

Model C 7114.987 2120 < 0.01 0.988 0.019

Model D 8118.431 2120 < 0.01 0.986 0.021

Appendix A.18a Global Model Fit Indices of Measurement Invariance Tests – Integrated Math 2

Invariance Model






Scalar 16469.255 2650 Metric 488.245 (51) < 0.01 0.000






Invariance Model



Scalar 15437.452 2650 Metric 681.036 (51) < 0.01 0.001




Scalar 10411.967 2650 Metric 56.157 (51) 0.29 0.000




Scalar 11165.357 2650 Metric 297.036 (51) < 0.01 0.000


Configural NA 2350

Metric NA 2399 Configural 90.952 (49) < 0.01 0.000

Scalar NA 2448 Metric 67.720 (49) 0.04 0.000




Scalar 11509.815 2650 Metric 245.693 (51) < 0.01 0.000




Scalar 17884.401 2650 Metric 1313.544 (51) < 0.01 0.001




Scalar 17663.151 2650 Metric 928.268 (51) < 0.01 0.001

Appendix A.18b Global Model Fit Indices of Scalar Invariance Model – Integrated Math 2



Model A 12720.000 2600 < 0.01 0.969 0.027

Model B-1 13370.135 2600 < 0.01 0.951 0.030

Model B-2 6859.081 2600 < 0.01 0.973 0.023

Model B-3 7025.287 2600 < 0.01 0.973 0.023

Model B-4 3580.814 2399 < 0.01 0.982 0.013

Model B-5 7402.193 2600 < 0.01 0.973 0.023

Model C 8852.619 2600 < 0.01 0.973 0.021

Model D 11279.954 2600 < 0.01 0.968 0.025



Appendix A.19a Global Model Fit Indices of Measurement Invariance Tests – Grade 5 Science

Invariance Model






Scalar 71076.019 2254 Metric 5151.347 (47) < 0.01 0.001




Scalar 66158.488 2254 Metric 3490.436 (47) < 0.01 0.001




Scalar 45290.173 2254 Metric 484.093 (47) < 0.01 0.000




Scalar 44070.379 2254 Metric 797.656 (47) < 0.01 0.000




Scalar 41605.053 2254 Metric 52.302 (47) 0.28 0.000




Scalar 47225.920 2254 Metric 406.461 (47) < 0.01 0.000




Scalar 72884.224 2254 Metric 4250.357 (47) < 0.01 0.000




Scalar 65615.472 2254 Metric 951.261 (47) < 0.01 0.000

Appendix A.19b Global Model Fit Indices of Scalar Invariance Model – Grade 5 Science



Model A 48465.973 2213 < 0.01 0.986 0.018

Model B-1 38704.713 2213 < 0.01 0.983 0.017

Model B-2 26621.863 2213 < 0.01 0.987 0.015

Model B-3 25118.109 2213 < 0.01 0.986 0.015





Model B-4 13685.571 2213 < 0.01 0.991 0.011

Model B-5 28481.220 2213 < 0.01 0.987 0.016

Model C 48502.598 2213 < 0.01 0.983 0.018

Model D 36286.565 2213 < 0.01 0.990 0.016

Appendix A.20a Global Model Fit Indices of Measurement Invariance Tests – Grade 8 Science

Invariance Model






Scalar 107482.525 2350 Metric 12927.735 (48) < 0.01 0.002




Scalar 89892.739 2350 Metric 5991.158 (48) < 0.01 0.001




Scalar 68191.700 2350 Metric 491.400 (48) < 0.01 0.000




Scalar 68468.267 2350 Metric 1347.000 (48) < 0.01 0.000




Scalar 63911.516 2350 Metric 40.325 (48) 0.78 0.000




Scalar 70296.196 2350 Metric 434.842 (48) < 0.01 0.000




Scalar 101895.282 2350 Metric 7077.778 (48) < 0.01 0.001




Scalar 93680.256 2350 Metric 2123.605 (48) < 0.01 0.000



Appendix A.20b Global Model Fit Indices of Scalar Invariance Model – Grade 8 Science



Model A 70504.219 2306 < 0.01 0.976 0.022

Model B-1 51594.463 2306 < 0.01 0.975 0.020

Model B-2 40574.909 2306 < 0.01 0.978 0.019

Model B-3 41410.612 2306 < 0.01 0.976 0.019

Model B-4 19676.823 2306 < 0.01 0.985 0.013

Model B-5 43588.756 2306 < 0.01 0.977 0.019

Model C 54941.635 2306 < 0.01 0.977 0.019

Model D 47452.805 2306 < 0.01 0.983 0.018

Appendix A.21a Global Model Fit Indices of Measurement Invariance Tests – Physical Science

Invariance Model




Configural NA NA

Metric NA NA Configural NA (NA) NA NA

Scalar NA NA Metric NA (NA) NA NA


Configural NA NA




Configural NA NA




Configural NA NA




Configural NA NA




Configural NA NA




Configural NA NA





Invariance Model




Configural NA NA



Appendix A.21b Global Model Fit Indices of Scalar Invariance Model – Physical Science



Model A NA NA NA NA NA






Model C NA NA NA NA NA

Model D NA NA NA NA NA

Appendix A.22a Global Model Fit Indices of Measurement Invariance Tests – Biology

Invariance Model




Configural NA NA






Scalar 66899.524 1638 Metric 2759.013 (40) < 0.01 0.000


Configural NA NA




Configural NA NA






Scalar 47322.862 1638 Metric 40.547 (40) 0.45 0.000




Invariance Model



Configural NA NA




Configural NA NA




Configural NA NA



Appendix A.22b Global Model Fit Indices of Scalar Invariance Model – Biology



Model A NA NA NA NA NA

Model B-1 62644.623 1607 < 0.01 0.971 0.025



Model B-4 27145.815 1607 < 0.01 0.982 0.018


Model C NA NA NA NA NA

Model D NA NA NA NA NA

Appendix A.23a Global Model Fit Indices of Measurement Invariance Tests – American Government

Invariance Model






Scalar 69797.660 1890 Metric 4543.083 (43) < 0.01 0.001




Scalar 62407.716 1890 Metric 3099.028 (43) < 0.01 0.001




Scalar 48683.136 1890 Metric 765.304 (43) < 0.01 0.001





Invariance Model




Scalar 46876.882 1890 Metric 801.329 (43) < 0.01 0.000




Scalar 45032.853 1890 Metric 58.255 (43) 0.06 0.000




Scalar 48997.369 1890 Metric 289.987 (43) < 0.01 0.000




Scalar 70018.085 1890 Metric 5112.435 (43) < 0.01 0.001




Scalar 66816.812 1890 Metric 2655.683 (43) < 0.01 0.000

Appendix A.23b Global Model Fit Indices of Scalar Invariance Model – American Government



Model A 65296.915 1865 < 0.01 0.963 0.028

Model B-1 54450.972 1865 < 0.01 0.959 0.027

Model B-2 42102.698 1865 < 0.01 0.963 0.026

Model B-3 39864.033 1865 < 0.01 0.961 0.025

Model B-4 20341.352 1865 < 0.01 0.973 0.018

Model B-5 43727.796 1865 < 0.01 0.963 0.026

Model C 58145.074 1865 < 0.01 0.960 0.026

Model D 51333.089 1865 < 0.01 0.970 0.025

Appendix A.24a Global Model Fit Indices of Measurement Invariance Tests – American History

Invariance Model






Scalar 64720.766 2448 Metric 18822.416 (49) < 0.01 0.003






Invariance Model



Scalar 50107.218 2448 Metric 4063.818 (49) < 0.01 0.001




Scalar 35813.482 2448 Metric 861.080 (49) < 0.01 0.000




Scalar 34452.210 2448 Metric 790.911 (49) < 0.01 0.000




Scalar 32860.184 2448 Metric 56.798 (49) 0.21 0.000




Scalar 36159.538 2448 Metric 396.177 (49) < 0.01 0.000




Scalar 54970.627 2448 Metric 5277.455 (49) < 0.01 0.000




Scalar 49707.893 2448 Metric 2226.537 (49) < 0.01 0.000

Appendix A.24b Global Model Fit Indices of Scalar Invariance Model – American History



Model A 61853.159 2411 < 0.01 0.985 0.020

Model B-1 48290.257 2411 < 0.01 0.984 0.018

Model B-2 32825.971 2411 < 0.01 0.988 0.016

Model B-3 30701.853 2411 < 0.01 0.988 0.016

Model B-4 15249.497 2411 < 0.01 0.993 0.011

Model B-5 33960.287 2411 < 0.01 0.988 0.016

Model C 48883.897 2411 < 0.01 0.986 0.017

Model D 41688.775 2411 < 0.01 0.990 0.016


B-1 American Institutes for Research

Table B.1 Number and Percentage of Flagged Aggregate Units for Test Integrity Forensic Studies –

English Language Arts

Test Aggregate Unit Total

Change in Student Performance

Response Latency Person Fit

Number of

Flagged Unit

Pct. of Flagged

Unit

Number of

Flagged Unit

Pct. of Flagged

Unit

Number of

Flagged Unit

Pct. of Flagged

Unit

Grade 3 ELA

Student 131,376 917 1% 327 0%

Session 19,307 280 1% 52 0%

Test Administrator

12,701 118 1% 54 0%

School 2,124 - 0% 104 5%

Grade 4 ELA

Student 131,355 2 0% 1,102 1% 305 0%

Session 17,954 2 0% 314 2% 35 0%

Test Administrator

12,396 2 0% 130 1% 41 0%

School 2,129 1 0% - 0% 105 5%

Grade 5 ELA

Student 131,589 717 1% 1,008 1% 1,440 1%

Session 17,127 187 1% 276 2% 105 1%

Test Administrator

11,847 129 1% 102 1% 128 1%

School 1,954 150 8% - 0% 151 8%

Grade 6 ELA

Student 130,098 760 1% 967 1% 1,551 1%

Session 15,034 231 2% 245 2% 665 4%

Test Administrator

10,337 182 2% 88 1% 739 7%

School 1,617 191 12% 1 0% 394 24%

Grade 7 ELA

Student 127,605 756 1% 918 1% 1,333 1%

Session 14,035 232 2% 212 2% 155 1%

Test Administrator

9,665 188 2% 77 1% 173 2%

School 1,443 186 13% 1 0% 115 8%

Grade 8 ELA

Student 128,537 693 1% 899 1% 871 1%

Session 14,174 144 1% 185 1% 284 2%

Test Administrator

9,635 113 1% 83 1% 349 4%

School 1,403 132 9% - 0% 279 20%

EOC ELA I

Student 155,905 135 0% 804 1% 2,487 2%

Session 17,840 44 0% 126 1% 531 3%

Test Administrator

9,803 31 0% 45 0% 564 6%

School 1,193 29 2% 1 0% 264 22%






Number of

Flagged Unit

Pct. of Flagged

Unit

Number of

Flagged Unit

Pct. of Flagged

Unit

Number of

Flagged Unit

Pct. of Flagged

Unit

EOC ELA II

Student 146,878 936 1% 780 1% 1,636 1%

Session 17,108 181 1% 117 1% 566 3%

Test Administrator

9,239 124 1% 39 0% 626 7%

School 1,074 136 13% - 0% 374 35%




Mathematics




Number of

Flagged Unit

Pct. of Flagged

Unit

Number of

Flagged Unit

Pct. of Flagged

Unit

Number of

Flagged Unit

Pct. of Flagged

Unit

Grade 3 Math

Student 132,658 987 1% 626 0%

Session 18,873 280 1% 24 0%

Test Administrator

12,662 124 1% 25 0%

School 2,152 - 0% 30 1%

Grade 4 Math

Student 130,792 647 0% 1,109 1% 331 0%

Session 17,961 714 4% 311 2% 17 0%

Test Administrator

12,366 677 5% 134 1% 18 0%

School 2,134 563 26% - 0% 34 2%

Grade 5 Math

Student 130,221 4 0% 1,472 1% 251 0%

Session 17,261 5 0% 424 2% 9 0%

Test Administrator

11,891 5 0% 188 2% 20 0%

School 1,978 5 0% - 0% 37 2%

Grade 6 Math

Student 128,486 1 0% 830 1% 494 0%

Session 15,381 1 0% 220 1% 26 0%

Test Administrator

10,524 1 0% 73 1% 26 0%

School 1,731 1 0% 1 0% 27 2%

Grade 7 Math

Student 122,942 - 0% 853 1% 276 0%

Session 14,402 - 0% 207 1% 7 0%

Test Administrator

9,887 - 0% 80 1% 8 0%

School 1,535 - 0% 1 0% 32 2%

Grade 8 Math

Student 100,264 438 0% 807 1% 392 0%

Session 13,411 322 2% 191 1% 38 0%

Test Administrator

9,134 314 3% 81 1% 37 0%

School 1,404 259 18% - 0% 56 4%

Algebra I

Student 153,027 177 0% 971 1% 348 0%

Session 20,231 71 0% 221 1% 120 1%

Test Administrator

11,406 55 0% 88 1% 134 1%

School 1,753 57 3% 2 0% 212 12%






Number of

Flagged Unit

Pct. of Flagged

Unit

Number of

Flagged Unit

Pct. of Flagged

Unit

Number of

Flagged Unit

Pct. of Flagged

Unit

Geometry

Student 134,970 526 0% 775 1% 348 0%

Session 17,407 249 1% 174 1% 40 0%

Test Administrator

9,519 211 2% 62 1% 48 1%

School 1,286 209 16% 3 0% 56 4%

Integrated Math I

Student 12,687 12 0% 99 1% 18 0%

Session 1,940 6 0% 21 1% 8 0%

Test Administrator

1,137 5 0% 11 1% 9 1%

School 308 - 0% 2 1% 9 3%

Integrated Math II

Student 11,240 50 0% 74 1% 36 0%

Session 1,820 28 2% 17 1% 5 0%

Test Administrator

1,039 20 2% 7 1% 8 1%

School 279 15 5% 1 0% 7 3%




Science



Number of Flagged

Unit

Pct. of Flagged

Unit

Number of Flagged

Unit

Pct. of Flagged

Unit

Grade 5 Science

Student 131,474 687 1% 365 0%

Session 16,530 174 1% 28 0%

Test Administrator 11,591 80 1% 26 0%

School 1,951 0 0% 46 2%

Grade 8 Science

Student 129,418 831 1% 244 0%

Session 13,790 185 1% 36 0%


School 1,409 1 0% 68 5%

Biology

Student 141,089 1,287 1% 115 0%

Session 16,098 227 1% 4 0%


School 1,089 0 0% 15 1%

Physical Science

Student 740 9 1% 1 0%

Session 393 3 1% 0 0%

Test Administrator 210 2 1% 0 0%

School 122 2 2% 0 0%




Social Studies



Number of Flagged

Unit

Pct. of Flagged

Unit

Number of Flagged

Unit

Pct. of Flagged

Unit

American Government

Student 98,804 1,807 2% 816 1%

Session 13,114 287 2% 124 1%


School 1,026 2 0% 140 14%

American History

Student 13,1062 1,729 1% 459 0%

Session 15,338 265 2% 122 1%


School 1,064 1 0% 243 23%



Table B.5 Number and Percentage of Flagged Aggregate Units for Response Pattern Similarity Study –

English Language Arts

Mode Test Aggregate

Unit Total

Number of Flagged Unit Percentage of Flagged Unit

Aggressive method

Conservative method

Aggressive method

Conservative method

Online

G3E School 430 353 14 82% 3% Session 334 310 25 93% 7%




G7E School 301 235 7 78% 2%

Session 284 267 21 94% 7%


ELA I School 560 315 28 56% 5% Session 533 474 61 89% 11%

ELA II School 1,063 492 48 46% 5%

Session 841 717 98 85% 12%

Paper

G3E School 2 2 0 100% 0% G4E School 1 1 1 100% 100% G5E School 2 2 0 100% 0% G6E School 2 2 0 100% 0% G7E School 1 1 0 100% 0% G8E School 3 3 0 100% 0% ELA I School 4 4 0 100% 0% ELA II School 5 4 0 80% 0%

Note: The aggressive approach is used to flag the School/session where flagged pair using alpha=.05 and Bonferroni adjustment factor (n-1). The conservative approach is used to flag a problematic School/session where the flagged pair is using alpha=.01 and Bonferroni adjustment factor (n(n-1)/2).




Mathematics

Mode Test Aggregate

Unit Total


Aggressive method

Conservative method

Aggressive method

Conservative method

Online

G3M School 114 112 0 98% 0% Session 137 136 3 99% 2%






Algebra School 2,616 770 80 29% 3% Session 1,465 1,171 167 80% 11%

Geometry School 796 441 3 55% 0% Session 775 675 32 87% 4%

IM I School 232 55 7 24% 3% Session 125 93 9 74% 7%

IM II School 278 60 12 22% 4% Session 134 110 21 82% 16%

Paper

G3M School 3 2 0 67% 0% G4M School 3 3 0 100% 0% G5M School 3 2 0 67% 0% G6M School 3 3 0 100% 0% G7M School 2 2 0 100% 0% G8M School 13 6 2 46% 15%

Algebra School 4 4 0 100% 0%

Geometry School 1 1 0 100% 0%

IM I School 3 2 0 67% 0% IM II School - - - - -





Science

Mode Test Aggregate Unit

Total


Aggressive method

Conservative method

Aggressive method

Conservative method

Online

G5Sci School 295 240 20 81% 7%

Session 271 246 37 91% 14%

G8Sci School 579 373 25 64% 4%

Session 576 521 60 90% 10%

Biology School 2,148 676 85 31% 4%

Session 1,537 1,190 203 77% 13%

Physical Science

School 6 5 0 83% 0%

Session 1 1 0 100% 0%

Paper

G5Sci School 17 4 3 24% 18%

G8Sci School 9 8 3 89% 33%

Biology School 7 6 2 86% 29%

Physical Science

School 1 1 1 100% 100%





Social Studies

Mode Test Aggregate

Unit Total


Aggressive method

Conservative method

Aggressive method

Conservative method

Online

AG School 464 288 25 62% 5%

Session 556 449 67 81% 12%

AH School 1,449 520 139 36% 10%

Session 1,212 862 234 71% 19%

Paper AG School 3 2 0 67% 0%

AH School 4 3 0 75% 0%


Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report

C-1 American Institutes for Research

Table C.1. Number of Students Participating in Fall 2017 Online Assessments

Assessment

Overall

Female

Male

Unk

nown

Afric

an

American

Asian

Hispa

nic /

Latin

o

American

Indian

White

Multip

le

Ethn

icities

LEP

IEP

ELA Grade 3 127,757 62,751 64,856 150 23,117 3,083 5,356 158 86,241 9,640 5,079 14,830 ELA I 41,904 18,094 23,036 774 14,167 764 2,207 59 21,694 2,762 2,843 9,906 ELA II 35,466 16,371 18,418 677 11,482 728 1,778 55 18,860 2,324 2,282 7,058

Math Algebra 47,341 22,249 24,121 971 14,667 571 2,599 74 26,117 3,032 2,005 9,283 Geometry 34,514 16,700 17,113 701 10,499 593 1,826 46 19,168 2,120 1,515 6,346

Integrated Math I 4,822 2,342 2,441 39 2,326 213 76 14 1,625 565 812 664 Integrated Math II 4,389 2,092 2,266 31 1,988 184 69 8 1,672 466 619 534

Science Biology 24,772 11,992 12,467 313 8,825 391 1,249 39 12,544 1,620 1,467 4,292

Physical Science 897 435 444 18 448 6 39 325 66 41 196 Social Studies

American Government 32,259 15,625 16,221 413 7,154 749 1,313 47 21,092 1,771 1,100 4,541

American History 20,912 10,329 10,258 325 7,534 348 1,130 28 10,431 1,348 1,404 4,145



Table C.2. Number of Students Participating in Fall 2017 Paper Assessments

Assessment

Overall

Female

Male

Unk

nown

Afric

an

American

Asian

Hispa

nic /

Latin

o

American

Indian

White

Multip

le

Ethn

icities

LEP

IEP

ELA G3 ELA 448 181 265 2 93 4 13 322 15 35 310 ELA I 135 42 89 4 28 7 88 10 6 83 ELA II 87 29 56 2 11 6 58 10 4 53

Math Algebra 150 54 87 9 44 2 9 1 76 17 6 70 Geometry 117 42 61 14 42 3 2 54 15 4 57

Integrated Math I 10 4 6 1 8 1 7 Integrated Math II 12 6 6 10 1 11

Science Biology 65 21 41 3 20 1 6 33 5 3 41

Physical Science 2 1 1 2 2 Social Studies

American Government 71 30 41 7 3 4 53 4 4 42

American History 57 20 34 3 17 3 2 29 5 4 28



Table C.3. Number of Students Participating in Spring 2018 ELA Online Assessments

Assessment

Overall

Female

Male

Unk

nown

Afric

an

American

Asian

Hispa

nic /

Latin

o

American

Indian

White

Multip

le

Ethn

icities

LEP

IEP

G3 ELA 126,050 61,977 63,914 159 22,817 3,033 5,323 157 84,936 9,621 5,202 16,183 G4 ELA 125,854 61,651 64,033 170 20,643 3,115 5,201 162 87,542 9,060 4,044 17,185 G5 ELA 127,446 62,418 64,848 180 21,267 3,125 5,051 188 88,832 8,841 3,549 17,484 G6 ELA 125,943 61,650 64,106 187 20,465 3,011 5,088 173 88,720 8,349 3,231 16,888 G7 ELA 123,841 60,688 62,901 252 18,864 3,063 4,643 160 89,059 7,901 2,934 16,162 G8 ELA 124,880 60,910 63,779 191 19,126 2,978 4,537 158 89,992 7,900 2,966 16,349 ELA I 148,951 71,575 76,796 580 27,502 3,482 5,927 203 102,174 9,355 5,593 21,741 ELA II 139,572 68,420 70,584 568 23,659 3,363 5,294 192 98,465 8,336 4,511 18,913



Table C.4. Number of Students Participating in Spring 2018 ELA Paper Assessments

Assessment

Overall

Female

Male

Unk

nown

Afric

an

American

Asian

Hispa

nic /

Latin

o

American

Indian

White

Multip

le

Ethn

icities

LEP

IEP

G3 ELA 490 204 284 2 96 6 12 356 19 37 317 G4 ELA 640 273 364 3 131 6 24 2 430 46 23 417 G5 ELA 511 212 297 2 98 4 20 354 35 16 320 G6 ELA 466 178 288 94 8 8 1 327 26 17 317 G7 ELA 475 209 265 1 83 5 9 359 18 17 303 G8 ELA 408 174 233 1 68 7 1 310 20 14 250 ELA I 443 143 295 5 109 5 21 1 265 30 20 295 ELA II 402 157 241 4 87 4 15 271 17 13 279



Table C.5. Number of Students Participating in Spring 2018 Mathematics Online Assessments

Assessment

Overall

Female

Male

Unk

nown

Afric

an

American

Asian

Hispa

nic /

Latin

o

American

Indian

White

Multip

le

Ethn

icities

LEP

IEP

G3 Math 126,769 62,334 64,255 180 22,837 3,020 5,351 157 85,601 9,636 5,202 16,243 G4 Math 125,282 61,445 63,659 178 20,639 3,030 5,205 160 87,091 9,024 4,034 17,187 G5 Math 126,093 61,865 64,032 196 21,222 2,928 5,042 191 87,807 8,761 3,544 17,457 G6 Math 124,337 60,962 63,161 214 20,443 2,803 5,019 173 87,489 8,267 3,235 16,919 G7 Math 119,191 58,570 60,362 259 18,598 2,613 4,594 154 85,433 7,644 2,915 16,090 G8 Math 97,029 46,827 50,018 184 16,595 1,939 4,059 129 67,596 6,572 2,811 15,726 Algebra 144,091 69,666 73,795 630 24,933 3,119 5,817 206 101,262 8,470 4,025 20,161 Geometry 126,729 62,827 63,385 517 20,163 2,857 4,876 179 91,392 7,028 3,084 15,831

Integrated Math I 12,152 5,860 6,239 53 3,808 494 300 25 6,206 1,298 1,441 1,830 Integrated Math II 10,490 5,132 5,301 57 3,049 434 266 27 5,725 967 1,004 1,516



Table C.6. Number of Students Participating in Spring 2018 Mathematics Paper Assessments

Assessment

Overall

Female

Male

Unk

nown

Afric

an

American

Asian

Hispa

nic /

Latin

o

American

Indian

White

Multip

le

Ethn

icities

LEP

IEP

G3 Math 653 279 370 4 186 7 19 414 25 42 406 G4 Math 640 273 364 3 142 5 22 1 424 45 27 418 G5 Math 521 210 309 2 99 4 19 361 38 21 328 G6 Math 484 183 301 97 11 8 1 336 29 23 334 G7 Math 502 211 290 1 94 5 9 374 19 24 325 G8 Math 437 187 249 1 76 7 1 327 25 23 285 Algebra 398 127 270 1 97 5 16 249 24 19 268 Geometry 289 118 166 5 62 3 12 1 183 17 8 178

Integrated Math I 76 27 48 1 15 2 2 49 7 6 63 Integrated Math II 46 13 31 2 8 1 2 32 3 4 41



Table C.7. Number of Students Participating in Spring 2018 Science and Social Studies Online Assessments

Assessment

Overall

Female

Male

Unk

nown

Afric

an

American

Asian

Hispa

nic /

Latin

o

American

Indian

White

Multip

le

Ethn

icities

LEP

IEP

Science G5 Science 127,349 62,404 64,751 194 21,166 3,135 5,063 188 88,823 8,831 3,550 17,438 G8 Science 125,778 61,534 64,043 201 19,078 2,995 4,589 155 90,895 7,879 2,967 16,291 Biology 135,109 66,887 67,803 419 21,928 3,225 4,870 185 96,774 7,872 3,878 17,412

Physical Science 478 243 231 4 195 4 20 1 211 40 25 91 Social Studies

American Government 86,861 42,931 43,521 409 14,322 1,565 2,924 121 62,730 5,009 2,149 10,331

American History 125,839 62,295 63,145 399 20,438 2,489 4,677 163 90,536 7,308 3,905 17,060



Table C.8. Number of Students Participating in Spring 2018 Science and Social Studies Paper Assessments

Assessment

Overall

Female

Male

Unk

nown

Afric

an

American

Asian

Hispa

nic /

Latin

o

American

Indian

White

Multip

le

Ethn

icities

LEP

IEP

Science G5 Science 520 210 308 2 99 4 19 361 37 20 328 G8 Science 424 185 238 1 73 6 1 316 26 21 267 Biology 371 138 228 5 67 5 14 1 251 23 10 253

Physical Science 6 2 4 1 5 3 Social Studies

American Government 217 79 133 5 47 3 6 1 147 12 9 143

American History 369 136 230 3 93 2 14 232 22 14 254


D-1 American Institutes for Research

Table D1. Scaled Score Frequency Distributions Fall 2017 – Grade 3 Reading

Reading Promotion Score Frequency Percent Cumulative

Frequency Cumulative Percent

16 89 0.07 89 0.07 21 235 0.19 324 0.26 26 645 0.51 969 0.76 30 1,302 1.03 2,271 1.79 32 2,169 1.71 4,440 3.51 35 2,978 2.35 7,418 5.86 37 3,934 3.11 11,352 8.96 39 4,489 3.54 15,841 12.51 41 5,178 4.09 21,019 16.59 43 5,489 4.33 26,508 20.93 45 5,800 4.58 32,308 25.51 46 6,187 4.88 38,495 30.39 48 6,399 5.05 44,894 35.44 50 6,814 5.38 51,708 40.82 51 6,799 5.37 58,507 46.19 53 6,872 5.43 65,379 51.61 54 7,027 5.55 72,406 57.16 56 7,071 5.58 79,477 62.74 58 7,050 5.57 86,527 68.31 59 7,006 5.53 93,533 73.84 61 6,807 5.37 100,340 79.22 63 6,502 5.13 106,842 84.35 65 5,633 4.45 112,475 88.80 67 4,758 3.76 117,233 92.55 70 3,815 3.01 121,048 95.56 73 2,760 2.18 123,808 97.74 77 1,641 1.30 125,449 99.04 82 884 0.70 126,333 99.74 86 334 0.26 126,667 100.00



Table D2. Scaled Score Frequency Distributions Fall 2017 – Grade 3 ELA

Scaled Score Frequency Percent Cumulative Frequency

Cumulative Percent

545 114 0.09 114 0.09 567 358 0.28 472 0.37 588 913 0.71 1,385 1.08 603 1,909 1.49 3,294 2.56 616 2,981 2.32 6,275 4.88 626 4,246 3.30 10,521 8.19 635 5,097 3.97 15,618 12.16 643 6,058 4.72 21,676 16.87 651 6,438 5.01 28,114 21.88 657 6,813 5.30 34,927 27.19 664 6,816 5.31 41,743 32.49 672 6,974 5.43 48,717 37.92 676 6,598 5.14 55,315 43.06 681 6,435 5.01 61,750 48.06 687 6,374 4.96 68,124 53.03 692 5,716 4.45 73,840 57.47 697 5,500 4.28 79,340 61.76 702 5,285 4.11 84,625 65.87 708 5,065 3.94 89,690 69.81 713 4,802 3.74 94,492 73.55 718 4,584 3.57 99,076 77.12 725 4,334 3.37 103,410 80.49 729 4,163 3.24 107,573 83.73 735 3,817 2.97 111,390 86.70 741 3,456 2.69 114,846 89.39 748 3,071 2.39 117,917 91.78 755 2,708 2.11 120,625 93.89 762 2,214 1.72 122,839 95.61 770 1,812 1.41 124,651 97.02 778 1,450 1.13 126,101 98.15 787 1,008 0.78 127,109 98.94 796 628 0.49 127,737 99.43 807 393 0.31 128,130 99.73 818 197 0.15 128,327 99.89 831 99 0.08 128,426 99.96 846 34 0.03 128,460 99.99 863 14 0.01 128,474 100.00



Table D3. Scaled Score Frequency Distributions Fall 2017 – High School ELA I


Cumulative Percent

606 66 0.15 66 0.15 615 111 0.26 177 0.41 624 195 0.45 372 0.86 631 360 0.84 732 1.70 637 552 1.28 1,284 2.98 642 810 1.88 2,094 4.86 647 1,054 2.45 3,148 7.31 651 1,348 3.13 4,496 10.44 655 1,398 3.25 5,894 13.68 658 1,644 3.82 7,538 17.50 661 1,642 3.81 9,180 21.31 665 1,814 4.21 10,994 25.53 668 1,729 4.01 12,723 29.54 670 1,799 4.18 14,522 33.72 673 1,851 4.30 16,373 38.01 676 1,836 4.26 18,209 42.28 678 1,876 4.36 20,085 46.63 681 1,900 4.41 21,985 51.04 683 1,905 4.42 23,890 55.47 685 1,759 4.08 25,649 59.55 688 1,750 4.06 27,399 63.62 690 1,622 3.77 29,021 67.38 692 1,547 3.59 30,568 70.97 694 1,372 3.19 31,940 74.16 697 1,274 2.96 33,214 77.12 699 1,134 2.63 34,348 79.75 701 1,120 2.60 35,468 82.35 703 965 2.24 36,433 84.59 705 851 1.98 37,284 86.57 707 734 1.70 38,018 88.27 709 671 1.56 38,689 89.83 711 562 1.30 39,251 91.13 713 510 1.18 39,761 92.32 716 407 0.94 40,168 93.26 718 372 0.86 40,540 94.13 720 359 0.83 40,899 94.96 722 277 0.64 41,176 95.60 725 262 0.61 41,438 96.21 727 221 0.51 41,659 96.72 729 204 0.47 41,863 97.20 732 190 0.44 42,053 97.64 734 170 0.39 42,223 98.03 737 135 0.31 42,358 98.35




Cumulative Percent

740 135 0.31 42,493 98.66 743 127 0.29 42,620 98.96 746 114 0.26 42,734 99.22 749 89 0.21 42,823 99.43 752 65 0.15 42,888 99.58 756 49 0.11 42,937 99.69 760 39 0.09 42,976 99.78 764 35 0.08 43,011 99.86 769 24 0.06 43,035 99.92 774 13 0.03 43,048 99.95 780 11 0.03 43,059 99.97 786 6 0.01 43,065 99.99 794 3 0.01 43,068 100.00 800 2 0.00 43,070 100.00



Table D4. Scaled Score Frequency Distributions Fall 2017 – High School ELA II


Cumulative Percent

597 62 0.17 62 0.17 603 94 0.26 156 0.43 613 200 0.55 356 0.98 621 324 0.89 680 1.87 628 557 1.53 1,237 3.40 633 756 2.08 1,993 5.48 639 992 2.73 2,985 8.21 643 1,187 3.27 4,172 11.48 647 1,341 3.69 5,513 15.17 651 1,423 3.92 6,936 19.09 655 1,569 4.32 8,505 23.40 658 1,622 4.46 10,127 27.87 662 1,549 4.26 11,676 32.13 665 1,668 4.59 13,344 36.72 668 1,618 4.45 14,962 41.17 671 1,566 4.31 16,528 45.48 673 1,531 4.21 18,059 49.69 676 1,544 4.25 19,603 53.94 679 1,428 3.93 21,031 57.87 681 1,396 3.84 22,427 61.71 684 1,310 3.60 23,737 65.32 686 1,247 3.43 24,984 68.75 689 1,156 3.18 26,140 71.93 691 1,147 3.16 27,287 75.09 693 1,040 2.86 28,327 77.95 696 941 2.59 29,268 80.54 698 862 2.37 30,130 82.91 700 787 2.17 30,917 85.08 703 685 1.88 31,602 86.96 705 595 1.64 32,197 88.60 707 522 1.44 32,719 90.04 709 435 1.20 33,154 91.23 712 421 1.16 33,575 92.39 714 350 0.96 33,925 93.35 716 313 0.86 34,238 94.22 719 260 0.72 34,498 94.93 721 250 0.69 34,748 95.62 724 215 0.59 34,963 96.21 726 222 0.61 35,185 96.82 729 183 0.50 35,368 97.33 731 162 0.45 35,530 97.77 734 143 0.39 35,673 98.16 737 130 0.36 35,803 98.52




Cumulative Percent

740 122 0.34 35,925 98.86 743 109 0.30 36,034 99.16 746 93 0.26 36,127 99.41 749 58 0.16 36,185 99.57 753 40 0.11 36,225 99.68 757 33 0.09 36,258 99.77 761 31 0.09 36,289 99.86 765 21 0.06 36,310 99.92 770 12 0.03 36,322 99.95 775 6 0.02 36,328 99.97 781 6 0.02 36,334 99.98 788 4 0.01 36,338 99.99 795 2 0.01 36,340 100.00



Table D5. Scaled Score Frequency Distributions Fall 2017 – Algebra


Cumulative Percent

618 213 0.44 213 0.44 630 316 0.65 529 1.10 639 666 1.38 1,195 2.48 646 1,115 2.31 2,310 4.78 652 1,792 3.71 4,102 8.50 657 2,488 5.15 6,590 13.65 662 3,222 6.67 9,812 20.32 666 3,577 7.41 13,389 27.73 670 3,706 7.68 17,095 35.41 674 3,604 7.46 20,699 42.87 677 3,505 7.26 24,204 50.13 682 3,142 6.51 27,346 56.64 684 2,881 5.97 30,227 62.61 687 2,677 5.54 32,904 68.15 690 2,306 4.78 35,210 72.93 693 2,046 4.24 37,256 77.17 696 1,768 3.66 39,024 80.83 698 1,506 3.12 40,530 83.95 701 1,312 2.72 41,842 86.67 704 1,030 2.13 42,872 88.80 706 940 1.95 43,812 90.75 709 735 1.52 44,547 92.27 712 630 1.30 45,177 93.57 714 506 1.05 45,683 94.62 717 420 0.87 46,103 95.49 720 310 0.64 46,413 96.13 722 284 0.59 46,697 96.72 725 224 0.46 46,921 97.19 728 161 0.33 47,082 97.52 730 173 0.36 47,255 97.88 733 127 0.26 47,382 98.14 736 118 0.24 47,500 98.38 739 98 0.20 47,598 98.59 742 79 0.16 47,677 98.75 745 81 0.17 47,758 98.92 748 71 0.15 47,829 99.07 751 76 0.16 47,905 99.22 755 50 0.10 47,955 99.33 758 50 0.10 48,005 99.43 762 54 0.11 48,059 99.54 765 41 0.08 48,100 99.63 769 44 0.09 48,144 99.72 773 30 0.06 48,174 99.78




Cumulative Percent

777 20 0.04 48,194 99.82 782 21 0.04 48,215 99.87 787 13 0.03 48,228 99.89 792 11 0.02 48,239 99.92 798 11 0.02 48,250 99.94 805 13 0.03 48,263 99.96 814 17 0.04 48,280 100.00



Table D6. Scaled Score Frequency Distributions Fall 2017 – Geometry


Cumulative Percent

604 231 0.66 231 0.66 617 646 1.84 877 2.50 630 1,399 3.98 2,276 6.48 640 2,248 6.40 4,524 12.88 648 3,094 8.81 7,618 21.69 655 3,518 10.02 11,136 31.71 661 3,595 10.24 14,731 41.95 666 3,463 9.86 18,194 51.81 671 3,064 8.73 21,258 60.54 675 2,732 7.78 23,990 68.32 679 2,106 6.00 26,096 74.31 683 1,789 5.09 27,885 79.41 687 1,429 4.07 29,314 83.48 691 1,111 3.16 30,425 86.64 694 831 2.37 31,256 89.01 697 674 1.92 31,930 90.93 700 485 1.38 32,415 92.31 704 413 1.18 32,828 93.48 707 296 0.84 33,124 94.33 709 257 0.73 33,381 95.06 712 183 0.52 33,564 95.58 715 167 0.48 33,731 96.06 718 152 0.43 33,883 96.49 721 131 0.37 34,014 96.86 723 99 0.28 34,113 97.14 726 91 0.26 34,204 97.40 729 77 0.22 34,281 97.62 732 73 0.21 34,354 97.83 734 66 0.19 34,420 98.02 737 57 0.16 34,477 98.18 740 73 0.21 34,550 98.39 743 55 0.16 34,605 98.54 745 51 0.15 34,656 98.69 748 49 0.14 34,705 98.83 751 51 0.15 34,756 98.97 754 41 0.12 34,797 99.09 757 38 0.11 34,835 99.20 760 40 0.11 34,875 99.31 763 36 0.10 34,911 99.42 767 31 0.09 34,942 99.50 770 30 0.09 34,972 99.59 774 23 0.07 34,995 99.66 777 20 0.06 35,015 99.71




Cumulative Percent

781 18 0.05 35,033 99.76 786 20 0.06 35,053 99.82 790 14 0.04 35,067 99.86 795 10 0.03 35,077 99.89 801 12 0.03 35,089 99.92 807 12 0.03 35,101 99.96 810 15 0.04 35,116 100.00



Table D7. Scaled Score Frequency Distributions Fall 2017 – Integrated Math I


Cumulative Percent

618 32 0.64 32 0.64 626 55 1.10 87 1.74 635 85 1.70 172 3.45 642 143 2.87 315 6.31 649 223 4.47 538 10.78 654 296 5.93 834 16.71 659 335 6.71 1,169 23.42 663 361 7.23 1,530 30.66 667 373 7.47 1,903 38.13 671 363 7.27 2,266 45.40 675 313 6.27 2,579 51.67 678 278 5.57 2,857 57.24 682 230 4.61 3,087 61.85 685 253 5.07 3,340 66.92 688 233 4.67 3,573 71.59 690 206 4.13 3,779 75.72 693 176 3.53 3,955 79.24 696 145 2.91 4,100 82.15 699 139 2.79 4,239 84.93 701 109 2.18 4,348 87.12 704 85 1.70 4,433 88.82 707 81 1.62 4,514 90.44 709 70 1.40 4,584 91.85 712 47 0.94 4,631 92.79 714 47 0.94 4,678 93.73 717 38 0.76 4,716 94.49 720 34 0.68 4,750 95.17 722 25 0.50 4,775 95.67 725 23 0.46 4,798 96.13 728 31 0.62 4,829 96.75 730 16 0.32 4,845 97.07 733 18 0.36 4,863 97.44 736 23 0.46 4,886 97.90 739 14 0.28 4,900 98.18 742 13 0.26 4,913 98.44 745 16 0.32 4,929 98.76 748 5 0.10 4,934 98.86 751 12 0.24 4,946 99.10 755 6 0.12 4,952 99.22 758 8 0.16 4,960 99.38 762 7 0.14 4,967 99.52 766 9 0.18 4,976 99.70 770 1 0.02 4,977 99.72




Cumulative Percent

774 4 0.08 4,981 99.80 778 3 0.06 4,984 99.86 783 4 0.08 4,988 99.94 795 1 0.02 4,989 99.96 811 2 0.04 4,991 100.00



Table D8. Scaled Score Frequency Distributions Fall 2017 – Integrated Math II


Cumulative Percent

594 15 0.33 15 0.33 604 33 0.73 48 1.06 618 94 2.08 142 3.14 629 143 3.16 285 6.30 637 231 5.11 516 11.41 645 265 5.86 781 17.27 651 340 7.52 1,121 24.78 657 402 8.89 1,523 33.67 662 405 8.95 1,928 42.63 667 392 8.67 2,320 51.29 671 356 7.87 2,676 59.16 675 330 7.30 3,006 66.46 679 260 5.75 3,266 72.21 683 228 5.04 3,494 77.25 687 163 3.60 3,657 80.85 690 160 3.54 3,817 84.39 694 101 2.23 3,918 86.62 697 86 1.90 4,004 88.53 700 64 1.41 4,068 89.94 704 53 1.17 4,121 91.11 707 51 1.13 4,172 92.24 710 36 0.80 4,208 93.04 713 39 0.86 4,247 93.90 716 28 0.62 4,275 94.52 719 20 0.44 4,295 94.96 722 21 0.46 4,316 95.42 725 22 0.49 4,338 95.91 728 13 0.29 4,351 96.20 731 22 0.49 4,373 96.68 734 11 0.24 4,384 96.93 737 7 0.15 4,391 97.08 740 10 0.22 4,401 97.30 743 10 0.22 4,411 97.52 746 13 0.29 4,424 97.81 750 11 0.24 4,435 98.05 753 8 0.18 4,443 98.23 758 5 0.11 4,448 98.34 760 8 0.18 4,456 98.52 764 12 0.27 4,468 98.78 768 8 0.18 4,476 98.96 771 5 0.11 4,481 99.07 776 10 0.22 4,491 99.29 780 3 0.07 4,494 99.36




Cumulative Percent

785 4 0.09 4,498 99.45 790 10 0.22 4,508 99.67 795 5 0.11 4,513 99.78 801 1 0.02 4,514 99.80 807 3 0.07 4,517 99.87 813 6 0.13 4,523 100.00



Table D9. Scaled Score Frequency Distributions Fall 2017 – Biology


Cumulative Percent

617 17 0.07 17 0.07 626 49 0.19 66 0.26 638 97 0.38 163 0.65 647 230 0.91 393 1.56 654 412 1.64 805 3.20 655 1 0.00 806 3.20 660 808 3.21 1,614 6.41 665 1,195 4.74 2,809 11.15 670 1,571 6.24 4,380 17.38 674 1,888 7.49 6,268 24.88 678 2,203 8.74 8,471 33.62 681 2,106 8.36 10,577 41.98 685 2,055 8.16 12,632 50.14 687 1,867 7.41 14,499 57.55 690 1,614 6.41 16,113 63.95 692 1,319 5.24 17,432 69.19 695 1,081 4.29 18,513 73.48 697 862 3.42 19,375 76.90 700 747 2.96 20,122 79.87 702 624 2.48 20,746 82.34 704 528 2.10 21,274 84.44 707 433 1.72 21,707 86.16 709 339 1.35 22,046 87.50 711 288 1.14 22,334 88.64 713 250 0.99 22,584 89.64 715 212 0.84 22,796 90.48 717 214 0.85 23,010 91.33 719 150 0.60 23,160 91.92 722 159 0.63 23,319 92.55 724 146 0.58 23,465 93.13 726 150 0.60 23,615 93.73 728 125 0.50 23,740 94.23 730 125 0.50 23,865 94.72 732 122 0.48 23,987 95.21 735 106 0.42 24,093 95.63 737 136 0.54 24,229 96.17 739 121 0.48 24,350 96.65 742 85 0.34 24,435 96.98 744 121 0.48 24,556 97.46 747 93 0.37 24,649 97.83 749 84 0.33 24,733 98.17 752 84 0.33 24,817 98.50 755 62 0.25 24,879 98.75




Cumulative Percent

758 64 0.25 24,943 99.00 761 54 0.21 24,997 99.21 764 44 0.17 25,041 99.39 768 36 0.14 25,077 99.53 771 36 0.14 25,113 99.67 776 27 0.11 25,140 99.78 780 23 0.09 25,163 99.87 786 13 0.05 25,176 99.92 792 9 0.04 25,185 99.96 799 6 0.02 25,191 99.98 809 1 0.00 25,192 99.99 822 3 0.01 25,195 100.00



Table D10. Scaled Score Frequency Distributions Fall 2017 – Physical Science


Cumulative Percent

634 2 0.21 2 0.21 637 4 0.43 6 0.64 648 7 0.75 13 1.39 655 1 0.11 14 1.50 656 21 2.25 35 3.75 663 34 3.64 69 7.39 668 45 4.82 114 12.21 673 85 9.10 199 21.31 677 109 11.67 308 32.98 680 1 0.11 309 33.08 681 117 12.53 426 45.61 685 102 10.92 528 56.53 688 105 11.24 633 67.77 691 83 8.89 716 76.66 694 46 4.93 762 81.58 697 37 3.96 799 85.55 700 34 3.64 833 89.19 702 26 2.78 859 91.97 704 12 1.28 871 93.25 706 11 1.18 882 94.43 709 14 1.50 896 95.93 711 10 1.07 906 97.00 713 7 0.75 913 97.75 715 3 0.32 916 98.07 717 5 0.54 921 98.61 719 2 0.21 923 98.82 722 4 0.43 927 99.25 724 1 0.11 928 99.36 728 1 0.11 929 99.46 730 1 0.11 930 99.57 736 2 0.21 932 99.79 740 1 0.11 933 99.89 752 1 0.11 934 100.00



Table D11. Scaled Score Frequency Distributions Fall 2017 – American Government


Cumulative Percent

642 3 0.01 3 0.01 645 17 0.05 20 0.06 654 20 0.06 40 0.12 659 47 0.14 87 0.27 664 107 0.33 194 0.59 668 176 0.54 370 1.13 671 272 0.83 642 1.97 674 404 1.24 1,046 3.21 677 523 1.60 1,569 4.81 679 660 2.02 2,229 6.83 681 714 2.19 2,943 9.02 683 839 2.57 3,782 11.59 685 899 2.76 4,681 14.35 687 989 3.03 5,670 17.38 688 931 2.85 6,601 20.23 690 924 2.83 7,525 23.06 691 895 2.74 8,420 25.81 693 895 2.74 9,315 28.55 694 833 2.55 10,148 31.10 695 858 2.63 11,006 33.73 697 773 2.37 11,779 36.10 698 733 2.25 12,512 38.35 699 693 2.12 13,205 40.47 700 714 2.19 13,919 42.66 701 711 2.18 14,630 44.84 703 651 2.00 15,281 46.84 704 641 1.96 15,922 48.80 705 632 1.94 16,554 50.74 706 613 1.88 17,167 52.62 707 617 1.89 17,784 54.51 708 641 1.96 18,425 56.47 709 588 1.80 19,013 58.28 710 631 1.93 19,644 60.21 711 596 1.83 20,240 62.04 713 631 1.93 20,871 63.97 714 631 1.93 21,502 65.90 715 608 1.86 22,110 67.77 716 617 1.89 22,727 69.66 717 586 1.80 23,313 71.46 718 629 1.93 23,942 73.38 719 611 1.87 24,553 75.26




Cumulative Percent

721 621 1.90 25,174 77.16 722 553 1.69 25,727 78.85 723 616 1.89 26,343 80.74 725 574 1.76 26,917 82.50 726 564 1.73 27,481 84.23 728 534 1.64 28,015 85.87 729 545 1.67 28,560 87.54 731 493 1.51 29,053 89.05 733 467 1.43 29,520 90.48 734 462 1.42 29,982 91.90 736 449 1.38 30,431 93.27 739 406 1.24 30,837 94.52 741 359 1.10 31,196 95.62 744 331 1.01 31,527 96.63 746 314 0.96 31,841 97.59 750 270 0.83 32,111 98.42 754 198 0.61 32,309 99.03 758 130 0.40 32,439 99.43 764 104 0.32 32,543 99.75 773 56 0.17 32,599 99.92 774 27 0.08 32,626 100.00



Table D12. Scaled Score Frequency Distributions Fall 2017 – American History


Cumulative Percent

619 4 0.02 4 0.02 622 10 0.05 14 0.07 633 20 0.09 34 0.16 641 44 0.21 78 0.37 648 79 0.37 157 0.74 653 130 0.61 287 1.34 658 236 1.11 523 2.45 662 341 1.60 864 4.05 665 460 2.15 1,324 6.20 669 633 2.97 1,957 9.17 672 728 3.41 2,685 12.58 675 872 4.08 3,557 16.66 678 903 4.23 4,460 20.89 680 1,038 4.86 5,498 25.76 683 1,093 5.12 6,591 30.88 685 1,073 5.03 7,664 35.90 687 1,043 4.89 8,707 40.79 689 1,044 4.89 9,751 45.68 691 978 4.58 10,729 50.26 693 967 4.53 11,696 54.79 695 885 4.15 12,581 58.94 697 870 4.08 13,451 63.01 699 787 3.69 14,238 66.70 701 733 3.43 14,971 70.13 702 655 3.07 15,626 73.20 704 574 2.69 16,200 75.89 706 498 2.33 16,698 78.22 707 468 2.19 17,166 80.41 709 426 2.00 17,592 82.41 711 384 1.80 17,976 84.21 712 359 1.68 18,335 85.89 714 316 1.48 18,651 87.37 716 250 1.17 18,901 88.54 717 221 1.04 19,122 89.58 719 210 0.98 19,332 90.56 721 155 0.73 19,487 91.29 722 141 0.66 19,628 91.95 724 165 0.77 19,793 92.72 726 144 0.67 19,937 93.39 727 158 0.74 20,095 94.14 729 124 0.58 20,219 94.72 731 110 0.52 20,329 95.23 733 121 0.57 20,450 95.80




Cumulative Percent

734 91 0.43 20,541 96.22 736 88 0.41 20,629 96.64 738 78 0.37 20,707 97.00 740 79 0.37 20,786 97.37 742 81 0.38 20,867 97.75 744 78 0.37 20,945 98.12 747 52 0.24 20,997 98.36 749 50 0.23 21,047 98.59 752 58 0.27 21,105 98.87 754 43 0.20 21,148 99.07 757 53 0.25 21,201 99.32 760 43 0.20 21,244 99.52 764 34 0.16 21,278 99.68 767 22 0.10 21,300 99.78 772 18 0.08 21,318 99.86 777 15 0.07 21,333 99.93 783 10 0.05 21,343 99.98 791 2 0.01 21,345 99.99 800 2 0.01 21,347 100.00



Table D13. Scaled Score Frequency Distributions Spring 2018 – Grade 3 Reading

Reading Promotion Score Frequency Percent Cumulative


16 89 0.07 89 0.07 21 235 0.19 324 0.26 26 645 0.51 969 0.76 30 1,302 1.03 2,271 1.79 32 2,169 1.71 4,440 3.51 35 2,978 2.35 7,418 5.86 37 3,934 3.11 11,352 8.96 39 4,489 3.54 15,841 12.51 41 5,178 4.09 21,019 16.59 43 5,489 4.33 26,508 20.93 45 5,800 4.58 32,308 25.51 46 6,187 4.88 38,495 30.39 48 6,399 5.05 44,894 35.44 50 6,814 5.38 51,708 40.82 51 6,799 5.37 58,507 46.19 53 6,872 5.43 65,379 51.61 54 7,027 5.55 72,406 57.16 56 7,071 5.58 79,477 62.74 58 7,050 5.57 86,527 68.31 59 7,006 5.53 93,533 73.84 61 6,807 5.37 100,340 79.22 63 6,502 5.13 106,842 84.35 65 5,633 4.45 112,475 88.80 67 4,758 3.76 117,233 92.55 70 3,815 3.01 121,048 95.56 73 2,760 2.18 123,808 97.74 77 1,641 1.30 125,449 99.04 82 884 0.70 126,333 99.74 86 334 0.26 126,667 100.00



Table D14. Scaled Score Frequency Distributions Spring 2018 – Grade 3 ELA


Cumulative Percent

545 78 0.06 78 0.06 562 191 0.15 269 0.21 583 513 0.40 782 0.62 599 1,040 0.82 1,822 1.44 612 1,732 1.37 3,554 2.81 623 2,285 1.80 5,839 4.61 633 2,961 2.34 8,800 6.95 642 3,377 2.67 12,177 9.61 650 3,938 3.11 16,115 12.72 657 4,181 3.30 20,296 16.02 664 4,399 3.47 24,695 19.50 672 4,726 3.73 29,421 23.23 678 4,954 3.91 34,375 27.14 684 5,004 3.95 39,379 31.09 690 5,476 4.32 44,855 35.41 696 5,622 4.44 50,477 39.85 702 5,669 4.48 56,146 44.33 708 5,639 4.45 61,785 48.78 714 5,861 4.63 67,646 53.40 719 6,026 4.76 73,672 58.16 725 5,782 4.56 79,454 62.73 731 5,645 4.46 85,099 67.18 737 5,687 4.49 90,786 71.67 743 5,437 4.29 96,223 75.97 752 5,259 4.15 101,482 80.12 756 4,859 3.84 106,341 83.95 763 4,404 3.48 110,745 87.43 770 3,988 3.15 114,733 90.58 777 3,385 2.67 118,118 93.25 785 2,658 2.10 120,776 95.35 794 2,143 1.69 122,919 97.04 803 1,569 1.24 124,488 98.28 813 1,009 0.80 125,497 99.08 824 631 0.50 126,128 99.57 837 326 0.26 126,454 99.83 851 144 0.11 126,598 99.95 863 69 0.05 126,667 100.00





Cumulative Percent

549 28 0.02 28 0.02 569 120 0.09 148 0.12 589 266 0.21 414 0.33 603 598 0.47 1,012 0.80 615 955 0.75 1,967 1.55 626 1,394 1.10 3,361 2.65 634 1,783 1.41 5,144 4.06 642 2,241 1.77 7,385 5.83 650 2,642 2.09 10,027 7.92 656 2,933 2.32 12,960 10.24 663 3,269 2.58 16,229 12.82 669 3,751 2.96 19,980 15.78 674 4,062 3.21 24,042 18.99 680 4,460 3.52 28,502 22.51 685 4,705 3.72 33,207 26.23 690 4,879 3.85 38,086 30.08 695 5,184 4.09 43,270 34.17 700 5,221 4.12 48,491 38.30 705 5,575 4.40 54,066 42.70 709 5,647 4.46 59,713 47.16 714 5,747 4.54 65,460 51.70 719 5,764 4.55 71,224 56.25 725 5,636 4.45 76,860 60.70 729 5,722 4.52 82,582 65.22 735 5,674 4.48 88,256 69.70 740 5,473 4.32 93,729 74.02 746 5,459 4.31 99,188 78.33 753 5,197 4.10 104,385 82.44 759 4,745 3.75 109,130 86.19 766 4,336 3.42 113,466 89.61 774 3,707 2.93 117,173 92.54 783 3,087 2.44 120,260 94.98 792 2,456 1.94 122,716 96.92 802 1,689 1.33 124,405 98.25 814 1,109 0.88 125,514 99.12 828 628 0.50 126,142 99.62 845 311 0.25 126,453 99.87 846 169 0.13 126,622 100.00





Cumulative Percent

552 28 0.02 28 0.02 557 64 0.05 92 0.07 576 159 0.12 251 0.20 590 314 0.25 565 0.44 602 495 0.39 1,060 0.83 612 860 0.67 1,920 1.50 621 1,079 0.84 2,999 2.34 629 1,450 1.13 4,449 3.47 636 1,817 1.42 6,266 4.89 643 2,082 1.63 8,348 6.52 650 2,510 1.96 10,858 8.48 656 2,621 2.05 13,479 10.52 662 2,912 2.27 16,391 12.80 669 3,121 2.44 19,512 15.23 673 3,365 2.63 22,877 17.86 678 3,485 2.72 26,362 20.58 683 3,873 3.02 30,235 23.61 689 4,041 3.16 34,276 26.76 694 4,211 3.29 38,487 30.05 700 4,462 3.48 42,949 33.53 704 4,712 3.68 47,661 37.21 708 5,064 3.95 52,725 41.17 713 5,270 4.11 57,995 45.28 718 5,439 4.25 63,434 49.53 725 5,612 4.38 69,046 53.91 729 5,983 4.67 75,029 58.58 734 6,042 4.72 81,071 63.30 740 6,133 4.79 87,204 68.09 745 6,221 4.86 93,425 72.94 751 5,981 4.67 99,406 77.61 758 5,912 4.62 105,318 82.23 765 5,357 4.18 110,675 86.41 773 4,811 3.76 115,486 90.17 782 4,109 3.21 119,595 93.38 791 3,329 2.60 122,924 95.98 802 2,288 1.79 125,212 97.76 815 1,493 1.17 126,705 98.93 830 811 0.63 127,516 99.56 847 376 0.29 127,892 99.86 848 184 0.14 128,076 100.00





Cumulative Percent

555 34 0.03 34 0.03 561 70 0.06 104 0.08 575 120 0.09 224 0.18 586 201 0.16 425 0.34 595 359 0.28 784 0.62 603 509 0.40 1,293 1.02 610 648 0.51 1,941 1.53 617 813 0.64 2,754 2.18 623 935 0.74 3,689 2.91 628 1,130 0.89 4,819 3.81 634 1,231 0.97 6,050 4.78 638 1,557 1.23 7,607 6.01 643 1,696 1.34 9,303 7.35 648 1,868 1.48 11,171 8.83 652 2,072 1.64 13,243 10.46 656 2,249 1.78 15,492 12.24 660 2,434 1.92 17,926 14.16 664 2,535 2.00 20,461 16.17 668 2,712 2.14 23,173 18.31 671 2,814 2.22 25,987 20.53 674 2,879 2.27 28,866 22.81 678 2,961 2.34 31,827 25.15 681 3,033 2.40 34,860 27.54 685 3,147 2.49 38,007 30.03 688 3,251 2.57 41,258 32.60 691 3,283 2.59 44,541 35.19 694 3,378 2.67 47,919 37.86 697 3,443 2.72 51,362 40.58 701 3,524 2.78 54,886 43.36 704 3,674 2.90 58,560 46.27 707 3,611 2.85 62,171 49.12 710 3,708 2.93 65,879 52.05 713 3,692 2.92 69,571 54.97 716 3,759 2.97 73,330 57.94 720 3,816 3.01 77,146 60.95 723 3,946 3.12 81,092 64.07 726 3,920 3.10 85,012 67.16 730 3,873 3.06 88,885 70.22 733 3,878 3.06 92,763 73.29 737 3,865 3.05 96,628 76.34




Cumulative Percent

741 3,744 2.96 100,372 79.30 744 3,544 2.80 103,916 82.10 748 3,402 2.69 107,318 84.79 752 3,232 2.55 110,550 87.34 757 2,871 2.27 113,421 89.61 761 2,671 2.11 116,092 91.72 766 2,297 1.81 118,389 93.53 771 2,073 1.64 120,462 95.17 777 1,660 1.31 122,122 96.48 782 1,311 1.04 123,433 97.52 789 1,064 0.84 124,497 98.36 796 794 0.63 125,291 98.99 803 550 0.43 125,841 99.42 812 356 0.28 126,197 99.70 821 193 0.15 126,390 99.86 833 108 0.09 126,498 99.94 847 51 0.04 126,549 99.98 851 23 0.02 126,572 100.00





Cumulative Percent

568 40 0.03 40 0.03 569 76 0.06 116 0.09 581 178 0.14 294 0.24 592 257 0.21 551 0.44 600 331 0.27 882 0.71 608 546 0.44 1,428 1.15 615 624 0.50 2,052 1.65 621 780 0.63 2,832 2.27 627 847 0.68 3,679 2.95 632 1,002 0.80 4,681 3.76 638 1,197 0.96 5,878 4.72 642 1,437 1.15 7,315 5.87 647 1,601 1.29 8,916 7.16 651 1,715 1.38 10,631 8.54 656 1,809 1.45 12,440 9.99 660 2,093 1.68 14,533 11.67 663 2,235 1.79 16,768 13.46 667 2,294 1.84 19,062 15.31 671 2,448 1.97 21,510 17.27 674 2,582 2.07 24,092 19.35 678 2,713 2.18 26,805 21.52 681 2,947 2.37 29,752 23.89 684 2,912 2.34 32,664 26.23 688 3,097 2.49 35,761 28.72 691 3,164 2.54 38,925 31.26 694 3,197 2.57 42,122 33.82 697 3,358 2.70 45,480 36.52 700 3,471 2.79 48,951 39.31 703 3,536 2.84 52,487 42.15 706 3,576 2.87 56,063 45.02 709 3,686 2.96 59,749 47.98 712 3,793 3.05 63,542 51.02 715 3,787 3.04 67,329 54.06 718 3,914 3.14 71,243 57.21 721 3,907 3.14 75,150 60.34 725 3,954 3.18 79,104 63.52 727 3,820 3.07 82,924 66.59 730 3,951 3.17 86,875 69.76 734 3,924 3.15 90,799 72.91 737 3,737 3.00 94,536 75.91




Cumulative Percent

740 3,597 2.89 98,133 78.80 744 3,507 2.82 101,640 81.62 749 3,296 2.65 104,936 84.26 751 3,089 2.48 108,025 86.74 755 2,924 2.35 110,949 89.09 759 2,702 2.17 113,651 91.26 764 2,372 1.90 116,023 93.17 768 1,999 1.61 118,022 94.77 773 1,665 1.34 119,687 96.11 779 1,283 1.03 120,970 97.14 784 1,069 0.86 122,039 98.00 791 832 0.67 122,871 98.66 797 633 0.51 123,504 99.17 805 419 0.34 123,923 99.51 814 270 0.22 124,193 99.73 824 159 0.13 124,352 99.85 833 182 0.15 124,534 100.00





Cumulative Percent

586 23 0.02 23 0.02 593 43 0.03 66 0.05 603 89 0.07 155 0.12 612 183 0.15 338 0.27 618 306 0.24 644 0.51 625 434 0.35 1,078 0.86 630 598 0.48 1,676 1.34 635 770 0.61 2,446 1.95 639 875 0.70 3,321 2.65 643 1,183 0.94 4,504 3.59 647 1,371 1.09 5,875 4.68 651 1,611 1.28 7,486 5.96 654 1,911 1.52 9,397 7.49 657 2,109 1.68 11,506 9.17 661 2,320 1.85 13,826 11.01 664 2,542 2.02 16,368 13.04 666 2,716 2.16 19,084 15.20 669 2,884 2.30 21,968 17.50 672 2,803 2.23 24,771 19.73 675 2,963 2.36 27,734 22.09 677 2,972 2.37 30,706 24.46 680 3,189 2.54 33,895 27.00 682 3,166 2.52 37,061 29.52 685 3,265 2.60 40,326 32.12 687 3,236 2.58 43,562 34.70 690 3,504 2.79 47,066 37.49 692 3,575 2.85 50,641 40.34 695 3,613 2.88 54,254 43.22 697 3,620 2.88 57,874 46.10 700 3,762 3.00 61,636 49.10 702 3,731 2.97 65,367 52.07 704 3,866 3.08 69,233 55.15 707 3,910 3.11 73,143 58.27 709 3,930 3.13 77,073 61.40 712 4,002 3.19 81,075 64.58 714 4,052 3.23 85,127 67.81 717 3,962 3.16 89,089 70.97 720 3,937 3.14 93,026 74.10 722 3,862 3.08 96,888 77.18 725 3,805 3.03 100,693 80.21




Cumulative Percent

728 3,480 2.77 104,173 82.98 731 3,360 2.68 107,533 85.66 734 3,066 2.44 110,599 88.10 737 2,756 2.20 113,355 90.30 741 2,504 1.99 115,859 92.29 744 2,165 1.72 118,024 94.02 748 1,803 1.44 119,827 95.45 752 1,513 1.21 121,340 96.66 756 1,198 0.95 122,538 97.61 760 998 0.80 123,536 98.41 765 679 0.54 124,215 98.95 770 515 0.41 124,730 99.36 776 339 0.27 125,069 99.63 782 219 0.17 125,288 99.80 789 114 0.09 125,402 99.89 797 75 0.06 125,477 99.95 805 57 0.05 125,534 100.00



Table D20. Scaled Score Frequency Distributions Spring 2018 – High School ELA I


Cumulative Percent

606 84 0.06 84 0.06 613 155 0.10 239 0.16 621 238 0.16 477 0.32 629 422 0.28 899 0.59 635 585 0.39 1,484 0.98 640 897 0.59 2,381 1.58 645 1,168 0.77 3,549 2.35 649 1,470 0.97 5,019 3.32 653 1,679 1.11 6,698 4.43 656 1,868 1.24 8,566 5.67 660 1,989 1.32 10,555 6.99 663 2,148 1.42 12,703 8.41 666 2,372 1.57 15,075 9.98 669 2,536 1.68 17,611 11.66 672 2,676 1.77 20,287 13.43 675 2,960 1.96 23,247 15.39 677 3,074 2.03 26,321 17.42 680 3,385 2.24 29,706 19.66 683 3,547 2.35 33,253 22.01 685 3,562 2.36 36,815 24.37 687 3,634 2.41 40,449 26.77 689 3,748 2.48 44,197 29.25 692 3,846 2.55 48,043 31.80 694 3,796 2.51 51,839 34.31 696 3,704 2.45 55,543 36.76 698 3,674 2.43 59,217 39.19 701 3,771 2.50 62,988 41.69 703 3,913 2.59 66,901 44.28 705 3,910 2.59 70,811 46.87 707 3,841 2.54 74,652 49.41 710 4,064 2.69 78,716 52.10 712 4,027 2.67 82,743 54.76 714 4,129 2.73 86,872 57.49 716 4,068 2.69 90,940 60.19 718 4,091 2.71 95,031 62.89 721 4,219 2.79 99,250 65.69 723 4,274 2.83 103,524 68.52 726 4,258 2.82 107,782 71.33 728 4,226 2.80 112,008 74.13 730 4,251 2.81 116,259 76.94




Cumulative Percent

733 4,143 2.74 120,402 79.69 736 3,915 2.59 124,317 82.28 739 4,051 2.68 128,368 84.96 741 3,630 2.40 131,998 87.36 744 3,510 2.32 135,508 89.68 748 3,123 2.07 138,631 91.75 751 2,720 1.80 141,351 93.55 755 2,469 1.63 143,820 95.19 758 2,013 1.33 145,833 96.52 763 1,678 1.11 147,511 97.63 767 1,211 0.80 148,722 98.43 772 926 0.61 149,648 99.04 777 650 0.43 150,298 99.47 783 398 0.26 150,696 99.74 790 230 0.15 150,926 99.89 798 111 0.07 151,037 99.96 800 58 0.04 151,095 100.00



Table D21. Scaled Score Frequency Distributions Spring 2018 – High School ELA II


Cumulative Percent

597 141 0.10 141 0.10 606 208 0.15 349 0.25 616 335 0.24 684 0.48 624 597 0.42 1,281 0.91 631 922 0.65 2,203 1.56 637 1,282 0.91 3,485 2.47 642 1,579 1.12 5,064 3.58 647 2,006 1.42 7,070 5.00 651 2,188 1.55 9,258 6.55 655 2,396 1.70 11,654 8.25 659 2,682 1.90 14,336 10.15 662 2,607 1.85 16,943 11.99 666 2,800 1.98 19,743 13.98 669 2,747 1.94 22,490 15.92 672 2,939 2.08 25,429 18.00 675 3,056 2.16 28,485 20.16 679 2,933 2.08 31,418 22.24 681 3,104 2.20 34,522 24.44 683 3,267 2.31 37,789 26.75 686 3,165 2.24 40,954 28.99 688 3,341 2.37 44,295 31.36 691 3,540 2.51 47,835 33.86 693 3,476 2.46 51,311 36.32 696 3,750 2.65 55,061 38.98 698 4,018 2.84 59,079 41.82 700 3,894 2.76 62,973 44.58 703 4,013 2.84 66,986 47.42 705 4,117 2.91 71,103 50.33 707 4,246 3.01 75,349 53.34 710 4,282 3.03 79,631 56.37 712 4,234 3.00 83,865 59.37 714 4,326 3.06 88,191 62.43 716 4,305 3.05 92,496 65.48 719 4,385 3.10 96,881 68.58 721 4,234 3.00 101,115 71.58 723 4,234 3.00 105,349 74.58 726 4,084 2.89 109,433 77.47 728 4,008 2.84 113,441 80.30 731 3,664 2.59 117,105 82.90 733 3,571 2.53 120,676 85.43




Cumulative Percent

736 3,191 2.26 123,867 87.69 739 3,076 2.18 126,943 89.86 742 2,747 1.94 129,690 91.81 745 2,321 1.64 132,011 93.45 748 2,075 1.47 134,086 94.92 752 1,723 1.22 135,809 96.14 755 1,451 1.03 137,260 97.17 759 1,121 0.79 138,381 97.96 763 867 0.61 139,248 98.57 768 662 0.47 139,910 99.04 773 504 0.36 140,414 99.40 778 348 0.25 140,762 99.65 783 211 0.15 140,973 99.79 790 148 0.10 141,121 99.90 797 72 0.05 141,193 99.95 805 44 0.03 141,237 99.98 808 26 0.02 141,263 100.00



Table D22. Scaled Score Frequency Distributions Spring 2018 – Grade 3 Math


Cumulative Percent

587 235 0.18 235 0.18 589 337 0.26 572 0.45 600 493 0.39 1,065 0.83 601 16 0.01 1,081 0.85 609 651 0.51 1,732 1.36 610 5 0.00 1,737 1.36 617 869 0.68 2,606 2.04 618 18 0.01 2,624 2.06 624 1,072 0.84 3,696 2.90 625 20 0.02 3,716 2.91 630 1,136 0.89 4,852 3.80 631 16 0.01 4,868 3.82 636 1,351 1.06 6,219 4.88 637 18 0.01 6,237 4.89 642 1,500 1.18 7,737 6.07 643 15 0.01 7,752 6.08 647 1,669 1.31 9,421 7.39 648 23 0.02 9,444 7.40 651 1,838 1.44 11,282 8.84 653 15 0.01 11,297 8.86 656 2,114 1.66 13,411 10.51 657 24 0.02 13,435 10.53 660 2,204 1.73 15,639 12.26 662 12 0.01 15,651 12.27 664 2,446 1.92 18,097 14.19 666 13 0.01 18,110 14.20 669 2,508 1.97 20,618 16.16 670 17 0.01 20,635 16.18 673 2,687 2.11 23,322 18.28 674 10 0.01 23,332 18.29 677 2,751 2.16 26,083 20.45 678 19 0.01 26,102 20.46 680 2,979 2.34 29,081 22.80 683 17 0.01 29,098 22.81 684 3,007 2.36 32,105 25.17 686 27 0.02 32,132 25.19 688 3,164 2.48 35,296 27.67 690 25 0.02 35,321 27.69 692 3,315 2.60 38,636 30.29 694 19 0.01 38,655 30.30 696 3,379 2.65 42,034 32.95 698 15 0.01 42,049 32.96 700 3,347 2.62 45,396 35.59




Cumulative Percent

702 19 0.01 45,415 35.60 703 3,518 2.76 48,933 38.36 706 15 0.01 48,948 38.37 707 3,735 2.93 52,683 41.30 710 18 0.01 52,701 41.32 711 3,718 2.91 56,419 44.23 714 29 0.02 56,448 44.25 715 3,804 2.98 60,252 47.23 717 18 0.01 60,270 47.25 719 3,955 3.10 64,225 50.35 721 20 0.02 64,245 50.37 723 3,845 3.01 68,090 53.38 725 15 0.01 68,105 53.39 727 4,034 3.16 72,139 56.55 729 12 0.01 72,151 56.56 731 4,107 3.22 76,258 59.78 733 21 0.02 76,279 59.80 735 4,027 3.16 80,306 62.96 738 19 0.01 80,325 62.97 739 4,140 3.25 84,465 66.22 742 4 0.00 84,469 66.22 744 4,152 3.25 88,621 69.48 746 11 0.01 88,632 69.48 748 4,180 3.28 92,812 72.76 750 16 0.01 92,828 72.77 753 4,092 3.21 96,920 75.98 755 15 0.01 96,935 75.99 758 4,084 3.20 101,019 79.19 760 9 0.01 101,028 79.20 763 3,907 3.06 104,935 82.26 764 9 0.01 104,944 82.27 768 3,807 2.98 108,751 85.26 769 10 0.01 108,761 85.26 774 3,631 2.85 112,392 88.11 775 5 0.00 112,397 88.11 780 3,191 2.50 115,588 90.62 781 8 0.01 115,596 90.62 787 2,968 2.33 118,564 92.95 794 6 0.00 118,570 92.95 795 2,546 2.00 121,116 94.95 803 4 0.00 121,120 94.95 804 2,257 1.77 123,377 96.72 814 1 0.00 123,378 96.72 815 1,804 1.41 125,182 98.14




Cumulative Percent

818 2,376 1.86 127,558 100.00





Cumulative Percent

605 336 0.27 336 0.27 606 649 0.52 985 0.78 620 1,121 0.89 2,106 1.67 631 1,499 1.19 3,605 2.86 640 1,747 1.39 5,352 4.25 647 1,841 1.46 7,193 5.71 653 2,088 1.66 9,281 7.37 659 2,229 1.77 11,510 9.13 664 2,287 1.81 13,797 10.95 669 2,400 1.90 16,197 12.85 674 2,414 1.92 18,611 14.77 678 2,575 2.04 21,186 16.81 682 2,656 2.11 23,842 18.92 686 2,574 2.04 26,416 20.96 689 2,724 2.16 29,140 23.12 693 2,770 2.20 31,910 25.32 696 2,680 2.13 34,590 27.45 700 2,858 2.27 37,448 29.72 703 2,748 2.18 40,196 31.90 706 2,818 2.24 43,014 34.13 710 2,805 2.23 45,819 36.36 713 2,805 2.23 48,624 38.59 716 2,900 2.30 51,524 40.89 719 2,951 2.34 54,475 43.23 722 3,015 2.39 57,490 45.62 725 2,975 2.36 60,465 47.98 728 3,120 2.48 63,585 50.46 732 3,163 2.51 66,748 52.97 735 3,144 2.49 69,892 55.46 738 3,198 2.54 73,090 58.00 741 3,280 2.60 76,370 60.60 745 3,182 2.53 79,552 63.13 748 3,360 2.67 82,912 65.80 752 3,382 2.68 86,294 68.48 756 3,445 2.73 89,739 71.21 760 3,375 2.68 93,114 73.89 764 3,410 2.71 96,524 76.60 768 3,400 2.70 99,924 79.30 773 3,431 2.72 103,355 82.02 778 3,266 2.59 106,621 84.61 783 3,296 2.62 109,917 87.23 789 3,173 2.52 113,090 89.74 796 3,002 2.38 116,092 92.13




Cumulative Percent

803 2,776 2.20 118,868 94.33 812 2,412 1.91 121,280 96.24 823 2,015 1.60 123,295 97.84 835 2,719 2.16 126,014 100.00





Cumulative Percent

624 1,562 1.23 1,562 1.23 631 1,634 1.29 3,196 2.52 639 1,934 1.53 5,130 4.05 646 2,334 1.84 7,464 5.89 652 2,639 2.08 10,103 7.97 658 2,765 2.18 12,868 10.15 662 2,899 2.29 15,767 12.44 667 2,957 2.33 18,724 14.78 671 3,109 2.45 21,833 17.23 674 3,105 2.45 24,938 19.68 678 3,101 2.45 28,039 22.13 681 3,057 2.41 31,096 24.54 685 3,150 2.49 34,246 27.02 688 3,249 2.56 37,495 29.59 691 3,256 2.57 40,751 32.16 694 3,277 2.59 44,028 34.74 696 3,287 2.59 47,315 37.34 700 3,406 2.69 50,721 40.03 702 3,367 2.66 54,088 42.68 705 3,370 2.66 57,458 45.34 707 3,312 2.61 60,770 47.96 710 3,456 2.73 64,226 50.68 713 3,361 2.65 67,587 53.34 715 3,252 2.57 70,839 55.90 718 3,249 2.56 74,088 58.47 720 3,264 2.58 77,352 61.04 723 3,240 2.56 80,592 63.60 726 3,139 2.48 83,731 66.08 728 3,090 2.44 86,821 68.51 731 2,980 2.35 89,801 70.87 734 2,826 2.23 92,627 73.10 737 2,806 2.21 95,433 75.31 739 2,725 2.15 98,158 77.46 742 2,615 2.06 100,773 79.52 745 2,506 1.98 103,279 81.50 749 2,460 1.94 105,739 83.44 752 2,406 1.90 108,145 85.34 755 2,317 1.83 110,462 87.17 759 2,275 1.80 112,737 88.97 762 2,116 1.67 114,853 90.64 767 2,052 1.62 116,905 92.25 771 1,926 1.52 118,831 93.77 776 1,727 1.36 120,558 95.14




Cumulative Percent

782 1,684 1.33 122,242 96.47 788 1,463 1.15 123,705 97.62 797 1,228 0.97 124,933 98.59 804 1,787 1.41 126,720 100.00





Cumulative Percent

616 264 0.21 264 0.21 619 470 0.38 734 0.59 628 805 0.64 1,539 1.23 635 1,290 1.03 2,829 2.26 640 1,605 1.28 4,434 3.55 645 1,886 1.51 6,320 5.06 650 2,014 1.61 8,334 6.67 654 2,111 1.69 10,445 8.36 658 2,220 1.78 12,665 10.13 661 2,336 1.87 15,001 12.00 665 2,473 1.98 17,474 13.98 668 2,472 1.98 19,946 15.96 671 2,594 2.08 22,540 18.04 674 2,647 2.12 25,187 20.15 677 2,767 2.21 27,954 22.37 680 2,718 2.17 30,672 24.54 682 2,684 2.15 33,356 26.69 685 2,795 2.24 36,151 28.93 688 2,877 2.30 39,028 31.23 690 2,857 2.29 41,885 33.52 693 3,007 2.41 44,892 35.92 696 3,001 2.40 47,893 38.32 698 3,079 2.46 50,972 40.79 701 3,104 2.48 54,076 43.27 703 3,284 2.63 57,360 45.90 706 3,228 2.58 60,588 48.48 708 3,255 2.60 63,843 51.09 711 3,292 2.63 67,135 53.72 713 3,356 2.69 70,491 56.41 716 3,330 2.66 73,821 59.07 719 3,230 2.58 77,051 61.66 721 3,146 2.52 80,197 64.17 725 3,205 2.56 83,402 66.74 727 3,242 2.59 86,644 69.33 729 3,123 2.50 89,767 71.83 732 3,089 2.47 92,856 74.30 735 3,035 2.43 95,891 76.73 738 2,918 2.33 98,809 79.07 742 2,793 2.23 101,602 81.30 745 2,755 2.20 104,357 83.50 748 2,703 2.16 107,060 85.67 752 2,548 2.04 109,608 87.71 756 2,405 1.92 112,013 89.63




Cumulative Percent

760 2,312 1.85 114,325 91.48 765 2,161 1.73 116,486 93.21 770 2,087 1.67 118,573 94.88 776 1,807 1.45 120,380 96.33 783 1,550 1.24 121,930 97.57 790 3,041 2.43 124,971 100.00





Cumulative Percent

605 168 0.14 168 0.14 606 477 0.40 645 0.54 619 977 0.81 1,622 1.35 629 1,476 1.23 3,098 2.58 636 2,050 1.71 5,148 4.29 643 2,411 2.01 7,559 6.30 648 2,583 2.15 10,142 8.46 653 2,712 2.26 12,854 10.72 658 2,715 2.26 15,569 12.98 662 2,786 2.32 18,355 15.31 666 2,751 2.29 21,106 17.60 670 2,725 2.27 23,831 19.87 673 2,751 2.29 26,582 22.17 677 2,717 2.27 29,299 24.44 680 2,834 2.36 32,133 26.80 684 2,851 2.38 34,984 29.18 686 2,835 2.36 37,819 31.54 689 2,753 2.30 40,572 33.84 692 2,818 2.35 43,390 36.19 695 2,885 2.41 46,275 38.59 697 2,771 2.31 49,046 40.90 700 2,831 2.36 51,877 43.27 703 2,887 2.41 54,764 45.67 706 2,838 2.37 57,602 48.04 708 2,874 2.40 60,476 50.44 711 2,979 2.48 63,455 52.92 713 2,864 2.39 66,319 55.31 716 2,869 2.39 69,188 57.70 719 2,971 2.48 72,159 60.18 721 2,885 2.41 75,044 62.59 725 2,946 2.46 77,990 65.04 727 2,886 2.41 80,876 67.45 730 2,932 2.45 83,808 69.90 732 2,884 2.41 86,692 72.30 735 2,741 2.29 89,433 74.59 738 2,809 2.34 92,242 76.93 741 2,638 2.20 94,880 79.13 744 2,648 2.21 97,528 81.34 747 2,505 2.09 100,033 83.43 751 2,458 2.05 102,491 85.48 755 2,477 2.07 104,968 87.54 758 2,354 1.96 107,322 89.51 762 2,204 1.84 109,526 91.34




Cumulative Percent

767 2,085 1.74 111,611 93.08 771 1,826 1.52 113,437 94.61 777 1,709 1.43 115,146 96.03 784 1,471 1.23 116,617 97.26 791 1,237 1.03 117,854 98.29 801 964 0.80 118,818 99.09 806 1,087 0.91 119,905 100.00





Cumulative Percent

633 548 0.56 548 0.56 639 637 0.65 1,185 1.21 644 995 1.02 2,180 2.23 649 1,288 1.32 3,468 3.55 654 1,567 1.60 5,035 5.15 658 1,751 1.79 6,786 6.95 661 1,957 2.00 8,743 8.95 665 2,081 2.13 10,824 11.08 668 2,136 2.19 12,960 13.27 671 2,266 2.32 15,226 15.59 674 2,326 2.38 17,552 17.97 676 2,438 2.50 19,990 20.46 679 2,479 2.54 22,469 23.00 682 2,553 2.61 25,022 25.62 684 2,720 2.78 27,742 28.40 687 2,744 2.81 30,486 31.21 690 2,805 2.87 33,291 34.08 692 2,922 2.99 36,213 37.07 694 2,941 3.01 39,154 40.08 696 3,002 3.07 42,156 43.16 699 3,020 3.09 45,176 46.25 701 2,981 3.05 48,157 49.30 703 3,089 3.16 51,246 52.46 705 2,992 3.06 54,238 55.53 708 3,041 3.11 57,279 58.64 710 2,960 3.03 60,239 61.67 712 2,990 3.06 63,229 64.73 714 2,962 3.03 66,191 67.76 717 2,925 2.99 69,116 70.76 719 2,766 2.83 71,882 73.59 721 2,637 2.70 74,519 76.29 723 2,578 2.64 77,097 78.93 726 2,495 2.55 79,592 81.48 728 2,191 2.24 81,783 83.72 730 2,115 2.17 83,898 85.89 733 1,944 1.99 85,842 87.88 735 1,817 1.86 87,659 89.74 738 1,648 1.69 89,307 91.43 741 1,416 1.45 90,723 92.88 744 1,279 1.31 92,002 94.19 747 1,145 1.17 93,147 95.36 750 1,031 1.06 94,178 96.41 753 816 0.84 94,994 97.25




Cumulative Percent

757 700 0.72 95,694 97.97 761 584 0.60 96,278 98.56 766 462 0.47 96,740 99.04 771 377 0.39 97,117 99.42 774 564 0.58 97,681 100.00



Table D28. Scaled Score Frequency Distributions Spring 2018 – Algebra


Cumulative Percent

618 435 0.30 435 0.30 629 780 0.54 1,215 0.83 638 1,504 1.03 2,719 1.87 645 2,481 1.70 5,200 3.57 652 3,558 2.44 8,758 6.02 657 4,321 2.97 13,079 8.99 662 5,011 3.44 18,090 12.43 666 5,299 3.64 23,389 16.07 670 5,260 3.61 28,649 19.68 674 5,198 3.57 33,847 23.25 677 4,914 3.38 38,761 26.63 682 4,747 3.26 43,508 29.89 684 4,630 3.18 48,138 33.07 687 4,447 3.06 52,585 36.13 690 4,553 3.13 57,138 39.26 693 4,267 2.93 61,405 42.19 695 4,201 2.89 65,606 45.07 698 3,965 2.72 69,571 47.80 701 4,025 2.77 73,596 50.56 703 4,064 2.79 77,660 53.35 706 3,859 2.65 81,519 56.01 708 3,965 2.72 85,484 58.73 711 3,682 2.53 89,166 61.26 713 3,718 2.55 92,884 63.81 716 3,551 2.44 96,435 66.25 718 3,546 2.44 99,981 68.69 721 3,449 2.37 103,430 71.06 723 3,353 2.30 106,783 73.36 726 3,170 2.18 109,953 75.54 728 3,243 2.23 113,196 77.77 731 2,989 2.05 116,185 79.82 733 2,831 1.94 119,016 81.77 736 2,773 1.91 121,789 83.67 739 2,569 1.76 124,358 85.44 741 2,370 1.63 126,728 87.07 744 2,388 1.64 129,116 88.71 747 2,111 1.45 131,227 90.16 750 2,029 1.39 133,256 91.55 754 1,822 1.25 135,078 92.80 757 1,663 1.14 136,741 93.95 760 1,553 1.07 138,294 95.01 764 1,329 0.91 139,623 95.93 768 1,163 0.80 140,786 96.72




Cumulative Percent

772 1,077 0.74 141,863 97.46 777 917 0.63 142,780 98.09 782 753 0.52 143,533 98.61 788 626 0.43 144,159 99.04 795 552 0.38 144,711 99.42 803 354 0.24 145,065 99.66 813 260 0.18 145,325 99.84 814 229 0.16 145,554 100.00



Table D29. Scaled Score Frequency Distributions Spring 2018 – Geometry


Cumulative Percent

604 628 0.49 628 0.49 609 1,465 1.15 2,093 1.64 622 2,929 2.29 5,022 3.93 632 4,164 3.26 9,186 7.19 640 5,069 3.97 14,255 11.16 647 5,505 4.31 19,760 15.46 653 5,654 4.43 25,414 19.89 658 5,505 4.31 30,919 24.20 663 5,351 4.19 36,270 28.39 668 4,935 3.86 41,205 32.25 672 4,738 3.71 45,943 35.96 676 4,404 3.45 50,347 39.40 679 4,092 3.20 54,439 42.61 683 3,826 2.99 58,265 45.60 686 3,704 2.90 61,969 48.50 689 3,585 2.81 65,554 51.30 693 3,317 2.60 68,871 53.90 696 3,154 2.47 72,025 56.37 700 3,068 2.40 75,093 58.77 702 2,947 2.31 78,040 61.08 705 2,922 2.29 80,962 63.36 707 2,766 2.16 83,728 65.53 710 2,658 2.08 86,386 67.61 713 2,631 2.06 89,017 69.67 716 2,453 1.92 91,470 71.59 719 2,459 1.92 93,929 73.51 721 2,223 1.74 96,152 75.25 725 2,280 1.78 98,432 77.04 727 2,142 1.68 100,574 78.71 730 2,077 1.63 102,651 80.34 732 2,047 1.60 104,698 81.94 735 2,017 1.58 106,715 83.52 738 1,858 1.45 108,573 84.97 741 1,825 1.43 110,398 86.40 744 1,713 1.34 112,111 87.74 747 1,507 1.18 113,618 88.92 750 1,558 1.22 115,176 90.14 753 1,524 1.19 116,700 91.33 756 1,421 1.11 118,121 92.45 759 1,334 1.04 119,455 93.49 763 1,197 0.94 120,652 94.43 766 1,103 0.86 121,755 95.29 770 1,063 0.83 122,818 96.12




Cumulative Percent

774 930 0.73 123,748 96.85 778 807 0.63 124,555 97.48 783 739 0.58 125,294 98.06 788 612 0.48 125,906 98.54 793 548 0.43 126,454 98.97 799 452 0.35 126,906 99.32 806 291 0.23 127,197 99.55 810 577 0.45 127,774 100.00



Table D30. Scaled Score Frequency Distributions Spring 2018 – Integrated Math I

Scaled Score Frequency Percent Cumulative


618 124 0.99 124 0.99 627 163 1.31 287 2.30 636 308 2.47 595 4.77 644 469 3.76 1,064 8.53 650 587 4.71 1,651 13.24 655 632 5.07 2,283 18.31 660 639 5.12 2,922 23.43 664 625 5.01 3,547 28.44 669 592 4.75 4,139 33.19 672 484 3.88 4,623 37.07 676 469 3.76 5,092 40.83 679 449 3.60 5,541 44.43 682 396 3.18 5,937 47.61 685 387 3.10 6,324 50.71 688 344 2.76 6,668 53.47 691 297 2.38 6,965 55.85 694 338 2.71 7,303 58.56 696 284 2.28 7,587 60.84 700 295 2.37 7,882 63.21 702 276 2.21 8,158 65.42 704 268 2.15 8,426 67.57 707 217 1.74 8,643 69.31 709 247 1.98 8,890 71.29 711 232 1.86 9,122 73.15 714 237 1.90 9,359 75.05 716 194 1.56 9,553 76.61 719 204 1.64 9,757 78.24 721 184 1.48 9,941 79.72 724 216 1.73 10,157 81.45 726 218 1.75 10,375 83.20 729 165 1.32 10,540 84.52 731 173 1.39 10,713 85.91 734 168 1.35 10,881 87.26 737 161 1.29 11,042 88.55 739 161 1.29 11,203 89.84 742 163 1.31 11,366 91.15 745 139 1.11 11,505 92.26 748 137 1.10 11,642 93.36 751 118 0.95 11,760 94.31 754 126 1.01 11,886 95.32 758 103 0.83 11,989 96.14 761 123 0.99 12,112 97.13 765 78 0.63 12,190 97.75



Scaled Score Frequency Percent Cumulative


769 62 0.50 12,252 98.25 774 59 0.47 12,311 98.72 779 45 0.36 12,356 99.09 784 43 0.34 12,399 99.43 791 31 0.25 12,430 99.68 798 15 0.12 12,445 99.80 808 15 0.12 12,460 99.92 814 10 0.08 12,470 100.00



Table D31. Scaled Score Frequency Distributions Spring 2018 – Integrated Math II


Cumulative Percent

594 17 0.16 17 0.16 600 50 0.47 67 0.63 614 113 1.06 180 1.68 624 191 1.78 371 3.46 632 313 2.92 684 6.39 640 420 3.92 1,104 10.31 646 552 5.15 1,656 15.46 651 617 5.76 2,273 21.23 657 618 5.77 2,891 27.00 661 612 5.71 3,503 32.71 666 563 5.26 4,066 37.97 670 536 5.01 4,602 42.97 674 462 4.31 5,064 47.29 678 457 4.27 5,521 51.55 681 411 3.84 5,932 55.39 685 372 3.47 6,304 58.87 688 330 3.08 6,634 61.95 691 321 3.00 6,955 64.95 695 301 2.81 7,256 67.76 698 253 2.36 7,509 70.12 701 231 2.16 7,740 72.28 704 207 1.93 7,947 74.21 707 183 1.71 8,130 75.92 710 190 1.77 8,320 77.69 713 179 1.67 8,499 79.36 716 197 1.84 8,696 81.20 719 164 1.53 8,860 82.73 722 142 1.33 9,002 84.06 725 129 1.20 9,131 85.26 728 136 1.27 9,267 86.53 731 148 1.38 9,415 87.92 734 108 1.01 9,523 88.93 738 147 1.37 9,670 90.30 741 102 0.95 9,772 91.25 744 112 1.05 9,884 92.30 747 112 1.05 9,996 93.34 751 74 0.69 10,070 94.03 754 81 0.76 10,151 94.79 758 78 0.73 10,229 95.52 762 73 0.68 10,302 96.20 765 67 0.63 10,369 96.83 769 70 0.65 10,439 97.48 774 57 0.53 10,496 98.01




Cumulative Percent

778 36 0.34 10,532 98.35 783 44 0.41 10,576 98.76 788 38 0.35 10,614 99.11 793 31 0.29 10,645 99.40 799 20 0.19 10,665 99.59 806 14 0.13 10,679 99.72 813 30 0.28 10,709 100.00



Table D32. Scaled Score Frequency Distributions Spring 2018 – Grade 5 Science


Cumulative Percent

559 12 0.01 12 0.01 569 27 0.02 39 0.03 582 67 0.05 106 0.08 593 162 0.13 268 0.21 602 297 0.23 565 0.44 610 449 0.35 1,014 0.79 617 733 0.57 1,747 1.36 623 945 0.74 2,692 2.10 629 1,114 0.87 3,806 2.97 635 1,362 1.06 5,168 4.04 640 1,571 1.23 6,739 5.26 645 1,707 1.33 8,446 6.60 649 1,883 1.47 10,329 8.07 650 15 0.01 10,344 8.08 654 1,960 1.53 12,304 9.61 658 2,029 1.59 14,333 11.20 659 16 0.01 14,349 11.21 664 2,088 1.63 16,437 12.84 667 2,293 1.79 18,730 14.63 671 2,327 1.82 21,057 16.45 675 2,416 1.89 23,473 18.34 678 2,515 1.96 25,988 20.30 679 18 0.01 26,006 20.32 682 2,610 2.04 28,616 22.36 683 17 0.01 28,633 22.37 686 2,748 2.15 31,381 24.52 687 13 0.01 31,394 24.53 690 2,856 2.23 34,250 26.76 691 19 0.01 34,269 26.77 693 2,916 2.28 37,185 29.05 694 24 0.02 37,209 29.07 697 3,181 2.49 40,390 31.55 698 14 0.01 40,404 31.56 701 3,216 2.51 43,620 34.08 702 18 0.01 43,638 34.09 704 3,324 2.60 46,962 36.69 705 13 0.01 46,975 36.70 708 3,540 2.77 50,515 39.46 709 24 0.02 50,539 39.48 712 3,583 2.80 54,122 42.28 713 16 0.01 54,138 42.29 715 3,823 2.99 57,961 45.28 716 18 0.01 57,979 45.29




Cumulative Percent

719 3,920 3.06 61,899 48.36 720 15 0.01 61,914 48.37 723 4,112 3.21 66,026 51.58 725 13 0.01 66,039 51.59 726 4,230 3.30 70,269 54.90 728 12 0.01 70,281 54.91 730 4,334 3.39 74,615 58.29 731 6 0.00 74,621 58.30 734 4,379 3.42 79,000 61.72 735 12 0.01 79,012 61.73 738 4,402 3.44 83,414 65.17 739 11 0.01 83,425 65.17 742 4,427 3.46 87,852 68.63 743 14 0.01 87,866 68.64 746 4,478 3.50 92,344 72.14 748 13 0.01 92,357 72.15 751 4,251 3.32 96,608 75.47 753 8 0.01 96,616 75.48 755 4,170 3.26 100,786 78.74 756 10 0.01 100,796 78.74 760 4,136 3.23 104,932 81.98 761 9 0.01 104,941 81.98 765 3,821 2.99 108,762 84.97 766 6 0.00 108,768 84.97 770 3,584 2.80 112,352 87.77 771 4 0.00 112,356 87.78 776 3,256 2.54 115,612 90.32 777 3 0.00 115,615 90.32 782 2,902 2.27 118,517 92.59 783 3 0.00 118,520 92.59 789 2,490 1.95 121,010 94.54 790 5 0.00 121,015 94.54 797 2,117 1.65 123,132 96.19 805 1,666 1.30 124,798 97.50 806 1 0.00 124,799 97.50 815 1,263 0.99 126,062 98.48 827 883 0.69 126,945 99.17 841 598 0.47 127,543 99.64 845 461 0.36 128,004 100.00



Table D33. Scaled Score Frequency Distributions Spring 2018 – Grade 8 Science


Cumulative Percent

575 17 0.01 17 0.01 579 55 0.04 72 0.06 593 109 0.09 181 0.14 598 3 0.00 184 0.15 604 246 0.19 430 0.34 610 1 0.00 431 0.34 613 456 0.36 887 0.70 619 1 0.00 888 0.70 622 694 0.55 1,582 1.25 627 4 0.00 1,586 1.25 629 1,082 0.86 2,668 2.11 634 8 0.01 2,676 2.12 636 1,451 1.15 4,127 3.26 641 12 0.01 4,139 3.27 642 1,773 1.40 5,912 4.68 647 10 0.01 5,922 4.68 648 2,064 1.63 7,986 6.32 653 2,487 1.97 10,473 8.28 658 18 0.01 10,491 8.30 659 2,633 2.08 13,124 10.38 663 16 0.01 13,140 10.39 664 2,839 2.25 15,979 12.64 668 19 0.02 15,998 12.65 669 3,097 2.45 19,095 15.10 674 3,361 2.66 22,456 17.76 677 27 0.02 22,483 17.78 678 3,421 2.71 25,904 20.49 681 14 0.01 25,918 20.50 682 3,540 2.80 29,458 23.30 686 3,720 2.94 33,178 26.24 690 23 0.02 33,201 26.26 691 3,891 3.08 37,092 29.33 694 16 0.01 37,108 29.35 695 4,018 3.18 41,126 32.53 697 14 0.01 41,140 32.54 700 4,079 3.23 45,219 35.76 701 14 0.01 45,233 35.77 703 4,066 3.22 49,299 38.99 705 16 0.01 49,315 39.00 707 4,173 3.30 53,488 42.30 708 17 0.01 53,505 42.32 710 4,337 3.43 57,842 45.75 712 7 0.01 57,849 45.75




Cumulative Percent

714 4,287 3.39 62,136 49.14 716 8 0.01 62,144 49.15 718 4,245 3.36 66,389 52.50 719 8 0.01 66,397 52.51 722 4,252 3.36 70,649 55.87 723 8 0.01 70,657 55.88 726 4,291 3.39 74,948 59.27 730 4,238 3.35 79,186 62.63 733 6 0.00 79,192 62.63 734 4,044 3.20 83,236 65.83 737 4 0.00 83,240 65.83 738 3,863 3.06 87,103 68.89 741 9 0.01 87,112 68.89 742 3,851 3.05 90,963 71.94 744 6 0.00 90,969 71.94 746 3,645 2.88 94,614 74.83 748 4 0.00 94,618 74.83 750 3,509 2.78 98,127 77.61 752 4 0.00 98,131 77.61 754 3,387 2.68 101,518 80.29 756 6 0.00 101,524 80.29 758 3,205 2.53 104,729 82.83 760 9 0.01 104,738 82.83 763 2,916 2.31 107,654 85.14 766 8 0.01 107,662 85.15 768 2,691 2.13 110,353 87.27 769 6 0.00 110,359 87.28 772 2,604 2.06 112,963 89.34 773 2 0.00 112,965 89.34 777 2,325 1.84 115,290 91.18 778 2 0.00 115,292 91.18 782 2,077 1.64 117,369 92.82 783 1 0.00 117,370 92.82 788 1,885 1.49 119,255 94.31 789 1 0.00 119,256 94.32 793 1,593 1.26 120,849 95.58 794 2 0.00 120,851 95.58 799 1,328 1.05 122,179 96.63 800 1 0.00 122,180 96.63 806 1,174 0.93 123,354 97.56 813 950 0.75 124,304 98.31 820 747 0.59 125,051 98.90 822 1 0.00 125,052 98.90 828 580 0.46 125,632 99.36




Cumulative Percent

832 3 0.00 125,635 99.36 838 340 0.27 125,975 99.63 849 247 0.20 126,222 99.82 863 140 0.11 126,362 99.94 868 82 0.06 126,444 100.00



Table D34. Scaled Score Frequency Distributions Spring 2018 – Biology


Cumulative Percent

617 19 0.01 19 0.01 619 29 0.02 48 0.04 631 89 0.07 137 0.10 637 1 0.00 138 0.10 641 176 0.13 314 0.23 647 2 0.00 316 0.23 648 390 0.29 706 0.52 654 4 0.00 710 0.52 655 679 0.50 1,389 1.02 660 1,116 0.82 2,505 1.84 665 1,716 1.26 4,221 3.10 669 2,379 1.75 6,600 4.85 670 5 0.00 6,605 4.85 673 2,907 2.14 9,512 6.99 674 20 0.01 9,532 7.00 677 3,413 2.51 12,945 9.51 678 21 0.02 12,966 9.53 680 3,871 2.84 16,837 12.37 681 30 0.02 16,867 12.39 683 4,047 2.97 20,914 15.37 685 27 0.02 20,941 15.39 686 4,158 3.06 25,099 18.44 688 26 0.02 25,125 18.46 689 4,293 3.15 29,418 21.62 690 32 0.02 29,450 21.64 692 3,969 2.92 33,419 24.56 693 21 0.02 33,440 24.57 694 4,124 3.03 37,564 27.60 696 26 0.02 37,590 27.62 697 3,991 2.93 41,581 30.55 698 10 0.01 41,591 30.56 700 3,865 2.84 45,456 33.40 701 13 0.01 45,469 33.41 702 3,860 2.84 49,329 36.25 703 14 0.01 49,343 36.26 704 3,807 2.80 53,150 39.06 705 8 0.01 53,158 39.06 706 3,766 2.77 56,924 41.83 708 11 0.01 56,935 41.84 709 3,763 2.77 60,698 44.60 710 11 0.01 60,709 44.61 711 3,749 2.75 64,458 47.37 712 12 0.01 64,470 47.37




Cumulative Percent

713 3,859 2.84 68,329 50.21 714 4 0.00 68,333 50.21 715 3,736 2.75 72,069 52.96 716 7 0.01 72,076 52.96 718 3,735 2.74 75,811 55.71 720 3,573 2.63 79,384 58.33 722 3,630 2.67 83,014 61.00 724 3,708 2.72 86,722 63.73 725 7 0.01 86,729 63.73 726 3,566 2.62 90,295 66.35 727 1 0.00 90,296 66.35 728 3,467 2.55 93,763 68.90 729 5 0.00 93,768 68.90 731 3,417 2.51 97,185 71.41 733 3,315 2.44 100,500 73.85 735 3,213 2.36 103,713 76.21 737 1 0.00 103,714 76.21 738 3,137 2.31 106,851 78.52 740 2,990 2.20 109,841 80.71 742 2,961 2.18 112,802 82.89 744 3 0.00 112,805 82.89 745 2,754 2.02 115,559 84.92 747 1 0.00 115,560 84.92 748 2,566 1.89 118,126 86.80 749 2 0.00 118,128 86.80 750 2,523 1.85 120,651 88.66 752 1 0.00 120,652 88.66 753 2,365 1.74 123,017 90.40 755 3 0.00 123,020 90.40 756 2,091 1.54 125,111 91.93 759 2,058 1.51 127,169 93.45 763 1,821 1.34 128,990 94.78 764 2 0.00 128,992 94.79 766 1,619 1.19 130,611 95.98 770 1,408 1.03 132,019 97.01 775 1,234 0.91 133,253 97.92 780 952 0.70 134,205 98.62 785 765 0.56 134,970 99.18 792 505 0.37 135,475 99.55 794 1 0.00 135,476 99.55 800 317 0.23 135,793 99.78 811 192 0.14 135,985 99.93 823 102 0.07 136,087 100.00



Table D35. Scaled Score Frequency Distributions Spring 2018 – Physical Science


Cumulative Percent

634 4 0.78 4 0.78 635 12 2.33 16 3.10 646 15 2.91 31 6.01 654 23 4.46 54 10.47 661 31 6.01 85 16.47 667 44 8.53 129 25.00 671 58 11.24 187 36.24 676 49 9.50 236 45.74 680 46 8.91 282 54.65 684 51 9.88 333 64.53 685 1 0.19 334 64.73 687 41 7.95 375 72.67 688 2 0.39 377 73.06 690 29 5.62 406 78.68 691 1 0.19 407 78.88 693 24 4.65 431 83.53 696 23 4.46 454 87.98 698 13 2.52 467 90.50 701 5 0.97 472 91.47 702 1 0.19 473 91.67 703 5 0.97 478 92.64 706 5 0.97 483 93.60 708 5 0.97 488 94.57 710 6 1.16 494 95.74 712 1 0.19 495 95.93 714 3 0.58 498 96.51 717 7 1.36 505 97.87 719 3 0.58 508 98.45 723 2 0.39 510 98.84 729 1 0.19 511 99.03 731 2 0.39 513 99.42 743 1 0.19 514 99.61 745 1 0.19 515 99.81 754 1 0.19 516 100.00



Table D36. Scaled Score Frequency Distributions Spring 2018 – American Government


Cumulative Percent

642 5 0.01 5 0.01 644 14 0.02 19 0.02 652 23 0.03 42 0.05 658 45 0.05 87 0.10 663 109 0.12 196 0.22 667 158 0.18 354 0.40 670 5 0.01 359 0.41 671 256 0.29 615 0.70 673 4 0.00 619 0.71 674 340 0.39 959 1.10 676 5 0.01 964 1.10 677 469 0.54 1,433 1.64 678 5 0.01 1,438 1.64 679 610 0.70 2,048 2.34 681 743 0.85 2,791 3.19 683 881 1.01 3,672 4.20 685 1,126 1.29 4,798 5.48 687 1,258 1.44 6,056 6.92 689 1,324 1.51 7,380 8.44 690 11 0.01 7,391 8.45 691 1,421 1.62 8,812 10.07 692 1,488 1.70 10,300 11.77 694 1,616 1.85 11,916 13.62 695 1,702 1.95 13,618 15.57 697 1,889 2.16 15,507 17.73 698 1,860 2.13 17,367 19.85 699 5 0.01 17,372 19.86 700 1,989 2.27 19,361 22.13 701 2,039 2.33 21,400 24.46 702 2,249 2.57 23,649 27.03 703 2,218 2.54 25,867 29.57 704 4 0.00 25,871 29.57 705 2,218 2.54 28,089 32.11 706 2,397 2.74 30,486 34.85 707 2,430 2.78 32,916 37.63 708 2,491 2.85 35,407 40.48 709 2,512 2.87 37,919 43.35 710 2,558 2.92 40,477 46.27 712 2,515 2.88 42,992 49.15 713 2,465 2.82 45,457 51.96 714 2,523 2.88 47,980 54.85 715 2,470 2.82 50,450 57.67 716 2,442 2.79 52,892 60.46




Cumulative Percent

717 2,367 2.71 55,259 63.17 718 2,373 2.71 57,632 65.88 719 2,281 2.61 59,913 68.49 720 1 0.00 59,914 68.49 721 2,268 2.59 62,182 71.08 722 2,157 2.47 64,339 73.55 723 2,154 2.46 66,493 76.01 724 2,048 2.34 68,541 78.35 725 1,893 2.16 70,434 80.52 727 1,832 2.09 72,266 82.61 728 1,780 2.03 74,046 84.65 730 1,672 1.91 75,718 86.56 731 1,581 1.81 77,299 88.36 733 1,469 1.68 78,768 90.04 734 1,299 1.48 80,067 91.53 736 1,237 1.41 81,304 92.94 737 1 0.00 81,305 92.94 738 1,147 1.31 82,452 94.25 739 1 0.00 82,453 94.26 740 960 1.10 83,413 95.35 742 903 1.03 84,316 96.39 745 784 0.90 85,100 97.28 747 627 0.72 85,727 98.00 751 583 0.67 86,310 98.66 754 450 0.51 86,760 99.18 759 309 0.35 87,069 99.53 765 226 0.26 87,295 99.79 773 106 0.12 87,401 99.91 774 77 0.09 87,478 100.00



Table D37. Scaled Score Frequency Distributions Spring 2018 – American History


Cumulative Percent

619 24 0.02 24 0.02 622 1 0.00 25 0.02 630 19 0.01 44 0.03 639 40 0.03 84 0.07 641 1 0.00 85 0.07 645 75 0.06 160 0.13 651 123 0.10 283 0.22 653 1 0.00 284 0.22 655 208 0.16 492 0.39 658 2 0.00 494 0.39 659 360 0.28 854 0.67 662 6 0.00 860 0.68 663 586 0.46 1,446 1.14 665 13 0.01 1,459 1.15 667 911 0.72 2,370 1.87 669 12 0.01 2,382 1.88 670 1,232 0.97 3,614 2.85 672 10 0.01 3,624 2.86 673 1,585 1.25 5,209 4.11 675 12 0.01 5,221 4.12 676 1,989 1.57 7,210 5.69 678 2,184 1.72 9,394 7.41 680 17 0.01 9,411 7.42 681 2,407 1.90 11,818 9.32 682 25 0.02 11,843 9.34 683 2,590 2.04 14,433 11.39 685 2,667 2.10 17,100 13.49 687 2,747 2.17 19,847 15.66 689 2,768 2.18 22,615 17.84 691 14 0.01 22,629 17.85 692 2,733 2.16 25,362 20.01 693 2,708 2.14 28,070 22.14 695 2,711 2.14 30,781 24.28 697 2,586 2.04 33,367 26.32 699 2,536 2.00 35,903 28.32 700 11 0.01 35,914 28.33 701 2,574 2.03 38,488 30.36 702 4 0.00 38,492 30.36 703 2,710 2.14 41,202 32.50 704 2,740 2.16 43,942 34.66 706 2,643 2.08 46,585 36.75 707 3 0.00 46,588 36.75




Cumulative Percent

708 2,731 2.15 49,319 38.91 709 2,702 2.13 52,021 41.04 710 6 0.00 52,027 41.04 711 2,716 2.14 54,743 43.18 712 9 0.01 54,752 43.19 713 2,790 2.20 57,542 45.39 714 2,833 2.23 60,375 47.63 715 1 0.00 60,376 47.63 716 2,853 2.25 63,229 49.88 717 2,976 2.35 66,205 52.23 718 4 0.00 66,209 52.23 719 2,946 2.32 69,155 54.55 720 4 0.00 69,159 54.56 721 2,734 2.16 71,893 56.71 722 2,760 2.18 74,653 58.89 723 7 0.01 74,660 58.90 724 2,915 2.30 77,575 61.19 725 7 0.01 77,582 61.20 726 2,954 2.33 80,536 63.53 728 2,994 2.36 83,530 65.89 729 3,158 2.49 86,688 68.38 730 3 0.00 86,691 68.39 731 3,062 2.42 89,753 70.80 733 2,970 2.34 92,723 73.14 735 3,015 2.38 95,738 75.52 737 3,036 2.39 98,774 77.92 739 2,998 2.36 101,772 80.28 741 2,919 2.30 104,691 82.59 743 2,892 2.28 107,583 84.87 746 2,745 2.17 110,328 87.03 748 2,748 2.17 113,076 89.20 751 2,454 1.94 115,530 91.14 754 2,303 1.82 117,833 92.95 757 2,098 1.66 119,931 94.61 760 1,797 1.42 121,728 96.02 764 1,561 1.23 123,289 97.26 768 1 0.00 123,290 97.26 769 1,226 0.97 124,516 98.22 774 903 0.71 125,419 98.94 780 608 0.48 126,027 99.42 788 390 0.31 126,417 99.72 799 233 0.18 126,650 99.91 800 117 0.09 126,767 100.00

Ohio’s State Tests—Spring 2018 Administration Technical Report

E-1 American Institutes for Research

Table E1. Operational Item Parameter Estimates – English Language Arts Grade 3

Item Item Type Item Parameter Estimates Average

Rasch Value Step 1 Step 2 Step 3 Step 4

25862 multipleChoice -0.51358 -0.51358

24759 multipleChoice 0.28037 0.28037

25875 multipleChoice -0.58801 -0.58801

24772 multipleChoice 0.90872 0.90872

24783 multipleChoice 0.45596 0.45596

27066 multipleSelect 0.08976 0.08976

24756 multipleChoice 0.16383 0.16383

31212 multipleChoice -1.76431 -1.76431

31213 multipleChoice -0.7541 -0.7541

31217 multipleChoice -0.34516 -0.34516

31220 multipleChoice -1.68894 -1.68894

31223 multipleChoice, multipleSelect

1.28346 2.161 1.72223

31224 tableMatch 1.70256 1.70256

31664_E textEntryExtendedResponse -0.03933 1.29975 2.62324 3.95269 1.959088

31664_O textEntryExtendedResponse -0.00337 1.24574 2.52953 3.92214 1.92351

31664_C textEntryExtendedResponse -1.09795 0.48455 -0.3067

26907 multipleChoice -0.7537 -0.7537

26923 multipleChoice -0.54825 -0.54825

26938 multipleChoice 0.23929 0.23929

26912 multipleChoice -1.54273 -1.54273

26940 multipleChoice -0.54281 -0.54281

26935 multipleChoice -0.28401 -0.28401


-1.29207 2.74061 0.72427

30441 multipleChoice 0.37338 0.37338

30377 hotTextCustom 1.00733 1.00733

30401 multipleChoice -0.01883 -0.01883

30440 multipleChoice -0.10418 -0.10418

30450 multipleChoice -1.24834 -1.24834

30382 multipleChoice, multipleChoice

0.98573 -0.36036 0.312685

30374 tableMatch 1.13778 1.13778 *Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.






26101 multipleChoice -1.47924 -1.47924


0.04984 0.64621 0.348025

26096 multipleChoice -0.64156 -0.64156

26098 multipleChoice -0.16945 -0.16945

26090 multipleChoice -2.31003 -2.31003

26091 hotTextCustom -0.28987 1.38096 0.545545

28240 multipleChoice 0.18033 0.18033

27699 multipleChoice 1.25277 1.25277

27685 multipleChoice -0.34283 -0.34283

27698 multipleChoice -1.30588 -1.30588

28245 multipleChoice 0.27098 0.27098

27704 multipleChoice -0.20895 -0.20895

27693 multipleSelect, multipleChoice

-0.22254 0.48933 0.133395




27297 multipleChoice -0.44982 -0.44982

24671 multipleChoice -0.30015 -0.30015

26875 multipleChoice -0.291 -0.291

24661 multipleChoice -0.10594 -0.10594

24668 multipleChoice -1.30412 -1.30412

24672 multipleChoice 0.31595 0.31595

26874 tableMatch 1.58679 1.58679

26734 multipleChoice -0.00577 -0.00577

26733 multipleSelect 2.48263 2.48263

28370 multipleChoice -0.07815 -0.07815

26726 hotTextCustom -0.24864 -0.24864

26735 multipleSelect 1.71361 1.71361


1.71089 -1.17365 0.26862

*Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.






26656 multipleChoice -0.27554 -0.27554


1.93508 -0.42745 0.753815

26667 multipleChoice -0.00665 -0.00665


0.00075 0.37582 0.188285


-0.16897 -0.52732 -0.34815

30753 multipleChoice -0.5249 -0.5249

30757 multipleChoice -0.86081 -0.86081

30748 multipleChoice -0.12624 -0.12624

30756 multipleChoice -1.46763 -1.46763


1.50975 -0.07645 0.71665

30751 multipleChoice 0.81347 0.81347

30755 tableMatch 2.48511 2.48511

30754 multipleChoice 1.04094 1.04094



32035_C textEntryExtendedResponse -2.39768 -1.10781 -1.75275

28309 multipleChoice -1.22741 -1.22741

28308 multipleChoice 0.00366 0.00366

26910 multipleChoice 0.46695 0.46695

26916 hotTextCustom -1.45276 -1.45276

26894 multipleChoice 0.52243 0.52243

26925 multipleSelect 0.34783 0.34783

26902 multipleSelect 0.31747 0.31747

27045 multipleChoice -1.15754 -1.15754

27050 multipleChoice -0.98601 -0.98601

27049 multipleChoice -0.31096 -0.31096

27047 multipleSelect 1.93112 1.93112

27046 multipleChoice 0.10667 0.10667

27040 multipleChoice -1.23815 -1.23815

27048 hotTextCustom -1.3071 0.41946 -0.44382 *Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.






27397 multipleChoice 0.57743 0.57743

28200 multipleSelect 1.34037 1.34037

27406 multipleChoice 0.75512 0.75512

28189 multipleChoice 0.72681 0.72681

27376 multipleChoice -2.02554 -2.02554


0.999 -1.07631 -0.03866

28266 multipleChoice -1.47034 -1.47034

27405 tableMatch 1.55259 1.55259

30424 multipleChoice -0.50231 -0.50231

30419 multipleSelect 1.31736 1.31736

30414 multipleSelect 1.67789 1.67789

30411 multipleChoice -0.48177 -0.48177

30443 multipleChoice -2.04217 -2.04217

30428 multipleChoice 0.07108 0.07108

30437 multipleChoice -0.09652 -0.09652

31711_E textEntryExtendedResponse -2.1755 -0.44258 1.24382 2.91468 0.385105

31711_O textEntryExtendedResponse -2.28712 -0.50726 1.03694 2.90727 0.287458


31077 multipleChoice -1.40513 -1.40513


0.86457 -1.99897 -0.5672

31079 multipleChoice -0.97851 -0.97851


0.20393 1.22595 0.71494

31078 multipleChoice -0.45674 -0.45674

31083 multipleChoice -1.84572 -1.84572


1.27041 -1.7974 -0.2635

30596 multipleChoice -0.19419 -0.19419

30793 multipleChoice 0.54423 0.54423

30588 multipleChoice -1.03902 -1.03902


2.35609 -1.43596 0.460065


1.70071 -0.47066 0.615025

31762 multipleChoice 0.15788 0.15788


0.45428 0.6636 0.55894






0.34763 0.03426 0.190945

31752 multipleSelect 0.81296 0.81296

31759 multipleChoice -0.15812 -0.15812

31754 multipleChoice -0.16302 -0.16302

31766_E textEntryExtendedResponse -1.01954 -0.93593 1.79549 2.3993 0.55983


31766_C textEntryExtendedResponse -2.07575 -1.31535 -1.69555 *Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.






31023 multipleChoice -0.56586 -0.56586


1.55504 -1.50591 0.024565

31032 multipleSelect 1.14036 1.14036

31021 multipleChoice -0.8639 -0.8639

31024 multipleChoice 0.62635 0.62635


1.2891 -0.00621 0.641445

27064 multipleChoice 0.71433 0.71433

27062 hotTextCustom 0.48573 0.48573

28473 multipleChoice -2.23718 -2.23718

28474 multipleChoice -1.35959 -1.35959

27053 multipleSelect 1.75525 1.75525


0.41915 0.01625 0.2177

27055 multipleSelect 1.21371 1.21371

31594 multipleChoice -0.86628 -0.86628

31599 multipleChoice -2.29263 -2.29263

31595 hotTextCustom 1.43375 1.72133 1.57754

31597 multipleChoice -0.11666 -0.11666

31593 multipleChoice -0.09674 -0.09674

31598 multipleChoice 0.3994 0.3994

31601 tableMatch 0.1095 0.1095




31444 multipleChoice -2.03737 -2.03737

31425 multipleChoice -0.06718 -0.06718

31446 multipleSelect 1.19092 1.19092

31428 multipleChoice -0.6831 -0.6831

31430 multipleChoice -1.45137 -1.45137


-0.20489 0.1525 -0.0262




26956 multipleChoice -0.34722 -0.34722

26961 multipleChoice 0.31368 0.31368

27727 multipleSelect 0.3158 0.3158





26959 multipleChoice -1.87329 -1.87329

27998 multipleChoice 0.06576 0.06576

28023 hotTextCustom 0.61062 0.61062

28211 multipleChoice 0.19803 0.19803

28212 hotTextCustom 0.96254 -0.03328 0.46463

28028 multipleSelect 1.46287 1.46287

28206 multipleSelect 0.24186 0.24186 *Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.






31383 multipleChoice -0.55067 -0.55067

31377 multipleChoice -0.32759 -0.32759

31380 multipleChoice -0.48801 -0.48801

31375 multipleChoice -0.1515 -0.1515

31376 multipleSelect 0.72229 0.72229

31390 multipleChoice -0.83735 -0.83735

26682 multipleChoice -0.58563 -0.58563

26689 multipleChoice -0.7992 -0.7992

27629 multipleChoice 0.15096 0.15096


1.7132 2.07288 1.89304


26683 hotTextCustom 0.83429 0.59436 0.714325

26688 multipleSelect 1.32824 1.32824

30901 multipleChoice -2.01057 -2.01057

30897 multipleChoice 0.23526 0.23526

30900 multipleChoice 0.68989 0.68989

30895 multipleChoice 0.60201 0.60201


0.54015 0.83416 0.687155

30898 multipleChoice -0.17447 -0.17447

30902 multipleChoice -0.17224 -0.17224


0.03536 0.34237 0.188865





1.16051 -1.21849 -0.02899

31291 multipleChoice 0.36046 0.36046

31054 multipleChoice -0.58357 -0.58357


1.38379 -1.7015 -0.15886

31056 multipleChoice -2.44654 -2.44654

31057 multipleSelect -0.89718 -0.89718

31053 multipleChoice 0.29311 0.29311

26225 multipleChoice -0.08456 -0.08456

28025 multipleChoice 1.04605 1.04605

26232 multipleChoice -1.33216 -1.33216

26233 multipleChoice -0.3111 -0.3111






-0.6218 0.46594 -0.07793


-0.58882 1.67072 0.54095






Table E7. Operational Item Parameter Estimates – English Language Arts High School I



30951 multipleChoice -0.13614 -0.13614

30957 multipleChoice 0.40764 0.40764

30950 multipleChoice 0.21227 0.21227

30946 multipleChoice 0.46709 0.46709

30953 multipleChoice -1.58898 -1.58898

30954 multipleChoice -1.12329 -1.12329

30958 hotTextCustom 0.93252 0.93252

30956 multipleChoice -0.48663 -0.48663

31556 multipleChoice -0.49819 -0.49819


0.54825 0.66111 0.60468


31560 multipleChoice -0.23619 -0.23619

31567 multipleSelect 2.41602 2.41602


-0.29548 0.56089 0.132705


0.10151 1.23776 0.669635





1.54144 -2.73577 -0.59717

30550 multipleChoice -0.91463 -0.91463

30552 multipleChoice -1.17356 -1.17356

30555 multipleChoice -1.15727 -1.15727

30548 multipleChoice -1.77249 -1.77249


1.86504 -1.19054 0.33725


-0.23161 -0.35943 -0.29552

26334 multipleChoice 0.68089 0.68089

26329 hotTextCustom 0.80981 0.80981

26325 multipleChoice -0.79961 -0.79961

26330 multipleChoice -1.01538 -1.01538

26336 multipleSelect 1.51159 1.51159

30092 multipleChoice 0.58017 0.58017

30095 multipleChoice -0.1724 -0.1724

30097 multipleChoice 0.21786 0.21786






1.25035 0.51591 0.88313

30093 multipleChoice -1.67816 -1.67816


1.25707 0.88235 1.06971

30100 multipleChoice -0.27289 -0.27289






Table E8. Operational Item Parameter Estimates – English Language Arts High School II



25190 multipleChoice -1.38713 -1.38713


1.60918 -1.00029 0.304445

25189 multipleChoice 0.67163 0.67163

25198 multipleChoice 1.32166 1.32166

26637 multipleSelect 1.12953 1.12953

26346 multipleChoice -1.22044 -1.22044

25191 multipleChoice -0.13953 -0.13953

25193 multipleSelect -0.08459 -0.08459

30934 multipleChoice -0.38673 -0.38673

30941 multipleChoice 0.49817 0.49817

30931 multipleChoice -1.26484 -1.26484

30933 multipleChoice -0.02776 -0.02776

30944 multipleChoice -0.48321 -0.48321


0.7873 -0.33296 0.22717

30938 multipleChoice 0.33021 0.33021




27282 multipleChoice -1.274 -1.274

27287 multipleSelect 1.85964 1.85964

27288 multipleChoice -0.45826 -0.45826

27286 multipleChoice 0.23141 0.23141

27280 multipleChoice -1.78291 -1.78291

27290 multipleChoice 0.12689 0.12689

27292 multipleSelect 0.53887 0.53887

25230 multipleChoice 0.41206 0.41206

25229 multipleSelect 2.05923 2.05923

25226 multipleChoice -0.62109 -0.62109


2.90686 -1.7798 0.56353


0.13315 -0.03889 0.04713

27595 multipleChoice -0.42206 -0.42206

31621 multipleChoice -0.2174 -0.2174

31613 multipleChoice -0.70043 -0.70043


1.1151 0.00262 0.55886

31617 multipleChoice -2.13025 -2.13025






0.17067 2.23294 1.201805

31616 multipleChoice -1.48749 -1.48749

31620 multipleChoice 0.75856 0.75856






Table E9. Operational Item Parameter Estimates – Mathematics Grade 3



25661 equation -1.78259 -1.78259

29257 grid -0.70287 -0.70287

23577 multipleChoice -2.1213 -2.1213

28618 grid -0.54177 -0.54177

23836 equation 0.47113 0.47113

28529 grid 0.61526 0.61526


26279 multipleChoice -2.77373 -2.77373

28885 equation 1.3547 1.3547

24514 multipleChoice -1.6448 -1.6448

24373 multipleChoice 0.74207 0.74207

29321 grid 1.72593 1.72593

33654 grid, equation -1.23304 -0.24168 -0.73736

25470 equation -1.63217 -1.63217


29330 grid 1.89706 2.3401 2.11858

24813 tableInput -0.92575 -0.92575

23518 grid -0.84628 0.07584 -0.38522

24511 grid 2.34634 1.53813 1.942235

24369 tableInput -0.39255 -0.39255

25469 multipleChoice -0.15975 -0.15975

23585 equation -0.68999 -0.68999

25124 grid -1.77938 -1.77938

29339 equation -0.81774 -0.81774


26998 multipleChoice -2.61914 -2.61914

29005 multipleSelect -1.27174 -1.27174

23858 equation 1.41295 1.41295

23567 tableInput 0.84694 0.84694

27001 equation -0.55102 -0.55102

28551 multipleChoice -0.68254 -0.68254

29081 equation 0.84854 0.84854

25861 grid -1.41323 -1.41323

23600 equation -1.10248 -1.10248

28709 tableInput 0.67996 2.84826 1.76411

26971 multipleChoice -1.19719 -1.19719

28568 equation 0.9789 0.9789

33653 grid, equation, equation -0.46244 3.32124 1.4294

28526 equation 0.18229 0.18229





28867 grid 1.68554 1.68554

28530 equation 0.5425 0.5425

23597 tableInput -0.40709 -0.40709

23572 equation -1.83296 -1.83296






27445 multipleChoice -1.83466 -1.83466

29186 equation -1.74196 -1.74196

24180 equation -1.39733 -1.39733

24716 multipleSelect -0.79471 -0.79471

27538 multipleSelect -0.53704 -0.53704

29190 grid 0.70104 0.70104

28237 equation 1.03322 1.03322

26379 multipleSelect -0.4354 -0.4354

28762 tableInput 0.4339 0.4339

25755 multipleChoice -0.00197 -0.00197

29193 equation 2.22057 2.22057

27449 equation 0.28056 0.28056

29322 multipleChoice 0.25681 0.25681

29184 equation -0.60709 -0.60709

28774 tableInput 2.13184 2.13184

24349 equation 0.13152 0.13152

29289 multipleChoice 0.04224 0.04224

25518 tableMatch 1.3496 1.3496

24182 equation 0.36665 0.36665

25712 equation 0.83204 0.83204

26151 multipleSelect 0.10503 0.10503

29312 grid -0.41946 -0.41946

29291 equation -0.75822 -0.75822

27402 multipleChoice -0.80318 -0.80318

24020 equation -1.71258 -1.71258

27248 grid -0.86221 -0.86221

25109 equation -0.73885 -0.73885

25884 equation -0.17991 -0.17991

24714 multipleChoice 0.12111 0.12111

25763 equation 0.23493 0.23493


23525 multipleSelect -0.27031 -0.27031

29189 grid -0.64618 -0.64618

25711 grid 2.16269 2.16269

26376 equation 0.18182 0.18182

24510 grid 0.12348 0.46686 0.29517

24801 grid 0.42449 0.42449

29192 equation 1.77962 1.77962

24343 equation 0.08131 0.08131





24188 grid 0.0463 0.0463

27249 multipleChoice 0.16513 0.16513

27947 equation -0.53084 -0.53084

26629 tableMatch 0.80042 0.80042

27946 multipleSelect 0.24841 0.24841

26373 grid 0.00897 0.00897

28776 multipleChoice -0.5088 -0.5088

27399 multipleChoice -0.53706 -0.53706

29191 equation -1.14969 -1.14969






29393 equation -1.17303 -1.17303

29077 grid -2.43274 -2.43274

26469 equation 0.05173 0.05173

26470 equation 0.35295 0.35295

26575 multipleChoice -1.59654 -1.59654

28982 equation 0.75717 0.75717

26473 equation -0.11233 -0.11233

28015 multipleSelect 1.09796 1.09796

29057 equation 1.24738 1.24738

27937 grid -1.44051 -1.44051

33658 equation, equation,

equation 0.00026 1.94693 1.67589 1.207693

29053 equation 0.24494 0.24494

26844 multipleChoice -1.92524 -1.92524

29395 equation 0.10069 0.10069

28073 equation -0.38274 -0.38274

27782 grid -0.08631 0.79111 0.3524

26445 equation 1.27954 1.27954

29475 equation 0.77376 0.77376

26578 multipleSelect 1.30185 1.30185

29063 equation 1.55816 1.55816

29054 grid 0.1345 0.1345

28010 equation -1.86763 -0.31855 -1.09309

29055 equation -1.1243 -1.1243

26560 equation -0.92813 -0.92813

29469 tableInput -0.75977 -0.75977

27731 equation -0.68964 -0.68964

26256 tableInput -1.1864 -1.1864

27926 equation 1.00069 1.00069

28076 multipleChoice -1.33085 -1.33085

26447 multipleChoice -0.13752 -0.13752

28316 equation 2.5563 2.5563

26721 multipleChoice 0.09168 0.09168

26891 equation -0.19235 -0.19235

27973 grid 0.04739 0.04739

29059 equation 1.64473 1.64473

28314 multipleChoice 0.41186 0.41186

27348 equation 1.29193 1.29193

27236 multipleSelect -0.34515 -0.34515





27744 equation 1.37688 1.37688

29390 equation 1.07762 1.07762

28981 multipleSelect -0.38154 -0.38154

29394 equation -0.00655 -0.00655

28170 equation 0.25053 0.25053

28072 multipleChoice -2.22454 -2.22454

26454 grid -0.49473 -0.49473

27116 equation -0.62614 -0.62614






25154 equation -1.52018 -1.52018

26286 multipleChoice -2.5914 -2.5914

33662 multipleChoice, multipleChoice, multipleChoice

-1.9145 0.22474 2.41447 0.24157

24377 tableInput -0.8363 -0.8363

28983 equation -0.65832 -0.65832

29176 equation 0.23555 0.23555

28669 equation 1.96613 0.94358 1.454855

28883 equation -0.31261 -0.31261

28704 equation 1.0977 1.0977

25423 equation 1.87181 1.87181

28989 multipleChoice -0.24068 -0.24068

25426 grid -0.96464 -0.96464

26980 equation 0.63065 0.63065

25484 equation 2.00575 2.00575

29308 multipleSelect -0.40145 -0.40145

28513 multipleChoice 1.52327 1.52327

29045 multipleChoice -0.67372 -0.67372

26292 multipleChoice -1.00471 -1.00471

28578 equation 2.49015 2.49015

28833 multipleSelect -0.13065 -0.13065

26984 multipleSelect -0.4251 -0.4251

28565 equation -0.68866 -0.68866

28994 equation -0.70868 -0.70868

29042 multipleChoice -1.53868 -1.53868

24382 equation -1.62708 -1.62708

27095 equation -2.5287 -2.5287

25155 equation -0.47653 -0.47653

24473 multipleChoice -0.74595 -0.74595

23819 equation -0.00324 -0.00324

29265 multipleChoice 2.46168 2.46168

28528 equation -0.17241 -0.17241

25419 equation 0.21512 0.21512

28995 multipleChoice 0.63867 0.63867

24383 equation 0.67979 0.67979

25485 equation 2.3476 2.3476

27091 multipleChoice -1.76916 -1.76916

33661 equation, grid -0.5282 0.4976 -0.0153






24760 multipleChoice 0.00616 0.00616

23834 multipleSelect 2.39386 2.39386

33663 equation, tableInput,

equation 0.15628 0.82401 2.27284 1.084377

27925 equation -0.93773 -0.93773

28541 equation -1.24736 0.85404 -0.19666

28745 equation -0.88921 -0.88921

25417 equation -1.34438 -1.34438

28990 multipleChoice -0.27495 -0.27495






24031 multipleChoice -1.93977 -1.93977

27865 equation -0.82288 -0.82288

25773 multipleChoice -1.02455 -1.02455

33666 equation, grid, equation -0.30161 2.54453 -0.03032 0.737533

24729 multipleChoice -0.59856 -0.59856

29424 equation -0.70083 -0.70083

24448 equation 0.42042 0.42042

28744 equation 1.31492 1.31492

26384 multipleChoice 0.15613 0.15613

26594 equation 0.47133 0.47133

28806 multipleChoice 0.29465 0.29465

29284 grid 0.29375 0.29375

28823 equation -0.15624 -0.15624

28798 multipleChoice 0.02391 0.02391

29438 equation -0.16596 -0.16596

33667 multipleChoice, tableMatch,

equation -0.19441 1.55301 3.46423 1.60761

29414 multipleSelect 0.13742 0.13742

25765 multipleChoice -0.44495 -0.44495

29490 multipleChoice -0.15851 -0.15851

29136 equation 2.15782 2.15782

25691 multipleChoice -1.28556 -1.28556

28795 equation -1.87004 0.99348 -0.43828

27541 multipleChoice -1.42701 -1.42701

27597 equation -0.02817 -0.02817

33664 grid, multipleChoice,

equation -0.77251 0.83918 1.70761 0.591427

26391 equation -0.09251 -0.09251

27394 equation 0.25139 0.25139

25771 equation 1.3479 1.3479

24022 tableMatch 0.3224 0.3224

29607 equation 0.12659 0.12659

26383 equation -0.28778 -0.28778

26191 equation -0.14311 -0.14311

29474 multipleChoice -1.49796 -1.49796

24161 equation 0.13008 0.13008

29354 equation -0.56223 0.74649 0.09213

27483 equation -0.53548 -0.53548

27826 tableInput -1.04169 -1.04169





26271 equation 1.09318 1.09318

26440 equation 1.12143 1.12143

24725 equation 0.36639 0.36639

25770 multipleChoice -0.06662 -0.06662

26242 equation 0.52242 0.52242

29491 equation 0.81976 0.81976

29137 equation 1.92173 1.92173






28125 multipleChoice -2.04461 -2.04461

27942 equation -2.70256 -2.70256

26591 equation 0.29072 0.29072

26861 tableInput 0.1606 0.1606

26563 multipleChoice -1.81844 -1.81844

29623 multipleChoice -0.09442 -0.09442

27327 equation 1.39545 1.39545

25500 multipleChoice -1.14419 -1.14419

29540 equation 0.44009 0.44009

26900 tableInput 0.65227 0.65227


26463 multipleChoice 0.05963 0.05963

28101 multipleChoice -1.0249 -1.0249

29585 equation 2.36331 2.36331

29016 multipleChoice -0.09024 -0.09024

29850 equation 0.8892 0.8892

29582 multipleChoice -1.51651 -1.51651

27979 equation 1.82285 1.82285

28134 equation -0.78704 -0.78704

27954 equation -1.68922 2.72496 0.51787

29931 equation 2.37478 2.37478

29539 equation 1.31319 1.31319

29544 grid -1.12358 -1.12358

26474 multipleChoice -1.35419 -1.35419

29528 equation -1.14941 -1.14941

29580 equation -2.73721 -2.73721

27994 multipleChoice -0.59947 -0.59947

28225 equation 0.30136 0.30136

29937 multipleChoice -2.13658 -2.13658

27108 equation -1.16486 -1.16486

29108 grid 1.52522 2.03748 1.78135

26565 multipleChoice -0.02511 -0.02511

26459 equation 0.74127 0.74127

29583 multipleChoice -0.93428 -0.93428

27790 tableInput 1.52332 1.52332

30242 equation 1.66703 1.66703

29017 multipleChoice 0.34282 0.34282

27471 equation 1.05664 1.05664

26738 grid -1.38276 0.8594 -0.26168





28033 multipleSelect 0.90768 0.90768

26261 equation 2.36342 2.36342

28130 multipleChoice -2.41372 -2.41372

30245 equation 2.05957 2.05957

29549 equation 0.61948 0.61948

27997 equation 1.04529 1.04529

30235 multipleSelect 1.90899 1.90899

28224 multipleSelect 0.26013 0.26013

29164 equation 1.9191 1.9191

29542 multipleChoice -0.23869 -0.23869

29579 multipleChoice -2.21812 -2.21812



Table E15. Operational Item Parameter Estimates – Algebra



27243 equation -2.08738 -2.08738

27130 multipleChoice -0.67775 -0.67775

24154 multipleSelect -0.53029 -0.53029


26986 multipleChoice -0.26204 -0.26204

30264 tableInput 0.53619 0.53619

25159 multipleChoice 0.30613 0.30613

26181 tableInput -0.30715 2.32257 1.00771

24026 multipleChoice -0.63474 -0.63474

30273 equation 0.22863 0.22863

29708 equation 0.08013 0.21808 0.149105

29800 tableInput -0.16107 -0.87565 -0.51836

26418 multipleChoice 0.58758 0.58758

25440 equation -0.36289 -0.36289

25790 equation -0.2077 3.28148 1.53689

25442 multipleChoice 0.11279 0.11279

29778 multipleChoice -0.10838 -0.10838

25697 multipleChoice -0.7898 -0.7898

33646 hotTextSelectable, equation,

equation 0.41982 0.61361 0.516715

28633 equation 1.70328 1.70328

29328 grid 0.7869 0.7869

28997 multipleChoice -0.53511 -0.53511

25480 multipleChoice -0.62439 -0.62439

29277 multipleChoice -1.08298 -1.08298

26704 equation -1.54533 -1.54533

26876 multipleChoice -1.80846 -1.80846

26989 multipleChoice -1.59233 -1.59233

28856 equation -1.30973 -1.30973

28870 equation 1.86405 0.8173 1.340675

30269 equation -0.04757 -0.04757

29674 multipleChoice -0.04274 -0.04274

28118 equation 0.49438 0.49438

24396 equation 1.61637 1.61637

28643 multipleChoice 0.98869 0.98869

29207 multipleChoice -0.01165 -0.01165

29935 tableMatch -0.90445 -0.90445

29836 equation 2.33268 2.33268

30539 tableInput 1.08831 1.08831





29492 multipleChoice -0.2961 -0.2961

25886 multipleChoice -0.52299 -0.52299

28720 multipleChoice 0.28116 0.28116

24638 equation 1.94598 1.94598

27083 equation -0.82419 -0.82419

29287 equation -0.01199 2.69615 1.34208

26426 multipleChoice -0.50014 -0.50014

26985 multipleChoice -0.37111 -0.37111

28566 multipleChoice -0.28946 -0.28946



Table E16. Operational Item Parameter Estimates – Geometry



28646 hotTextCustom -1.99619 -1.99619

26436 multipleChoice -0.85378 -0.85378

29797 multipleChoice -0.70297 -0.70297

29769 multipleChoice -0.15812 -0.15812

24834 grid 2.56981 2.56981

29516 equation 0.49399 0.49399

27506 equation 0.98612 0.98612

26255 equation 0.7528 0.7528

26070 equation 2.56238 2.56238

29807 multipleChoice 0.19627 0.19627

27490 equation 1.86812 1.86812

29409 equation 0.39963 0.39963

27246 equation -0.92409 -0.92409

24457 multipleChoice 0.24553 0.24553

27703 equation 1.95523 1.95523

27621 multipleChoice 0.73628 0.73628

29288 equation 1.15982 1.15982

26363 equation 1.62716 1.62716

29753 equation -1.11056 -1.11056

26086 equation 1.15562 1.15562

29893 hotTextCustom 2.14979 2.14979

26709 equation 1.00788 1.00788

27466 equation -0.14625 -0.14625

28937 multipleChoice -0.49776 -0.49776

29841 multipleChoice -0.53051 -0.53051

24167 multipleChoice -0.48169 -0.48169

24737 hotTextCustom -0.75547 -0.75547

26444 multipleChoice -1.25539 -1.25539

27505 equation -0.01091 -0.01091

26254 equation 0.5252 0.5252

28722 multipleChoice -2.15188 -2.15188

27679 equation 0.53601 0.53601

29523 equation 0.06605 1.258 0.662025

30257 equation 2.18607 2.18607

26441 multipleChoice 0.36686 0.36686

28964 equation 0.977 0.977

27495 equation 0.57981 0.57981

25702 equation 1.36028 1.36028

29884 hotTextCustom 1.925 1.925





29500 equation 1.28158 1.28158

26083 equation 0.28851 0.28851

29770 equation 0.75335 0.75335

26397 equation 0.92528 0.92528

29359 hotTextCustom 0.64314 3.08517 1.864155

25784 equation 1.72064 3.50211 2.611375

27300 equation 1.24809 1.24809

28108 multipleSelect -0.11239 -0.11239

29669 tableInput 0.36336 0.36336

27070 multipleChoice -0.11106 -0.11106

28095 hotTextCustom -1.27425 0.13845 -0.5679

28685 multipleChoice -1.0984 -1.0984



Table E17. Operational Item Parameter Estimates – Integrated Mathematics I



27243 equation -2.08738 -2.08738

28646 hotTextCustom -1.99619 -1.99619

28856 equation -1.30973 -1.30973

27130 multipleChoice -0.67775 -0.67775

29674 multipleChoice -0.04274 -0.04274

25440 equation -0.36289 -0.36289

25159 multipleChoice 0.30613 0.30613


24451 multipleSelect 1.36135 1.36135

26418 multipleChoice 0.58758 0.58758

33646 hotTextSelectable, equation,

equation 0.41982 0.61361 0.516715

24396 equation 1.61637 1.61637

29328 grid 0.7869 0.7869

28929 multipleChoice -0.21343 -0.21343

28118 equation 0.49438 0.49438

29836 equation 2.33268 2.33268

27070 multipleChoice -0.11106 -0.11106

29287 equation -0.01199 2.69615 1.34208

26426 multipleChoice -0.50014 -0.50014

26986 multipleChoice -0.26204 -0.26204

24457 multipleChoice 0.24553 0.24553

24154 multipleSelect -0.53029 -0.53029

29800 tableInput -0.16107 -0.87565 -0.51836

26876 multipleChoice -1.80846 -1.80846

25697 multipleChoice -0.7898 -0.7898

26989 multipleChoice -1.59233 -1.59233

29935 tableMatch -0.90445 -0.90445

24737 hotTextCustom -0.75547 -0.75547

26086 equation 1.15562 1.15562

29277 multipleChoice -1.08298 -1.08298

29778 multipleChoice -0.10838 -0.10838

28095 hotTextCustom -1.27425 0.13845 -0.5679

30264 tableInput 0.53619 0.53619

29708 equation 0.08013 0.21808 0.149105

26181 tableInput -0.30715 2.32257 1.00771

30273 equation 0.22863 0.22863

28675 multipleChoice -0.00596 -0.00596

29718 equation 1.70761 1.70761





26985 multipleChoice -0.37111 -0.37111

27004 multipleChoice -0.04457 -0.04457

28633 equation 1.70328 1.70328

28870 equation 1.86405 0.8173 1.340675

28108 multipleSelect -0.11239 -0.11239

25886 multipleChoice -0.52299 -0.52299

28997 multipleChoice -0.53511 -0.53511

25480 multipleChoice -0.62439 -0.62439

27083 equation -0.82419 -0.82419



Table E18. Operational Item Parameter Estimates – Integrated Mathematics II



26704 equation -1.54533 -1.54533

26436 multipleChoice -0.85378 -0.85378

27506 equation 0.98612 0.98612

28720 multipleChoice 0.28116 0.28116

29409 equation 0.39963 0.39963

28685 multipleChoice -1.0984 -1.0984

28897 equation 0.92233 0.92233

28643 multipleChoice 0.98869 0.98869

26255 equation 0.7528 0.7528

29492 multipleChoice -0.2961 -0.2961

30539 tableInput 1.08831 1.08831

29516 equation 0.49399 0.49399

29529 multipleChoice 0.73545 0.73545

27703 equation 1.95523 1.95523

28566 multipleChoice -0.28946 -0.28946

29288 equation 1.15982 1.15982

24639 equation 2.74918 2.74918

29548 multipleChoice 1.01781 1.01781

29938 equation -1.44802 -1.44802

26363 equation 1.62716 1.62716

24394 multipleSelect 1.29338 1.29338

26709 equation 1.00788 1.00788

29207 multipleChoice -0.01165 -0.01165

23535 hotTextCustom 1.51424 1.51424

24647 multipleChoice -0.67907 -0.67907

28945 grid 0.1539 0.1539

29841 multipleChoice -0.53051 -0.53051

26251 equation -2.10336 -2.10336

28923 multipleChoice -0.4526 -0.4526

29875 multipleChoice -0.11267 -0.11267

29753 equation -1.11056 -1.11056

26444 multipleChoice -1.25539 -1.25539

26254 equation 0.5252 0.5252

25790 equation -0.2077 3.28148 1.53689

23844 multipleChoice -0.09876 -0.09876

29500 equation 1.28158 1.28158

27679 equation 0.53601 0.53601

24026 multipleChoice -0.63474 -0.63474

30257 equation 2.18607 2.18607





26441 multipleChoice 0.36686 0.36686

28850 multipleChoice -0.04539 -0.04539

29924 hotTextCustom 2.68437 2.68437

28676 equation 2.49616 2.49616

29769 multipleChoice -0.15812 -0.15812

29157 multipleChoice 0.57159 0.57159

23589 equation 3.60937 1.43799 2.52368

29807 multipleChoice 0.19627 0.19627

29770 equation 0.75335 0.75335

23534 grid 1.41159 1.41159

29669 tableInput 0.36336 0.36336

28677 multipleChoice 0.41899 0.41899

29457 multipleChoice -1.00169 -1.00169



Table E19. Operational Item Parameter Estimates – Science Grade 5



29203 multipleChoice -1.58109 -1.58109

21159 multipleChoice -1.6228 -1.6228

17690 multipleChoice -0.7568 -0.7568

17691 multipleChoice -1.00261 -1.00261

17752 multipleChoice -1.1029 -1.1029

21865 grid -0.94277 0.23491 -0.35393

18570 multipleChoice 0.04933 0.04933


0.92422 -0.0561 0.43406

17761 multipleChoice -0.68586 -0.68586

15238 grid -0.1924 1.51032 0.65896

17704 multipleChoice -0.96725 -0.96725


1.83093 1.83093

21279 multipleChoice -0.38782 -0.38782

20413 multipleChoice 0.15027 0.15027

14351 grid 2.88089 2.88089

16259 grid 1.61871 1.61871

15020 multipleChoice -0.93952 -0.93952

19935 tableMatch -0.56749 -0.56749


19079 multipleChoice 0.17869 0.17869

15789 grid 0.07965 0.07965

17730 multipleChoice -1.24531 -1.24531

16063 multipleChoice -0.77081 -0.77081


0.67127 0.67127

21347 multipleSelect 2.30552 2.30552

20420 multipleChoice -0.30901 -0.30901

28835 grid -1.0238 0.44349 -0.29016

28832 tableMatch -0.45956 -0.45956

28834 multipleSelect -0.82699 -0.82699

29208 tableInput 0.21758 0.21758


-0.2408 0.03736 -0.10172

21474 multipleChoice -1.54729 -1.54729


2.4077 -0.35718 1.02526

21268 multipleChoice -0.79015 -0.79015

14372 grid 1.04096 1.04096





20411 multipleChoice -0.18911 -0.18911

21138 multipleChoice -0.29587 -0.29587


1.55777 1.55777

21333 multipleChoice 1.01285 1.01285

14337 grid 0.21771 0.21771


2.72563 2.72563



0.92152 1.3126 1.11706

28763 multipleChoice -0.01389 -0.01389

15913 multipleChoice -1.69533 -1.69533

18576 multipleChoice -1.55166 -1.55166

17763 multipleChoice -1.6485 -1.6485

21128 multipleChoice -1.42229 -1.42229



Table E20. Operational Item Parameter Estimates – Science Grade 8



18566 multipleChoice -2.06775 -2.06775

17016 multipleChoice -2.04356 -2.04356

30279 multipleChoice -1.30959 -1.30959

16476 multipleChoice -1.71798 -1.71798

22205 grid -0.131 -0.78732 -0.45916

30369 multipleSelect -1.09285 -1.09285

17919 multipleChoice -0.05924 -0.05924

15821 grid -0.94453 -0.94453

19839 multipleChoice -0.69399 -0.69399

16397 grid 0.43831 0.43831

17925 multipleChoice -0.71846 -0.71846

16523 multipleChoice -0.7211 -0.7211

19013 tableMatch 0.56419 0.56419

15796 multipleChoice -0.47319 -0.47319

16747 grid -1.94559 -1.01263 -1.47911

29514 tableMatch 0.34591 0.34591

16521 multipleChoice 0.22712 0.22712

18523 multipleChoice 0.28258 0.28258

22294 grid -0.14883 -0.14883

17899 multipleChoice -0.87987 -0.87987

19802 multipleChoice 0.47772 0.47772

22226 grid -0.26399 0.87187 0.30394

17455 multipleChoice -0.40412 -0.40412

21227 multipleSelect 0.36309 0.36309

15274 grid 1.47057 -0.76461 0.35298

21235 multipleChoice -0.34254 -0.34254


16937 grid 2.16866 2.16866

19123 tableMatch 2.44517 2.44517

29902 multipleSelect 1.56687 1.56687

21248 multipleChoice -0.83731 -0.83731


1.25318 1.25318

22204 grid 0.65711 0.65711

18526 multipleChoice -2.03158 -2.03158

18527 multipleChoice -0.64404 -0.64404

29946 tableInput 2.23124 2.23124

21226 multipleSelect 0.31853 0.31853

19040 tableInput 1.27997 1.27997





17843 multipleChoice -0.54066 -0.54066

17820 multipleChoice 0.17989 0.17989

19006 tableMatch 2.16307 2.16307

15192 grid 2.04999 1.71399 1.88199

22145 multipleSelect, multipleSelect

0.70019 0.70019

17436 multipleChoice 0.01653 0.01653

14436 grid 1.49465 1.49465


1.91586 1.91586

15754 multipleChoice -0.95311 -0.95311

15873 multipleChoice -1.8218 -1.8218

18567 multipleChoice -1.85211 -1.85211

18543 multipleChoice -2.0248 -2.0248



Table E21. Operational Item Parameter Estimates – Biology



17928 multipleChoice -1.76541 -1.76541

22092 multipleChoice -0.97475 -0.97475

20226 multipleChoice -0.65732 -0.65732


0.17693 -1.27173 -0.5474

16102 grid -0.77951 -0.77951


-0.34335 -1.19265 -0.768

16505 multipleChoice 0.32597 0.32597


0.26213 0.26213

16803 grid -0.72136 -0.72136

16812 multipleChoice 0.50776 0.50776

15525 grid 1.19633 0.43382 0.815075

15171 grid 1.40587 1.40587

16358 grid 0.17575 0.17575

21844 multipleChoice 0.34955 0.34955

16251 grid 1.36382 -0.77612 0.29385


0.65389 0.00279 0.32834

16754 grid -0.26577 -0.26577

16741 multipleChoice -0.15714 -0.15714

16847 multipleChoice -0.79144 -0.79144

16842 grid 0.4727 0.4727

16961 multipleChoice 0.39063 0.39063

19913 grid -1.74706 -0.6623 -1.20468


-0.98275 0.78479 -0.09898


0.62346 0.06442 0.34394

20381 tableInput 0.65528 0.65528

22040 grid 0.82054 0.83554 0.82804

16849 multipleChoice 0.33521 0.33521

16851 textEntryNaturalLanguage -0.04618 -0.04618

17462 multipleChoice 0.46589 0.46589

21927 tableMatch 0.80675 0.80675


0.39559 0.39559

17932 multipleChoice -0.32609 -0.32609

17933 multipleChoice -1.18548 -1.18548





17935 multipleChoice -0.86944 -0.86944

15237 grid -0.48652 -0.60252 -0.54452

22060 tableMatch 1.12187 1.12187

16388 grid 0.22765 0.22765

28630 multipleChoice 0.08316 0.08316



22055 grid -2.08387 4.22915 1.07264

17924 multipleChoice 0.46803 0.46803

20269 multipleChoice -0.17137 -0.17137

15761 multipleChoice -0.38141 -0.38141

16498 multipleChoice -0.47677 -0.47677



Table E22. Operational Item Parameter Estimates – Physical Science



17512 multipleChoice -1.37535 -1.37535

20672 multipleChoice -1.25898 -1.25898

18517 multipleChoice -1.38934 -1.38934

18583 multipleChoice -1.14114 -1.14114

18518 multipleChoice -0.93137 -0.93137

19072 multipleChoice -1.22237 -1.22237

18988 tableMatch 0.51899 0.51899


1.98437 -0.26886 0.857755

21470 multipleSelect 0.94195 0.94195

20396 tableMatch 1.27228 1.27228

16809 grid 0.60826 1.21914 0.9137

14507 grid 0.80053 0.80053

20579 simulation -0.97698 -0.32934 -0.65316


1.07005 1.34451 1.20728


17800 textEntrySimple 0.88271 0.27587 0.57929

21427 multipleSelect 1.04038 1.04038

20103 multipleChoice 0.51331 0.51331

21335 multipleChoice 0.06696 0.06696

16732 grid 0.7124 -0.57406 0.06917

21232 multipleChoice -0.3057 -0.3057

17425 multipleChoice -1.0752 -1.0752

16220 grid -1.49964 -1.49964

19466 multipleChoice -0.93302 -0.93302

21555 multipleChoice -0.62627 -0.62627

21410 multipleChoice -0.11986 -0.11986

21495 multipleChoice -0.0688 -0.0688


20488 tableInput 0.90325 0.90325

20996 multipleSelect -0.05922 -0.05922

18484 textEntrySimple 0.76281 -1.34063 -0.28891

19063 tableMatch 1.17735 1.17735

20998 multipleSelect 1.75409 1.75409

19173 textEntrySimple 0.83551 0.48433 0.65992

19926 multipleSelect 1.47008 1.47008

19530 textEntrySimple -0.13462 -0.20426 0.14279 1.2915 0.273853

15432 grid -0.98531 -0.98531





16751 grid -0.75315 -0.75315

14723 grid -0.98096 -0.40658 -0.69377

21000 multipleSelect 1.21358 1.21358

20723 multipleChoice -1.01753 -1.01753

21511 multipleChoice -0.90422 -0.90422

21092 multipleChoice -1.22478 -1.22478

15655 multipleChoice -1.29953 -1.29953



Table E25. Operational Item Parameter Estimates – American Government



20795 multipleChoice -1.18425 -1.18425

16468 multipleChoice -1.7603 -1.7603

16129 multipleChoice 0.14344 0.14344

20793 multipleChoice -0.0317 -0.0317

14957 grid 0.73866 0.33396 0.53631

18884 tableMatch -1.25458 -1.25458

19294 multipleChoice 0.18226 0.18226


1.85196 -1.02566 0.41315

20964 multipleChoice -0.76606 -0.76606

20777 multipleChoice -0.99384 -0.99384

20801 multipleChoice 0.73048 0.73048

16345 multipleChoice 0.23582 0.23582

19289 tableMatch -0.7507 -0.7507


1.42078 0.4145 0.91764

14573 grid 2.42041 -1.33447 0.54297

21964 hotTextCustom -0.26174 -0.26174

30716 multipleChoice -0.73916 -0.73916



1.61641 0.16787 0.89214


-0.89065 1.48073 0.29504

15741 grid 0.33478 0.43615 0.385465


0.70423 -0.15814 0.273045


-0.41538 -0.79806 -0.60672

21952 multipleChoice -0.11643 -0.11643

16821 grid -2.09801 2.92869 0.41534

19724 multipleSelect 1.36051 1.36051


0.22403 -0.90413 -0.34005


2.07563 -0.86449 0.60557

20949 multipleChoice -1.30796 -1.30796


1.73655 -1.75449 -0.00897






0.80401 1.71007 1.25704

20849 multipleChoice -0.47215 -0.47215


0.51228 1.13658 0.82443

19045 multipleSelect 1.16235 1.16235


0.58391 -0.21505 0.18443

19871 multipleChoice -0.84131 -0.84131

19812 multipleChoice 0.71513 0.71513

20868 multipleChoice 0.86757 0.86757


1.26671 -1.24791 0.0094

14955 grid 0.15002 -2.40744 -1.12871


2.6407 -0.92816 0.85627

21780 tableMatch 0.05067 0.05067

18902 tableMatch 0.65411 0.65411

19894 multipleChoice -1.96274 -1.96274



Table E26. Operational Item Parameter Estimates – American History



20883 multipleChoice -0.67498 -0.67498

21460 multipleChoice 0.10604 0.10604

18036 multipleChoice -0.22318 -0.22318

18973 multipleSelect 0.95043 0.95043


2.95645 -1.52593 0.71526

19326 multipleChoice 0.25066 0.25066

29911 multipleChoice -0.42785 -0.42785

29957 multipleChoice -0.36061 -0.36061


-1.02121 1.83121 0.405

29964 multipleChoice 0.49904 0.49904

29961 multipleChoice -0.44507 -0.44507


1.81701 1.13552 1.476265


0.00044 -0.21772 -0.10864

14828 grid -2.04957 1.43645 -0.30656

19389 multipleSelect 0.98877 0.98877

21576 multipleChoice -0.55486 -0.55486

21141 multipleChoice -1.62106 -1.62106

20774 multipleChoice -0.19826 -0.19826

18003 multipleChoice -1.07938 -1.07938


1.58199 -0.84331 0.36934

21441 multipleChoice -0.41929 -0.41929


1.67627 -0.63645 0.51991

21136 multipleChoice -1.03587 -1.03587

15232 grid 0.35952 -0.87663 -0.25856

18016 multipleChoice -0.27497 -0.27497

19526 multipleSelect 0.72824 0.72824

28733 multipleChoice -0.1197 -0.1197

18139 multipleChoice -0.01626 -0.01626

18081 multipleChoice 0.19987 0.19987

14784 grid -0.23448 0.50972 0.13762

16047 multipleChoice 0.21402 0.21402

18156 multipleChoice -0.82391 -0.82391

18191 multipleChoice -0.76695 -0.76695

18093 multipleChoice -0.41319 -0.41319





18920 hotTextCustom -0.03893 -0.03893

18178 multipleChoice -0.22247 -0.22247

21384 multipleChoice 0.80694 0.80694


-0.03477 1.91723 0.94123

18102 multipleChoice -0.28109 -0.28109


0.80219 -0.27797 0.26211

16051 multipleChoice -0.41845 -0.41845

20900 multipleChoice -0.16332 -0.16332

16696 grid -1.35901 1.08607 -0.13647

18140 multipleChoice -0.63278 -0.63278

20749 multipleChoice 0.06319 0.06319

18065 multipleChoice -0.34193 -0.34193

20933 multipleChoice 0.83768 0.83768


0.79864 0.243 0.52082

16783 grid 0.02495 0.02495

21326 multipleChoice -0.28605 -0.28605


F-1 American Institutes for Research

Table F. OST Performance Standards – Spring 2018

Basic Proficient Accelerated Advanced

Test Theta Scaled Score

Theta Scaled Score

Theta Scaled Score

Theta Scaled Score

ELA

Grade 3 -0.70 672 -0.09 700 0.46 725 1.06 752

Grade 4 -0.56 674 0.06 700 0.65 725 1.32 753

Grade 5 -0.74 669 0.00 700 0.59 725 1.29 755

Grade 6 -0.83 668 -0.07 700 0.52 725 1.14 751

Grade 7 -0.80 670 -0.01 700 0.65 725 1.29 749

Grade 8 -0.43 682 0.15 700 0.95 725 1.55 744

EOC ELA I -0.71 683 -0.11 700 0.79 725 1.31 739

EOC ELA II -0.77 679 -0.08 700 0.75 725 1.30 742

Mathematics

Grade 3 -0.61 683 -0.08 700 0.68 725 1.53 753

Grade 4 -1.05 686 -0.61 700 0.15 725 1.19 759

Grade 5 -1.05 687 -0.54 700 0.43 725 1.35 749

Grade 6 -0.83 682 -0.12 700 0.89 725 1.65 744

Grade 7 -0.76 684 -0.19 700 0.68 725 1.74 755

Grade 8 -0.69 690 -0.18 700 1.06 725 2.00 744

Algebra -1.21 682 -0.57 700 0.32 725 1.37 754

Geometry -1.63 678 -0.89 700 -0.04 725 1.01 756

Int Math I -1.15 682 -0.52 700 0.37 725 1.42 754

Int Math II -1.37 677 -0.63 700 0.17 725 1.23 758

Science

Grade 5 -0.92 664 -0.04 700 0.57 725 1.25 753

Grade 8 -1.14 674 -0.51 700 0.09 725 1.08 766

Biology -1.19 685 -0.67 700 0.18 725 0.51 735

Physical Science

-1.56 684 -0.94 700 0.02 725 0.95 749

Social Studies

American History

-0.98 684 -0.37 700 0.60 725 1.12 738

American Government

-1.11 687 -0.41 700 0.92 725 1.66 739

Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report

G-1 American Institutes for Research

Table G1. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Grade 3 Reading

Raw Score

Ohio Theta

Before Ohio Rounding/Truncation

Scaled Score

After Ohio Rounding/Truncation

Scaled Score


Proficiency 0 ‐3.50 545 545 Promotion No 1 ‐3.50 545 545 Promotion No 2 ‐3.01 567 567 Promotion No 3 ‐2.56 588 588 Promotion No 4 ‐2.22 603 603 Promotion No 5 ‐1.95 616 616 Promotion No 6 ‐1.72 626 626 Promotion No 7 ‐1.52 635 635 Promotion No 8 ‐1.34 643 643 Promotion No 9 ‐1.18 651 651 Promotion No 10 ‐1.03 657 657 Promotion No 11 ‐0.89 664 664 Promotion No 12 ‐0.75 670 672 Promotion Yes 13 ‐0.63 676 676 Promotion Yes 14 ‐0.50 681 681 Promotion Yes 15 ‐0.38 687 687 Promotion Yes 16 ‐0.27 692 692 Promotion Yes 17 ‐0.15 697 697 Promotion Yes 18 ‐0.04 702 702 Promotion Yes 19 0.08 708 708 Promotion Yes 20 0.19 713 713 Promotion Yes 21 0.31 718 718 Promotion Yes 22 0.43 724 725 Promotion Yes 23 0.56 729 729 Promotion Yes 24 0.69 735 735 Promotion Yes 25 0.82 741 741 Promotion Yes 26 0.96 748 748 Promotion Yes 27 1.11 755 755 Promotion Yes 28 1.27 762 762 Promotion Yes 29 1.44 770 770 Promotion Yes 30 1.62 778 778 Promotion Yes 31 1.82 787 787 Promotion Yes 32 2.03 796 796 Promotion Yes 33 2.26 807 807 Promotion Yes 34 2.51 818 818 Promotion Yes 35 2.80 831 831 Promotion Yes 36 3.12 846 846 Promotion Yes 37 3.50 863 863 Promotion Yes 38 3.50 863 863 Promotion Yes 39 3.50 863 863 Promotion Yes



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 3.50 863 863 Promotion Yes



Table G2. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Grade 3 ELA Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 545 545 Limited 1 ‐3.50 545 545 Limited 2 ‐3.01 567 567 Limited 3 ‐2.56 588 588 Limited 4 ‐2.22 603 603 Limited 5 ‐1.95 616 616 Limited 6 ‐1.72 626 626 Limited 7 ‐1.52 635 635 Limited 8 ‐1.34 643 643 Limited 9 ‐1.18 651 651 Limited 10 ‐1.03 657 657 Limited 11 ‐0.89 664 664 Limited 12 ‐0.75 670 672 Basic 13 ‐0.63 676 676 Basic 14 ‐0.50 681 681 Basic 15 ‐0.38 687 687 Basic 16 ‐0.27 692 692 Basic 17 ‐0.15 697 697 Basic 18 ‐0.04 702 702 Proficient 19 0.08 708 708 Proficient 20 0.19 713 713 Proficient 21 0.31 718 718 Proficient 22 0.43 724 725 Accelerated 23 0.56 729 729 Accelerated 24 0.69 735 735 Accelerated 25 0.82 741 741 Accelerated 26 0.96 748 748 Accelerated 27 1.11 755 755 Advanced 28 1.27 762 762 Advanced 29 1.44 770 770 Advanced 30 1.62 778 778 Advanced 31 1.82 787 787 Advanced 32 2.03 796 796 Advanced 33 2.26 807 807 Advanced 34 2.51 818 818 Advanced 35 2.80 831 831 Advanced 36 3.12 846 846 Advanced 37 3.50 863 863 Advanced 38 3.50 863 863 Advanced 39 3.50 863 863 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 3.50 863 863 Advanced



Table G3. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Grade 3 ELA Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 545 545 Limited 1 ‐3.50 545 545 Limited 2 ‐3.01 567 567 Limited 3 ‐2.56 588 588 Limited 4 ‐2.22 603 603 Limited 5 ‐1.95 616 616 Limited 6 ‐1.72 626 626 Limited 7 ‐1.52 635 635 Limited 8 ‐1.34 643 643 Limited 9 ‐1.18 651 651 Limited 10 ‐1.03 657 657 Limited 11 ‐0.89 664 664 Limited 12 ‐0.75 670 672 Basic 13 ‐0.63 676 676 Basic 14 ‐0.50 681 681 Basic 15 ‐0.38 687 687 Basic 16 ‐0.27 692 692 Basic 17 ‐0.15 697 697 Basic 18 ‐0.04 702 702 Proficient 19 0.08 708 708 Proficient 20 0.19 713 713 Proficient 21 0.31 718 718 Proficient 22 0.43 724 725 Accelerated 23 0.56 729 729 Accelerated 24 0.69 735 735 Accelerated 25 0.82 741 741 Accelerated 26 0.96 748 748 Accelerated 27 1.11 755 755 Advanced 28 1.27 762 762 Advanced 29 1.44 770 770 Advanced 30 1.62 778 778 Advanced 31 1.82 787 787 Advanced 32 2.03 796 796 Advanced 33 2.26 807 807 Advanced 34 2.51 818 818 Advanced 35 2.80 831 831 Advanced 36 3.12 846 846 Advanced 37 3.50 863 863 Advanced 38 3.50 863 863 Advanced 39 3.50 863 863 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G4. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – High School ELA I Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 606 606 Limited 1 ‐3.50 606 606 Limited 2 ‐3.50 606 606 Limited 3 ‐3.17 615 615 Limited 4 ‐2.85 624 624 Limited 5 ‐2.60 631 631 Limited 6 ‐2.39 637 637 Limited 7 ‐2.20 642 642 Limited 8 ‐2.03 647 647 Limited 9 ‐1.88 651 651 Limited 10 ‐1.75 655 655 Limited 11 ‐1.62 658 658 Limited 12 ‐1.50 661 661 Limited 13 ‐1.39 665 665 Limited 14 ‐1.28 668 668 Limited 15 ‐1.18 670 670 Limited 16 ‐1.08 673 673 Limited 17 ‐0.98 676 676 Limited 18 ‐0.89 678 678 Limited 19 ‐0.80 681 681 Limited 20 ‐0.72 683 683 Basic 21 ‐0.63 685 685 Basic 22 ‐0.55 688 688 Basic 23 ‐0.47 690 690 Basic 24 ‐0.39 692 692 Basic 25 ‐0.31 694 694 Basic 26 ‐0.24 697 697 Basic 27 ‐0.16 699 699 Basic 28 ‐0.08 701 701 Proficient 29 ‐0.01 703 703 Proficient 30 0.07 705 705 Proficient 31 0.15 707 707 Proficient 32 0.22 709 709 Proficient 33 0.30 711 711 Proficient 34 0.37 713 713 Proficient 35 0.45 716 716 Proficient 36 0.53 718 718 Proficient 37 0.61 720 720 Proficient 38 0.69 722 722 Proficient 39 0.77 725 725 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.86 727 727 Accelerated 41 0.95 729 729 Accelerated 42 1.04 732 732 Accelerated 43 1.13 734 734 Accelerated 44 1.22 737 737 Accelerated 45 1.32 740 740 Advanced 46 1.43 743 743 Advanced 47 1.54 746 746 Advanced 48 1.65 749 749 Advanced 49 1.78 752 752 Advanced 50 1.91 756 756 Advanced 51 2.05 760 760 Advanced 52 2.20 764 764 Advanced 53 2.36 769 769 Advanced 54 2.55 774 774 Advanced 55 2.75 780 780 Advanced 56 2.99 786 786 Advanced 57 3.26 794 794 Advanced 58 3.50 800 800 Advanced 59 3.50 800 800 Advanced 60 3.50 800 800 Advanced 61 3.50 800 800 Advanced 62 3.50 800 800 Advanced



Table G5. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – High School ELA I Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score





Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G6. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – High School ELA II Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 597 597 Limited 1 ‐3.50 597 597 Limited 2 ‐3.50 597 597 Limited 3 ‐3.30 603 603 Limited 4 ‐2.97 613 613 Limited 5 ‐2.70 621 621 Limited 6 ‐2.48 628 628 Limited 7 ‐2.29 633 633 Limited 8 ‐2.12 639 639 Limited 9 ‐1.97 643 643 Limited 10 ‐1.83 647 647 Limited 11 ‐1.70 651 651 Limited 12 ‐1.58 655 655 Limited 13 ‐1.46 658 658 Limited 14 ‐1.35 662 662 Limited 15 ‐1.25 665 665 Limited 16 ‐1.15 668 668 Limited 17 ‐1.06 671 671 Limited 18 ‐0.96 673 673 Limited 19 ‐0.88 676 676 Limited 20 ‐0.79 679 679 Basic 21 ‐0.70 681 681 Basic 22 ‐0.62 684 684 Basic 23 ‐0.54 686 686 Basic 24 ‐0.46 689 689 Basic 25 ‐0.38 691 691 Basic 26 ‐0.30 693 693 Basic 27 ‐0.22 696 696 Basic 28 ‐0.15 698 698 Basic 29 ‐0.07 700 700 Proficient 30 0.01 703 703 Proficient 31 0.08 705 705 Proficient 32 0.16 707 707 Proficient 33 0.23 709 709 Proficient 34 0.31 712 712 Proficient 35 0.39 714 714 Proficient 36 0.46 716 716 Proficient 37 0.54 719 719 Proficient 38 0.62 721 721 Proficient 39 0.70 724 724 Proficient



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.79 726 726 Accelerated 41 0.87 729 729 Accelerated 42 0.96 731 731 Accelerated 43 1.05 734 734 Accelerated 44 1.14 737 737 Accelerated 45 1.24 740 740 Accelerated 46 1.34 743 743 Advanced 47 1.45 746 746 Advanced 48 1.56 749 749 Advanced 49 1.68 753 753 Advanced 50 1.81 757 757 Advanced 51 1.94 761 761 Advanced 52 2.09 765 765 Advanced 53 2.25 770 770 Advanced 54 2.42 775 775 Advanced 55 2.62 781 781 Advanced 56 2.84 788 788 Advanced 57 3.09 795 795 Advanced 58 3.39 804 804 Advanced 59 3.50 808 808 Advanced 60 3.50 808 808 Advanced 61 3.50 808 808 Advanced 62 3.50 808 808 Advanced



Table G7. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – High School ELA II Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 597 597 Limited 1 ‐3.50 597 597 Limited 2 ‐3.50 597 597 Limited 3 ‐3.30 603 603 Limited 4 ‐2.97 613 613 Limited 5 ‐2.70 621 621 Limited 6 ‐2.48 628 628 Limited 7 ‐2.29 633 633 Limited 8 ‐2.12 639 639 Limited 9 ‐1.97 643 643 Limited 10 ‐1.83 647 647 Limited 11 ‐1.70 651 651 Limited 12 ‐1.58 655 655 Limited 13 ‐1.46 658 658 Limited 14 ‐1.35 662 662 Limited 15 ‐1.25 665 665 Limited 16 ‐1.15 668 668 Limited 17 ‐1.06 671 671 Limited 18 ‐0.96 673 673 Limited 19 ‐0.88 676 676 Limited 20 ‐0.79 679 679 Basic 21 ‐0.70 681 681 Basic 22 ‐0.62 684 684 Basic 23 ‐0.54 686 686 Basic 24 ‐0.46 689 689 Basic 25 ‐0.38 691 691 Basic 26 ‐0.30 693 693 Basic 27 ‐0.22 696 696 Basic 28 ‐0.15 698 698 Basic 29 ‐0.07 700 700 Proficient 30 0.01 703 703 Proficient 31 0.08 705 705 Proficient 32 0.16 707 707 Proficient 33 0.23 709 709 Proficient 34 0.31 712 712 Proficient 35 0.39 714 714 Proficient 36 0.46 716 716 Proficient 37 0.54 719 719 Proficient 38 0.62 721 721 Proficient 39 0.70 724 724 Proficient



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.79 726 726 Accelerated 41 0.87 729 729 Accelerated 42 0.96 731 731 Accelerated 43 1.05 734 734 Accelerated 44 1.14 737 737 Accelerated 45 1.24 740 740 Accelerated 46 1.34 743 743 Advanced 47 1.45 746 746 Advanced 48 1.56 749 749 Advanced 49 1.68 753 753 Advanced 50 1.81 757 757 Advanced 51 1.94 761 761 Advanced 52 2.09 765 765 Advanced 53 2.25 770 770 Advanced 54 2.42 775 775 Advanced 55 2.62 781 781 Advanced 56 2.84 788 788 Advanced 57 3.09 795 795 Advanced 58 3.39 804 804 Advanced 59 3.50 808 808 Advanced 60 3.50 808 808 Advanced 61 3.50 808 808 Advanced 62 3.50 808 808 Advanced



Table G8. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Algebra Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.08 630 630 Limited 4 ‐2.76 639 639 Limited 5 ‐2.50 646 646 Limited 6 ‐2.28 652 652 Limited 7 ‐2.10 657 657 Limited 8 ‐1.93 662 662 Limited 9 ‐1.77 666 666 Limited 10 ‐1.63 670 670 Limited 11 ‐1.50 674 674 Limited 12 ‐1.38 677 677 Limited 13 ‐1.26 681 682 Basic 14 ‐1.14 684 684 Basic 15 ‐1.03 687 687 Basic 16 ‐0.93 690 690 Basic 17 ‐0.83 693 693 Basic 18 ‐0.73 696 696 Basic 19 ‐0.63 698 698 Basic 20 ‐0.53 701 701 Proficient 21 ‐0.44 704 704 Proficient 22 ‐0.34 706 706 Proficient 23 ‐0.25 709 709 Proficient 24 ‐0.15 712 712 Proficient 25 ‐0.06 714 714 Proficient 26 0.03 717 717 Proficient 27 0.13 720 720 Proficient 28 0.22 722 722 Proficient 29 0.32 725 725 Accelerated 30 0.41 728 728 Accelerated 31 0.51 730 730 Accelerated 32 0.61 733 733 Accelerated 33 0.71 736 736 Accelerated 34 0.82 739 739 Accelerated 35 0.92 742 742 Accelerated 36 1.03 745 745 Accelerated 37 1.14 748 748 Accelerated 38 1.26 751 751 Accelerated 39 1.37 755 755 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.49 758 758 Advanced 41 1.62 762 762 Advanced 42 1.75 765 765 Advanced 43 1.89 769 769 Advanced 44 2.03 773 773 Advanced 45 2.18 777 777 Advanced 46 2.34 782 782 Advanced 47 2.51 787 787 Advanced 48 2.70 792 792 Advanced 49 2.92 798 798 Advanced 50 3.17 805 805 Advanced 51 3.49 814 814 Advanced 52 3.50 814 814 Advanced 53 3.50 814 814 Advanced 54 3.50 814 814 Advanced



Table G9. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Algebra Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.08 630 630 Limited 4 ‐2.76 639 639 Limited 5 ‐2.50 646 646 Limited 6 ‐2.28 652 652 Limited 7 ‐2.10 657 657 Limited 8 ‐1.93 662 662 Limited 9 ‐1.77 666 666 Limited 10 ‐1.63 670 670 Limited 11 ‐1.50 674 674 Limited 12 ‐1.38 677 677 Limited 13 ‐1.26 681 682 Basic 14 ‐1.14 684 684 Basic 15 ‐1.03 687 687 Basic 16 ‐0.93 690 690 Basic 17 ‐0.83 693 693 Basic 18 ‐0.73 696 696 Basic 19 ‐0.63 698 698 Basic 20 ‐0.53 701 701 Proficient 21 ‐0.44 704 704 Proficient 22 ‐0.34 706 706 Proficient 23 ‐0.25 709 709 Proficient 24 ‐0.15 712 712 Proficient 25 ‐0.06 714 714 Proficient 26 0.03 717 717 Proficient 27 0.13 720 720 Proficient 28 0.22 722 722 Proficient 29 0.32 725 725 Accelerated 30 0.41 728 728 Accelerated 31 0.51 730 730 Accelerated 32 0.61 733 733 Accelerated 33 0.71 736 736 Accelerated 34 0.82 739 739 Accelerated 35 0.92 742 742 Accelerated 36 1.03 745 745 Accelerated 37 1.14 748 748 Accelerated 38 1.26 751 751 Accelerated 39 1.37 755 755 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G10. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Geometry Online

Raw Score Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 604 604 Limited 1 ‐3.50 604 604 Limited 2 ‐3.08 617 617 Limited 3 ‐2.62 630 630 Limited 4 ‐2.28 640 640 Limited 5 ‐2.01 648 648 Limited 6 ‐1.78 655 655 Limited 7 ‐1.58 661 661 Limited 8 ‐1.40 666 666 Limited 9 ‐1.23 671 671 Limited 10 ‐1.08 675 675 Limited 11 ‐0.94 679 679 Basic 12 ‐0.81 683 683 Basic 13 ‐0.68 687 687 Basic 14 ‐0.56 691 691 Basic 15 ‐0.45 694 694 Basic 16 ‐0.34 697 697 Basic 17 ‐0.23 700 700 Proficient 18 ‐0.12 704 704 Proficient 19 ‐0.02 707 707 Proficient 20 0.08 709 709 Proficient 21 0.17 712 712 Proficient 22 0.27 715 715 Proficient 23 0.36 718 718 Proficient 24 0.46 721 721 Proficient 25 0.55 723 723 Proficient 26 0.64 726 726 Accelerated 27 0.74 729 729 Accelerated 28 0.83 732 732 Accelerated 29 0.92 734 734 Accelerated 30 1.01 737 737 Accelerated 31 1.11 740 740 Accelerated 32 1.20 743 743 Accelerated 33 1.30 745 745 Accelerated 34 1.39 748 748 Accelerated 35 1.49 751 751 Accelerated 36 1.59 754 754 Accelerated 37 1.70 757 757 Advanced 38 1.80 760 760 Advanced 39 1.91 763 763 Advanced





Scaled Score


Scaled Score





Table G11. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Geometry Paper



Scaled Score


Scaled Score


Proficiency 0 ‐3.50 604 604 Limited 1 ‐3.50 604 604 Limited 2 ‐3.08 617 617 Limited 3 ‐2.62 630 630 Limited 4 ‐2.28 640 640 Limited 5 ‐2.01 648 648 Limited 6 ‐1.78 655 655 Limited 7 ‐1.58 661 661 Limited 8 ‐1.40 666 666 Limited 9 ‐1.23 671 671 Limited 10 ‐1.08 675 675 Limited 11 ‐0.94 679 679 Basic 12 ‐0.81 683 683 Basic 13 ‐0.68 687 687 Basic 14 ‐0.56 691 691 Basic 15 ‐0.45 694 694 Basic 16 ‐0.34 697 697 Basic 17 ‐0.23 700 700 Proficient 18 ‐0.12 704 704 Proficient 19 ‐0.02 707 707 Proficient 20 0.08 709 709 Proficient 21 0.17 712 712 Proficient 22 0.27 715 715 Proficient 23 0.36 718 718 Proficient 24 0.46 721 721 Proficient 25 0.55 723 723 Proficient 26 0.64 726 726 Accelerated 27 0.74 729 729 Accelerated 28 0.83 732 732 Accelerated 29 0.92 734 734 Accelerated 30 1.01 737 737 Accelerated 31 1.11 740 740 Accelerated 32 1.20 743 743 Accelerated 33 1.30 745 745 Accelerated 34 1.39 748 748 Accelerated 35 1.49 751 751 Accelerated 36 1.59 754 754 Accelerated 37 1.70 757 757 Advanced 38 1.80 760 760 Advanced 39 1.91 763 763 Advanced





Scaled Score


Scaled Score





Table G12. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Integrated Math I Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.21 626 626 Limited 4 ‐2.88 635 635 Limited 5 ‐2.62 642 642 Limited 6 ‐2.40 649 649 Limited 7 ‐2.20 654 654 Limited 8 ‐2.03 659 659 Limited 9 ‐1.87 663 663 Limited 10 ‐1.73 667 667 Limited 11 ‐1.59 671 671 Limited 12 ‐1.47 675 675 Limited 13 ‐1.35 678 678 Limited 14 ‐1.23 681 682 Basic 15 ‐1.12 685 685 Basic 16 ‐1.01 688 688 Basic 17 ‐0.91 690 690 Basic 18 ‐0.81 693 693 Basic 19 ‐0.71 696 696 Basic 20 ‐0.62 699 699 Basic 21 ‐0.52 701 701 Proficient 22 ‐0.43 704 704 Proficient 23 ‐0.33 707 707 Proficient 24 ‐0.24 709 709 Proficient 25 ‐0.15 712 712 Proficient 26 ‐0.06 714 714 Proficient 27 0.04 717 717 Proficient 28 0.13 720 720 Proficient 29 0.22 722 722 Proficient 30 0.32 725 725 Accelerated 31 0.41 728 728 Accelerated 32 0.51 730 730 Accelerated 33 0.61 733 733 Accelerated 34 0.71 736 736 Accelerated 35 0.81 739 739 Accelerated 36 0.92 742 742 Accelerated 37 1.03 745 745 Accelerated 38 1.14 748 748 Accelerated 39 1.26 751 751 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G13. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Integrated Math I Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.21 626 626 Limited 4 ‐2.88 635 635 Limited 5 ‐2.62 642 642 Limited 6 ‐2.40 649 649 Limited 7 ‐2.20 654 654 Limited 8 ‐2.03 659 659 Limited 9 ‐1.87 663 663 Limited 10 ‐1.73 667 667 Limited 11 ‐1.59 671 671 Limited 12 ‐1.47 675 675 Limited 13 ‐1.35 678 678 Limited 14 ‐1.23 681 682 Basic 15 ‐1.12 685 685 Basic 16 ‐1.01 688 688 Basic 17 ‐0.91 690 690 Basic 18 ‐0.81 693 693 Basic 19 ‐0.71 696 696 Basic 20 ‐0.62 699 699 Basic 21 ‐0.52 701 701 Proficient 22 ‐0.43 704 704 Proficient 23 ‐0.33 707 707 Proficient 24 ‐0.24 709 709 Proficient 25 ‐0.15 712 712 Proficient 26 ‐0.06 714 714 Proficient 27 0.04 717 717 Proficient 28 0.13 720 720 Proficient 29 0.22 722 722 Proficient 30 0.32 725 725 Accelerated 31 0.41 728 728 Accelerated 32 0.51 730 730 Accelerated 33 0.61 733 733 Accelerated 34 0.71 736 736 Accelerated 35 0.81 739 739 Accelerated 36 0.92 742 742 Accelerated 37 1.03 745 745 Accelerated 38 1.14 748 748 Accelerated 39 1.26 751 751 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G14. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Integrated Math II Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 594 594 Limited 1 ‐3.50 594 594 Limited 2 ‐3.18 604 604 Limited 3 ‐2.72 618 618 Limited 4 ‐2.39 629 629 Limited 5 ‐2.12 637 637 Limited 6 ‐1.88 645 645 Limited 7 ‐1.68 651 651 Limited 8 ‐1.50 657 657 Limited 9 ‐1.34 662 662 Limited 10 ‐1.18 667 667 Limited 11 ‐1.04 671 671 Limited 12 ‐0.91 675 675 Limited 13 ‐0.78 679 679 Basic 14 ‐0.65 683 683 Basic 15 ‐0.54 687 687 Basic 16 ‐0.42 690 690 Basic 17 ‐0.31 694 694 Basic 18 ‐0.21 697 697 Basic 19 ‐0.10 700 700 Proficient 20 0.00 704 704 Proficient 21 0.10 707 707 Proficient 22 0.20 710 710 Proficient 23 0.30 713 713 Proficient 24 0.40 716 716 Proficient 25 0.49 719 719 Proficient 26 0.59 722 722 Proficient 27 0.68 725 725 Accelerated 28 0.78 728 728 Accelerated 29 0.88 731 731 Accelerated 30 0.97 734 734 Accelerated 31 1.07 737 737 Accelerated 32 1.17 740 740 Accelerated 33 1.27 743 743 Accelerated 34 1.37 746 746 Accelerated 35 1.48 750 750 Accelerated 36 1.59 753 753 Accelerated 37 1.69 757 758 Advanced 38 1.81 760 760 Advanced 39 1.93 764 764 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G15. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Integrated Math II Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 594 594 Limited 1 ‐3.50 594 594 Limited 2 ‐3.18 604 604 Limited 3 ‐2.72 618 618 Limited 4 ‐2.39 629 629 Limited 5 ‐2.12 637 637 Limited 6 ‐1.88 645 645 Limited 7 ‐1.68 651 651 Limited 8 ‐1.50 657 657 Limited 9 ‐1.34 662 662 Limited 10 ‐1.18 667 667 Limited 11 ‐1.04 671 671 Limited 12 ‐0.91 675 675 Limited 13 ‐0.78 679 679 Basic 14 ‐0.65 683 683 Basic 15 ‐0.54 687 687 Basic 16 ‐0.42 690 690 Basic 17 ‐0.31 694 694 Basic 18 ‐0.21 697 697 Basic 19 ‐0.10 700 700 Proficient 20 0.00 704 704 Proficient 21 0.10 707 707 Proficient 22 0.20 710 710 Proficient 23 0.30 713 713 Proficient 24 0.40 716 716 Proficient 25 0.49 719 719 Proficient 26 0.59 722 722 Proficient 27 0.68 725 725 Accelerated 28 0.78 728 728 Accelerated 29 0.88 731 731 Accelerated 30 0.97 734 734 Accelerated 31 1.07 737 737 Accelerated 32 1.17 740 740 Accelerated 33 1.27 743 743 Accelerated 34 1.37 746 746 Accelerated 35 1.48 750 750 Accelerated 36 1.59 753 753 Accelerated 37 1.69 757 758 Advanced 38 1.81 760 760 Advanced 39 1.93 764 764 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G16. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Biology Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 617 617 Limited 1 ‐3.50 617 617 Limited 2 ‐3.18 626 626 Limited 3 ‐2.76 638 638 Limited 4 ‐2.46 647 647 Limited 5 ‐2.21 654 654 Limited 6 ‐2.02 660 660 Limited 7 ‐1.84 665 665 Limited 8 ‐1.69 670 670 Limited 9 ‐1.56 674 674 Limited 10 ‐1.43 678 678 Limited 11 ‐1.32 681 681 Limited 12 ‐1.21 684 685 Basic 13 ‐1.11 687 687 Basic 14 ‐1.02 690 690 Basic 15 ‐0.93 692 692 Basic 16 ‐0.84 695 695 Basic 17 ‐0.76 697 697 Basic 18 ‐0.68 700 700 Proficient 19 ‐0.60 702 702 Proficient 20 ‐0.52 704 704 Proficient 21 ‐0.45 707 707 Proficient 22 ‐0.37 709 709 Proficient 23 ‐0.30 711 711 Proficient 24 ‐0.23 713 713 Proficient 25 ‐0.16 715 715 Proficient 26 ‐0.08 717 717 Proficient 27 ‐0.01 719 719 Proficient 28 0.06 722 722 Proficient 29 0.13 724 724 Proficient 30 0.20 726 726 Accelerated 31 0.28 728 728 Accelerated 32 0.35 730 730 Accelerated 33 0.43 732 732 Accelerated 34 0.50 735 735 Advanced 35 0.58 737 737 Advanced 36 0.66 739 739 Advanced 37 0.74 742 742 Advanced 38 0.82 744 744 Advanced 39 0.91 747 747 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.00 749 749 Advanced 41 1.09 752 752 Advanced 42 1.18 755 755 Advanced 43 1.28 758 758 Advanced 44 1.39 761 761 Advanced 45 1.50 764 764 Advanced 46 1.62 768 768 Advanced 47 1.75 771 771 Advanced 48 1.89 776 776 Advanced 49 2.05 780 780 Advanced 50 2.23 786 786 Advanced 51 2.44 792 792 Advanced 52 2.70 799 799 Advanced 53 3.02 809 809 Advanced 54 3.46 822 822 Advanced 55 3.50 823 823 Advanced 56 3.50 823 823 Advanced



Table G17. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Biology Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 617 617 Limited 1 ‐3.50 617 617 Limited 2 ‐3.18 626 626 Limited 3 ‐2.76 638 638 Limited 4 ‐2.45 647 647 Limited 5 ‐2.21 655 655 Limited 6 ‐2.01 660 660 Limited 7 ‐1.84 665 665 Limited 8 ‐1.69 670 670 Limited 9 ‐1.56 674 674 Limited 10 ‐1.43 678 678 Limited 11 ‐1.32 681 681 Limited 12 ‐1.21 684 685 Basic 13 ‐1.11 687 687 Basic 14 ‐1.02 690 690 Basic 15 ‐0.93 692 692 Basic 16 ‐0.84 695 695 Basic 17 ‐0.76 697 697 Basic 18 ‐0.68 700 700 Proficient 19 ‐0.60 702 702 Proficient 20 ‐0.52 704 704 Proficient 21 ‐0.45 707 707 Proficient 22 ‐0.37 709 709 Proficient 23 ‐0.30 711 711 Proficient 24 ‐0.23 713 713 Proficient 25 ‐0.16 715 715 Proficient 26 ‐0.09 717 717 Proficient 27 ‐0.02 719 719 Proficient 28 0.06 721 721 Proficient 29 0.13 724 724 Proficient 30 0.20 726 726 Accelerated 31 0.27 728 728 Accelerated 32 0.35 730 730 Accelerated 33 0.42 732 732 Accelerated 34 0.50 734 735 Advanced 35 0.57 737 737 Advanced 36 0.65 739 739 Advanced 37 0.73 741 741 Advanced 38 0.81 744 744 Advanced 39 0.90 746 746 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G18. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Physical Science Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 634 634 Limited 1 ‐3.50 634 634 Limited 2 ‐3.39 637 637 Limited 3 ‐2.95 648 648 Limited 4 ‐2.63 656 656 Limited 5 ‐2.38 663 663 Limited 6 ‐2.16 668 668 Limited 7 ‐1.98 673 673 Limited 8 ‐1.81 677 677 Limited 9 ‐1.67 681 681 Limited 10 ‐1.53 685 685 Basic 11 ‐1.40 688 688 Basic 12 ‐1.29 691 691 Basic 13 ‐1.18 694 694 Basic 14 ‐1.07 697 697 Basic 15 ‐0.97 699 700 Proficient 16 ‐0.87 702 702 Proficient 17 ‐0.78 704 704 Proficient 18 ‐0.69 706 706 Proficient 19 ‐0.60 709 709 Proficient 20 ‐0.52 711 711 Proficient 21 ‐0.44 713 713 Proficient 22 ‐0.35 715 715 Proficient 23 ‐0.27 717 717 Proficient 24 ‐0.19 719 719 Proficient 25 ‐0.11 722 722 Proficient 26 ‐0.03 724 724 Proficient 27 0.05 726 726 Accelerated 28 0.13 728 728 Accelerated 29 0.21 730 730 Accelerated 30 0.29 732 732 Accelerated 31 0.37 734 734 Accelerated 32 0.45 736 736 Accelerated 33 0.53 738 738 Accelerated 34 0.62 740 740 Accelerated 35 0.70 743 743 Accelerated 36 0.79 745 745 Accelerated 37 0.88 747 747 Accelerated 38 0.98 750 750 Advanced 39 1.07 752 752 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G19. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Physical Science Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 634 634 Limited 1 ‐3.50 634 634 Limited 2 ‐3.46 635 635 Limited 3 ‐3.02 646 646 Limited 4 ‐2.69 655 655 Limited 5 ‐2.43 661 661 Limited 6 ‐2.22 667 667 Limited 7 ‐2.03 672 672 Limited 8 ‐1.86 676 676 Limited 9 ‐1.70 680 680 Limited 10 ‐1.56 684 684 Basic 11 ‐1.43 687 687 Basic 12 ‐1.31 691 691 Basic 13 ‐1.19 693 693 Basic 14 ‐1.08 696 696 Basic 15 ‐0.98 699 700 Proficient 16 ‐0.88 702 702 Proficient 17 ‐0.78 704 704 Proficient 18 ‐0.69 707 707 Proficient 19 ‐0.60 709 709 Proficient 20 ‐0.51 711 711 Proficient 21 ‐0.42 714 714 Proficient 22 ‐0.33 716 716 Proficient 23 ‐0.24 718 718 Proficient 24 ‐0.16 720 720 Proficient 25 ‐0.07 723 723 Proficient 26 0.01 725 725 Accelerated 27 0.10 727 727 Accelerated 28 0.18 729 729 Accelerated 29 0.27 731 731 Accelerated 30 0.35 734 734 Accelerated 31 0.44 736 736 Accelerated 32 0.53 738 738 Accelerated 33 0.62 741 741 Accelerated 34 0.72 743 743 Accelerated 35 0.81 745 745 Accelerated 36 0.91 748 749 Advanced 37 1.01 751 751 Advanced 38 1.12 753 753 Advanced 39 1.22 756 756 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G20. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – American Government Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 642 642 Limited 1 ‐3.50 642 642 Limited 2 ‐3.30 645 645 Limited 3 ‐2.88 654 654 Limited 4 ‐2.57 659 659 Limited 5 ‐2.32 664 664 Limited 6 ‐2.12 668 668 Limited 7 ‐1.95 671 671 Limited 8 ‐1.79 674 674 Limited 9 ‐1.66 677 677 Limited 10 ‐1.53 679 679 Limited 11 ‐1.42 681 681 Limited 12 ‐1.32 683 683 Limited 13 ‐1.22 685 685 Limited 14 ‐1.13 687 687 Basic 15 ‐1.04 688 688 Basic 16 ‐0.96 690 690 Basic 17 ‐0.88 691 691 Basic 18 ‐0.80 693 693 Basic 19 ‐0.73 694 694 Basic 20 ‐0.66 695 695 Basic 21 ‐0.59 697 697 Basic 22 ‐0.53 698 698 Basic 23 ‐0.46 699 699 Basic 24 ‐0.40 700 700 Proficient 25 ‐0.34 701 701 Proficient 26 ‐0.28 703 703 Proficient 27 ‐0.22 704 704 Proficient 28 ‐0.16 705 705 Proficient 29 ‐0.10 706 706 Proficient 30 ‐0.04 707 707 Proficient 31 0.02 708 708 Proficient 32 0.08 709 709 Proficient 33 0.14 710 710 Proficient 34 0.19 711 711 Proficient 35 0.25 713 713 Proficient 36 0.31 714 714 Proficient 37 0.37 715 715 Proficient 38 0.43 716 716 Proficient 39 0.50 717 717 Proficient



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.56 718 718 Proficient 41 0.62 719 719 Proficient 42 0.69 721 721 Proficient 43 0.76 722 722 Proficient 44 0.83 723 723 Proficient 45 0.90 725 725 Accelerated 46 0.98 726 726 Accelerated 47 1.05 728 728 Accelerated 48 1.14 729 729 Accelerated 49 1.22 731 731 Accelerated 50 1.32 733 733 Accelerated 51 1.42 734 734 Accelerated 52 1.52 736 736 Accelerated 53 1.64 739 739 Advanced 54 1.76 741 741 Advanced 55 1.90 744 744 Advanced 56 2.05 746 746 Advanced 57 2.23 750 750 Advanced 58 2.44 754 754 Advanced 59 2.69 758 758 Advanced 60 3.00 764 764 Advanced 61 3.44 773 773 Advanced 62 3.50 774 774 Advanced 63 3.50 774 774 Advanced



Table G21. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – American Government Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 642 642 Limited 1 ‐3.50 642 642 Limited 2 ‐3.30 645 645 Limited 3 ‐2.88 654 654 Limited 4 ‐2.57 659 659 Limited 5 ‐2.32 664 664 Limited 6 ‐2.12 668 668 Limited 7 ‐1.95 671 671 Limited 8 ‐1.79 674 674 Limited 9 ‐1.66 677 677 Limited 10 ‐1.53 679 679 Limited 11 ‐1.42 681 681 Limited 12 ‐1.32 683 683 Limited 13 ‐1.22 685 685 Limited 14 ‐1.13 687 687 Basic 15 ‐1.04 688 688 Basic 16 ‐0.96 690 690 Basic 17 ‐0.88 691 691 Basic 18 ‐0.80 693 693 Basic 19 ‐0.73 694 694 Basic 20 ‐0.66 695 695 Basic 21 ‐0.59 697 697 Basic 22 ‐0.53 698 698 Basic 23 ‐0.46 699 699 Basic 24 ‐0.40 700 700 Proficient 25 ‐0.34 701 701 Proficient 26 ‐0.28 703 703 Proficient 27 ‐0.22 704 704 Proficient 28 ‐0.16 705 705 Proficient 29 ‐0.10 706 706 Proficient 30 ‐0.04 707 707 Proficient 31 0.02 708 708 Proficient 32 0.08 709 709 Proficient 33 0.14 710 710 Proficient 34 0.19 711 711 Proficient 35 0.25 713 713 Proficient 36 0.31 714 714 Proficient 37 0.37 715 715 Proficient 38 0.43 716 716 Proficient 39 0.50 717 717 Proficient



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.56 718 718 Proficient 41 0.62 719 719 Proficient 42 0.69 721 721 Proficient 43 0.76 722 722 Proficient 44 0.83 723 723 Proficient 45 0.90 725 725 Accelerated 46 0.98 726 726 Accelerated 47 1.05 728 728 Accelerated 48 1.14 729 729 Accelerated 49 1.22 731 731 Accelerated 50 1.32 733 733 Accelerated 51 1.42 734 734 Accelerated 52 1.52 736 736 Accelerated 53 1.64 739 739 Advanced 54 1.76 741 741 Advanced 55 1.90 744 744 Advanced 56 2.05 746 746 Advanced 57 2.23 750 750 Advanced 58 2.44 754 754 Advanced 59 2.69 758 758 Advanced 60 3.00 764 764 Advanced 61 3.44 773 773 Advanced 62 3.50 774 774 Advanced 63 3.50 774 774 Advanced



Table G22. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – American History Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 619 619 Limited 1 ‐3.50 619 619 Limited 2 ‐3.39 622 622 Limited 3 ‐2.96 633 633 Limited 4 ‐2.65 641 641 Limited 5 ‐2.40 648 648 Limited 6 ‐2.19 653 653 Limited 7 ‐2.01 658 658 Limited 8 ‐1.85 662 662 Limited 9 ‐1.71 665 665 Limited 10 ‐1.58 669 669 Limited 11 ‐1.46 672 672 Limited 12 ‐1.34 675 675 Limited 13 ‐1.24 678 678 Limited 14 ‐1.14 680 680 Limited 15 ‐1.05 683 683 Limited 16 ‐0.96 685 685 Basic 17 ‐0.87 687 687 Basic 18 ‐0.79 689 689 Basic 19 ‐0.71 691 691 Basic 20 ‐0.63 693 693 Basic 21 ‐0.56 695 695 Basic 22 ‐0.48 697 697 Basic 23 ‐0.41 699 699 Basic 24 ‐0.34 701 701 Proficient 25 ‐0.27 702 702 Proficient 26 ‐0.21 704 704 Proficient 27 ‐0.14 706 706 Proficient 28 ‐0.08 707 707 Proficient 29 ‐0.01 709 709 Proficient 30 0.05 711 711 Proficient 31 0.12 712 712 Proficient 32 0.18 714 714 Proficient 33 0.24 716 716 Proficient 34 0.31 717 717 Proficient 35 0.37 719 719 Proficient 36 0.43 721 721 Proficient 37 0.50 722 722 Proficient 38 0.56 724 724 Proficient 39 0.63 726 726 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.69 727 727 Accelerated 41 0.76 729 729 Accelerated 42 0.83 731 731 Accelerated 43 0.90 733 733 Accelerated 44 0.97 734 734 Accelerated 45 1.04 736 736 Accelerated 46 1.11 738 738 Advanced 47 1.19 740 740 Advanced 48 1.27 742 742 Advanced 49 1.36 744 744 Advanced 50 1.44 747 747 Advanced 51 1.54 749 749 Advanced 52 1.63 752 752 Advanced 53 1.74 754 754 Advanced 54 1.85 757 757 Advanced 55 1.97 760 760 Advanced 56 2.10 764 764 Advanced 57 2.25 767 767 Advanced 58 2.42 772 772 Advanced 59 2.62 777 777 Advanced 60 2.85 783 783 Advanced 61 3.15 791 791 Advanced 62 3.50 800 800 Advanced 63 3.50 800 800 Advanced 64 3.50 800 800 Advanced



Table G23. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – American History Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score





Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.69 727 727 Accelerated 41 0.76 729 729 Accelerated 42 0.83 731 731 Accelerated 43 0.90 733 733 Accelerated 44 0.97 734 734 Accelerated 45 1.04 736 736 Accelerated 46 1.11 738 738 Advanced 47 1.19 740 740 Advanced 48 1.27 742 742 Advanced 49 1.36 744 744 Advanced 50 1.44 747 747 Advanced 51 1.54 749 749 Advanced 52 1.63 752 752 Advanced 53 1.74 754 754 Advanced 54 1.85 757 757 Advanced 55 1.97 760 760 Advanced 56 2.10 764 764 Advanced 57 2.25 767 767 Advanced 58 2.42 772 772 Advanced 59 2.62 777 777 Advanced 60 2.85 783 783 Advanced 61 3.15 791 791 Advanced 62 3.50 800 800 Advanced 63 3.50 800 800 Advanced 64 3.50 800 800 Advanced



Table G24. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 3 Reading

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 545 545 Promotion No 1 ‐3.50 545 545 Promotion No 2 ‐3.12 562 562 Promotion No 3 ‐2.65 583 583 Promotion No 4 ‐2.31 599 599 Promotion No 5 ‐2.03 612 612 Promotion No 6 ‐1.78 623 623 Promotion No 7 ‐1.57 633 633 Promotion No 8 ‐1.38 642 642 Promotion No 9 ‐1.20 650 650 Promotion No 10 ‐1.03 657 657 Promotion No 11 ‐0.87 664 664 Promotion No 12 ‐0.72 671 672 Promotion Yes 13 ‐0.58 678 678 Promotion Yes 14 ‐0.44 684 684 Promotion Yes 15 ‐0.31 690 690 Promotion Yes 16 ‐0.18 696 696 Promotion Yes 17 ‐0.05 702 702 Promotion Yes 18 0.08 708 708 Promotion Yes 19 0.21 714 714 Promotion Yes 20 0.34 719 719 Promotion Yes 21 0.46 725 725 Promotion Yes 22 0.59 731 731 Promotion Yes 23 0.72 737 737 Promotion Yes 24 0.86 743 743 Promotion Yes 25 1.00 749 752 Promotion Yes 26 1.14 756 756 Promotion Yes 27 1.29 763 763 Promotion Yes 28 1.45 770 770 Promotion Yes 29 1.61 777 777 Promotion Yes 30 1.79 785 785 Promotion Yes 31 1.97 794 794 Promotion Yes 32 2.18 803 803 Promotion Yes 33 2.40 813 813 Promotion Yes 34 2.64 824 824 Promotion Yes 35 2.92 837 837 Promotion Yes 36 3.24 851 851 Promotion Yes 37 3.50 863 863 Promotion Yes 38 3.50 863 863 Promotion Yes 39 3.50 863 863 Promotion Yes



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 3.50 863 863 Promotion Yes



Table G25. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 3 ELA Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 545 545 Limited 1 ‐3.50 545 545 Limited 2 ‐3.12 562 562 Limited 3 ‐2.65 583 583 Limited 4 ‐2.31 599 599 Limited 5 ‐2.03 612 612 Limited 6 ‐1.78 623 623 Limited 7 ‐1.57 633 633 Limited 8 ‐1.38 642 642 Limited 9 ‐1.20 650 650 Limited 10 ‐1.03 657 657 Limited 11 ‐0.87 664 664 Limited 12 ‐0.72 671 672 Basic 13 ‐0.58 678 678 Basic 14 ‐0.44 684 684 Basic 15 ‐0.31 690 690 Basic 16 ‐0.18 696 696 Basic 17 ‐0.05 702 702 Proficient 18 0.08 708 708 Proficient 19 0.21 714 714 Proficient 20 0.34 719 719 Proficient 21 0.46 725 725 Accelerated 22 0.59 731 731 Accelerated 23 0.72 737 737 Accelerated 24 0.86 743 743 Accelerated 25 1.00 749 752 Advanced 26 1.14 756 756 Advanced 27 1.29 763 763 Advanced 28 1.45 770 770 Advanced 29 1.61 777 777 Advanced 30 1.79 785 785 Advanced 31 1.97 794 794 Advanced 32 2.18 803 803 Advanced 33 2.40 813 813 Advanced 34 2.64 824 824 Advanced 35 2.92 837 837 Advanced 36 3.24 851 851 Advanced 37 3.50 863 863 Advanced 38 3.50 863 863 Advanced 39 3.50 863 863 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G26. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 3 ELA Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 545 545 Limited 1 ‐3.50 545 545 Limited 2 ‐3.12 562 562 Limited 3 ‐2.65 583 583 Limited 4 ‐2.31 599 599 Limited 5 ‐2.03 612 612 Limited 6 ‐1.78 623 623 Limited 7 ‐1.57 633 633 Limited 8 ‐1.38 642 642 Limited 9 ‐1.20 650 650 Limited 10 ‐1.03 657 657 Limited 11 ‐0.87 664 664 Limited 12 ‐0.72 671 672 Basic 13 ‐0.58 678 678 Basic 14 ‐0.44 684 684 Basic 15 ‐0.31 690 690 Basic 16 ‐0.18 696 696 Basic 17 ‐0.05 702 702 Proficient 18 0.08 708 708 Proficient 19 0.21 714 714 Proficient 20 0.34 719 719 Proficient 21 0.46 725 725 Accelerated 22 0.59 731 731 Accelerated 23 0.72 737 737 Accelerated 24 0.86 743 743 Accelerated 25 1.00 749 752 Advanced 26 1.14 756 756 Advanced 27 1.29 763 763 Advanced 28 1.45 770 770 Advanced 29 1.61 777 777 Advanced 30 1.79 785 785 Advanced 31 1.97 794 794 Advanced 32 2.18 803 803 Advanced 33 2.40 813 813 Advanced 34 2.64 824 824 Advanced 35 2.92 837 837 Advanced 36 3.24 851 851 Advanced 37 3.50 863 863 Advanced 38 3.50 863 863 Advanced 39 3.50 863 863 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score






Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 549 549 Limited 1 ‐3.50 549 549 Limited 2 ‐3.04 569 569 Limited 3 ‐2.57 589 589 Limited 4 ‐2.22 603 603 Limited 5 ‐1.94 615 615 Limited 6 ‐1.70 626 626 Limited 7 ‐1.49 634 634 Limited 8 ‐1.30 642 642 Limited 9 ‐1.13 650 650 Limited 10 ‐0.97 656 656 Limited 11 ‐0.82 663 663 Limited 12 ‐0.68 669 669 Limited 13 ‐0.55 674 674 Basic 14 ‐0.42 680 680 Basic 15 ‐0.30 685 685 Basic 16 ‐0.18 690 690 Basic 17 ‐0.06 695 695 Basic 18 0.05 700 700 Proficient 19 0.17 705 705 Proficient 20 0.28 709 709 Proficient 21 0.40 714 714 Proficient 22 0.51 719 719 Proficient 23 0.63 724 725 Accelerated 24 0.76 729 729 Accelerated 25 0.88 735 735 Accelerated 26 1.01 740 740 Accelerated 27 1.15 746 746 Accelerated 28 1.30 753 753 Advanced 29 1.46 759 759 Advanced 30 1.63 766 766 Advanced 31 1.81 774 774 Advanced 32 2.01 783 783 Advanced 33 2.23 792 792 Advanced 34 2.48 802 802 Advanced 35 2.76 814 814 Advanced 36 3.08 828 828 Advanced 37 3.47 845 845 Advanced 38 3.50 846 846 Advanced 39 3.50 846 846 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score






Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 549 549 Limited 1 ‐3.50 549 549 Limited 2 ‐3.04 569 569 Limited 3 ‐2.57 589 589 Limited 4 ‐2.22 603 603 Limited 5 ‐1.94 615 615 Limited 6 ‐1.70 626 626 Limited 7 ‐1.49 634 634 Limited 8 ‐1.30 642 642 Limited 9 ‐1.13 650 650 Limited 10 ‐0.97 656 656 Limited 11 ‐0.82 663 663 Limited 12 ‐0.68 669 669 Limited 13 ‐0.55 674 674 Basic 14 ‐0.42 680 680 Basic 15 ‐0.30 685 685 Basic 16 ‐0.18 690 690 Basic 17 ‐0.06 695 695 Basic 18 0.05 700 700 Proficient 19 0.17 705 705 Proficient 20 0.28 709 709 Proficient 21 0.40 714 714 Proficient 22 0.51 719 719 Proficient 23 0.63 724 725 Accelerated 24 0.76 729 729 Accelerated 25 0.88 735 735 Accelerated 26 1.01 740 740 Accelerated 27 1.15 746 746 Accelerated 28 1.30 753 753 Advanced 29 1.46 759 759 Advanced 30 1.63 766 766 Advanced 31 1.81 774 774 Advanced 32 2.01 783 783 Advanced 33 2.23 792 792 Advanced 34 2.48 802 802 Advanced 35 2.76 814 814 Advanced 36 3.08 828 828 Advanced 37 3.47 845 845 Advanced 38 3.50 846 846 Advanced 39 3.50 846 846 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score






Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 552 552 Limited 1 ‐3.50 552 552 Limited 2 ‐3.39 557 557 Limited 3 ‐2.93 576 576 Limited 4 ‐2.59 590 590 Limited 5 ‐2.31 602 602 Limited 6 ‐2.07 612 612 Limited 7 ‐1.87 621 621 Limited 8 ‐1.68 629 629 Limited 9 ‐1.50 636 636 Limited 10 ‐1.34 643 643 Limited 11 ‐1.19 650 650 Limited 12 ‐1.04 656 656 Limited 13 ‐0.90 662 662 Limited 14 ‐0.77 667 669 Basic 15 ‐0.64 673 673 Basic 16 ‐0.51 678 678 Basic 17 ‐0.39 683 683 Basic 18 ‐0.27 689 689 Basic 19 ‐0.15 694 694 Basic 20 ‐0.03 699 700 Proficient 21 0.08 704 704 Proficient 22 0.20 708 708 Proficient 23 0.32 713 713 Proficient 24 0.44 718 718 Proficient 25 0.56 724 725 Accelerated 26 0.68 729 729 Accelerated 27 0.80 734 734 Accelerated 28 0.93 740 740 Accelerated 29 1.07 745 745 Accelerated 30 1.21 751 751 Accelerated 31 1.37 758 758 Advanced 32 1.54 765 765 Advanced 33 1.72 773 773 Advanced 34 1.92 782 782 Advanced 35 2.15 791 791 Advanced 36 2.41 802 802 Advanced 37 2.71 815 815 Advanced 38 3.06 830 830 Advanced 39 3.47 847 847 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 3.50 848 848 Advanced 41 3.50 848 848 Advanced 42 3.50 848 848 Advanced




Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 552 552 Limited 1 ‐3.50 552 552 Limited 2 ‐3.39 557 557 Limited 3 ‐2.93 576 576 Limited 4 ‐2.59 590 590 Limited 5 ‐2.31 602 602 Limited 6 ‐2.07 612 612 Limited 7 ‐1.87 621 621 Limited 8 ‐1.68 629 629 Limited 9 ‐1.50 636 636 Limited 10 ‐1.34 643 643 Limited 11 ‐1.19 650 650 Limited 12 ‐1.04 656 656 Limited 13 ‐0.90 662 662 Limited 14 ‐0.77 667 669 Basic 15 ‐0.64 673 673 Basic 16 ‐0.51 678 678 Basic 17 ‐0.39 683 683 Basic 18 ‐0.27 689 689 Basic 19 ‐0.15 694 694 Basic 20 ‐0.03 699 700 Proficient 21 0.08 704 704 Proficient 22 0.20 708 708 Proficient 23 0.32 713 713 Proficient 24 0.44 718 718 Proficient 25 0.56 724 725 Accelerated 26 0.68 729 729 Accelerated 27 0.80 734 734 Accelerated 28 0.93 740 740 Accelerated 29 1.07 745 745 Accelerated 30 1.21 751 751 Accelerated 31 1.37 758 758 Advanced 32 1.54 765 765 Advanced 33 1.72 773 773 Advanced 34 1.92 782 782 Advanced 35 2.15 791 791 Advanced 36 2.41 802 802 Advanced 37 2.71 815 815 Advanced 38 3.06 830 830 Advanced 39 3.47 847 847 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 3.50 848 848 Advanced 41 3.50 848 848 Advanced 42 3.50 848 848 Advanced




Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 555 555 Limited 1 ‐3.50 555 555 Limited 2 ‐3.50 555 555 Limited 3 ‐3.35 561 561 Limited 4 ‐3.03 575 575 Limited 5 ‐2.77 586 586 Limited 6 ‐2.55 595 595 Limited 7 ‐2.36 603 603 Limited 8 ‐2.19 610 610 Limited 9 ‐2.03 617 617 Limited 10 ‐1.89 623 623 Limited 11 ‐1.76 628 628 Limited 12 ‐1.64 634 634 Limited 13 ‐1.52 638 638 Limited 14 ‐1.41 643 643 Limited 15 ‐1.31 648 648 Limited 16 ‐1.21 652 652 Limited 17 ‐1.11 656 656 Limited 18 ‐1.02 660 660 Limited 19 ‐0.93 664 664 Limited 20 ‐0.84 667 668 Basic 21 ‐0.76 671 671 Basic 22 ‐0.67 674 674 Basic 23 ‐0.59 678 678 Basic 24 ‐0.51 681 681 Basic 25 ‐0.43 685 685 Basic 26 ‐0.36 688 688 Basic 27 ‐0.28 691 691 Basic 28 ‐0.21 694 694 Basic 29 ‐0.13 697 697 Basic 30 ‐0.06 701 701 Proficient 31 0.02 704 704 Proficient 32 0.09 707 707 Proficient 33 0.17 710 710 Proficient 34 0.24 713 713 Proficient 35 0.32 716 716 Proficient 36 0.40 720 720 Proficient 37 0.47 723 723 Proficient 38 0.55 726 726 Accelerated 39 0.63 730 730 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.72 733 733 Accelerated 41 0.80 737 737 Accelerated 42 0.89 741 741 Accelerated 43 0.98 744 744 Accelerated 44 1.07 748 748 Accelerated 45 1.17 752 752 Advanced 46 1.27 757 757 Advanced 47 1.38 761 761 Advanced 48 1.49 766 766 Advanced 49 1.61 771 771 Advanced 50 1.74 777 777 Advanced 51 1.88 782 782 Advanced 52 2.02 789 789 Advanced 53 2.19 796 796 Advanced 54 2.36 803 803 Advanced 55 2.56 812 812 Advanced 56 2.79 821 821 Advanced 57 3.06 833 833 Advanced 58 3.40 847 847 Advanced 59 3.50 851 851 Advanced 60 3.50 851 851 Advanced 61 3.50 851 851 Advanced




Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 555 555 Limited 1 ‐3.50 555 555 Limited 2 ‐3.50 555 555 Limited 3 ‐3.35 561 561 Limited 4 ‐3.03 575 575 Limited 5 ‐2.77 586 586 Limited 6 ‐2.55 595 595 Limited 7 ‐2.36 603 603 Limited 8 ‐2.19 610 610 Limited 9 ‐2.03 617 617 Limited 10 ‐1.89 623 623 Limited 11 ‐1.76 628 628 Limited 12 ‐1.64 634 634 Limited 13 ‐1.52 638 638 Limited 14 ‐1.41 643 643 Limited 15 ‐1.31 648 648 Limited 16 ‐1.21 652 652 Limited 17 ‐1.11 656 656 Limited 18 ‐1.02 660 660 Limited 19 ‐0.93 664 664 Limited 20 ‐0.84 667 668 Basic 21 ‐0.76 671 671 Basic 22 ‐0.67 674 674 Basic 23 ‐0.59 678 678 Basic 24 ‐0.51 681 681 Basic 25 ‐0.43 685 685 Basic 26 ‐0.36 688 688 Basic 27 ‐0.28 691 691 Basic 28 ‐0.21 694 694 Basic 29 ‐0.13 697 697 Basic 30 ‐0.06 701 701 Proficient 31 0.02 704 704 Proficient 32 0.09 707 707 Proficient 33 0.17 710 710 Proficient 34 0.24 713 713 Proficient 35 0.32 716 716 Proficient 36 0.40 720 720 Proficient 37 0.47 723 723 Proficient 38 0.55 726 726 Accelerated 39 0.63 730 730 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score






Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 568 568 Limited 1 ‐3.50 568 568 Limited 2 ‐3.50 568 568 Limited 3 ‐3.48 569 569 Limited 4 ‐3.14 581 581 Limited 5 ‐2.87 592 592 Limited 6 ‐2.64 600 600 Limited 7 ‐2.44 608 608 Limited 8 ‐2.26 615 615 Limited 9 ‐2.09 621 621 Limited 10 ‐1.94 627 627 Limited 11 ‐1.79 632 632 Limited 12 ‐1.66 638 638 Limited 13 ‐1.53 642 642 Limited 14 ‐1.41 647 647 Limited 15 ‐1.29 651 651 Limited 16 ‐1.18 656 656 Limited 17 ‐1.08 660 660 Limited 18 ‐0.97 663 663 Limited 19 ‐0.87 667 667 Limited 20 ‐0.78 671 671 Basic 21 ‐0.69 674 674 Basic 22 ‐0.59 678 678 Basic 23 ‐0.51 681 681 Basic 24 ‐0.42 684 684 Basic 25 ‐0.34 688 688 Basic 26 ‐0.25 691 691 Basic 27 ‐0.17 694 694 Basic 28 ‐0.09 697 697 Basic 29 ‐0.01 700 700 Proficient 30 0.07 703 703 Proficient 31 0.15 706 706 Proficient 32 0.23 709 709 Proficient 33 0.31 712 712 Proficient 34 0.38 715 715 Proficient 35 0.46 718 718 Proficient 36 0.54 721 721 Proficient 37 0.63 724 725 Accelerated 38 0.71 727 727 Accelerated 39 0.79 730 730 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.88 734 734 Accelerated 41 0.96 737 737 Accelerated 42 1.06 740 740 Accelerated 43 1.15 744 744 Accelerated 44 1.24 748 749 Advanced 45 1.35 751 751 Advanced 46 1.45 755 755 Advanced 47 1.56 759 759 Advanced 48 1.67 764 764 Advanced 49 1.80 768 768 Advanced 50 1.93 773 773 Advanced 51 2.07 779 779 Advanced 52 2.22 784 784 Advanced 53 2.38 791 791 Advanced 54 2.56 797 797 Advanced 55 2.77 805 805 Advanced 56 3.00 814 814 Advanced 57 3.27 824 824 Advanced 58 3.50 833 833 Advanced 59 3.50 833 833 Advanced 60 3.50 833 833 Advanced 61 3.50 833 833 Advanced 62 3.50 833 833 Advanced




Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 568 568 Limited 1 ‐3.50 568 568 Limited 2 ‐3.50 568 568 Limited 3 ‐3.48 569 569 Limited 4 ‐3.14 581 581 Limited 5 ‐2.87 592 592 Limited 6 ‐2.64 600 600 Limited 7 ‐2.44 608 608 Limited 8 ‐2.26 615 615 Limited 9 ‐2.09 621 621 Limited 10 ‐1.94 627 627 Limited 11 ‐1.79 632 632 Limited 12 ‐1.66 638 638 Limited 13 ‐1.53 642 642 Limited 14 ‐1.41 647 647 Limited 15 ‐1.29 651 651 Limited 16 ‐1.18 656 656 Limited 17 ‐1.08 660 660 Limited 18 ‐0.97 663 663 Limited 19 ‐0.87 667 667 Limited 20 ‐0.78 671 671 Basic 21 ‐0.69 674 674 Basic 22 ‐0.59 678 678 Basic 23 ‐0.51 681 681 Basic 24 ‐0.42 684 684 Basic 25 ‐0.34 688 688 Basic 26 ‐0.25 691 691 Basic 27 ‐0.17 694 694 Basic 28 ‐0.09 697 697 Basic 29 ‐0.01 700 700 Proficient 30 0.07 703 703 Proficient 31 0.15 706 706 Proficient 32 0.23 709 709 Proficient 33 0.31 712 712 Proficient 34 0.38 715 715 Proficient 35 0.46 718 718 Proficient 36 0.54 721 721 Proficient 37 0.63 724 725 Accelerated 38 0.71 727 727 Accelerated 39 0.79 730 730 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score






Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 586 586 Limited 1 ‐3.50 586 586 Limited 2 ‐3.50 586 586 Limited 3 ‐3.27 593 593 Limited 4 ‐2.94 603 603 Limited 5 ‐2.68 612 612 Limited 6 ‐2.46 618 618 Limited 7 ‐2.27 625 625 Limited 8 ‐2.09 630 630 Limited 9 ‐1.94 635 635 Limited 10 ‐1.80 639 639 Limited 11 ‐1.67 643 643 Limited 12 ‐1.54 647 647 Limited 13 ‐1.43 651 651 Limited 14 ‐1.32 654 654 Limited 15 ‐1.21 657 657 Limited 16 ‐1.11 661 661 Limited 17 ‐1.02 664 664 Limited 18 ‐0.92 666 666 Limited 19 ‐0.83 669 669 Limited 20 ‐0.75 672 672 Limited 21 ‐0.66 675 675 Limited 22 ‐0.58 677 677 Limited 23 ‐0.49 680 680 Limited 24 ‐0.41 682 682 Basic 25 ‐0.33 685 685 Basic 26 ‐0.26 687 687 Basic 27 ‐0.18 690 690 Basic 28 ‐0.10 692 692 Basic 29 ‐0.02 695 695 Basic 30 0.05 697 697 Basic 31 0.13 699 700 Proficient 32 0.21 702 702 Proficient 33 0.29 704 704 Proficient 34 0.37 707 707 Proficient 35 0.44 709 709 Proficient 36 0.53 712 712 Proficient 37 0.61 714 714 Proficient 38 0.69 717 717 Proficient 39 0.78 720 720 Proficient



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.87 722 722 Proficient 41 0.96 725 725 Accelerated 42 1.05 728 728 Accelerated 43 1.15 731 731 Accelerated 44 1.24 734 734 Accelerated 45 1.35 737 737 Accelerated 46 1.46 741 741 Accelerated 47 1.57 744 744 Advanced 48 1.69 748 748 Advanced 49 1.81 752 752 Advanced 50 1.94 756 756 Advanced 51 2.09 760 760 Advanced 52 2.24 765 765 Advanced 53 2.40 770 770 Advanced 54 2.58 776 776 Advanced 55 2.78 782 782 Advanced 56 3.00 789 789 Advanced 57 3.26 797 797 Advanced 58 3.50 805 805 Advanced 59 3.50 805 805 Advanced 60 3.50 805 805 Advanced 61 3.50 805 805 Advanced 62 3.50 805 805 Advanced




Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 586 586 Limited 1 ‐3.50 586 586 Limited 2 ‐3.50 586 586 Limited 3 ‐3.27 593 593 Limited 4 ‐2.94 603 603 Limited 5 ‐2.68 612 612 Limited 6 ‐2.46 618 618 Limited 7 ‐2.27 625 625 Limited 8 ‐2.09 630 630 Limited 9 ‐1.94 635 635 Limited 10 ‐1.80 639 639 Limited 11 ‐1.67 643 643 Limited 12 ‐1.54 647 647 Limited 13 ‐1.43 651 651 Limited 14 ‐1.32 654 654 Limited 15 ‐1.21 657 657 Limited 16 ‐1.11 661 661 Limited 17 ‐1.02 664 664 Limited 18 ‐0.92 666 666 Limited 19 ‐0.83 669 669 Limited 20 ‐0.75 672 672 Limited 21 ‐0.66 675 675 Limited 22 ‐0.58 677 677 Limited 23 ‐0.49 680 680 Limited 24 ‐0.41 682 682 Basic 25 ‐0.33 685 685 Basic 26 ‐0.26 687 687 Basic 27 ‐0.18 690 690 Basic 28 ‐0.10 692 692 Basic 29 ‐0.02 695 695 Basic 30 0.05 697 697 Basic 31 0.13 699 700 Proficient 32 0.21 702 702 Proficient 33 0.29 704 704 Proficient 34 0.37 707 707 Proficient 35 0.44 709 709 Proficient 36 0.53 712 712 Proficient 37 0.61 714 714 Proficient 38 0.69 717 717 Proficient 39 0.78 720 720 Proficient



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.87 722 722 Proficient 41 0.96 725 725 Accelerated 42 1.05 728 728 Accelerated 43 1.15 731 731 Accelerated 44 1.24 734 734 Accelerated 45 1.35 737 737 Accelerated 46 1.46 741 741 Accelerated 47 1.57 744 744 Advanced 48 1.69 748 748 Advanced 49 1.81 752 752 Advanced 50 1.94 756 756 Advanced 51 2.09 760 760 Advanced 52 2.24 765 765 Advanced 53 2.40 770 770 Advanced 54 2.58 776 776 Advanced 55 2.78 782 782 Advanced 56 3.00 789 789 Advanced 57 3.26 797 797 Advanced 58 3.50 805 805 Advanced 59 3.50 805 805 Advanced 60 3.50 805 805 Advanced 61 3.50 805 805 Advanced 62 3.50 805 805 Advanced



Table G37. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – High School ELA I Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score





Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G38. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – High School ELA I Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score





Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G39. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – High School ELA II Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 597 597 Limited 1 ‐3.50 597 597 Limited 2 ‐3.50 597 597 Limited 3 ‐3.19 606 606 Limited 4 ‐2.86 616 616 Limited 5 ‐2.59 624 624 Limited 6 ‐2.37 631 631 Limited 7 ‐2.17 637 637 Limited 8 ‐2.00 642 642 Limited 9 ‐1.84 647 647 Limited 10 ‐1.70 651 651 Limited 11 ‐1.57 655 655 Limited 12 ‐1.44 659 659 Limited 13 ‐1.33 662 662 Limited 14 ‐1.21 666 666 Limited 15 ‐1.11 669 669 Limited 16 ‐1.01 672 672 Limited 17 ‐0.91 675 675 Limited 18 ‐0.81 678 679 Basic 19 ‐0.72 681 681 Basic 20 ‐0.63 683 683 Basic 21 ‐0.55 686 686 Basic 22 ‐0.46 688 688 Basic 23 ‐0.38 691 691 Basic 24 ‐0.30 693 693 Basic 25 ‐0.22 696 696 Basic 26 ‐0.14 698 698 Basic 27 ‐0.07 700 700 Proficient 28 0.01 703 703 Proficient 29 0.09 705 705 Proficient 30 0.16 707 707 Proficient 31 0.24 710 710 Proficient 32 0.31 712 712 Proficient 33 0.39 714 714 Proficient 34 0.46 716 716 Proficient 35 0.54 719 719 Proficient 36 0.62 721 721 Proficient 37 0.70 723 723 Proficient 38 0.78 726 726 Accelerated 39 0.86 728 728 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.94 731 731 Accelerated 41 1.03 733 733 Accelerated 42 1.12 736 736 Accelerated 43 1.21 739 739 Accelerated 44 1.31 742 742 Advanced 45 1.42 745 745 Advanced 46 1.52 748 748 Advanced 47 1.64 752 752 Advanced 48 1.76 755 755 Advanced 49 1.89 759 759 Advanced 50 2.02 763 763 Advanced 51 2.17 768 768 Advanced 52 2.33 773 773 Advanced 53 2.50 778 778 Advanced 54 2.69 783 783 Advanced 55 2.90 790 790 Advanced 56 3.14 797 797 Advanced 57 3.42 805 805 Advanced 58 3.50 808 808 Advanced 59 3.50 808 808 Advanced 60 3.50 808 808 Advanced 61 3.50 808 808 Advanced



Table G40. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – High School ELA II Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 597 597 Limited 1 ‐3.50 597 597 Limited 2 ‐3.50 597 597 Limited 3 ‐3.19 606 606 Limited 4 ‐2.86 616 616 Limited 5 ‐2.59 624 624 Limited 6 ‐2.37 631 631 Limited 7 ‐2.17 637 637 Limited 8 ‐2.00 642 642 Limited 9 ‐1.84 647 647 Limited 10 ‐1.70 651 651 Limited 11 ‐1.57 655 655 Limited 12 ‐1.44 659 659 Limited 13 ‐1.33 662 662 Limited 14 ‐1.21 666 666 Limited 15 ‐1.11 669 669 Limited 16 ‐1.01 672 672 Limited 17 ‐0.91 675 675 Limited 18 ‐0.81 678 679 Basic 19 ‐0.72 681 681 Basic 20 ‐0.63 683 683 Basic 21 ‐0.55 686 686 Basic 22 ‐0.46 688 688 Basic 23 ‐0.38 691 691 Basic 24 ‐0.30 693 693 Basic 25 ‐0.22 696 696 Basic 26 ‐0.14 698 698 Basic 27 ‐0.07 700 700 Proficient 28 0.01 703 703 Proficient 29 0.09 705 705 Proficient 30 0.16 707 707 Proficient 31 0.24 710 710 Proficient 32 0.31 712 712 Proficient 33 0.39 714 714 Proficient 34 0.46 716 716 Proficient 35 0.54 719 719 Proficient 36 0.62 721 721 Proficient 37 0.70 723 723 Proficient 38 0.78 726 726 Accelerated 39 0.86 728 728 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.94 731 731 Accelerated 41 1.03 733 733 Accelerated 42 1.12 736 736 Accelerated 43 1.21 739 739 Accelerated 44 1.31 742 742 Advanced 45 1.42 745 745 Advanced 46 1.52 748 748 Advanced 47 1.64 752 752 Advanced 48 1.76 755 755 Advanced 49 1.89 759 759 Advanced 50 2.02 763 763 Advanced 51 2.17 768 768 Advanced 52 2.33 773 773 Advanced 53 2.50 778 778 Advanced 54 2.69 783 783 Advanced 55 2.90 790 790 Advanced 56 3.14 797 797 Advanced 57 3.42 805 805 Advanced 58 3.50 808 808 Advanced 59 3.50 808 808 Advanced 60 3.50 808 808 Advanced 61 3.50 808 808 Advanced



Table G41. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 3 Math Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 587 587 Limited 1 ‐3.50 587 587 Limited 2 ‐3.50 587 587 Limited 3 ‐3.46 589 589 Limited 4 ‐3.11 600 600 Limited 5 ‐2.83 609 609 Limited 6 ‐2.59 617 617 Limited 7 ‐2.38 624 624 Limited 8 ‐2.19 630 630 Limited 9 ‐2.02 636 636 Limited 10 ‐1.86 642 642 Limited 11 ‐1.70 647 647 Limited 12 ‐1.56 651 651 Limited 13 ‐1.42 656 656 Limited 14 ‐1.29 660 660 Limited 15 ‐1.16 664 664 Limited 16 ‐1.04 669 669 Limited 17 ‐0.91 673 673 Limited 18 ‐0.79 677 677 Limited 19 ‐0.68 680 680 Limited 20 ‐0.56 684 684 Basic 21 ‐0.44 688 688 Basic 22 ‐0.33 692 692 Basic 23 ‐0.21 696 696 Basic 24 ‐0.10 699 700 Proficient 25 0.02 703 703 Proficient 26 0.13 707 707 Proficient 27 0.25 711 711 Proficient 28 0.37 715 715 Proficient 29 0.49 719 719 Proficient 30 0.61 723 723 Proficient 31 0.73 727 727 Accelerated 32 0.85 731 731 Accelerated 33 0.98 735 735 Accelerated 34 1.11 739 739 Accelerated 35 1.25 744 744 Accelerated 36 1.39 748 748 Accelerated 37 1.53 753 753 Advanced 38 1.68 758 758 Advanced 39 1.83 763 763 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.99 768 768 Advanced 41 2.17 774 774 Advanced 42 2.35 780 780 Advanced 43 2.56 787 787 Advanced 44 2.79 795 795 Advanced 45 3.07 804 804 Advanced 46 3.41 815 815 Advanced 47 3.50 818 818 Advanced 48 3.50 818 818 Advanced 49 3.50 818 818 Advanced



Table G42. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 3 Math Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 587 587 Limited 1 ‐3.50 587 587 Limited 2 ‐3.50 587 587 Limited 3 ‐3.44 589 589 Limited 4 ‐3.09 601 601 Limited 5 ‐2.81 610 610 Limited 6 ‐2.57 618 618 Limited 7 ‐2.36 625 625 Limited 8 ‐2.17 631 631 Limited 9 ‐1.99 637 637 Limited 10 ‐1.82 643 643 Limited 11 ‐1.67 648 648 Limited 12 ‐1.52 653 653 Limited 13 ‐1.38 657 657 Limited 14 ‐1.24 662 662 Limited 15 ‐1.11 666 666 Limited 16 ‐0.98 670 670 Limited 17 ‐0.86 674 674 Limited 18 ‐0.73 678 678 Limited 19 ‐0.61 682 683 Basic 20 ‐0.49 686 686 Basic 21 ‐0.37 690 690 Basic 22 ‐0.26 694 694 Basic 23 ‐0.14 698 698 Basic 24 ‐0.02 702 702 Proficient 25 0.10 706 706 Proficient 26 0.21 710 710 Proficient 27 0.33 714 714 Proficient 28 0.45 717 717 Proficient 29 0.57 721 721 Proficient 30 0.69 725 725 Accelerated 31 0.81 729 729 Accelerated 32 0.94 733 733 Accelerated 33 1.06 738 738 Accelerated 34 1.19 742 742 Accelerated 35 1.32 746 746 Accelerated 36 1.45 750 750 Accelerated 37 1.59 755 755 Advanced 38 1.73 760 760 Advanced 39 1.88 764 764 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score








Scaled Score


Scaled Score


Proficiency 0 ‐3.50 605 605 Limited 1 ‐3.50 605 605 Limited 2 ‐3.47 606 606 Limited 3 ‐3.03 620 620 Limited 4 ‐2.70 631 631 Limited 5 ‐2.44 640 640 Limited 6 ‐2.22 647 647 Limited 7 ‐2.02 653 653 Limited 8 ‐1.85 659 659 Limited 9 ‐1.69 664 664 Limited 10 ‐1.55 669 669 Limited 11 ‐1.41 674 674 Limited 12 ‐1.28 678 678 Limited 13 ‐1.16 682 682 Limited 14 ‐1.04 686 686 Basic 15 ‐0.93 689 689 Basic 16 ‐0.82 693 693 Basic 17 ‐0.72 696 696 Basic 18 ‐0.62 700 700 Proficient 19 ‐0.52 703 703 Proficient 20 ‐0.42 706 706 Proficient 21 ‐0.32 710 710 Proficient 22 ‐0.22 713 713 Proficient 23 ‐0.13 716 716 Proficient 24 ‐0.03 719 719 Proficient 25 0.06 722 722 Proficient 26 0.16 725 725 Accelerated 27 0.25 728 728 Accelerated 28 0.35 732 732 Accelerated 29 0.45 735 735 Accelerated 30 0.55 738 738 Accelerated 31 0.65 741 741 Accelerated 32 0.75 745 745 Accelerated 33 0.86 748 748 Accelerated 34 0.97 752 752 Accelerated 35 1.08 756 756 Accelerated 36 1.20 760 760 Advanced 37 1.33 764 764 Advanced 38 1.46 768 768 Advanced 39 1.60 773 773 Advanced





Scaled Score


Scaled Score






Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 605 605 Limited 1 ‐3.50 605 605 Limited 2 ‐3.47 606 606 Limited 3 ‐3.03 620 620 Limited 4 ‐2.70 631 631 Limited 5 ‐2.44 640 640 Limited 6 ‐2.22 647 647 Limited 7 ‐2.02 653 653 Limited 8 ‐1.85 659 659 Limited 9 ‐1.69 664 664 Limited 10 ‐1.55 669 669 Limited 11 ‐1.41 674 674 Limited 12 ‐1.28 678 678 Limited 13 ‐1.16 682 682 Limited 14 ‐1.04 686 686 Basic 15 ‐0.93 689 689 Basic 16 ‐0.82 693 693 Basic 17 ‐0.72 696 696 Basic 18 ‐0.62 700 700 Proficient 19 ‐0.52 703 703 Proficient 20 ‐0.42 706 706 Proficient 21 ‐0.32 710 710 Proficient 22 ‐0.22 713 713 Proficient 23 ‐0.13 716 716 Proficient 24 ‐0.03 719 719 Proficient 25 0.06 722 722 Proficient 26 0.16 725 725 Accelerated 27 0.25 728 728 Accelerated 28 0.35 732 732 Accelerated 29 0.45 735 735 Accelerated 30 0.55 738 738 Accelerated 31 0.65 741 741 Accelerated 32 0.75 745 745 Accelerated 33 0.86 748 748 Accelerated 34 0.97 752 752 Accelerated 35 1.08 756 756 Accelerated 36 1.20 760 760 Advanced 37 1.33 764 764 Advanced 38 1.46 768 768 Advanced 39 1.60 773 773 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score






Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 624 624 Limited 1 ‐3.50 624 624 Limited 2 ‐3.50 624 624 Limited 3 ‐3.23 631 631 Limited 4 ‐2.90 639 639 Limited 5 ‐2.62 646 646 Limited 6 ‐2.39 652 652 Limited 7 ‐2.19 658 658 Limited 8 ‐2.00 662 662 Limited 9 ‐1.83 667 667 Limited 10 ‐1.68 671 671 Limited 11 ‐1.53 674 674 Limited 12 ‐1.40 678 678 Limited 13 ‐1.26 681 681 Limited 14 ‐1.14 685 685 Limited 15 ‐1.02 688 688 Basic 16 ‐0.90 691 691 Basic 17 ‐0.79 694 694 Basic 18 ‐0.68 696 696 Basic 19 ‐0.57 699 700 Proficient 20 ‐0.46 702 702 Proficient 21 ‐0.36 705 705 Proficient 22 ‐0.26 707 707 Proficient 23 ‐0.15 710 710 Proficient 24 ‐0.05 713 713 Proficient 25 0.05 715 715 Proficient 26 0.15 718 718 Proficient 27 0.25 720 720 Proficient 28 0.35 723 723 Proficient 29 0.46 726 726 Accelerated 30 0.56 728 728 Accelerated 31 0.66 731 731 Accelerated 32 0.77 734 734 Accelerated 33 0.88 737 737 Accelerated 34 0.99 739 739 Accelerated 35 1.10 742 742 Accelerated 36 1.22 745 745 Accelerated 37 1.34 748 749 Advanced 38 1.47 752 752 Advanced 39 1.60 755 755 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.74 759 759 Advanced 41 1.88 762 762 Advanced 42 2.04 767 767 Advanced 43 2.21 771 771 Advanced 44 2.41 776 776 Advanced 45 2.63 782 782 Advanced 46 2.89 788 788 Advanced 47 3.21 797 797 Advanced 48 3.50 804 804 Advanced 49 3.50 804 804 Advanced 50 3.50 804 804 Advanced




Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 624 624 Limited 1 ‐3.50 624 624 Limited 2 ‐3.50 624 624 Limited 3 ‐3.23 631 631 Limited 4 ‐2.90 639 639 Limited 5 ‐2.62 646 646 Limited 6 ‐2.39 652 652 Limited 7 ‐2.19 658 658 Limited 8 ‐2.00 662 662 Limited 9 ‐1.83 667 667 Limited 10 ‐1.68 671 671 Limited 11 ‐1.53 674 674 Limited 12 ‐1.40 678 678 Limited 13 ‐1.26 681 681 Limited 14 ‐1.14 685 685 Limited 15 ‐1.02 688 688 Basic 16 ‐0.90 691 691 Basic 17 ‐0.79 694 694 Basic 18 ‐0.68 696 696 Basic 19 ‐0.57 699 700 Proficient 20 ‐0.46 702 702 Proficient 21 ‐0.36 705 705 Proficient 22 ‐0.26 707 707 Proficient 23 ‐0.15 710 710 Proficient 24 ‐0.05 713 713 Proficient 25 0.05 715 715 Proficient 26 0.15 718 718 Proficient 27 0.25 720 720 Proficient 28 0.35 723 723 Proficient 29 0.46 726 726 Accelerated 30 0.56 728 728 Accelerated 31 0.66 731 731 Accelerated 32 0.77 734 734 Accelerated 33 0.88 737 737 Accelerated 34 0.99 739 739 Accelerated 35 1.10 742 742 Accelerated 36 1.22 745 745 Accelerated 37 1.34 748 749 Advanced 38 1.47 752 752 Advanced 39 1.60 755 755 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.74 759 759 Advanced 41 1.88 762 762 Advanced 42 2.04 767 767 Advanced 43 2.21 771 771 Advanced 44 2.41 776 776 Advanced 45 2.63 782 782 Advanced 46 2.89 788 788 Advanced 47 3.21 797 797 Advanced 48 3.50 804 804 Advanced 49 3.50 804 804 Advanced 50 3.50 804 804 Advanced






Scaled Score


Scaled Score


Proficiency 0 ‐3.50 616 616 Limited 1 ‐3.50 616 616 Limited 2 ‐3.50 616 616 Limited 3 ‐3.38 619 619 Limited 4 ‐3.04 628 628 Limited 5 ‐2.77 635 635 Limited 6 ‐2.53 640 640 Limited 7 ‐2.33 645 645 Limited 8 ‐2.15 650 650 Limited 9 ‐1.98 654 654 Limited 10 ‐1.83 658 658 Limited 11 ‐1.68 661 661 Limited 12 ‐1.54 665 665 Limited 13 ‐1.41 668 668 Limited 14 ‐1.29 671 671 Limited 15 ‐1.17 674 674 Limited 16 ‐1.05 677 677 Limited 17 ‐0.94 680 680 Limited 18 ‐0.83 682 682 Basic 19 ‐0.72 685 685 Basic 20 ‐0.61 688 688 Basic 21 ‐0.51 690 690 Basic 22 ‐0.40 693 693 Basic 23 ‐0.30 696 696 Basic 24 ‐0.20 698 698 Basic 25 ‐0.10 701 701 Proficient 26 0.00 703 703 Proficient 27 0.11 706 706 Proficient 28 0.21 708 708 Proficient 29 0.31 711 711 Proficient 30 0.42 713 713 Proficient 31 0.52 716 716 Proficient 32 0.63 719 719 Proficient 33 0.74 721 721 Proficient 34 0.84 724 725 Accelerated 35 0.96 727 727 Accelerated 36 1.07 729 729 Accelerated 37 1.19 732 732 Accelerated 38 1.31 735 735 Accelerated 39 1.43 738 738 Accelerated





Scaled Score


Scaled Score


Proficiency 40 1.56 742 742 Accelerated 41 1.70 745 745 Advanced 42 1.84 748 748 Advanced 43 1.99 752 752 Advanced 44 2.15 756 756 Advanced 45 2.32 760 760 Advanced 46 2.51 765 765 Advanced 47 2.72 770 770 Advanced 48 2.96 776 776 Advanced 49 3.24 783 783 Advanced 50 3.50 790 790 Advanced 51 3.50 790 790 Advanced 52 3.50 790 790 Advanced 53 3.50 790 790 Advanced




Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 616 616 Limited 1 ‐3.50 616 616 Limited 2 ‐3.50 616 616 Limited 3 ‐3.38 619 619 Limited 4 ‐3.04 628 628 Limited 5 ‐2.77 635 635 Limited 6 ‐2.53 640 640 Limited 7 ‐2.33 645 645 Limited 8 ‐2.15 650 650 Limited 9 ‐1.98 654 654 Limited 10 ‐1.83 658 658 Limited 11 ‐1.68 661 661 Limited 12 ‐1.54 665 665 Limited 13 ‐1.41 668 668 Limited 14 ‐1.29 671 671 Limited 15 ‐1.17 674 674 Limited 16 ‐1.05 677 677 Limited 17 ‐0.94 680 680 Limited 18 ‐0.83 682 682 Basic 19 ‐0.72 685 685 Basic 20 ‐0.61 688 688 Basic 21 ‐0.51 690 690 Basic 22 ‐0.40 693 693 Basic 23 ‐0.30 696 696 Basic 24 ‐0.20 698 698 Basic 25 ‐0.10 701 701 Proficient 26 0.00 703 703 Proficient 27 0.11 706 706 Proficient 28 0.21 708 708 Proficient 29 0.31 711 711 Proficient 30 0.42 713 713 Proficient 31 0.52 716 716 Proficient 32 0.63 719 719 Proficient 33 0.74 721 721 Proficient 34 0.84 724 725 Accelerated 35 0.96 727 727 Accelerated 36 1.07 729 729 Accelerated 37 1.19 732 732 Accelerated 38 1.31 735 735 Accelerated 39 1.43 738 738 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.56 742 742 Accelerated 41 1.70 745 745 Advanced 42 1.84 748 748 Advanced 43 1.99 752 752 Advanced 44 2.15 756 756 Advanced 45 2.32 760 760 Advanced 46 2.51 765 765 Advanced 47 2.72 770 770 Advanced 48 2.96 776 776 Advanced 49 3.24 783 783 Advanced 50 3.50 790 790 Advanced 51 3.50 790 790 Advanced 52 3.50 790 790 Advanced 53 3.50 790 790 Advanced




Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 605 605 Limited 1 ‐3.50 605 605 Limited 2 ‐3.45 606 606 Limited 3 ‐3.00 619 619 Limited 4 ‐2.67 629 629 Limited 5 ‐2.41 636 636 Limited 6 ‐2.18 643 643 Limited 7 ‐1.99 648 648 Limited 8 ‐1.81 653 653 Limited 9 ‐1.65 658 658 Limited 10 ‐1.51 662 662 Limited 11 ‐1.37 666 666 Limited 12 ‐1.24 670 670 Limited 13 ‐1.12 673 673 Limited 14 ‐1.00 677 677 Limited 15 ‐0.89 680 680 Limited 16 ‐0.78 683 684 Basic 17 ‐0.67 686 686 Basic 18 ‐0.57 689 689 Basic 19 ‐0.47 692 692 Basic 20 ‐0.37 695 695 Basic 21 ‐0.28 697 697 Basic 22 ‐0.18 700 700 Proficient 23 ‐0.09 703 703 Proficient 24 0.00 706 706 Proficient 25 0.10 708 708 Proficient 26 0.19 711 711 Proficient 27 0.28 713 713 Proficient 28 0.37 716 716 Proficient 29 0.46 719 719 Proficient 30 0.56 721 721 Proficient 31 0.65 724 725 Accelerated 32 0.74 727 727 Accelerated 33 0.84 730 730 Accelerated 34 0.93 732 732 Accelerated 35 1.03 735 735 Accelerated 36 1.13 738 738 Accelerated 37 1.24 741 741 Accelerated 38 1.34 744 744 Accelerated 39 1.45 747 747 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.57 751 751 Accelerated 41 1.70 754 755 Advanced 42 1.83 758 758 Advanced 43 1.97 762 762 Advanced 44 2.12 767 767 Advanced 45 2.30 771 771 Advanced 46 2.49 777 777 Advanced 47 2.72 784 784 Advanced 48 2.99 791 791 Advanced 49 3.33 801 801 Advanced 50 3.50 806 806 Advanced 51 3.50 806 806 Advanced 52 3.50 806 806 Advanced




Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 605 605 Limited 1 ‐3.50 605 605 Limited 2 ‐3.45 606 606 Limited 3 ‐3.00 619 619 Limited 4 ‐2.67 629 629 Limited 5 ‐2.41 636 636 Limited 6 ‐2.18 643 643 Limited 7 ‐1.99 648 648 Limited 8 ‐1.81 653 653 Limited 9 ‐1.65 658 658 Limited 10 ‐1.51 662 662 Limited 11 ‐1.37 666 666 Limited 12 ‐1.24 670 670 Limited 13 ‐1.12 673 673 Limited 14 ‐1.00 677 677 Limited 15 ‐0.89 680 680 Limited 16 ‐0.78 683 684 Basic 17 ‐0.67 686 686 Basic 18 ‐0.57 689 689 Basic 19 ‐0.47 692 692 Basic 20 ‐0.37 695 695 Basic 21 ‐0.28 697 697 Basic 22 ‐0.18 700 700 Proficient 23 ‐0.09 703 703 Proficient 24 0.00 706 706 Proficient 25 0.10 708 708 Proficient 26 0.19 711 711 Proficient 27 0.28 713 713 Proficient 28 0.37 716 716 Proficient 29 0.46 719 719 Proficient 30 0.56 721 721 Proficient 31 0.65 724 725 Accelerated 32 0.74 727 727 Accelerated 33 0.84 730 730 Accelerated 34 0.93 732 732 Accelerated 35 1.03 735 735 Accelerated 36 1.13 738 738 Accelerated 37 1.24 741 741 Accelerated 38 1.34 744 744 Accelerated 39 1.45 747 747 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.57 751 751 Accelerated 41 1.70 754 755 Advanced 42 1.83 758 758 Advanced 43 1.97 762 klklkl762 Advanced 44 2.12 767 767 Advanced 45 2.30 771 771 Advanced 46 2.49 777 777 Advanced 47 2.72 784 784 Advanced 48 2.99 791 791 Advanced 49 3.33 801 801 Advanced 50 3.50 806 806 Advanced 51 3.50 806 806 Advanced 52 3.50 806 806 Advanced




Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 633 633 Limited 1 ‐3.50 633 633 Limited 2 ‐3.50 633 633 Limited 3 ‐3.50 633 633 Limited 4 ‐3.23 639 639 Limited 5 ‐2.94 644 644 Limited 6 ‐2.70 649 649 Limited 7 ‐2.48 654 654 Limited 8 ‐2.29 658 658 Limited 9 ‐2.11 661 661 Limited 10 ‐1.94 665 665 Limited 11 ‐1.78 668 668 Limited 12 ‐1.63 671 671 Limited 13 ‐1.48 674 674 Limited 14 ‐1.35 676 676 Limited 15 ‐1.21 679 679 Limited 16 ‐1.08 682 682 Limited 17 ‐0.96 684 684 Limited 18 ‐0.83 687 687 Limited 19 ‐0.71 689 690 Basic 20 ‐0.59 692 692 Basic 21 ‐0.48 694 694 Basic 22 ‐0.36 696 696 Basic 23 ‐0.25 699 699 Basic 24 ‐0.13 701 701 Proficient 25 ‐0.02 703 703 Proficient 26 0.09 705 705 Proficient 27 0.20 708 708 Proficient 28 0.31 710 710 Proficient 29 0.42 712 712 Proficient 30 0.53 714 714 Proficient 31 0.64 717 717 Proficient 32 0.75 719 719 Proficient 33 0.87 721 721 Proficient 34 0.98 723 723 Proficient 35 1.09 726 726 Accelerated 36 1.21 728 728 Accelerated 37 1.33 730 730 Accelerated 38 1.45 733 733 Accelerated 39 1.58 735 735 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.71 738 738 Accelerated 41 1.84 741 741 Accelerated 42 1.98 744 744 Advanced 43 2.13 747 747 Advanced 44 2.29 750 750 Advanced 45 2.46 753 753 Advanced 46 2.64 757 757 Advanced 47 2.84 761 761 Advanced 48 3.07 766 766 Advanced 49 3.35 771 771 Advanced 50 3.50 774 774 Advanced 51 3.50 774 774 Advanced 52 3.50 774 774 Advanced 53 3.50 774 774 Advanced




Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 633 633 Limited 1 ‐3.50 633 633 Limited 2 ‐3.50 633 633 Limited 3 ‐3.50 633 633 Limited 4 ‐3.23 639 639 Limited 5 ‐2.94 644 644 Limited 6 ‐2.70 649 649 Limited 7 ‐2.48 654 654 Limited 8 ‐2.29 658 658 Limited 9 ‐2.11 661 661 Limited 10 ‐1.94 665 665 Limited 11 ‐1.78 668 668 Limited 12 ‐1.63 671 671 Limited 13 ‐1.48 674 674 Limited 14 ‐1.35 676 676 Limited 15 ‐1.21 679 679 Limited 16 ‐1.08 682 682 Limited 17 ‐0.96 684 684 Limited 18 ‐0.83 687 687 Limited 19 ‐0.71 689 690 Basic 20 ‐0.59 692 692 Basic 21 ‐0.48 694 694 Basic 22 ‐0.36 696 696 Basic 23 ‐0.25 699 699 Basic 24 ‐0.13 701 701 Proficient 25 ‐0.02 703 703 Proficient 26 0.09 705 705 Proficient 27 0.20 708 708 Proficient 28 0.31 710 710 Proficient 29 0.42 712 712 Proficient 30 0.53 714 714 Proficient 31 0.64 717 717 Proficient 32 0.75 719 719 Proficient 33 0.87 721 721 Proficient 34 0.98 723 723 Proficient 35 1.09 726 726 Accelerated 36 1.21 728 728 Accelerated 37 1.33 730 730 Accelerated 38 1.45 733 733 Accelerated 39 1.58 735 735 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.71 738 738 Accelerated 41 1.84 741 741 Accelerated 42 1.98 744 744 Advanced 43 2.13 747 747 Advanced 44 2.29 750 750 Advanced 45 2.46 753 753 Advanced 46 2.64 757 757 Advanced 47 2.84 761 761 Advanced 48 3.07 766 766 Advanced 49 3.35 771 771 Advanced 50 3.50 774 774 Advanced 51 3.50 774 774 Advanced 52 3.50 774 774 Advanced 53 3.50 774 774 Advanced



Table G53. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Algebra Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.10 629 629 Limited 4 ‐2.78 638 638 Limited 5 ‐2.51 645 645 Limited 6 ‐2.29 652 652 Limited 7 ‐2.10 657 657 Limited 8 ‐1.93 662 662 Limited 9 ‐1.78 666 666 Limited 10 ‐1.63 670 670 Limited 11 ‐1.50 674 674 Limited 12 ‐1.38 677 677 Limited 13 ‐1.26 681 682 Basic 14 ‐1.15 684 684 Basic 15 ‐1.04 687 687 Basic 16 ‐0.93 690 690 Basic 17 ‐0.83 693 693 Basic 18 ‐0.74 695 695 Basic 19 ‐0.64 698 698 Basic 20 ‐0.55 701 701 Proficient 21 ‐0.46 703 703 Proficient 22 ‐0.37 706 706 Proficient 23 ‐0.28 708 708 Proficient 24 ‐0.19 711 711 Proficient 25 ‐0.10 713 713 Proficient 26 ‐0.01 716 716 Proficient 27 0.07 718 718 Proficient 28 0.16 721 721 Proficient 29 0.25 723 723 Proficient 30 0.34 726 726 Accelerated 31 0.43 728 728 Accelerated 32 0.52 731 731 Accelerated 33 0.62 733 733 Accelerated 34 0.71 736 736 Accelerated 35 0.81 739 739 Accelerated 36 0.91 741 741 Accelerated 37 1.01 744 744 Accelerated 38 1.11 747 747 Accelerated 39 1.22 750 750 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G54. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Algebra Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.10 629 629 Limited 4 ‐2.78 638 638 Limited 5 ‐2.51 645 645 Limited 6 ‐2.29 652 652 Limited 7 ‐2.10 657 657 Limited 8 ‐1.93 662 662 Limited 9 ‐1.78 666 666 Limited 10 ‐1.63 670 670 Limited 11 ‐1.50 674 674 Limited 12 ‐1.38 677 677 Limited 13 ‐1.26 681 682 Basic 14 ‐1.15 684 684 Basic 15 ‐1.04 687 687 Basic 16 ‐0.93 690 690 Basic 17 ‐0.83 693 693 Basic 18 ‐0.74 695 695 Basic 19 ‐0.64 698 698 Basic 20 ‐0.55 701 701 Proficient 21 ‐0.46 703 703 Proficient 22 ‐0.37 706 706 Proficient 23 ‐0.28 708 708 Proficient 24 ‐0.19 711 711 Proficient 25 ‐0.10 713 713 Proficient 26 ‐0.01 716 716 Proficient 27 0.07 718 718 Proficient 28 0.16 721 721 Proficient 29 0.25 723 723 Proficient 30 0.34 726 726 Accelerated 31 0.43 728 728 Accelerated 32 0.52 731 731 Accelerated 33 0.62 733 733 Accelerated 34 0.71 736 736 Accelerated 35 0.81 739 739 Accelerated 36 0.91 741 741 Accelerated 37 1.01 744 744 Accelerated 38 1.11 747 747 Accelerated 39 1.22 750 750 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G55. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Geometry Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 604 604 Limited 1 ‐3.50 604 604 Limited 2 ‐3.34 609 609 Limited 3 ‐2.88 622 622 Limited 4 ‐2.55 632 632 Limited 5 ‐2.27 640 640 Limited 6 ‐2.04 647 647 Limited 7 ‐1.84 653 653 Limited 8 ‐1.66 658 658 Limited 9 ‐1.50 663 663 Limited 10 ‐1.35 668 668 Limited 11 ‐1.21 672 672 Limited 12 ‐1.07 676 676 Limited 13 ‐0.95 679 679 Basic 14 ‐0.83 683 683 Basic 15 ‐0.71 686 686 Basic 16 ‐0.60 689 689 Basic 17 ‐0.49 693 693 Basic 18 ‐0.39 696 696 Basic 19 ‐0.29 699 700 Proficient 20 ‐0.19 702 702 Proficient 21 ‐0.09 705 705 Proficient 22 0.01 707 707 Proficient 23 0.11 710 710 Proficient 24 0.20 713 713 Proficient 25 0.29 716 716 Proficient 26 0.39 719 719 Proficient 27 0.48 721 721 Proficient 28 0.57 724 725 Accelerated 29 0.67 727 727 Accelerated 30 0.76 730 730 Accelerated 31 0.85 732 732 Accelerated 32 0.95 735 735 Accelerated 33 1.05 738 738 Accelerated 34 1.14 741 741 Accelerated 35 1.24 744 744 Accelerated 36 1.34 747 747 Accelerated 37 1.45 750 750 Accelerated 38 1.55 753 753 Accelerated 39 1.66 756 756 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.78 759 759 Advanced 41 1.89 763 763 Advanced 42 2.02 766 766 Advanced 43 2.14 770 770 Advanced 44 2.28 774 774 Advanced 45 2.42 778 778 Advanced 46 2.58 783 783 Advanced 47 2.74 788 788 Advanced 48 2.93 793 793 Advanced 49 3.13 799 799 Advanced 50 3.36 806 806 Advanced 51 3.50 810 810 Advanced 52 3.50 810 810 Advanced 53 3.50 810 810 Advanced 54 3.50 810 810 Advanced 55 3.50 810 810 Advanced



Table G56. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Geometry Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 604 604 Limited 1 ‐3.50 604 604 Limited 2 ‐3.34 609 609 Limited 3 ‐2.88 622 622 Limited 4 ‐2.55 632 632 Limited 5 ‐2.27 640 640 Limited 6 ‐2.04 647 647 Limited 7 ‐1.84 653 653 Limited 8 ‐1.66 658 658 Limited 9 ‐1.50 663 663 Limited 10 ‐1.35 668 668 Limited 11 ‐1.21 672 672 Limited 12 ‐1.07 676 676 Limited 13 ‐0.95 679 679 Basic 14 ‐0.83 683 683 Basic 15 ‐0.71 686 686 Basic 16 ‐0.60 689 689 Basic 17 ‐0.49 693 693 Basic 18 ‐0.39 696 696 Basic 19 ‐0.29 699 700 Proficient 20 ‐0.19 702 702 Proficient 21 ‐0.09 705 705 Proficient 22 0.01 707 707 Proficient 23 0.11 710 710 Proficient 24 0.20 713 713 Proficient 25 0.29 716 716 Proficient 26 0.39 719 719 Proficient 27 0.48 721 721 Proficient 28 0.57 724 725 Accelerated 29 0.67 727 727 Accelerated 30 0.76 730 730 Accelerated 31 0.85 732 732 Accelerated 32 0.95 735 735 Accelerated 33 1.05 738 738 Accelerated 34 1.14 741 741 Accelerated 35 1.24 744 744 Accelerated 36 1.34 747 747 Accelerated 37 1.45 750 750 Accelerated 38 1.55 753 753 Accelerated 39 1.66 756 756 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G57. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Integrated Math I Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.16 627 627 Limited 4 ‐2.84 636 636 Limited 5 ‐2.58 644 644 Limited 6 ‐2.35 650 650 Limited 7 ‐2.16 655 655 Limited 8 ‐1.99 660 660 Limited 9 ‐1.83 664 664 Limited 10 ‐1.69 669 669 Limited 11 ‐1.56 672 672 Limited 12 ‐1.43 676 676 Limited 13 ‐1.32 679 679 Limited 14 ‐1.20 682 682 Basic 15 ‐1.10 685 685 Basic 16 ‐0.99 688 688 Basic 17 ‐0.89 691 691 Basic 18 ‐0.79 694 694 Basic 19 ‐0.70 696 696 Basic 20 ‐0.61 699 700 Proficient 21 ‐0.52 702 702 Proficient 22 ‐0.43 704 704 Proficient 23 ‐0.34 707 707 Proficient 24 ‐0.25 709 709 Proficient 25 ‐0.16 711 711 Proficient 26 ‐0.08 714 714 Proficient 27 0.01 716 716 Proficient 28 0.10 719 719 Proficient 29 0.19 721 721 Proficient 30 0.27 724 724 Proficient 31 0.36 726 726 Accelerated 32 0.45 729 729 Accelerated 33 0.54 731 731 Accelerated 34 0.64 734 734 Accelerated 35 0.73 737 737 Accelerated 36 0.83 739 739 Accelerated 37 0.93 742 742 Accelerated 38 1.03 745 745 Accelerated 39 1.14 748 748 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.25 751 751 Accelerated 41 1.36 754 754 Advanced 42 1.48 758 758 Advanced 43 1.61 761 761 Advanced 44 1.75 765 765 Advanced 45 1.90 769 769 Advanced 46 2.06 774 774 Advanced 47 2.23 779 779 Advanced 48 2.43 784 784 Advanced 49 2.66 791 791 Advanced 50 2.93 798 798 Advanced 51 3.27 808 808 Advanced 52 3.50 814 814 Advanced 53 3.50 814 814 Advanced 54 3.50 814 814 Advanced



Table G58. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Integrated Math I Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.16 627 627 Limited 4 ‐2.84 636 636 Limited 5 ‐2.58 644 644 Limited 6 ‐2.35 650 650 Limited 7 ‐2.16 655 655 Limited 8 ‐1.99 660 660 Limited 9 ‐1.83 664 664 Limited 10 ‐1.69 669 669 Limited 11 ‐1.56 672 672 Limited 12 ‐1.43 676 676 Limited 13 ‐1.32 679 679 Limited 14 ‐1.20 682 682 Basic 15 ‐1.10 685 685 Basic 16 ‐0.99 688 688 Basic 17 ‐0.89 691 691 Basic 18 ‐0.79 694 694 Basic 19 ‐0.70 696 696 Basic 20 ‐0.61 699 700 Proficient 21 ‐0.52 702 702 Proficient 22 ‐0.43 704 704 Proficient 23 ‐0.34 707 707 Proficient 24 ‐0.25 709 709 Proficient 25 ‐0.16 711 711 Proficient 26 ‐0.08 714 714 Proficient 27 0.01 716 716 Proficient 28 0.10 719 719 Proficient 29 0.19 721 721 Proficient 30 0.27 724 724 Proficient 31 0.36 726 726 Accelerated 32 0.45 729 729 Accelerated 33 0.54 731 731 Accelerated 34 0.64 734 734 Accelerated 35 0.73 737 737 Accelerated 36 0.83 739 739 Accelerated 37 0.93 742 742 Accelerated 38 1.03 745 745 Accelerated 39 1.14 748 748 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.25 751 751 Accelerated 41 1.36 754 754 Advanced 42 1.48 758 758 Advanced 43 1.61 761 761 Advanced 44 1.75 765 765 Advanced 45 1.90 769 769 Advanced 46 2.06 774 774 Advanced 47 2.23 779 779 Advanced 48 2.43 784 784 Advanced 49 2.66 791 791 Advanced 50 2.93 798 798 Advanced 51 3.27 808 808 Advanced 52 3.50 814 814 Advanced 53 3.50 814 814 Advanced 54 3.50 814 814 Advanced



Table G59. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Integrated Math II Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 594 594 Limited 1 ‐3.50 594 594 Limited 2 ‐3.33 600 600 Limited 3 ‐2.88 614 614 Limited 4 ‐2.54 624 624 Limited 5 ‐2.27 632 632 Limited 6 ‐2.05 640 640 Limited 7 ‐1.85 646 646 Limited 8 ‐1.67 651 651 Limited 9 ‐1.50 657 657 Limited 10 ‐1.35 661 661 Limited 11 ‐1.21 666 666 Limited 12 ‐1.08 670 670 Limited 13 ‐0.95 674 674 Limited 14 ‐0.83 678 678 Basic 15 ‐0.71 681 681 Basic 16 ‐0.60 685 685 Basic 17 ‐0.49 688 688 Basic 18 ‐0.39 691 691 Basic 19 ‐0.28 695 695 Basic 20 ‐0.18 698 698 Basic 21 ‐0.08 701 701 Proficient 22 0.02 704 704 Proficient 23 0.12 707 707 Proficient 24 0.21 710 710 Proficient 25 0.31 713 713 Proficient 26 0.40 716 716 Proficient 27 0.50 719 719 Proficient 28 0.60 722 722 Proficient 29 0.69 725 725 Accelerated 30 0.79 728 728 Accelerated 31 0.89 731 731 Accelerated 32 0.99 734 734 Accelerated 33 1.09 738 738 Accelerated 34 1.19 741 741 Accelerated 35 1.30 744 744 Accelerated 36 1.40 747 747 Accelerated 37 1.51 751 751 Accelerated 38 1.62 754 754 Accelerated 39 1.74 758 758 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G60. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Integrated Math II Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 594 594 Limited 1 ‐3.50 594 594 Limited 2 ‐3.33 600 600 Limited 3 ‐2.88 614 614 Limited 4 ‐2.54 624 624 Limited 5 ‐2.27 632 632 Limited 6 ‐2.05 640 640 Limited 7 ‐1.85 646 646 Limited 8 ‐1.67 651 651 Limited 9 ‐1.50 657 657 Limited 10 ‐1.35 661 661 Limited 11 ‐1.21 666 666 Limited 12 ‐1.08 670 670 Limited 13 ‐0.95 674 674 Limited 14 ‐0.83 678 678 Basic 15 ‐0.71 681 681 Basic 16 ‐0.60 685 685 Basic 17 ‐0.49 688 688 Basic 18 ‐0.39 691 691 Basic 19 ‐0.28 695 695 Basic 20 ‐0.18 698 698 Basic 21 ‐0.08 701 701 Proficient 22 0.02 704 704 Proficient 23 0.12 707 707 Proficient 24 0.21 710 710 Proficient 25 0.31 713 713 Proficient 26 0.40 716 716 Proficient 27 0.50 719 719 Proficient 28 0.60 722 722 Proficient 29 0.69 725 725 Accelerated 30 0.79 728 728 Accelerated 31 0.89 731 731 Accelerated 32 0.99 734 734 Accelerated 33 1.09 738 738 Accelerated 34 1.19 741 741 Accelerated 35 1.30 744 744 Accelerated 36 1.40 747 747 Accelerated 37 1.51 751 751 Accelerated 38 1.62 754 754 Accelerated 39 1.74 758 758 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G61. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – American Government Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 642 642 Limited 1 ‐3.50 642 642 Limited 2 ‐3.39 644 644 Limited 3 ‐2.94 652 652 Limited 4 ‐2.62 658 658 Limited 5 ‐2.36 663 663 Limited 6 ‐2.15 667 667 Limited 7 ‐1.96 671 671 Limited 8 ‐1.80 674 674 Limited 9 ‐1.66 677 677 Limited 10 ‐1.52 679 679 Limited 11 ‐1.40 681 681 Limited 12 ‐1.29 683 683 Limited 13 ‐1.18 685 685 Limited 14 ‐1.08 687 687 Basic 15 ‐0.99 689 689 Basic 16 ‐0.90 691 691 Basic 17 ‐0.81 692 692 Basic 18 ‐0.73 694 694 Basic 19 ‐0.65 695 695 Basic 20 ‐0.58 697 697 Basic 21 ‐0.50 698 698 Basic 22 ‐0.43 700 700 Proficient 23 ‐0.36 701 701 Proficient 24 ‐0.30 702 702 Proficient 25 ‐0.23 703 703 Proficient 26 ‐0.17 705 705 Proficient 27 ‐0.10 706 706 Proficient 28 ‐0.04 707 707 Proficient 29 0.02 708 708 Proficient 30 0.08 709 709 Proficient 31 0.14 710 710 Proficient 32 0.20 712 712 Proficient 33 0.26 713 713 Proficient 34 0.32 714 714 Proficient 35 0.38 715 715 Proficient 36 0.44 716 716 Proficient 37 0.50 717 717 Proficient 38 0.56 718 718 Proficient 39 0.62 719 719 Proficient



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.68 721 721 Proficient 41 0.74 722 722 Proficient 42 0.81 723 723 Proficient 43 0.87 724 724 Proficient 44 0.94 725 725 Accelerated 45 1.01 727 727 Accelerated 46 1.08 728 728 Accelerated 47 1.16 730 730 Accelerated 48 1.23 731 731 Accelerated 49 1.32 733 733 Accelerated 50 1.40 734 734 Accelerated 51 1.49 736 736 Accelerated 52 1.59 738 738 Accelerated 53 1.70 740 740 Advanced 54 1.82 742 742 Advanced 55 1.95 745 745 Advanced 56 2.10 747 747 Advanced 57 2.27 751 751 Advanced 58 2.47 754 754 Advanced 59 2.71 759 759 Advanced 60 3.03 765 765 Advanced 61 3.47 773 773 Advanced 62 3.50 774 774 Advanced 63 3.50 774 774 Advanced



Table G62. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – American Government Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 642 642 Limited 1 ‐3.50 642 642 Limited 2 ‐3.43 643 643 Limited 3 ‐2.98 651 651 Limited 4 ‐2.66 658 658 Limited 5 ‐2.40 662 662 Limited 6 ‐2.19 666 666 Limited 7 ‐2.00 670 670 Limited 8 ‐1.84 673 673 Limited 9 ‐1.69 676 676 Limited 10 ‐1.55 678 678 Limited 11 ‐1.43 681 681 Limited 12 ‐1.31 683 683 Limited 13 ‐1.21 685 685 Limited 14 ‐1.11 687 687 Basic 15 ‐1.01 689 689 Basic 16 ‐0.92 690 690 Basic 17 ‐0.83 692 692 Basic 18 ‐0.75 694 694 Basic 19 ‐0.67 695 695 Basic 20 ‐0.59 697 697 Basic 21 ‐0.52 698 698 Basic 22 ‐0.45 699 699 Basic 23 ‐0.38 701 701 Proficient 24 ‐0.31 702 702 Proficient 25 ‐0.24 703 703 Proficient 26 ‐0.17 704 704 Proficient 27 ‐0.11 706 706 Proficient 28 ‐0.05 707 707 Proficient 29 0.02 708 708 Proficient 30 0.08 709 709 Proficient 31 0.14 710 710 Proficient 32 0.20 712 712 Proficient 33 0.26 713 713 Proficient 34 0.32 714 714 Proficient 35 0.38 715 715 Proficient 36 0.45 716 716 Proficient 37 0.51 717 717 Proficient 38 0.57 718 718 Proficient 39 0.63 720 720 Proficient



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.70 721 721 Proficient 41 0.76 722 722 Proficient 42 0.83 723 723 Proficient 43 0.90 725 725 Accelerated 44 0.97 726 726 Accelerated 45 1.04 727 727 Accelerated 46 1.11 729 729 Accelerated 47 1.19 730 730 Accelerated 48 1.27 732 732 Accelerated 49 1.36 733 733 Accelerated 50 1.45 735 735 Accelerated 51 1.54 737 737 Accelerated 52 1.64 739 739 Advanced 53 1.75 741 741 Advanced 54 1.87 743 743 Advanced 55 2.01 746 746 Advanced 56 2.16 748 748 Advanced 57 2.33 752 752 Advanced 58 2.53 755 755 Advanced 59 2.77 760 760 Advanced 60 3.07 766 766 Advanced 61 3.50 774 774 Advanced 62 3.50 774 774 Advanced 63 3.50 774 774 Advanced



Table G63. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – American History Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 619 619 Limited 1 ‐3.50 619 619 Limited 2 ‐3.50 619 619 Limited 3 ‐3.07 630 630 Limited 4 ‐2.75 639 639 Limited 5 ‐2.50 645 645 Limited 6 ‐2.29 651 651 Limited 7 ‐2.11 655 655 Limited 8 ‐1.94 659 659 Limited 9 ‐1.80 663 663 Limited 10 ‐1.66 667 667 Limited 11 ‐1.54 670 670 Limited 12 ‐1.43 673 673 Limited 13 ‐1.32 676 676 Limited 14 ‐1.22 678 678 Limited 15 ‐1.12 681 681 Limited 16 ‐1.03 683 683 Limited 17 ‐0.94 685 685 Basic 18 ‐0.86 687 687 Basic 19 ‐0.78 689 689 Basic 20 ‐0.70 692 692 Basic 21 ‐0.62 693 693 Basic 22 ‐0.55 695 695 Basic 23 ‐0.48 697 697 Basic 24 ‐0.40 699 699 Basic 25 ‐0.34 701 701 Proficient 26 ‐0.27 703 703 Proficient 27 ‐0.20 704 704 Proficient 28 ‐0.14 706 706 Proficient 29 ‐0.07 708 708 Proficient 30 ‐0.01 709 709 Proficient 31 0.06 711 711 Proficient 32 0.12 713 713 Proficient 33 0.18 714 714 Proficient 34 0.25 716 716 Proficient 35 0.31 717 717 Proficient 36 0.37 719 719 Proficient 37 0.44 721 721 Proficient 38 0.50 722 722 Proficient 39 0.57 724 724 Proficient



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.63 726 726 Accelerated 41 0.70 728 728 Accelerated 42 0.77 729 729 Accelerated 43 0.84 731 731 Accelerated 44 0.91 733 733 Accelerated 45 0.99 735 735 Accelerated 46 1.06 737 737 Accelerated 47 1.14 739 739 Advanced 48 1.23 741 741 Advanced 49 1.31 743 743 Advanced 50 1.41 746 746 Advanced 51 1.50 748 748 Advanced 52 1.61 751 751 Advanced 53 1.72 754 754 Advanced 54 1.84 757 757 Advanced 55 1.98 760 760 Advanced 56 2.13 764 764 Advanced 57 2.30 769 769 Advanced 58 2.50 774 774 Advanced 59 2.74 780 780 Advanced 60 3.04 788 788 Advanced 61 3.46 799 799 Advanced 62 3.50 800 800 Advanced 63 3.50 800 800 Advanced



Table G64. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – American History Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score





Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 0.65 726 726 Accelerated 41 0.72 728 728 Accelerated 42 0.79 730 730 Accelerated 43 0.85 731 731 Accelerated 44 0.92 733 733 Accelerated 45 1.00 735 735 Accelerated 46 1.07 737 737 Accelerated 47 1.15 739 739 Advanced 48 1.23 741 741 Advanced 49 1.31 743 743 Advanced 50 1.40 746 746 Advanced 51 1.50 748 748 Advanced 52 1.60 751 751 Advanced 53 1.71 754 754 Advanced 54 1.83 757 757 Advanced 55 1.96 760 760 Advanced 56 2.11 764 764 Advanced 57 2.27 768 768 Advanced 58 2.47 773 773 Advanced 59 2.70 779 779 Advanced 60 3.00 787 787 Advanced 61 3.42 798 798 Advanced 62 3.50 800 800 Advanced 63 3.50 800 800 Advanced



Table G65. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 5 Science Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 559 559 Limited 1 ‐3.50 559 559 Limited 2 ‐3.50 559 559 Limited 3 ‐3.25 569 569 Limited 4 ‐2.92 582 582 Limited 5 ‐2.66 593 593 Limited 6 ‐2.44 602 602 Limited 7 ‐2.25 610 610 Limited 8 ‐2.08 617 617 Limited 9 ‐1.92 623 623 Limited 10 ‐1.78 629 629 Limited 11 ‐1.65 635 635 Limited 12 ‐1.52 640 640 Limited 13 ‐1.40 645 645 Limited 14 ‐1.28 649 649 Limited 15 ‐1.17 654 654 Limited 16 ‐1.06 658 658 Limited 17 ‐0.96 663 664 Basic 18 ‐0.86 667 667 Basic 19 ‐0.76 671 671 Basic 20 ‐0.67 675 675 Basic 21 ‐0.57 678 678 Basic 22 ‐0.48 682 682 Basic 23 ‐0.39 686 686 Basic 24 ‐0.30 690 690 Basic 25 ‐0.21 693 693 Basic 26 ‐0.12 697 697 Basic 27 ‐0.03 701 701 Proficient 28 0.06 704 704 Proficient 29 0.15 708 708 Proficient 30 0.24 712 712 Proficient 31 0.33 715 715 Proficient 32 0.42 719 719 Proficient 33 0.51 723 723 Proficient 34 0.60 726 726 Accelerated 35 0.70 730 730 Accelerated 36 0.79 734 734 Accelerated 37 0.89 738 738 Accelerated 38 0.99 742 742 Accelerated 39 1.09 746 746 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 40 1.20 751 751 Accelerated 41 1.31 755 755 Advanced 42 1.43 760 760 Advanced 43 1.55 765 765 Advanced 44 1.68 770 770 Advanced 45 1.82 776 776 Advanced 46 1.98 782 782 Advanced 47 2.14 789 789 Advanced 48 2.33 797 797 Advanced 49 2.54 805 805 Advanced 50 2.78 815 815 Advanced 51 3.06 827 827 Advanced 52 3.42 841 841 Advanced 53 3.50 845 845 Advanced 54 3.50 845 845 Advanced 55 3.50 845 845 Advanced



Table G66. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 5 Science Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 559 559 Limited 1 ‐3.50 559 559 Limited 2 ‐3.50 559 559 Limited 3 ‐3.26 569 569 Limited 4 ‐2.93 582 582 Limited 5 ‐2.67 593 593 Limited 6 ‐2.45 602 602 Limited 7 ‐2.26 610 610 Limited 8 ‐2.08 617 617 Limited 9 ‐1.92 623 623 Limited 10 ‐1.78 629 629 Limited 11 ‐1.64 635 635 Limited 12 ‐1.51 640 640 Limited 13 ‐1.39 645 645 Limited 14 ‐1.27 650 650 Limited 15 ‐1.16 654 654 Limited 16 ‐1.05 659 659 Limited 17 ‐0.95 663 664 Basic 18 ‐0.84 667 667 Basic 19 ‐0.74 671 671 Basic 20 ‐0.65 675 675 Basic 21 ‐0.55 679 679 Basic 22 ‐0.46 683 683 Basic 23 ‐0.36 687 687 Basic 24 ‐0.27 691 691 Basic 25 ‐0.18 694 694 Basic 26 ‐0.09 698 698 Basic 27 0.00 702 702 Proficient 28 0.09 705 705 Proficient 29 0.18 709 709 Proficient 30 0.27 713 713 Proficient 31 0.36 716 716 Proficient 32 0.45 720 720 Proficient 33 0.54 724 725 Accelerated 34 0.63 728 728 Accelerated 35 0.73 731 731 Accelerated 36 0.82 735 735 Accelerated 37 0.92 739 739 Accelerated 38 1.02 743 743 Accelerated 39 1.12 748 748 Accelerated



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G67. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 8 Science Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 575 575 Limited 1 ‐3.50 575 575 Limited 2 ‐3.50 575 575 Limited 3 ‐3.40 579 579 Limited 4 ‐3.07 593 593 Limited 5 ‐2.80 604 604 Limited 6 ‐2.58 613 613 Limited 7 ‐2.38 622 622 Limited 8 ‐2.20 629 629 Limited 9 ‐2.04 636 636 Limited 10 ‐1.89 642 642 Limited 11 ‐1.75 648 648 Limited 12 ‐1.62 653 653 Limited 13 ‐1.49 659 659 Limited 14 ‐1.37 664 664 Limited 15 ‐1.26 669 669 Limited 16 ‐1.14 673 674 Basic 17 ‐1.04 678 678 Basic 18 ‐0.93 682 682 Basic 19 ‐0.83 686 686 Basic 20 ‐0.73 691 691 Basic 21 ‐0.63 695 695 Basic 22 ‐0.54 699 700 Proficient 23 ‐0.44 703 703 Proficient 24 ‐0.35 707 707 Proficient 25 ‐0.25 710 710 Proficient 26 ‐0.16 714 714 Proficient 27 ‐0.07 718 718 Proficient 28 0.02 722 722 Proficient 29 0.11 726 726 Accelerated 30 0.21 730 730 Accelerated 31 0.30 734 734 Accelerated 32 0.40 738 738 Accelerated 33 0.49 742 742 Accelerated 34 0.59 746 746 Accelerated 35 0.69 750 750 Accelerated 36 0.79 754 754 Accelerated 37 0.89 758 758 Accelerated 38 1.00 763 763 Accelerated 39 1.11 768 768 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G68. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 8 Science Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 575 575 Limited 1 ‐3.50 575 575 Limited 2 ‐3.50 575 575 Limited 3 ‐3.26 585 585 Limited 4 ‐2.93 598 598 Limited 5 ‐2.67 610 610 Limited 6 ‐2.44 619 619 Limited 7 ‐2.25 627 627 Limited 8 ‐2.07 634 634 Limited 9 ‐1.91 641 641 Limited 10 ‐1.77 647 647 Limited 11 ‐1.63 653 653 Limited 12 ‐1.50 658 658 Limited 13 ‐1.38 663 663 Limited 14 ‐1.27 668 668 Limited 15 ‐1.16 673 674 Basic 16 ‐1.05 677 677 Basic 17 ‐0.95 681 681 Basic 18 ‐0.85 686 686 Basic 19 ‐0.75 690 690 Basic 20 ‐0.66 694 694 Basic 21 ‐0.57 697 697 Basic 22 ‐0.48 701 701 Proficient 23 ‐0.39 705 705 Proficient 24 ‐0.30 708 708 Proficient 25 ‐0.22 712 712 Proficient 26 ‐0.13 716 716 Proficient 27 ‐0.05 719 719 Proficient 28 0.04 723 723 Proficient 29 0.12 726 726 Accelerated 30 0.21 730 730 Accelerated 31 0.29 733 733 Accelerated 32 0.38 737 737 Accelerated 33 0.47 741 741 Accelerated 34 0.56 744 744 Accelerated 35 0.65 748 748 Accelerated 36 0.74 752 752 Accelerated 37 0.83 756 756 Accelerated 38 0.93 760 760 Accelerated 39 1.03 764 766 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G69. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Biology Science Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 617 617 Limited 1 ‐3.50 617 617 Limited 2 ‐3.43 619 619 Limited 3 ‐3.00 631 631 Limited 4 ‐2.68 641 641 Limited 5 ‐2.42 648 648 Limited 6 ‐2.21 655 655 Limited 7 ‐2.03 660 660 Limited 8 ‐1.87 665 665 Limited 9 ‐1.72 669 669 Limited 10 ‐1.59 673 673 Limited 11 ‐1.47 677 677 Limited 12 ‐1.35 680 680 Limited 13 ‐1.25 683 683 Limited 14 ‐1.14 686 686 Basic 15 ‐1.05 689 689 Basic 16 ‐0.95 692 692 Basic 17 ‐0.86 694 694 Basic 18 ‐0.78 697 697 Basic 19 ‐0.69 699 700 Proficient 20 ‐0.61 702 702 Proficient 21 ‐0.53 704 704 Proficient 22 ‐0.45 706 706 Proficient 23 ‐0.38 709 709 Proficient 24 ‐0.30 711 711 Proficient 25 ‐0.23 713 713 Proficient 26 ‐0.15 715 715 Proficient 27 ‐0.08 718 718 Proficient 28 0.00 720 720 Proficient 29 0.07 722 722 Proficient 30 0.14 724 724 Proficient 31 0.22 726 726 Accelerated 32 0.29 728 728 Accelerated 33 0.37 731 731 Accelerated 34 0.45 733 733 Accelerated 35 0.52 735 735 Advanced 36 0.60 738 738 Advanced 37 0.68 740 740 Advanced 38 0.77 742 742 Advanced 39 0.85 745 745 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G70. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Biology Science Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 617 617 Limited 1 ‐3.50 617 617 Limited 2 ‐3.22 625 625 Limited 3 ‐2.79 637 637 Limited 4 ‐2.48 647 647 Limited 5 ‐2.23 654 654 Limited 6 ‐2.03 660 660 Limited 7 ‐1.85 665 665 Limited 8 ‐1.69 670 670 Limited 9 ‐1.55 674 674 Limited 10 ‐1.42 678 678 Limited 11 ‐1.31 681 681 Limited 12 ‐1.20 684 685 Basic 13 ‐1.09 688 688 Basic 14 ‐1.00 690 690 Basic 15 ‐0.90 693 693 Basic 16 ‐0.82 696 696 Basic 17 ‐0.73 698 698 Basic 18 ‐0.65 701 701 Proficient 19 ‐0.57 703 703 Proficient 20 ‐0.49 705 705 Proficient 21 ‐0.42 708 708 Proficient 22 ‐0.34 710 710 Proficient 23 ‐0.27 712 712 Proficient 24 ‐0.20 714 714 Proficient 25 ‐0.12 716 716 Proficient 26 ‐0.05 718 718 Proficient 27 0.02 720 720 Proficient 28 0.09 722 722 Proficient 29 0.16 724 725 Accelerated 30 0.23 727 727 Accelerated 31 0.30 729 729 Accelerated 32 0.37 731 731 Accelerated 33 0.44 733 733 Accelerated 34 0.52 735 735 Advanced 35 0.59 737 737 Advanced 36 0.67 740 740 Advanced 37 0.75 742 742 Advanced 38 0.83 744 744 Advanced 39 0.91 747 747 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G71. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Physical Science Online

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 634 634 Limited 1 ‐3.50 634 634 Limited 2 ‐3.47 635 635 Limited 3 ‐3.03 646 646 Limited 4 ‐2.71 654 654 Limited 5 ‐2.45 661 661 Limited 6 ‐2.23 667 667 Limited 7 ‐2.04 671 671 Limited 8 ‐1.88 676 676 Limited 9 ‐1.72 680 680 Limited 10 ‐1.58 683 684 Basic 11 ‐1.46 687 687 Basic 12 ‐1.33 690 690 Basic 13 ‐1.22 693 693 Basic 14 ‐1.11 696 696 Basic 15 ‐1.01 698 698 Basic 16 ‐0.91 701 701 Proficient 17 ‐0.82 703 703 Proficient 18 ‐0.72 706 706 Proficient 19 ‐0.64 708 708 Proficient 20 ‐0.55 710 710 Proficient 21 ‐0.47 712 712 Proficient 22 ‐0.38 714 714 Proficient 23 ‐0.30 717 717 Proficient 24 ‐0.22 719 719 Proficient 25 ‐0.15 721 721 Proficient 26 ‐0.07 723 723 Proficient 27 0.01 725 725 Accelerated 28 0.08 727 727 Accelerated 29 0.16 729 729 Accelerated 30 0.24 731 731 Accelerated 31 0.31 733 733 Accelerated 32 0.39 735 735 Accelerated 33 0.47 737 737 Accelerated 34 0.55 739 739 Accelerated 35 0.63 741 741 Accelerated 36 0.71 743 743 Accelerated 37 0.79 745 745 Accelerated 38 0.88 747 747 Accelerated 39 0.97 749 749 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score





Table G72. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Physical Science Paper

Raw Score

Ohio Theta


Scaled Score


Scaled Score


Proficiency 0 ‐3.50 634 634 Limited 1 ‐3.50 634 634 Limited 2 ‐3.50 634 634 Limited 3 ‐3.12 644 644 Limited 4 ‐2.80 652 652 Limited 5 ‐2.54 659 659 Limited 6 ‐2.32 664 664 Limited 7 ‐2.13 669 669 Limited 8 ‐1.96 674 674 Limited 9 ‐1.80 678 678 Limited 10 ‐1.66 681 681 Limited 11 ‐1.53 685 685 Basic 12 ‐1.40 688 688 Basic 13 ‐1.29 691 691 Basic 14 ‐1.17 694 694 Basic 15 ‐1.07 697 697 Basic 16 ‐0.96 699 700 Proficient 17 ‐0.87 702 702 Proficient 18 ‐0.77 704 704 Proficient 19 ‐0.68 707 707 Proficient 20 ‐0.59 709 709 Proficient 21 ‐0.50 711 711 Proficient 22 ‐0.42 714 714 Proficient 23 ‐0.33 716 716 Proficient 24 ‐0.25 718 718 Proficient 25 ‐0.17 720 720 Proficient 26 ‐0.09 722 722 Proficient 27 ‐0.01 724 724 Proficient 28 0.07 726 726 Accelerated 29 0.15 728 728 Accelerated 30 0.23 730 730 Accelerated 31 0.31 732 732 Accelerated 32 0.39 735 735 Accelerated 33 0.47 737 737 Accelerated 34 0.56 739 739 Accelerated 35 0.64 741 741 Accelerated 36 0.72 743 743 Accelerated 37 0.81 745 745 Accelerated 38 0.90 748 748 Accelerated 39 0.99 750 750 Advanced



Raw Score

Ohio Theta


Scaled Score


Scaled Score




H-1 American Institutes for Research

Table H.1 DRC Item Summary Report – Grade 3 ELA Fall 2017

Item ID Dimension Inter‐Rater Reliability Score Point Distribution

2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %F %T %U

31679 Conventions 8,834 84 16 0 37,914 9 39 47 0 0 0 0 0 6 Purpose/Organization 8,834 80 20 0 37,914 21 45 24 3 0 0 0 1 6 Evidence/Elaboration 8,834 80 20 0 37,914 21 47 23 3 0 0 0 1 6

Note: 2x = the number of student responses scored by two readers, %EX = the percent of student responses that reader 1 and reader 2 were in exact agreement (3,3), %AD = the percent of student responses that reader 1 and reader 2 scores were adjacent (1, 2), %NA = the percent of student responses that reader 1 and reader 2 were non‐adjacent (0, 2), %B = percent of Blank/No response, %U = percent of Unreadable responses, %F = percent of Foreign Language responses, %T = percent of Off Topic responses



Table H.2 DRC Item Summary Report – High School ELA I Fall 2017








Table H.3 DRC Item Summary Report – High School ELA II Fall 2017








Table H.4 DRC Item Summary Report – Grade 3 ELA Spring 2018


2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %C %F %T %U

31664 First 600

Conventions 1,532 81 18 1 1,532 9 28 35 0 0 2 0 0 0 25 Purpose/Organization 1,532 81 18 1 1,532 22 35 12 2 1 2 0 0 1 25 Evidence/Elaboration 1,532 80 19 1 1,532 23 34 12 2 1 2 0 0 1 25

31664 Online








31960 First 600


31960 Online








32035 First 600


32035 Online








31711 First 600


31711 Online


31766 First 600


31766 Online








31604 First 600


31604 Online


31978 First 600


31978 Online








32037 First 600


32037 Online


32110 First 600


32110 Online





Table H.10 DRC Item Summary Report – High School ELA I Spring 2018



31555 First 600


31555 Online


31583 First 600


31583 Online





Table H.11 DRC Item Summary Report – High School ELA II Spring 2018



31605 First 600


31605 Online


31622 First 600


31622 Online




I-1 American Institutes for Research

Fall 2017 Grade 3 ELA

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta

Test Characteristic Curves

OH Online G3ELA SP16 Proportional Score

OH Online G3ELA FA17 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM

OH Online G3ELA SP16 SEM

OH Online G3ELA FA17 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta

TCC Prop. Difference

OH Online G3ELA SP16 Prop. ‐ OH Online G3ELA FA17Prop.



Fall 2017 HS1 ELA

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM



‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta





Fall 2017 HS2 ELA

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM



‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta





Fall 2017 Algebra

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta


OH Online Alg SP16 Proportional Score

OH Online Alg FA17 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM

Theta

CSEM

OH Online Alg SP16 SEM OH Online Alg FA17 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online Alg SP16 Prop. ‐ OH Online Alg FA17 Prop.



Fall 2017 Geometry

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta


OH Online Geo SP16 Proportional Score

OH Online Geo FA17 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM

Theta

CSEM

OH Online Geo SP16 SEM OH Online Geo FA17 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online Geo SP16 Prop. ‐ OH Online Geo FA17 Prop.



Fall 2017 Integrated Math 1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta


OH Online Int Math1 SP16 Proportional Score

OH Online Int Math1 FA17 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM

OH Online Int Math1 SP16 SEM

OH Online Int Math1 FA17 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online Int Math1 SP16 Prop. ‐ OH Online Int Math1FA17 Prop.



Fall 2017 Integrated Math 2

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta



OH Online Int Math2 FA17 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM


OH Online Int Math2 FA17 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online Int Math2 SP16 Prop. ‐ OH Online Int Math2FA17 Prop.



Fall 2017 Biology

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta


OH Online HS Biology SP16 Proportional Score

OH Online HS Biology FA17 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM

OH Online HS Biology SP16 SEM

OH Online HS Biology FA17 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online HS Biology SP16 Prop. ‐ OH Online HS BiologyFA17 Prop.



Fall 2017 American Government

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta


OH Online HS AG SP16 Proportional Score

OH Online HS AG FA17 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM

OH Online HS AG SP16 SEM

OH Online HS AG FA17 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online HS AG SP16 Prop. ‐ OH Online HS AG FA17Prop.



Fall 2017 American History

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta


OH Online HS AH SP16 Proportional Score

OH Online HS AH FA17 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM

OH Online HS AH SP16 SEM

OH Online HS AH FA17 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online HS AH SP16 Prop. ‐ OH Online HS AH FA17 Prop.



Spring 2018 Grade 3 ELA

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta



OH OL PP G3ELA SP18 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM


OH OL PP G3ELA SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online G3ELA SP16 Prop. ‐ OH OL PP G3ELA SP18Prop.




0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM



‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta






0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM

Theta

CSEM



‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta






0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM



‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta





Spring 2018 High School ELA I

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM



‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta





Spring 2018 High School ELA II

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM



‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta





Spring 2018 Grade 3 Math ‐ Online

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta


OH Online G3M SP16 Proportional Score


00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM

OH Online G3M SP16 SEM


‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online G3M SP16 Prop. ‐ OH Online G3M SP18 Prop.



Spring 2018 Grade 3 Math ‐ Paper



Spring 2018 Grade 4 Math

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta



OH Online_Paper G4M SP18 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM


OH Online_Paper G4M SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online G4M SP16 Prop. ‐ OH Online_Paper G4M SP18Prop.




0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta



OH Online/Paper G5M SP18 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM


OH Online/Paper G5M SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online G5M SP16 Prop. ‐ OH Online/Paper G5M SP18Prop.




0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta



OH Online_Paper G6M SP18 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM

Theta

CSEM


OH Online_Paper G6M SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online G6M SP16 Prop. ‐ OH Online_Paper G6M SP18Prop.




0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM

Theta

CSEM



‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta






0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM



‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta





Spring 2018 Algebra

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta


OH Online Alg SP16 Proportional Score

OH OLPP Alg SP18 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM

Theta

CSEM

OH Online Alg SP16 SEM OH OLPP Alg SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online Alg SP16 Prop. ‐ OH OLPP Alg SP18 Prop.



Spring 2018 Geometry

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta


OH Online Geo SP16 Proportional Score

OH Online_PP Geo SP18 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM

OH Online Geo SP16 SEM

OH Online_PP Geo SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online Geo SP16 Prop. ‐ OH Online_PP Geo SP18Prop.



Spring 2018 Integrated Math 1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta



OH Online/Paper Int Math1 SP18 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM


OH Online/Paper Int Math1 SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online Int Math1 SP16 Prop. ‐ OH Online/Paper IntMath1 SP18 Prop.



Spring 2018 Integrated Math 2

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta



OH Online/Paper Int Math2 SP18 Proportional Score

00.150.3

0.450.6

0.750.9

1.051.2

1.351.5

1.651.8

1.952.1

2.252.4

2.552.7

2.853

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM


OH Online/Paper Int Math2 SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online Int Math2 SP16 Prop. ‐ OH Online/Paper IntMath2 SP18 Prop.



Spring 2018 Grade 5 Science ‐ Online

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta


OH Online G5S SP16 Proportional Score


00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM

Theta

CSEM

OH Online G5S SP16 SEM OH Online G5S SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online G5S SP16 Prop. ‐ OH Online G5S SP18 Prop.



Spring 2018 Grade 5 Science ‐ Paper

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta



OH Paper G5S SP18 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM

Theta

CSEM

OH Online G5S SP16 SEM OH Paper G5S SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online G5S SP16 Prop. ‐ OH Paper G5S SP18 Prop.



Spring 2018 Grade 8 Science ‐ Online

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM

Theta

CSEM

OH Online G8S SP16 SEM OH Online G8S SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online G8S SP16 Prop. ‐ OH Online G8S SP18 Prop.



Spring 2018 Grade 8 Science ‐ Paper

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta



OH Paper G8S SP18 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM

Theta

CSEM

OH Online G8S SP16 SEM OH Paper G8S SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online G8S SP16 Prop. ‐ OH Paper G8S SP18 Prop.



Spring 2018 Biology ‐ Online

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM



‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online HS Biology SP16 Prop. ‐ OH Online HS BiologySP18 Prop.



Spring 2018 Biology ‐ Paper

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta



OH Paper HS Biology SP18 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM


OH Paper HS Biology SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online HS Biology SP16 Prop. ‐ OH Paper HS BiologySP18 Prop.



Spring 2018 American Government ‐ Online

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM



Proficient

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online HS AG SP16 Prop. ‐ OH Online HS AG SP18Prop.



Spring 2018 American Government ‐ Paper

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta



OH Paper HS AG SP18 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM


OH Paper HS AG SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online HS AG SP16 Prop. ‐ OH Paper HS AG SP18Prop.



Spring 2018 American History ‐ Online

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta




00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

CSEM

Theta

CSEM



‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


OH Online HS AH SP16 Prop. ‐ OH Online HS AH SP18Prop.



Spring 2018 American History ‐ Paper

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

ortio

n

Theta


AH Online SP16 Proportional Score

AH Paper SP18 Proportional Score

00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85

3

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM

Theta

CSEM

AH Online SP16 SEM AH Paper SP18 SEM

‐0.2

‐0.15

‐0.1

‐0.05

0

0.05

0.1

0.15

0.2

‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5

TCC Prop

. Differen

ce

Theta


AH Online SP16 Prop. ‐ AH Paper SP18 Prop.

Descriptions of the operation of the Test Information Distribution Engine, Test Delivery System, and related systems are property of the American Institutes for Research (AIR) and are used with the permission of AIR.

Test Delivery System

Test Administrator User Guide

2017-2018

Published February 9, 2018

Prepared by the American Institutes for Research®

Test Delivery System Table of Contents

ii

Table of Contents

Section I. Introduction to the User Guide ................................................................................................. 1

Organization of the User Guide ................................................................................................................ 1

Document Conventions ............................................................................................................................ 1

Intended Audience .................................................................................................................................... 2

Additional Resources ................................................................................................................................ 2

Section II. Overview of the Test Delivery System .................................................................................... 3

Description of the Test Delivery System’s Sites ....................................................................................... 3

User Roles and System Requirements .................................................................................................... 3

General Rules of Online Testing .............................................................................................................. 4

Test Setting Rules ................................................................................................................................ 4

Pause Rules ......................................................................................................................................... 4

Test Timeout Rules .............................................................................................................................. 4

Test Opportunity Expiration Rules ........................................................................................................ 4

Section III. Accessing the Test Administration Sites .............................................................................. 5

About Usernames and Passwords ........................................................................................................... 6

Section IV. Overview of the Test Administration Sites ........................................................................... 7

Test Administrator Site Layout ................................................................................................................. 7

TA Site Features ....................................................................................................................................... 8

Looking Up Students ............................................................................................................................ 9

Printing Session Information............................................................................................................... 10

Section V. Administering Online Tests ................................................................................................... 11

Starting a Test Session........................................................................................................................... 11

Approving Students for Testing .............................................................................................................. 12

Monitoring Students’ Testing Progress ................................................................................................... 15

About the Timer .................................................................................................................................. 16

Pausing a Student’s Test.................................................................................................................... 17

Stopping a Test Session and Logging Out ............................................................................................. 17

Stopping a Test Session ..................................................................................................................... 17

Logging Out of the Test Administrator Site ........................................................................................ 18

Accidentally Closing the Browser Window ..................................................................................... 18

Section VI. Signing in to the Student Testing Site ................................................................................ 19

Step 1: Signing Students In .................................................................................................................... 19

Common Student Sign-in Errors ........................................................................................................ 20

Enabling Settings from the Sign-in Page ............................................................................................ 20

Step 2: Verifying Student Information ..................................................................................................... 21


iii

Step 3: Selecting a Test.......................................................................................................................... 22

Step 4: Verifying Test Information .......................................................................................................... 23

Step 5: Functionality Checks .................................................................................................................. 24

Step 5a: Text-to-Speech Check ............................................................................................................. 24

Step 5b: Audio Playback Check ............................................................................................................. 26

Troubleshooting Audio Issues ........................................................................................................ 26

Step 5c: Recording Device Check .......................................................................................................... 27

Troubleshooting Recording Device Issues .................................................................................... 27

Step 6: Viewing Instructions and Starting the Test ................................................................................. 29

Section VII. Overview of the Student Testing Site ................................................................................. 30

Test Layout ............................................................................................................................................. 30

Test Tools ............................................................................................................................................... 31

Using Menus and Tools ...................................................................................................................... 34

About the Global Menu .................................................................................................................. 34

About the Context Menus............................................................................................................... 34

Opening a Context Menu for Passages and Questions ................................................................. 35

Opening a Context Menu for Answer Options ............................................................................... 35

About the Masking Tool ................................................................................................................. 36

About Text-to-Speech (TTS) .......................................................................................................... 37

Selecting a Previous Response Version ........................................................................................ 38

Section VIII. Proceeding Through a Test ................................................................................................ 39

About Reading Passages ....................................................................................................................... 39

Responding to Test Questions ............................................................................................................... 39

Reviewing Questions in a Test ............................................................................................................... 40

Pausing Tests ......................................................................................................................................... 40

Submitting a Test .................................................................................................................................... 41

Reaching the End of a Test ................................................................................................................ 41

End Test Page .................................................................................................................................... 41

Your Results Page .............................................................................................................................. 42

Appendix A. About the Secure Browser ................................................................................................. 44

Additional Measures for Securing the Test Environment ....................................................................... 44

Forbidden Application Detection ........................................................................................................ 45

Configuring Tablets for Testing .............................................................................................................. 45

Closing the Student Testing Site on Tablets ...................................................................................... 45

About Permissive Mode .......................................................................................................................... 46

Troubleshooting ...................................................................................................................................... 47

Resolving Secure Browser Error Messages ....................................................................................... 47


iv

Force-Quit Commands ........................................................................................................................... 48

Appendix B. Text Response Formatting Toolbar .................................................................................. 49

Spell Check ............................................................................................................................................. 50

Special Characters ................................................................................................................................. 50

Appendix C. Keyboard Navigation for Students .................................................................................... 51

Sign-In Pages and In-Test Pop-ups ....................................................................................................... 51

Keyboard Commands for Test Navigation .............................................................................................. 51

Keyboard Commands for Global and Context Menus ............................................................................ 52

Global Menu ....................................................................................................................................... 52

Context Menus ................................................................................................................................... 52

Highlighting Selected Regions of Text ............................................................................................... 52

Keyboard Commands for Grid Questions .......................................................................................... 53

Appendix D. Transferring a Test Session ............................................................................................... 54

Appendix E. User Support........................................................................................................................ 55

Appendix F. Change Log .......................................................................................................................... 56

Test Delivery System Table of Figures

v

Table of Figures

Figure 1. Portal User Cards .......................................................................................................................... 5

Figure 2. Card for Test Administrator Interface ............................................................................................. 5

Figure 3. Cards for Test Administrator Practice Site .................................................................................... 5

Figure 4. Login Page ..................................................................................................................................... 5

Figure 5. Test Administrator Site Layout ....................................................................................................... 7

Figure 6. Test Administrator Site Banner ...................................................................................................... 8

Figure 7. Student Lookup: Quick Search ...................................................................................................... 9

Figure 8. Student Lookup: Advanced Search ............................................................................................. 10

Figure 9. Test Selection Box ....................................................................................................................... 11

Figure 10. Students Awaiting Approval ....................................................................................................... 12

Figure 11. Approvals and Student Test Settings Window .......................................................................... 13

Figure 12. Test Settings Window for a Selected Student ........................................................................... 14

Figure 13. Student Sign-In Page ................................................................................................................. 19

Figure 14. Choose Settings Window ........................................................................................................... 20

Figure 15. Is This You? Page ..................................................................................................................... 21

Figure 16. Your Tests Page ........................................................................................................................ 22

Figure 17. Is This Your Test? Page ............................................................................................................ 23

Figure 18. Text-to-Speech Sound Check Page .......................................................................................... 24

Figure 19. Sound Check Page .................................................................................................................... 26

Figure 20. Recording Device Check Page .................................................................................................. 27

Figure 21. Recording Input Device Selection Page .................................................................................... 28

Figure 22. Instructions and Help Page ........................................................................................................ 29

Figure 23. Test Layout ................................................................................................................................ 30

Figure 24. Test Page ................................................................................................................................... 31

Figure 25. Global Menu ............................................................................................................................... 34

Figure 26. Context Menu for Questions ...................................................................................................... 35

Figure 27. Context Menu for Answer Options ............................................................................................. 35

Figure 28. Test Page with Masked Area ..................................................................................................... 36

Figure 29. Speak Tool Options for Questions ............................................................................................. 37

Figure 30. Select Previous Version Window ............................................................................................... 38

Figure 31. Reading Passage....................................................................................................................... 39

Figure 32. Question Marked for Review ..................................................................................................... 40

Figure 33. Global Menu with End Test Button ............................................................................................ 41

Figure 34. End Test Page ........................................................................................................................... 41

Figure 35. Your Results Page ..................................................................................................................... 42

Test Delivery System List of Tables

vi

Figure 36. Practice Test Summary Report .................................................................................................. 42

Figure 37. Text Response Question with Formatting Toolbar .................................................................... 49

Figure 38. Spell Check Tool ........................................................................................................................ 50

Figure 39. Special Characters Window ....................................................................................................... 50

Figure 40. Grid Question ............................................................................................................................. 53

List of Tables

Table 1. Key Symbols and Elements ............................................................................................................ 1

Table 2. Test Administrator Site Features .................................................................................................... 8

Table 3. Columns in the Students in Your Test Session Table .................................................................. 15

Table 4. Student Testing Statuses .............................................................................................................. 16

Table 5. Global Tools .................................................................................................................................. 31

Table 6. Context Menu Tools and Stimulus Tools ...................................................................................... 33

Table 7. Overview of the Practice Test Summary Report ........................................................................... 43

Table 8. Description of Formatting Tools .................................................................................................... 49

Table 9. Keyboard Commands for Sign-In Pages and Pop-Up Windows .................................................. 51

Table 10. Keyboard Commands for Test Navigation .................................................................................. 51

Test Delivery System Introduction to the User Guide

1

Section I. Introduction to the User Guide This user guide supports test administrators who manage testing for students participating in the Ohio's State Tests and Ohio English Language Proficiency Assessment practice tests and operational tests.

Organization of the User Guide

• Overview of the Test Delivery System provides an overview of online testing and general test rules.

• Accessing the Test Administration Sites explains how to log in to the test administrator sites.

• Overview of the Test Administration Sites describes the overall layout of the test administrator sites and highlights the important tasks and functions.

• Administering Online Tests outlines the process for creating a test session, approving students for testing, pausing tests, and logging out.

• Signing in to the Student Testing Site explains how students sign in to a test session.

• Overview of the Student Testing Site describes the layout of an online test, as well as the tools available to students.

• Proceeding Through a Test explains how students complete tests.

• The Appendices provide additional information about the secure browser, keyboard commands, transferring test sessions and user support.

Document Conventions

Table 1 describes the conventions appearing in this guide.

Table 1. Key Symbols and Elements

Element Description

Alert: This symbol accompanies important information regarding a task that may cause minor errors.

Note: This symbol accompanies additional information or instructions of which users must take note.

Policy: This symbol accompanies information regarding test administration policies.

Test Delivery System Introduction to the User Guide

2

Element Description

Warning: This symbol accompanies important information regarding actions that may cause major errors.

Intended Audience

This user guide is intended for test administrators responsible for proctoring tests with the Test Delivery System. To use this system, you should be familiar with using a web browser to retrieve data and with filling out web forms. You should also be familiar with printing documents and adjusting a computer’s audio settings. If you or your students use Chromebooks, iPads, or other tablets for testing, then you should be familiar with operating these devices as well.

Additional Resources

The following publications provide additional information:

• For information about policies and procedures that govern secure and valid test administration, see the Test Administration Manual.

• For information about supported operating systems and browsers, see the Online System Requirements document.

• For information about student and user management, rosters, and test status requests, see the TIDE User Guide.

• For information about network and internet requirements, general peripheral and software requirements, and configuring text-to-speech settings, see the Technical Specifications Manual.

• For information about installing secure browsers, see the Secure Browser Installation Manual.

The above resources are available on the Ohio's State Tests Portal (www.ohiostatetests.org).

http://oh.portal.airast.org/ocba/wp-content/uploads/OST_Fall2017_Test_Administration_Manual.pdf

http://oh.portal.airast.org/ocba/wp-content/uploads/OH_System_Requirements_2017-2018.pdf


http://oh.portal.airast.org/ocba/wp-content/uploads/TIDE_User_Guide_2017_2018.pdf

http://oh.portal.airast.org/ocba/wp-content/uploads/OH_Tech_Specs_Manual_2017-2018.pdf


http://oh.portal.airast.org/ocba/wp-content/uploads/OH_Secure_Browser_Installation_Manual_2017-2018.pdf

http://oh.portal.airast.org/ocba/wp-content/uploads/OH_Secure_Browser_Installation_Manual_2017-2018.pdf

Test Delivery System Overview of the Test Delivery System

3

Section II. Overview of the Test Delivery System The Test Delivery System delivers Ohio’s online tests. The following sections describe highlights of online testing in general and the Test Delivery System in particular.

Description of the Test Delivery System’s Sites

The Test Delivery System consists of practice sites and operational testing sites. The practice sites function identically to the operational testing sites.

• Practice Sites

o Test Administrator Practice Site: Allows test administrators to practice administering tests.

o Student Practice Site: Allows students to practice taking tests online and using test tools.

• Operational Testing Sites

o Test Administrator Interface: Allows test administrators to administer operational tests.

o Student Testing Site: Allows students to take operational tests.

User Roles and System Requirements

Access to the practice and operational testing sites depends on your user role and browser.

• Test administrators can use any supported web browser to access either the Test Administrator Practice Site or the Test Administrator Interface. For a list of user roles that can access the Test Administrator Sites, see the User Role Matrix document available in the Resources section of the Ohio's State Tests Portal (www.ohiostatetests.org).

• Students, test administrators, and parents can use a supported web browser or secure browser to access the Student Practice Site as guests. Students can also sign in to a practice test session created by a test administrator.

• Students use a secure browser to access the Student Testing Site.

For information about supported operating systems and browsers, see the Online System Requirements document available on the Ohio's State Tests Portal (www.ohiostatetests.org).



Test Delivery System Overview of the Test Delivery System

4

General Rules of Online Testing

This section describes the rules for administering online tests.

Test Setting Rules

Students should not begin testing until they are assigned the correct test settings. You may have to update some test settings in the Test Information and Distribution Engine (TIDE).

Pause Rules

Test administrators and students can pause a test in order to temporarily log the student out of the test session. Students cannot access their test if it is paused for more than one day, even if they marked questions for review. The only exception to this rule is if the district test coordinator submits a test status request for reopen in TIDE.

These pause rules apply regardless of whether the student or the test administrator pauses the test or a technical issue logs the student out.

Test Timeout Rules

A warning message displays after 20 minutes of test inactivity. Students who do not click OK within 30 seconds after this message appears are logged out. This timeout automatically pauses the test.

Test Opportunity Expiration Rules

Opportunities refer to the number of times a student can take a test within a range of dates. Ohio tests have one opportunity per test part during the test window. A student’s test opportunity remains active until the student submits the test or until the opportunity expires at the end of the test window. Once a test opportunity expires, the student cannot submit or review the test. Opportunities that have been started but not submitted by the student will be automatically submitted by the system.

https://oh.sso.airast.org/auth/XUI/#login/&realm=/ohio&forward=true&spEntityID=SP_OHIO_TIDE_PROD&goto=%2FSSORedirect%2FmetaAlias%2Fohio%2Fidp%3FReqID%3DS2C6DDB24FD69A4A57CBED15F8FD309B421BC445BA%26index%3Dnull%26acsURL%3D%26spEntityID%3DSP_OHIO_TIDE_PROD%26binding%3D&AIRSSO-DFW-PROD4-prod-A

Test Delivery System Accessing the Test Administration Sites

5

Section III. Accessing the Test Administration Sites This section describes how to access the Test Administrator Sites.

To access the Test Administrator Interface:

1. Navigate to the Ohio's State Tests Portal (www.ohiostatetests.org).

2. Select the Teachers/ Test Administrators or Test Coordinators card (see Figure 1).

3. Select the appropriate TA Site:

o To access the Test Administrator Interface, click TA Interface (see Figure 2).

o To access the Test Administrator Practice Site, click TA Practice Site, (see Figure 3).

4. The login page appears (see Figure 3). Enter your email address and password.

5. Click Secure Login. The selected TA Site appears.

Figure 1. Portal User Cards

Figure 2. Card for Test Administrator Interface

Figure 3. Cards for Test Administrator Practice

Site

Figure 4. Login Page

Test Delivery System Accessing the Test Administration Sites

6

Note: For information about logging out of the TA Site, see the section Logging Out of the Test Administrator Site.

About Usernames and Passwords

Your username is the email address associated with your account in TIDE. When you are added to TIDE, you receive an email containing a temporary link to the Reset Your Password page. To activate your account, you must set up your password and set a security question within 15 minutes of receiving this email.

• If your first temporary link expired or you forgot your password:

On the login page, click Forgot Your Password? and then enter your email address in the Email Address field to reset your password. If your account is already set up, you need to answer your security question as well. You will receive an email with a new link to reset your password.

• If you did not receive an email containing a temporary password:

Check your spam folder to make sure your email program did not categorize it as junk mail. If you still do not have an email, contact your Building or District Test Coordinator to make sure you are listed in TIDE.

• Additional help:

If you are unable to log in, contact the Ohio Help Desk for assistance. You must provide your name and email address. Contact information is available in the User Support section of this user guide.

Test Delivery System Overview of the Test Administration Sites

7

Section IV. Overview of the Test Administration Sites This section describes the test administration sites for test administrators. Throughout the rest of this user guide, “Test Administrator Site” refers to both the Test Administrator Interface and Test Administrator Practice Site.

Warning: Do not use the Test Administrator Interface for practice. To practice administering tests, use the Test Administrator Practice Site. Both Test Administrator Sites have the same functionality, but the available tests are different. The table header will have “Practice” in the Test Administrator Practice Site and “Operational” in the Test Administrator Interface. Tests provided in the Test Administrator Interface are operational and will expend the students’ test opportunities.

Test Administrator Site Layout

Figure 5 displays the layout of the Test Administrator Site during an active test session.

Figure 5. Test Administrator Site Layout

Essential features in the Test Administrator Site:

1. Session ID

2. Select Tests button

3. Approvals button

4. Students in Your Test Session table


8

Table 2 provides an overview of the major features available in the Test Administrator Site.

Table 2. Test Administrator Site Features

Feature Description/More Information

Student Lookup button Searches for student information. See the section Looking Up Students.

Print Session button Prints your screen. See the section Printing Session Information.

Help Guide button Displays the online version of this user guide.

Log Out button Logs you out of the Test Administrator Site. See the section Stopping a Test Session and Logging Out.

Stop Session button* Ends the test session. See the section Stopping a Test Session and Logging Out.

Session ID* Displays the unique ID generated for the test session.

Select Tests button Opens the Test Selection window. See the section Starting a Test Session.

Approvals button* Opens the Approvals and Student Test Settings window. See the section Approving Students for Testing.

Refresh button* Updates the on-screen information.

Students in Your Test Session table**

Displays the testing progress for students in your test session. See the section Monitoring Students’ Testing Progress.

*Feature appears after you start a test session.

**Feature appears after you approve students for testing.

TA Site Features

This section provides instructions for using the features available in the banner at the top of the Test Administrator Site (see Figure 6).

Figure 6. Test Administrator Site Banner


9

Looking Up Students

You can use the student lookup feature to perform a quick or advanced search for student information. This is useful if students signing in to your test session cannot remember their login information.

Warning: You must ensure that a student’s demographic information is correct before testing begins. If a student’s information is not correct, that student should not begin testing.

To perform a quick search:

1. In the banner, click Student Lookup.

2. Enter a student’s full SSID and click Submit SSID. Search results appear below the search field (see Figure 7).

Figure 7. Student Lookup: Quick Search

To perform an advanced search:

1. Click Student Lookup > Advanced Search.

a. Select the appropriate district and school from the drop-down lists.

b. Select the appropriate grade.

c. Optional: Enter a student’s exact first or last name. Partial names are not allowed.


10

2. Click Search. Search results appear below the search fields (see Figure 8).

Figure 8. Student Lookup: Advanced Search

3. To view a student’s information, click in the Details column.

Printing Session Information

You can print a snapshot of the Test Administrator Site as it currently appears if you wish to keep a hard-copy record of the Session ID or list of approved students.

To print a snapshot of the page:

1. In the banner, click Print Session. The computer’s print dialog window appears.

2. Click OK.

Policy Note: Federal law prohibits the release of students' personally identifiable information. All printouts must be securely stored and then destroyed when no longer needed.

Test Delivery System Administering Online Tests

11

Section V. Administering Online Tests The basic workflow for administering online tests is as follows:

1. The test administrator selects tests and starts a test session.

2. Students sign in and request approval for tests.

3. The test administrator reviews students’ requests and approves them for testing.

4. Students complete and submit their tests.

5. The test administrator stops the test session and logs out.

For information about the testing process from a student’s perspective, see the sections Signing in to the Student Testing Site and Overview of the Student Testing Site.

Starting a Test Session

When you log in to the Test Administrator Site, the Test Selection window opens automatically (see Figure 9). This window allows you to select tests and start the session. Only the tests that you select will be available to students who join your session.

Figure 9. Test Selection Box

The Test Selection window color-codes tests and groups them into subjects. A test group may include one or more sub-groups. All test groups and sub-groups appear collapsed by default. To

expand a test group, click (or Expand All). To collapse an expanded test group, click (or Collapse All).


12

To create a new test session:

1. If the Test Selection window is not open, click Select Tests in the upper-right corner of the Test Administrator Site (otherwise skip to step 2).

2. To select tests for the session, do one of the following:

o To select individual tests, mark the checkbox for each test you want to include.

o To select all the tests in a test group, mark the checkbox for that group.

3. In the lower-left corner of the window, click Start Session (the exact label for this button may vary depending on whether you are starting a practice or operational session). The window closes and the Session ID appears on the Test Administrator Site.

4. Provide the Session ID to your students.

Note: Write down the Session ID in case you accidentally close the browser window and need to return to the active test session. You may have only one session open at a time. You cannot reopen closed sessions, but students can resume a test opportunity in a new session.

To add tests to an active test session:

1. In the upper-right corner of the Test Administrator Site, click Select Tests.

2. In the Test Selection window, mark the checkbox for the required test and click Add to Session in the lower-left corner.

3. A confirmation message asks if you are sure you want to modify the tests in your session. To continue, click Yes.

Note: You cannot remove tests from an active session.

Approving Students for Testing

After students sign in and select tests, you must verify that their test settings are correct before approving them for testing. When students are awaiting approval, the Approvals button next to the Session ID becomes active and shows you how many students are awaiting approval (see Figure 10).

Figure 10. Students Awaiting Approval


13

Note: The Approvals notification updates regularly, but you can also click in the upper-

right corner to update it manually.

To approve students for testing:

1. Click Approvals. The Approvals and Student Test Settings window appears, displaying a list of students grouped by test (see Figure 11).

Figure 11. Approvals and Student Test Settings Window

2. To check a student’s test settings, click for that student. The student’s information

appears in the Test Settings window (see Figure 12). This window groups test settings by their tool categories.


14

Figure 12. Test Settings Window for a Selected Student

a. If any settings are incorrect, update them as required. Students should not begin testing until their settings are correct.

Alert: When approving students for testing, you must update the editable settings in this window, rather than in TIDE.

b. Do one of the following:

To confirm the settings, click Set. You must still approve the student for testing (see step 5).

To confirm the settings and approve the student, click Set & Approve. Students can start testing once you approve them.

To return to the Approvals and Student Test Settings window without confirming settings, click Cancel.

3. Repeat step 2 for each student in the Approvals and Student Test Settings list.

Note: The Approvals and Student Test Settings window does not automatically refresh. To update the list of students awaiting approval, click Refresh at the top of the window.

4. If you need to deny a student access to testing, do the following (otherwise skip to step 5):

a. Click for that student.


15

b. Optional: In the window that appears, enter a brief reason for denying the student.

c. Click Deny. The student receives a message explaining the reason for the denial if you entered a reason, and is logged out.

Note: If you deny students entry for a test, they can still request access to that test again.

5. If you wish to approve students directly from the Approvals and Student Test Settings window, do the following:

o To approve individual students, click for each student.

o To approve all students displayed in the list, click Approve All Students for that subject.

Monitoring Students’ Testing Progress

After you approve students for testing, the Students in Your Test Session table appears (see Figure 5). This table displays the testing progress for each student logged in to your session. Table 3 describes the columns in this table. To sort the table by a given column, click that column header.

Table 3. Columns in the Students in Your Test Session Table

Column Description

Student Name Last and first name of the student in the session.

SSID SSID associated with the student.

Opp # Opportunity number for the student’s selected test.

Test Name of the test the student selected.

Time Indicates the approximate elapsed time (in minutes only) in the student’s test. There will be an approximate one minute delay between the elapsed time showing on the student’s test and the time displayed in this column.

Student Status

Current status for each student in the session. This column may also indicate how many questions the student has completed out of the total number of test questions. For more information about the statuses in this column, see Table 4.

Test Settings This column displays one of the following:

• Standard: Default test settings are applied for this test opportunity.

• Custom: One or more of the student’s test settings differ from the default settings.

To view the student’s settings for the current test opportunity, click .

Pause Test Pauses the student’s test. When a test pauses, this column displays an information button that opens a pop-up message explaining how the test became paused. For more information, see the section Pause Rules.


16

Table 4 describes the codes in the Student Status column of the Students in Your Test Session table.

Table 4. Student Testing Statuses

Status Description

Approved You approved the student, but the student did not yet start or resume the test.

Started Student started the test and is actively testing.

Review Student visited all questions and is currently reviewing answers before completing the test.

Completed Student submitted the test. The student can take no additional action at this point.

Submitted Test was submitted for quality assurance review and validation.

Reported Test passed quality assurance and is undergoing further processing.

Paused* Student’s test is paused. The time listed indicates how long the test has been paused.

Expired* Test was not completed by the end of the testing window and the opportunity expired.

Pending* Student is awaiting approval for a new test opportunity.

Suspended* Student is awaiting approval to resume a test opportunity.

*Appears when the student is not actively testing. The student’s row grays out in such cases.

Note: The Students in Your Test Session table refreshes at regular intervals, but you can

also refresh it manually by clicking in the upper-right corner.

About the Timer

You can view the approximate time a student has been actively testing via the time column on the TA Interface (see Figure 5). The time column updates approximately once per minute and will only display the approximate time in minutes, not seconds. If a student’s test is paused, the timer in the time column on the TA Interface will pause as well. The student status column will reflect how much time has passed since the student’s test was paused. If a student logs back in and resumes a test, the approximate time will resume counting up from the point the student’s test was paused previously. Elapsed time that a student has spent in the test will continue to accrue, even if the student resumes testing in a new session.

If the test clock setting is turned on for the student, the Student Interface displays a test clock at the top right of the screen (see Figure 23). The test clock displays in real-time the amount of time that a student has spent viewing item content. Students can choose to hide the displayed time by selecting the test clock button on the Student Interface. The Student Interface will continue to keep track of the time elapsed, even if the student hides the test clock time. Students can view the time elapsed display by selecting the test clock button again.


17

If the test clock setting is turned off for the student, the Student Interface will not display the test clock icon nor the displayed time. However, the time column on the TA Interface will still provide the approximate time the student has been actively testing.

Note: The time reflects only the time a student spends viewing test content. It does not include the time a student spends on the log-in pages, the review page or when the test is paused.

The TA Interface and Student Testing Site do not enforce a time limit. Test administrators are responsible for ensuring that students complete each part of their tests within the testing time published on the portal.

Pausing a Student’s Test

You can pause a student’s test via the Pause Test column in the Students in Your Test Session table (see Figure 5). For information about pause rules, see the section Pause Rules.

To pause an individual student’s test:

1. In the Pause Test column, click for that student.

2. Click Yes to confirm. The Test Delivery System logs the student out and an information button appears in the Pause Test column.

Stopping a Test Session and Logging Out

This section explains how to stop a test session and log out of the Test Administrator Site.

Stopping a Test Session

When students finish testing or the current testing time slot is over, you should stop the test session. Stopping a session automatically logs out all the students in the session and pauses their tests.

Once you stop a test session, you cannot resume it. To resume testing students, you must start a new session.

Warning: The Test Delivery System automatically logs you out after 20 minutes of both user and student inactivity in the session. This action automatically stops the test session.

To stop a test session:

1. In the upper-right corner, click (see Figure 10). A confirmation message appears.

2. Click OK. The test session stops.


18

Logging Out of the Test Administrator Site

You should log out of the Test Administrator Site only after stopping a test session.

To log out of the Test Administrator Site:

1. In the banner, click Log Out. A warning message appears.

2. In the warning message, click Log Out. The Ohio's State Tests Portal appears.

Alert: Navigating away from the Test Administrator Site will also log you out. Logging out while a session is in progress stops the session. If you need to access another application while administering tests, open it in a separate browser window.

If you log out from another Ohio's State Tests system, such as TIDE, you will also log out of the TA Site.

Accidentally Closing the Browser Window

If you accidentally close the browser while students are testing, your session remains open until it times out. To return to the test session in the Test Administrator Site, you must enter the active Session ID.

If you do not return to the active session within 20 minutes and there is no student activity during that time, the Test Delivery System logs you out and pauses the students’ tests.

Test Delivery System Signing in to the Student Testing Site

19

Section VI. Signing in to the Student Testing Site This section describes the student sign-in process for the Student Testing Site. Students follow this procedure when starting a new test or resuming a paused test.

Note: Students must sign in to the appropriate testing site:

• For sessions created in the Test Administrator Interface, students sign in to the Student Testing Site on the secure browser.

• For sessions created in the Test Administrator Practice Site, students sign in to the Student Practice Site. Students can access the Student Practice Site on the Ohio's State Tests Portal.

Step 1: Signing Students In

To sign students in to a test session:

1. Launch the secure browser on the student’s testing device. The Student Sign-In page appears (see Figure 13).

Figure 13. Student Sign-In Page

2. Students enter the following information:

a. In the First Name field, students enter their first name as it appears in TIDE.

b. In the Student ID field, students enter their SSID as it appears in TIDE. Non-public students enter the non-public student ID. Home schooled students enter the home schooled student ID.

Note: If students do not know their exact information as it appears in TIDE, you can retrieve it in the Test Administrator Site (see the section Looking Up Students).

c. In the Session ID field, students enter the Session ID as it appears on the Test Administrator Site.

3. Students select Sign In. The Is This You? page appears.


20

Common Student Sign-in Errors

The Test Delivery System generates an error message if a student cannot sign in. The following are the most common student sign-in issues:

• Session does not exist: The student entered the Session ID incorrectly or signed in to the wrong site. Verify that the student correctly entered the active Session ID. Also, verify that both you and the student are using the correct sites. For example, students signed in to the Student Practice Site cannot access sessions created in the Test Administrator Interface.

• Student information is not entered correctly: Verify that the student correctly entered the SSID. If this does not resolve the error, use the Student Lookup tool to verify the student's information. See the section Looking Up Students.

• Session has expired: The Session ID corresponds to a closed session. Ensure that the student enters the correct Session ID and verify that your session is open. For more information about test sessions, see the section Starting a Test Session.

• Student is not associated with the school: The student is not associated with your school, or you are not associated with the student’s school. Contact your test coordinator to make the appropriate update in TIDE.

Enabling Settings from the Sign-in Page

On the Student Practice Site, students can modify the settings they want to use during the sign-in process.

Note: On the operational Student Testing Site, students cannot modify the settings they want during the sign-in process

To edit settings:

1. Students select the cog wheel in the upper-right corner of the login page. The Choose Settings window appears (see Figure 14).

2. Students select their preferred options from the available drop-down lists. These settings persist until you set the actual test settings during the test administrator approval process.

Figure 14. Choose Settings Window


21

Step 2: Verifying Student Information

After students sign in, the Is This You? page appears (see Figure 15). On this page, students verify their personal information.

Figure 15. Is This You? Page

To verify personal information:

• If all the information is correct, students select Yes. The Your Tests page appears.

• If any of the information displayed is incorrect, the student must not proceed with testing. The student should select No. You must notify the building or district test coordinator that the student’s information is incorrect.

Warning: Incorrect student demographic information (including SSID) must be updated before the student begins testing.

Note: When signing in to the Student Practice Site as a guest, the Is This You? page displays a Student Grade Level drop-down list, from which students select the grade they wish to use for testing.


22

Step 3: Selecting a Test

The Your Tests page displays all the tests that a student is eligible to take (see Figure 16). Students can only select tests that are included in the session and still need to be completed.

Available tests are color-coded and grouped into categories, just like the tests listed in the Test Selection window of the TA Site (see Figure 9).

If the student has not started a test opportunity, the button for that test is labeled Start [Test Name]. If the student has started and paused a test opportunity, the button for that test is labeled Resume [Test Name].

Figure 16. Your Tests Page

To select an available test:

• Students select the required test name. The request is sent to the test administrator for approval and the Waiting for TA Approval message appears.

• If a student’s required test is inactive or not displayed, the student should click Back to Login. You should verify the test session includes the correct tests and add additional tests, if necessary.


23

Step 4: Verifying Test Information

After you approve the student for testing, the student should verify the test information and settings on the Is This Your Test? page (see Figure 17). At this point, the student’s actual test settings override any settings selected earlier in the sign-in process.

Figure 17. Is This Your Test? Page

To verify test information:

• If the settings are correct, students select Yes.

• If the settings are incorrect, students select No. After a student’s test settings are corrected, the student must sign in and request approval again.

Note: When signing in to the Student Practice Site, a Choose Settings page appears in place of the Is This Your Test? page. On this page, students can select the test settings they wish to use.


24

Step 5: Functionality Checks

Depending on the test content and the specified test settings, students may need to verify that their testing device is functioning properly. Any of the following verification pages may appear:

• Step 5a: Text-to-Speech Check

• Step 5b: Audio Playback Check

• Step 5c: Recording Device Check

Step 5a: Text-to-Speech Check

The Text-to-Speech Sound Check page appears if a student has the text-to-speech (TTS) setting (see Figure 18). On this page, students verify that text-to-speech is working properly on their device. Students can only use text-to-speech within a supported browser. This check can also be performed from the Diagnostics page. The Diagnostics page can be accessed from the homepage of the Student Practice Site. If the student has the Bilingual English-Spanish accommodation in addition to the TTS setting, a second Text-to-Speech Sound Check page appears to verify the Spanish voice.

Figure 18. Text-to-Speech Sound Check Page

To check text-to-speech functionality:

1. Students select the speaker icon and listen to the audio.


25

o If the voice is clearly audible, students select I heard the voice.

o If the voice is not clearly audible, students adjust the settings using the sliders and select the speaker icon again.

o If students still cannot hear the voice clearly, they select I did not hear the voice and close the browser. You can work with students to adjust their audio or headset settings (for more information, see the section Troubleshooting Audio Issues). They can sign in again when the issue is resolved.


26

Step 5b: Audio Playback Check

The Audio Playback Check page appears for tests with listening questions administered for the Ohio English Language Proficiency Assessment (OELPA) (see Figure 19). On this page, students verify that they can hear the sample audio. This check can also be performed from the Diagnostics page. The Diagnostics page can be accessed from the homepage of the Student Practice Site.

Figure 19. Sound Check Page

To check audio settings:

1. Students select the icon and listen to the audio.

2. Depending on the sound quality, students do one of the following:

o If the sound is audible, students select I heard the sound.

o If the sound is not audible, students select I did not hear the sound. The Sound Check: Audio Problem page appears, giving students two options:

Students can select Try Again. This returns them to the Audio Playback Check page.

Students can select Log Out. You should troubleshoot the device and headphones or move the student to another device with working audio.

Troubleshooting Audio Issues

Prior to testing, ensure that audio is enabled on each device and that headsets are functioning correctly. If audio issues occur, do the following:

• Ensure headphones are securely plugged in to the correct jack or USB port.

• If the headphones have a volume control, ensure the volume is not muted.

• Ensure that the audio on the device is not muted.


27

Step 5c: Recording Device Check

The Recording Device Check page appears for tests with speaking questions administered for the Ohio English Language Proficiency Assessment (OELPA) (see Figure 20). On this page, students record their voice and verify that they can hear the recorded audio.

Figure 20. Recording Device Check Page

To check recording device settings:

1. To begin recording, students select the icon.

2. Students speak into their recording device.

3. To stop recording, students select the icon.

4. To listen to their recorded audio, students select the icon.

5. Depending on the recorded audio quality, students do one of the following:

o If the recorded audio is audible, students select I heard my recording.

o If the recorded audio is not audible, students select I did not hear my recording. The Problem Recording Audio page appears.

Troubleshooting Recording Device Issues

The Problem Recording Audio page appears when students experience difficulties recording audio or playing back recorded audio. This page gives students up to three options:

• Try Again: This returns students to the Recording Device Check page.

• Log Out: This returns students to the sign-in page. You should troubleshoot the recording device or set up a new recording device.


28

• Select New Recording Device: This option only appears for students testing on computers or tablets with multiple recording devices. When students select this option, the Recording Input Device Selection page appears (see Figure 21), listing the available recording devices.

Figure 21. Recording Input Device Selection Page

a. To select a different recording device, students speak their names. The blue bar to the right of each recording device indicates the strength of the audio detection for that device.

b. Students select the recording device with the strongest audio detection.

c. Students select Yes.

Note: The Recording Input Device Selection page only allows students to change recording input devices. The audio output device does not change.


29

Step 6: Viewing Instructions and Starting the Test

The Instructions and Help page is the last step of the sign-in process (see Figure 22). Students may review this page to understand how to navigate the test and use test tools.

Figure 22. Instructions and Help Page

To proceed and begin the test:

• After reviewing this page, students select Begin Test Now. The test opportunity officially begins or resumes.

Test Delivery System Overview of the Student Testing Site

30

Section VII. Overview of the Student Testing Site This section describes the layout of the Student Testing Site and the available testing tools.

Test Layout

Figure 23 shows the main sections of the layout for a test page that includes a stimulus. A stimulus is a reading passage or other testing material (such as a simulation or graphic) that students review in order to answer associated questions.

Figure 23. Test Layout

A test page can include the following sections:

• The Global Menu section displays the global navigation and tool buttons. The banner above the global menu displays the Questions drop-down list, test information, help button, system settings button and test clock.

Note: To hide the elapsed time, students can select the test clock in the upper-right corner.

The TA Interface and Student Test Site will still maintain the time elapsed while the time is hidden. To display the hidden time, students can select the test clock again.

• The Stimulus section appears only for questions associated with a stimulus. This section contains the stimulus content (such as a reading passage or graphic), context menu and either the expand passage button or reading mode button.

• The Question section contains one or more test questions (also known as “items”). Each question includes a number, context menu, stem, and response area.

For more information about the global menu and context menus, see the section Using Menus and Tools.


31

Test Tools

This section provides an overview of the Test Delivery System’s available tools.

Figure 24 shows the primary features and tools available in the Student Testing Site.

Figure 24. Test Page

Note: Some tools are available for all tests, while others are only available for a particular subject, test setting, or type of question.

Table 5 and Table 6 list the Student Testing Site’s available global tools and context menu tools, respectively.

Table 5. Global Tools

Global Tool Instructions

Help To view the on-screen Test Instructions and Help window, select in the

upper-right corner.

Test Clock

To hide the time elapsed from displaying, select the test clock in

the upper-right corner. To unhide the elapsed time, select the test clock again. Note: The test clock and elapsed time will not display at all if the test clock setting is turned off for the student.


32


Calculator

To use the on-screen calculator, select Calculator.

The graphing calculator is available on the following tests:

• Algebra I

• Geometry

• Integrated Mathematics I

• Integrated Mathematics II

The scientific calculator is available on the following tests:

• Physical Science parts 1 and 2

• Grade 6 Mathematics part 2

• Grade 7 Mathematics part 2

• Grade 8 Mathematics parts 1 and 2

Formula

To view the on-screen reference sheet, select Formula.

Available on the following tests (sheet specifics vary by test):

• Grades 4 to 8 Mathematics

• Algebra I

• Geometry



• Physical Science

Line Reader

To highlight an individual line of text in a passage or question, select Line Reader. This tool is not available while the Highlighter tool is in use.

Masking

To temporarily cover a distracting area of the test page:

1. Select Masking.

2. Click and drag across the distracting area.

3. Release the mouse button.

To close the Masking tool, select Masking again. To remove a masked area, select X in the upper-right corner of that area.

Notes

To open the on-screen notepad, select Notes.

Students cannot copy/paste text from notepad into a response space.

Periodic Table

To view the on-screen periodic table, select Periodic Table.

Available on the Physical Science tests.

System Settings

To adjust text-to-speech settings during the test, select in the upper-right

corner.

Students testing on mobile devices cannot use this tool to adjust volume. To adjust audio volume on mobile devices, students must use the device's built-in volume control.


33


Zoom buttons

To enlarge the text and images on a test page, select Zoom In. You can zoom in up to four levels. To undo zooming, select Zoom Out.

Table 6. Context Menu Tools and Stimulus Tools

Tool Name Instructions

Expand Passage To expand the passage section, select the double arrow icon. The section

will expand and overlap the question section for easier readability. To collapse

the expanded section, select the double arrow icon again.

Expand Buttons You can expand the passage section or the question section for easier readability.

• To expand the passage section, select the right arrow icon below the

global menu. To collapse the expanded passage section, select the left arrow

icon in the upper-right corner.

To expand the question section, select the left arrow icon below the global

menu. To collapse the expanded question section, select the right arrow icon

in the upper-left corner.

Highlighter To highlight text, select the text on the screen and then select Highlight Selection from the context menu. To remove highlighting, select Reset Highlighting from the context menu.

Text in images cannot be highlighted. This tool is not available while the Line Reader tool is in use.

Mark for Review To mark a question for review, select Mark for Review from the context menu.

The question number displays a flap in the upper-right corner. A flag icon

appears next to the number on the test page. The Questions drop-down list

displays "(marked)" for the selected question.

Reading Mode Reading Mode opens a pop-up window that lets you view two pages of a reading

passage at a time. To open Reading Mode, select below a reading passage.

To exit Reading Mode, select in the lower-right corner of the pop-up window.

Select Previous Version

To view and restore responses previously entered for a Text Response question, select the Select Previous Version option from the context menu. A list of saved responses appears. Select the appropriate response and click Select.


34

Tool Name Instructions

Strikethrough

For selected-response questions, you can cross out an answer option to focus on the options you think might be correct. There are two options for using this tool:

• Option A:

a. To activate Strikethrough mode, open the context menu and select Strikethrough.

b. Select each answer option you wish to strike out.

c. To deactivate Strikethrough mode, press Esc or click outside the question’s response area.

• Option B:

a. Right-click an answer option and select Strikethrough.

Text-to-Speech To listen to passages and questions, select a Speak option from the context menu.

Text-to-Speech Tracking

When this tool is enabled, words become highlighted as text-to-speech reads them aloud.

Tutorial To view a short video demonstrating how to respond to a particular question type, select Tutorial from the context menu. The tutorials do not include audio.

Using Menus and Tools

This section describes how to use the global and context menus to access on-screen tools. This section also provides further details for using some of the Student Testing Site tools.

Note: Students can access tools using a mouse or keyboard commands. For information about keyboard commands, see Appendix C.

About the Global Menu

The global menu at the top of the test page contains navigation buttons on the left and tool buttons on the right (see Figure 25).

Figure 25. Global Menu

To open a test tool in the global menu:

1. Select the button for the tool. The selected test tool activates.

About the Context Menus

Each test page may include several elements, such as the question, answer options and stimulus (see Figure 23). The context menu for each element (including the stimulus) only contains tools that are applicable to that element (see Figure 26 and Figure 27).


35

Figure 26. Context Menu for Questions

Figure 27. Context Menu for Answer Options

Opening a Context Menu for Passages and Questions

Students can access context menus by right-clicking elements or by selecting elements and then clicking the context menu button.

To access the context menu for a passage or question:

1. Click the context menu button in the upper-right corner of the passage or question.

The context menu opens.

2. Select a tool.

Opening a Context Menu for Answer Options

Students can use the context menu to access tools for answer options in a multiple-choice or multi-select question.

To access an answer option’s context menu:

1. To open the context menu, do one of the following:

o If you are using a two-button mouse, right-click an answer option.

o If you are using a single-button mouse, click an answer option while pressing Ctrl.

o If you are using a Chromebook, click an answer option while pressing Alt.

o If you are using a tablet, tap the answer option and then tap the context menu button (this selects the answer option until you select a different option).

2. Select a tool from the context menu.


36

About the Masking Tool

The Masking tool allows students to hide distracting areas of the test page (see Figure 28).

Figure 28. Test Page with Masked Area

To mask an area of a test page:

1. To activate the Masking tool, select Masking in the global menu. The button becomes orange.

2. Click and drag across the distracting area of the test page.

3. Release the mouse button. The selected area becomes dark gray. The tool remains active until you deactivate it.

To deactivate the Masking tool:

1. Select Masking in the global menu again. The button becomes green. Any masked areas remain on the screen until you remove them.

To remove a masked area from a test page:

1. Select X in the upper-right corner of a masked area.


37

About Text-to-Speech (TTS)

Students testing with text-to-speech can listen to passages, questions, and answer options (see Figure 29). If a student is using Text-to-Speech Tracking, the words become highlighted as they are read aloud. Text-to-speech is only available when using the secure browser or a supported Chrome or Firefox browser.

For information about setting up text-to-speech, see the Technical Specifications Manual.

Figure 29. Speak Tool Options for Questions

To listen to content with the Text-to-Speech tool:

• To listen to a passage, students open the passage context menu and select a Speak option. Students can also select a portion of text to listen to, such as a word or phrase. To do this, students select the text, open the passage context menu and select Speak Selection.

Note: When listening to passages, students can pause TTS and then resume it at the point where it was paused. However, this feature is not available on mobile devices. Students testing on mobile devices can resume a paused TTS passage by selecting the remaining text to be read aloud and selecting Speak Selection from the context menu.

• To listen to a question or answer options, students open the question context menu and select one of the following Speak options:

o To listen only to the question, students select Speak Question.

o To listen to a multiple-choice question and all answer options, students select Speak Question and Options.

o To listen only to an answer option, select Speak Option from the context menu and then select the answer option. Students could also right-click the answer option and select Speak Option.


38

Selecting a Previous Response Version

The Select Previous Version tool allows students to view and restore responses they previously entered for a Text Response question. For example, if students type a response, click Save, delete the text, and enter new text, they can use this tool to recover the original response.

To recover a previously-entered response:

2. Select the Select Previous Version option from the context menu. The Select Previous Version window appears, listing all the saved responses for the question in the left panel (see Figure 30).

Figure 30. Select Previous Version Window

3. Select a response version from the left panel. The text associated with that response appears in the right panel.

4. Click Select. The selected response appears in the text box for the question.

Note: This tool is only available for Text Response questions. If the student or test administrator pauses the test, any responses entered prior to pausing will no longer appear in the Select Previous Version window.

Test Delivery System Proceeding Through a Test

39

Section VIII. Proceeding Through a Test Students can view reading passages, respond to questions, review previously answered questions, pause a test, and submit a test. The following sections describe each of these tasks.

About Reading Passages

Responding to Test Questions

Students answer test questions depending on the question type.

• Multiple choice questions: Students select a single answer option.

• Multi-select questions: Students select one or more answer options.

• Technology-enhanced questions: Students follow the instructions given for each question. Technology-enhanced questions require students to do any of the following tasks:

o Use an on-screen keypad to generate an answer.

o Select an object or text excerpt on the screen.

o Place points, lines, or bars on a graph.

o Drag and drop text or graphic objects.

o Enter text in a text box or table.

o Match answer options together.

o Modify a highlighted word or phrase in a reading selection.

o Enter input parameters to run an on-screen simulation.

When test question is associated with a reading passage, students should review the passage before responding to the question. The content for a reading passage may be paginated. To move between the

pages of a reading passage, students can select

and below the stimulus. Students can also select

to open the Reading Mode window, which displays

two pages at a time.

Note: If students want to highlight text that spans multiple pages in a reading passage, they must highlight the text on each page separately.

Figure 31. Reading Passage


40

Some questions may consist of multiple parts that students must answer. After students respond to all the questions on a page, they select Next to proceed to the next page.

All responses are saved automatically. Students can also manually save their responses to questions by selecting Save in the global menu.

Questions grouped with the same stimulus are tabbed for individual viewing ( ).

Students select the tabs in the upper-right corner to proceed to the corresponding question.

The navigation tabs may also include a stimulus icon ( ) that students can select to view the

stimulus associated with the grouped questions.

Note: Students can use the Student Practice Site to familiarize themselves with the question types that may appear on tests.

Reviewing Questions in a Test

Students may return to a previous question and modify their response if the test was not paused for more than one day. See the Pause Rules section for more information.

Students can use the Back button or the Questions drop-down list to return to questions they want to review. The drop-down list displays "(marked)" for any questions marked for review (see Figure 32).

Figure 32. Question Marked for Review

Pausing Tests

Students can pause the test at any time. Pausing a test logs the student out. To resume testing, students must repeat the sign-in process (see the section Signing in to the Student Testing Site).

To pause a test:

1. The student selects Pause in the global menu. A confirmation message appears.

2. The student selects Yes. The Student Sign-In page appears.


41

Submitting a Test

This section describes how students submit a test when they are done answering questions.

Reaching the End of a Test

After students respond to the last test question, the End Test button appears in the global menu (see Figure 33).

Figure 33. Global Menu with End Test Button

To end a test:

1. Students select End Test. A confirmation message appears.

2. Students select OK.

End Test Page

When students end a test, the End Test page appears (see Figure 34). This page allows students to review answers and submit the test for scoring. A flag icon appears for any questions marked for review.

Figure 34. End Test Page

To review answers:

1. Students select a question number.

2. To return to the End Test page, students select End Test in the global menu.


42

To submit the test:

1. Students select Submit Test.

Warning: Once students select Submit Test, they cannot return to the test or modify answers.

Your Results Page

After students submit the test, the Your Results page appears, displaying the student’s name, the test name, and the completion date (see Figure 35).

Figure 35. Your Results Page

For some practice tests, this page also displays a summary report (see Figure 36).

Figure 36. Practice Test Summary Report


43

Table 7 provides an overview of the columns in the practice test summary report.

Table 7. Overview of the Practice Test Summary Report

Column Description

Item Number The link in this column opens the question page with the student’s entered response.

Achieved This column displays the student’s achieved points for the item.

Max This column displays the maximum amount of points possible for the item.

Score Rationale This column displays information about the correct answer to a part of or the whole item. A check mark is shown next to the score rationale for each item or part of an item that the student responded to correctly. An X is shown next to the score rationale for each item or part of an item that the student responded to incorrectly.

To exit the Student Testing Site:

2. Select Log Out.

3. In the upper-right corner, select Close Secure Browser. For information about exiting the Student Testing Site on mobile devices, see Appendix A.

Note: If you are testing with the Take a Test app on Windows 10, you must press Ctrl + Alt + Delete to exit the Student Testing Site. For more information about the Take a Test app, see the Technical Specifications Manual.


44

Appendix A. About the Secure Browser This appendix includes the following sections:

• Additional Measures for Securing the Test Environment

• Configuring Tablets for Testing

• About Permissive Mode

• Troubleshooting

For more information about the secure browser, see the Secure Browser Installation Manual.

Additional Measures for Securing the Test Environment

The secure browser ensures test security by prohibiting access to external applications and navigation away from the test. This section provides additional measures you can implement to ensure the test environment is secure.

• Close External User Applications

Before launching the secure browser, or prior to administering the online tests, close all non-required applications on testing devices, such as word processors and web browsers.

• Avoid Testing with Dual Monitors

Students should not take online tests on computers connected to more than one monitor. Systems that use a dual monitor setup typically display an application on one screen while another application is accessible on the other screen.

• Disable Screen Savers and Timeout Features

On all testing devices, be sure to disable any features that display a screen saver or log users out after a period of inactivity. If such features activate while a student is testing, the secure browser logs the student out of the test.

Test Delivery System About the Secure Browser

45

Forbidden Application Detection

When the secure browser launches, it checks for other applications running on the device. If it detects a forbidden application, it displays a message listing the offending application and prevents the student from testing. This also occurs if a forbidden application launches while the student is already in a test.

In most cases, a detected forbidden application is a scheduled or background job, such as anti-virus scans or software updates. The best way to prevent forbidden applications from running during a test is to schedule such jobs outside of planned testing hours.

Configuring Tablets for Testing

Tablets and Chromebooks should be configured for testing before you provide them to students. For more information, see the Technical Specifications Manual on the Ohio's State Tests Portal.

To configure iOS devices:

1. Tap the AIRSecureTest secure browser icon.

To configure Android tablets:

1. Tap the AIRSecureTest secure browser icon. 2. If the secure browser keyboard is not selected, follow the prompts on the screen.

When the secure browser keyboard is selected, the secure browser app opens.

To configure Chromebooks:

1. From the Apps link on the Chrome OS login screen, select AIRSecureTest secure browser.

Closing the Student Testing Site on Tablets

After a test session ends, close the AIRSecureTest application on student tablets.

To close the Student Testing Site on iOS devices:

1. Double-tap the Home button. The multitasking bar appears. 2. Locate the AIRSecureTest app preview and slide it upward.

To close the Student Testing Site on Android tablets:

1. Tap the Menu icon in the upper-right corner.

2. Tap Exit. A confirmation message appears. 3. Tap Exit.

To close the Student Testing Site on Chromebooks:

1. Click Close Secure Browser in the upper-right corner.



46

About Permissive Mode

Permissive Mode is a test setting option that allows students to use accessibility software in addition to the secure browser.

Policy: Requests to use permissive mode for operational testing must be submitted in advance of testing. Districts may submit student-specific requests by contacting the Ohio Help Desk. All requests will be reviewed by the Ohio Department of Education.

Permissive Mode activates when the student is approved for testing. Students who have the Permissive Mode setting enabled should not continue with the sign-in process until their accessibility software is correctly configured.

To use accessibility software with the secure browser:

1. Open the required accessibility software.

2. Open the secure browser. Begin the normal sign-in process up to the test administrator approval step.

3. When a student is approved for testing, the secure browser allows the operating system’s menu and task bar to appear.

4. The student must immediately switch to the accessibility software that is already open on the computer so that it appears over the secure browser. The student cannot click within the secure browser until the accessibility software is configured.

o Windows: To switch to the accessibility software application, click the application in the task bar.

o Mac: To switch to the accessibility software application, click the application in the dock.

Note: When using Windows 8 and above, the task bar remains on-screen throughout the test after enabling accessibility software. However, forbidden applications are still prohibited.

5. The student configures the accessibility software settings as needed.

6. After configuring the accessibility software settings, the student returns to the secure browser. At this point, the student can no longer switch back to the accessibility software. If changes need to be made, the student must sign out and then sign in again.

7. The student continues with the sign-in process.


47

Permissive Mode is available only for computers running supported desktop Windows and Mac operating systems. For information about supported operating systems, see the Technical Specifications Manual.

Personnel should test the compatibility of assistive technology software with permissive mode enabled by accessing the Student Practice Site via the Secure Browser in advance of testing.

Forbidden applications will still not be allowed to run.

Troubleshooting

This section describes how to troubleshoot some situations in which a student cannot connect to a test.

Resolving Secure Browser Error Messages

This section provides possible resolutions for the following messages that students may receive when signing in.

• You cannot login with this browser: This message occurs when the student is not using the correct secure browser. To resolve this issue, ensure the latest version of the secure browser is installed, and that the student launched the secure browser instead of a standard web browser. If the latest version of the secure browser is already running, then log the student out, restart the computer, and try again.

• Looking for an internet connection: This message occurs when the secure browser cannot connect with the Test Delivery System. This can occur if there is a network-related problem. Make sure that either the network cable is plugged in (for wired connections) or the Wi-Fi connection is live (for wireless connections). Also check if the secure browser must use specific proxy settings; if so, those settings must be part of the command that launches the secure browser.

• Test Environment Is Not Secure: This message can occur when the secure browser detects a forbidden application running on the device (see the section Additional Measures for Securing the Test Environment). If this message appears on an iPad, ensure that either Autonomous Single App Mode or Automatic Assessment Configuration is enabled (see the section Configuring Tablets for Testing).



48

Force-Quit Commands

In the rare event that the secure browser or test becomes unresponsive, you can force-quit the secure browser.

To force the secure browser to close, use the keyboard command for your operating system as shown below. This action logs the student out of the test. When the secure browser is opened again, the student logs back in to resume testing.

Operating System Key Combination

Windows* Ctrl + Alt + Shift + F10

Mac OS X* Ctrl + Alt + Shift + F10. The Ctrl key may appear as Control, Ctrl, or ^

Linux Ctrl + Alt + Shift + Esc

*If you are using a laptop or notebook, you may need to press Function before pressing F10.

Caution: Use of Force-Quit Commands

The secure browser hides features such as the Windows task bar or Mac OS X dock. If the secure browser is not closed correctly, then the task bar or dock may not reappear correctly, requiring you to reboot the device. Avoid using a force-quit command if possible.

Force-quit commands do not exist for the secure browser for iOS, Chrome OS, and Android devices.

• iOS: Double-tap the Home button, then close the app as you would any other iOS app.

• Chrome OS: To exit the secure browser, press Ctrl + Shift + S.

• Android: To close the secure browser, tap the menu button in the upper-right corner and select Exit.

Test Delivery System Text Response Formatting Toolbar

49

Appendix B. Text Response Formatting Toolbar In addition to the standard test tools described in the section Test Tools, students can use a formatting toolbar above the response field for some text response questions (see Figure 37). The formatting toolbar allows students to apply styling to text and use standard word-processing features.

Figure 37. Text Response Question with Formatting Toolbar

The lower-right corner of the response field displays the word count and character count for the student's response.

Table 8 provides an overview of the formatting tools available.

Table 8. Description of Formatting Tools

Tool Description of Function

Bold, italicize, or underline selected text.

Remove formatting that was applied to the selected text.

Insert a numbered or bulleted list.

Indent a line of selected text.

Decrease indent of text.

Cut selected text.

Copy selected text.

Paste copied or cut text.

Undo the last edit to text or formatting in the response field.

Redo the last undo action.

Use spell check (if available) to identify potentially misspelled words in the response field. Not available for English language arts tests.

Add special characters in the response field.

Test Delivery System Text Response Formatting Toolbar

50

Spell Check

The spell check tool identifies words in the response field that may be misspelled (see Figure 38).

Figure 38. Spell Check Tool

To use spell check:

1. In the toolbar, select .

2. Potentially incorrect words change color and become underlined.

3. Select a misspelled word. A list of suggestions appears.

4. Select a replacement word from the list. If none of the replacement words are correct, close the list by clicking anywhere outside it.

5. To exit spell check, select again.

Special Characters

Students can add mathematical, accented characters, and other symbols.

To add a special character:

1. In the toolbar, select .

2. In the window that pops up, select the required character (see Figure 39).

Figure 39. Special Characters Window

Test Delivery System Keyboard Navigation for Students

51

Appendix C. Keyboard Navigation for Students Students can use keyboard commands to navigate between test elements, features, and tools.

Keyboard commands require the use of the primary keyboard. Do not use keys in a numeric keypad.

Sign-In Pages and In-Test Pop-ups

Table 9 lists keyboard commands for selecting options on the sign-in pages or pop-up windows that appear during a test.

Table 9. Keyboard Commands for Sign-In Pages and Pop-Up Windows

Function Keyboard Commands

Move to the next option Tab

Move to the previous option Shift + Tab

Select the active option Enter

Mark checkbox Space

Scroll through drop-down list options Arrow Keys

Close pop-up window Esc

Keyboard Commands for Test Navigation

Table 10 lists keyboard commands for navigating tests and responding to questions.

Table 10. Keyboard Commands for Test Navigation

Function Keyboard Commands

Scroll up Up Arrow

Scroll down Down Arrow

Scroll to the right Right Arrow

Scroll to the left Left Arrow

Move to the next element Tab

Move to the previous element Shift + Tab

Select an answer option Space

Go to the next test page Ctrl + Right Arrow

Go to the previous test page Ctrl + Left Arrow

Open the global menu Ctrl + G

Open a context menu Ctrl + M


52

Keyboard Commands for Global and Context Menus

Students can use keyboard commands to access tools in the global and context menus. For more information about tools in the global menu, see Table 5. For more information about tools in the context menu, see Table 6.

Global Menu

To access the global menu tools using keyboard commands:

1. Press Ctrl + G. The global menu list opens.

2. To move between options in the global menu, use the Up or Down arrow key.

3. To select an option, press Enter.

4. To close the global menu without selecting an option, press Esc.

Context Menus

To open the context menu for an element:

5. Navigate to the element using the Tab or Shift + Tab command.

1. Press Ctrl + M. The context menu for the selected element opens.

2. To move between options in the context menu, use the Up or Down arrow keys.

3. To select an option, press Enter.

4. To close the context menu without selecting an option, press Esc.

Highlighting Selected Regions of Text

This section explains how to use keyboard commands to select a text excerpt (such as a word in a passage) and highlight it. These instructions only apply to students using the secure browser.

To select text and highlight it:

1. Navigate to the element containing the text you want to select.

2. Press Ctrl + M to open the context menu and navigate to Enable Text Selection.

3. Press Enter. A flashing cursor appears at the upper-left corner of the active element.

4. To move the cursor to the beginning of the text you want to select, use the arrow keys.

5. Press Shift and an arrow key to select your text. The text you select appears shaded.

6. Press Ctrl + M and select Highlight Selection.


53

Keyboard Commands for Grid Questions

Questions with the grid response area (see Figure 40) may have up to three main sections:

• Answer Space: The grid area where students enter the response.

• Button Row: The following buttons may appear above the answer space: Delete, Add Point, Add Arrow, Add Line, Add Circle, Add Dashed Line, and Connect Line.

• Object Bank: A panel containing objects you can move to the answer space.

Figure 40. Grid Question

To move between the main sections:

1. To move clockwise, press Tab. To move counter-clockwise, press Shift + Tab.

To add an object to the answer space:

1. With the object bank active, use the arrow keys to move between objects. The active object has a blue background.

2. To add the active object to the answer space, press Space.

To use the action buttons:

1. With the button row active, use the left and right arrow keys to move between the buttons. The active button is white.

2. To select a button, press Enter.

3. Press Space to apply the point, arrow, or line to the answer space.

To move objects and graph elements in the answer space:

1. With the answer space active, press Enter to move between the objects. The active object displays a blue border.

2. Press Space.

3. Press an arrow key to move the object. To move the object in smaller increments, hold Shift while pressing an arrow key.

Test Delivery System Transferring a Test Session

54

Appendix D. Transferring a Test Session You can transfer an active test session from one device or browser to another without stopping the session or interrupting in-progress tests. This is useful in scenarios when your computer malfunctions while a session is in progress.

Warning: If you do not know the active Session ID, you cannot transfer the session.

The Test Delivery System ensures that you can only administer a test session from one browser at a time. If you move a test session to a new device, you cannot simultaneously administer the session from the original browser or device.

These instructions apply to both the Test Administrator Interface and Test Administrator Practice Site. However, you cannot transfer a session from the Test Administrator Interface to the Test Administrator Practice Site or vice versa.

To transfer a test session to a new device or browser:

1. While the session is still active on the original device or browser, log in to the Test Administrator Site on the new device or browser. A Session ID prompt appears.

2. Enter the active Session ID in the text box and click Enter. The Test Administrator Site appears, allowing you to continue monitoring your students’ progress. The test session on the previous computer or browser automatically closes.

The Session ID prompt appears any time you access the TA Site during an active session. If you do not wish to return to the active session, you can click Start a Different Session to create a new session or Logout to close the active session and log out of the TA Site.

Test Delivery System User Support

55

Appendix E. User Support For additional information and assistance in using the Test Delivery System, contact the Ohio Help Desk.

The Help Desk is open Monday-Friday 7 a.m. to 5 p.m. (except holidays or as otherwise indicated on the Ohio's State Tests portal).

Ohio Help Desk

Toll-Free Phone Support: 877-231-7809

Email Support: [email protected]

Please provide the Help Desk with a detailed description of your problem, as well as the following:

• Test Administrator name

• If the issue pertains to a student, provide the student’s SSID, test name, test part, and associated district or school. Do not provide the student’s name.

• If the issue pertains to a TIDE user, provide the user’s full name and email address.

• Any error messages and codes that appeared, if applicable.

• Affected test ID and question number, if applicable.

• Operating system and browser version information, including version numbers (for example, Windows 7 and Firefox 45 or Mac OS 10.10 and Safari 8)

• Information about your network configuration, if known:

o Secure browser installation (to individual devices or network)

o Wired or wireless internet network setup

Test Delivery System Change Log

56

Appendix F. Change Log

Date Section/Element Note

9/25/17 Initial version 2017-2018

02/06/18 All Updates made to reflect TA Interface time column and Student Interface test clock feature

FOURTH EDITION

JANUARY 2018

Ohio’s Accessibility Manual

Table of Contents

Section 1: Introduction ........................................................................................................................... 4

1.1 About this Manual ........................................................................................................................................4

1.2 About Accessibility Features on Ohio’s State Tests ...............................................................................4

1.3 General Testing Procedures .......................................................................................................................4

Section 2: Ohio’s Accessibility Features for Students Taking Ohio’s State Tests ........................ 4

2.1 Decision-Making Framework for Accessibility Features ........................................................................4

2.2 Ohio’s Accessibility Features .....................................................................................................................5

2.3 Administrative Considerations ...................................................................................................................6

2.4 Universal Tools ............................................................................................................................................7

2.5 Designated Supports ................................................................................................................................. 10

2.6 Accommodations for Students with Disabilities and English learners............................................... 13

2.7 Considerations for English Learner Accommodations ......................................................................... 23

2.8 Other Accommodations and Modifications ............................................................................................ 27

Section 3: Universal Design and Ohio’s State Tests ........................................................................ 28

Acknowledgements

The Ohio Department of Education would like to acknowledge the members of the Ohio AT Network for giving their time, insight and expertise to this manual.

Revision History The revision history of this manual provides a means for readers to easily navigate to places in the relevant section where updates have occurred. Significant changes and updates are indicated with red text and underline for additions and strike-throughs for deletions. Minor changes, such as typos, formatting and grammar corrections or updates, are not highlighted.

Page Description

4 Noted that some universal features can be turned off.

4 Noted new Desmos calculator.

5 Changed description of Highlighter to match Test Administrator Manual (TAM).

5 Changed name of General Masking to Masking to match TAM.

5 Changed description of Line reader to match TAM.

5 Changed name of Flag items to Mark for review to align with TAM.

5 Changed description of Paginated stimuli and reading mode to match TAM.

6 Changed name of Eliminate answer choices to Strikethrough to align with TAM.

6 Added new feature, Test Timer.

6 Added reminder about voice packs for Text-to-speech.

6 Changed name of Magnification or enlargement to Zoom to align with TAM.

7 Noted that online features can be turned on and off in the student test settings.

7-8 Removed multiple features that can be disabled and consolidated under term Disable universal tool.

8 Added Fact charts to Calculator - handheld.

8 Added Line reader tool - handheld.

9 Added Spellchecker - handheld.

9 Added Tactile fidgets/Fidget devices.

9 Changed name Timer to Timer – external to differentiate from new universal Test timer feature.

10 Added note about documenting accommodations for different standardized tests.

11 Added note to not document accommodations for college and career readiness tests on IEPs or 504 plans.

11 Added that graphic organizers are not allowable on Ohio State Tests.

11 Added note about the Assistive Technology and Accessible Educational Materials Center.

12 Added reminder that reading only questions and answer options to students is not allowed.


15 Added note about the Assistive Technology and Accessible Educational Materials Center.

16 Noted new Desmos calculator.

16-17 Changed calculator policy for science tests.

17 Added rekenrek and removed limit on use to visually impaired.

17 Added comment about Mathematical tools.

18 Removed prohibition of calculators on science tests.


23 Added word-to-word glossaries and dictionaries approved by ACT and College Board to state allowed dictionaries.

23 Added additional information to section on emergency accommodations.

P a g e 4 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8

Section 1: Introduction

1.1 About this Manual Ohio’s Accessibility Manual is a comprehensive policy document providing information about the accessibility features of Ohio’s State Tests for grades 3-8 and high school in English language arts, mathematics, science and social studies. The manual helps to define the specific accessibility features available for all students, students with disabilities, students who are English learners and students who are English learners with disabilities. The intended audience of the manual is district decision makers and teams who will determine the accessibility features for all students taking the tests.

1.2 About Accessibility Features on Ohio’s State Tests Ohio regards tests as tools for enhancing teaching and learning. Ohio is committed to providing all students, including but not limited to, students with disabilities, English learners, English learners with disabilities, and underserved populations, with equitable access to high-quality, 21st century assessments. By applying principles of universal design, leveraging technology, and embedding and allowing a broad range of accessibility features, Ohio’s State Tests provide opportunities for the widest possible number of students to demonstrate their knowledge and skills. Ohio sets and maintains high expectations that all students will have access to the full range of grade-level and course content standards. Together, these elements will increase student access to Ohio’s State Tests with fidelity of implementation. Ohio’s goals for promoting student access include:

● Applying principles of universal design to the development of the assessments such that the assessments provide the greatest amount of accessibility and minimize test related barriers for all students;

● Measuring the full range of complexity of the standards; ● Leveraging technology for the accessible delivery of the assessments; ● Building accessibility throughout the test without sacrificing assessment validity; and ● Using a combination of accessible design and accessible technologies from the inception of items and

tasks.

1.3 General Testing Procedures For information about coordinating or administering Ohio’s State Tests, including test security policies, administrative procedures and tasks to complete before, during and after testing, refer to the Test Administration Manual. Manuals are available on Ohio’s State Tests Portal.

Section 2: Ohio’s Accessibility Features for Students Taking Ohio’s State Tests

2.1 Decision-Making Framework for Accessibility Features Students should be familiar with accessibility features prior to testing and should have the opportunity to select, practice and use those features in instruction before test day. Students can become familiar with the computer-based features by accessing the practice items available on the Student Practice Site on Ohio’s State Test

http://www.ohiostatetests.org/

http://oh.portal.airast.org/ocba/students-and-families/


Portal. Appendix G provides a graphic to assist district testing accessibility decision makers in selecting appropriate features based on student needs. The graphic shows the various layers of features and provides guiding questions to support the district’s selection process.

2.2 Ohio’s Accessibility Features Through a combination of universal design principles and computer-embedded accessibility features, Ohio has designed an inclusive assessment system by considering accessibility from initial design through item development, field-testing and implementation of the assessments for all students. Although accommodations may still be needed for some students with disabilities and English learners to assist in demonstrating what they know and can do, the computer-embedded accessibility features should minimize the need for accommodations during testing and ensure the inclusive, accessible and fair testing of the diverse students being assessed.

Ohio’s Accessibility System

Accommodations for students with disabilities must be documented on IEPs or 504 plans. Other accessibility features are not required to be documented to be provided. However, if there is an accessibility feature that a team wants to ensure a student receives, the team should document the feature on the student’s IEP or 504 plan as well.


For example, if a student with a disability needs to have the test administered in a small group setting or if a student must have color contrast for testing, these features also should be included on the IEP or 504 plan. If they are not included on a plan, they may still be provided, but documenting the student’s need ensures that the features are provided.

2.3 Administrative Considerations Students are typically tested in their general education classrooms following the test administration schedule for the grade and content area being administered. However, the administrator has the authority to schedule students in testing spaces other than general education classrooms and at different scheduled times, as long as all requirements for testing conditions and test security are met as set forth in the Test Administration Manual. Decisions may be considered, for example, that benefit students who are easily distracted in large group settings by testing them in a small group or individual setting. In general, changes to the timing, setting or conditions of testing are left to the discretion of the principal or test coordinator. In accordance with principles of universal design for assessment, these administrative considerations are available to all students.

Administrative Considerations Description

Familiar test administrator The student knows the test administrator and/or interpreter.

Frequent breaks

All students may take breaks as needed. Frequent breaks refers

to multiple, planned, short breaks during testing based on a

specific student’s needs (for example, the student fatigues

easily). During each break, the testing clock is stopped.

Students should pause their test when taking a break. Students may pause their test from the student testing site or the test administrator may do so from the Test Administrator Interface. Pausing a student’s test signs the student out of his or her test. A student who pauses his or her test and signs back into the test on the same school day will be able to revisit all the items on the test. A warning message displays after 20 minutes of test inactivity. If

the student does not click OK within 30 seconds after this

message appears, the test is paused and they are signed out.

Separate or alternate location

The test is administered in a different location than the location

where other students are testing (for example, a different

classroom).

Small group

A small group is a subset of a larger testing group assessed in a

separate location. There is no specific number defined for a small

group, but two to eight students is typical. A “group” of one also is

permissible. Small groups may be appropriate for human read-

aloud and translated test administration or to reduce distractors

for some students.




Specialized equipment or furniture This includes equipment such as adjustable desks or chairs.

Specified area or seating

The student sits in a specific place in the test setting, such as by

the window for natural light or beside the test administrator’s

desk.

Time of day

The student takes test during time of day most beneficial to his or

her performance. Care must be taken to ensure that the student

has all allowable time available for testing.

2.4 Universal Tools On the Ohio computer-based assessments, universal tools are features or preferences that are either built into the assessment system or provided externally by test administrators. Universal tools are available for all students taking Ohio’s State Tests. Since these features are available for all students, they are not classified as accommodations. Students should be familiar with these features prior to testing and should have the opportunity to select and practice using them in order to appropriately use these features on test day. Universal tools are intended to benefit a wide range of students and may be used by the student at his or her discretion during testing. Universal tools embedded in the test delivery system are on by default but some may be turned off. See the Test Administration Manual for detailed information about turning features on and off in the student test settings.

Universal Tools Description

Blank paper

The test administrator provides blank scratch paper to students to take

notes and/or work through items during testing. Blank paper is required

for the English language arts tests. For mathematics, science and social

studies, blank paper must be available upon request. Refer to the Test

Administration Manual for more information about blank paper.

Calculator – Test Delivery System

The Test Delivery System provides a calculator for student use on calculator-allowable mathematics tests or parts of test and the physical science test. Beginning with the 2017-2018 school year, the Ohio’s State Tests will use Desmos as the online calculator. The previous calculator versions will no longer be available on the tests or under the Student Practice Test resources. Practice tests that have the calculator tool have been updated to provide students with the Desmos calculator. The Desmos calculators are also available in the Student Practice Resources folder on the Ohio’s State Tests portal. Additional calculator guidance is in the Test Administration Manual. A graphing calculator is available on the following tests:

• Algebra I

• Geometry



http://oh.portal.airast.org/ocba/



http://oh.portal.airast.org/ocba/resources/



A scientific calculator is available on the following tests:

• Physical Science

• Grades 6 to 8 Mathematics

General directions

The test administrator must read the scripted general directions for starting all administrations and must not deviate from the script. After the test administrator has read the directions, students may ask for the directions to be repeated or clarified. General directions may be translated or signed (e.g. ASL). General directions include the scripted information for students that comes before the test starts. Once students have begun the test, nothing may be clarified.

General Masking

The student electronically “covers” parts of the test with a blank box, as

needed covers an area of the item so they can focus on certain item

elements. The student may uncover anything masked when ready. This

setting can be changed in TIDE and the Test Administrator Interface.

Headphones

The student uses headphones or earbuds to access text-to-speech or

media on the assessment.

Students using text-to-speech must use headphones if tested in a group

setting.

At this time, there are no audio clips embedded in any content area

test. Therefore, headphones are not required for testing unless a

student is using the text-to-speech feature in a group setting.

Students with hearing impairments may use personal FM systems. For

more information on additional assistive technology devices and

software for use on Ohio’s State Tests, refer to Appendix D of this

manual.

Highlighter

The student electronically highlights text as needed to recall and/or

emphasize. This setting can be changed in TIDE and the Test

Administrator Interface.

Line reader

The student uses an onscreen tool to highlight lines of text as they read.

This setting can be changed in TIDE and the Test Administrator

Interface.

Mark for review (Flag items) The student electronically “flags” or “bookmarks” items to review later.

Notepad The student writes notes using the embedded notepad feature.

Paginated stimuli and reading mode

The student moves between pages of a reading passage by clicking on the arrow keys below the passage reads a passage by flipping pages, similar to a book or e-reader. This eliminates vertical scrolling on passages. The student can also select to open the reading mode window which displays two pages of the reading passage at a time.


Paginated stimuli and reading mode are available only for ELA and some social studies tests. This setting can be changed in TIDE and the Test Administrator Interface

Redirect student to the test

The test administrator redirects the student’s attention to the test

without coaching or assisting the student in any way. To redirect a

student is not the same as to cueing or prompting the student.

Spellcheck

This feature allows the student to check the spelling of words in student-

generated responses.

Spellcheck is available only for some science and social studies items that require a student to write/type a response. Unlike some word processing programs, the Student Testing Site does not automatically highlight misspelled words as the student types. Students must click the ABC button to check spelling. Spellcheck is not allowed on the English language arts tests. There are no type-written responses for mathematics.

Strikethrough

(Eliminate answer choices)

The student electronically crosses out possible answer choices on multiple choice items. This setting can be changed in TIDE and the Test Administrator Interface

Test timer

The student test timer displays the amount of time the student has been in the test. The timer only runs while the student is viewing test content. The test timer does not enforce a time limit. Test administrators are responsible for ensuring that students complete each part of their tests within the posted testing time, or when applicable, within the student’s allotted extended time. The student can collapse or un-collapse the test timer by clicking on it. This feature may be turned off.

Text-to-speech for mathematics,

science and social studies

Text-to-speech as a universal tool will be turned on for mathematics,

science and social studies. The text-to-speech feature reads aloud the

test to the student when the student selects an available “speak” option.

Student must use headphones if tested in a group setting.

Only students who meet the criteria to have a read-aloud

accommodation on the English language arts test may use this feature

for English language arts.

Students who use text-to-speech should use a voice pack they are

familiar with and adjust the volume, pitch and rate prior to starting the

test. Detailed information about text-to-speech functionality is in the

Test Administration Manual. Manuals are available on Ohio’s State

Tests Portal.

Text-to-speech tracking for

mathematics, science and social

studies

The feature will highlight words in test questions as the embedded text-

to-speech feature reads the test aloud.

Only students who meet the criteria to have a read-aloud

accommodation on the English language arts test may use this feature

for English language arts.




Writing tools Writing tools (cut and paste, copy, underline, bold and insert bullets) are

available for select constructed-response items.

Zoom

(Magnification or enlargement)

Students use the zoom out and zoom in buttons to decrease and increase the size of the text and graphics on the page. Maximum zoom is about 250 percent depending on the device.

2.5 Designated Supports A relatively small number of students will require additional features for their particular needs (for example, changing the background or font color or disabling text-to-speech for the mathematics assessments). Providing too many tools on screen might distract some students. Therefore, some designated features will be selected ahead of time based on the individual needs and preferences of the student. Students should practice using these features and understand when and how to use them. Students can decide whether or not to use a pre-selected support without any consequence to the student, school or district. Individualizing access needs on the test for each student provides increased opportunities to accurately demonstrate knowledge and skills. Designated supports are divided into two types: 1) embedded designated supports; and 2) non-embedded

designated supports. Embedded supports are those that are available as part of the technology platform. They

can be turned on three different ways:

1. By uploading a student settings file in TIDE;

2. By marking the features under the “Test Settings” section of the student’s record manually in TIDE; or

3. Test administrators can select the feature(s) under “Test Settings” in the Test Administrator Interface

when approving the student to test during the test session.

See the Test Administration Manual for detailed information about turning features on and off in the student

test settings. Non-embedded supports are not part of the technology platform so test administrators must

provide them locally.

Designated Supports

Embedded Designated

Supports

Description

Background/font color choice

• Black on light yellow

• Black on light blue

• Black on light magenta

• White on black (inverted)

• White on navy blue

Alternate on-screen background and font color is enabled.

A note about color blindness: The Department follows accessibility color guidelines when developing test items. Items on state tests should not be color dependent. Graphs, maps, charts and other images may have color, but being able to distinguish the colors should not affect a student’s ability to respond to a question. When using color-contrast options, the contrast may not transfer to some images or text in images. If a student comes to an item that he or she cannot answer, either because it is not universally accessible or the color contrast does not work properly, it is allowable for the test administrator to describe what needs to be explained to the student to be able to answer the question. The test administrator must be



cautious to not provide any information that gives the answer to the student.

Disable universal tool Some students may benefit from fewer tools in the Test Delivery System when testing. Many of the universal tools available in the Test Delivery System can be turned off. See the Test Administration Manual for details about turning student settings on and off.

Disable general masking Turn off general masking to reduce student distraction.

Disable paginated stimuli and reading mode

Turn off paginated stimuli and reading mode to reduce student distraction.

Disable text-to-speech for


studies

Turn off text-to-speech to reduce student distraction.

Disable text-to-speech tracking for


studies

Turn off text-to-speech tracking to reduce student distraction.

Mouse pointer size and color

• Large/extra large black

• Large/extra large green

• Large/extra large red

• Large/extra large yellow

• Large/extra large white

Adjust the size and color of the mouse cursor as it appears on the student's screen.

Print size

• Level 0: 1X (default/no zoom)

• Level 1: 1.5X

• Level 2: 1.75X

• Level 3: 2.5X

• Level 4: 3X

The print size can be pre-set to one to four levels larger than the

default.

Non-embedded Designated

Supports

Description

Calculator or fact charts - handheld

Students may use handheld calculators and fact charts (addition, subtraction, multiplication or division only) for calculator-allowable mathematics tests or parts of test and the physical science test. Additional calculator guidance is in the Test Administration Manual.

External magnification or

enlargement device

The student uses external magnification or enlargement devices to

increase the font or graphic size (e.g., projector, closed-circuit

television, eye-glass mounted or hand-held magnifiers, electronic

magnification systems, etc.).

Line reader tool - handheld The student uses a blank straight edge as he or she reads and follows along with the text on the screen.



Music and white noise

A student or group of students listens to background music during testing. The test administrator may play music to a student or group of students, or a student may use a teacher-provided device and earbuds. Music selections should be free of any test content-specific lyrics. Test security must be maintained. Students may not use a personal device (e.g. cell phone, MP3 player). Additional information about the electronic device policy is in the Test Administration Manual.

Noise buffers

The student uses headphones/earbuds or earplugs to minimize

distraction or filter external noise during testing. If students use

headphones/earbuds as noise buffers, they should not be plugged into

a device.

Rulers, angled-rulers, compasses and protractors

Students may be familiar with these tools from instruction at various grade levels and want to use them on the test. While these tools are not required for testing, districts may choose to provide them to students or allow students to provide their own. The tools cannot contain any additional writing or information that may provide an unfair testing advantage. Examples of additional writing could include but are not limited to multiplication tables, formulas or conversion charts. A student with a visual impairment may need adapted mathematical tools such as a large print ruler, Braille ruler, tactile compass or Braille protractor.

Specialized paper

In addition to blank paper, students may use test administrator-

provided grid paper, wide-ruled paper, Braille paper, raised-line paper,

bold-line paper, raised-line grid paper, bold-line grid paper, colored

paper, etc. The paper provided cannot contain any writing that may

give the student an unfair testing advantage. Examples of additional

writing that is prohibited can include but is not limited to number lines,

two-column tables, fraction models and coordinate grids. Students also

may use personal white boards.

Spellchecker - handheld

The student uses a handheld spellchecker during testing instead of the universal spellcheck embedded into the test delivery system. A handheld spellchecker may be used on all Ohio state tests except English language arts. Spell checkers may not connect to the internet, store information or include definitions, phrases, sentences or pictures. The student should be familiar with the spellchecker he or she will use during testing.

Student reads test aloud to self

Student reads aloud to self. This feature includes the use of whisper

phones. Student must be tested in a one-on-one setting so that the

student does not disturb other students or in a setting in which students

are separated enough from each other that they cannot hear each

other and do not disturb one another.




Tactile fidgets/Fidget devices

Student uses tool for self-regulation, to help with focus, attention, calming, and active listening. (e.g. Fidget Spinner, squish ball, focus cube, pencil topper, etc.). Tool must be free of anything that may give an advantage during testing or test content.

Timer - external

Student uses a timer. There are a variety of timers that students may use, ranging from basic kitchen timers to more complex wearable devices that vibrate or flash at preset intervals or timers with visual clues such as a red covering that disappears as the timer counts down.

Students may not use cell phones and devices must not connect to the internet.

2.6 Accommodations for Students with Disabilities and English learners While all students potentially can benefit from the universal tools and designated supports embedded within the test, some students may still need further support to access the tests and show what they know. Those students may require testing accommodations. Accommodations for testing are supports that are already familiar to the student because they are being used in the classroom to support instruction. Four distinct groups of students may receive accommodations on Ohio’s State Tests:

1. Students with disabilities who have an Individualized Education Program (IEP);

2. Students with a Section 504 plan who have physical or mental impairments that substantially limit

one or more major life activities, have records of such impairments, or are regarded as having such

impairments, but who do not qualify for special education services;

3. Students who are English learners (Guidelines for determining English learner status can be

found in the Ohio Statewide Assessments Rules Book.) Students who have exited English learner

status may not receive English learner accommodations on Ohio’s State Tests; and

4. Students who are English learners with disabilities who have IEPs or 504 plans are eligible for

both accommodations for students with disabilities and English learners. For additional guidance

and information about English learners with disabilities, access the About the Lau Resource Center

page of the Ohio Department of Education website.

For Ohio’s State Tests, accommodations are considered to be adjustments to the testing conditions, test format or test administration that provide equitable access during assessments for students with disabilities and students who are English learners. The administration of the assessment should never be the first occasion in which an accommodation is introduced to the student. Accommodations should:

• Provide equitable access during instruction and assessment;

• Mitigate the effects of a student’s disability or English learner status;

• Not reduce learning or performance expectations;

• Not change the construct being assessed; and

• Not compromise the integrity or validity of the assessment. The guidelines provided in this manual are intended to ensure that valid and reliable scores are produced on Ohio’s State Tests and that an unfair advantage is not given to students who receive accommodations. Outside of the guidance provided in this manual, changes to an accommodation or the conditions in which it is provided may change what the test is measuring, and will likely call into question the reliability and validity of the results regarding what a student knows and is able to do as measured by the test.

http://education.ohio.gov/Topics/Testing/Testing-Forms-Rules-and-Committees/Ohio-Statewide-Assessment-Program-Rules-Book



Accommodations should adhere to the following principles:

• Accommodations enable students to participate more fully and fairly on assessments and to demonstrate their knowledge and skills;

• Accommodations should be based upon an individual student’s needs rather than on the category of a student’s disability, level of English language proficiency alone, level of or access to grade-level instruction, amount of time spent in a general classroom, current program setting or availability of staff;

• Teams should base accommodations on a documented need in the instruction and assessment setting and educators should not provide accommodations in order to give the student an enhancement that others could view as an unfair advantage;

• IEP teams and 504 Plan coordinators should describe and document accommodations for students with disabilities in the student’s appropriate plan (i.e., either the IEP or 504 Plan);

• Ohio requires that districts develop district-wide educational plans for English learners that include testing accessibility features;

• Educators should not introduce accommodations to the student for the first time during testing; • When allowable, students also should use accommodations used during instruction on district

assessments and state tests. • Policies about allowable accommodations sometimes differ between standardized tests. For example,

an accommodation allowed on the state ELA test may not be allowed on a district ELA test. To help ensure that students only receive accommodations that result in a valid assessment, IEP teams and 504 plan coordinators should document student accommodations by test and content area, not content area alone.

• Tests that require application for and vendor approval of accommodations, such as college and career readiness tests, should not be documented on an IEP or 504 plan. Because students will have a valid score only if they use the accommodations the vendor approves, IEP teams and 504 plan coordinators cannot ensure that a student will be provided the accommodations documented in their plan.

The table below shows the allowable accommodations for Ohio’s State Tests. Note that some accommodations students use in the classroom will reduce the validity of a student’s test score and are not allowable, such as use of a thesaurus, graphic organizers or access to the Internet during testing.

Accommodations for Students with Disabilities

Presentation Accommodations

Presentation accommodations alter the method or format used to administer Ohio’s State Tests to a student, by changing either the auditory, tactile, visual and/or a combination of these characteristics.

Description

Students who benefit most from presentation accommodations are those with disabilities that affect reading standard print, typically as a result of a physical, sensory, cognitive or specific learning disability.


Students may use a range of assistive technologies (AT) on Ohio’s State Tests including devices that are compatible with the AIR Student Testing Site, and those that are used externally (i.e., on a separate device).


For more information on additional assistive technology devices and software for use on Ohio’s State Tests, refer to Appendix D of this manual. For information about who needs AT, how to obtain AT and AT tools, visit the Assistive Technology & Accessible Educational Materials Center online (ataem.org).

Human reader for computer-based test

A test administrator or monitor reads from the student’s computer screen to the student. For computer-based testing, most students should be able to use text-to-speech for a read-aloud. In some cases, a student’s disability may prohibit them from using the text-to-speech feature and require a human reader.

If testing in a small group, test administrators should ensure that all students in the group have similar abilities so that the reader’s pace meets all student’s needs without being too slow or too fast for any student.

Refer to the TIDE User Guide for information about setting up groups for computer-based testing.

If a student need this accommodation, then the person providing the accommodation must read the entire test to the student. It cannot be “as needed” or “on demand.”

Only students who meet the criteria to have a read-aloud accommodation on the English language arts test may use this feature for English language arts.

Paper version of test instead of online

A paper version of the test is available for the small number of students who are unable to use a computer due to the impact of their disability. Before selecting a paper version of the test, IEP teams and 504 plan coordinators should first consider other accessibility features. Students who take a paper-based test should be unable, even with support, to use technology to produce and publish writing using keyboarding. Situations that may require this accommodation include:

● A student with a disability who cannot participate in the online assessment due to a health-related disability, neurological disorder or other complex disability and/or cannot meet the demands of a computer-based test administration even with other accessibility features such as extended time, frequent breaks or a scribe;

● A student with an emotional, behavioral or other disability who is unable to maintain sufficient concentration to participate in a computer-based test administration, even with other accessibility features such as a familiar test administrator, frequent breaks, small group, specified seating or a timer;

● A student with a disability who requires assistive technology that is not compatible with the testing platform.



If a student takes a paper version of a test, the student must take both parts of the test on paper.

Refer to Appendix A of this manual for additional information about paper-based testing.

Read-aloud on English language arts

“Read-aloud” as a general term is when a student is administered a test via text-to-speech, human reader, screen reader or sign language interpreter.

The read-aloud accommodation for the English language arts test is intended to provide access for a very small number of students to printed or written texts on the tests. These students have print-related disabilities and otherwise would be unable to participate in the state tests because their disabilities severely limit or prevent them from decoding, thus accessing printed text.

Because students who require this accommodation are unable to access printed text, they must have a read-aloud for the entire test, including the items, answer options, charts/graphs/figures and passages. This accommodation is not intended for students reading somewhat (only moderately) below grade level.

Reading only questions and answer options to a student is not allowable on the ELA test. If a student qualifies for this accommodation, then they must have the entire test read, including the passages.

In making decisions on whether to provide a student with this accommodation, IEP teams and 504 plan coordinators should consider whether the student has:

• A disability that severely limits or prevents him or her from accessing printed text, even after varied and repeated attempts to teach the student to do so (for example, the student is unable to decode printed text);

OR

• Blindness or a visual impairment and has not learned (or is unable to use) Braille;

OR

• Deafness or hearing loss and is severely limited or prevented from decoding text due to a documented history of early and prolonged language deprivation.

Before documenting the accommodation in the student’s IEP or 504 plan, teams/coordinators also should consider whether:

• The student has access to printed text during routine instruction through a reader, or other spoken-text audio format


accessible educational materials (AEM) or sign language interpreter;

• The student’s inability to decode printed text or read Braille is documented in evaluation summaries from locally administered diagnostic assessments;

• The student receives ongoing, intensive instruction and/or interventions in foundational reading skills to continue attaining the important college- and career-ready skill of independent reading.

For information about who needs AEM, how to obtain AEM and tools to support AEM, visit the Assistive Technology & Accessible Educational Materials Center online (http://ataem.org/); IEP teams and 504 plan coordinators make decisions about who receives this accommodation. Schools should use a variety of sources as evidence (including state assessments, district assessments and one or more locally administered diagnostic assessments or other evaluation).

For students who receive this accommodation, no claims should be inferred regarding the student’s ability to demonstrate foundational reading skills.

Refer to the Test Administration Manual for more information about administering a test through a human reader.

Screen reader mode

Screen reader mode is for students with visual impairments who use

screen readers. Students who do not use screen readers should not

use screen reader mode. Screen reader mode changes the

presentation of items and removes some features. Students working

in this mode do not have the same access to tools. Additional

information about the screen reader and functionality is in the Test

Administration Manual, Practice Test Guidance Document and TIDE

User Guide.

Only students who meet the criteria to have a read-aloud accommodation on the English language arts test may use this feature for English language arts. Screen reader mode is not available for grade 8 science, biology or physical science. By design, screen reader mode does not render simulations and displays alternate text that describes the key information about the simulation needed to answer the associated items. Screen reader mode is not available for these tests because they contain simulations that cannot be adequately described due to the complexity of the simulations.

Sign language interpreter Any student who is deaf or has hearing loss may have a sign language interpreter reflecting their IEP accommodations (American

http://ataem.org/






Sign Language, Signed English, Cued Speech) for mathematics, science and social studies.

For the purposes of statewide testing, sign language is considered a second language and should be treated the same as any other language from a translational standpoint. The test must be signed verbatim. The intent of the phrase “signed verbatim” does not mean a word-to-word translation, as this is not appropriate for any language translation. The expectation is that the interpreter should faithfully translate, to the greatest extent possible, all of the words on the test without changing or enhancing the meaning of the content, adding information or explaining concepts unknown to the student.

If a sign language interpreter perceives that a specific sign gives a student the answer or otherwise provides an unfair advantage, an alternate sign or finger spelling should be used.


Text-to-speech for English language arts

The text-to-speech feature reads aloud the test to the student when the student selects an available “speak” option.

Student must use headphones if tested in a group setting.


Students who use text-to-speech should use a voice pack they are familiar with and adjust the volume, pitch and rate prior to starting the test. Detailed information about text-to-speech functionality is in the Test Administration Manual. Manuals are available on Ohio’s State Tests Portal.

Text-to-speech tracking for English language arts

The feature will highlight words in test questions as the embedded text-to-speech feature reads the test aloud.





Response Accommodations

Response accommodations allow students to use alternative methods for providing responses to test items, such as through dictating to a scribe or using an assistive device.

Description

Response accommodations can benefit students who have physical, sensory or learning disabilities who have difficulties with memory, fine-motor skills, sequencing, directionality, alignment and organization.


Students may use a range of assistive technologies (AT)on Ohio’s State Tests, including devices that are compatible with the Student Testing Site and those that are used externally (i.e., on a separate device).

For more information on additional assistive technology devices and software for use on Ohio’s State Tests, refer to Appendix D.

For information about who needs AT, how to obtain AT and AT tools, visit the Assistive Technology & Accessible Educational Materials Center online (ataem.org).

Answers transcribed by test administrator

The student records his or her answers directly on paper and the test administrator/monitor transcribes the responses verbatim into the Student Testing Site.

Braille notetaker

A student who is blind or has visual impairments may use an electronic Braille notetaker. For Ohio’s State Tests, grammar checker, Internet and stored file functionalities must be turned off. The responses of a student who uses an electronic Braille notetaker during Ohio’s State Tests must be transcribed exactly as entered in the electronic Braille notetaker. Only transcribed responses will be scored. Transcription guidelines are available in Appendix C of this manual.

Braille writer

A student who is blind or has visual impairments may use an electronic Braille writer. A test administrator must transcribe into the computer the student’s responses exactly as entered in the electronic Braille writer.

Only transcribed responses will be scored. Transcription guidelines are available in Appendix C of this manual.

Calculator or fact charts on non-calculator mathematics test or part of test

The student uses a handheld or embedded calculator or fact chart (addition, subtraction, multiplication or division only) on a non-calculator mathematics test or part of test. Both parts of grades 3 through 5 mathematics tests and part 1 of grades 6 and 7 mathematics tests are non-calculator tests.

The accommodation would be permitted on test sections for which calculators are not allowed for other students. IEP teams and 504


plan coordinators should carefully review the following guidelines for identifying students to receive this accommodation.

This accommodation is for students with disabilities that severely limit or prevent their abilities to perform basic calculations (i.e., single-digit addition, subtraction, multiplication or division).

In making decisions whether to provide the student with this accommodation, IEP teams and 504 plan coordinators should consider whether the student has a disability that severely limits or prevents the student’s ability to perform basic calculations (i.e., single-digit addition, subtraction, multiplication or division), even after varied and repeated attempts to teach the student to do so.

Before documenting the accommodation in the student’s IEP or 504 plan, teams also should consider whether:

● The student is unable to perform calculations without the use of a calculation device, arithmetic table or manipulative during routine instruction;

● The student’s inability to perform mathematical calculations is documented in evaluation summaries from locally administered diagnostic assessments;

● The student receives ongoing, intensive instruction and/or interventions to learn to calculate without using a calculation device, in order to ensure that the student continues to learn basic calculation and fluency.

If students in grades 3-5 will use the embedded Desmos calculator within the Student Testing Site for a math test, the test administrator must turn on this accommodation when approving the student to test for part 1 and part 2. If students in grades 6 and 7 will use the embedded Desmos calculator within the Student Testing Site as an accommodation, the test administrator must turn it on this accommodation when approving the student to test for part 1. An embedded calculator for non-calculator math tests or parts of math tests cannot be turned on ahead of testing in TIDE.

Calculators are not allowed on the grades 5 and 8 science tests and the biology end-of-course test for students with disabilities. However, there are no mathematical calculations on these Ohio science tests and a calculator should not be needed. An embedded calculator is not available for these tests.

Calculator guidance is in the Test Administration Manual.

Mathematical tools – allowable tools as accommodation include:

• 100s chart • Abacus/rekenrek and other

specialized tools for students with visual impairments

• Algebra Tiles

Student uses these tools and manipulatives to assist mathematical problem solving. These manipulatives allow the flexibility of grouping, representing or counting without numeric labels.

Tools that give students answers (e.g. fraction tiles with numerical labels) or lead a student to use a specific strategy (e.g. number lines) are not allowed. These types of tools can be effective for instruction



• Base 10 blocks • Counters and counting chips • Cubes • Fraction tiles and pies without

numerical labels • Square tiles • Two-colored chips

and while students may create their own during testing as a strategy, they may not be provided to students on Ohio state tests.

For information about fact charts, see calculation device or fact charts on non-calculator mathematics test or part of test in this section.

Information about rulers, angled-rulers, compasses and protractors is located in the non-embedded designated supports section of this manual.

The Department will review and revise this list annually as needed.

Allowable for mathematics and physical science tests only.

Scribe

The student dictates responses either verbally, using a speech-to text device, augmentative or assistive communication device (e.g., picture or word board), or by signing, gesturing, pointing or eye gazing. Grammar checker, Internet and stored files functionalities must be turned off. Word prediction must also be turned off for students who do not receive this accommodation. The student must test in a separate setting.

In making decisions whether to provide the student with this accommodation, IEP teams and 504 plan coordinators should consider whether the student has:

● A physical disability that severely limits or prevents the student’s motor process of writing through keyboarding;

OR

● A disability that severely limits or prevents the student from expressing written language, even after varied and repeated attempts to teach the student to do so.

Before documenting the accommodation in the student’s IEP or 504 plan, teams/coordinators should also consider whether:

● The student’s inability to express in writing is documented in evaluation summaries from locally administered diagnostic assessments;

● The student routinely uses a scribe for written assignments; and

● The student receives ongoing, intensive instruction and/or interventions to learn written expression, as deemed appropriate by the IEP team or 504 plan coordinator.

Student’s responses must be transcribed exactly as dictated.

Information about the scribing process is available in Appendix C of this manual.

Specialized calculation device A student uses a specialized calculation device (for example, a large key, talking or other adapted calculator) on the calculator part of the


mathematics assessments. If a talking calculator is used, the student must use headphones or test in a separate setting.

The student must qualify for the calculation device or fact charts on non-calculator mathematics test or part of test accommodation to use a specialized calculator in those tests.

Calculators are not allowed on science tests except physical science.

Word prediction external device

The student uses an external word prediction device that provides a bank of frequently or recently used words on screen as a result of the student entering the first few letters of a word.

The student must be familiar with the use of the external device prior to assessment administration. The device cannot connect to the Internet or save information.

In making decisions whether to provide the student with this accommodation, IEP teams and 504 plan coordinators are instructed to consider whether the student has:

● A physical disability that severely limits or prevents the student from writing or keyboarding responses;

OR

● A disability that severely limits or prevents the student from recalling, processing and expressing written language, even after varied and repeated attempts to teach the student to do so.

Before documenting the accommodation in the student’s IEP/504 plan, teams/coordinators are instructed to consider whether:

● The student’s inability to express in writing is documented in evaluation summaries from locally administered diagnostic assessments; and

● The student receives ongoing, intensive instruction and/or intervention in language processing and writing, as deemed appropriate by the IEP team/504 plan coordinator.

Timing Accommodation

Timing and scheduling accommodations are changes in the allowable length of time in which a student may complete the test.

Description

The extended-time accommodation is most beneficial for students who routinely need more time than is generally allowed to complete activities, assignments and tests. Extra time may be needed to:

● Process written text (for a student who processes information slowly or has a human reader);

● Write (for a student with limited dexterity); ● Use other accommodations or devices.

Extended time Student is allowed more time than allotted for each test part.


In most cases, the Department recommends that extended time be defined for students and not open-ended. This accommodation is usually expressed as one and a one-half time (1.5x) or double time (2x). A student who has one and one-half time on a test that normally takes 90 minutes may be allowed 135 minutes. Extended time may not exceed one school day; students must complete each test part on the same day that part is started.

Decisions about how much extended time is provided must be made on a case-by-case basis for each individual student, not for any category of students or group. Teams should keep in mind the purposes of different accommodations as they relate to disability characteristic or language barrier. Typically, if a student needs extended time, one and one-half time is sufficient. For some accommodations, such as use of a human reader or scribe, double time may be appropriate. Rarely is unlimited time (an entire school day) applicable.

Schools may choose to test students with the extended-time accommodation in a separate setting to minimize distractions. Department recommends scheduling these students for testing in the morning to allow adequate time for completion of a test part by the end of the school day.

2.7 Considerations for English Learner Accommodations While all English learners have in common that they are acquiring English language proficiency, they are not a homogenous group. Similar to students with disabilities, English learners should not be assigned accommodations using a one-size-fits-all approach. Knowing the student is key. When considering accommodations for English learners, it is important to focus on the effectiveness of each accommodation for each individual student. Not only does an English learner’s English language proficiency influence accommodation effectiveness, but so do other factors, including their literacy development in English and their native language, grade, age, affective needs and time in U.S. schools. Keep in mind that the purpose of English language assessment accommodations is not to improve an English learners rate of passing state assessments, but to allow more accurate demonstration of their knowledge of the content being assessed. All students who have been identified as an English learner may receive accommodations for English learners even if they do not participate in the district English learner program. Schools should monitor how English learners in the classroom benefit from English learner-specific accommodations when determining accommodations for state tests.


Accommodations for English

Learners

Accommodations for English

learners are intended to reduce

and/or eliminate the effects of a

student’s lack of English

language proficiency.

Description

When making decisions about accommodations for English

learners, teams should consider the effectiveness of the

accommodation based on the English language proficiency

level of the student.

Extended time

Student is allowed more time than allotted for each test part.

In most cases, the Department recommends that extended time be defined for students and not open-ended. This accommodation is usually expressed as one and a one-half time (1.5x) or double time (2x). A student who has one and one-half time on a test that normally takes 60 minutes may be allowed 90. Extended time may not exceed one school day; students must complete each test part on the same day that part is started.

Decisions about how much extended time is provided must be made on a case-by-case basis for each individual student, not for any category of students or group. Teams should keep in mind the purposes of different accommodations as they relate to disability characteristic or language barrier. Typically, if a student needs extended time, one and one-half time is sufficient. For some accommodations, such as an oral translation, double time may be appropriate. Rarely is unlimited time (an entire school day) applicable.

Schools may choose to test students with the extended-time

accommodation in a separate setting to minimize distractions. The

department recommends scheduling these students for testing in the

morning to allow adequate time for completion of a test part by the

end of the school day.

Appropriate for all English language proficiency levels.

Human reader for computer-based

test

Not allowed for English learners on the English language arts test.

A test administrator reads in English from the student’s computer

screen to the student. For computer-based testing, most students

should be able to use text-to-speech for a read-aloud.

Test administrators must administer the read-aloud accommodation

in a separate setting. This feature can be provided in small groups if

set up as a small group administration in the Student Testing Site. If

testing in a small group, test administrators should ensure that all

students in the group have similar abilities so that the reader’s pace


meets all student’s needs without being too slow or too fast for some

students.

If a student need this accommodation, the person providing the

accommodation must read the entire test to the student. It cannot be

“as needed” or “on demand.”

Appropriate for students who regularly have a human reader in the

classroom and who have had very little or no prior experience or

familiarity with computer-based testing technology.

Refer to the Test Administration Manual for more information about

administering a test through a human reader.

Oral translation of the test

Not allowed for English language arts test.

Note: The general directions for all tests, including English language

arts, may be translated. The general directions are the scripted

directions the test administrator reads to all students before the test

begins. The Department will not reimburse translators for translating

general directions only.

A translator reads aloud the test to a student in his or her native

language. Translators will translate the test from the student’s device.

Student responses must be recorded in the Student Testing Site in

English. Responses submitted in a language other than English will

not be scored.

Refer to the Test Administration Manual for additional information

about how to administer an oral translation.

A translator must administer an oral translation of the test in a

separate setting.

Appropriate for beginning and some intermediate-level English

learners but may not be appropriate for advanced-level English

learners.

Refer to the Test Administration Manual for more information about

administering an oral translation.

Scribe (In English)

Not allowed for the English language arts test.

The student dictates responses in English. The test administrator or

monitor must test the student in a separate setting.

May be appropriate for beginning-level English learners who do not

have translators and who have better spoken than written English

language proficiency. Typically, not appropriate for intermediate- or

advanced-level English learners.





Stacked Spanish/English bilingual

form of the test


Test items presented with Spanish on the top and English on the

bottom. Only responses in English will be scored.

Appropriate for students who have content knowledge in both

Spanish and English. Not appropriate for students who have not been

instructed in tested content in Spanish.

Text-to-speech Spanish/English


The text-to-speech feature reads aloud the test to the student.

Students who use text-to-speech should use a voice pack they are

familiar with and adjust the volume, pitch and rate prior to starting the

test. Detailed information about text-to-speech functionality is in the

Test Administration Manual. Manuals are available on Ohio’s State

Tests Portal.

Recommended for beginning and some intermediate English learners

but may not be appropriate for advanced-level English learners.

Text-to-speech tracking


The feature will highlight words in test questions as the embedded text-to-speech feature reads the test aloud. May help some students who use text-to-speech.

Word-to-word dictionary

(English/Native Language)

The student uses an allowable bilingual, word-to-word dictionary.

Dictionaries that include definitions, phrases, sentences or pictures

are not allowed. The student should be familiar with the dictionary he

or she will use during testing. An electronic translator may be used

instead of a paper dictionary. An electronic translator cannot connect

to the Internet or store information.

Recommended for intermediate and advanced English learners but

may not be appropriate for beginning-level English learners.

The Massachusetts Department of Elementary and Secondary

Education has released a list of dictionaries that are known to meet

the criteria for allowable dictionaries for statewide testing. This list

may be accessed at doe.mass.edu/mcas/testadmin/lep-bilingual-

dictionary.pdf.

Word-to-word glossaries and dictionaries approved by ACT or the

College Board are allowable.

Assessment scores for students who qualify and receive any of the accommodations listed in this manual will be aggregated with the scores of other students and those of relevant groups and will be included for accountability purposes.



http://www.doe.mass.edu/mcas/testadmin/lep-bilingual-dictionary.pdf

http://www.doe.mass.edu/mcas/testadmin/lep-bilingual-dictionary.pdf


2.8 Other Accommodations and Modifications

Emergency Accommodations

An emergency accommodation may be appropriate for a student who incurs a temporary disabling condition

that interferes with test performance shortly before or during the assessment window (e.g. the student has a

recently fractured limb that affects physical access to the test, a student whose only pair of eyeglasses has

broken or a student returning after a serious or prolonged illness or injury). Scribe is the most common

emergency accommodation for the examples given. Extended time may also be considered when providing a

scribe but it is not required. For a student with a concussion, a paper test may be an appropriate emergency

accommodation, alternately, frequent breaks and a human reader may provide needed access for a student in

this situation.

If the principal (or designee) determines that a student requires an emergency accommodation, the optional Emergency Accommodation form found in Appendix E may be completed and maintained in the student’s file. The Department recommends that the school notify the parent or guardian that an emergency accommodation was provided. If appropriate, the form also may be submitted to the district testing coordinator to be retained in the student’s central office file. Accommodation Irregularities In the event that a student was provided a test accommodation the student was not entitled to, or if a student was not provided a test accommodation the student was entitled to, the school should refer to the Test Incident Guidance Document located in the Test Administration Manual to determine next steps. Modifications on Assessments Modifications are not permitted on Ohio’s State Tests. Modifications, as contrasted with accessibility features, involve changes in the standards being measured on the test, or in the conditions in which a student takes the test that would result in changes in what the assessment is designed to measure (e.g., reducing or changing expectations for students), or provides an unfair advantage to a student. Examples of modifications the Department does not permit on Ohio’s State Tests include:

● Allowing a student to be assessed off grade level; ● Instructing a student to skip selected items, reducing the scope of assessments so a student needs to

complete only a limited number of problems or items; ● Modifying the complexity of assessments to make them easier (e.g., deleting response choices on a

multiple-choice assessment so that a student selects from two or three options instead of four); ● Providing hints, clues or other coaching that directs the student to correct responses; ● Defining vocabulary on the assessment, for non-glossed words, or explaining assessment items; ● Allowing the student to complete an assessment of English language arts in a language other than

English; and ● Using a dictionary that provides definitions (rather than an acceptable word-to-word dual language

dictionary). Providing a student with modifications during Ohio’s State Tests may constitute a test irregularity and will result in an invalidated score (the score will not be counted) and/or an investigation by the state into the school’s or district’s testing practices. Moreover, providing modifications to students during statewide tests may have the unintended consequence of reducing their opportunities to learn critical content and may result in adverse effects on the students throughout their educational careers.



Section 3: Universal Design and Ohio’s State Tests The Department designed Ohio’s State Tests to ensure all students have the tools and supports to

demonstrate what they know. Using universal design approaches, the test makers ensure that all students

have an equal opportunity to show what they have learned. All students benefit from the flexibility universal

design can bring to assessment design and administration, including students who need accommodations.

Universally designed assessment aims to create multiple alternatives and approaches, so a maximum number

of students can take the assessment without accommodations.

Ohio has included the following universal-design requirements for item development for Ohio’s State Tests:

● The item or task takes into consideration the diversity of the assessment population and the need to

allow the full range of eligible students to respond to the item/stimulus.

● Constructs have been precisely defined and the item or task measures what is intended.

● Assessments contain accessible, non-biased items.

● Assessments are designed to be amenable to accommodations.

● Instructions and procedures are simple, clear and intuitive.

● Assessments are designed for maximum readability, comprehensibility and legibility.

● The item or task material uses a clear and accessible text format.

● The item or task material uses clear and accessible visual elements (when essential to the item).

● The item or task material uses text appropriate for the intended grade level.

● Decisions will be made to ensure that items and tasks measure what they are intended to measure for

English language learner students with different levels of English language proficiency and/or first

language proficiency.

● All accessibility features have been considered that may increase access while preserving the targeted

construct.

● Test developers considered multiple means of item presentation, expression and student engagement

with regard to items/tasks for both students with disabilities and English learners.


L-1 American Institutes for Research

Table L1. Summary of Human and Machine Scores for Fall 2017 Writing Prompts

Grade

Item D

Dim

ension

Mean Standard Deviation Human1‐Human2 Agreement Machine‐Human Agreement

Hum

an

Engine

Hum

an

Engine

Pearson r

% Exact

Weighted κ*

SMD*

Pearson r

% Exact

Weighted κ*

SMD*

3 31679

Conventions 1.42 1.47 0.64 0.61 0.70 0.74 0.62 0.00 0.73 0.79 0.66 0.07

Evidence 1.33 1.30 0.81 0.70 0.72 0.71 0.61 0.00 0.67 0.67 0.53 0.03

Purpose 1.42 1.36 0.78 0.64 0.73 0.72 0.63 0.01 0.66 0.67 0.52 0.08

9 31578

Conventions 1.50 1.53 0.68 0.65 0.79 0.81 0.72 0.00 0.82 0.84 0.75 0.04

Evidence 1.66 1.64 0.90 0.81 0.86 0.77 0.76 0.01 0.82 0.75 0.71 0.03

Purpose 1.81 1.82 0.91 0.81 0.85 0.75 0.75 0.01 0.85 0.78 0.76 0.02

9 31588

Conventions 1.35 1.38 0.74 0.72 0.74 0.74 0.65 0.04 0.80 0.79 0.71 0.03

Evidence 1.54 1.56 0.85 0.80 0.83 0.76 0.74 0.04 0.81 0.76 0.71 0.03

Purpose 1.76 1.71 0.81 0.73 0.81 0.77 0.73 0.01 0.79 0.78 0.71 0.07

10 31662

Conventions 1.54 1.59 0.68 0.63 0.81 0.84 0.74 0.01 0.79 0.83 0.72 0.07

Evidence 1.72 1.76 0.88 0.81 0.86 0.79 0.77 0.01 0.81 0.74 0.70 0.04

Purpose 1.82 1.82 0.84 0.78 0.87 0.82 0.80 0.00 0.84 0.80 0.76 0.00

10 31513

Conventions 1.53 1.54 0.68 0.68 0.83 0.83 0.76 0.02 0.81 0.82 0.73 0.01

Evidence 1.59 1.60 0.90 0.84 0.88 0.80 0.80 0.02 0.85 0.77 0.76 0.01

Purpose 1.85 1.83 0.89 0.81 0.87 0.79 0.79 0.01 0.84 0.77 0.74 0.03 *Weighted K = Quadratic weighted kappa; SMD = Standardized Mean Difference



Table L2. Summary of Dimension Intercorrelations for Fall 2017 Writing Prompts

Grade ITS ID Dimensions Correlations Among Dimensions

Human1 vs Human2 Machine vs Final Human Conventions Evidence Conventions Evidence

3 31664 Evidence 0.52 0.49 Purpose 0.66 0.96 0.46 0.85







Table L3. Summary of Human and Machine Scores for Spring 2018 Writing Prompts

Grade

Item D

Dim

ension

Mean Standard Deviation

Human1‐Human2 Agreement Machine‐Human Agreement

Hum

an

Engine

Hum

an

Engine

Pearson r

% Exact

Weighted

κ*

SMD*

Pearson r

% Exact

Weighted

κ*

SMD*

3 31664

Conventions 1.35 1.35 0.63 0.57 0.71 0.75 0.64 0.02 0.63 0.73 0.56 0.00

Evidence 0.98 0.96 0.86 0.75 0.76 0.72 0.66 0.03 0.73 0.66 0.60 0.03

Purpose 1.03 0.97 0.86 0.80 0.76 0.70 0.65 0.03 0.74 0.65 0.60 0.08

4 31960

Conventions 1.21 1.22 0.66 0.59 0.71 0.75 0.63 0.00 0.62 0.73 0.56 0.02

Evidence 1.46 1.46 0.99 0.95 0.83 0.71 0.72 0.03 0.84 0.71 0.72 0.01

Purpose 1.45 1.48 1.05 0.97 0.82 0.69 0.70 0.00 0.81 0.68 0.69 0.03

5 32035 Conventions 1.61 1.65 0.59 0.56 0.62 0.75 0.56 0.01 0.70 0.80 0.63 0.06 Evidence 1.55 1.53 0.71 0.61 0.75 0.73 0.65 0.03 0.75 0.77 0.67 0.04 Purpose 1.72 1.74 0.77 0.68 0.76 0.75 0.67 0.02 0.78 0.77 0.69 0.03







Grade

Item D

Dim

ension

Mean Standard Deviation

Human1‐Human2 Agreement Machine‐Human Agreement

Hum

an

Engine

Hum

an

Engine

Pearson r

% Exact

Weighted

κ*

SMD*

Pearson r

% Exact

Weighted

κ*

SMD*






10 31622

Conventions 1.55 1.59 0.67 0.65 0.83 0.85 0.77 0.00 0.75 0.80 0.67 0.06

Evidence 1.53 1.56 1.02 0.92 0.85 0.73 0.74 0.00 0.81 0.70 0.70 0.03

Purpose 1.84 1.83 0.83 0.80 0.85 0.78 0.77 0.00 0.81 0.76 0.72 0.01 *Weighted K = Quadratic weighted kappa; SMD = Standardized Mean Difference



Table L4. Summary of Dimension Intercorrelations for Spring 2018 Writing Prompts

Grade ITS ID Dimensions Correlations Among Dimensions

Human1 vs Human2 Machine vs Final Human Conventions Evidence Conventions Evidence