annual technical report - oh.portal.cambiumast.com...chapter 9 describes the procedures used to...
TRANSCRIPT
Annual Technical Report
Ohio’s State Tests in English Language
Arts, Mathematics, Science, and Social
Studies
2017–2018 School Year
September 2018
Ohio Department of Education i American Institutes for Research
OHIO STATEWIDE ASSESSMENT
OHIO’S STATE TESTS (OST)
ELA GRADES 3 THROUGH 8, HIGH SCHOOL ELA I, AND ELA II
MATHEMATICS GRADES 3 THROUGH 8, HIGH SCHOOL ALGEBRA, GEOMETRY,
INTEGRATED MATHEMATICS I, AND INTEGRATED MATHEMATICS II
SCIENCE GRADE 5, GRADE 8, BIOLOGY, AND PHYSICAL SCIENCE
SOCIAL STUDIES AMERICAN HISTORY AND AMERICAN GOVERNMENT
2017–2018 ANNUAL TECHNICAL REPORT
SEPTEMBER 2018
Prepared by American Institutes for Research (AIR) in collaboration with the Ohio Department of Education (ODE)
Ohio Department of Education i American Institutes for Research
TABLE OF CONTENTS
1. Introduction: The Validity of OST Test Score Interpretations ................................................................................ 5 1.1 Overview ......................................................................................................................................................... 5 1.2 Validity Evidence ............................................................................................................................................. 6
1.2.1 Evidence Based on Test Content ..................................................................................................... 10 1.2.2 Evidence for Performance Standard Interpretation of Test Scores ................................................ 12 1.2.3 Evidence Based on Internal Structure ............................................................................................. 15 1.2.4 Measurement Invariance Across Subgroups ................................................................................... 18 1.2.5 Test Integrity Forensics ................................................................................................................... 20 1.2.6 Summary of Validity of Test Score Interpretations ......................................................................... 23
2. Background of Ohio Computer-Based Assessments............................................................................................ 24 2.1 Background of ELA and Mathematics Assessments ...................................................................................... 24 2.2 Background of Science and Social Studies Assessments ............................................................................... 24 2.3 OST Test Design ............................................................................................................................................. 25
3. Summary of Fall 2017 Operational Test Adminstration ...................................................................................... 29 3.1 Student Population and Participation ........................................................................................................... 29 3.2 Summary of Overall Student Performance for Fall 2017 .............................................................................. 30 3.3 Student Performance by Subgroup for Fall 2017 .......................................................................................... 31 3.4 Reliabity for Fall 2017 .................................................................................................................................... 34
3.4.1 Internal Consistency ........................................................................................................................ 34 3.4.2 Standard Error of Measurement ..................................................................................................... 35 3.4.3 Student Classification Reliability ..................................................................................................... 40 3.4.4 Classification Accuracy .................................................................................................................... 40 3.4.5 Classification Consistency ............................................................................................................... 41 3.4.6 Classification Accuracy and Consistency Estimates ........................................................................ 41 3.4.7 Reliability for Subgroups in the Population .................................................................................... 42 3.4.8 Reliability for Subscales................................................................................................................... 44 3.4.9 Subscale Intercorrelation ................................................................................................................ 46
4. Summary of Spring 2018 Operational Test Adminstration .................................................................................. 49 4.1 Student Population and Participation ........................................................................................................... 50 4.2 Summary of Overall Student Performance for Spring 2018 .......................................................................... 51 4.3 Student Performance by Subgroup for Spring 2018 ..................................................................................... 53 4.4 Classical Item Analysis ................................................................................................................................... 58 4.5 Item Response Theory Analysis ..................................................................................................................... 59 4.6 Reliability for Spring 2018 ............................................................................................................................. 63
4.6.1 Internal Consistency ........................................................................................................................ 63 4.6.2 Standard Error of Measurement ..................................................................................................... 65 4.6.3 Student Classification Reliability ..................................................................................................... 70 4.6.4 Classification Accuracy .................................................................................................................... 71 4.6.5 Classification Consistency ............................................................................................................... 71 4.6.6 Classification Accuracy and Consistency Estimates ........................................................................ 72 4.6.7 Reliability for Subgroups in the Population .................................................................................... 73 4.6.8 Reliability for Subscales................................................................................................................... 77
4.7 Subscale Intercorrelations ............................................................................................................................. 80 4.8 Rater Agreement ........................................................................................................................................... 85
5. Item Development and Test Construction........................................................................................................... 86 5.1 Item Development Process ........................................................................................................................... 86 5.2 Machine-Scored Constructed-Response Item Development Tools............................................................... 87 5.3 Item Types ..................................................................................................................................................... 87 5.4 Item Review ................................................................................................................................................... 88
6. Field Testing ......................................................................................................................................................... 91
Ohio Department of Education ii American Institutes for Research
6.1 Item Statistics ................................................................................................................................................ 92 6.1.1 Classical Statistics ............................................................................................................................ 92 6.1.2 IRT Statistics .................................................................................................................................... 93 6.1.3 Analysis of Differential Item Functioning ........................................................................................ 93
6.2 Data Review Summary .................................................................................................................................. 95 6.3 Test Construction .......................................................................................................................................... 95
6.3.1 Operational Form Construction ...................................................................................................... 96 6.3.2 Assembling Test Forms ................................................................................................................... 97 6.3.3 Embedded Field-Test Slots .............................................................................................................. 98
7. Test Administration ........................................................................................................................................... 100 7.1 Eligibility ...................................................................................................................................................... 100 7.2 Administration Procedures ......................................................................................................................... 100 7.3 Accomodations ............................................................................................................................................ 102 7.4 Test Security ................................................................................................................................................ 109
8. Reporting and Interpreting OST Scores ............................................................................................................. 112 8.1 Appropriate Uses for Scores and Reports ................................................................................................... 112 8.2 Reports Provided ......................................................................................................................................... 113
8.2.1 Online Reporting System for Educators ........................................................................................ 114 8.3 Interpretation of Scores .............................................................................................................................. 121
8.3.1 Scale Scores ................................................................................................................................... 121 8.3.2 Performance Standards ................................................................................................................ 123 8.3.3 Performance-Level Descriptors ..................................................................................................... 123
9. Performance Standards ..................................................................................................................................... 124 9.1 Standard Setting Procedures ....................................................................................................................... 124
9.1.1 Performance-Level Descriptors ..................................................................................................... 125 9.2 Recommended Performance Standards ..................................................................................................... 126 9.3 OST Transformations and Rounding Rules .................................................................................................. 132
9.3.1 Rules for Transforming the Within-Grade Theta to the OST Scale ............................................... 132 9.3.2 OST Rounding Rules ...................................................................................................................... 133 9.3.3 Rules for Overall Performance Level Classification ....................................................................... 133 9.3.4 OST Subscale Performance Classification ..................................................................................... 134
10. Scaling And Equating ......................................................................................................................................... 135 10.1 Item Response Theory Procedures ............................................................................................................. 135
10.1.1 Calibration of OST Item Banks....................................................................................................... 135 10.1.2 Estimating Student Ability Using Maximum Likelihood Estimation .............................................. 136
10.2 OST Reporting Scale (Scale Scores) ............................................................................................................. 138 10.3 Equating Paper-Pencil and Online Test Scores ............................................................................................ 140
11. Constructed-Response Scoring .......................................................................................................................... 142 11.1 Machine-Scoring .......................................................................................................................................... 142
11.1.1 Explicit Rubrics .............................................................................................................................. 142 11.1.2 Essay Autoscoring ......................................................................................................................... 142
11.2 Handscoring ................................................................................................................................................. 144 11.2.1 Rangefinding ................................................................................................................................. 144 11.2.2 Developing Training Materials After Rangefinding ....................................................................... 145 11.2.3 Scoring Guides with Anchor Responses ........................................................................................ 145 11.2.4 Training Sets .................................................................................................................................. 145 11.2.5 Operational Training and Qualifying Materials ............................................................................. 145 11.2.6 Handscoring Procedures ............................................................................................................... 146 11.2.7 Training of Scorers ........................................................................................................................ 147 11.2.8 Monitoring and Maintaining Quality Control ................................................................................ 147 11.2.9 Handling Unusual Responses and Disturbing Responses .............................................................. 148
12. Quality Control Procedures ............................................................................................................................... 149 12.1 Quality Assurance in Test Construction ...................................................................................................... 149
Ohio Department of Education iii American Institutes for Research
12.2 Quality Assurance in Test Production ......................................................................................................... 151 12.2.1 Production of Content .................................................................................................................. 151 12.2.2 Web Approval of Content During Development ........................................................................... 152 12.2.3 Approval of Final Forms ................................................................................................................ 152 12.2.4 Packaging ...................................................................................................................................... 152 12.2.5 Platform Review ............................................................................................................................ 152 12.2.6 User Acceptance Testing and Final Review ................................................................................... 153 12.2.7 Functionality and Configuration ................................................................................................... 153
12.3 Quality Assurance in Document Processing ................................................................................................ 154 12.3.1 Scanning Accuracy ......................................................................................................................... 154 12.3.2 Quality Assurance in Editing and Data Input................................................................................. 154
12.4 Quality Assurance in Data Preparation ....................................................................................................... 155 12.5 Quality Assurance in Test Form Equating .................................................................................................... 156 12.6 Quality Assurance in Scoring and Reporting ............................................................................................... 156
12.6.1 Quality Assurance in Handscoring ................................................................................................ 156 12.6.2 Quality Assurance for Score Reporting ......................................................................................... 159 12.6.3 Quality Assurance for Test Scoring ............................................................................................... 161 12.6.4 Reporting ...................................................................................................................................... 164
Ohio Department of Education iv American Institutes for Research
APPENDICES
Appendix A Global Model Fit ..................................................................................................................................... A-1 Appendix B Test Integrity Forensics Report............................................................................................................... B-1 Appendix C Number of Students Participating by Test Mode ................................................................................... C-1 Appendix D Test Score Frequency Distributions ....................................................................................................... D-1 Appendix E Operational Bank Parameters ................................................................................................................. E-1 Appendix F Ability Measures at Raw Score Cuts ........................................................................................................ F-1 Appendix G Raw-to-Scale Score Conversion Tables .................................................................................................. G-1 Appendix H Rater Agreement Rates ......................................................................................................................... H-1 Appendix I Test Characteristics Curve Graphs ............................................................................................................. I-1 Appendix J Test Administrator User Guide ................................................................................................................. J-1 Appendix K Ohio Accessibility Manual .......................................................................................................................K-1 Appendix L ELA Writing Prompts Scoring Rubric Summary ........................................................................................ L-1
Ohio Department of Education 5 American Institutes for Research
1. INTRODUCTION: THE VALIDITY OF OST TEST SCORE INTERPRETATIONS
1.1 OVERVIEW
The purpose of this technical report is to document the evidence supporting the claims made for how Ohio’s State Tests (OST) scores may be interpreted. Evidence for the validity of test score interpretations is central to claims that OST test scores can be used to evaluate the effectiveness with which Ohio districts and schools teach students Ohio’s Learning Standards and whether individual students have achieved those standards by the end of each school year. Thus, the report begins with a review of validity evidence evaluated to date. Evidence for the validity of test score interpretations is expected to accrue over time, and this section will be expanded as further evidence is gained.
Chapter 2 of the report describes the design and development of OST assessments, including Ohio’s Learning Standards which define the content domain to be assessed by OST; the development of test specifications, including blueprints, that ensure the breadth and depth of the content domain is adequately sampled by the assessments; as well as test development procedures that ensure alignment of test forms with the blueprint specifications.
Chapters 3 and 4 present results of the fall 2017 and spring 2018 OST test administrations, respectively. The fall administration is limited and includes only the grade 3 ELA assessment and the high school end-of-course (EOC) tests in ELA, mathematics, science, and social studies. The full OST assessment system administered in spring includes end-of-year assessments in ELA and mathematics for grades 3–8, as well as end-of-year assessments in science at grades 5 and 8. The spring assessments also include the high school EOC tests in ELA (grade 10 and grade 11), mathematics (Algebra I, Geometry, Integrated Mathematics I, and Integrated Mathematics II), science (Physical Science and Biology), and social studies (American Government and American History). These chapters provide summaries of the test-taking student population and their performance on the assessments. In addition, these chapters describe administration-specific evidence for the reliability of OST assessments, including internal consistency reliability, standard errors of measurement, and the reliability of performance level classifications.
The remaining chapters document technical details of the test development, administration, scoring, and reporting activities. Chapter 5 describes the item development process and especially the sequence of reviews that each item must pass through before being eligible for OST test administration. This chapter also describes the procedures for constructing test forms from items successfully passing through the review process.
Chapter 6 documents the test administration procedures, including eligibility of participation in OST assessments, testing conditions, including accessibility tools and accommodations, systems security for assessments administered online, as well as test security procedures for all test administrations.
A description of the score reporting system and the interpretation of test scores is provided in Chapter 7. Chapter 8 describes the procedures that ODE used to identify and adopt performance standards for OST assessments, and Chapter 9 describes the procedures used to scale and equate OST assessments for scoring and reporting.
Chapter 10 describes the procedures for scoring constructed-response items, both machine-scored and handscored items, and provides summary rater agreement results. Finally, Chapter 11 provides an overview of the quality assurance processes described throughout that are used to ensure that all test development, administration, scoring, and reporting activities are conducted with fidelity to the developed procedures.
Ohio Department of Education 6 American Institutes for Research
1.2 VALIDITY EVIDENCE
Validity refers to the degree to which test score interpretations are supported by evidence and speaks directly to the legitimate uses of test scores. Establishing the validity of test score interpretations is thus the most fundamental component of test design and evaluation. The Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 2014) provide a framework for evaluating whether claims based on test score interpretations are supported by evidence. Within this framework, the standards describe the range of evidence that may be brought to bear to support the validity of test score interpretations.
The types of evidence required to support the validity of test score interpretations depend centrally on the claims made for how test scores may be interpreted. Moreover, the standards prove that validity is an attribute not of tests but rather of test score interpretations. Some test score interpretations may be supported by validity evidence while others are not. The validity of the test itself is not considered, rather the validity of the intended interpretation and use of test scores is evaluated.
OST assessments are designed to measure the degree to which students have achieved the academic learning standards defined by Ohio’s Learning Standards. The evidence presented here focuses on the validity of test score and performance level interpretations about student achievement of Ohio’s Learning Standards. There are a number of intended uses for Ohio’s State Test (OST scores, including school accountability, feedback about student and class performance, measurement of student growth over time, evaluation of performance gaps between groups, evaluation of teacher performance, and diagnosis of individual student strengths and weaknesses. Each intended use requires claims to be made about the interpretation of test scores, and the strength of those claims rests on the validity evidence supporting those claims. Some validity evidence will be central to all of the claims, especially evidence for the alignment of test items and administrations to Ohio’s Learning Standards. Other evidence may target more specific claims, such as evidence for measurement of student growth or evaluation of teacher performance. Evaluation of validity evidence should therefore be made with respect to the claim that it is purported to support.
Determining whether the test measures the intended construct is central to evaluating the validity of test score interpretations. Such an evaluation in turn requires a clear definition of the measurement construct. For OST assessments, the definition of the measurement construct is provided by Ohio’s Learning Standards.
Ohio’s Learning Standards specify what students should know and be able to do by the end of each grade level, or by end-of-course for high school, in order to graduate ready for post-secondary education or entry into the workforce.1 Ohio first adopted learning standards in 2001, recognizing that learning standards would continue to be revised over time. The Ohio State Board of Education adopted Ohio’s Learning Standards for ELA and mathematics in 2010 as part of a multi-state effort. In 2010, the Ohio State Board of Education also adopted new, more rigorous, science and social studies standards. Ohio’s Learning Standards for ELA, mathematics, science, and social studies describe the educational targets for students in each subject area.
Because directly measuring student achievement against each benchmark in Ohio’s Learning Standards would result in an impractically long test, each test administration is designed to measure a representative sample of the content domain defined by Ohio’s Learning Standards.2 To ensure that each student is assessed on the intended breadth and depth of Ohio’s Learning Standards, item selection is guided by a set of test specifications, or blueprints, that indicate
1 Standard 1.1 – The test developer should clearly set forth how test scores are intended to be interpreted and consequently used. The population(s) for which a test is intended should be delimited clearly, and the construct or constructs that the test is intended to assess should be described clearly. 2 Standard 4.0 – Tests and testing programs should be designed and developed in a way that supports the validity of interpretations of the test scores for their intended uses. Test developers and publishers should document steps taken during the design and development process to provide evidence of fairness, reliability, and validity for intended uses for individuals in the intended examinee population.
Ohio Department of Education 7 American Institutes for Research
the number of items that should be sampled from each content strand, standard, and benchmark.3 Thus, the test blueprints represent a policy statement about the relative importance of content strands and standards in addition to meeting important measurement goals (e.g., sufficient items to report strand performance levels reliably). Because the test blueprint determines how student achievement of Ohio’s Learning Standards is evaluated, alignment of test blueprints with the content standards is critical. ODE has published the OST test blueprints that specify the distribution of items across reporting categories.
The principles of universal design of assessments provide guidelines for test design to minimize the impact of construct-irrelevant factors in assessing student achievement.4 Universal design removes barriers to access for the widest range of students possible. Seven principles of universal design are applied in the process of test development (Thompson, Johnstone, & Thurlow, 2002):
• Inclusive assessment population
• Precisely defined constructs
• Accessible, non-biased items
• Amenable to accommodations
• Simple, clear, and intuitive instructions and procedures
• Maximum readability and comprehensibility
• Maximum legibility
Test development specialists receive extensive training on the principles of universal design and apply these principles in the development of all test materials, including items and accompanying stimuli. In the review process, adherence to the principles of universal design is verified.
In addition, the OST test delivery system provides a range of accessibility tools and accommodations for reducing construct-irrelevant barriers to accessing test content for virtually all students. 5 The range of accommodations provided in the online testing environment far exceed the typical accommodations made available in paper-based test administrations, which were typically limited to large print, Braille, and English and foreign language audio translations. Exhibit 1.2.1 lists the accommodations and accessibility supports currently available for OST assessments.
3 Standard 4.1 – Test specifications should describe the purpose(s) of the test, the definition of the construct or domain measured, the intended examinee population, and interpretations for intended uses. The specifications should include a rationale supporting the interpretations and uses of test results for the intended purpose(s). 4 Standard 3.0 – All steps in the testing process, including test design, validation, development, administration, and scoring procedures, should be designed in such a manner as to minimize construct-irrelevant variance and to promote valid score interpretations for the intended uses for all examinees in the intended population. 5 Standard 3.1 – Those responsible for test development, revision, and administration should design all steps of the testing process to promote valid score interpretations for intended score uses for the widest possible range of individuals and relevant subgroups in the intended population. Standard 3.2 – Test developers are responsible for developing tests that measure the intended construct and for minimizing the potential for tests to be affected by construct-irrelevant characteristics, such as linguistic, communicative, cognitive, cultural, physical, or other characteristics. Standard 12.3 – Those responsible for the development and use of educational assessments should design all relevant steps of the testing process to promote access to the construct for all individuals and subgroups for whom the assessment is intended.
Ohio Department of Education 8 American Institutes for Research
Exhibit 1.2.1: Accommodations and Accessibility Supports
Accessibility Feature Description
Text-to-Speech—Directions, Passages,
Items
Computer reads text and graphics aloud on directions, passages, and items. What is read and how it is read is configurable.
Text-to-Speech—Graphic Description
Computer reads graphics and tables aloud.
Magnification Interface
Student can zoom in and zoom out on the entire page. This capability persists throughout the test.
Magnifier Student can magnify a selected portion of an item.
Variable Font Size The number of levels (generally, five levels) and rate of increase (generally, 1.25x the previous level) are configurable.
Refreshable Braille/ Tactile With External
Embosser Printer
Items can be rendered to desktop embossers that can integrate Braille and tactile graphics. The items are simultaneously rendered on a reader-accessible screen, and the student can navigate to response spaces to provide answers.
Reverse Contrast Background is black, while text is white.
Administrator- Selectable Variable
Font and Background Colors
Any foreground and background color can be supported.
Color Overlay Any color can be laid on the screen. This persists throughout the test.
Increased White Space
This is the streamlined interface.
Sign Language—Directions, Passages,
Items
This capability consists of recorded videos using American Sign Language. Experts on hearing impairment do not recommend avatars because they do not translate well to American Sign Language.
Translations Versions are available in alternate languages.
Keyword Translation This enables translators to associate keyword translations.
Glossaries and Dictionaries
These enable content developers to associate additional content with words or phrases. The content can comprise multiple types, and the content shown to a student can be controlled by his or her personal profile.
Alternate Language Glossaries and
Dictionaries
These enable content developers to associate alternate-language content with words or phrases. The content can comprise multiple types, and the content shown to a student can be controlled by his or her personal profile.
Ohio Department of Education 9 American Institutes for Research
Accessibility Feature Description
Administrator- Selectable Assistive Devices Integration
Our system has a standard interface and a streamlined interface. Most assistive devices can work with the former, and an even wider group works with the latter. If the use of the device requires relaxation of certain security features (e.g., if suppression of pop-up windows interferes with on-screen keyboards), the system can be configured to allow the test administrator to select a more permissive mode.
Line Reader This feature allows a student to track the line he or she is reading.
Masking Students can mask extraneous information on the screen.
Speech-to-Text Speech is converted to text and then saved in the database. (Available through compatibility with third-party assistive technology.)
Auditory Calming This enables music or white noise to be played in the background. (Available through third-party software.)
Administrator-Selectable Zoom
Default font size can be set in advance through a file upload or user interface or at the time of testing by the test administrator. Student can zoom in or zoom out at any time.
Administrator-Selectable Large
Print Font
Default font size can be set in advance through a file upload or user interface or at the time of testing by the test administrator. Student can zoom in or zoom out at any time.
Administrator-Selectable Screen-
Reader
The system supports an integrated screen reader that can be configured to provide a variety of support levels, each selectable by the test administrator.
Additional Time
AIR’s system currently does not impose a time limit on the test. It is up to the proctor to stop a student’s test or stop the entire session. However, if there are unforeseen events, such as a fire alarm, that trigger need for additional testing time, AIR’s system can enable a grace period extension (GPE) for a single test opportunity or for multiple test opportunities.
Segment Breaks
AIR’s system has the capability of adding test segments within a test. A test segment is made up of multiple item groups and creates a logical break between segments within a test. For example, a segment break might separate a calculator from a non-calculator segment of a test.
Recorded Audio AIR’s system efficiently delivers recorded audio. We are able to deliver voice-audio using only about 10 Kbps of bandwidth.
Secure Print Facility
A visual accessibility feature, the secure print facility allows the secure printing of items or passages. A student requests that a passage or item be printed; the request is then encrypted and sent securely to the proctor; the proctor approves the request before it is sent to the printer. In addition, this feature also allows for the delivery of real-time paper-pencil tests, including large print tests.
Ohio Department of Education 10 American Institutes for Research
Accessibility Feature Description
Test Pauses and Restarts
An attention accessibility feature, test pauses and restarts allow the test to be paused at any time and restarted and taken over many days. So that security is not compromised, visibility on past items is not allowed when the test has been paused longer than a specified period of time.
Writing Checklist An attention accessibility feature generally used for essay items, the writing checklist enables a student to check off writing guidelines from a checklist.
Review Test Students can review the test before ending it.
Area Boundaries An agility accessibility feature, area boundaries for mouse-clicking multiple-choice options allow students to click anywhere on the selected-response text or button.
Language Any language that is necessary can be supported.
Help Section A reference feature, the Help Section explains how the system and its tools work.
Performance Report A reference feature, a performance report is available at the end of the test for the student.
1.2.1 EVIDENCE BASED ON TEST CONTENT
Because OST assessments are designed to measure student progress toward achievement of Ohio’s Learning Standards, the validity of OST test score interpretations critically depend on the degree to which test content is aligned with expectations for student learning specified in the academic standards.6
Alignment of Ohio’s Learning Standards is achieved through a rigorous test development process that proceeds from the learning standards and refers back to those standards in a highly iterative test development process that includes ODE, test developers, and educator and stakeholder committees.
In addition to ensuring that test items are aligned with their intended learning standards, each assessment is intended to measure a representative sample of the knowledge and skills identified in the standards. Test blueprints specify the range with which each of the content strands and standards will be covered in each test administration.7 Thus, the test blueprints represent a policy document specifying the relative importance of content strands and standards in addition to meeting important measurement goals (e.g., sufficient items to report strand performance levels reliably). Because the test blueprint determines how student achievement of Ohio’s Learning Standards is evaluated, alignment of test blueprints with the learning standards is critical.
With the desired alignment of test blueprints to Ohio’s Learning Standards, alignment of test forms to the learning standards becomes a mechanical, although sometimes difficult, task of developing test forms that meet the
6 Standard 12.4 – When a test is used as an indicator of achievement in an instructional domain or with respect to specified content standards, evidence of the extent to which the test samples the range of knowledge and elicits the processes reflected in the target domain should be provided. Both the tested and the target domains should be described in sufficient detail for their relationship to be evaluated. The analyses should make explicit those aspects of the target domain that the test represents, as well as those aspects that the test fails to represent. 7 Standard 4.1 – Test specifications should describe the purpose(s) of the test, the definition of the construct or domain measured, the intended examinee population, and interpretations for intended uses. The specifications should include a rationale supporting the interpretations and uses of test results for the intended purpose(s).
Ohio Department of Education 11 American Institutes for Research
blueprints. Developing test forms is difficult because test blueprints can be highly complex, specifying not only the range of items and points for each strand and standard, but also cross-cutting criteria such as distribution across item types, writing genre, and so on. Also, in addition to meeting complex blueprint requirements, test developers must work to meet psychometric goals so that alternate test forms measure equivalently across the range of ability.
Following a standard item-review process, item reviews proceed initially through a series of internal reviews before items are eligible for review by ODE content experts. Most of AIR’s content staff members, who are responsible for conducting internal reviews, are former classroom teachers who hold degrees in education and/or their respective content areas. Each item passes through four internal review steps before it is eligible for review by ODE. Those steps include the following:
• Preliminary Review, in which the item is reviewed by a group of AIR content area experts
• Content Review 1, in which the item is reviewed by an AIR content specialist
• Editorial Review, in which a copyeditor checks the item for correct grammar/usage
• Senior Content Review, in which the item is reviewed by the lead content expert.
At every stage of the item-review process, beginning with preliminary review, AIR’s test developers analyze each item to ensure the following:
• The item is well-aligned with the intended learning standard.
• The item conforms to the item specifications for the target being assessed.
• The item is based on a quality idea (i.e., it assesses something worthwhile in a reasonable way).
• The vocabulary used in the item is appropriate for the intended grade/age and subject matter, and takes
into consideration language accessibility, bias, and sensitivity.
• The item content is accurate and straightforward.
• Any accompanying graphic and stimulus materials are actually necessary to answer the question.
• The item stem is clear, concise, and succinct, meaning that it contains enough information to know what is
being asked, it is stated positively (and does not rely on negatives such as no, not, none, or never unless
absolutely necessary), and it ends with a question.
• For selected-response items, the set of response options are succinct; parallel in structure, grammar,
length, and content; sufficiently distinct from one another; and all plausible, all non-keyed response options
are unambiguously incorrect.
• There is no obvious or subtle cluing within the item.
• The score points for constructed-response items are clearly defined.
• For machine-scored constructed-response (MSCR) items, that item responses yield the intended score
points and rationales based on the rubric.
• For handscored constructed-response items, the scoring rubric clearly explains what characterizes
responses at each possible level of achievement.
In addition, rubric-scored items, both machine-scored and handscored, are validated following field-test administration. Machine-scored items go through a rubric validation process wherein samples of student responses are reviewed, along with resulting scores, to ensure that rubrics are enacted as intended. This process is described in Section 10.1. Handscored items go through a range-finding process prior to scoring where samples of item responses are used to create scorer-training materials and ensure that the scoring rubric is appropriate, as described in Section 10.2.
Based on their review of each item, the test developer may have accepted the item and classification as written, revised the item, or rejected the item outright.
Ohio Department of Education 12 American Institutes for Research
Items passing through the internal review process were sent to ODE for their review. At this stage, items may have been further revised based on any edits or changes requested by ODE, or rejected outright. Items passing through the ODE review level then had to pass through two stakeholder reviews in which committees of Ohio educators and stakeholders review each item’s accuracy, alignment to the intended standard and DOK level, as well as item fairness and language sensitivity. Thus, all items considered for inclusion in the OST item pools were initially reviewed by the following committees.
• A content advisory committee checked to ensure that each item is
o aligned to Ohio’s Learning Standards;
o appropriate for the grade level;
o accurate; and
o presented online in a way that is clear and appropriate.
• A fairness and sensitivity committee checked to ensure that each item and any associated stimulus
materials were free from bias, sensitive issues, controversial language, stereotyping, and statements that
reflect negatively on race, ethnicity, gender, culture, region, disability, or other social and economic
conditions and characteristics.
Items successfully passing through this committee review process were then field tested to ensure that the items behaved as intended when administered to students. Despite conscientious item development, some items perform differently than expected when administered to students. Using the item statistics computed following field testing to review item performance is an important step in constructing equivalent operational test forms that support valid inferences.
Classical item analyses ensure that items function as intended with respect to the underlying scales. Classical item statistics are designed to evaluate the item difficulty and the relationship of each item to the overall scale (item discrimination) and to identify items that may exhibit a bias across subgroups (differential item functioning analyses).
Items flagged for review based on their statistical performance have to pass a three-stage review to be included in the final item pool from which operational forms are created. In the first stage of this review, a team of psychometricians review all flagged items to ensure that the data are accurate and properly analyzed, response keys are correct, and there are no other obvious problems with the items.
ODE then reconvened their content review and fairness and sensitivity committees to re-evaluate flagged field-test items in the context of each item’s statistical performance. Based on their review of each item’s performance, the content review and fairness and sensitivity committees could recommend that flagged items be rejected or deem the item eligible for inclusion in operational test administrations.
1.2.2 EVIDENCE FOR PERFORMANCE STANDARD INTERPRETATION OF TEST SCORES
Alignment of test content to Ohio’s Learning Standards ensures that test scores can serve as valid indicators of the degree to which students have achieved the learning expectations detailed in the standards. However, the interpretation of the OST test scores rests fundamentally on how test scores relate to performance standards, which define the extent to which students have achieved the expectations defined in the standards. OST test scores are reported with respect to five proficiency levels, indicating the degree to which Ohio students have achieved the learning expectations defined by Ohio’s Learning Standards. The cut score establishing the Proficient level of performance is the most critical, since it indicates that students are meeting grade-level expectations for achievement of Ohio’s Learning Standards and that they are prepared to benefit from instruction at the next grade level. The Accelerated level is also of critical importance because performance at this level is intended to indicate
Ohio Department of Education 13 American Institutes for Research
that students are on track to pursue post-secondary education. Procedures used to adopt performance standards for OST assessments are therefore central to the validity of test score interpretations.8
Following the first operational administration of the science and social studies assessments in spring 2015, a standard-setting workshop was conducted to recommend to the Ohio State Board of Education a set of performance standards for reporting student achievement of Ohio’s Learning Standards. In December 2015, a standard-setting workshop was conducted to recommend the performance standards in ELA and mathematics. For each of the standard-setting workshops, a technical report was produced that describes the standardized and rigorous procedures that Ohio educators, serving as standard-setting panelists, followed to recommend performance standards. The workshops employed the Bookmark procedure, a widely used method in which standard-setting panelists use their expert knowledge of the academic content standards and student achievement to map the performance level descriptors adopted by the Ohio State Board of Education onto an ordered-item booklet based on the first operational test forms administered to students in spring 2015 for science and social studies, and in fall 2015 and spring 2016 for ELA and mathematics.9
Panelists were also provided with contextual information to help inform their primarily content-driven performance standard recommendations. Panelists recommending performance standards were provided with the approximate location of performance standards from other statewide assessment systems, including the Partnership for Assessments of Readiness for College and Careers (PARCC) and Smarter Balanced. Panelists recommending performance standard for the grades 3–8 summative assessments were provided with the approximate location of relevant National Assessment of Educational Progress (NAEP) performance standards at grades 4 and 8, as well as interpolated NAEP standards for grade 6. High school end-of-course panelists were also informed of the approximate location of the ACT college-ready cut scores for appropriate subject area assessments. Panelists were asked to consider these benchmark locations when making their content-based performance standard recommendations. When panelists are able to use benchmark information to locate performance standards that converge across assessment systems, validity of test score interpretations is bolstered.
In addition, panelists were provided with feedback about the vertical articulation of their recommended performance standards so that they could view how the locations of their recommended performance standards for each grade-level assessment sat in relation to the performance standard recommendations at the other grade levels. This approach allowed panelists to view their performance standard recommendations as a coherent system of performance standards, which further reinforces the interpretation of test scores as indicating not only achievement of current grade-level standards, but also preparedness to benefit from instruction in the subsequent grade level.
Based on the recommended performance standards, Exhibits 1.2.2.1 show the estimated percentage of students meeting or exceeding the OST assessments’ proficient and accelerated performance standards for each of the ELA and mathematics assessments, while Exhibit 1.2.2.2 shows the percentage of students expected to meet or exceed the OST assessments’ proficient and accelerated standards for the science and social studies assessments. Exhibit 1.2.2.1 and 1.2.2.2 also show the approximate percentage of Ohio students that would be expected to meet the relevant ACT college-ready standard, and the percentage of Ohio students meeting the NAEP proficient standards at grades 4 and 8, and as interpolated at grade 6. Exhibit 1.2.2.1 also presents the estimated percent of Ohio students meeting the PARCC and Smarter Balanced proficient standard. Exhibit 1.2.2.2 provides the estimated percentage of students nationally scoring at or above the highly proficient standard for TIMSS. As the exhibits indicate, the recommended OST performance standards are quite consistent with relevant ACT college-ready standards, and the NAEP and Smarter Balanced proficient benchmarks. Moreover, because the performance standards were vertically articulated, the proficiency rates across grade levels are generally consistent.
8 Standard 4.22 – Test developers should specify the procedures used to interpret test scores and, when appropriate, the normative or standardization samples or the criterion used. 9 Standard 1.18 – When it is asserted that a certain level of test performance predicts adequate or inadequate criterion performance, information about the levels of criterion performance associated with given levels of test scores should be provided.
Ohio Department of Education 14 American Institutes for Research
Exhibit 1.2.2.1: Percentage of Students Meeting OST and Benchmark Proficient Standards — ELA and
Mathematics
Grade / Course OST
Proficient OST
Accelerated ACT College
Ready NAEP
Proficient PARCC
Proficient SBAC Meets
ELA
Grade 3 56 36 -- -- -- 52
Grade 4 54 33 -- 37 71 55
Grade 5 57 36 -- -- 66 58
Grade 6 58 37 -- 37 69 54
Grade 7 55 32 -- -- 68 57
Grade 8 55 28 -- 36 64 57
ELA I 53 24 -- -- 71 63
ELA II 52 25 37 -- 72 63
Mathematics
Grade 3 66 36 -- -- 66 58
Grade 4 65 37 -- 45 65 54
Grade 5 65 34 -- -- 65 49
Grade 6 62 31 -- 40 63 45
Grade 7 61 36 -- -- 63 49
Grade 8 63 32 -- 35 53 50
Algebra I 58 36 -- -- 65 47
Geometry 59 38 31 -- -- 48
Integrated Math I 58 36 -- -- 58 47
Integrated Math II 56 36 32 -- -- 45
Exhibit 1.2.2.2: Percentage of Students Meeting OST and Benchmark Proficient Standards — Science and Social
Studies
Grade / Course OST
Proficient OST
Accelerated ACT College
Ready NAEP
Proficient National
TIMSS High
Science
Grade 5 62 38 -- -- 47
Grade 8 60 37 -- 38 40
Physical Science 63 22 26 -- --
Biology 60 27 26 -- --
Social Studies
Grade 4 70 29 -- 37 --
Grade 6 57 36 -- 38 --
American History 71 35 37 -- --
American Government 67 18 37 -- --
Ohio Department of Education 15 American Institutes for Research
1.2.3 EVIDENCE BASED ON INTERNAL STRUCTURE
Ohio’s State Tests represent a structural model of student achievement in grade-level and course-specific content areas. Within each subject area (e.g., ELA), items are designed to measure a single content strand (e.g., Reading Information, Reading Literature, Writing). Content strands within each subject area are, in turn, indicators of achievement in the subject area. The form of the second-order confirmatory factor analyses is illustrated in Exhibit 1.2.3.1 with each item as an indicator of an academic content strand. Because items are never pure indicators of an underlying factor, each item also includes an error component. Similarly, each academic content strand serves as an indicator of achievement in a subject area. As at the item level, the content strands include an error term indicating that the content strands are not pure indicators of overall achievement in the subject area. The paths from the content strands to the items represent the first-order factor loadings or the degree to which items are correlated with the underlying academic content strand construct. Similarly, the paths from subject area achievement to the content strands represent the second-order factor loading, indicating the degree to which academic content strand constructs are correlated with the underlying construct of subject area achievement.
Exhibit 1.2.3.1: Second-Order Structural Model for OST Assessments
Confirmatory factor analysis was used to evaluate the fit of this structural model to student response data from OST assessments’ test administrations.10 For each of test forms administered in spring 2018, we examined the goodness-of-fit between the structural model and the operational test data. Goodness-of-fit is typically indexed by a χ2 statistic, with good model fit indicated by a non-significant χ2 statistic. The χ2 statistic is sensitive to sample size, however; even well-fitting models will demonstrate highly significant χ2 statistics given a very large number of students. Therefore, fit indices, such as the Comparative Fit Index (CFI; Bentler, 1990), the Tucker-Lewis Index (Tucker & Lewis, 1973), and the Root Mean Square of Approximation (RMSEA) were also used to evaluate model fit. Exhibit 1.2.3.2 illustrates the guidelines for evaluating goodness-of-fit.
10 Standard 1.13 – If the rationale for a test score interpretation for a given use depends on premises about the relationships among test items or among parts of the test, evidence concerning the internal structure of the test should be provided.
Ohio Department of Education 16 American Institutes for Research
Exhibit 1.2.3.2: Guidelines for Evaluating Goodness-of-Fit
Goodness-of-Fit Index Indication of Good Fit
CFI 0 ≥ 0.95
TLI 0 ≥ 0.95
RMSEA 0 ≤ 0.05
In addition to testing the fit of the hypothesized OST second-order confirmatory factor analysis model, we examined the degree to which the second-order model improved fit over the more general one-factor model (i.e., first-order model) of academic achievement in each subject area. Because the one-factor, general achievement model was nested within the second-order model, a simple likelihood ratio test was used to determine whether the added information provided by the structure of the OST assessments’ frameworks improved model fit over a general achievement model. Results indicating improved model fit for the second-order factor model provide support for the interpretation of learning standard performance at the strand level above that provided by the overall subject area score.11
ELA Content Model
We began by evaluating the fit of the first-order, general achievement model in which all items are indicators of a common subject area factor. This model importantly evaluates the assumption of unidimensionality of the subject area assessments, and provides a baseline for evaluating the improvement of fit for the more differentiated second-order (i.e., strand) model. The goodness-of-fit statistics for the first-order, general achievement models in ELA are shown in Exhibit 1.2.3.3. All of the statistics indicate the general achievement factor model fit the data well. This pattern was true across all grades. The CFI and TLI values were all greater than 0.93, and the RMSEA values were at or below 0.06, indicating reasonable fit for the base model. The goodness-of-fit statistics for the hypothesized OST second-order models in ELA are shown in Exhibit 1.2.3.3. All of the statistics indicate the second-order models posited by OST assessments fit the data well. This pattern was true across all grades. The CFI and TLI values for the second-order models were all equal to or greater than 0.95, with RMSEA values well below the 0.05 threshold used to indicate good fit.
The results of the comparison between the hypothesized OST model and the more general achievement model are presented in Exhibit 1.2.3.3. We note that model fit for first-order models of general achievement are reasonably high and provide evidence for the unidimensionality of the subject area assessments. The purpose of these analyses is to determine whether the posited second-order reporting model adds information beyond that provided by the first-order model. The chi-square difference test shows that across grade levels, the strand-based, second-order model showed significantly better fit than the general achievement first-order model. The χ2
Diff p-values were less than 0.001 across all grade levels. The improved fit derives primarily from the differentiation between the reading and writing assessments. While the evidence supports a unified ELA construct, reading and writing are sufficiently independent to warrant differentiated reporting.
Exhibit 1.2.3.3: Goodness-of-Fit for the OST First-Order Model and Second-Order Model and Difference in Fit
between Two Competing Models — ELA
Grade / Course
Goodness-of-Fit Difference in Fit between First- and Second-Order Models First-Order Models Second-Order Models
CFI TLI RMSEA CFI TLI RMSEA χ2 df p value
Grade 3 0.97 0.96 0.06 1.00 1.00 0.02 42920.307 3 p < 0.001
Grade 4 0.98 0.98 0.04 1.00 1.00 0.02 30094.493 3 p < 0.001
11 Standard 1.14 – When interpretation of subscores, score differences, or profiles is suggested, the rationale and relevant evidence in support of such interpretation should be provided. Where composite scores are developed, the basis and rationale for arriving at the composites should be given.
Ohio Department of Education 17 American Institutes for Research
Grade / Course
Goodness-of-Fit Difference in Fit between First- and Second-Order Models First-Order Models Second-Order Models
CFI TLI RMSEA CFI TLI RMSEA χ2 df p value
Grade 5 0.96 0.96 0.06 1.00 1.00 0.02 39872.871 3 p < 0.001
Grade 6 0.97 0.97 0.05 0.99 0.99 0.03 34166.295 3 p < 0.001
Grade 7 0.97 0.97 0.05 0.99 0.99 0.03 48889.907 3 p < 0.001
Grade 8 0.97 0.97 0.05 0.99 0.99 0.03 37699.965 3 p < 0.001
ELA I 0.98 0.98 0.05 0.99 0.99 0.03 35643.652 3 p < 0.001
ELA II 0.99 0.98 0.05 0.99 0.99 0.03 34637.670 3 p < 0.001
Mathematics Content Model
As with ELA, structural analyses of the mathematics assessments began with an evaluation of fit for the first-order, general achievement model in which all items are indicators of a common mathematics subject area factor. This model provides for an evaluation of the unidimensionality assumption of the subject area assessments, and provides a baseline for evaluating the improvement of fit for the more differentiated second-order model. The goodness-of-fit statistics for the general achievement models in mathematics are shown in Exhibit 1.2.3.4. Fit statistics indicate that the general achievement factor model fit the data well. This pattern was true across all grades. The CFI and TLI values were greater than 0.95, and the RMSEA values were below 0.05 for all grades, indicating adequate fit for the base model.
The goodness-of-fit statistics for the hypothesized OST second-order models in mathematics are shown in Exhibit 1.2.3.4. Fit statistics indicate the second-order models posited by OST assessments also fit the data well. This pattern was true across all grades. The CFI and TLI values for the second-order models were all equal to or greater than 0.95, with RMSEA values well below the 0.05 threshold used to indicate good fit.
The results of the comparison between the hypothesized OST model and the more general achievement model for mathematics tests are presented in Exhibit 1.2.3.4. The chi-square difference test shows that across grade levels, the strand-based, second-order model showed significantly better fit than the general achievement first-order model. The χ2
Diff p-values were less than 0.001 across all grade levels. We note that the magnitude of the χ2 Diff values
are much smaller than those observed for ELA, as are the differences between the first- and second-order model fit indices. This suggests that while differentiation among the latent traits may be supported, precision of observed subscale scores may not be sufficient to support differential comparisons between subscale scores. This appears especially relevant for grade 6 and the high school end-of-course tests which are focused more narrowly on a single content strand.
Exhibit 1.2.3.4: Goodness-of-Fit for the OST First-Order Model and Second-Order Model and Difference in Fit
between Two Competing Models — Mathematics
Grade / Course
Goodness-of-Fit Difference in Fit between First- and Second-Order Models First-Order Models Second-Order Models
CFI TLI RMSEA CFI TLI RMSEA χ2 df p value
Grade 3 0.96 0.95 0.04 0.96 0.95 0.04 617.339 4 p < 0.001
Grade 4 0.97 0.97 0.03 0.97 0.97 0.03 2770.506 3 p < 0.001
Grade 5 0.97 0.97 0.03 0.98 0.98 0.03 7981.289 3 p < 0.001
Grade 6 0.98 0.98 0.03 0.98 0.98 0.03 40.451 4 p < 0.001
Grade 7 0.98 0.98 0.03 0.98 0.98 0.03 711.761 4 p < 0.001
Grade 8 0.97 0.97 0.03 0.97 0.97 0.03 323.786 4 p < 0.001
Algebra 0.98 0.98 0.03 0.98 0.98 0.03 2087.410 3 p < 0.001
Ohio Department of Education 18 American Institutes for Research
Grade / Course
Goodness-of-Fit Difference in Fit between First- and Second-Order Models First-Order Models Second-Order Models
CFI TLI RMSEA CFI TLI RMSEA χ2 df p value
Geometry 0.97 0.97 0.03 0.97 0.97 0.03 341.206 4 p < 0.001
Int Math I 0.99 0.98 0.02 0.99 0.98 0.02 81.381 4 p < 0.001
Int Math II 0.97 0.97 0.03 0.97 0.97 0.03 244.299 4 p < 0.001
Science and Social Studies Content Models
Structural analyses of the science and social studies assessments also began with an evaluation of fit for the first-order, general achievement model in which all items are indicators of a common subject area factor. The goodness-of-fit statistics for the general achievement models in science and social studies are shown in Exhibit 1.2.3.5. All of the statistics indicate the general achievement factor model fit the data well. This pattern was true across all grades. The CFI and TLI values were all equal to or greater than 0.95, and the RMSEA values are all below 0.05, indicating good fit for the base model. The goodness-of-fit statistics for the hypothesized OST second-order models in science and social studies are shown in Exhibit 1.2.3.5. As with the general factor model, all of the fit statistics indicate the second-order models posited by OST assessments fit the data well. This pattern was true across all grades. The CFI and TLI values for the second-order models were all equal to or greater than 0.95, with RMSEA values well below the 0.05 threshold used to indicate good fit.
The results of the comparison between the hypothesized OST model and the more general achievement model for science and social studies tests are presented in Exhibit 1.2.3.5. The chi-square difference test shows that across grade levels, the strand-based second-order model showed significantly better fit than the general achievement first-order model. The χ2
Diff p-values were less than 0.001 across all grade levels. As observed with respect to the mathematics assessments, the magnitude of the χ2
Diff values are relatively smaller than those observed for ELA, as are the differences between the first- and second-order model fit indices. Thus, while differentiation among the latent traits may be supported theoretically, precision of observed subscale scores is likely not sufficient to support differential comparisons between subscale scores.
Exhibit 1.2.3.5: Goodness-of-Fit for the OST First-Order Model and Second-Order Model and Difference in Fit
between Two Competing Models — Science and Social Studies
Grade / Course
Goodness-of-Fit Difference in Fit between First- and Second-Order Models First-Order Models Second-Order Models
CFI TLI RMSEA CFI TLI RMSEA χ2 df p value
Science
Grade 5 0.99 0.99 0.02 0.99 0.99 0.02 56.579 3 p < 0.001
Grade 8 0.95 0.95 0.03 0.96 0.95 0.03 4334.645 3 p < 0.001
Biology 0.98 0.98 0.02 0.98 0.98 0.02 8409.255 4 p < 0.001
Physical Science 0.41 0.38 0.05 0.42 0.39 0.05 58.596 4 p < 0.001
Social Studies
American Government 0.96 0.96 0.03 0.96 0.96 0.03 604.544 3 p < 0.001
American History 0.99 0.99 0.02 0.99 0.99 0.02 316.798 3 p < 0.001
1.2.4 MEASUREMENT INVARIANCE ACROSS SUBGROUPS
Measurement invariance occurs when the likelihood of a correct response conforms to the measurement model and is independent of group membership and the parameters of a measurement model are statistically equivalent across
Ohio Department of Education 19 American Institutes for Research
groups. 12 The parameters of interest in measurement invariance testing are the factor loadings and intercepts/thresholds. Invariance in residual variances or scale factors can also be tested, but there is consensus that it is not necessary to demonstrate invariance across groups on these parameters. In general, measurement invariance testing can be conducted using a series of multiple-group confirmatory factor analysis (CFA) models, which impose identical parameters across groups. The measurement model parameters, including factor patterns (configural invariance), factor loadings (metric or weak invariance), latent intercepts/thresholds (scalar or strong invariance), and unique or residual factor variances (strict invariance), are tested across groups in that sequential order. When factor loadings and intercepts/thresholds are invariant across groups, scores on latent variables can be validly compared across the groups and the latent variables can be used in structural models that hypothesize relationships among latent variables (Millsap, 2011).
Items comprising the spring 2018 operational test administration were used to investigate measurement invariance across subgroups for all subjects. The full set of tables associated with these analyses is provided in Appendix A for each of the grade and subject area assessments.
The series “a” tables (e.g., A.1a, A.2a, etc., in Appendix A) present the global model fit indices for the measurement invariance tests for each assessment. Following the sequence of tests of measurement invariance (Millsap & Cham, 2012), we tested configural, metric, and scalar invariance models using χ2 difference test (at α ≤ 0.05) and the examination of significant differences of the Root Mean Square of Approximation (RMSEA, change in RMSEA ≤ 0.015; Chen, 2007) between the two nested invariance models. Measurement invariance was investigated across the following subgroups: gender (Model A); ethnicity including African American vs. White (Model B-1), Hispanic vs. White (Model B-2), Asian vs. White (Model B-3), American Indian vs. White (Model B-4), and Multi-ethnic vs. White (Model B-5); Individualized Education Program status (IEP; Model C); and Limited English Proficiency status (LEP; Model D). Invariance tests of subgroups were investigated separately for each grade and subject area test. Please note that multiple-group CFA for Physical Science and Biology could not be tested because of the very small number of focal groups in Physical Science and missing responses of certain items in Biology.
The null hypothesis of the χ2 difference test is that the more restricted invariance model (e.g., metric) fits the data equally as well as the less restricted invariance model (e.g., configural). Given that the sensitivity of the χ2 difference tests to sample size, we additionally examined significant differences on this test with an examination of the RMSEA. A small change in the RMSEA between the more restricted and less restricted invariance models supports retention of the more restricted invariance model (Chen, 2007).
The series “b” tables (e.g., A.1b, A.2b, etc.) show the model fit indices of scalar invariance models assuming same factor pattern plus identical factor loadings plus identical latent intercept/threshold across subgroups. Global model fit indices included the Comparative Fit Index (CFI; Bentler, 1990) and Root Mean Square of Approximation (RMSEA). CFI values ≥ 0.90 and RMSEA values ≤ 0.08 were used to evaluate acceptable model fit. The model fit indices of the scalar invariance models for all tests suggested acceptable fit to the data. For ELA, CFI ranged from 0.93 to 0.99 and RMSEA ranged from 0.01 to 0.06. For mathematics, CFI values ranged from 0.95 to 0.99 and RMSEA ranged from 0.03 to 0.04. For science, CFI values ranged from 0.90 to 0.97 and RMSEA ranged from 0.02 to 0.03. For social studies, CFI values ranged from 0.96 to 0.99 and RMSEA ranged from 0.01 to 0.03.
Although the χ2 difference test should ideally be nonsignificant, almost all χ2 difference tests were significant at α = 0.05 due to large sample sizes. An exception to this was observed for Model B-4 (American Indian vs. White), where the χ2 difference tests for most grades was nonsignificant or marginally significant at α = 0.05. In spite of significant χ2 difference tests for most models, we found that changes of the RMSEA between the two nested invariance models were very small (ranging from 0.000 to 0.005 across assessments in all grades and subjects), which indicates acceptable fit indices of the scalar model. Based on the similar magnitudes of the RMSEA (i.e., no material change across all tested models; Cheung & Rensvold, 2002) and the acceptable fit indices of the scalar invariance model to the data, OST spring 2018 test scores have the same measurement structure across gender, ethnicity (African
12 Standard 3.15 – Test developers and publishers who claim that a test can be used with examinees from specific subgroups are responsible for providing the necessary information to support appropriate test score interpretations for their intended uses for individuals from these subgroups.
Ohio Department of Education 20 American Institutes for Research
American vs. White, Hispanics vs. White, Asian vs. White, American Indian vs. White, and Multi-Ethnic vs. White), individualized education program status, and limited English proficiency status.
1.2.5 TEST INTEGRITY FORENSICS
The validity of test score interpretation depends critically on the integrity of the test administrations on which those scores are based. Any irregularities in the administration of assessments can therefore cast doubt on the validity of the inferences based on those test scores. Multiple facets work to ensure that tests are administered properly which include clear test administration policies, effective test administrator training, and tools to identify possible irregularities in test administrations.13
For online administrations, quality assurance (QA) reports are generated during and after the testing windows. These are geared toward detection of possible cheating, aggregating unusual responses at the student level to detect possible group-level testing anomalies.
Online test administration allows Ohio’s testing contractor to track information that was not possible to track in the context of the paper-pencil tests. This information includes not only item responses but also item response changes, latencies between item responses and changes, number of revisits to an item or items, test start and end times, scores in each opportunity in the current year, scores in the previous year, and other selected information in the system (e.g., accommodations) as requested by the state. AIR’s Test Delivery System (TDS) captures all of this information.
Unlike with paper-based assessments where data analysis must await the close of the testing window and processing of answer documents, AIR’s TDS allows AIR psychometricians and state assessment staff to monitor testing anomalies throughout each testing window, following the first operational administration. Following the base year, the analyses used to detect the testing anomalies can be run at any time within the testing window. Evidence evaluated includes changes in test scores across administrations, item response times, and item response patterns using the person-fit index. The flagging criteria used for these analyses are configurable and can be changed by the user. Analyses are performed at student-level and summarized for each aggregate unit, including testing session, test administrator, and school.
Changes in Student Performance
Although not available for spring 2016, beginning in the 2016–2017 school year, for both online and paper-pencil tests, it became possible to examine score changes between test administrations using a regression model. For between-year comparisons, the scores between past and current years are compared, with the current-year score regressed on the test score from the previous year. Between-year comparisons are performed starting with the second year of the test administration.
A large score gain or loss between grades is detected by examining the residuals for outliers. The residuals are computed as observed value minus predicted value. To detect unusual residuals, we compute the studentized t residuals. An unusual increase or decrease in student scores between opportunities is flagged when studentized t residuals are greater than 3 or less than -3.
The number of students with a large score gain or loss is aggregated for a testing session, test administrator, and school. Unusual changes in an aggregate performance between administrations and/or years is flagged based on the average studentized t residuals in an aggregate unit (e.g., a testing session or a test administrator). For each aggregate unit, a critical t value is computed and flagged when t was greater than 3 or less than -3,
13 Standard 6.6 – Reasonable efforts should be made to ensure the integrity of test scores by eliminating
opportunities for students to attain scores by fraudulent or deceptive means.
Ohio Department of Education 21 American Institutes for Research
𝑡 =𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙𝑠
√𝑠2
𝑛+
∑ 𝑣𝑎𝑟(𝑒𝑖)𝑛𝑖=1
𝑛2
,
where s = standard deviation of residuals in an aggregate unit; n = number of students in an aggregate unit (e.g., testing session or test administrator); and 𝑣𝑎𝑟(𝑒𝑖) = 𝜎2(1 − ℎ𝑖𝑖). The term 1 − ℎ𝑖𝑖 is the diagonal component of the variance-covariance matrix of the residuals. The QA report includes a list of the flagged aggregate units with the number of flagged students in the aggregate unit.
If the aggregate unit size is 1–5 students, the aggregate unit was flagged if the percentage of flagged students was greater than 50%. The aggregate unit size for the score change is based on the number of students included in the within- or between-year regression analyses in the aggregate unit. The number of flagged aggregate units and the percentage of flagged aggregate are presented in Table B.1 to B.4 in Appendix B.
Item Response Latency
The online environment also allows item response latency to be captured as the item page time (the length of time
that each item page is presented) in milliseconds. Discrete items appear one item on the screen at a time.
However, for stimulus-based items selected as part of an item group, all items associated with the stimulus are
selected and loaded as a group. For each student, the total time taken to complete the test is computed by
summing up the page time for all items and item groups.
An example of unusual item response time would be a test record for an individual who scores very well on the
test even though the average time spent for each item was far less than that required of students statewide. If
students already know the answers to the questions, the response time is much shorter than the response time for
those items where the student has no prior knowledge of the item content. Conversely, if a test administrator
helps students by “coaching” them to change their responses during the test, the testing time could be longer than
expected.
The average and the standard deviation of test-taking time are computed across all students for each opportunity.
Students and aggregate units were flagged if the test-taking time was greater than 3 or smaller than -3 standard
deviations of the state average. The state average and standard deviation were computed based on all students at
the time the analysis was performed. The QA report includes a list of the flagged aggregate units with the number
of flagged students in the aggregate unit. The number of flagged aggregate units and the percentage of flagged
aggregate are presented in Table B.1 to B.4 in Appendix B.
Inconsistent Item Response Pattern (Person Fit)
In Item Response Theory (IRT) models, person-fit measurement is used to identify students whose response patterns
are improbable given an IRT model. If a test has psychometric integrity, little irregularity will be seen in the item
responses of the individual who responds to the items fairly and honestly.
If a student has prior knowledge of some test items (or is provided answers during the exam), the student will
respond correctly to those items at a higher probability than indicated by his or her ability as estimated across all
items. In this case, the person-fit index will be large for the student. We note, however, that if a student has prior
knowledge of the entire test content, this will not be detected based on the person-fit index, although the item
response latency index might flag such a student.
The person-fit index is based on all item responses. An unlikely response to a single test question may not result in
a flagged person-fit index. Of course, not all unlikely patterns indicate cheating, as in the case of a student who is
Ohio Department of Education 22 American Institutes for Research
able to guess a significant number of correct answers. Therefore, the evidence of person-fit index should be
evaluated along with other testing irregularities to determine possible testing irregularities. The number of flagged
students is summarized for every testing session and test administrator.
The person-fit index, zl , is computed using a standardized log-likelihood statistic. Following Drasgow, Levine, and
Williams (1985), Sotaridona, Pornell, and Vallejo (2003) define aberrant response patterns as a deviation from the
expected item score model. Snijders (2001) showed that the distribution of zl is asymptotically normal (i.e., with
an increasing number of administered items, i). Even at shorter test lengths of 8 or 15 items, the “asymptotic error
probabilities are quite reasonable for nominal Type I error probabilities of 0.10 and 0.05” (Snijders, 2001).
Sotaridona et al. (2003) report promising results of using zl for systematic flagging of aberrant response patterns.
Students with zl values greater than 3 or smaller than -3 are flagged. Aggregate units are flagged with t greater than
3 or smaller than -3 are flagged. The number of flagged aggregate units and the percentage of flagged aggregate are
presented in Table B.1 to B.4 in Appendix B.
𝑡 =𝐴𝑣𝑒𝑟𝑎𝑔𝑒 zl values
√(𝑠2 + 1) 𝑛⁄,
where s = standard deviation of zl values in an aggregate unit and n = number of students in an aggregate unit. The
QA report includes a list of the flagged aggregate units with the number of flagged students in the aggregate unit
(e.g., test session, test administrator, school). The number of flagged aggregate units and the percentage of flagged
aggregate are presented in Table B.1 to B.4 in Appendix B.
Item Response Similarity
The item response similarity was investigated using a cheating detection method, proposed in the paper titled
“Detecting excessive similarity in answers on multiple choice exams” (Wesolowsky, 2000). This study uses the
similarity of responses between a pair of students to estimate the probability of possible cheating. The
computational steps are as follows:
1. Based on assumptions and probability theory (pp. 911–912), ˆjip is estimated by solving the following two
equations
1/
1
(1 (1 ) )j ja a
ji i
q
jiij
p r
pc
q
for ja , and from ˆ
ja and ir to obtain ˆ ˆ1/
ˆ (1 (1 ) )j ja a
ji ip r , where ir is the proportion of the analysis unit
(e.g., school) that answered correctly on item i, jc is the proportion of items answered correctly by student j;
Ohio Department of Education 23 American Institutes for Research
2. tiw is the probability that, conditional on the answer being wrong, distractor t is chosen on question i. For now,
this is estimated by the proportion of students who choose option t over students who choose wrong options on
this item;
3. Using estimates from steps 1 and 2 to estimate ˆjk and 2ˆ
jk , hence, jkZ ;
4. Based on jkZ and significant level to decide if the students j and k have significant probability to copy each other.
In order to investigate the probability of false positive of the estimating procedure, the procedure is applied to
estimate the probability of cheating for each pair within each aggregate unit (school/session), and two Bonferroni
adjustments are used, one of which is based on (n – 1), and the other of which is based on (n(n – 1)/2), where n is
the number of students within the aggregate unit (school/session).
Aggregate units are flagged with two different methods: aggressive method and conservative method. The
aggressive method uses an alpha=0.05 and Bonferroni adjustment factor (n – 1) to flag test sessions and schools.
The more conservative method uses alpha=0.01 and Bonferroni adjustment factor (n(n – 1)/2) to flag suspect test
sessions and schools.
Bonferroni adjustment with factor (n – 1) is used if we know the seating of the students and the possible cheating
can only happen between the front and back student pair. If no seating chart is available, the factor (n(n – 1)/2) is
usually used. Based on simulation studies, the results based on (n(n – 1)/2) provide a good safety buffer against the
false positive, that we see only a slight chance of false positive. As for the alpha level, it seems that using alpha=0.01
is preferred, so only extreme pairs that worth investigation will be flagged. The number of flagged aggregate units
and the percentage of flagged aggregate by the two different methods are presented in Table B.5 to B.8 in Appendix
B.
1.2.6 SUMMARY OF VALIDITY OF TEST SCORE INTERPRETATIONS
Evidence for the validity of test score interpretations is strengthened as evidence supporting test score
interpretations accrues. In this sense, the process of seeking and evaluating evidence for the validity of test score
interpretation is ongoing. Nevertheless, there currently exists sufficient evidence to support the principle claims for
the test scores, including that OST test scores indicate the degree to which students have achieved Ohio’s Learning
Standards at each grade level, and that students scoring at the proficient level or higher demonstrate levels of
achievement consistent with extrapolations of national benchmarks indicating that they are on track to graduate
and are ready for post-secondary education or entry into the workforce. These claims are supported by evidence of
a test development process that ensures alignment of test content to Ohio’s Learning Standards, a standard-setting
process that yielded performance standards consistent with those of rigorous, national benchmarks, and evidence
that the structural model described by the new standards and implemented in Ohio’s State Tests is sound.
Ohio Department of Education 24 American Institutes for Research
2. BACKGROUND OF OHIO COMPUTER-BASED ASSESSMENTS
2.1 BACKGROUND OF ELA AND MATHEMATICS ASSESSMENTS
The Ohio State Board of Education adopted the Common Core State Standards (CCSS) in both English Language Arts
(ELA) and mathematics as Ohio’s Learning Standards. Ohio’s Learning Standards are designed to help ensure that
students are college and career ready by the end of high school. Ohio’s Learning Standards in ELA and mathematics
were fully implemented in classrooms and assessed starting in the 2014–2015 school year. Beginning in the 2015–
2016 school year, ODE began administering Ohio’s State Tests (OST) to assess student proficiency in ELA and
mathematics at grades 3–8, and following completion of high school coursework in ELA I, ELA II, Algebra, and
Geometry (or alternatively, following coursework in Integrated Mathematics I and Integrated Mathematics II).
The first operational administration of OST assessments in ELA and mathematics took place in fall 2015, with
administration of grade 3 ELA and high school end-of-course (EOC) assessments in ELA and mathematics. The first
operational forms of OST assessments in ELA and mathematics were constructed using items from the AIRCore item
bank. The AIRCore items were developed to be aligned to the CCSS and had been previously administered as part of
statewide assessments in Arizona, Florida, Utah, and/or Oregon. Following administration in one or more of the
statewide assessment systems and completion of the item-review process, AIRCore items were calibrated using
Rasch and Masters’ Partial Credit models, and linked to a common AIRCore scale. In December 2015, a standard-
setting workshop was conducted to recommend to the Ohio State Board of Education a set of performance standards
for reporting student achievement of Ohio’s Learning Standards in ELA and mathematics.14
The full system of grade-level and end-of-course assessments in ELA and mathematics were administered in spring
2016. Following the close of the testing window, the item pools for grade-level summative and high school EOC
assessments were calibrated. The Rasch model and Masters’ (1982) partial credit model, an extension of the one
parameter Rasch model that allows for graded responses, were used to estimate item parameters for OST
assessments. The OST assessments’ scale for each of the ELA and mathematics assessments was established by
centering on spring 2016 operational form. The performance standards previously set on the AIRCore scale were
shifted to the new OST scale by the linking constants obtained through the mean-mean equating method. In
subsequent years, pre-equated bank item parameter estimates will be applied directly for final scoring and reporting,
a strategy that allows for more rapid reporting of tests administered online.
2.2 BACKGROUND OF SCIENCE AND SOCIAL STUDIES ASSESSMENTS
Ohio adopted new academic learning standards in social studies and science in 2010 and 2011, respectively. Ohio’s
Learning Standards in science and social studies are designed to ensure that students across grades are receiving the
instruction they need to become scientifically literate and civic-minded citizens equipped with knowledge and skills
for the 21st century workforce and able to successfully transition to higher education. In spring 2015, ODE
administered the OST for the first time in social studies and science to assess proficiency with respect to Ohio’s
Learning Standards. OST assessment assesses science achievement in grades 5 and 8, and following instruction in
Physical Science and Biology in high school. Social studies achievement is assessed following high school coursework
14 Standard 7.1 – The rationale for a test, recommended uses of the test, support for such uses, and information that assists in score interpretation should be documented. When particular misuses of a test can be reasonably anticipated, cautions against such misuses should be specified.
Ohio Department of Education 25 American Institutes for Research
in American History and American Government. Prior to spring 2018, social studies achievement was also assessed
in grades 4 and 6.
The first operational administration of OST assessments in science and social studies took place in spring 2015. The
paper-based and online administrations occurred through the months of March and April. Following the close of the
testing window, the American Institutes for Research (AIR), under contract to ODE, convened eight panels of Ohio
educators to recommend performance standards on the assessments.
The Rasch model and Masters’ (1982) partial credit model were used to estimate item parameters for OST
assessments in science and social studies. Item pools for grade-level summative and end-of-course assessments
were calibrated following the first operational administration in spring 2015. In subsequent years, pre-equated bank
item parameter estimates will be applied directly for final scoring and reporting, a strategy that allows for more
rapid reporting of tests administered online.
2.3 OST TEST DESIGN
OST assessments are a series of fixed-form assessments that are intended to be administered online, although the
assessment is offered as a dual-mode (online and paper-based) assessment to accommodate schools that are not
ready to transition to the online testing environment.
Five types of machine-scored constructed-response (MSCR) items were included in OST assessments’ forms: graphic
response, natural language, equation response, hot text, and table input items. The graphic response item types
require students to place objects or move objects around in the answer space. A student can also plot points, draw
lines, and draw shapes. The natural language item types require students to type an English language answer. The
equation response items require students to enter a value or equation. The table input item types require students
to input numerical values into a table. Rubric validation for all operational test items was completed prior to test
construction and was based on the previous field-test administration of those items.
Each ELA assessment included one writing essay prompt that required an extended essay response. For the online
test administrations, a random sample of student responses to each writing task was selected for handscoring. These
responses were scored by two human raters on three distinct scoring dimensions or rubrics: Statement of
Purpose/Focus and Organization, Evidence/Elaboration, and Conventions/Editing, with any discrepancy adjudicated
in a resolution score. This sample of essay responses and writing scores was used to develop the statistical models
used for machine-scoring the remaining online essay responses. All essay responses captured from paper-pencil
tests were handscored.
Exhibits 2.3.1–2.3.16 provide the test blueprints that guide the construction of OST assessments’ test forms.
Exhibit 2.3.1: Point Range by Subscale — ELA
Grade / Course RL RI W
Grade 3 14–16 14–16 10
Grade 4 14–16 14–16 10
Grade 5 14–16 14–16 10
Grade 6 16–20 20–24 20
Grade 7 16–20 20–24 20
Grade 8 16–20 20–24 20
Ohio Department of Education 26 American Institutes for Research
Grade / Course RL RI W
ELA I 14–18 22–26 20
ELA II 14–18 22–26 20
Note: RL = Reading Literary Text; RI = Reading Informational Text; W = Writing
Exhibit 2.3.2: Point Range by Subscale — Grade 3 Mathematics
Grade FRA G MUD NO
Grade 3 11–13 11–13 12–16 11–13
Note: FRA = Fraction; G = Geometry; MUD = Multiplication & Division; NO = Numbers & Operations
Exhibit 2.3.3: Point Range by Subscale — Grade 4 Mathematics
Grade FRA G MUD
Grade 4 17–21 11–13 17–21
Note: FRA = Fraction; G = Geometry; MUD = Multiplication & Division
Exhibit 2.3.4: Point Range by Subscale — Grade 5 Mathematics
Grade D FRA G
Grade 5 17-21 17-21 11-13
Note: D = Decimals; FRA = Fraction; G = Geometry
Exhibit 2.3.5: Point Range by Subscale — Grade 6 Mathematics
Grade EE GS NS RP
Grade 6 17–23 11–13 11–13 13–17
Note: EE = Expression and Equations; GS = Geometry and Statistics; NS = The Number System; RP = Ratios and Proportions
Exhibit 2.3.6: Point Range by Subscale — Grade 7 Mathematics
Grade G NS RP SP
Grade 7 11–13 15–19 12–16 12–15
Note: G = Geometry; NS = The Number System; RP = Ratios and Proportions; SP = Statistics and Probability
Ohio Department of Education 27 American Institutes for Research
Exhibit 2.3.7: Point Range by Subscale — Grade 8 Mathematics
Grade EE F G NS
Grade 8 11–15 11–15 15–19 11–13
Note: EE = Expression and Equations; F = Functions; G = Geometry; NS = The Number System
Exhibit 2.3.8: Point Range by Subscale — Algebra
Course F NQEE S
Algebra 23–27 19–22 10–12
Note: F = Functions; NQEE = Number, Quantities, Equations and Expressions; S = Statistics
Exhibit 2.3.9: Point Range by Subscale — Geometry
Course CP C P ST
Geometry 19–21 10–13 10–12 13–19
Note: CP = Congruency & Proof; C = Circles; P = Probability; ST = Similarity & Trigonometry
Exhibit 2.3.10: Point Range by Subscale — Integrated Mathematics I
Course A G NQF S
Integrated Math I 13–15 11–13 18–21 10–12
Note: A = Algebra; G = Geometry; NQF = Number & Quantity/Functions; S = Statistics
Exhibit 2.3.11: Point Range by Subscale — Integrated Mathematics II
Course F G NQEE P
Integrated Math II 11–13 17–22 14–18 10–12
Note: F = Functions; G = Geometry; NQEE = Number, Quantities, Equations and Expressions; P = Probability
Exhibit 2.3.12 Point Range by Subscale — Grade 5 & 8 Science
Grade ES LS PS
Grade 5 15–17 19–21 19–21
Grade 8 21–23 16–18 16–18
Note: ES = Earth Science, LS = Life Science, PS = Physical Science
Ohio Department of Education 28 American Institutes for Research
Exhibit 2.3.13 Point Range by Subscale — Biology
Course BS.A BS.B BS.C BS.D
Biology 13–15 13–15 13–15 13–15
Note: BS.A = Heredity; BS.B = Evolution; BS.C = Diversity and Interdependence of Life; BS.D = Cells
Exhibit 2.3.14: Point Range by Subscale — Physical Science
Course PS-HS.A PS-HS.B PS-HS.C PS-HS.D
Physical Science 15–17 15–17 15–17 7–9
Note: PS-HS.A = Study of Matter, PS-HS.B = Energy and Waves, PS-HS.C = Forces and Motion, PS-HS.D = The Universe
Exhibit 2.3.15: Point Range by Subscale — American Government
Course AGA AGB AGC
American Government 23–25 23–25 15–17
Note: AGA = Historic Documents, AGB = Principles and Structure, AGC = Ohio/Policy/ Economy
Exhibit 2.3.16: Point Range by Subscale — American History
Course AHA AHB AHC
American History 17–19 24–26 20–22
Note: AHA = Skills and Document, AHB = 1877-1945, AHC = 1945 - Present
Ohio Department of Education 29 American Institutes for Research
3. SUMMARY OF FALL 2017 OPERATIONAL TEST ADMINSTRATION
The following tests were administered in fall 2017:
• Grade 3 ELA assessment
• High school end-of-course assessments
o ELA I and ELA II
o Algebra I, Geometry, Integrated Mathematics I, and Integrated Mathematics II
o Biology and Physical Science
o American Government and American History
The online testing window opened between October 23 and November 3 for grade 3 ELA.
The high school end-of-course tests are scheduled to be administered following completion of instruction in courses
targeted for assessment. Most of the courses are taught over an academic year, but some students receive
instruction in semester-based courses, necessitating a fall administration. The online and paper-based testing
window opened between December 4 and January 12 for high school EOC tests. All the 2017 fall tests were scored
using the pre-equated parameters. This section summarizes the operational test results for the fall 2017
administration of OST.
3.1 STUDENT POPULATION AND PARTICIPATION
Assessment data for operational analyses included Ohio public school students who met minimum attemptedness
requirements for scoring and reporting. The demographic composition of students taking the fall 2017 OST is
presented in Exhibit 3.1.1 by assessment and subgroup.15 The number of students participating in each assessment
by test mode is presented in Appendix C.
Exhibit 3.1.1: Number of Students Participating in Fall 2017 Assessments
Grade / Course
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
Wh
ite
Mu
ltip
le
Eth
nic
itie
s
LEP
IEP
ELA
Grade 3 128,205 62,932 65,121 152 23,210 3,087 5,369 158 86,563 9,655 5,114 15,140
ELA I 42,039 18,136 23,125 778 14,195 764 2,214 59 21,782 2,772 2,849 9,989
ELA II 35,553 16,400 18,474 679 11,493 728 1,784 55 18,918 2,334 2,286 7,111
Mathematics
Algebra 47,491 22,303 24,208 980 14,711 573 2,608 75 26,193 3,049 2,011 9,353
Geometry 34,631 16,742 17,174 715 10,541 596 1,828 46 19,222 2,135 1,519 6,403
Integrated Math I 4,832 2,346 2,447 39 2,327 213 76 14 1,633 566 812 671
15 Standard 1.8 – The composition of any sample of students from which validity evidence is obtained should be described in as much detail as is practical and permissible, including major relevant socio-demographic and developmental characteristics.
Ohio Department of Education 30 American Institutes for Research
Grade / Course
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
Wh
ite
Mu
ltip
le
Eth
nic
itie
s
LEP
IEP
Integrated Math II 4,401 2,098 2,272 31 1,988 184 69 8 1,682 467 619 545
Science
Biology 24,837 12,013 12,508 316 8,845 392 1,255 39 12,577 1,625 1,470 4,333
Physical Science 899 436 445 18 448 6 39 327 66 41 198
Social Studies
American Government 32,330 15,655 16,262 413 7,161 752 1,317 47 21,145 1,775 1,104 4,583
American History 20,969 10,349 10,292 328 7,551 351 1,132 28 10,460 1,353 1,408 4,173
3.2 SUMMARY OF OVERALL STUDENT PERFORMANCE FOR FALL 2017
Exhibit 3.2.1 shows the statewide summary statistics for the fall 2017 OST administration. The results that include
the minimum and maximum observed scale scores, scale score mean, standard deviation, standard error of the
measure (SEM), and internal consistency reliability. Frequency distributions for each of the assessments are provided
in Appendix D.
Exhibit 3.2.1 Fall 2017 Operational Test Summary Statistics
Grade / Course N-count
Max Obtained
Scaled Score
Min Obtained
Scaled Score
Scaled Score Mean
Scaled Score Standard Deviation
Scaled Score SEM
Reliability
ELA
Grade 3 128,205 863 545 688.63 43.37 17.49 0.84
ELA I 42,039 800 606 681.63 23.55 8.89 0.86
ELA II 35,553 795 597 675.63 26.01 9.73 0.86
Mathematics
Algebra I 47,491 814 618 680.79 21.74 10.32 0.77
Geometry 34,631 810 604 669.11 25.16 12.98 0.73
Integrated Math I 4,832 811 618 677.9 24.21 10.47 0.81
Integrated Math I 4,401 813 594 670.62 28.50 12.65 0.80
Science
Biology 24,837 822 617 688.77 21.43 9.85 0.79
Physical Science 899 752 634 684.59 14.06 9.85 0.51
Social Studies
American Government 32,330 774 642 706.56 19.61 5.62 0.92
American History 20,969 800 619 693.96 19.93 7.65 0.85
Ohio Department of Education 31 American Institutes for Research
Exhibit 3.2.2 shows the percentage of students classified in each of the performance levels for each the fall 2017
tests.
Exhibit 3.2.2: Fall 2017 Percentage of Students in Performance Levels
Grade / Course Number Tested
% Limited % Basic % Proficient % Accelerated % Advanced % At or Above
Proficient
ELA
Grade 3 128,205 33 29 15 15 8 38
ELA I 42,039 52 29 16 3 2 21
ELA II 35,553 55 30 14 3 1 17
Mathematics
Algebra I 47,491 51 34 13 3 1 16
Geometry 34,631 69 23 6 2 1 9
Integrated Math I 4,832 59 29 11 4 1 16
Integrated Math II 4,401 68 23 7 3 2 12
Science
Biology 24,837 43 35 16 2 5 23
Physical Science 899 47 41 14 1 0 15
Social Studies
American Government 32,330 14 26 41 13 7 60
American History 20,969 31 36 26 4 3 34
3.3 STUDENT PERFORMANCE BY SUBGROUP FOR FALL 2017
Exhibits 3.3.1–3.3.4 present the percentage of students in each grade and subject at each performance level, by
gender and ethnicity, including female, male, African American, Asian, Alaskan/Hawaiian Native Hispanic/Latino,
American Indian, White, and Multiple Ethnicities. Overall, the achievement gap between subgroups continues to
exist. For example, the white group outperforms the African American and Hispanic group.
Exhibit 3.3.1 Fall 2017 Percentage of Students at Each Performance Level by Gender and Ethnicity — ELA
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
LEP
IEP
Grade 3
Limited 33 30 35 61 56 23 46 36 25 39 68 61
Basic 29 30 29 22 28 23 30 32 30 31 25 25
Proficient 15 16 15 11 8 16 12 12 18 14 5 8
Accelerated 15 15 14 6 6 21 9 15 18 12 2 5
Ohio Department of Education 32 American Institutes for Research
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
LEP
IEP
Advanced 8 9 7 3 2 17 3 5 10 6 0 2
ELA I
Limited 52 48 56 51 65 50 55 56 43 57 70 72
Basic 29 31 28 32 27 28 30 24 32 27 23 23
Proficient 16 18 15 18 11 18 15 20 20 16 9 7
Accelerated 3 3 3 2 1 4 1 2 4 2 0 0
Advanced 2 2 1 0 0 3 1 2 3 1 0 0
ELA II
Limited 55 51 59 49 68 53 62 55 46 58 75 79
Basic 30 32 27 32 26 30 25 27 32 29 21 19
Proficient 14 15 12 18 8 13 13 16 17 12 5 5
Accelerated 3 3 2 2 1 4 2 2 4 2 0 0
Advanced 1 1 1 0 0 2 1 2 2 1 0 0
Exhibit 3.3.2 Fall 2017 Percentage of Students at Each Performance Level by Gender and Ethnicity —
Mathematics
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
LEP
IEP
Algebra
Limited 51 48 54 45 62 33 54 48 44 55 64 76
Basic 34 37 32 37 31 30 34 41 37 34 28 22
Proficient 13 14 12 14 9 19 12 11 16 11 8 4
Accelerated 3 2 3 4 1 10 2 1 3 2 1 0
Advanced 1 1 1 1 0 9 0 0 1 0 0 0
Geometry
Limited 69 69 69 65 82 46 73 63 62 74 75 89
Basic 23 23 22 25 17 25 21 24 26 20 19 11
Proficient 6 6 7 7 3 14 5 11 8 6 5 2
Accelerated 2 2 2 4 0 9 1 0 3 1 2 0
Advanced 1 1 1 0 0 7 0 2 1 0 0 0
Integrated Math I
Limited 59 56 61 74 66 60 59 86 48 61 73 80
Basic 29 31 27 21 27 26 29 7 30 31 23 18
Proficient 11 12 11 8 9 12 12 7 14 9 7 6
Ohio Department of Education 33 American Institutes for Research
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
LEP
IEP
Accelerated 4 4 3 0 1 1 3 0 8 2 1 1
Advanced 1 1 1 0 0 2 0 0 2 1 0 0
Integrated Math II
Limited 68 68 68 71 77 63 59 63 57 78 82 90
Basic 23 23 23 19 22 18 26 25 24 21 17 12
Proficient 7 7 7 6 4 11 10 13 10 4 5 1
Accelerated 3 2 3 3 1 5 1 0 6 2 0 0
Advanced 2 2 2 0 0 5 3 0 4 0 0 0
Exhibit 3.3.3 Fall 2017 Percentage of Students at Each Performance Level by Gender and Ethnicity — Science
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
LEP
IEP
Biology
Limited 43 41 44 45 55 40 49 46 33 43 59 62
Basic 35 37 34 38 37 35 36 44 34 38 35 32
Proficient 16 17 16 18 10 11 15 13 21 16 6 7
Accelerated 2 2 2 1 0 4 1 3 3 1 0 0
Advanced 5 5 5 1 1 11 2 3 8 4 0 1
Physical Science
Limited 47 48 47 56 54 33 41 44 33 44 61
Basic 41 42 41 39 40 50 46 41 50 49 36
Proficient 14 14 15 11 10 17 13 18 20 10 7
Accelerated 1 1 0 0 0 0 0 1 2 0 0
Advanced 0 0 0 0 0 0 0 0 0 0 0
Ohio Department of Education 34 American Institutes for Research
Exhibit 3.3.4 Fall 2017 Percentage of Students at Each Performance Level by Gender and Ethnicity — Social
Studies
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
LEP
IEP
American Government
Limited 14 13 16 22 31 9 22 11 8 17 36 35
Basic 26 28 25 35 37 19 37 36 22 31 44 39
Proficient 41 42 40 38 29 38 35 34 45 39 20 23
Accelerated 13 12 13 5 4 19 5 19 16 10 1 3
Advanced 7 6 8 1 1 15 2 0 9 5 0 1
American History
Limited 31 28 35 28 41 27 36 32 24 34 47 48
Basic 36 40 33 31 39 32 39 29 34 36 38 38
Proficient 26 27 26 31 20 30 22 36 32 26 16 14
Accelerated 4 3 4 7 1 5 2 0 6 3 1 1
Advanced 3 2 4 5 1 7 1 4 6 3 1 1
3.4 RELIABITY FOR FALL 2017
Reliability refers to the consistency or precision of test scores and performance level classifications, and essentially
addresses the question of how likely would a student be to achieve the same score, or be classified in the same
performance level, across multiple administrations of equivalently constructed and administered test forms. As part
of each test administration, the reliability of test scores and performance classifications is evaluated from a variety
of perspectives. The reliability evidence of OST assessments in ELA, mathematics, science, and social studies is
demonstrated with respect to both classical and IRT indices of internal consistency of test scores, and decision
accuracy and consistency of performance level classifications.16
3.4.1 INTERNAL CONSISTENCY
Test score reliability is traditionally estimated using both classical and IRT approaches. While measurement error is
conditional on test information, it is nevertheless desirable to provide a single index of a test’s internal consistency
or reliability. Classical estimates of test reliability such as Cronbach’s alpha, provide an index of the internal
consistency reliability of the test, or the likelihood that a student would achieve the same score in an equivalently
16 Standard 2.2 – The evidence provided for the reliability/precision of the scores should be consistent with the domain of replications associated with the testing procedures, and with the intended interpretations for use of the test scores. Standard 2.3 – For each total score, subscore, or combination of scores that is to be interpreted, estimates of relevant indices of reliability/precision should be reported.
Ohio Department of Education 35 American Institutes for Research
constructed test form. 17 Exhibit 3.4.1.1 shows the internal consistency estimates for each of the assessments.
Internal consistency estimates are around 0.8, typical of most similar length achievement tests. Internal consistency
estimate for Physical Science is, however, quite low and appears to be due to restriction of score range resulting
from the very high difficulty of test items.
Exhibit 3.4.1.1 Internal Consistency Reliabilities (Cronbach’s alpha) for Fall 2017 OST Scores
Grade / Course Internal Consistency
Reliability Variance
ELA
Grade 3 0.84 1882
ELA I 0.86 555
ELA II 0.86 677
Mathematics
Algebra 0.78 473
Geometry 0.73 633
Integrated Math I 0.81 586
Integrated Math II 0.80 813
Science
Biology 0.79 460
Physical Science NA NA
Social Studies
American Government 0.92 385
American History 0.85 397
NA: Not enough information to estimate reliably.
3.4.2 STANDARD ERROR OF MEASUREMENT
Because measurement error is conditional on test information, the precision of test scores varies with respect to the
information value of the test at each location along the ability distribution. Precision of individual test scores is
critically important to valid test score interpretation and is provided along with test scores as part of all student-
level reporting. Test scores are most precise in locations where test information is greatest. Because relatively little
test information is targeted to measurement of very low and high performing students, the precision of test scores
decreases near the tails of the ability distribution.
For OST assessments scored using MLE, the mathematical statement of the conditional standard error of
measurement (CSEM) for student i is:
𝐶𝑆𝐸𝑀(�̂�𝑖) = 1
√𝐼(�̂�𝑖)
where 𝐼(�̂�𝑖) is the Fisher information at the MLE and is calculated:
17 Standard 2.19
Ohio Department of Education 36 American Institutes for Research
𝐼(�̂�) = −𝜕2𝑙(𝜃)
𝜕𝜃2 |𝜃=�̂�.
In general, the second derivative for the ith 1PL item is
𝜕2𝑙𝑜𝑔([𝑝(𝜃)]𝑧𝑖[𝑞(𝜃)]1−𝑧𝑖)
𝜕𝜃2= { −𝐷2
𝑞𝑖(𝜃) (𝑝𝑖3(𝜃))
𝑝𝑖2(𝜃)
if 𝑧𝑖 = 1
−𝐷2𝑞𝑖(𝜃)(𝑝𝑖(𝜃)) if 𝑧𝑖 = 0
The second derivative for the ith Master’s Partial Credit Model item is
𝜕2𝑙𝑜𝑔(𝑃(𝑧𝑖|𝜃))
𝜕𝜃2= 𝐷2
[∑ 𝑗Exp(∑ 𝐷(𝜃 − 𝑏𝑘𝑖))𝑗𝑘=1
𝑚𝑖𝑗=1 ]
2
[1 + ∑ exp ∑ 𝐷(𝜃 − 𝑏𝑘𝑖)𝑗𝑘=1
𝑚𝑖𝑗=1 ]
2 − 𝐷2∑ 𝑗2Exp(∑ 𝐷(𝜃 − 𝑏𝑘𝑖))
𝑗𝑘=1
𝑚𝑖𝑗=1
1 + ∑ exp ∑ 𝐷(𝜃 − 𝑏𝑘𝑖)𝑗𝑘=1
𝑚𝑖𝑗=1
Standard errors of the MLEs are transformed to be placed onto the reporting scale. This transformation is:
SEss = a∙SEθ
where SEθ is the standard error of the ability estimate on the scale; and a is the slope of the scaling constants
that transform to the reporting scale. For OST assessments, 𝑎 =(725−700)
(𝜃𝐴𝑐𝑐𝑒𝑙𝑒𝑟𝑎𝑡𝑒𝑑−𝜃𝑃𝑟𝑜𝑓𝑖𝑐𝑖𝑒𝑛𝑡).
The figures in Exhibit 3.4.2.1–3.4.2.3 present graphically the standard errors of measurement for the grade-level and
end-of-course assessments. Each figure also includes the location of the four OST performance standard cuts. As the
figures indicate, OST assessments’ standard errors are smallest near the middle of the ability distribution, and
especially near the Proficient and Accelerated performance standard. 18 Test scores near the tails of the ability
distribution have larger standard errors as expected. We note that the test precision for some assessments,
especially the elementary grade ELA tests, does not support the number of performance standards adopted for OST
assessments. Thus, the standard errors for scores within some performance levels is nearly the size of the
performance level. Nevertheless, classification consistency estimates of scores at or above each performance
standard are strong.
18 Standard 2.14 – When possible and appropriate, conditional standard errors of measurement should be reported at several score levels unless there is evidence that the standard error is constant across score levels. Where cut scores are specified for selection or classification, the standard errors of measurement should be reported in the vicinity of each cut score.
Ohio Department of Education 37 American Institutes for Research
Exhibit 3.4.2.1: Overall Standard Error of Measurement for Fall 2017 ELA
Ohio Department of Education 38 American Institutes for Research
Exhibit 3.4.2.2: Overall Standard Error of Measurement for Fall 2017 Mathematics
Ohio Department of Education 39 American Institutes for Research
Exhibit 3.4.2.3: Overall Standard Error of Measurement for Fall 2017 Science and Social Studies
Ohio Department of Education 40 American Institutes for Research
3.4.3 STUDENT CLASSIFICATION RELIABILITY
When student performance is reported in terms of performance categories, a reliability index is computed to
estimate the likelihood of consistent classification of students as specified in standard 2.15 in the Standards for
Educational and Psychological Testing (AERA, APA, NCME, 2014). 19 This index considers the consistency of
classifications for the percentage of students that would, hypothetically, be classified in the same category on an
alternate, equivalent form.
For a fixed-form test, the consistency of classifications is typically estimated on test scores based on a single test
form from a single test administration using the true-score distribution estimated by fitting a bivariate beta-binomial
model or a four-parameter beta model (Huynh, 1976; Livingston & Wingersky, 1979; Subkoviak, 1976; Livingston &
Lewis, 1995).
The classification index can be examined for classification accuracy and classification consistency. Classification
accuracy refers to the agreement between the classifications based on the form actually taken and the classifications
that would be made on the basis of the students’ true scores, if their true scores could somehow be known.
Classification consistency refers to the agreement between the classifications based on the form actually taken and
the classifications that would be made on the basis of an alternate, equivalently constructed test form—that is, the
percentages of students who would be consistently classified in the same performance levels on two equivalent test
administrations.
In reality, the student’s true ability is unknown, and students are not administered an alternate, equivalent form.
Therefore, classification accuracy and consistency are estimated based on students’ item scores and the item
parameters, and the assumed underlying latent ability distribution as described below. The true score is an expected
value of the test score with measurement error.
3.4.4 CLASSIFICATION ACCURACY
Instead of assuming a normal distribution, we can directly estimate the probability of consistent classification using
the likelihood function. The likelihood function of 𝜃 given a student’s item scores represents the likelihood of the
student’s ability at that theta value. Integrating the likelihood values over the range of theta at and above the cut
score (with proper normalization) represents the probability of the student’s latent ability or the true score being at
or above that cut point.
If a student’s estimated ability (theta) is below the cut score, the probability of at or above the cut score is an
estimate of the chance that this student is misclassified as below the cut score, and 1 minus that probability is the
estimate of the chance that the student is correctly classified as below the cut score. Using this logic, we can define
various classification probabilities.
In Exhibit 3.4.4.1, accurate classifications occur when the classification decision made on the basis of the
hypothetical true score agrees with the decision made on the basis of the form actually taken. Misclassifications,
false positives and false negatives, occur when student’s true score classifications are different from the student’s
observed scores (e.g., a student whose true score results in a classification as Proficient, but whose observed score
19 Standard 2.16 – When a test or combination of measures is used to make classification decisions, estimates should be provided of the percentage of students who would be classified in the same way on two replications of the procedure.
Ohio Department of Education 41 American Institutes for Research
results in an incorrect classification as Partially Proficient). represents the expected numbers of students who
are truly above the cut score; represents the expected number of students falsely above the cut score;
represents the expected number of students truly below the cut score; and represents the number of students
falsely below the cut score.
Exhibit 3.4.4.1: Classification Accuracy
Classification on a Form Actually Taken
At or Above the Cut Score Below the Cut Score
Classification on True Score
At or Above the Cut Score
𝑁11 (Truly above the cut) 𝑁10 (False negative)
Below the Cut Score 𝑁01 (False positive) 𝑁00 (Truly below the cut)
3.4.5 CLASSIFICATION CONSISTENCY
As shown in Exhibit 3.4.5.1, consistent classification occurs when two forms agree on the classification of a student
as either at and above or below the performance standard, whereas inconsistent classification occurs when the two
decisions made on the basis of results from the two forms differ.
Exhibit 3.4.5.1: Classification Consistency
Classification on the 2nd Form Taken
Above the Cut Score Below the Cut Score
Classification on the 1st Form
Taken
At or Above the Cut Score
N11 (Consistently above the cut)
N10 (Inconsistent)
Below the Cut Score
N01 (Inconsistent)
N00 (Consistently below the cut)
3.4.6 CLASSIFICATION ACCURACY AND CONSISTENCY ESTIMATES
Exhibit 3.4.6.1 presents the classification accuracy and consistency indices for fall 2017 administration of OST.
Accuracy classifications are slightly higher than the consistency classifications in almost all performance standards.
The consistency classification rate can be somewhat lower than the accuracy rate because consistency index
assumes two test scores, both of which include measurement error, while the accuracy index assumes only a single
test score plus the true score, which does not include measurement error. However, the accuracy index is lower
than the consistency rate in Geometry and Integrated Mathematics II, especially at the basic cut. This may indicate
the Geometry and Integrated Mathematics II tests are difficult especially for the low achieving students.
11N
01N 00N
10N
Ohio Department of Education 42 American Institutes for Research
Exhibit 3.4.6.1: Decision Accuracy and Consistency Indices for Performance Standards
Grade / Course
Accuracy Consistency
Bas
ic
Pro
fici
en
t
Acc
ele
rate
d
Ad
van
ced
Bas
ic
Pro
fici
en
t
Acc
ele
rate
d
Ad
van
ced
ELA
Grade 3 0.88 0.90 0.92 0.96 0.84 0.86 0.89 0.94
ELA I 0.88 0.93 0.98 0.99 0.84 0.89 0.97 0.99
ELA II 0.89 0.94 0.98 0.99 0.85 0.91 0.98 0.99
Mathematics
Algebra 0.85 0.92 0.99 1.00 0.79 0.89 0.98 1.00
Geometry 0.66 0.79 0.95 0.99 0.77 0.81 0.96 0.99
Integrated Math I 0.87 0.94 0.99 1.00 0.83 0.91 0.98 1.00
Integrated Math II 0.69 0.84 0.96 0.99 0.78 0.84 0.96 0.99
Science
Biology 0.82 0.90 0.98 0.99 0.76 0.86 0.97 0.98
Physical Science 0.79 0.90 0.99 1.00 0.71 0.85 0.99 1.00
Social Studies
American Government 0.93 0.93 0.95 0.97 0.90 0.89 0.93 0.96
American History 0.87 0.90 0.98 0.99 0.83 0.85 0.97 0.98
3.4.7 RELIABILITY FOR SUBGROUPS IN THE POPULATION
Exhibits 3.4.7.1–3.4.7.3 show Cronbach’s alpha estimates of the internal consistency reliability of OST assessments for each of the subgroups: gender (females and males), ethnicity (African American, Asian, Hispanic/Latino, American Indian, White, and students reporting multiple ethnicities), as well as students’ Limited English Proficient (LEP) and Individualized Education Program (IEP) status. 20 Each of the ethnicity subgroups was composed of approximately equal numbers of males and females. As Exhibits 3.4.7.1–3.4.7.3 indicate, internal consistency reliabilities are generally consistent across subgroups, indicating that OST assessments measure an underlying achievement dimension that is in common across all subgroups. Where group reliabilities are attenuated, there is a corresponding decrease in test score variance for the subgroup, likely indicating that the attenuation of reliability is due to restriction of range in the subgroup.
20 Standard 2.11 – Test publishers should provide estimates of reliability/precision as soon as feasible for each relevant subgroup for which the test is recommended.
Ohio Department of Education 43 American Institutes for Research
Exhibit 3.4.7.1 Internal Consistency Reliability by Subgroup Fall 2016 — Grade 3 and High School ELA
Assessments
Subgroup
Grade 3 ELA I ELA II
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
All Students 127,757 0.84 1882 41,904 0.86 555 35,466 0.86 677
Female 62,751 0.84 1865 18,094 0.86 547 16,371 0.86 636
Male 64,856 0.84 1887 23,036 0.85 554 18,418 0.86 700
Unknown Gender 150 0.78 1574 774 0.84 489 677 0.85 632
African American 23,117 0.75 1354 14,167 0.82 465 11,482 0.81 539
Asian 3,083 0.87 2360 764 0.87 604 728 0.88 751
Hispanic/Latino 5,356 0.79 1523 2,207 0.83 482 1,778 0.84 620
American Indian/Alaskan 158 0.83 1866 59 0.87 612 55 0.87 678
White 86,241 0.84 1793 21,694 0.87 563 18,860 0.87 701
Multi-Ethnic 9,640 0.82 1744 2,762 0.86 571 2,324 0.85 634
LEP 5,079 0.65 1036 2,843 0.79 416 2,282 0.77 466
IEP 14,830 0.76 1467 9,906 0.78 393 7,058 0.76 456
Exhibit 3.4.7.2: Internal Consistency Reliability by Subgroup Fall 2017 — High School Mathematics Assessments
Subgroup
Algebra Geometry Integrated Math I Integrated Math II
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
All Students 47,341 0.78 473 34,514 0.73 633 4,822 0.81 586 4,389 0.80 813
Female 22,249 0.76 440 16,700 0.71 583 2,342 0.81 564 2,092 0.79 736
Male 24,121 0.78 500 17,113 0.75 681 2,441 0.82 607 2,266 0.82 887
Unknown Gender 971 0.79 487 701 0.75 653 39 0.72 448 31 0.74 618
African American 14,667 0.69 373 10,499 0.56 427 2,326 0.74 429 1,988 0.65 487
Asian 571 0.92 1249 593 0.90 1367 213 0.83 641 184 0.88 1384
Hispanic/Latino 2,599 0.74 410 1,826 0.67 529 76 0.77 469 69 0.83 884
American Indian/Alaskan 74 0.67 311 46 0.70 465 14 0.61 296 8 0.81 1007
White 26,117 0.79 486 19,168 0.77 683 1,625 0.86 761 1,672 0.86 1086
Multi-Ethnic 3,032 0.75 429 2,120 0.69 562 565 0.79 540 466 0.71 568
LEP 2,005 0.71 402 1,515 0.69 585 812 0.72 415 619 0.67 552
IEP 9,283 0.61 313 6,346 0.50 412 664 0.70 426 534 0.59 465
Ohio Department of Education 44 American Institutes for Research
Exhibit 3.4.7.3. Internal Consistency Reliability by Subgroup Fall 2017 — High School Social Studies and Science
Assessments
Subgroup
Biology Physical Science American Goverment American History
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
All Students 24,772 0.79 460 897 0.51 197 32,259 0.92 385 20,912 0.85 397
Female 11,992 0.78 430 435 0.48 185 15,625 0.91 355 10,329 0.83 326
Male 12,467 0.80 492 444 0.53 206 16,221 0.92 415 10,258 0.87 467
Unknown Gender 313 0.65 284 18 0.62 288 413 0.87 233 325 0.87 447
African American 8,825 0.56 238 448 0.43 181 7,154 0.87 248 7,534 0.77 266
Asian 391 0.87 793 6 0.09 97 749 0.92 463 348 0.88 472
Hispanic/Latino 1,249 0.67 301 39 0.05 94 1,313 0.89 272 1,130 0.80 305
American Indian/Alaskan 39 0.73 401 0 NA NA 47 0.91 335 28 0.85 379
White 12,544 0.84 559 325 0.58 220 21,092 0.91 364 10,431 0.88 452
Multi-Ethnic 1,620 0.77 437 66 0.53 183 1,771 0.91 353 1,348 0.84 375
LEP 1,467 0.46 199 41 -0.04 88 1,100 0.80 168 1,404 0.75 250
IEP 4,292 0.54 239 196 0.24 136 4,541 0.85 223 4,145 0.75 263
3.4.8 RELIABILITY FOR SUBSCALES
Internal consistency reliability estimates associated with the subscales for the fall 2017 operational forms are presented in Exhibits 3.4.8.1–3.4.8.4. As indicated in the Exhibits, subscale reliabilities are generally moderate in magnitude, as expected for subscales of the length observed in OST. The very low subscale reliabilities in ELA I, ELA II writing, and Integrated Mathematics II probability are due to the skewed subscore distribution. For example, in writing tests, 70% students got raw score 0 and 1. In probability subscale, 57% students got raw score 0 and 1.
Exhibit 3.4.8.1: Subscale Reliabilities — Fall 2017 ELA
Grade Reading
Informational Text
Reading
Literary Text Writing
Grade 3 0.68 0.68 0.81
ELA I 0.70 0.58 0.73
ELA II 0.71 0.61 0.71
Ohio Department of Education 45 American Institutes for Research
Exhibit 3.4.8.2: Subscale Reliabilities — Fall 2017 Mathematics
Algebra
Functions Modeling and Reasoning Number, Quantities,
Equations and Expressions
Statistics
0.58 0.67 0.54 0.41
Geometry
Congruence & Proof
Circles Modeling and
Reasoning Probability
Similarity & Trigonometry
0.44 NA 0.52 0.20 0.33
Integrated Mathematics I
Algebra Geometry Modeling and
Reasoning Number &
Quantity/Functions Statistics
0.48 0.25 0.69 0.58 0.46
Integrated Mathematics II
Functions Geometry Modeling and
Reasoning
Number, Quantities,
Equations and Expressions
Probability
0.19 0.52 0.56 0.55 0.24
NA: Negative reliability due to large SEM and small variance of scale scores.
Exhibit 3.4.8.3: Subscale Reliabilities — Fall 2017 Biology and Physical Science
Biology
Heredity Evolution Diversity and
Interdependence of Life
Cells
0.31 0.41 0.48 0.49
Physical Science
Study of Matter Energy and Waves Forces and Motion The Universe
0.01 0.11 0.14 -0.09
NA: Negative reliability due to large SEM and small variance of scale scores.
Ohio Department of Education 46 American Institutes for Research
Exhibit 3.4.8.4: Subscale Reliabilities — Fall 2017 Social Studies
American Government
Historic Documents Principles and Structure Ohio/Policy/Economy
0.82 0.83 0.69
American History
Skills and Documents 1877-1945 1945-Present
0.66 0.72 0.59
3.4.9 SUBSCALE INTERCORRELATION
The observed correlations among reporting category scores are presented in Exhibits 3.4.9.1–3.4.9.9.
Exhibit 3.4.9.1 Subscale Intercorrelations — Fall 2017 ELA
Grade / Course
Subscale Observed Correlation
RI RL
Grade 3 RL 0.66
W 0.45 0.44
ELA I RL 0.58
W 0.55 0.50
ELA II RL 0.60
W 0.55 0.50 Note: RL = Reading Literary Text; RI = Reading Informational Text; W = Writing
Exhibit 3.4.9.2 Subscale Intercorrelations — Fall 2017 Algebra
Grade Subscale Observed Correlation
F MR NQEE
Algebra
MR 0.77
NQEE 0.57 0.69
S 0.50 0.79 0.48
Note: F = Functions; MR = Model Reasoning; NQEE = Number, Quantities, Equations and Expressions; S = Statistics
Exhibit 3.4.9.3 Subscale Intercorrelations — Fall 2017 Geometry
Grade Subscale Observed Correlation
CP C MR P
Geometry C 0.40
MR Reasoning
0.70 0.60
Ohio Department of Education 47 American Institutes for Research
Grade Subscale Observed Correlation
CP C MR P
P 0.44 0.36 0.73
ST 0.52 0.46 0.63 0.48
Note: CP = Congruency & Proof; C = Circles; MR = Model Reasoning; P = Probability; ST = Similarity & Trigonometry
Exhibit 3.4.9.4 Subscale Intercorrelations — Fall 2017 Integrated Mathematics I
Grade Subscale Observed Correlation
A G MR NQF
Integrated Math I
G 0.48
MR Reasoning
0.72 0.61
NQF 0.59 0.50 0.77
S 0.52 0.45 0.81 0.55 Note: A = Algebra; G = Geometry; MR = Model Reasoning; NQF = Number & Quantity/Functions; S = Statistics
Exhibit 3.4.9.5 Subscale Intercorrelations — Fall 2017 Integrated Mathematics II
Grade Subscale
Observed Correlation
F G MR NQEE
Integrated Math II
G 0.47
MR Reasoning
0.65 0.69
NQEE 0.50 0.52 0.63
P 0.46 0.50 0.75 0.49 Note: F = Functions; G = Geometry; MR = Model Reasoning; NQEE = Number, Quantities, Equations and Expressions; P = Probability
Exhibit 3.4.9.6 Subscale Intercorrelations — Fall 2017 Biology
Grade Subscale Observed Correlations
BS-A BS-B BS-C
Biology
BS-B 0.45
BS-C 0.49 0.50
BS-D 0.48 0.47 0.51 Note: BS-A = Heredity; BS-B = Evolution; BS-C = Diversity and Interdependence of Life; BS-D = Cells
Ohio Department of Education 48 American Institutes for Research
Exhibit 3.4.9.7 Subscale Intercorrelations Fall 2016 — Physical Science
Grade Subscale Observed Correlations
PS-A PS-B PS-C
Physical Science
PS-B 0.23
PS-C 0.26 0.17
PS-D 0.18 0.15 0.12 Note: PS-A = Study of Matter; PS-B = Energy & Waves; PS-C = Forces & Motions; PS-D = The Universe
Exhibit 3.4.9.8 Subscale Intercorrelations and Reliability Estimates — Fall 2017 American Government
Grade Subscale
Observed Correlations
AGA AGB
American Government AGB 0.78
AGC 0.71 0.73 Note: AGA = Historic Documents; AGB = Principles & Structures; AGC = Ohio/Policy/Economy
Exhibit 3.4.9.9 Subscale Intercorrelations and Reliability Estimates — Fall 2017 American History
Grade Subscale
Observed Correlations
AHA AHB
American History AHB 0.67
AHC 0.58 0.62 Note: AHA = Skills & Documents; AHB = 1877-1945; AHC = 1945–Present
Ohio Department of Education 49 American Institutes for Research
4. SUMMARY OF SPRING 2018 OPERATIONAL TEST ADMINSTRATION
The following OST assessments were administered in spring 2018:
• ELA in grades 3–8, and high school EOC assessments in ELA I and ELA II
• Mathematics in grades 3–8 and high school EOC assessments in Algebra I, Geometry (or alternatively
Integrated Mathematics I and Integrated Mathematics II)
• Science in grades 5, grade 8, and high school EOC assessments in Biology and Physical Science (the Physical
Science assessment is being phased out; only a small number of students who had previously failed the
exam participated in the assessment)
• Social Studies high school EOC assessments in American Government and American History
The third operational administration of the full system of OST assessments in ELA and mathematics took place in spring 2018. The ELA testing window occurred from March 26 through April 27 and the mathematics, science, and social studies testing window occurred from April 2 through May 11. Item parameters for all the ELA assessments were freely calibrated following the spring administration. The mean-mean equating procedure was used to link the spring 2017 OST ELA item parameters to the OST assessments’ scale which was established following spring 2016 administration.
The mathematics, science, and social studies tests were scored using the pre-equated parameters calibrated following the spring 2016 administration of Ohio’s State Tests in those subject areas.
This section summarizes the operational test results for the spring 2018 administration of Ohio’s State Tests. Detailed descriptions of procedures for item and test development, test administration, scaling, equating, and scoring are presented in subsequent sections.
Ohio Department of Education 50 American Institutes for Research
4.1 STUDENT POPULATION AND PARTICIPATION
Assessment data for operational analyses included Ohio public school students who met minimum attemptedness requirements for scoring and reporting. The demographic composition of students taking OST assessments is presented in Exhibits 4.1.1–4.1.3 by assessment and subgroup.21 The number of students participating in each assessment by test mode is presented in Appendix C.
Exhibit 4.1.1: Number of Students Participating in Spring 2018 — ELA Online and Paper-Pencil
Exhibit 4.1.2: Number of Students Participating in Spring 2018 — Mathematics Online and Paper-Pencil
21 Standard 1.8 – The composition of any sample of students from which validity evidence is obtained should be described in as much detail as is practical and permissible, including major relevant socio-demographic and developmental characteristics.
Grade / Course
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
LEP
IEP
Grade 3 126,540 62,181 64,198 161 22,913 3,039 5,335 157 85,292 9,640 5,239 16,500
Grade 4 126,494 61,924 64,397 173 20,774 3,121 5,225 164 87,972 9,106 4,067 17,602
Grade 5 127,957 62,630 65,145 182 21,365 3,129 5,071 188 89,186 8,876 3,565 17,804
Grade 6 126,408 61,828 64,393 187 20,559 3,019 5,096 174 89,046 8,375 3,248 17,204
Grade 7 124,315 60,896 63,166 253 18,947 3,068 4,652 160 89,417 7,919 2,951 16,465
Grade 8 125,288 61,084 64,012 192 19,194 2,985 4,538 158 90,302 7,920 2,980 16,599
ELA I 149,393 71,718 77,090 585 27,610 3,487 5,948 204 102,439 9,385 5,613 22,036
ELA II 139,973 68,577 70,824 572 23,745 3,367 5,309 192 98,736 8,353 4,524 19,191
Grade / Course
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
LEP
IEP
Grade 3 127,422 62,613 64,625 184 23,023 3,027 5,370 157 86,015 9,661 5,244 16,649
Grade 4 125,922 61,718 64,023 181 20,781 3,035 5,227 161 87,515 9,069 4,061 17,605
Grade 5 126,613 62,075 64,340 198 21,321 2,932 5,061 191 88,167 8,799 3,565 17,784
Grade 6 124,820 61,145 63,461 214 20,540 2,814 5,027 174 87,824 8,296 3,258 17,252
Grade 7 119,692 58,780 60,652 260 18,692 2,618 4,603 154 85,806 7,663 2,939 16,415
Grade 8 97,465 47,014 50,266 185 16,670 1,946 4,060 129 67,923 6,597 2,834 16,011
Algebra 144,489 69,793 74,065 631 25,030 3,124 5,833 206 101,511 8,494 4,044 20,429
Geometry 127,017 62,944 63,551 522 20,224 2,860 4,888 180 91,575 7,045 3,092 16,009
Int Math I 12,228 5,887 6,287 54 3,823 496 302 25 6,255 1,305 1,447 1,893
Int Math II 10,536 5,145 5,332 59 3,057 435 268 27 5,757 970 1,008 1,557
Ohio Department of Education 51 American Institutes for Research
Exhibit 4.1.3: Number of Students Participating in Spring 2018 — Science and Social Studies Online and Paper-
Pencil
4.2 SUMMARY OF OVERALL STUDENT PERFORMANCE FOR SPRING 2018
The state summary results for the average scale scores, standard deviation, standard error measurement, minimum and maximum observed scale scores and reliability of the overall test are presented in Exhibit 4.2.1.
Exhibit 4.2.1: Spring 2018 Operational Test Summary Statistics
Operational Summary Statistics
Grade / Course N-count
Max Obtained
Scaled Score
Min Obtained
Scaled Score
Scale Score Mean
Scale Score Standard Deviation
Scale Score SEM
Reliability
ELA
Grade 3 126,540 863 545 710.88 48.41 17.78 0.87
Grade 4 126,494 846 549 715.11 43.83 16.33 0.86
Grade 5 127,957 848 552 719.40 46.20 16.54 0.87
Grade 6 126,408 851 555 707.69 41.83 12.99 0.90
Grade 7 124,315 833 568 710.20 40.01 11.94 0.91
Grade 8 125,288 805 586 700.32 30.45 9.62 0.90
ELA I 149,393 800 606 707.94 30.52 8.78 0.92
ELA II 139,973 808 597 703.19 30.96 9.27 0.91
Mathematics
Grade 3 127,422 818 587 719.56 47.92 12.86 0.93
Grade 4 125,922 835 605 728.85 49.05 12.73 0.93
Grade 5 126,613 804 624 711.09 39.20 9.90 0.94
Grade / Course O
vera
ll
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
LEP
IEP
Science
Grade 5 127,869 62,614 65,059 196 21,265 3,139 5,082 188 89,184 8,868 3,570 17,766
Grade 8 126,202 61,719 64,281 202 19,151 3,001 4,590 155 91,211 7,905 2,988 16,558
Biology 135,480 67,025 68,031 424 21,995 3,230 4,884 186 97,025 7,895 3,888 17,665
Physical Science 484 245 235 4 196 4 20 1 216 40 25 94
Social Studies
American Government 87,077 43,010 43,653 414 14,369 1,568 2,930 122 62,876 5,021 2,158 10,474
American History 126,208 62,431 63,375 402 20,531 2,491 4,691 163 90,768 7,330 3,919 17,314
Ohio Department of Education 52 American Institutes for Research
Operational Summary Statistics
Grade / Course N-count
Max Obtained
Scaled Score
Min Obtained
Scaled Score
Scale Score Mean
Scale Score Standard Deviation
Scale Score SEM
Reliability
Grade 6 124,820 790 616 708.74 37.46 9.29 0.94
Grade 7 119,692 806 605 708.49 40.74 10.55 0.93
Grade 8 97,465 774 633 701.91 27.78 7.49 0.93
Algebra 144,489 814 618 703.27 34.14 9.78 0.92
Geometry 127,017 810 604 693.30 41.93 11.43 0.93
Integrated Math I 12,228 814 618 690.82 35.64 10.40 0.91
Integrated Math II 10,536 813 594 684.48 37.95 11.79 0.90
Science
Grade 5 127,869 845 559 720.45 46.43 13.82 0.91
Grade 8 126,202 868 575 718.20 44.71 13.95 0.90
Biology 135,480 823 617 715.36 29.09 9.14 0.90
Physical Science 484 754 634 679.34 18.24 10.73 0.65
Social Studies
American Government 87,077 774 642 712.61 16.68 5.41 0.89
American History 126,208 800 619 716.79 26.15 7.62 0.92
The percentage of students in each performance level by grade and content area, as well as the percent of students at or above Proficient are presented in Exhibit 4.2.2.
Exhibit 4.2.2: Spring 2018 Percentage of Students in Performance Levels
Grade / Course Number Tested
% Limited
% Basic
% Proficient
% Accelerated
% Advanced
% At or
Above Proficient
ELA
Grade 3 126,540 20 20 18 18 24 60
Grade 4 126,494 16 18 22 22 22 66
Grade 5 127,957 13 17 19 28 22 70
Grade 6 126,408 16 24 24 21 15 59
Grade 7 124,315 15 21 24 21 18 64
Grade 8 125,288 27 19 31 15 8 54
ELA I 149,393 20 20 30 14 18 62
ELA II 139,973 20 22 33 15 10 59
Mathematics
Grade 3 127,422 23 10 20 19 27 67
Grade 4 125,922 19 9 18 26 29 73
Grade 5 126,613 27 10 26 18 19 63
Grade 6 124,820 25 16 23 17 19 59
Grade 7 119,692 27 14 22 23 15 59
Ohio Department of Education 53 American Institutes for Research
Grade / Course Number Tested
% Limited
% Basic
% Proficient
% Accelerated
% Advanced
% At or
Above Proficient
Grade 8 97,465 31 15 33 14 7 54
Algebra 144,489 27 21 26 18 9 53
Geometry 127,017 40 17 19 16 9 44
Integrated Math I 12,228 45 17 21 13 6 40
Integrated Math II 10,536 48 23 14 11 5 30
Science
Grade 5 127,869 11 20 20 24 25 69
Grade 8 126,202 15 17 23 29 15 68
Biology 135,480 15 15 33 10 26 70
Physical Science 484 58 38 9 1 0 10
Social Studies
American Government 87,077 6 14 59 16 6 81
American History 126,208 11 17 33 17 22 72
4.3 STUDENT PERFORMANCE BY SUBGROUP FOR SPRING 2018
Exhibits 4.3.1–4.3.4 presents the percentage of students in each grade and subject at each performance level, by gender and ethnicity, including female, male, African American, Asian, Alaskan/Hawaiian Native, Hispanic/Latino, American Indian, White, and Multiple Ethnicities, other demographic information such as special education (SPED) and limited English proficiency (LEP). Performance of African American and Hispanic students lags considerably behind performance of White and Asian students and this performance gap continues to be a concern.
Exhibit 4.3.1: Percentage of Students at Each Performance Level by Gender and Ethnicity in Spring 2018 — ELA
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
IEP
LEP
Grade 3
Limited 20 17 22 29 38 13 29 19 14 23 48 48
Basic 20 19 21 30 26 15 26 20 18 23 25 29
Proficient 18 18 19 19 16 14 18 18 19 18 13 14
Accelerated 18 18 17 14 11 19 13 18 20 17 8 7
Advanced 24 27 21 9 8 39 13 24 29 19 6 3
Grade 4
Limited 16 14 18 31 32 11 24 20 11 20 47 49
Basic 18 18 19 25 27 11 23 14 16 21 24 27
Proficient 22 22 22 20 21 16 22 25 22 22 16 15
Accelerated 22 22 22 14 14 23 17 19 25 19 9 7
Advanced 22 24 19 10 7 40 13 22 26 17 4 2
Ohio Department of Education 54 American Institutes for Research
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
IEP
LEP
Grade 5
Limited 13 10 15 24 29 10 19 15 8 16 43 47
Basic 17 16 18 27 27 9 24 16 14 20 28 30
Proficient 19 19 20 22 21 13 21 22 19 21 15 15
Accelerated 28 29 27 16 17 28 24 30 31 25 10 7
Advanced 22 25 20 12 6 40 12 16 27 17 4 2
Grade 6
Limited 16 12 20 36 35 12 23 19 11 20 52 58
Basic 24 22 27 32 33 13 30 30 22 30 31 31
Proficient 24 24 23 19 19 19 23 24 25 23 11 9
Accelerated 21 23 19 10 9 23 15 21 24 17 4 3
Advanced 15 19 12 4 4 34 8 6 18 11 1 1
Grade 7
Limited 15 12 19 18 34 11 24 18 11 20 51 60
Basic 21 19 23 25 31 10 26 25 19 26 30 28
Proficient 24 24 24 27 20 17 24 23 25 23 13 9
Accelerated 21 23 20 22 10 23 16 19 24 18 5 3
Advanced 18 22 15 9 4 38 10 16 22 13 2 1
Grade 8
Limited 27 22 32 48 55 17 40 28 20 33 71 79
Basic 19 18 20 17 21 11 21 17 19 21 17 14
Proficient 31 33 29 24 18 27 26 37 34 29 10 6
Accelerated 15 17 13 9 5 22 9 13 18 12 2 1
Advanced 8 9 6 3 1 23 3 4 9 6 1 0
ELA I
Limited 20 16 24 49 42 20 33 23 13 25 52 66
Basic 20 18 21 26 29 11 23 24 17 23 30 24
Proficient 30 30 29 22 23 20 27 34 32 29 16 10
Accelerated 14 15 13 5 6 14 9 12 17 11 2 1
Advanced 18 22 14 2 4 36 9 10 22 13 1 0
ELA II
Limited 20 16 24 48 43 20 35 27 14 25 55 70
Basic 22 21 23 29 30 15 24 23 20 24 30 23
Proficient 33 35 31 16 22 26 29 32 36 32 14 9
Accelerated 15 17 14 4 5 18 9 13 18 12 2 0
Advanced 10 12 9 4 2 22 4 7 12 7 1 0
Ohio Department of Education 55 American Institutes for Research
Exhibit 4.3.2: Percentage of Students at Each Performance Level by Gender and Ethnicity in Spring 2018 —
Mathematics
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
IEP
LEP
Grade 3
Limited 23 22 23 34 46 11 31 31 16 29 52 44
Basic 10 11 10 20 14 5 13 8 9 12 13 14
Proficient 20 21 20 29 20 14 24 21 21 22 17 22
Accelerated 19 20 19 9 12 18 17 17 22 18 10 12
Advanced 27 26 28 10 9 53 15 24 33 20 8 8
Grade 4
Limited 19 19 19 36 43 8 26 23 12 25 49 44
Basic 9 9 8 10 13 5 12 10 7 11 12 13
Proficient 18 19 17 25 20 11 21 21 18 21 17 20
Accelerated 26 26 25 17 16 22 24 24 28 24 14 15
Advanced 29 27 31 11 8 55 17 22 35 19 7 8
Grade 5
Limited 27 26 28 48 57 13 38 35 19 35 60 60
Basic 10 11 10 11 13 6 13 13 10 12 12 13
Proficient 26 28 25 21 20 18 28 26 28 27 18 18
Accelerated 18 18 18 13 7 20 13 12 21 15 6 5
Advanced 19 17 20 7 4 43 9 15 23 12 4 4
Grade 6
Limited 25 23 26 49 53 13 35 31 17 32 63 65
Basic 16 17 16 24 20 9 20 16 15 20 17 17
Proficient 23 24 22 17 17 16 23 25 25 23 13 12
Accelerated 17 18 17 6 7 17 13 18 20 13 5 4
Advanced 19 18 19 4 4 44 9 10 23 12 3 3
Grade 7
Limited 27 26 28 39 55 15 38 36 20 36 67 68
Basic 14 14 14 22 18 8 17 15 13 16 14 15
Proficient 22 22 21 22 17 17 22 19 23 22 11 11
Accelerated 23 24 22 15 9 26 16 21 27 17 6 5
Advanced 15 14 15 3 3 35 7 8 17 9 2 2
Grade 8
Limited 31 29 34 57 58 17 40 40 24 38 68 67
Basic 15 15 15 11 16 9 17 21 15 17 14 14
Proficient 33 34 31 25 20 25 30 24 36 31 14 15
Accelerated 14 15 13 6 5 20 10 11 17 10 3 3
Advanced 7 7 7 2 2 28 3 4 8 5 1 2
Algebra
Limited 27 24 29 55 54 12 43 30 19 32 65 65
Basic 21 21 21 27 26 11 24 26 20 24 22 22
Proficient 26 28 24 13 17 19 21 23 29 24 10 10
Ohio Department of Education 56 American Institutes for Research
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
IEP
LEP
Accelerated 18 20 17 5 5 25 10 17 22 14 3 3
Advanced 9 8 9 1 1 34 3 5 10 6 1 1
Geometry
Limited 40 38 41 79 73 19 58 47 31 49 83 78
Basic 17 18 16 15 15 10 17 20 18 17 10 12
Proficient 19 20 18 5 9 16 14 16 22 17 5 7
Accelerated 16 17 16 3 4 23 8 11 19 12 2 3
Advanced 9 8 10 1 1 32 3 7 10 6 1 1
Integrated Math I
Limited 45 41 49 76 69 52 40 60 28 58 80 84
Basic 17 17 16 17 19 13 15 24 15 18 12 12
Proficient 21 23 19 6 13 12 26 8 27 19 9 6
Accelerated 13 14 12 0 3 14 15 8 21 6 2 1
Advanced 6 6 6 2 0 10 4 0 10 3 0 0
Integrated Math II
Limited 48 47 49 85 74 47 49 56 32 59 82 85
Basic 23 24 22 10 21 22 22 30 24 25 15 15
Proficient 14 16 13 5 6 12 18 7 19 10 3 3
Accelerated 11 11 11 5 2 10 9 7 17 6 1 0
Advanced 5 5 6 0 0 10 5 0 8 3 1 0
Exhibit 4.3.3: Percentage of Students at Each Performance Level by Gender and Ethnicity in Spring 2018 —
Science
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
LEP
IEP
Grade 5
Limited 11 11 11 26 30 7 17 12 6 14 36 38
Basic 20 22 19 28 34 11 29 23 17 25 32 37
Proficient 20 21 19 18 19 15 22 26 20 22 16 15
Accelerated 24 24 24 19 13 25 19 25 27 21 10 8
Advanced 25 22 27 9 5 42 13 15 30 18 6 3
Grade 8
Limited 15 14 17 30 39 8 23 17 9 20 45 51
Basic 17 18 16 25 28 10 24 21 15 22 27 27
Proficient 23 25 22 22 20 16 25 27 24 25 17 14
Ohio Department of Education 57 American Institutes for Research
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
LEP
IEP
Accelerated 29 30 29 18 11 33 21 28 34 24 9 7
Advanced 15 13 16 6 2 32 6 9 18 10 2 1
Biology
Limited 15 13 17 39 35 10 26 16 10 20 43 48
Basic 15 15 15 25 26 11 21 17 12 19 28 27
Proficient 33 36 31 29 30 23 32 41 34 34 24 21
Accelerated 10 11 10 4 4 10 7 9 12 8 3 2
Advanced 26 25 28 7 6 45 14 18 31 20 4 2
Physical Science
Limited 58 61 54 125 68 50 60 100 49 60 81 76
Basic 38 41 36 25 35 50 30 0 42 33 22 20
Proficient 9 7 11 25 6 0 5 0 12 10 5 0
Accelerated 1 0 2 0 1 0 5 0 1 0 0 4
Advanced 0 0 0 0 0 0 0 0 0 3 0 0
Exhibit 4.3.4: Percentage of Students at Each Performance Level by Gender and Ethnicity in Spring 2018 — Social
Studies
Grade / Course
Performance Level
Percentage of Students in Each Grade and Subject at Each Performance Level
Ove
rall
Fem
ale
Mal
e
Un
kno
wn
Afr
ican
Am
eri
can
Asi
an
His
pan
ic /
Lat
ino
Am
eri
can
Ind
ian
/
Ala
skan
Wh
ite
Mu
ltip
le E
thn
icit
ies
LEP
IEP
American
Government
Limited 6 5 6 19 13 6 12 10 3 8 21 25
Basic 14 14 15 26 27 13 23 19 11 18 37 37
Proficient 59 62 56 48 53 51 54 57 60 60 40 36
Accelerated 16 15 17 7 6 21 9 8 19 12 3 2
Advanced 6 5 7 1 1 10 2 6 7 3 1 0
American History
Limited 11 10 13 29 27 10 21 12 7 15 34 39
Basic 17 18 16 31 29 15 24 17 14 21 33 34
Proficient 33 36 30 28 30 28 32 31 34 34 25 22
Accelerated 17 17 17 7 9 17 12 17 19 15 5 3
Advanced 22 19 25 7 6 30 11 24 27 17 4 2
Ohio Department of Education 58 American Institutes for Research
4.4 CLASSICAL ITEM ANALYSIS
Classical item statistics for multiple-choice (MC) and constructed-response (CR) items are calculated based on all student responses and used to monitor item behavior and investigate irregularities in item scoring throughout the testing window. Classical item analyses ensure that the items function as intended with respect to the underlying scales. AIR’s analysis program computed the required item and test statistics for each multiple-choice and constructed-response (CR) item to check the integrity of the item and to verify the appropriateness of the difficulty level of the item. Key statistics computed and examined include point biserial/polyserial correlations for item discrimination, biserial correlations for distractors for selected-response items, and proportion correct for item difficulty.
The point biserial/polyserial correlations indicate the extent to which each item differentiated between those students who possess the skills being measured and those who do not. In general, the higher the value, the better the item is able to differentiate between high- and low-achieving students. The point biserial/polyserial correlations are calculated as the correlation between the focal item score and the student’s IRT-based ability estimate. For polytomous items, the mean total number correct for student scoring within each of the possible score categories is also computed. Items with point biserial/polyserial correlations less than 0.25 are flagged and further reviewed by test development experts. For multiple-choice items, the point biserial correlation for each of the distractor response options is also computed.
The proportion correct score is the average number of available points achieved by students on the item. For dichotomous items, this is simply the proportion of students responding correctly. For polytomous items, dividing the average score on the item by the points possible produces a comparable index. The proportion correct score is commonly referred to as the p-value.
Exhibit 4.4.1 presents the average item p-values or proportion of total points and average point biserial/polyserial correlations for the operational test items. As indicated in Exhibit 4.4.1, the mean difficulty of ELA and social studies items is relatively consistent across grade-level assessments. However, average difficulty of mathematics and science items increases in general across grade levels and course assessments. The proportion of students responding to test items correctly in the end-of-course assessments in mathematics and science was relatively quite low. Mean point biserial correlations for the grade-level and end-of-course assessments are moderately high and generally consistent across assessments.
Exhibit 4.4.1: Average p-Value in Operational Test Administration
Grade / Course Average p-
Value p-Value SD
Average Point-Biserial
Point-Biserial SD
ELA
Grade 3 0.53 0.19 0.45 0.11
Grade 4 0.56 0.19 0.45 0.14
Grade 5 0.59 0.17 0.47 0.10
Grade 6 0.55 0.19 0.46 0.14
Grade 7 0.55 0.18 0.46 0.13
Grade 8 0.54 0.18 0.46 0.14
ELA I 0.54 0.19 0.46 0.16
ELA II 0.51 0.19 0.45 0.15
Mathematics
Grade 3 0.61 0.20 0.51 0.08
Grade 4 0.54 0.15 0.52 0.08
Grade 5 0.48 0.18 0.53 0.10
Ohio Department of Education 59 American Institutes for Research
Grade / Course Average p-
Value p-Value SD
Average Point-Biserial
Point-Biserial SD
Grade 6 0.54 0.20 0.52 0.11
Grade 7 0.51 0.15 0.53 0.12
Grade 8 0.48 0.23 0.47 0.09
Algebra 0.43 0.17 0.48 0.12
Geometry 0.36 0.18 0.51 0.11
Integrated Math I 0.37 0.16 0.49 0.14
Integrated Math II 0.33 0.17 0.46 0.11
Science
Grade 5 0.59 0.21 0.45 0.09
Grade 8 0.49 0.23 0.42 0.09
Biology 0.47 0.14 0.44 0.12
Physical Science 0.21 0.16 0.27 0.11
Social Studies
American Government 0.55 0.18 0.43 0.09
American History 0.56 0.12 0.46 0.10
4.5 ITEM RESPONSE THEORY ANALYSIS
Calibration is the process by which the statistical relationship between student responses and the underlying measurement construct is estimated. Traditional item response models assume a single underlying trait and assume that items are independent given that underlying trait. In other words, the models assume that given the value of the underlying trait, knowing the response to one item provides no information about responses to other items. This basic simplifying assumption allows the likelihood function for these models to take the relatively simple form of a product over items for a single student:
𝐿(𝑍) = ∏ 𝑃(𝑧|𝜃)
𝑛
𝑗=1
,
where Z represents the vector of item responses, and θ represents a student’s true ability.
Traditional item response models differ only in the form of the function P(Z). The one-parameter model (also known as the Rasch model) is used to calibrate dichotomously scored OST items and takes the form
𝑃(𝑥𝑗 = 1|𝜃𝑘 , 𝑏𝑗) =1
1+𝑒(𝜃𝑘−𝑏𝑗)
= 𝑃𝑗1(𝜃𝑘).
The b parameter is often called the location or difficulty parameter—the greater the value of b, the greater the difficulty of the item. The one-parameter model assumes that the probability of a correct response approaches zero as proficiency (θk – bj) decreases toward negative infinity. In other words, the one-parameter model assumes that no guessing occurs. In addition, the one-parameter model assumes that all items are equally discriminating.
For items that have multiple, ordered response categories (i.e., partial credit items), OST items are calibrated using the Rasch family Masters’ (1982) partial credit model. Under Masters’ model, the probability of a response in category i for an item with mj categories can be written as
Ohio Department of Education 60 American Institutes for Research
𝑃 (𝑥𝑗 = 𝑖|𝜃𝑘, 𝑏𝑗0 … 𝑏𝑗𝑚𝑗−1) =𝑒
∑ (𝜃𝑘−𝑏𝑗𝑣)𝑖𝑣=0
∑ 𝑒∑ (𝜃𝑘−𝑏𝑗𝑣)
𝑔𝑣=0
𝑚𝑗−1𝑔=0
.
Item banks for ELA and mathematics were freely calibrated following the close of the spring 2016 testing window, centering the mean item difficulty on the operational test form to establish the OST assessments’ scale. The linking constant necessary to bring the previously adopted performance standards onto the new OST scale was then computed. The procedures for calibration, equating, and scaling of tests are described in the Scaling and Equating section. Appendix E shows the operation item parameters for each test.
The tables in Appendix E provide Rasch and Masters’ partial credit model item parameter estimates for the spring 2018 operational test items. Since OST assessments are an online assessment system, bank item parameters were estimated based only on online responses to test items. Exhibits 4.5.1–4.5.4 present the mean and standard deviation of the Rasch item parameters by item type for each test for items administered online. Item types include traditional four-option multiple-choice (MC) items and machine-scored constructed-response (MSCR) items for which students’ constructed responses are scored electronically using explicit rubrics. MSCR includes natural-language items, grid items, and table items. In addition, there are technology enhanced (TE) items, hotspot (HS) items, and writing text (ER) items. The average Rasch difficulty is presented for each scoring dimension of the writing prompt administered at each grade. As illustrated in Exhibits 4.5.1–4.5.4, selected-response items are, on average, less difficult than the constructed-response item types.
Exhibit 4.5.1: Rasch Summary Statistics by Item Type — ELA
Grade / Course
MC TE HS Writing Prompt Average Rasch
N Avg
Rasch SD N
Avg Rasch
SD N Avg
Rasch SD Org
Ev / Elab
Conv
Grade 3 20 -0.41 0.74 7 0.96 0.63 3 1.19 1.30 1.92 1.96 -0.31
Grade 4 18 -0.39 0.80 8 0.85 0.95 3 1.09 0.95 1.60 1.67 -0.01
Grade 5 17 -0.31 0.76 10 0.45 1.14 3 0.21 1.70 1.19 1.19 -1.75
Grade 6 20 -0.50 0.90 13 0.64 0.70 6 -0.28 1.01 0.25 0.47 -1.57
Grade 7 20 -0.63 0.96 16 0.71 0.59 6 0.19 1.01 0.77 0.87 -1.07
Grade 8 23 -0.28 0.86 11 0.45 0.76 6 0.22 1.20 0.72 1.21 -1.27
ELA I 22 -0.41 0.86 12 0.71 0.80 6 0.32 1.36 1.04 1.33 -1.43
ELA II 24 -0.40 0.88 11 0.76 0.71 6 0.36 1.00 0.85 1.11 -0.89
Ohio Department of Education 61 American Institutes for Research
Exhibit 4.5.2: Rasch Summary Statistics by Item Type — Mathematics
Grade / Course
MC MSCR TE
N Avg
Rasch SD N
Avg Rasch
SD N Avg
Rasch SD
Grade 3 9 -1.13 1.27 31 0.07 1.24 3 0.20 1.27
Grade 4 10 -0.19 0.79 30 0.09 1.06 8 0.06 0.73
Grade 5 7 -0.96 1.06 35 0.13 1.07 4 0.42 0.91
Grade 6 12 -0.35 1.40 28 0.01 1.26 6 0.69 1.37
Grade 7 12 -0.66 0.75 30 0.37 0.80 3 0.69 0.80
Grade 8 18 -0.94 0.93 29 0.60 1.39 3 1.03 0.83
Algebra 23 -0.34 0.64 22 0.45 1.18 2 -0.72 0.26
Geometry 14 -0.45 0.75 30 0.95 0.91 7 0.36 1.62
Int Math I 21 -0.38 0.59 19 0.52 1.14 7 -0.50 1.01
Int Math II 23 -0.13 0.64 26 0.80 1.24 3 1.83 0.75
Ohio Department of Education 62 American Institutes for Research
Exhibit 4.5.3: Rasch Summary Statistics by Item Type — Science
Exhibit 4.5.4: Rasch Summary Statistics by Item Type — Social Studies
Item fit is evaluated via the mean square Infit and mean square Outfit statistics reported by WINSTEPS, which are based on weighted and unweighted standardized residuals for each item response, respectively. These residual statistics indicate the discrepancy between observed item responses and the predicted item responses based on the Rasch and Masters models. Both fit statistics have an expected value of 1. Values substantially greater than 1 indicate model underfit, while values substantially less than 1 indicate model overfit (Linacre, 2004). Items are flagged if Infit or Outfit values are less than 0.7 or greater than 1.3. Exhibit 4.5.5 summarizes the number of operational test items with Infit and Outfit statistics within the range of 0.7 to 1.3 and those items outside of that range. Appendix F shows OST assessments’ performance standards on the theta and scale score scale for current operational test administrations, and Appendix G provides the raw to scale score conversion tables. The operational field-test design of the spring 2018 ELA assessments makes it impossible to report the raw to scale score transformation. Therefore only mathematics, science, and social studies conversion tables are provided in Appendix G for the spring 2018 administration.
Exhibit 4.5.5: Summary of Item Fit Statistics
Grade / Course
Infit Outfit
Below 0.7
Between 0.7–1.3
Above 1.3
Below 0.7
Between 0.7–1.3
Above 1.3
ELA
Grade 3 2 28 0 3 26 1
Grade 4 4 24 1 5 23 1
Grade 5 2 27 1 3 24 3
Grade 6 4 35 0 7 27 5
Grade 7 4 36 2 5 31 6
Grade 8 6 31 3 8 28 4
ELA I 4 34 2 6 29 5
ELA II 6 32 3 9 28 4
Grade / Course
MC MSCR TE HS
N Avg
Rasch SD N
Avg Rasch
SD N Avg
Rasch SD N
Avg Rasch
SD
Grade 5 26 -0.73 0.72 9 0.67 1.04 13 0.95 1.22 - - -
Grade 8 25 -0.84 0.82 13 0.60 1.18 12 0.98 0.98 - - -
Biology 21 -0.19 0.67 15 0.16 0.75 9 0.20 0.60 - - -
Physical Science 18 -0.80 0.57 9 -0.21 0.91 13 1.16 0.57 4 0.31 0.43
Grade / Course
MC MSCR TE
N Avg
Rasch SD N
Avg Rasch
SD N Avg
Rasch SD
American Government 18 -0.35 0.91 5 0.15 0.72 21 0.31 0.69
American History 32 -0.28 0.51 5 -0.11 0.19 13 0.59 0.44
Ohio Department of Education 63 American Institutes for Research
Grade / Course
Infit Outfit
Below 0.7
Between 0.7–1.3
Above 1.3
Below 0.7
Between 0.7–1.3
Above 1.3
Mathematics
Grade 3 0 41 2 0 39 4
Grade 4 0 44 4 1 42 5
Grade 5 0 44 2 3 36 7
Grade 6 1 43 2 4 32 10
Grade 7 0 41 3 3 32 9
Grade 8 1 48 1 1 41 8
Algebra 1 44 2 8 32 7
Geometry 0 48 3 10 32 9
Integrated Math I 0 42 5 12 24 11
Integrated Math II 5 45 2 13 27 12
Science
Grade 5 1 45 2 4 39 5
Grade 8 1 48 1 4 42 4
Biology 0 45 0 2 42 1
Physical Science 7 29 8 12 24 8
Social Studies
American History 0 45 5 1 43 6
American Government 2 40 2 4 34 6
4.6 RELIABILITY FOR SPRING 2018
Reliability refers to the consistency or precision of test scores and performance level classifications, and essentially addresses the question of how likely a student would be to achieve the same score, or be classified in the same performance level, across multiple administrations of equivalently constructed and administered test forms. As part of each test administration, the reliability of test scores and performance classifications is evaluated from a variety of perspectives. The reliability evidence of OST assessments in ELA and mathematics is demonstrated with respect to both classical and IRT indices of internal consistency of test scores, and decision accuracy and consistency of performance level classifications.22
4.6.1 INTERNAL CONSISTENCY
Test score reliability is traditionally estimated using both classical and IRT approaches. While measurement error is conditional on test information, it is nevertheless desirable to provide a single index of a test’s internal consistency
22 Standard 2.2 – The evidence provided for the reliability/precision of the scores should be consistent with the domain of replications associated with the testing procedures, and with the intended interpretations for use of the test scores. Standard 2.3 – For each total score, subscore, or combination of scores that is to be interpreted, estimates of relevant indices of reliability/precision should be reported.
Ohio Department of Education 64 American Institutes for Research
or reliability. Classical estimates of test reliability such as Cronbach’s alpha, provide an index of the internal consistency reliability of the test, or the likelihood that a student would achieve the same score in an equivalently constructed test form. 23 Exhibit 4.6.1.1 shows the internal consistency estimates for each OST assessment in mathematics, science, and social studies. Internal consistency estimates are uniformly near 0.9, typical of most similar length achievement tests. Internal consistency reliability for the Physical Science assessments is quite low, likely due to truncation of range in the highly selected sample of students participating in the test administration. The Physical Science assessment is being phased out and only students who had previously taken the test and not met performance requirements participated.
Exhibit 4.6.1.1: Internal Consistency Reliabilities (Cronbach’s alpha) for OST Scores
Grade / Course Internal Consistency
Reliability Variance
ELA
Grade 3 0.87 2344
Grade 4 0.86 1920
Grade 5 0.87 2132
Grade 6 0.90 1747
Grade 7 0.91 1598
Grade 8 0.90 926
ELA I 0.92 931
ELA II 0.91 958
Mathematics
Grade 3 0.93 2292
Grade 4 0.93 2400
Grade 5 0.94 1536
Grade 6 0.94 1401
Grade 7 0.93 1658
Grade 8 0.93 770
Algebra 0.92 1165
Geometry 0.93 1756
Integrated Math I 0.91 1270
Integrated Math II 0.90 1439
Science
Grade 5 0.91 2154
Grade 8 0.90 1999
Biology 0.90 846
Physical Science NA NA
Social Studies
American Government 0.89 278
American History 0.91 683
NA: Not enough information to estimate reliably.
23 Standard 2.19
Ohio Department of Education 65 American Institutes for Research
4.6.2 STANDARD ERROR OF MEASUREMENT
The figures in Exhibit 4.6.2.1–4.6.2.4 present graphically the standard errors of measurement for the grade-level and end-of-course assessments. Each figure also includes the location of the four OST performance standard cuts. As the figures indicate, OST assessments’ test scores are most precise near the middle of the ability distribution, and especially near the Proficient and Accelerated performance standard.24 Test scores near the tails of the ability distribution are somewhat less precise as expected. We note that the test precision for some assessments, especially the elementary grade ELA tests, does not support the number of performance standards adopted for OST assessments. Thus, the standard errors for scores within some performance levels is nearly the size of the performance level. Nevertheless, classification consistency estimates of scores at or above each performance standard are strong.
Exhibit 4.6.2.1: Overall Standard Error of Measurement for ELA
24 Standard 2.14 – When possible and appropriate, conditional standard errors of measurement should be reported at several score levels unless there is evidence that the standard error is constant across score levels. Where cut scores are specified for selection or classification, the standard errors of measurement should be reported in the vicinity of each cut score.
Ohio Department of Education 66 American Institutes for Research
Ohio Department of Education 67 American Institutes for Research
Exhibit 4.6.2.2: Overall Standard Error of Measurement for Mathematics
Ohio Department of Education 68 American Institutes for Research
Ohio Department of Education 69 American Institutes for Research
Exhibit 4.6.2.3: Overall Standard Error of Measurement for Science
Ohio Department of Education 70 American Institutes for Research
Exhibit 4.6.2.4: Overall Standard Error of Measurement for Social Studies
4.6.3 STUDENT CLASSIFICATION RELIABILITY
When student performance is reported in terms of performance categories, a reliability index is computed to estimate the likelihood of consistent classification of students as specified in standard 2.15 in the Standards for Educational and Psychological Testing (AERA, APA, NCME, 2014). 25 This index considers the consistency of classifications for the percentage of students that would, hypothetically, be classified in the same category on an alternate, equivalent form.
For a fixed-form test, the consistency of classifications is typically estimated on test scores based on a single test form from a single test administration using the true-score distribution, which is estimated by fitting a bivariate beta-binomial model or a four-parameter beta model (Huynh, 1976; Livingston & Wingersky, 1979; Subkoviak, 1976; Livingston & Lewis, 1995).
The classification index can be examined for classification accuracy and classification consistency. Classification accuracy refers to the agreement between the classifications based on the form actually taken and the classifications that would be made on the basis of the students’ true scores, if their true scores could somehow be known. Classification consistency refers to the agreement between the classifications based on the form actually taken and the classifications that would be made on the basis of an alternate, equivalently constructed test form—that is, the percentages of students who would be consistently classified in the same performance levels on two equivalent test administrations.
In reality, the student’s true ability is unknown, and students are not administered an alternate, equivalent form. Therefore, classification accuracy and consistency are estimated based on students’ item scores and the item parameters, and the assumed underlying latent ability distribution as described below. The true score is an expected value of the test score with measurement error.
25 Standard 2.16 – When a test or combination of measures is used to make classification decisions, estimates should be provided of the percentage of students who would be classified in the same way on two replications of the procedure.
Ohio Department of Education 71 American Institutes for Research
4.6.4 CLASSIFICATION ACCURACY
Instead of assuming a normal distribution we can directly estimate the probability of consistent classification using the likelihood function. The likelihood function of 𝜃 given a student’s item scores represents the likelihood of the student’s ability at that theta value. Integrating the likelihood values over the range of theta at and above the cut score (with proper normalization) represents the probability of the student’s latent ability or the true score being at or above that cut point.
If a student’s estimated ability (theta) is below the cut score, the probability of at or above the cut score is an estimate of the chance that this student is misclassified as below the cut score, and 1 minus that probability is the estimate of the chance that the student is correctly classified as below the cut score. Using this logic, we can define various classification probabilities.
In Exhibit 4.6.4.1, accurate classifications occur when the classification decision made on the basis of the hypothetical true score agrees with the decision made on the basis of the form actually taken. Misclassifications, false positives and false negatives, occur when students’ true score classifications are different from students’ observed scores (e.g., a student whose true score results in a classification as Proficient, but whose observed score results in an incorrect classification as Partially Proficient). represents the expected numbers of students who
are truly above the cut score; represents the expected number of students falsely above the cut score;
represents the expected number of students truly below the cut score; and represents the number of students
falsely below the cut score.
Exhibit 4.6.4.1: Classification Accuracy
Classification on a Form Actually Taken
At or Above the Cut Score Below the Cut Score
Classification on True Score
At or Above the Cut Score
𝑁11 (Truly above the cut) 𝑁10 (False negative)
Below the Cut Score 𝑁01 (False positive) 𝑁00 (Truly below the cut)
4.6.5 CLASSIFICATION CONSISTENCY
As shown in Exhibit 4.6.5.1, consistent classification occurs when two forms agree on the classification of a student as either at and above or below the performance standard, whereas inconsistent classification occurs when the two decisions made on the basis of results from the two forms differ.
Exhibit 4.6.5.1: Classification Consistency
Classification on the 2nd Form Taken
Above the Cut Score Below the Cut Score
Classification on the 1st Form
Taken
At or Above the Cut Score
N11 (Consistently above the cut)
N10 (Inconsistent)
Below the Cut Score
N01 (Inconsistent)
N00 (Consistently below the cut)
11N
01N 00N
10N
Ohio Department of Education 72 American Institutes for Research
4.6.6 CLASSIFICATION ACCURACY AND CONSISTENCY ESTIMATES
Exhibit 4.6.6.1 presents the classification accuracy and consistency indices for the spring 2018 administration of OST. Accuracy classifications are slightly higher than the consistency classifications in almost all performance standards. The consistency classification rate can be somewhat lower than the accuracy rate because the consistency index assumes two test scores, both of which include measurement error, while the accuracy index assumes only a single test score plus the true score, which does not include measurement error.
Exhibit 4.6.6.1: Decision Accuracy and Consistency Indexes for Performance Standards
Grade / Course
Accuracy Consistency
Bas
ic
Pro
fici
en
t
Acc
ele
rate
d
Ad
van
ced
Bas
ic
Pro
fici
en
t
Acc
ele
rate
d
Ad
van
ced
ELA
Grade 3 0.92 0.90 0.90 0.91 0.89 0.86 0.86 0.89
Grade 4 0.93 0.90 0.90 0.92 0.90 0.86 0.86 0.89
Grade 5 0.94 0.91 0.89 0.91 0.92 0.88 0.86 0.87
Grade 6 0.94 0.92 0.91 0.94 0.92 0.89 0.88 0.91
Grade 7 0.95 0.92 0.91 0.93 0.93 0.89 0.88 0.91
Grade 8 0.93 0.91 0.92 0.96 0.90 0.88 0.89 0.94
ELA I 0.94 0.93 0.93 0.94 0.91 0.90 0.90 0.92
ELA II 0.94 0.92 0.92 0.95 0.92 0.89 0.89 0.93
Mathematics
Grade 3 0.94 0.93 0.93 0.93 0.92 0.91 0.90 0.91
Grade 4 0.95 0.94 0.94 0.94 0.93 0.92 0.91 0.91
Grade 5 0.94 0.93 0.94 0.96 0.92 0.91 0.91 0.94
Grade 6 0.95 0.94 0.94 0.95 0.92 0.91 0.91 0.93
Grade 7 0.94 0.94 0.94 0.96 0.92 0.91 0.91 0.94
Grade 8 0.93 0.93 0.94 0.97 0.90 0.90 0.92 0.96
Algebra I 0.92 0.93 0.94 0.97 0.89 0.90 0.92 0.96
Geometry 0.83 0.85 0.88 0.92 0.88 0.89 0.92 0.95
Integrated Math I 0.92 0.94 0.96 0.98 0.89 0.92 0.94 0.97
Integrated Math II 0.80 0.85 0.92 0.96 0.84 0.88 0.93 0.97
Science
Grade 5 0.95 0.93 0.92 0.92 0.94 0.90 0.88 0.89
Grade 8 0.94 0.91 0.91 0.95 0.91 0.88 0.88 0.93
Biology 0.93 0.92 0.92 0.93 0.90 0.89 0.89 0.90
Physical Science 0.84 0.92 0.98 1.00 0.78 0.88 0.97 0.99
Social Studies
American Government 0.97 0.93 0.93 0.97 0.95 0.90 0.91 0.96
American History 0.95 0.94 0.93 0.93 0.93 0.91 0.90 0.91
Ohio Department of Education 73 American Institutes for Research
4.6.7 RELIABILITY FOR SUBGROUPS IN THE POPULATION
Exhibits 4.6.7.1–4.6.7.7 show the Cronbach’s alpha estimates of internal consistency reliability for each of the subgroups: gender (females and males), ethnicity (African American, Asian, Hispanic/Latino, American Indian, White, and students reporting multiple ethnicities), as well as students’ Limited English Proficient (LEP) and Individualized Education Program (IEP) status.26 Each of the ethnicity subgroups was composed of approximately equal numbers of males and females. As the Exhibits indicate, internal consistency reliabilities are generally consistent across subgroups, indicating that OST assessments measure an underlying achievement dimension that is in common across all subgroups. Where group reliabilities are attenuated, there is a corresponding decrease in test score variance for the subgroup, likely indicating that the attenuation of reliability is due to restriction of range in the subgroup.
Exhibit 4.6.7.1: Internal Consistency Reliability by Subgroup — Grades 3–6 ELA Assessments
Subgroup
Grade 3 Grade 4 Grade 5 Grade 6
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
All Students 126,050 0.87 2344 125,854 0.86 1920 127,446 0.87 2132 125,943 0.90 1747
Female 61,977 0.86 2325 61,651 0.86 1919 62,418 0.87 2050 61,650 0.90 1662
Male 63,914 0.86 2323 64,033 0.86 1900 64,848 0.88 2172 64,106 0.90 1750
Unknown Gender 159 0.82 1802 170 0.85 1777 180 0.88 2289 187 0.89 1515
African American 22,817 0.83 2004 20,643 0.83 1569 21,267 0.86 1803 20,465 0.88 1463
Asian 3,033 0.88 2746 3,115 0.88 2515 3,125 0.88 2683 3,011 0.92 2419
Hispanic/Latino 5,323 0.85 2148 5,201 0.86 1860 5,051 0.87 1971 5,088 0.90 1684
American Indian/Alaskan 157 0.87 2378 162 0.87 2183 188 0.86 1826 173 0.89 1467
White 84,936 0.85 2117 87,542 0.85 1758 88,832 0.85 1891 88,720 0.89 1547
Multi-Ethnic 9,621 0.86 2280 9,060 0.86 1867 8,841 0.87 2078 8,349 0.90 1671
LEP 5,202 0.78 1581 4,044 0.79 1364 3,549 0.82 1527 3,231 0.85 1293
IEP 16,183 0.82 2003 17,185 0.82 1599 17,484 0.85 1728 16,888 0.86 1302
Note: LEP: Limited English Proficiency; IEP: Individualized Education Program
26 Standard 2.11 – Test publishers should provide estimates of reliability/precision as soon as feasible for each relevant subgroup for which the test is recommended.
Ohio Department of Education 74 American Institutes for Research
Exhibit 4.6.7.2: Internal Consistency Reliability by Subgroup — Grades 7–HS ELA Assessments
Subgroup
Grade 7 Grade 8 ELA I ELA II
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
All Students 123,841 0.91 1598 124,880 0.90 926 148,951 0.92 931 139,572 0.91 958
Female 60,688 0.91 1514 60,910 0.89 873 71,575 0.91 893 68,420 0.90 880
Male 62,901 0.91 1620 63,779 0.90 941 76,796 0.92 929 70,584 0.91 1003
Unknown Gender 252 0.90 1459 191 0.89 891 580 0.88 667 568 0.89 893
African American 18,864 0.89 1392 19,126 0.88 762 27,502 0.89 723 23,659 0.88 810
Asian 3,063 0.92 2179 2,978 0.92 1222 3,482 0.94 1391 3,363 0.93 1373
Hispanic/Latino 4,643 0.91 1669 4,537 0.90 888 5,927 0.91 908 5,294 0.90 968
American Indian/Alaskan 160 0.92 1710 158 0.89 826 203 0.91 787 192 0.91 957
White 89,059 0.90 1397 89,992 0.89 815 102,174 0.91 807 98,465 0.90 823
Multi-Ethnic 7,901 0.91 1575 7,900 0.90 911 9,355 0.92 903 8,336 0.91 959
LEP 2,934 0.87 1322 2,966 0.82 595 5,593 0.83 509 4,511 0.80 598
IEP 16,162 0.87 1234 16,349 0.84 610 21,741 0.85 544 18,913 0.84 632
Note: LEP: Limited English Proficiency; IEP: Individualized Education Program
Exhibit 4.6.7.3: Internal Consistency Reliability by Subgroup — Grades 3–5 Mathematics Assessments
Subgroup
Grade 3 Grade 4 Grade 5
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N R
elia
bili
ty
Var
ian
ce
All Students 126,769 0.93 2292 125,282 0.93 2400 126,093 0.94 1536
Female 62,334 0.92 2139 61,445 0.93 2248 61,865 0.93 1424
Male 64,255 0.93 2439 63,659 0.93 2542 64,032 0.94 1643
Unknown Gender 180 0.92 1759 178 0.92 2010 196 0.92 1354
African American 22,837 0.92 1920 20,639 0.91 1809 21,222 0.90 1083
Asian 3,020 0.91 2390 3,030 0.92 2561 2,928 0.93 1785
Hispanic/Latino 5,351 0.92 1939 5,205 0.93 2113 5,042 0.92 1236
American Indian/Alaskan 157 0.93 2514 160 0.93 2191 191 0.93 1402
White 85,601 0.92 2030 87,091 0.92 2134 87,807 0.93 1372
Multi-Ethnic 9,636 0.93 2115 9,024 0.93 2204 8,761 0.93 1368
LEP 5,202 0.92 1814 4,034 0.91 1924 3,544 0.90 1171
IEP 16,243 0.92 2107 17,187 0.91 2017 17,457 0.90 1144
Note: LEP: Limited English Proficiency; IEP: Individualized Education Program
Ohio Department of Education 75 American Institutes for Research
Exhibit 4.6.7.4: Internal Consistency Reliability by Subgroup — Grades 6–8 Mathematics Assessments
Subgroup
Grade 6 Grade 7 Grade 8
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
All Students 124,337 0.94 1401 119,191 0.93 1658 97,029 0.93 770
Female 60,962 0.94 1306 58,570 0.93 1565 46,827 0.92 727
Male 63,161 0.94 1491 60,362 0.93 1747 50,018 0.93 804
Unknown Gender 214 0.91 1016 259 0.91 1250 184 0.92 781
African American 20,443 0.91 994 18,598 0.90 1173 16,595 0.90 635
Asian 2,803 0.94 1730 2,613 0.93 1981 1,939 0.94 1041
Hispanic/Latino 5,019 0.93 1176 4,594 0.92 1438 4,059 0.92 682
American Indian/Alaskan 173 0.93 1150 154 0.93 1622 129 0.92 699
White 87,489 0.93 1250 85,433 0.93 1505 67,596 0.92 692
Multi-Ethnic 8,267 0.93 1291 7,644 0.93 1558 6,572 0.92 715
LEP 3,235 0.90 970 2,915 0.88 1135 2,811 0.90 643
IEP 16,919 0.90 956 16,090 0.88 1119 15,726 0.89 574
Note: LEP: Limited English Proficiency; IEP: Individualized Education Program
Exhibit 4.6.7.5: Internal Consistency Reliability by Subgroup — High School Mathematics Assessments
Subgroup
Algebra Geometry Integrated Math I Integrated Math II
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
All Students 144,091 0.92 1165 126,729 0.93 1756 12,152 0.91 1270 10,490 0.90 1439
Female 69,666 0.91 1081 62,827 0.92 1606 5,860 0.92 1234 5,132 0.90 1337
Male 73,795 0.92 1239 63,385 0.93 1901 6,239 0.91 1291 5,301 0.91 1540
Unknown Gender 630 0.84 697 517 0.78 858 53 0.79 643 57 0.77 736
African American 24,933 0.84 697 20,163 0.83 998 3,808 0.8 618 3,049 0.75 637
Asian 3,119 0.94 1674 2,857 0.95 2358 494 0.93 1661 434 0.93 1954
Hispanic/Latino 5,817 0.88 904 4,876 0.88 1310 300 0.91 1105 266 0.89 1234
American Indian/Alaskan 206 0.90 965 179 0.91 1508 25 0.83 616 27 0.83 819
White 101,262 0.92 1080 91,392 0.93 1644 6,206 0.93 1284 5,725 0.92 1494
Multi-Ethnic 8,470 0.91 1073 7,028 0.91 1606 1,298 0.88 979 967 0.87 1142
LEP 4,025 0.82 659 3,084 0.81 969 1,441 0.7 464 1,004 0.65 493
IEP 20,161 0.80 606 15,831 0.77 841 1,830 0.77 616 1,516 0.74 660
Note: LEP: Limited English Proficiency; IEP: Individualized Education Program
Ohio Department of Education 76 American Institutes for Research
Exhibit 4.6.7.6: Internal Consistency Reliability by Subgroup — Science Assessments
Subgroup
Grade 5 Grade 8 Biology Physical Science
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
All Students 127,349 0.91 2154 125,778 0.90 1999 135,109 0.90 846 478 0.65 333
Female 62,404 0.91 2016 61,534 0.89 1787 66,887 0.89 758 243 0.60 293
Male 64,751 0.91 2272 64,043 0.91 2201 67,803 0.91 932 231 0.70 376
Unknown Gender 194 0.91 2103 201 0.89 1770 419 0.85 612 4 0.68 336
African American 21,166 0.89 1670 19,078 0.85 1348 21,928 0.83 536 195 0.61 330
Asian 3,135 0.90 2287 2,995 0.91 2398 3,225 0.91 1089 4 -0.12 95
Hispanic/Latino 5,063 0.91 1861 4,589 0.88 1634 4,870 0.88 734 20 0.74 484
American Indian/Alaskan 188 0.91 2056 155 0.89 1702 185 0.89 707 1 NA NA
White 88,823 0.90 1862 90,895 0.89 1789 96,774 0.90 785 211 0.66 303
Multi-Ethnic 8,831 0.91 2037 7,879 0.90 1865 7,872 0.89 789 40 0.67 331
LEP 3,550 0.88 1462 2,967 0.82 1206 3,878 0.74 372 25 0.53 258
IEP 17,438 0.90 1899 16,291 0.85 1360 17,412 0.80 469 91 0.48 262
Note: LEP: Limited English Proficiency; IEP: Individualized Education Program
Exhibit 4.6.7.7: Internal Consistency Reliability by Subgroup — Social Studies Assessments
Subgroup
American Government American History
N
Re
liab
ility
Var
ian
ce
N
Re
liab
ility
Var
ian
ce
All Students 86,861 0.89 278 125,839 0.91 683
Female 42,931 0.89 252 62,295 0.91 602
Male 43,521 0.90 302 63,145 0.92 758
Unknown Gender 409 0.89 267 399 0.89 526
African American 14,322 0.87 232 20,438 0.89 510
Asian 1,565 0.91 355 2,489 0.92 744
Hispanic/Latino 2,924 0.89 267 4,677 0.91 613
American Indian/Alaskan 121 0.91 338 163 0.91 648
White 62,730 0.89 255 90,536 0.91 632
Multi-Ethnic 5,009 0.89 256 7,308 0.91 652
LEP 2,149 0.84 200 3,905 0.83 345
IEP 10,331 0.84 199 17,060 0.87 444
Note: LEP: Limited English Proficiency; IEP: Individualized Education Program
Ohio Department of Education 77 American Institutes for Research
4.6.8 RELIABILITY FOR SUBSCALES
The Cronbach’s alpha internal consistency reliability estimates associated with the subscales for the spring 2018 operational forms are presented in Exhibits 4.6.8.1–4.6.8.5. As indicated in the Exhibits, subscale reliabilities are generally moderate in magnitude, as expected for subscales of the length observed in OST. Subscale reliabilities for the Physical Science assessment are quite low and likely due to restriction of range resulting from very difficult test items.
Exhibit 4.6.8.1: Subscale Reliabilities — ELA
Grade / Course Reading
Informational Text
Reading Literary Text
Writing
Grade 3 0.71 0.72 0.52
Grade 4 0.71 0.66 0.63
Grade 5 0.74 0.71 0.59
Grade 6 0.77 0.70 0.81
Grade 7 0.80 0.74 0.82
Grade 8 0.76 0.74 0.80
ELA I 0.81 0.69 0.84
ELA II 0.77 0.71 0.83
Ohio Department of Education 78 American Institutes for Research
Exhibit 4.6.8.2: Subscale Reliabilities — Mathematics
Grade 3
Fractions Geometry Multiplication &
Division Modeling & Reasoning
Numbers & Operations
0.68 0.74 0.80 0.90 0.73
Grade 4
Fractions Geometry Multiplication & Division Modeling & Reasoning
0.84 0.71 0.84 0.86
Grade 5
Decimals Fractions Geometry Modeling & Reasoning
0.85 0.82 0.77 0.87
Grade 6
Expressions and Equations
Geometry and Statistics
Modeling & Reasoning
The Number System
Ratios and Proportions
0.83 0.73 0.88 0.70 0.79
Grade 7
Geometry Modeling & Reasoning
The Number System
Ratios and Proportions
Statistics & Probability
0.68 0.88 0.80 0.78 0.75
Grade 8
Expressions and Equations
Functions Geometry Modeling & Reasoning
The Number System
0.77 0.69 0.80 0.85 0.74
Algebra I
Functions Modeling & Reasoning Number, Quantities,
Equations and Expressions
Statistics
0.85 0.86 0.76 0.63
Geometry
Congruence & Proof
Circles Modeling & Reasoning
Similarity & Trigonometry
Probability
0.83 0.61 0.87 0.74 0.65
Integrated Mathematics I
Ohio Department of Education 79 American Institutes for Research
Algebra Geometry Modeling & Reasoning
Number & Quantity/Functions
Statistics
0.64 0.64 0.88 0.80 0.63
Integrated Mathematics II
Functions Geometry Modeling & Reasoning
Number, Quantities,
Equations and Expressions
Probability
0.66 0.75 0.81 0.65 0.63
Exhibit 4.6.8.3: Subscale Reliabilities — Grade 5 and Grade 8 Science
Grade Earth
Science Life
Science Physical Science
Grade 5 0.73 0.80 0.79
Grade 8 0.75 0.78 0.75
Exhibit 4.6.8.4: Subscale Reliabilities — Biology and Physical Science
Biology
Heredity Evolution Diversity and
Interdependence of Life
Cells
0.71 0.67 0.69 0.70
Physical Science
Study of Mater Energy and Waves Forces and Motion The Universe
NA NA 0.14 0.27
NA: Negative reliability due to large SEM and small variance of scale scores.
Exhibit 4.6.8.5: Subscale Reliabilities — Social Studies
American Government
Historic Documents Principles and Structure Ohio/Policy/Economy
0.78 0.77 0.67
American History
Skills and Documents 1877–1945 1945–Present
0.71 0.82 0.79
Ohio Department of Education 80 American Institutes for Research
4.7 SUBSCALE INTERCORRELATIONS
The observed correlations among reporting category scores are presented in Exhibits 4.7.1–4.7.16.
Exhibit 4.7.1: Subscale Intercorrelations — ELA
Grade / Course
Subscale
Observed Correlation
RI RL
Grade 3 RL 0.69
W 0.50 0.48
Grade 4 RL 0.66
W 0.62 0.56
Grade 5 RL 0.70
W 0.56 0.55
Grade 6 RL 0.70
W 0.65 0.64
Grade 7 RL 0.71
W 0.64 0.63
Grade 8 RL 0.70
W 0.65 0.61
ELA I RL 0.71
W 0.72 0.66
ELA II RL 0.70
W 0.67 0.64
Note: RL = Reading Literary Text; RI = Reading Informational Text; W = Writing
Exhibit 4.7.2: Subscale Intercorrelations — Grade 3 Mathematics
Grade Subscale Observed Correlation
FRA G MUD MR
Grade 3
G 0.68
MUD 0.70 0.74
MR 0.83 0.84 0.92
NO 0.69 0.71 0.78 0.88
Note: FRA = Fraction; G = Geometry; MUD = Multiplication & Division; MR = Model Reasoning; NO = Numbers & Operations
Ohio Department of Education 81 American Institutes for Research
Exhibit 4.7.3: Subscale Intercorrelations — Grade 4 Mathematics
Grade Subscale Observed Correlation
FRA G MUD
Grade 4
G 0.76
MUD 0.83 0.73
MR 0.91 0.82 0.91
Note: FRA = Fraction; G = Geometry; MUD = Multiplication & Division; MR = Model Reasoning;
Exhibit 4.7.4: Subscale Intercorrelations — Grade 5 Mathematics
Grade Subscale Observed Correlation
D FRA G
Grade 5
FRA 0.83
G 0.80 0.79
MR 0.88 0.94 0.88
Note: D = Decimals; FRA = Fraction; G = Geometry; MR = Model Reasoning;
Exhibit 4.7.5: Subscale Intercorrelations — Grade 6 Mathematics
Grade Subscale Observed Correlation
EE GS MR NS
Grade 6
GS 0.79
MR 0.90 0.88
NS 0.78 0.73 0.83
RP 0.83 0.77 0.90 0.76
Note: EE = Expression and Equations; GS = Geometry and Statistics; MR = Model Reasoning; NS = The Number System; RP = Ratios and Proportions
Ohio Department of Education 82 American Institutes for Research
Exhibit 4.7.6: Subscale Intercorrelations — Grade 7 Mathematics
Grade Subscale Observed Correlation
G MR NS RP
Grade 7
MR 0.87
NS 0.75 0.90
RP 0.77 0.89 0.80
SP 0.73 0.86 0.75 0.77
Note: G = Geometry; MR = Model Reasoning; NS = The Number System; RP = Ratios and Proportions; SP = Statistics and Probability
Exhibit 4.7.7: Subscale Intercorrelations — Grade 8 Mathematics
Grade Subscale Observed Correlation
EE F G MR
Grade 8
F 0.71
G 0.78 0.72
MR 0.92 0.80 0.87
NS 0.74 0.67 0.74 0.81
Note: EE = Expression and Equations; F = Functions; G = Geometry; MR = Model Reasoning; NS = The Number System
Exhibit 4.7.8: Subscale Intercorrelations — Algebra
Grade Subscale Observed Correlation
F MR NQEE
Algebra
MR 0.92
NQEE 0.81 0.88
S 0.74 0.84 0.71
Note: F = Functions; MR = Model Reasoning; NQEE = Number, Quantities, Equations and Expressions; S = Statistics
Ohio Department of Education 83 American Institutes for Research
Exhibit 4.7.9: Subscale Intercorrelations — Geometry
Grade Subscale Observed Correlation
CP C MR P
Geometry
C 0.77
MR 0.90 0.85
P 0.73 0.69 0.86
ST 0.81 0.77 0.88 0.72
Note: CP = Congruency & Proof; C = Circles; MR = Model Reasoning; P = Probability; ST = Similarity & Trigonometry
Exhibit 4.7.10: Subscale Intercorrelations — Integrated Mathematics I
Grade Subscale Observed Correlation
A G MR NQF
IM I
G 0.74
MR 0.90 0.85
NQF 0.79 0.77 0.91
S 0.71 0.71 0.85 0.75
Note: A = Algebra; G = Geometry; MR = Model Reasoning; NQF = Number & Quantity/Functions; S = Statistics
Exhibit 4.7.11: Subscale Intercorrelations — Integrated Mathematics II
Grade Subscale Observed Correlation
F G MR NQEE
IM II
G 0.73
MR 0.81 0.84
NQEE 0.66 0.70 0.79
P 0.67 0.73 0.86 0.64
Note: F = Functions; G = Geometry; MR = Model Reasoning; NQEE = Number, Quantities, Equations and Expressions; P = Probability
Ohio Department of Education 84 American Institutes for Research
Exhibit 4.7.12: Subscale Intercorrelations — Grade 5 and Grade 8 Science
Grade Subscale Observed Correlation
ES LS
Grade 5 LS 0.72
PS 0.73 0.76
Grade 8 LS 0.73
PS 0.72 0.73
Note: ES = Earth Science; LS = Life Science; PS = Physical Science
Exhibit 4.7.13: Subscale Intercorrelations — Biology
Grade Subscale Observed Correlation
BS-A BS-B BS-C
Biology
BS-B 0.66
BS-C 0.70 0.67
BS-D 0.69 0.66 0.67
Note: BS-A = Heredity; BS-B = Evolution; BS-C = Diversity and Interdependence of Life; BS-D = Cells
Exhibit 4.7.14: Subscale Intercorrelations — Physical Science
Grade Subscale Observed Correlation
PS-A PS-B PS-C
Physical Science
PS-B 0.34
PS-C 0.35 0.38
PS-D 0.36 0.38 0.40
Note: PS-A = Study of Matter; PS-B = Energy & Waves; PS-C = Forces & Motions; PS-D = The Universe
Ohio Department of Education 85 American Institutes for Research
Exhibit 4.7.15: Subscale Intercorrelations and Reliability Estimates — American Government
Grade Subscale Observed Correlation
AGA AGB
American Government
AGB 0.71
AGC 0.66 0.73 Note: AGA = Historic Documents; AGB = Principles & Structures; AGC = Ohio/Policy/Economy
Exhibit 4.7.16: Subscale Intercorrelations and Reliability Estimates — American History
Grade Subscale Observed Correlation
AHA AHB
American History
AHB 0.74
AHC 0.73 0.80 Note: AHA = Skills & Documents; AHB = 1877–1945; AHC = 1945–Present
4.8 RATER AGREEMENT
All essay responses for spring 2018 online and paper-pencil tests were handscored by Data Recognition Corporation (DRC). In addition, approximately 45% of handscored essay responses were routed for a second human reader. Appendix H.1 shows the rater agreement rates for each of the writing prompts administered on OST assessments. Exhibit 4.8.1 provides a summary of those results, showing the exact human rater agreement rate for dimension scores across grades. The rater agreement reports in Appendix H.1 show percentages of exact agreement (%EX), adjacent scores (%AD), and nonadjacent scores (%NA). The tables also provide score point distribution (from 0 to 2 for Conventions; from 0 to 4 for Purpose/Organization and Evidence/Elaboration) including the condition codes such as percent of Blank/No response (%B), percent of Unreadable responses (%U), percent of Foreign Language responses (%F), and percent of Off Topic responses (%T). Generally exact agreement rates ranged from 67%–86% (average of 76%), with little variability across the essay prompts.
Exhibit 4.8.1: Mean Exact Agreement Rates for Online Essay Responses — Spring 2018
Dimension Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8 ELA I ELA II
Conventions 82 79 81 80 82 85 79 86
Purpose/Organization 78 72 75 69 71 69 75 75
Evidence/Elaboration 79 74 73 68 69 67 77 75
Ohio Department of Education 86 American Institutes for Research
5. ITEM DEVELOPMENT AND TEST CONSTRUCTION
OST assessments are rigorously examined in accordance to the guidelines provided in the Standards for Educational and Psychological Testing (AERA, APA, NCME, 2014). The Elementary and Secondary Education Act (ESEA) legislation also describes the evidence based on these standards that is necessary to validate assessments for their intended purposes.
OST assessments were designed to measure student progress toward achievement of Ohio’s Learning Standards. Although the validity of OST test score interpretations are evaluated along several dimensions, as a content criterion-referenced system of tests, the meaning of test scores are critically evaluated by the degree to which test content was aligned with Ohio content standards.27
Alignment to learning standards is achieved through a rigorous test development process that proceeds from the learning standards and refers back to those standards in a highly iterative test development process that included ODE, test developers, and educator and stakeholder committees.
In spring 2014, an independent field test was conducted to develop the item pool for future test forms in science and social studies. The independent field test consisted of newly developed items, as well as items from Ohio’s previous statewide tests, the Ohio Achievement Assessment and the Ohio Graduation Test (OGT). Items from Ohio’s prior tests went through AIR and ODE review before being approved for use in the new OST tests.
Items administered in the first administration of OST assessments in ELA and mathematics went through a very similar development procedure. These AIRCore items were designed to align with the CCSS that Ohio adopted as Ohio’s Learning Standards in ELA and mathematics. All of the AIRCore items administered had also been previously field tested in embedded slots within statewide assessments, and passed through content and fairness data review prior to inclusion in the item pool from which the spring 2018 OST test forms were constructed.
5.1 ITEM DEVELOPMENT PROCESS 28
The content development process for Ohio’s State Tests is managed by AIR’s Item Authoring Tool (IAT), which acts as a content development and management tool, item bank, and publication system supporting both paper-based and online publication. This item development process leads items from inception, through a series of content, fairness, graphic, and other reviews to final publication. The system captures the outcomes and rationales at each review and maintains previous drafts of each item. The workflow management ensures that each item receives each review in the designated sequence, and that the review is conducted (or recorded in the case of committee review) by an authorized person. As items travel through Ohio’s extensive review process, every version of every item is archived, along with each comment received in any review. Reviewers have immediate access to all prior versions, providing version control throughout development.
IAT allows remote Internet access by item writers and reviewers while ensuring security with individualized passwords for all users, limited access for external users, and strong encryption of all information. IAT tracks item use on test forms or adaptive pools. After items are used, IAT stores the resulting statistics, including exposure statistics, classical item statistics, and statistics based on item response theory (IRT).
27 Standard 1.11 – When the rationale for test score interpretation for a given use rests in part on the appropriateness of test content, the procedures followed in specifying and generating test content should be described and justified with reference to the intended population to be tested and the construct the test is intended to measure or the domain it is intended to represent. If the definition of the content sampled incorporates criteria such as importance, frequency, or criticality, these criteria should also be clearly explained and justified. 28 Standard 4.7 – The procedures used to develop, review, and tryout items and to select items from the item pool should be documented.
Ohio Department of Education 87 American Institutes for Research
OST assessments’ item development process is predicated on a high level of interaction between test developers at AIR and the ODE, as well as with Ohio educators and stakeholders. AIR’s IAT manages item content throughout the entire life cycle of an item, from inception, through series of agreed-upon item review levels culminating in operational pool approval. It also manages item content beyond the operational life of the item, including migration of items for use in practice tests or other training materials. IAT ensures that every item follows through the entire sequence of development and provides Ohio and AIR management on-demand reports of the content and status of the inventory of items. Each item is directed through a sequence of reviews (described in this section) and sign-offs before it is locked for field-test or operational administration.
The IAT is integrated with the item display engine used by the online test delivery system. This feature, combined with a “web approval” process, allows the display of online items to be “locked” well before test forms are constructed and ensures that only approved items are administered to Ohio students.29
5.2 MACHINE-SCORED CONSTRUCTED-RESPONSE ITEM DEVELOPMENT TOOLS
OST assessments include a number of machine-scored constructed-response (MSCR) items which leverage a complicated system that allows for a large variety of item types expecting varied student responses to be developed, and scored efficiently and economically.
MSCR item development tools put the power of both item and rubric creation into the hands of item writers, and allow reviewers to score possible responses to ensure that the rubric is enacted correctly. For example, when administered a graphic-response item, students can respond by drawing, moving, arranging, or selecting graphic regions. The scoring rubric allows for each answer to be scored using scoring logic created by the item writer. Test developers have flexibility in identifying features of student responses to score, which go beyond simple features (e.g., whether the correct object is put in the correct place) but can involve abstraction. For example, if a student is asked to design an experiment, the rubric can discern whether the objects representing the experimental variable actually vary across conditions or cover the range of inquiry, among other capabilities. These concepts are abstract and many different responses may reflect those abstract features. This ability enables machine rubrics to “justify” the partial credit assigned in terms of the skills that particular response features exemplify.
In addition, throughout the item development and review process, test developers can mimic the many different possible student responses, and review how the rubric is applied to those responses. Test developers can test the scoring rubric and make corrections to the scoring logic at each step.
When creating equation items, test developers have access to the Equation Editor tool. Student responses can be simple numeric responses or complex equations or even sets of equations. This tool allows for multiple answers and the development of multistep items. Test developers can customize the equation palette to show the appropriate functions. Just as the keypad is customizable, the answer spaces are as well. Additional answer spaces can be added as needed by the item writer. The scoring rubric allows for each answer to be scored using scoring logic created by the item writer.
Such tools are integrated into the IAT, providing test developers the power and flexibility to use technology to create sophisticated OST items.
5.3 ITEM TYPES
OST assessments include a wide variety of item types that are designed around a broad and growing variety of response mechanisms. In addition to selected-response items, which include traditional multiple-choice items and more advanced multi-select and two-part items, OST assessments utilize items with the following response mechanisms:
29 Standard 4.1
Ohio Department of Education 88 American Institutes for Research
• Graphic Response, which includes any item to which students respond by drawing, moving, arranging,
or selecting graphic regions.
• Hot Text, in which students select or rearrange sentences or phrases in a passage.
• Word Builder, in which students respond by entering a single number or word.
• Proposition Response, in which students respond in one English language sentence or more, which may
be scored by our proposition-scoring engine, handscored, or a mixture of both.
OST items use technology to machine-score many such items that measure deeper knowledge and application of knowledge in a more open ended way. Most MSCR items remain accessible. If accessibility is sacrificed for some population, test development staff weigh the measurement benefit before deploying that item. For example, recognizing cells under a microscope is an inherently visual task. Our simulation items can measure this ability, but the task itself cannot readily be made accessible to students who are blind. In this case, the skill itself limits accessibility, not the construction of the item. This is very different than presenting a selected-response item as a spatial matching or drag-and-drop task.
The graphic-response mechanism supports most of the typical technology-enhanced item types, including sorting, matching, hot spot, and drag-and-drop. In addition, it supports items where students actually draw a machine-scorable response and respond by constructing complex, open-ended diagrams, as well as many other possibilities. Because they are uniformly derived from a single response mechanism, the manipulations and interactions are consistent across these technology-enhanced item types, diminishing a possible source of construct-irrelevant variance.
Hot-text items are effectively selected-response items, though in some cases the number of potential selections is quite large. These machine-scored items can have multiple correct answers and allow flexibility in scoring of student responses.
The equation response mechanism asks students to enter one or more equations using a palette of symbols. Test developers can specify which symbols are available on an item-by-item basis.
The availability of tools organized around response mechanisms creates a very flexible capability for test developers to create authentic, challenging tasks.
5.4 ITEM REVIEW
This section describes the multi-step item-review process that items travel through from inception, to several rounds of test developer, ODE, educator, and stakeholder review, to field testing and final review prior to inclusion on operational test forms. 30
The item-review procedures used to develop and review OST test items are designed to ensure item accuracy and alignment with the intent of Ohio’s Learning Standards. Following a standard item-review process, item reviews proceed initially through a series of internal reviews before items are eligible for review by ODE content experts. Most of AIR’s content staff members, who are responsible for conducting internal reviews, are former classroom teachers who hold degrees in education and/or their respective content areas. Each item passes through four internal review steps before it is eligible for review by ODE. Those steps include the following:
• Preliminary Review, in which a review is conducted by a group of AIR content area experts
• Content Review 1, in which a review is performed by an AIR content specialist
30 Standard 4.8 – The test review process should include empirical analyses and/or the use of expert judges to review items and scoring criteria. When expert judges are used, their qualifications, relevant experiences, and demographic characteristics should be documented, along with the instructions and training in the item review process that the judges receive.
Ohio Department of Education 89 American Institutes for Research
• Editorial Review, in which a copyeditor checks the item for correct grammar/usage
• Senior Content Review, in which a review is conducted by the lead content expert
At every stage of the item-review process, beginning with preliminary review, AIR’s test developers analyze each item to ensure that the following are true.
• The item is well-aligned with the intended learning standard.
• The item conforms to the item specifications for the target being assessed.
• The item is based on a quality idea (i.e., it assesses something worthwhile in a reasonable way).
• The item is properly aligned to a depth of knowledge (DOK) level.
• The vocabulary used in the item is appropriate for the intended grade/age and subject matter, and takes
into consideration language accessibility, bias, and sensitivity.
• The item content is accurate and straightforward.
• Any accompanying graphic and stimulus materials are actually necessary to answer the question.
• The item stem is clear, concise, and succinct, meaning it contains enough information to know what is being
asked; is stated positively (and does not rely on negatives such as no, not, none, never, unless absolutely
necessary); and it ends with a question.
• For selected-response items, the set of response options is succinct; parallel in structure, grammar, length,
and content; sufficiently distinct from one another; and all plausible, but with only one correct option.
• There is no obvious or subtle cluing within the item.
• The score points for constructed-response items are clearly defined.
• For machine-scored constructed-response (MSCR) items, the items are scored as intended at each score
point in the rubric.
Based on their review of each item, test developers can accept the item and classification as written, revise the item, or reject the item outright.
Items passing through the internal review process are sent to ODE for their review. At this stage, items may be further revised based on any edits or changes requested by ODE, or rejected outright. Items passing through the ODE review level then have to pass through two stakeholder reviews in which committees of Ohio educators and stakeholders review each item’s accuracy, alignment to the intended standard, and DOK level, as well as item fairness and language sensitivity. Thus, all items considered for inclusion in the operational item pools were initially reviewed by:
• content advisory educator committee, which checked to ensure that each item is
o aligned to the learning standards;
o appropriate for the grade level;
o accurate; and
o presented online in a way that is clear and appropriate.
• A fairness and sensitivity educator committee, which checks to ensure that each item and any associated
stimulus materials are free from bias, sensitive issues, controversial language, stereotyping, and statements
that reflect negatively on race, ethnicity, gender, culture, region, disability, or other social and economic
conditions and characteristics.
Items successfully passing through this committee review process are then field tested to ensure that the items
behave as intended when administered to students. Despite conscientious item development, some items perform
differently than expected when administered to students. Using the item statistics gathered in field testing to
review item performance is an important step in constructing valid and equivalent operational test forms.
Ohio Department of Education 90 American Institutes for Research
Classical item analyses ensure that items function as intended with respect to the underlying scales. Classical item
statistics are designed to evaluate the item difficulty and the relationship of each item to the overall scale (item
discrimination) and to identify items that may exhibit a bias across subgroups (differential item functioning
analyses).
Items flagged for review based on their statistical performance must pass a three-stage review to be included in
the final item pool from which operational forms were created. In the first stage of this review, a team of
psychometricians reviewed all flagged items to ensure that the data are accurate and properly analyzed, response
keys are correct and there are no other obvious problems with the items.
Content review and fairness and sensitivity committees are again convened to re-evaluate flagged field-test items
in the context of each item’s statistical performance. Based on their review of each item’s performance, the
content review and fairness and sensitivity committees can recommend that flagged items be rejected or deem
the item eligible for inclusion in operational test administrations.
Ohio Department of Education 91 American Institutes for Research
6. FIELD TESTING
Items selected for operational use in the base year ELA and mathematics forms in 2015 were previously administered as part of statewide assessments in Arizona, Florida, Utah, and/or Oregon. The Ohio science and social studies test items are field tested prior to inclusion on operational test forms. Items selected for operational use in the base year form in 2015 were previously administered as part of an independent field test in spring 2014. Additionally, newly developed items were embedded in all Ohio’s 2018 tests (except for Physical Science) and field tested, expanding the base of items for building future test forms.
Embedding field-test items in operational assessments yields item parameter estimates that capture many of the contextual effects that contribute to simulating item difficulty in operational test administrations. A number of factors that may influence item difficulty estimates in the context of operational test administrations may be less relevant in stand-alone field-test contexts. For example, in a high-stakes test, such as a high school end-of-year (EOY) exam where test performance impacts graduation, students may be motivated to expend greater effort to achieve maximum performance. Conversely, the high-stakes assessments may also be more likely to elicit anxiety in some students, thus impairing their performance on the tests. Even when assessments are low stakes for students, schools often work to convey to students the importance of statewide assessments in ways that are likely not done for independent field tests. While the impact of contextual factors may not be great, embedded field testing ensures that many aspects of the operational testing context influencing item difficulty are incorporated into the resulting item parameter estimates.
Embedded field testing is especially useful in the context of a pre-equating model for scoring and reporting test results. Because the test administration context remains the same between the embedded field test (EFT) and subsequent operational test administration, item parameter estimates are more stable than they may be when obtained through stand-alone field testing.
A potential drawback of the EFT approach is the increased assessment burden placed on students and schools. For this reason, Ohio utilizes EFT designs for purposes of item bank maintenance. Ohio uses AIR’s online field-test engine, which, when combined with Ohio’s large student population, serves to greatly reduce the number of EFT slots necessary to replenish or grow the item banks for the Ohio assessments.
The field-test engine randomly samples field-test items for each individual test administration, essentially creating thousands of unique EFT forms. This sampling approach to embedding field-test items results in several important outcomes:31
• Reduction in the number of embedded field-test items that each student must respond to and more
efficient “spiraling” of items, which reduces clustering of item responses, resulting in more precise
parameter estimates
• More generalizable item statistics because they are not based on items appearing in a single position
• A more representative sample of respondents for each item
The embedded field-testing algorithm consists of two different algorithms—one for identifying which field-test items will be administered to which student (the distribution algorithm), and one for selecting the position on the test for each item administered the student (the positioning algorithm). When a student starts a test, the system randomly selects a pre-determined number of item groups (depending on whether items have shared stimulus, etc.), stopping when it has selected item groups containing at least the minimum number of field-test items designated for administration to each student. This structured randomization ensures that a) each item is seen by a representative
31 Standard 4.9 – When item or test form tryouts are conducted, the procedures used to select the sample(s) of students as well as the resulting characteristics of the sample(s) should be documented. The sample(s) should be as representative as possible of the population(s) for which the test is intended.
Ohio Department of Education 92 American Institutes for Research
sample of Ohio students, and b) every item is as likely as every other item to appear in a class or school, minimizing clustering effects.
6.1 ITEM STATISTICS
Following the close of the testing window, AIR psychometrics staff works to analyze field-test data in preparation for item data review meetings and promotion of high quality test items to operational item pools.32 The item analyses include classical item statistics as well as the IRT item calibrations. Classical item statistics are used to evaluate the relationship of each item to the overall scale, evaluate the quality of the distractors, and identify items that may exhibit bias across subsgroups (DIF analyses). The IRT item analyses allow examination of the fit of items to the measurement model and provide the statistical foundation for operational form construction and test scoring and reporting. Items are flagged if analyses indicate resulting values are out of range; flagged items are reviewed by AIR and ODE psychometric and content staff for possible miskey or scoring errors; items that pass through AIR and ODE statistical reivew are then sent to item data review committess comprised of Ohio educators for a final external review.
6.1.1 CLASSICAL STATISTICS
Classical item analyses inspected whether the items functioned as intended with respect to the underlying scales. AIR’s analysis program computed the required item and test statistics for each dichotomous multiple-choice (MC) and polytomous constructed-response (CR) item to check the integrity of the item and to verify the appropriateness of the difficulty level of the item. Key statistics that are computed and examined include item difficulty, item discrimination, and distractor analysis.
Items that are either extremely difficult or extremely easy are flagged for review but not necessarily rejected if they align with the test specifications. For multiple-choice items, the proportion of students in the sample selecting the correct answer (p-values) is computed, as well as those selecting the incorrect responses. Multiple-choice items are flagged for reviews if the p-value was less than 0.30 or greater than 0.95. For constructed-response items, item difficulty is calculated both as the item’s mean score and as the average proportion of points gotten correct (analogous to p-value and indicating the ratio of an item’s mean score divided by the number of points possible). Constructed-response items are flagged for review if the the proportion of students assigned any score-point category is greater than 0.95. In addition, items are flagged if the average IRT-based ability estimate of students in a score-point category is lower than the average IRT-based ability estimate of students in the next lower score-point category (i.e., when students who receive 3 points score lower, on average, on the total test than students who received only 2 points on the item).
The item discrimination index indicates the extent to which each item differentiated between those students who possessed the skills being measured and those who do not. In general, the higher the value, the better the item was able to differentiate between high- and low-achieving students. The discrimination index for multiple-choice items is calculated as the correlation between the item score and the student’s IRT-based ability estimate. For constructed-response items, we computed the mean total number correct for student scoring within each of the possible score categories. Items were flagged for subsequent reviews if the point biserial correlation for the keyed (correct) response is less than 0.25.
32 Standard 4.10 – When a test developer evaluates the psychometric properties of items, the model used for that purpose (e.g., classical test theory, item response theory, or another model) should be documented. The sample used for estimating item properties should be described and should be of adequate size and diversity for the procedure. The process by which items are screened and the data used for screening, such as item difficulty, item discrimination, or differential item functioning (DIF) for major examinee groups, should also be documented. When model-based methods (e.g., IRT) are used to estimate item parameters in test development, the item response model, estimation procedures, and evidence of model fit should be documented.
Ohio Department of Education 93 American Institutes for Research
Distractor analysis for the multiple-choice items is used to identify items that have marginal distractors or ambiguous correct responses that were overlooked by the Content Advisory Committee. In the distractor analysis, the correct response should be the most frequently selected option among high-scoring students. The discrimination value of the correct response should be substantial and positive, and the discrimination values for distractors should be lower and, generally, negative. The point biserial correlation for distractors is the correlation between the item score, treating the target distractor as the correct response, and the student’s IRT ability estimate, restricting the analysis to those students selecting either the target distractor or the keyed response. Items are flagged for subsequent reviews if the point biserial correlation for the distractor response is greater than zero. In addition, items are flagged if the proportion of students responding to a distractor exceeds the proportion selecting the keyed response.
6.1.2 IRT STATISTICS
AIR applied the Rasch and Masters’ Partial Credit Model to estimate the item response theory (IRT) model parameters for dichotomously and polytomously scored items, respectively. The WINSTEPS output showing the item statistics resulting from the free (unanchored) estimation of parameters for items in the operational tests are reviewed, as well as the WINSTEPS-generated item and persons maps. Item fit is evaluated via the mean square Infit and mean square Outfit statistics reported by WINSTEPS, which are based on weighted and unweighted standardized residuals for each item response, respectively. These residual statistics indicate the discrepancy between observed item responses and the predicted item responses based on the IRT model. Both fit statistics have an expected value of 1. Values substantially greater than 1 indicate model underfit, while values substantially less than 1 indicate model overfit (Linacre, 2004).
6.1.3 ANALYSIS OF DIFFERENTIAL ITEM FUNCTIONING
Differential item functioning (DIF) refers to items that appear to function differently across identifiable groups, typically across different demographic groups. Identifying DIF is important because sometimes it is a clue that an item contains a cultural or other bias. Not all items that exhibit DIF are biased; characteristics of the educational system may also lead to DIF. For example, if schools in low-income areas are less likely to offer geometry classes, students at those schools might perform more poorly on geometry items than would be expected, given their proficiency on other types of items. In this example, it is not the item that exhibits bias but the curriculum. However, DIF can indicate bias, so all field-tested items were evaluated for DIF, and all items exhibiting DIF were flagged for further examination by a Fairness and Sensitivity Committee. Committee members were asked to reexamine each flagged item, using the statistics as a guide, and to make a final decision about whether the item should be excluded from the pool of potential items given its performance during field testing.
AIR conducts DIF analysis on all field-tested items to detect potential item bias across major ethnic and gender groups. In Ohio, DIF is investigated among the following group comparisons (reference group/focal group):
• Male/Female
• White/Hispanic
• White/Black
• White/Multiple ethnicities selected
AIR uses a generalized Mantel-Haenszel (MH) procedure to evaluate DIF. The generalizations include (1) adaptation to polytomous items, and (2) improved variance estimators to render the test statistics valid under complex sample designs. Because students within a district, school, and classroom are more similar than would be expected in a simple random sample of students statewide, the information provided by students within a school is not independent, so that standard errors based on the assumption of simple random samples are underestimated. We compute design-consistent standard errors that reflect the clustered nature of educational systems. While clustering is mitigated through random administration of large numbers of embedded field-test items, design effects (Kish, 1967) in student samples are rarely reduced to the level of a simple random sample.
Ohio Department of Education 94 American Institutes for Research
The ability distribution is divided into a configurable number of intervals to compute the Mantel-Haenszel (MH) chi-square DIF statistics. The analysis program computes the MH chi-square value, the log-odds ratio, the standard error
of the log-odds ratio, and the MH-delta statistic (∆̂𝑀𝐻) for the MC items; the MH chi-square, the standardized mean difference (SMD), and the standard error of the SMD for the CR items.
Items are classified into three categories (A, B, or C), ranging from no evidence of DIF to severe DIF according to the DIF classification convention illustrated in Exhibit 6.1.3.1. Items are also categorized as positive DIF (i.e., +A, +B, or +C), signifying that the item favors the focal group (e.g., African American/Black, Hispanic, or female), or negative DIF (i.e., –A, –B, or –C), signifying that the item favors the reference group (e.g., white or male). Items are flagged if their DIF statistics fall into the “C” category for any group. A DIF classification of “C” indicates that the item shows significant DIF and should be reviewed for potential content bias, differential validity, or other issues that may reduce item fairness.
Exhibit 6.1.3.1: DIF Classification Rules
Category Rule
Dichotomous Items
C 2MH is significant and 1|ˆ| MH.5
B 2MH is significant and 1|ˆ| MH.5
A 2MH is not significant.
Polytomous Items
C 2MH is significant and 25.||/|| SDSMD .
B 2MH is significant and 25.||/|| SDSMD .
A 2MH is not significant.
Ohio Department of Education 95 American Institutes for Research
6.2 DATA REVIEW SUMMARY
Exhibit 6.2.1 provides a summary of items flagged for review and the number of items rejected following review.
Exhibit 6.2.1: Summary of Content Flagged Items During Field Testing — Spring 2018
Grade / Course Number of Field-
Test Items Number of
Flagged items Number of
Rejected Items
ELA
Grade 3 135 19 5
Grade 4 129 21 5
Grade 5 166 23 6
Grade 6 150 23 5
Grade 7 130 15 3
Grade 8 177 33 12
ELA I 167 28 10
ELA II 171 47 18
Mathematics
Grade 3 104 7 5
Grade 4 106 7 3
Grade 5 109 5 3
Grade 6 123 12 3
Grade 7 118 16 7
Grade 8 122 26 10
Algebra I 433 129 108
Geometry 450 101 98
Science
Grade 5 44 8 0
Grade 8 36 2 0
Biology 72 20 11
Social Studies
American Government 42 12 7
American History 40 8 3
6.3 TEST CONSTRUCTION
The process for constructing fixed-form operational tests began after field testing and committee review of items. Once the item pool was finalized, AIR content specialists began the process of constructing test forms. Test forms were constructed in two stages—first with intact operational forms, and second, specification of field-test item locations within the forms. Operational passages and items qualified for operational forms were those that met all of the criteria established by ODE in terms of content, fairness review, and data characteristics.
Ohio Department of Education 96 American Institutes for Research
6.3.1 OPERATIONAL FORM CONSTRUCTION
Each OST form is built to match the detailed test blueprint, and to match the target distribution of item difficulty and test information. The blueprint describes the content to be covered, and the type of items that measure the constructs, and other content-relevant aspects of the test. The statistical targets ensure that students receive scores of similar precision, regardless of which form of the test they receive.33
AIR’s test developers used AIR’s FormBuilder software to help construct operational forms. FormBuilder interfaces
with AIR’s Item Authoring Tool (IAT) to extract test information and interactively create test characteristics curves
(TCCs), test information curves (TICs), and Standard Error of Measurement Curves (SEMCs) as test developers built
a test map, which provides the relevant information that the content specialists need to ensure that the test forms
are statistically parallel, in addition to ensuring that the test blueprint is met.
Immediately upon generation of a test form, the FormBuilder generates a blueprint match report to ensure that all elements of the test blueprint are satisfied. In addition, the FormBuilder produces a statistical summary of form characteristics to ensure consistency of test characteristics across test forms.
The summary report also flags items with low biserial correlations, as well as very easy and very difficult items. Although items in the operational pool have passed through data review, construction of fixed-form assessments allow another opportunity to ensure that poorly performing items are not included in operational test forms.
The FormBuilder also plots the distribution of item difficulties, both classical and IRT indices, to both flag extremely easy or difficult items and to ensure that the distribution of item difficulties was consistent across test forms and with the bank.
As test developers construct forms, FormBuilder-generated TCCs and SEMCs are plotted using a different color trace line for each prototype form. At this point, the test developer can see the test form difficulty relationship between the target and reference forms. Exhibit 6.3.1.1 shows a sample graph of TCC differences. There are several important things to note when examining TCC differences. First, differences in TCCs can occur at specific locations in the TCCs across a range of abilities. These differences reflect different emphases in test information across forms at these ability levels. If the difficulty and error structure for the target forms is sufficiently identical to the reference form, as in the sample TCC and SEM curves, then the item selection process concludes with newly created, multiple, parallel test forms. Once the goal of parallel forms is achieved, the information is entered into IAT, which tracks item usage and generates test maps (tables of data for the items on the form) for use in scoring, forms development, and other processes.
33 Standard 4.12 – Test developers should document the extent to which the content domain of a test represents the domain defined in the test specifications.
Ohio Department of Education 97 American Institutes for Research
Exhibit 6.3.1.1: Test Characteristic Curve Differences
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
-4 -3 -2 -1 0 1 2 3 4
TC
C D
iffe
rence
s
Theta
TCC Differences
For the base year, test construction targets were based on psychometric characteristics of the bank. Subsequent to the base year, construction of OST test forms will seek to match the distribution of test information in the base year’s form. As illustrated in Exhibit 6.3.1.2, by evaluating test characteristics in reference to the likely location of important cut scores, test developers could ensure that test forms measure with precision in the locations where students were being classified into performance levels. Appendix I shows the test information, test information difference and the CSEM graphs that were used to evaluate the science and social studies test forms against their base year forms. The spring 2017 ELA and mathematics operational forms will be considered as base year forms for the future test development.
Exhibit 6.3.1.2: Test Information and Standard Errors Relative to Performance Standard Locations
6.3.2 ASSEMBLING TEST FORMS
The mechanical features of a test—arrangement, directions, and production—are just as important as the quality of the items. Many factors directly affect a student’s ability to demonstrate proficiency on the assessment, while others relate to the ability to score the assessment accurately and efficiently. Still others affect the inferences made from the test results.
When the test developer reviews a test form for content, in addition to making sure all the benchmark/indicator item requirements are met, he or she also ensures that the items on the form do not cue each other—that one item does not present material that indicates the answer to another item. This is important to ensure that a student’s
Ohio Department of Education 98 American Institutes for Research
response on any particular test item is unaffected by, and is statistically independent of, a response to any other test item. This is called “local independence.” Independence is most commonly violated when there is a hint in one item about the answer to another item or when two items narrowly assess the same ability/content. In that case, a student’s true ability on the second item is not being assessed independent of the first item.
Test Developers begin the form construction process by first identifying the pool of items from which forms are built. This pool of items resides at a locked operational status in the Item Tracking System. Each item contains a historical record that clearly demonstrates it has survived the full review process from internal development through client, committee, and statistical data review.
Upon identifying and reviewing the eligible pool of items, a test developer then considers the limitations of the pool, if any. For example, there might be a shortage of high depth of knowledge items at a particular benchmark. The test developer will review and select from among these items first to ensure that the constraints of the blueprint are met.
Once the items and passages for the form are selected and matched against the blueprint, the test developer reviews the form for a variety of additional content considerations, including the following.
• The items are sequentially ordered.
• Each item of the same type is presented in a consistent manner.
• The listing of the options for the multiple-choice items is consistent.
• The answer options are lettered with A, B, C, and D.
• All graphics are consistently presented.
• All tables and charts have titles and are consistently formatted.
• The number of the answer choice letters is approximately equal across the form.
• The answer key was checked by the initial reviewer and one additional independent reviewer.
• All stimuli have items associated with them.
• The topics of items, passages, or stimuli are not too similar to one another.
• There are no errors in spelling, grammar, or accuracy of graphics.
• The wording, layout, and appearance of the item matches how the item was field-tested.
• There is gender and ethnic balance where perceivable in the passages or prompts.
• The passage sets do not start with or end with a constructed-response item.
• Each item and the form are checked against the appropriate style guide.
• The directions are consistent across items and are accurate.
• All copyrighted materials have up-to-date permissions agreements.
• Word counts are within documented ranges.
After completing the initial build of the form, the test developer hands it off to another content specialist, who conducts a final review of the criteria listed above. If the test specialist reviewer finds any issues, the form is sent back for revisions. If the form meets blueprint and complies with all specified criteria, the test developer sends it to the psychometric team for review. When the form is approved by the psychometric team, the test developer uploads the item list into IAT.
6.3.3 EMBEDDED FIELD-TEST SLOTS
Each operational test form contains designated slots for administration of items that do not contribute to students’ test scores.
Ohio Department of Education 99 American Institutes for Research
For online test administrations, Ohio employs AIR’s field-test engine to administer test items. As described previously, the field-test algorithm randomly assigned both the field-test items/item groups and the field-test item positions, ensuring that
• a random sample of students were administered each item; and
• for any given item, the students were sampled with equal probability.
AIR’s field-test algorithm yields a representative, randomized sample of student responses for each item. The field-test algorithm also leads to randomization of item position and the context in which items appear.
For paper-pencil assessments, AIR staff constructed fixed EFT blocks. Selection of items for EFT slots were designed to ensure proportional representation of the field-test items. Items selected for paper-based EFT slots are submitted to ODE for review and approval regarding positioning and frequency within the paper-pencil forms.
Ohio Department of Education 100 American Institutes for Research
7. TEST ADMINISTRATION
7.1 ELIGIBILITY
Ohio public school students in grades 3–8 participated in grade-level ELA and mathematics testing. Students enrolled in grades 5 and 8 were administered the grade-level science assessments.34 Beginning with the class of 2018, the high school end-of-course (EOC) tests will be part of the high school graduation requirements for all Ohio students. High school students take EOC assessments following coursework in ELA I and ELA II, Algebra I, Geometry (or alternatively Integrated Mathematics I and Integrated Mathematics II), American Government, American History, and Physical Science, and/or Biology as part of their course requirements when enrolled in eligible courses.
Students with significant cognitive disabilities who are eligible for the alternate assessment do not participate in OST assessments.
7.2 ADMINISTRATION PROCEDURES
Tests were administered in both an online format and a paper-based format. For administration of the online format tests, a secure browser, developed by AIR, was required to access the computer-based tests. The browser provided a secure environment for student testing by disabling the hot keys, copy and screenshot capabilities, and access to desktop functionalities, such as the Internet and email. Other measures that protect the integrity and security of the online test are presented in the “Test Security” section of this document.
Prior to the beginning of each test administration, AIR released guidance documents, user guides and video tutorials. The materials provided information on using the Test Delivery System (TDS), Online Reporting System (ORS), and Test Information Distribution Engine (TIDE). AIR posted these training materials on the publicly available online portal for OST.
Key personnel involved with OST administration include the District Test Coordinators (DTCs), School Test Coordinators (STCs), and test administrators (TAs) who proctor the test. Materials were developed and provided by AIR for each of these roles.
The Test Administrator User Guide (included as Appendix J)35 was designed to familiarize TAs with the Test Delivery System and contained tips and screenshots throughout. The guide provides enough how-to information to enable TAs to access and navigate the Test Delivery System. The user guide provides the following information:
• Steps to take prior to accessing the system and logging in
• Navigating the TA Interface application
• The Student Interface, used by students for computer-based testing
• Secure browsers and keyboard shortcut keys
The Spring 2015 OST Test Coordinator’s Manual provides information about policies and procedures for OST Test Coordinators. This manual is updated prior to each test administration and includes test administration policies and guidance for Test Coordinators before, during, and after the testing window.
34 Standard 7.2 – The population for whom a test is intended and specifications for the test should be documented. If normative data are provided, the procedures used to gather the data should be explained; the norming population should be described in terms of relevant demographic variables; and the year(s) in which the data were collected should be reported. 35 Supporting documents (e.g., test manuals, technical manuals, user’s guides, and supplemental material) should be made available to the appropriate people in a timely manner.
Ohio Department of Education 101 American Institutes for Research
The Spring 2015 OST Directions for Administration Manual and Online Testing Checklist provides easy-to-follow instructions about administering tests, creating testing sessions, monitoring sessions, verifying student information, assigning test accommodations, and starting and pausing test sessions.36 Additional instructions for administering tests to students using Braille and large print accommodated test booklets are provided in the Supplemental Instructions Appendices of the Test Coordinator’s Manual and Directions for Administration Manual.
In addition to the guidance documents and manuals, AIR provided a TA Certification Course for personnel administering tests online.37 The course provided step-by-step instructions for starting a test session in the TA Interface, marking student test settings, approving students to test and monitoring a test session. All TAs were encouraged to complete the course to prepare for the online administration.
TAs who administered the computer-based OST assessments were also encouraged to conduct a training test session using OST assessments’ sample tests.
Personnel involved with OST test administration played an important role in ensuring the validity of the assessment by maintaining both standardized administration conditions and test security.
DTCs were responsible for coordinating testing at the district level. They ensured that the STCs in each school were appropriately trained and aware of policies and procedures, and that they were trained to use the reporting system.38
STCs were ultimately accountable for ensuring that testing was conducted in accordance with the test security and other policies and procedures established by ODE. STCs were primarily responsible for identifying and training TAs. They also created or approved testing schedules and procedures for the school. STCs worked with Technology Coordinators to ensure that the necessary secure browsers were installed and any other technical issues were resolved. During the testing window, Test Coordinators needed to monitor testing progress, ensure that all students participated as appropriate, and handle testing incidents as necessary.
TAs were responsible for reviewing necessary manuals and user guides to prepare the testing environment and ensuring that students did not have unapproved books, notes, or electronic devices out during testing. They were required to administer OST following the directions in the Directions for Administration Manual and the Online Testing Checklist.39 Any deviation in test administration was to have been reported by TAs to the STC, who reports it to the DTC. The DTC then reports it to ODE.
TAs also were responsible for ensuring that only resources that were allowed for specific tests were available and no additional resources were being used during the test. The Test Coordinator’s Manual and Directions for Administration Manual addressed allowable resources.
36 Standard 4.15 – The directions for test administration should be presented with sufficient clarity so that it is possible for others to replicate the administration conditions under which the data on reliability, validity, and (where appropriate) norms were obtained. Allowable variations in administration procedures should be clearly described. The process for reviewing requests for additional testing variations should also be documented. 37 Standard 6.1 – TAs should follow carefully the standardized procedures for administration and scoring specified by the test developer and any instructions from the test user. Standard 12.16 – Those responsible for educational testing programs should provide appropriate training, documentation, and oversight so that the individuals who administer and score the test(s) are proficient in the appropriate test administration and scoring procedures and understand the importance of adhering to the directions provided by the test developer. 38 Standard 12.16 – Those responsible for educational testing programs should provide appropriate training, documentation, and oversight so that the individuals who administer and score the test(s) are proficient in the appropriate test administration and scoring procedures and understand the importance of adhering to the directions provided by the test developer. 39 Standard 6.1 – TAs should follow carefully the standardized procedures for administration and scoring specified by the test developer and any instructions from the test user.
Ohio Department of Education 102 American Institutes for Research
The STC and TAs worked together to determine the most appropriate testing option(s) and testing environment and the average time needed to complete each test. The appropriate protocols were established to maintain a quiet testing environment throughout the testing session. TAs also needed to ensure that adequate time was available to start computers, load secure browsers, and log in students for computer-based tests and pass out and collect test booklets and materials for paper-pencil tests.40
7.3 ACCOMODATIONS
Some students require testing accommodations. Accommodations are supports that are already familiar to the student because they are being used in the classroom to support instruction.
Four distinct groups of students may receive accommodations on Ohio’s State Tests:
• Students with disabilities who have an Individualized Education Program (IEP).
• Students with a Section 504 Plan who have physical or mental impairments that substantially limit one
or more major life activities, have records of such impairments, or are regarded as having such
impairments, but who do not qualify for special education services.
• Students who are English learners (ELs). (Guidelines for determining EL status can be found in the Ohio
Statewide Assessments Rules Book.) Students who have exited EL status may not receive EL
accommodations on Ohio’s State Tests.
• Students who are ELs with disabilities who have IEPs or Section 504 Plans are eligible for both
accommodations for students with disabilities and ELs. For additional guidance and information about
ELs with disabilities, access the About the Lau Resource Center for English Learners page of ODE’s
website.
For Ohio’s State Tests, accommodations are considered to be adjustments to the testing conditions, test format, or test administration that provide equitable access during assessments for students with disabilities and students who are ELs. The administration of the assessment should never be the first occasion in which an accommodation is introduced to the student.41
To the extent possible, accommodations should
• provide equitable access during instruction and assessments;
• mitigate the effects of a student’s disability;
• not reduce learning or performance expectations;
• not change the construct being assessed; and
• not compromise the integrity or validity of the inferences to be made from the assessment.
40 Standard 3.4 – Students should receive comparable treatment during the test administration and scoring process. Standard 4.5 – If the test developer indicates that the conditions of administration are permitted to vary from one student or group to another, permissible variation in conditions for administration should be identified. A rationale for permitting the different conditions and any requirements for permitting the different conditions should be documented. Standard 6.4 – The testing environment should furnish reasonable comfort with minimal distractions to avoid construct-irrelevant variance. 41 Standard 3.10 – When test accommodations are permitted, test developers and/or test users are responsible for documenting standard provisions for using the accommodation and for monitoring the appropriate implementation of the accommodation.
Ohio Department of Education 103 American Institutes for Research
Ohio’s Accessibility Manual described available accommodations and accessibility features for the spring 2018 administration in Appendix K. Exhibit 7.3.1 shows the available accommodations for Ohio’s State Tests.42
42 Standard 3.9 – Test developers and/or test users are responsible for developing and providing test accommodations, when appropriate and feasible, to remove construct-irrelevant barriers that otherwise would interfere with examinees’ ability to demonstrate their standing on the target constructs.
Ohio Department of Education 104 American Institutes for Research
Exhibit 7.3.1: Accommodations
Accommodation Description
Additional assistive technology regularly used in instruction
Students may use a range of assistive technologies on Ohio’s State Tests including devices that are compatible with the AIR Student Testing Site, and those that are used externally (i.e., on a separate device).
For more information on additional assistive technology devices and software for use on Ohio’s State Tests, refer to Appendix K.
Human read-aloud (on computer- based test)
A TA or monitor reads from the student’s computer screen to the student. For computer-based testing, most students should be able to use text-to-speech for a read-aloud. In some cases, a student’s disability may prohibit them from using the text-to-speech feature and require a human reader.
If testing in a small group, TAs should ensure that all students in the group have similar abilities so that the reader’s pace meets all student’s needs without being too slow or too fast for some students.
Refer to the TIDE User Guide for information about setting up groups for computer-based testing.
If a student needs this accommodation, then the person providing the accommodation must read the entire test to the student. It cannot be “as needed” or “on demand.”
Only students who meet the criteria to have a read-aloud accommodation on the ELA test may use this feature for ELA.
Paper version of test instead of online
If a student’s class is taking Ohio’s State Tests in an online environment and a student is unable to use a computer due to the impact of his or her disability, it is allowable for the student to take the test on a paper-pencil form instead.
Situations that may require this accommodation include:
● A student with a disability who cannot participate in the online assessment due to a health-related disability, neurological disorder or other complex disability and/or cannot meet the demands of a computer-based test administration
● A student with an emotional, behavioral or other disability who is unable to maintain sufficient concentration to participate in a computer-based test administration, even with other accessibility features
● A student with a disability who requires assistive technology that is not compatible with the testing platform
● If a student takes a paper-based version of the test, the student must take both parts of the test on a paper-pencil form
● A student with a disability who requires assistive technology that is not compatible with the testing platform
If a student takes a paper-based version of the test, the student must take both parts of the test on a paper-pencil form.
Ohio Department of Education 105 American Institutes for Research
Accommodation Description
Read-aloud on English language arts
“Read-aloud” as a general term is when a student is administered a test via text-to-speech, human read-aloud, screen reader or sign language interpreter.
The read-aloud accommodation for the ELA test is intended to provide access for a very small number of students to printed or written texts in the ELA tests. These students have print-related disabilities and otherwise would be unable to participate in Ohio’s State Tests because their disabilities severely limit or prevent them from decoding, thus accessing printed text.
This accommodation is not intended for students reading somewhat (only moderately) below grade level.
In making decisions on whether to provide a student with this accommodation, IEP teams and Section 504 Plan coordinators should consider whether the student has
• a disability that severely limits or prevents him or her from accessing printed text, even after varied and repeated attempts to teach the student to do so (for example, the student is unable to decode printed text);
• blindness or a visual impairment and has not learned (or is
unable to use) Braille; or
• deafness or hearing loss and is severely limited or prevented from decoding text due to a documented history of early and prolonged language deprivation.
Before listing the accommodation in the student’s IEP or Section 504 Plan, teams/coordinators also should consider whether
• the student has access to printed text during routine instruction through a reader or other spoken-text audio format or sign language interpreter;
• the student’s inability to decode printed text or read Braille is documented in evaluation summaries from locally administered diagnostic assessments; or
• the student receives ongoing, intensive instruction and/or interventions in foundational reading skills to continue attaining the important college and career-ready skill of independent reading.
IEP teams and Section 504 Plan coordinators make decisions about who receives this accommodation. Schools should use a variety of sources as evidence (including state assessments, district assessments and one or more locally administered diagnostic assessments or other evaluation).
For students who receive this accommodation, no claims should be inferred regarding the student’s ability to demonstrate foundational reading skills.
Ohio Department of Education 106 American Institutes for Research
Accommodation Description
Screen reader mode (English language arts) (formerly called enhanced accessibility mode or streamlined mode; not available 2015–2016 for grade 8 science, biology or physical science)
Screen reader mode is for students with visual impairments who use screen readers.
Only students who meet the criteria to have a read-aloud accommodation on the ELA test may use this feature for ELA.
Sign language interpreter
Any student who is deaf or has hearing loss may have a sign language interpreter (American Sign Language, signed English, Cued Speech) for mathematics, science, and social studies.
For the purposes of statewide testing, sign language is considered a second language and should be treated the same as any other language from a translational standpoint. The test must be signed verbatim. The intent of the phrase “signed verbatim” does not mean a word-to-word translation, as this is not appropriate for any language translation. The expectation is that the interpreter should faithfully translate, to the greatest extent possible, all of the words on the test without changing or enhancing the meaning of the content, adding information or explaining concepts unknown to the student.
Only students who meet the criteria to have a read-aloud accommodation on the ELA test may use this feature for ELA.
Text-to-speech for English language arts
The text-to-speech feature reads aloud the test to the student.
Student must use headphones if not tested in a one-on-one setting.
Only students who meet the criteria to have a read-aloud accommodation on the ELA test may use this feature for ELA.
Text-to-speech tracking for English language arts
The feature will highlight words in test questions as the embedded text-to-speech feature reads the test aloud.
Only students who meet the criteria to have a read-aloud accommodation on the ELA test may use this feature for ELA.
Additional assistive technology regularly used in instruction
Students may use a range of assistive technologies on Ohio’s State Tests, including devices that are compatible with the Student Testing Site and those that are used externally (i.e., on a separate device).
For more information on additional assistive technology devices and software for use on Ohio’s State Tests, refer to Appendix K.
Answers transcribed by test administrator
The student records his or her answers directly on paper and the test administrator/monitor transcribes the responses verbatim into the Student Testing Site.
Braille note taker
A student who is blind or has visual impairments may use an electronic Braille note taker. For Ohio’s State Tests, grammar checker, Internet and stored file functionalities must be turned off.
The responses of a student who uses an electronic Braille note taker during Ohio’s State Tests must be transcribed exactly as entered in the electronic Braille note-taker. Only transcribed responses will be scored. Transcription
Ohio Department of Education 107 American Institutes for Research
Accommodation Description
guidelines are available in Appendix K (Appendix C: Protocol for the Use of the Scribe Accommodation and for Transcribing Student Responses).
Braille writer
A student who is blind or has visual impairments may use an electronic Braille writer. A TA must transcribe into the computer the student’s responses exactly as entered in the electronic Braille writer.
Only transcribed responses will be scored. Transcription guidelines are available in Appendix K (Appendix C: Protocol for the Use of the Scribe Accommodation and for Transcribing Student Responses).
Calculation device or fact charts for non-calculator mathematics test part of test
The student uses a calculation device or fact chart (addition/subtraction/multiplication/division charts) on the non-calculator sections of the mathematics assessments.
The accommodation would be permitted on test sections for which calculators are not allowed for other students. IEP teams and Section 504 Plan coordinators should carefully review the following guidelines for identifying students to receive this accommodation.
This accommodation is for students with disabilities that severely limit or prevent their abilities to perform basic calculations (i.e., single-digit addition, subtraction, multiplication, or division).
In making decisions whether to provide the student with this accommodation, IEP teams and Section 504 Plan coordinators should consider whether the student has a disability that severely limits or prevents the student’s ability to perform basic calculations (i.e., single-digit addition, subtraction, multiplication or division), even after varied and repeated attempts to teach the student to do so.
Before listing the accommodation in the student’s IEP or Section 504 Plan, teams also should consider whether
● the student is unable to perform calculations without the use of a calculation device, arithmetic table or manipulative during routine instruction;
● the student’s inability to perform mathematical calculations is documented in evaluation summaries from locally administered diagnostic assessments; or
● the student receives ongoing, intensive instruction and/or interventions to learn to calculate without using a calculation device, in order to ensure that the student continues to learn basic calculation and fluency.
Specific calculation devices must match the Ohio’s State Tests calculator policy.
Mathematical tools (mathematics and physical science only) —allowable tools include:
• 100s chart
Student uses these tools and manipulatives to assist mathematical problem solving. These manipulatives allow the flexibility of grouping, representing or counting without numeric labels.
Ohio Department of Education 108 American Institutes for Research
Accommodation Description
• Abacus and other specialized tools for students with visual impairments
• Base 10 blocks
• Counters and counting chips
• Cubes
• Square tiles
• Two-colored chips
A student with a visual impairment may need other mathematical tools such as a large print ruler, Braille ruler, tactile compass or Braille protractor.
ODE will review and revise this list annually as needed.
Scribe
The student dictates responses either verbally, using a speech-to text device, augmentative or assistive communication device (e.g., picture or word board), or by signing, gesturing, pointing, or eye gazing.
Grammar checker, Internet, and stored files functionalities must be turned off. Word prediction must also be turned off for students who do not receive this accommodation. The student must test in a separate setting.
In making decisions whether to provide the student with this accommodation, IEP teams and Section 504 Plan coordinators should consider whether the student has
● a physical disability that severely limits or prevents the student’s motor process of writing through keyboarding; or
● a disability that severely limits or prevents the student from expressing written language, even after varied and repeated attempts to teach the student to do so.
Before listing the accommodation in the student’s IEP or Section 504 Plan, teams/coordinators should also consider whether
● the student’s inability to express in writing is documented in evaluation summaries from locally administered diagnostic assessments;
● the student routinely uses a scribe for written assignments; and
● the student receives ongoing, intensive instruction and/or interventions to learn written expression, as deemed appropriate by the IEP team or Section 504 Plan coordinator.
Student’s responses must be transcribed exactly as dictated.
Information about the scribing process is available in Appendix K (Appendix C: Protocol for the Use of the Scribe Accommodation and for Transcribing Student Responses).
Specialized calculation device
A student uses a specific calculation device (for example, a large key, talking or other adapted calculator) on the calculator part of the mathematics assessments. If a talking calculator is used, the student must use headphones or test in a separate setting.
Ohio Department of Education 109 American Institutes for Research
Accommodation Description
The student must qualify for the calculation device or fact charts on non-calculator mathematics test or part of test accommodation to use a specialized calculator in those tests.
Word prediction external device
The student uses an external word prediction device that provides a bank of frequently or recently used words on screen as a result of the student entering the first few letters of a word.
The student must be familiar with the use of the external device prior to assessment administration. The device cannot connect to the Internet or save information.
In making decisions whether to provide the student with this accommodation, IEP teams and Section 504 Plan coordinators are instructed to consider whether the student has
● a physical disability that severely limits or prevents the student from writing or keyboarding responses; or
● a disability that severely limits or prevents the student from recalling, processing and expressing written language, even after varied and repeated attempts to teach the student to do so.
Before listing the accommodation in the student’s IEP team and Section 504 Plan coordinators are instructed to consider whether
● the student’s inability to express in writing is documented in evaluation summaries from locally administered diagnostic assessments; and
● the student receives ongoing, intensive instruction and/or intervention in language processing and writing, as deemed appropriate by the IEP team and Section 504 Plan coordinator.
7.4 TEST SECURITY
Maintaining a secure test environment is critical to ensure that scores represent what students know and are able to do. Because OST assessments were administered both as a paper-based and a computer-based assessment, test security procedures must guard against item exposure, cheating, or other security problems for all testing modes.
The test security procedures involve the following:
• Procedures to ensure security of test materials
• Procedures to investigate test irregularities
The Test Coordinator’s Manual provides detailed instructions on test security policies and procedures that are briefly described as follows—all test items, test materials, and student-level testing information are secure documents and must be appropriately handled. Secure handling protects the integrity, validity, and confidentiality of assessment questions, prompts, and student results. Any deviation in test administration must be reported to ensure the validity of the assessment results. Mishandling of test administration puts student information at risk and disadvantages the student. Failure to honor security severely jeopardizes district and state accountability requirements and the accuracy of student data.
Ohio Department of Education 110 American Institutes for Research
The security of all test materials must be maintained before, during, and after test administration. Under no circumstances are students permitted to assist in preparing secure materials before testing or in organizing and returning materials after testing. After any administration, whether initial or make-up, secure materials (e.g., test booklets) must be returned immediately to the STC and placed in locked storage. Secure materials must never be left unsecured and must not remain in classrooms or be taken off the school’s campus overnight.43
It is unethical and shall be viewed as a violation of test security for any person to
• capture images of any part of the test via any electronic or photographic device;
• duplicate in any way any part of the test;
• examine, read, or review the content of any portion of the test;
• disclose or allow to be disclosed the content of any portion of the test before, during, or after test
administration;
• discuss any OST test item before, during, or after test administration;
• allow students’ access to any test content prior to testing;
• allow students to share information during the test administration;
• read any parts of the test to students except as indicated in the test administration directions or as part of
an accommodation;
• influence students’ responses by making any kind of gestures (for example, pointing to items, holding up
fingers to signify item numbers or answer options) while students are taking the test;
• instruct students to go back and reread/redo responses after they have finished their test since this
instruction may only be given before the students take the test;
• review students’ responses;
• read or review students’ scratch paper; or
• participate in, direct, aid, counsel, assist in, encourage, or fail to report any violations of these test
administration security procedures.
Additional security violations for paper-pencil testing include
• reading or reviewing any test booklet during or after testing,
• changing any student response in the student’s scorable document,
• erasing any student response in the student’s scorable document,
• erasing any stray marks in the student’s scorable document, and
• failing to return all test booklets and other test materials.
TA and proctors may not assist students in answering questions or reword or explain any test content. No test content may ever be discussed before, during, or after test administration.
All regular test booklets and special documents (large print and Braille) are secure documents and must be protected from loss, theft, and reproduction in any medium. A unique identification number and a bar code were printed on the front cover of all test booklets. Schools were expected to maintain test security by using the security numbers to account for all secure test materials before, during, and after test administration until the time they were returned to the contractor.
43 Standard 6.7 – Test users have the responsibility of protecting the security of test materials at all times. Standard 7.9 – If test security is critical to the interpretation of test scores, the documentation should explain the steps necessary to protect test materials and to prevent inappropriate exchange of information during the test administration session.
Ohio Department of Education 111 American Institutes for Research
To access the computer-based tests, the AIR-developed secure Internet browser was required. The secure browser provides a secure environment for student testing by disabling the hot keys, copy and screenshot capabilities, and access to the desktop (Internet, email, and other files or programs installed on school machines). The secure browser did not display the IP address or other URL for the site. Users could not access other applications from within the secure browser, even if they knew the keystroke sequences. The “back” and “forward” browser options were not available, except as allowed in the testing environment as testing navigation tools. Students were not able to print from the secure browsers. During testing, the device was locked down, and students were required to “Pause” (to save the test for another session) or “Submit” (to indicate they were finished with the test). The secure browser was designed to ensure test security by prohibiting access to external applications or navigation away from the test. See the Test Administrator User Guide Appendix J for further details.44
Throughout the testing window, TAs were to report any test incidents (e.g., disruptive students, loss of Internet connectivity) to the STC immediately. A test incident could include testing that was interrupted for an extended period of time due to a local technical malfunction or severe weather. STCs notified DTCs of any test irregularities that were reported, and DTCs were to discuss test incidents with ODE.
44 Standard 6.16 – Transmission of individually identified test scores to authorized individuals or institutions should be done in a manner that protects the confidential nature of the scores and pertinent ancillary information. Standard 8.6 – Test data maintained or transmitted in data files, including all personally identifiable information (not just results), should be adequately protected from improper access, use, or disclosure, including by reasonable physical, technical, and administrative protections as appropriate to the particular data set and its risks, and in compliance with applicable legal requirements. Use of facsimile transmission, computer networks, data banks, or other electronic data-processing or transmittal systems should be restricted to situations in which confidentiality can be reasonably assured. Users should develop and/or follow policies, consistent with any legal requirements, for whether and how students may review and correct personal information.
Ohio Department of Education 112 American Institutes for Research
8. REPORTING AND INTERPRETING OST SCORES
A set of score reports is provided for each administration that summarizes student performance in each grade and content area. Score reports provide data on the performance of individual students and on the aggregated performance of students at various levels—such as state, districts, schools, and teachers. The test data are based on all students who participated in OST assessments for the 2017–2018 school year.
The score reports include information describing student progress toward mastery of the state learning standards. OST provides individual student score reports that are mailed directly to families, detailing student performance on overall tests and subscores. In addition, Ohio offers detailed individual and aggregate level data to educators via AIR’s Online Reporting System (ORS), which provides score data for each OST assessment, both computer-based and paper-pencil. The ORS allows users to compare score data between individual students and the school, district, or overall state, and also provides information about performance on subscore categories.
8.1 APPROPRIATE USES FOR SCORES AND REPORTS
The state provides a variety of resources for helping parents and educators understand and apply student performance results to improve student learning and classroom instruction. All reporting systems for OST assessments, both paper-based and online, are designed with stakeholders, such as teachers, parents, and students (who are not technical measurement experts) in mind and ensure that test results are used in ways that support valid inferences about student achievement and contribute to student learning.45 For example, similar colors are used for groups of similar elements, such as performance levels, throughout the design. This design strategy guides the reader to compare like elements and avoid comparison of dissimilar elements.
Sample reports are available on the portal. The sections below provide additional guidance for interpreting results.
45 Standard 6.10 – When test score information is released, those responsible for testing programs should provide interpretations appropriate to the audience. The interpretations should describe in simple language what the test covers, what scores represent, the precision/reliability of the scores, and how scores are intended to be used. Standard 13.5 – Those responsible for the development and use of tests for evaluation or accountability purposes should take steps to promote accurate interpretations and appropriate uses for all groups for which results will be applied.
Ohio Department of Education 113 American Institutes for Research
8.2 REPORTS PROVIDED
FAMILY REPORTS
Ohio provides full-color individual student reports to families of all OST testers. Reports are designed to be useful to families, and include
• full color to aid readers’ interpretation of the data;
• scale scores and performance level descriptors;
• scoring category performance, including descriptions of what was assessed and what results mean for each
scoring category to guide parents and students in their understanding of student scores; and
• school, district, and state average scores for comparative purposes.
Ohio Department of Education 114 American Institutes for Research
8.2.1 ONLINE REPORTING SYSTEM FOR EDUCATORS
OST results are reported using AIR’s Online Reporting System, which is designed to support educators as they evaluate the needs of their students and reflect on their own curricula and practice. Navigation in the system mirrors the instructional decision-making process, meaning the user can intuitively navigate in any of the three dimensions inherent in the data, helping the user answer three kinds of questions:
1) Who? The data can be displayed at levels of aggregation anywhere from the individual level for a specific
student up to the entire state. Demographic breakdowns are immediately available at any level of
aggregation.
2) What? The subject area data can be broken down into finer or coarser “chunks” of content. Navigating this
dimension allows the user to travel from subject to scoring category and back.
3) When? When data are available over time, the system allows the user to view a data trend over time or
toggle to a fixed point in time.
Each navigational step changes the reporting display, providing richer context when interpreting a class’s or individual student’s performance. While the system contains many reports, the interface design encourages users to think about the substantive, educational questions to which they need answers and access information from that perspective. In addition, while finding and interpreting data from multiple online assessments can easily become overwhelming, the ORS minimizes information overload for educators and administrators by organizing score information in a conceptual framework that helps users quickly locate the right level of data, evaluate its impact, and identify the concrete actions they can take to help students improve.
OST assessments’ online system produces the following online score reports: individual student reports and aggregate reports at the teacher, school, district, and state level.
OST assessments’ online score reports are structured hierarchically. Upon selecting “Home” on the Welcome page, a user is taken to the Home Page Dashboard, which displays for all grades and content areas the number of students tested and the percent of students passing by grade and content area. Users who have access to multiple districts or schools are first required to select a single district or school. Once an aggregate unit is selected in this instance, the summary table of student performance for the selected entity displays. For more detailed information for a subject and a grade, the user must select that subject and grade.
On each aggregate report, the summary report presents the results for the selected aggregate unit as well as the results for the state and the aggregate unit above the selected aggregate. For example, if a school is selected on the school report page, the summary results of the state and the district the school belongs to are provided above the school summary results so that the school performance can be compared with the district and the state. If a teacher is selected, the summary results for state, district, and school are provided above the summary results for the teacher.
For a more detailed overview of the Online Reporting System you can log in and select “Help” to view the ORS User Guide.
Ohio Department of Education 115 American Institutes for Research
Exhibit 8.2.2.1 summarizes the types of online score reports available and the levels at which they can be viewed (e.g., student, roster, teacher, school, district).
Exhibit 8.2.2.1: OST Online Score Report Summary
Type of Report Page Level of Aggregation Description
Home Page Dashboard District, school, and
teacher
Summary of performance and participation
(NumberTested and Percent Passing) across grades
and subjects or course
Subject Detail
District
Average scale score, percent passing, and percent at
each performance level for a district and each school
within that district; ability to disaggregate data by
subgroup
School
Average scale score, percent passing, and percent at
each performance level for a school and each teacher
within that school; ability to disaggregate data by
subgroup
Teacher
Average scale score, percent passing, and percent at
each performance level for a teacher and each class
roster associated with that teacher; ability to
disaggregate data by subgroup
Scoring Category Detail District, school, teacher,
and roster
Performance on the scoring category for a subject and
a grade for all students and by subgroups; a relative
strength and weakness indicator is also reported for
each category
Student Roster School, teacher, roster
List of students with performance on overall subject
and scoring categories for a group of students
associated with a school, teacher, or roster.
Individual Student
Report Student
Student performance for a selected subject; report
includes performance on each scoring category, and
performance on the writing essay dimensions, if
applicable
Ohio Department of Education 116 American Institutes for Research
SUBJECT DETAIL REPORTS
The screenshot above demonstrates the Subject Detail Reports at the district level. Aggregated subject reports show average performance for the state, districts, schools, teachers, and classes. Bar chart displays show the distribution of students’ performance levels. These reports provide users with rosters of schools, teachers, and classes, allowing for simple comparisons across smaller groups.
The Subject Detail Report page shows the following data:
• Student Count: Number of students who have completed who completed the selected test
• Average Scale Score: Average scale score of students who completed the selected test
• Percent Proficient: The percent of tested students reaching the proficient threshold on the selected test
Percent at Each Performance Level: The distribution of students across each of the four performance
levels
Ohio Department of Education 117 American Institutes for Research
SCORING CATEGORY DETAIL REPORTS
The screenshot above shows the Scoring Category Detail Reports. Aggregated scoring category detail reports follow the layout of the subject detail reports, displaying the performance data for the state, districts, schools, teachers, and classes. These can be accessed by selecting the desired entity and choosing “Reporting Category” in the drop-down menu.
Ohio Department of Education 118 American Institutes for Research
STUDENT ROSTER REPORTS
The screenshot above shows the Student Roster Reports, it provides users with performance data for a group of students associated with a teacher or a school, as defined in TIDE. The report includes each student’s unique state ID, overall subject score, and overall subject performance level. Using the exploration menu, a user can also view each student’s scoring category performance for the selected test.
The table that appears on the Student Roster Report page shows the following data:
• Scale score: The score of each student who completed the test
• Performance level: Represents levels of overall subject mastery
• Scoring Categories: Represents levels of scoring category mastery
Ohio Department of Education 119 American Institutes for Research
INDIVIDUAL STUDENT REPORTS
The screenshot above shows the Individual Student Reports, it summarizes a student’s performance in an organized, easy-to-understand document that can be distributed to educators, parents, and students. The student’s performance is plotted against cut scores on a barrel chart that provides detailed explanations of each performance level. A student’s scoring category scores and comparison data for the state, district, and school are provided in separate tables. The report can be exported as a PDF document, and users can batch-print multiple students’ reports, allowing for electronic distribution of student reports.
The Individual Student Report page contains the following information:
• Barrel chart: Presents the student’s performance and where his or her performance lies on the OST
assessments’ scale. The following information is presented in the barrel chart:
o Scale score: The score the student received on the selected test.
o Performance-level descriptors (PLDs): PLDs define the content area knowledge, skills, and
processes that students at a performance level are expected to possess.
o Cut scores: The barrel chart shows the cut scores for each performance level for a particular
grade and subject.
• Student performance on scoring categories: Shows the student’s performance on each of the scoring
categories, including text descriptions of what was assessed in each category and what the student’s
results mean.
Ohio Department of Education 120 American Institutes for Research
PARTICIPATION REPORTS
The screenshot above shows the Participation Report. To help schools manage their test schedule, allocating testing resources, and prioritize testing, the online reporting system offers participation reports for online testers. From the “Data Files and Participation Reports” drop-down, users can select “Plan and Manage Testing” to generate up-to-the-minute reports showing students’ test status. In addition, users can set testing schedules, monitor testing progress across schools, and track students’ participation based on their performance on previous tests.
Ohio Department of Education 121 American Institutes for Research
8.3 INTERPRETATION OF SCORES
Ohio provides a variety of resources to help parents and educators understand and apply student performance results to improve student learning, including interpretive guides for navigating the online reporting system, and understanding paper family reports.46 This section describes many of the measures presented in the paper and online score reports.
8.3.1 SCALE SCORES
The student’s performance in each content area assessment is summarized in an overall test score referred to as a scale score. The number of items a student answers correctly and the difficulty of the items presented are used to statistically transform theta scores (student ability expressed in logits) to scale scores so that scores from different sets of items (test forms) can be meaningfully compared on a linear and invariant scale. The scale score is used to indicate how well students perform on each subject area assessment. Scale scores can indicate how much students know and are able to do. Scale scores can also be used to compare student performance across administrations for the same grade and content area so that, for example, an average scale score of 700 for grade 5 students in the 2016–2017 school year indicates the same level of achievement as an average scale score of 700 for grade 5 students in the 2017–2018 school year. Scale scores can also be expressed as integers to facilitate communication while theta scores are cumbersome due to the need to express several decimal places.
As presented in chapter 9, Scaling and Equating, ability estimates are truncated at ±3.5 logits on the theta scale prior to transformation to the OST assessments’ reporting scale. This truncation rule suppresses reporting extreme scale scores where the standard error of the estimate is very large. Overall scale scores for science and social studies are mapped into five performance levels using four performance standards (i.e., cut scores). The OST assessments’ scale score ranges can be found in Exhibit 8.3.1.1.
Exhibit 8.3.1.1: OST Scale Score Ranges
Assessment Limited Basic Proficient Accelerated Advanced
ELA
Grade 3 545–671 672–699 700–724 725–751 752–863
Grade 4 549–673 674–699 700–724 725–752 753–846
Grade 5 552–668 669–699 700–724 725–754 755–848
Grade 6 555–667 668–699 700–724 725–750 751–851
Grade 7 568–669 670–699 700–724 725–748 749–833
Grade 8 586–681 682–699 700–724 725–743 744–805
ELA I 606–682 683–699 700–724 725–738 739–800
ELA II 597–678 679–699 700–724 725–741 742–808
Mathematics
Grade 3 587–682 683–699 700–724 725–752 753–818
Grade 4 605–685 686–699 700–724 725–758 759–835
Grade 5 624–686 687–699 700–724 725–748 749–804
Grade 6 616–681 682–699 700–724 725–743 744–790
Grade 7 605–683 684–699 700–724 725–754 755–806
46 Standard 12.18 – In educational settings, score reports should be accompanied by a clear presentation of information on how to interpret the scores, including the degree of measurement error associated with each score or classification level, and by supplementary information related to group summary scores. In addition, dates of test administration and relevant norming studies should be included in score reports.
Ohio Department of Education 122 American Institutes for Research
Assessment Limited Basic Proficient Accelerated Advanced
Grade 8 633–689 690–699 700–724 725–743 744–774
Algebra 618–681 682–699 700–724 725–753 754–814
Geometry 604–677 678–699 700–724 725–755 756–810
Integrated Math I 618–681 682–699 700–724 725–753 754–814
Integrated Math II 594–676 677–699 700–724 725–757 758–813
Science
Grade 5 559–663 664–699 700–724 725–752 753–845
Grade 8 575–673 674–699 700–724 725–765 766–868
Biology 617–684 685–699 700–724 725–734 735–823
Physical Science 634–683 684–699 700–724 725–748 749–815
Social Studies
American History 619–683 684–699 700–724 725–737 738–800
American Government 642–686 687–699 700–724 725–738 739–774
Ohio Department of Education 123 American Institutes for Research
8.3.2 PERFORMANCE STANDARDS
Performance standards are the points (or cut scores) on the achievement scale that differentiate performance levels. Four performance standards are used to classify students into one of five proficiency levels. Performance standard cut scores were recommended by panels of Ohio educators following the first administration of OST in 2015, and subsequently adopted by the Ohio Board of Education. Panelists engaged in a rigorous, technically sound standard setting process that is summarized in the Performance Standards section of this technical manual, and documented in detail in the 2015 “Recommending Ohio Computer-Based Assessment Performance Standards” technical report, available from ODE.47
Performance levels represent levels of mastery with respect to Ohio’s Learning Standards for a particular subject and grade. Performance levels are labeled as Limited, Basic, Proficient, Accelerated, and Advanced in accordance with Ohio Revised Code. Performance level labels and performance level descriptors (PLDs) are developed to define and illustrate the level of achievement that characterizes students in each group.
Performance levels provide context for interpreting the meaning of scale scores. While scale scores indicate how much a student knows and is able to do, performance levels indicate how much students must know and be able to do to receive a Limited, Basic, Proficient, Accelerated, or Advanced label for a subject area assessment. Teachers can evaluate how their students are performing compared with other students in the school, LEA, and state in terms of the percentage of students in each performance level for the same grade and content area.
8.3.3 PERFORMANCE-LEVEL DESCRIPTORS
PLDs define the content area knowledge, skills, and processes that students at a performance level are expected to possess. The descriptions of Limited, Basic, Proficient, Accelerated, or Advanced performance are the public statements about what and how much Ohio educators want students to know and be able to do for each grade level and content area. The PLD development process includes rounds of review from test development experts within ODE, AIR, and Ohio educators and parents. The very detailed PLDs are summarized and included in score reports to provide context for the score and are designed to help parents understand what their students can and cannot do.
47 Standard 5.21 – When proposed score interpretations involve one or more cut scores, the rationale and procedures used for establishing cut scores should be documented clearly. Standard 7.4 – Test documentation should summarize test development procedures, including descriptions and the results of the statistical analyses that were used in the development of the test, evidence of the reliability/precision of scores and the validity of their recommended interpretations, and the methods for establishing performance cut scores.
Ohio Department of Education 124 American Institutes for Research
9. PERFORMANCE STANDARDS
In the summer of 2015, following the first administration of OST assessments in science and social studies, AIR convened panels of Ohio educators to recommend performance standards on each of the science and social studies assessments. Details of the panels, procedures, and outcomes are documented in the “Recommending Ohio Computer-Based Assessment Performance Standards” technical report, which is available from ODE.
To comply with legislatively mandated reporting requirements, performance standards for ELA and mathematics were recommended prior to any test administrations. In December 2015, AIR convened panels of Ohio educators to recommend performance standards on each of the ELA and mathematics assessments based on an ordered-item booklet (OIB) that comprised AIRCore items that had been previously calibrated and equated based on administration in other statewide assessments. Details of the panels, procedures, and outcomes are documented in the “Recommending Performance Standards for Ohio’s State Tests”, which is available from ODE. This section briefly describes the procedures used by educators to recommend standards, and resulting performance standards.
9.1 STANDARD SETTING PROCEDURES
Student achievement on OST assessments are classified into five performance levels: Limited, Basic, Proficient, Accelerated, and Advanced. Interpretation of OST assessments’ scores rests fundamentally on how student ability estimates, indicated by test scores, relate to the performance standards that define the extent to which students have achieved the expectations defined in Ohio’s Learning Standards. OST test scores are reported with respect to five performance levels, demarcating the degree to which Ohio students have achieved the learning expectations defined by Ohio’s Learning Standards. The levels are defined in Ohio Revised Code 3301.0710(A)(2). The cut score establishing the Proficient level of performance is the most critical, since it indicates that students are meeting grade level expectations for achievement of Ohio’s Learning Standards and that they are prepared to benefit from instruction at the next grade level. Additionally, the Accelerated level is important, as it indicates that students are on track to pursue post-secondary education or enter the workforce. Procedures used to adopt performance standards for OST assessments are therefore central to the validity of test score interpretations.
Following the first operational administration of OST assessments in spring 2015, a standard-setting workshop was conducted to recommend to the Ohio State Board of Education a set of performance standards for reporting student achievement of Ohio’s Learning Standards in science and social studies. Ohio educators, serving as standard setting panelists engaged in a standardized and rigorous process to recommend performance standards. The workshops employed the Bookmark procedure (Mitzel, Lewis, Patz, and Green, 2001 and Lewis, Mitzel, Green, 1996) a widely used method in which standard setting panelists use their expert knowledge of Ohio’s Learning Standards and student achievement to map the performance level descriptors adopted by the Ohio State Board of Education onto an OIB based on the first operational test form administered to students in spring 2015.
Similar procedures were adopted to recommend performance standards for Ohio’s State Tests in ELA and
mathematics, but with notable differences. The ELA and mathematics standard setting workshops were conducted
in December 2015, prior to test administration. The AIRCore items used to construct those initial test forms had
been previously field tested as part of other statewide assessments, with IRT parameters calibrated and linked to a
common scale based on those state test administrations. Further, because the ELA and mathematics assessments
had never been administered to Ohio students, impact of recommended cut scores had to be estimated based on
the performance of students who had been administered tests that could be linked to the AIRCore scale. Because
Washington has NAEP reading and mathematics scores that are very similar to those of Ohio, estimated impact of
recommended performance standards was projected from student performance on the Smarter Balanced
assessments administered in Washington State.
Thus, panelists in both workshops were provided with contextual information to help inform their primarily
content-driven performance standard recommendations. Panelists were charged with recommending
Ohio Department of Education 125 American Institutes for Research
performance standards comparable to other important assessment systems, including multi-state consortia
(PARCC and Smarter Balanced) and national benchmarks such as NAEP. To facilitate comparisons of Ohio
performance standards with other important benchmark assessments, panelists were provided with the locations
of performance standards from these other assessments systems in their OIBs. Performance standard locations for
the following assessments were provided as part of panelists’ OIB review.
ELA and mathematics workshops:
• PARCC ELA and mathematics performance standards in grades 3–8 and end-of-course assessments
• Smarter Balanced ELA and mathematics performance standards in grades 3–8, since Smarter Balanced includes only a grade 11 assessment in high school
• NAEP performance standards in reading and mathematics in grades 4 and 8 (and interpolated for grade 6)
• ACT college-ready performance standard for reading and mathematics in grade 11
Science and social studies workshops:
• ACT college-ready performance standard for Physical Science, Biology, American Government, and American History
• NAEP reading performance standards for American Government and American History
• NAEP reading performance standards for social studies grade 4 and grade 6 (interpolated value for grade 6)
• TIMSS science assessment benchmarks for science grade 5 and grade 8 (interpolated value for grade 5)
Because the AIRCore items used to build OST assessments in ELA and mathematics can be linked to the reporting
scales of the Smarter Balanced assessments, the locations of the Smarter Balanced performance standards were
mapped directly to the OIBs. The location of performance standards for the PARCC, NAEP, and ACT assessments
were inferred through estimated impact rates (the expected percentage of students expected to meet or exceed
the performance level indicated by each page in the OIB).
In addition, following recommendation of performance standards in each of the panels, panelists were provided
with feedback about the vertical articulation of their recommended performance standards so that they could
view how the locations of their recommended performance standards for each of the grade-level and EOC
assessments sat in relation to the cut score recommendations for the other assessments. This approach allowed
panelists to view their cut score recommendations as a coherent system of performance standards, and further
reinforces the interpretation of test scores as indicating not only achievement of current grade-level standards but
also preparedness to benefit from instruction in the subsequent grade levels.
9.1.1 PERFORMANCE-LEVEL DESCRIPTORS
Student achievement on OST assessments is classified into five performance levels: Limited, Basic, Proficient,
Accelerated, and Advanced as prescribed by Ohio Revised Code 3301.0710(A)(2). Performance level descriptors
(PLDs) define the content area knowledge and skills that students at each performance level are expected to
demonstrate. The standard-setting panelists based their judgments about the location of the performance
standards on the PLDs as well as Ohio’s Learning Standards. OST assessments’ PLDs describe five levels of
achievement:
Ohio Department of Education 126 American Institutes for Research
• Limited
• Basic
• Proficient
• Accelerated
• Advanced
Prior to convening the standard setting workshops, ODE, in consultation with AIR, drafted PLDs for each test that
describe the range of achievement encompassed by each performance level on the test. The PLDs were designed to
be clear, be concrete, and reflect Ohio’s expectations for proficiency based on Ohio’s Learning Standards. ODE
considered any need for clarification or revision that arose throughout the standard setting process prior to
publishing the final versions of the PLDs following the standard setting workshop. Ohio’s PLDS are available at
education.ohio.gov.
9.2 RECOMMENDED PERFORMANCE STANDARDS
Panelists were tasked with recommending five performance standards (Limited, Basic, Proficient, Accelerated and
Advanced) that resulted in four performance levels (Basic, Proficient, Accelerate and Advanced). The final
recommended performance standards for each OST assessment is provided in Exhibits 9.2.1–9.2.4, and include the
panelist-recommended OIB page numbers, theta value of the performance standard (in logit scale), as well as the
percentage of Ohio students classified as meeting or exceeding each standard. Following the standard-setting
workshop, panelist recommendations were submitted to Ohio’s State Board of Education; the Board formally
adopted the standards in spring 2015 for science and social studies and in winter 2015 for ELA and mathematics.
The estimated percentage of students at each performance level for each test is shown in Exhibit 9.2.5.
Exhibit 9.2.1 Final Recommended Performance Standards for OST Assessments — ELA
Test Performance
Level
Ordered-Item Booklet
Page Theta
Estimated Percentage of Students At or
Above Performance
Standard
Approximate Percentage of
Raw Score Points
Grade 3
Basic 8 -0.84 75 33
Proficient 14 -0.23 56 46
Accelerated 26 0.32 36 56
Advanced 36 0.92 17 67
Grade 4
Basic 4 -0.56 73 38
Proficient 17 0.06 54 50
Accelerated 30 0.65 33 63
Advanced 42 1.32 14 73
Grade 5
Basic 6 -0.74 78 38
Proficient 15 0.00 57 50
Accelerated 28 0.59 36 60
Advanced 41 1.29 15 73
Grade 6
Basic 6 -0.88 80 31
Proficient 13 -0.12 58 48
Accelerated 27 0.47 37 60
Ohio Department of Education 127 American Institutes for Research
Test Performance
Level
Ordered-Item Booklet
Page Theta
Estimated Percentage of Students At or
Above Performance
Standard
Approximate Percentage of
Raw Score Points
Advanced 48 1.09 18 71
Grade 7
Basic 5 -0.80 76 35
Proficient 15 -0.01 55 50
Accelerated 36 0.65 32 63
Advanced 49 1.29 14 71
Grade 8
Basic 9 -0.43 72 40
Proficient 21 0.15 55 52
Accelerated 41 0.95 28 69
Advanced 56 1.55 13 79
ELA I
Basic 3 -0.71 71 35
Proficient 14 -0.11 53 48
Accelerated 36 0.79 24 65
Advanced 48 1.31 12 73
ELA II
Basic 6 -0.77 72 35
Proficient 19 -0.08 52 49
Accelerated 40 0.75 25 65
Advanced 53 1.30 11 76
Exhibit 9.2.2 Final Recommended Performance Standards for OST Assessments — Mathematics
Test Performance
Level
Ordered-Item Booklet
Page Theta
Estimated Percentage of Students At or
Above Performance
Standard
Approximate Percentage of
Raw Score Points
Grade 3
Basic 17 -0.61 82 39
Proficient 24 -0.08 66 49
Accelerated 36 0.68 36 63
Advanced 51 1.53 11 75
Grade 4
Basic 5 -1.05 78 30
Proficient 13 -0.61 65 38
Accelerated 31 0.15 37 52
Advanced 49 1.19 9 70
Grade 5
Basic 7 -1.05 78 29
Proficient 13 -0.54 65 39
Accelerated 33 0.43 34 59
Advanced 55 1.35 10 76
Grade 6 Basic 8 -0.83 79 31
Ohio Department of Education 128 American Institutes for Research
Test Performance
Level
Ordered-Item Booklet
Page Theta
Estimated Percentage of Students At or
Above Performance
Standard
Approximate Percentage of
Raw Score Points
Proficient 18 -0.12 62 46
Accelerated 38 0.89 31 69
Advanced 60 1.65 13 81
Grade 7
Basic 3 -0.76 75 35
Proficient 13 -0.19 61 46
Accelerated 40 0.68 36 65
Advanced 60 1.74 11 81
Grade 8
Basic 12 -0.69 74 38
Proficient 24 -0.18 63 46
Accelerated 50 1.06 32 69
Advanced 62 2.00 13 83
Algebra
Basic 2 -1.21 72 27
Proficient 8 -0.57 58 38
Accelerated 31 0.32 36 58
Advanced 56 1.37 13 76
Geometry
Basic 11 -0.27 75 33
Proficient 24 0.47 59 48
Accelerated 43 1.32 38 63
Advanced 52 2.37 15 80
Integrated Math I
Basic 4 -1.20 72 27
Proficient 11 -0.57 58 40
Accelerated 30 0.32 36 58
Advanced 56 1.37 13 78
Integrated Math II
Basic 7 -0.14 72 39
Proficient 26 0.60 56 52
Accelerated 43 1.40 36 67
Advanced 53 2.46 13 81
Exhibit 9.2.3 Final Recommended Performance Standards for OST Assessments — Science
Test Performance
Level
Ordered-Item Booklet
Page Theta
Estimated Percentage of Students At or
Above Performance
Standard
Approximate Percentage of
Raw Score Points
Grade 5
Basic 7 -0.92 88 30
Proficient 26 -0.04 62 48
Accelerated 41 0.57 38 63
Ohio Department of Education 129 American Institutes for Research
Test Performance
Level
Ordered-Item Booklet
Page Theta
Estimated Percentage of Students At or
Above Performance
Standard
Approximate Percentage of
Raw Score Points
Advanced 60 1.25 17 75
Grade 8
Basic 9 -1.14 82 25
Proficient 21 -0.51 60 38
Accelerated 39 0.09 37 52
Advanced 61 1.08 10 73
Physical Science
Basic 6 -1.56 87 20
Proficient 19 -0.94 63 29
Accelerated 45 0.02 22 48
Advanced 63 0.95 4 70
Biology
Basic 13 -1.19 79 21
Proficient 26 -0.67 60 30
Accelerated 49 0.18 27 50
Advanced 63 0.51 17 57
Exhibit 9.2.4 Final Recommended Performance Standards for OST Assessments — Social Studies
Test Performance
Level
Ordered-Item Booklet
Page Theta
Estimated Percentage of Students At or
Above Performance
Standard
Approximate Percentage of
Raw Score Points
Grade 4
Basic 8 -0.92 88 33
Proficient 19 -0.40 70 44
Accelerated 41 0.57 29 64
Advanced 62 1.58 5 81
Grade 6
Basic 15 -0.22 77 44
Proficient 30 0.36 57 58
Accelerated 44 0.97 36 70
Advanced 60 1.71 13 83
American History
Basic 9 -0.98 88 31
Proficient 21 -0.37 71 42
Accelerated 43 0.60 35 64
Advanced 58 1.12 18 73
American Government
Basic 10 -1.11 90 27
Proficient 22 -0.41 67 39
Accelerated 49 0.92 18 69
Advanced 69 1.66 4 81
Ohio Department of Education 130 American Institutes for Research
Exhibit 9.2.5 shows the estimated percentage of students classified at each performance level based on final panelist-recommended standards for each OST assessments in ELA and mathematics.
Exhibit 9.2.5 Estimated Percentage of Students Classified in Each OST Performance Level
Test Limited Basic Proficient Accelerated Advanced
ELA
Grade 3 25 20 19 19 17
Grade 4 27 19 20 19 14
Grade 5 22 21 21 21 15
Grade 6 20 22 21 19 18
Grade 7 24 22 22 18 14
Grade 8 28 17 26 16 13
ELA I 29 18 29 13 12
ELA II 28 20 27 14 11
Mathematics
Grade 3 18 16 29 25 11
Grade 4 22 13 28 29 9
Grade 5 22 14 31 24 10
Grade 6 21 17 31 18 13
Grade 7 25 14 25 25 11
Grade 8 26 11 30 20 13
Algebra 28 14 22 22 13
Geometry 25 16 21 23 15
Integrated Math I 28 14 22 22 13
Integrated Math II 28 17 20 23 13
Science
Grade 5 12 26 24 21 17
Grade 8 18 22 23 27 10
Physical Science 13 24 41 18 4
Biology 21 19 33 9 17
Social Studies
Grade 4 12 19 41 24 5
Grade 6 23 20 21 22 13
American History 12 18 36 17 18
American Government 10 23 49 14 4
As noted previously, the proficiency rates provided to standard-setting panelists were projected from Washington
State performance on the Smarter Balanced assessments. However, Smarter Balanced does not assess student
achievement in grades 9 and 10, and the grade 11 assessment is not a course-based test. Therefore, to estimate
the impacts for the Algebra I/IM I and Geometry/IM II scales, AIR psychometricians applied the vertical linking
constants in the underlying AIRCore scale to the grade 8 ability estimates to project student achievement in grades
9 and 10. Based on the Ohio results, however, it appeared that the vertical linking constants in the underlying scale
overestimated the growth rate between Algebra I and Geometry as observed in Ohio. Therefore, prior to the final
scoring of the spring 2016 tests, modified cut scores for Geometry/IM II were obtained by adjusting the vertical
Ohio Department of Education 131 American Institutes for Research
linking constant to reflect the observed difference between Algebra I and Geometry. While the adjusted impact
rates are still lower than those projected, they do become consistent with rates observed for Algebra I/IM I. Exhibit
9.2.6 shows the percentage of students classified at each performance level in the spring 2016 administration of
the ELA and mathematics assessments based on final panelist-recommended standards.
Exhibit 9.2.6 Percentage of Students Classified in Each OST Performance Level — Spring 2016 ELA and
Mathematics
Test % Limited % Basic % Proficient % Accelerated % Advanced
ELA
Grade 3 14 20 14 21 20
Grade 4 20 18 18 21 18
Grade 5 20 19 19 22 19
Grade 6 21 15 16 23 15
Grade 7 25 21 16 17 21
Grade 8 20 24 12 11 24
ELA I 19 32 14 11 32
ELA II 25 28 15 11 28
Mathematics
Grade 3 22 11 19 20 28
Grade 4 21 10 18 28 23
Grade 5 23 14 27 18 17
Grade 6 26 18 25 14 18
Grade 7 30 15 23 20 12
Grade 8 32 15 34 14 4
Algebra 32 18 25 17 8
Geometry 23 28 27 17 5
Integrated Math I 35 19 22 17 7
Integrated Math II 40 25 19 12 5
Ohio Department of Education 132 American Institutes for Research
Exhibit 9.2.7 shows the percentage of students classified at each performance level in the initial year of the test
administration, based on final panelist-recommended standards for the student population overall across grade
levels and courses for the science and social studies assessments.
Exhibit 9.2.7 Percentage of Students at Each Performance Level based on Final Recommended Performance
Standards — Spring 2016 Science and Social Studies
Test %Limited %Basic %Proficient %Accelerated %Advanced
Science
Grade 5 12 26 24 21 17
Grade 8 18 22 23 27 10
Physical Science 13 24 41 18 4
Biology 21 19 33 9 17
Social Studies
Grade 4 12 19 41 24 5
Grade 6 23 20 21 22 13
American History 12 18 36 17 18
American Government 10 23 49 14 4
9.3 OST TRANSFORMATIONS AND ROUNDING RULES
9.3.1 RULES FOR TRANSFORMING THE WITHIN-GRADE THETA TO THE OST SCALE
There are two milestone performance standards for OST assessments. The proficient performance standard
indicates that students have met expectations for achievement of Ohio’s Learning Standards in the relevant subject
area. The proficient standard is central to Ohio’s new end-of-course based graduation requirements. In OST
assessments’ system, the Accelerated performance standard corresponds to the level of achievement indicating
readiness for post-secondary education without remediation. To support effective communication of these two
milestones, OST will be reporting on a scale that fixes both the proficient and accelerated performance standards.
In addition, ODE wishes to ensure that users do not inadvertently seek to compare performance on OST assessments
with performance on the previous OAA and OGT assessments. To distinguish test scores on OST assessments from
previous assessment results, ODE will adopt 700 as the new proficient cut score. OST assessments will therefore be
transformed from the within-grade theta estimate of ability to the reporting scale using the following
transformation:
𝑆𝑙𝑜𝑝𝑒 =(725−700)25
(𝜃𝐴𝑐𝑐𝑒𝑙−𝜃𝑃𝑟𝑜𝑓) (1)
𝑂𝑆𝑇 𝑆𝑐𝑎𝑙𝑒 𝑆𝑐𝑜𝑟𝑒 = (𝜃 − 𝜃𝑃𝑟𝑜𝑓) ∗ 𝑆𝑙𝑜𝑝𝑒 + 700 (2)
where 700 is the scaled score representing the proficient level performance standard, and the slope is the spread of
the OST assessments’ scale derived from fixing both the proficient performance standard at 700 and accelerated
performance standard at 725. The 𝜃 represents any level of student ability based on the MLE. The 𝜃𝑃𝑟𝑜𝑓 and 𝜃𝐴𝑐𝑐𝑒𝑙
represent the proficient and accelerated cut scores, respectively, adopted by the State Board.
Ohio Department of Education 133 American Institutes for Research
9.3.2 OST ROUNDING RULES
After transforming theta ability estimates to the OST assessments’ reporting scale, the observable scale scores
nearest each of the performance standard cut scores will be evaluated. If the observable scale score nearest the
performance standard is below the cut score, the scale score will be rounded up to be equal to the cut score. If the
observable scale score nearest the performance standard is above the cut score, no special rounding rules will be
applied. Thus, if the student’s scale score is SS0, and adding one raw score point results in a scale score of SS1, then
where SS0 < SScut < SS1, if the SScut —SS0 < SS1 —SScut, then round the student’s scale score to the cut score. Please
note that the scale score SS0, SS1 and SScut need to be rounded to the nearest integer before the comparison.
9.3.3 RULES FOR OVERALL PERFORMANCE LEVEL CLASSIFICATION
Overall scale scores for OST assessments are mapped into five performance levels. The performance level designations are: Level 1 (Limited), Level 2 (Basic), Level 3 (Proficient), Level 4 (Accelerated), and Level 5 (Advanced).
The within-grade performance standards upon which student achievement is classified are provided in Exhibits 9.3.3.1–9.3.3.4.
Exhibit 9.3.3.1: OST Performance Standards Thetas — ELA
OST ELA Basic Proficient Accelerated Advanced
Grade 3 -0.70 -0.09 0.46 1.06
Grade 4 -0.56 0.06 0.65 1.32
Grade 5 -0.74 0.00 0.59 1.29
Grade 6 -0.83 -0.07 0.52 1.14
Grade 7 -0.80 -0.01 0.65 1.29
Grade 8 -0.43 0.15 0.95 1.55
ELA I -0.71 -0.11 0.79 1.31
ELA II -0.77 -0.08 0.75 1.30
Exhibit 9.3.3.2: OST Performance Standards Thetas — Mathematics
OST Mathematics Basic Proficient Accelerated Advanced
Grade 3 -0.61 -0.08 0.68 1.53
Grade 4 -1.05 -0.61 0.15 1.19
Grade 5 -1.05 -0.54 0.43 1.35
Grade 6 -0.83 -0.12 0.89 1.65
Grade 7 -0.76 -0.19 0.68 1.74
Grade 8 -0.69 -0.18 1.06 2.00
Algebra -1.21 -0.57 0.32 1.37
Geometry -0.98 -0.24 0.61 1.66
Integrated Math I -1.20 -0.57 0.32 1.37
Integrated Math II -0.85 -0.11 0.69 1.75
Ohio Department of Education 134 American Institutes for Research
Exhibit 9.3.3.3: OST Performance Standards Thetas — Science
OST Science Basic Proficient Accelerated Advanced
Grade 5 -0.91997 -0.04328 0.56923 1.24605
Grade 8 -1.13745 -0.50512 0.09217 1.07651
Physical Science -1.56235 -0.94268 0.02261 0.94759
Biology -1.18548 -0.67156 0.17575 0.50740
Exhibit 9.3.3.4: OST Performance Standards — Social Studies
OST Social Studies Basic Proficient Accelerated Advanced
Grade 4 -0.91623 -0.40271 0.57222 1.57510
Grade 6 -0.21536 0.36261 0.96849 1.70707
American History -0.97617 -0.36759 0.60310 1.12246
American Government -1.10964 -0.41063 0.91557 1.65763
9.3.4 OST SUBSCALE PERFORMANCE CLASSIFICATION
Subscale performance classifications are computed to classify student performance levels for each of the reporting category subscales with respect to the proficient performance standard.
For each subscale, a mid-range band is defined as extending one SEM below and above the proficient level performance standard. Where student subscale scores are more than one SEM below the proficient standard for the subscale, students are classified as scoring below the standard. Conversely, where student subscale scores are more than one SEM above the proficient standard for the subscale, students are classified as scoring above the standard. Students with subscale scores falling within the mid-range band are classified as scoring near the standard. The rules surrounding classification are described below:
• If 𝑆𝑆𝑜𝑏𝑠 < 𝑆𝑆𝑐𝑢𝑡 − 1 ∗ 𝑆𝐸𝑀𝑐𝑢𝑡, then performance is classified as Below Proficient
• If 𝑆𝑆𝑜𝑏𝑠 > 𝑆𝑆𝑐𝑢𝑡 + 1 ∗ 𝑆𝐸𝑀𝑐𝑢𝑡, then performance is classified as Above Proficient
• If 𝑆𝑆𝑐𝑢𝑡 − 1 ∗ 𝑆𝐸𝑀𝑐𝑢𝑡 ≤ 𝑆𝑆𝑜𝑏𝑠 ≤ 𝑆𝑆𝑐𝑢𝑡 + 1 ∗ 𝑆𝐸𝑀𝑐𝑢𝑡, then performance is classified as At/Near
Proficient
Where SSobs is the student’s subscale score, SEMcut is the conditional standard error of measurement associated with the proficient standard for the subscale, and SScut is the proficient cut score. Zero and perfect scores on the subscale are always assigned Below Proficient and Above Proficient, respectively. Please note SSobs, SScut, and SEMcut need to be rounded to the nearest integer before the comparison.
Ohio Department of Education 135 American Institutes for Research
10. SCALING AND EQUATING
OST assessments are fixed-form, online assessments, with paper-pencil forms available for schools that are not ready to transition to the online testing environment. For the science and social studies assessments, where items in the online form that cannot be rendered for paper-based administration are replaced with items measuring the same standards and with similar difficulty. In addition to the common operational base form administered to all students participating in a given grade and subject area assessment, each student is also administered a set of embedded field-test items. In the online environment, the field-test distribution engine randomly selects field-test items for administration. Embedded items include newly developed field-test items that do not contribute toward the student’s overall operational score. The paper-pencil forms also include an embedded field-test block that is used to field-test online items rendered for paper-based administration.
The paper-pencil forms are constructed to be as similar as possible to the online forms, and for ELA and mathematics the same items are administered in both modes. There are, however, some online items on the science and social studies assessments for which paper equivalents cannot be rendered. In these instances, replacement items are identified which allow the paper-pencil form to also meet the blueprint.
10.1 ITEM RESPONSE THEORY PROCEDURES
OST assessments in science and social studies were administered for the first time in spring 2015. Following test administration, item response theory (IRT) procedures were used to calibrate item parameter estimates and create the new OST scales for scoring and reporting.48 OST end-of-course assessments in ELA and mathematics, as well as grade 3 ELA, were administered for the first time in December 2015, followed by a comprehensive administration of all ELA and mathematics assessments in spring 2016. This section describes the procedures for calibration of operational item parameters. All calibration procedures are independently applied by AIR.
Within each test, students are able to skip items in both the online and paper-based test platforms. While omitted items are scored as incorrect for purposes of ability estimation, all omitted responses are treated as not administered for purposes of IRT analysis. All students who respond to at least five items or achieve five scores are considered to have attempted a test. All attempted records are included in IRT analysis with the exclusion of student records that are invalidated by TAs.
10.1.1 CALIBRATION OF OST ITEM BANKS
WINSTEPS was used to estimate Rasch and Masters’ partial credit model item parameters for OST. WINSTEPS is publically available and thoroughly documented software from Mesa Press. WINSTEPS employs a joint maximum likelihood approach towards estimation (JMLE), which jointly estimates the person and item parameters. The Rasch model is fit to estimate student responses to dichotomous (0/1 point) items. Masters’ (1982) partial credit model, an extension of the one parameter Rasch model, allows for graded responses and is fit to estimate parameters for polytomous items.
In the base year of OST assessments in science and social studies, operational items for each test were freely calibrated, centering on the mean item difficulty of each operational test form, to establish the new OST reference scales for those assessments. Following the approval of final item parameter estimates for operational items,
48 Standard 4.10 – When a test developer evaluates the psychometric properties of items, the model used for that purpose (e.g., classical test theory, item response theory, or another model) should be documented. The sample used for estimating item properties should be described and should be of adequate size and diversity for the procedure. The process by which items are screened and the data used for screening, such as item difficulty, item discrimination, or differential item functioning (DIF) for major examinee groups, should also be documented. When model-based methods (e.g., IRT) are used to estimate item parameters in test development, the item response model, estimation procedures, and evidence of model fit should be documented.
Ohio Department of Education 136 American Institutes for Research
parameter estimates for the operational items were anchored to their new OST bank values and parameter estimates for field-test and linking items were estimated under that constraint. This placed parameter estimates for all field-test and external linking items on the same OST scale defined by the operational item parameters.
Beginning with the fall 2015 administrations of OST EOC assessments, pre-equated item parameters were used to score student test records in science and social studies.
The first operational forms of OST assessments in ELA and mathematics were constructed using items in the AIRCore item bank. These items were developed to be aligned to the CCSS and had all been previously administered as part of statewide assessments in Arizona, Florida, Utah, and/or Oregon. Following administration in one or more of the statewide assessment systems and completion of the item review process, AIRCore items were calibrated using Rasch and Masters’ Partial Credit, and linked to a common scale. In December 2015, a standard setting workshop was conducted to recommend to the Ohio State Board of Education a set of performance standards on the AIRCore scale for reporting student achievement of Ohio’s Learning Standards in ELA and mathematics. Because the sample of students administered the fall EOC tests is small and unrepresentative of the state population, and because the grade 3 ELA assessment was administered to grade 3 students early in the school year and before they could be expected to achieve grade 3 learning standards, the fall 2015 ELA and mathematics tests were scored using the AIRCore bank item parameter estimates.
The first operational administration of the full system of OST assessments in ELA and mathematics took place in spring 2016. Item parameters for all the ELA and mathematics assessments were freely calibrated following the spring administration. The OST assessments’ scale for each of the ELA and mathematics tests was established by centering the operational test form item difficulties to zero. The mean-mean equating procedure was used to link the spring 2016 OST item parameters to the AIRCore scale on which performance standards were recommended, allowing those performance standards to be placed onto the new OST scale.
Because the high school end-of-course tests in mathematics include Integrated Mathematics I and Integrated Mathematics II assessments that are constructed from items in both the Algebra I and Geometry item banks, it is not possible to maintain separate banks for each of the EOC mathematics assessments. Following the spring 2016 administration, the decision was made to adopt the standard-setting scale, which was common for all high school mathematics items, as the reference scale for the mathematics EOC assessments. Thus, the linking constants identified following the spring 2016 administration were applied to the spring 2016 item parameter estimates to place them back to the standard-setting scale.
10.1.2 ESTIMATING STUDENT ABILITY USING MAXIMUM LIKELIHOOD ESTIMATION
OST is scored using maximum likelihood estimation.49 As described previously, parameter estimates are calibrated using the Rasch model for dichotomously scored items and Masters’ partial credit model for polytomous items.
LIKELIHOOD FUNCTION
The likelihood function for generating the MLEs is based on a mixture of item types and can therefore be expressed as:
L(θ) = L(θ)MCL(θ)CR
49 Standard 5.0 – Test scores should be derived in a way that supports the interpretations of test scores for the proposed uses of tests. Test developers and users should document evidence of fairness, reliability, and validity of test scores for their proposed use. Standard 5.2 – The procedures for constructing scales used for reporting scores and the rationale for these procedures should be described clearly.
Ohio Department of Education 137 American Institutes for Research
where:
L(θ)MC = ∏ [1
1 + exp [−D(θ − bi)]]
xi
[1 +1
1 + exp [−D(θ − bi)]]
1−xiN
i=1
L(θ)CR = ∏exp ∑ D(θ − δki)
xik=1
1 + ∑ exp ∑ D(θ − δki)j
k=1
mij=1
N
i=1
and where bi is the location parameter, xi is the observed response to the item, i indexes item and δki is the kth step for item i with m total categories.
We subsequently find arg max θ L(θ) as the student’s theta (i.e., MLE) given the set of items administered to the student.
DERIVATIVES
Finding the maximum of the likelihood requires an iterative method, such as Newton-Raphson iterations. Since the log-likelihood is a monotonic function of the likelihood, the following derivatives based on the log-likelihood function (with Rasch constraints) are used:
𝜕ln𝐿(𝜃)𝑀𝐶
𝜕𝜃= ∑ 𝑥𝑖 − [
1
1 + exp [−(𝜃 − 𝑏𝑖)]]
𝑁
𝑖=1
𝜕ln𝐿(𝜃)𝐶𝑅
𝜕𝜃= ∑ 𝑥𝑖 − [
∑ 𝑗𝑚𝑖𝑗=1 exp ∑ (𝜃 − 𝛿𝑘𝑖)
𝑥𝑖𝑘=1
1 + ∑ exp ∑ (𝜃 − 𝛿𝑘𝑖)𝑗𝑘=1
𝑚𝑖𝑗=1
]
𝑁
𝑖=1
𝜕2ln𝐿(𝜃)𝑀𝐶
𝜕2𝜃𝜕𝜃2= − ∑ (1 − [
1
1 + exp [−(𝜃 − 𝑏𝑖)]])
𝑁
𝑖=1
𝜕2ln𝐿(𝜃)𝑀𝐶
𝜕2𝜃
= − ∑ (1 − [1
1 + exp [−(𝜃 − 𝑏𝑖)]])
𝑁
𝑖=1
[1
1 + exp [−(𝜃 − 𝑏𝑖)]]
𝜕2ln𝐿(𝜃)𝐶𝑅
𝜕2𝜃𝜕𝜃2= ∑ [
∑ 𝑗𝑚𝑖𝑗=1 exp ∑ (𝜃 − 𝛿𝑘𝑖)
𝑥𝑖𝑘=1
1 + ∑ exp ∑ (𝜃 − 𝛿𝑘𝑖)𝑗𝑘=1
𝑚𝑖𝑗=1
]
2𝑁
𝑖=1
𝜕2ln𝐿(𝜃)𝐶𝑅
𝜕2𝜃
= ∑ [∑ 𝑗
𝑚𝑖𝑗=1 exp ∑ (𝜃 − 𝛿𝑘𝑖)
𝑥𝑖𝑘=1
1 + ∑ exp ∑ (𝜃 − 𝛿𝑘𝑖)𝑗𝑘=1
𝑚𝑖𝑗=1
]
2𝑁
𝑖=1
− [∑ 𝑗2𝑚𝑖
𝑗=1 exp ∑ (𝜃 − 𝛿𝑘𝑖)𝑥𝑖𝑘=1
1 + ∑ exp ∑ (𝜃 − 𝛿𝑘𝑖)𝑗𝑘=1
𝑚𝑖𝑗=1
]
Hence, the estimated MLE is found via the following maximization routine:
𝜃𝑡+1 = 𝜃𝑡 −𝜕ln𝐿(𝜃𝑡)
𝜕𝜃𝑡
𝜕2ln𝐿(𝜃𝑡)
𝜕2𝜃𝑡
⁄
where
𝜕ln𝐿(𝜃)
𝜕𝜃=
𝜕ln𝐿(𝜃)𝑀𝐶
𝜕𝜃+
𝜕ln𝐿(𝜃)𝐶𝑅
𝜕𝜃
Ohio Department of Education 138 American Institutes for Research
𝜕2ln𝐿(𝜃)
𝜕2𝜃𝜕𝜃2=
𝜕2ln𝐿(𝜃)𝑀𝐶
𝜕2𝜃𝜕𝜃2+
𝜕2ln𝐿(𝜃)𝐶𝑅
𝜕2𝜃𝜕𝜃2
𝜕2ln𝐿(𝜃)
𝜕2𝜃=
𝜕2ln𝐿(𝜃)𝑀𝐶
𝜕2𝜃+
𝜕2ln𝐿(𝜃)𝐶𝑅
𝜕2𝜃
and where θt denotes the estimated θ at iteration t.
ESTIMATING ZERO AND PERFECT SCORES
In the event of zero or perfect scores, a procedure recommended by Berkson (as cited in Linacre, 2004) is implemented to add (or subtract) 0.5 to (from) the zero (perfect) score prior to estimating student ability. Thus, for students responding incorrectly to all items in a scale or subscale, students will be assigned a test raw score of 0.5. Conversely, for students responding correctly to all items in a scale or subscale, 0.5 will be subtracted from the test raw score.
10.2 OST REPORTING SCALE (SCALE SCORES)
There are two milestone performance standards for OST assessments: proficient and accelerated. The proficient
performance standard indicates that students have met expectations for achievement of Ohio’s Learning Standards
in science and social studies. The proficient standard is central to Ohio’s new EOC-based graduation requirements
and accountability practices. In the OST assessment system, the accelerated performance standard corresponds to
the level of achievement indicating readiness for post-secondary education without remediation. To support
effective communication of these two milestones, OST assessments are reported on a scale that fixes both the
proficient and accelerated performance standards. In addition, ODE wishes to ensure that users do not inadvertently
seek to compare performance on OST assessments with performance on the previous OAA and OGT assessments.
To distinguish test scores on OST assessments from previous assessment results, ODE adopted 700 as the new
proficient cut score. OST assessments have therefore been transformed from the within-grade theta estimate of
ability to the reporting scale using the following transformation:50
𝑆𝑙𝑜𝑝𝑒 =725−700
(𝜃𝐴𝑐𝑐𝑒𝑙−𝜃𝑃𝑟𝑜𝑓) (1)
𝑂𝑆𝑇 𝑆𝑐𝑎𝑙𝑒 𝑆𝑐𝑜𝑟𝑒 = (𝜃 − 𝜃𝑃𝑟𝑜𝑓) ∗ 𝑆𝑙𝑜𝑝𝑒 + 700 (2)
where 700 is the scaled score representing the proficient level performance standard, and the slope is the spread of
the OST assessments’ scale derived from fixing both the proficient and accelerated performance standards. The 𝜃
represents any level of student ability based on the MLE. The 𝜃𝑃𝑟𝑜𝑓 and 𝜃𝐴𝑐𝑐𝑒𝑙 represent the proficient and
accelerated cut scores, respectively, adopted by the State Board.
Overall scale scores for OST have been mapped into five performance levels per grade/course. The performance
level designations are: Basic, Limited, Proficient, Accelerated, and Advanced. The performance level is evaluated
using the rounded scale score. Exhibit 10.2.1 shows the scale score ranges for the performance levels for each of the
assessments.
50 Standard 5.2 – The procedures for constructing scales used for reporting scores and the rationale for these procedures should be described clearly.
Ohio Department of Education 139 American Institutes for Research
Exhibit 10.2.1: Scale Score Ranges for Performance Levels
Grade / Course Limited Basic Proficient Accelerated Advanced
ELA
Grade 3 545–671 672–699 700–724 725–751 752–863
Grade 4 549–673 674–699 700–724 725–752 753–846
Grade 5 552–668 669–699 700–724 725–754 755–848
Grade 6 555–667 668–699 700–724 725–750 751–851
Grade 7 568–669 670–699 700–724 725–748 749–833
Grade 8 586–681 682–699 700–724 725–743 744–805
ELA I 606–682 683–699 700–724 725–738 739–800
ELA II 597–678 679–699 700–724 725–741 742–808
Mathematics
Grade 3 587–681 682–699 700–724 725–752 753–818
Grade 4 605–684 685–699 700–724 725–758 759–835
Grade 5 624–686 687–699 700–724 725–747 748–804
Grade 6 616–681 682–699 700–724 725–743 744–790
Grade 7 605–683 684–699 700–724 725–754 755–806
Grade 8 633–689 690–699 700–724 725–743 744–774
Algebra 618–681 682–699 700–724 725–753 754–814
Geometry 604–677 678–699 700–724 725–755 756–810
Integrated Math I 618–681 682–699 700–724 725–753 754–814
Integrated Math II 594–676 677–699 700–724 725–757 758–813
Science
Grade 5 559–663 664–699 700–724 725–752 753–845
Grade 8 575–673 674–699 700–724 725–765 766–868
Biology 617–684 685–699 700–724 725–734 735–823
Physical Science 634–683 684–699 700–724 725–748 749–815
Social Studies
Grade 4 621–686 687–699 700–724 725–750 751–800
Grade 6 541–675 676–699 700–724 725–754 755–829
American History 619–683 684–699 700–724 725–737 738–800
American Government 642–686 687–699 700–724 725–738 739–774
Ohio Department of Education 140 American Institutes for Research
10.3 EQUATING PAPER-PENCIL AND ONLINE TEST SCORES
Prior to reporting test scores for OST assessments, a mode comparability study was performed to evaluate differences in test performance attributable to the mode of test administration, and to identify the linking constants necessary to place item parameter estimates across modes on a common scale for test scoring and reporting.51
A matched samples design (Way, Davis, and Fitzpatrick, 2006) was used to investigate mode comparability. A covariate regression approach was implemented to construct equivalent groups of students taking OST assessments for both modes of test administration. The regression analysis identified for each student a predicted score on the paper-based OST assessment from previous year achievement, covarying demographic variables that included gender, ethnicity, Limited English Proficiency (LEP) status, and Individualized Education Program (IEP) in the development of the prediction equation. A nearest neighbor search procedure was then applied to the predicted OST scores to select the equivalent groups of students. This procedure resulted in the identification of two matched samples for each assessment to conduct the mode comparability study.
Independent calibration of common items between the matched samples indicated that while mean differences in item difficulty between the two modes were generally small, some items performed quite differently across modes with some items much easier when administered online while other items appeared to be more difficult for online students.
Equating constants were computed to place the matched sample paper-based item parameters on the online scale. Because ODE does not intend to maintain separate item banks for the online and paper-based assessments, we compared the performance of the matched online and paper-pencil samples scoring the paper-pencil tests using both the online item parameters as well as adjusted online item parameters, which applied the common item equating constant to the online item parameters. Application of the equating constant to produce adjusted online item parameters generally brought the ability estimates of the matched samples more in line with the expectation of equivalent achievement between the two samples. The mean of item difficulty parameters for online and paper-pencil tests and the mode linking constants between two modes using common items are presented in Exhibit 10.3.1.
Because the equating constants were based only on the common items between the online and paper-pencil assessments, we also evaluated the results of the common item equating with an equipercentile equating approach. Comparison of the common item and equipercentile equating approaches indicated general consistency between linked ability estimates between the two methods. Although there was some slight divergence between methods for some assessments for estimates of low ability students, convergence between the two methods supports use of the common item approach for identifying a linking constant to adjust for any mode differences.
Following presentation of mode comparability results, ODE’s technical advisory committee recommended that rather than apply mode correction constants following each test administration, Ohio should focus on moving toward a fully online assessment system as quickly as possible.52
51 Standard 5.13 – When claims of form-to-form score equivalence are based on equating procedures, detailed technical information should be provided on the method by which equating functions were established and on the accuracy of the equating functions. 52 Standard 5.23
Ohio Department of Education 141 American Institutes for Research
Exhibit 10.3.1: Mode Linking Constants
Test Mean Item Difficulties Mode Linking Constant
% Taking Paper Online Paper-Pencil Theta score Scale Score
ELA
Grade 3 -0.03 0.10 -0.13 -5.91 33%
Grade 4 -0.33 -0.44 0.11 4.66 24%
Grade 5 -0.34 -0.39 0.04 1.69 21%
Grade 6 -0.18 -0.26 0.07 2.97 19%
Grade 7 -0.21 -0.18 -0.03 -1.19 17%
Grade 8 -0.11 -0.19 0.08 2.47 18%
ELA I -0.15 -0.24 0.09 2.57 18%
ELA II -0.05 -0.15 0.10 3.01 18%
Mathematics
Grade 3 -0.63 -0.51 -0.13 -4.28 27%
Grade 4 -0.22 -0.14 -0.08 -2.63 24%
Grade 5 -0.01 0.15 -0.16 -4.12 20%
Grade 6 -0.27 -0.22 -0.06 -1.49 19%
Grade 7 -0.08 -0.09 0.00 0.00 17%
Grade 8 0.08 0.03 0.05 1.01 18%
Algebra 0.76 0.74 0.02 0.56 17%
Geometry 1.16 1.08 0.07 2.06 17%
Ohio Department of Education 142 American Institutes for Research
11. CONSTRUCTED-RESPONSE SCORING
The OST assessments utilize a variety of item types to assess students’ mastery of Ohio’s Learning Standards. AIR uses item scoring technology to machine-score student responses to most items, including traditional selected-response item types (including multiple-choice items), and machine-scored constructed-response (MSCR) items types. These item types are designed to capture and score a variety of response types, such as graphing, drawing or arranging graphic regions, selecting or rearranging sentences or phrases within passages, or entering equations or words, allowing OST items to assess a wide range of student knowledge and skills. In most cases, machine-scored constructed-response items that are developed for online administration are adapted for paper-based testing and responses are captured in a format that allows machine-scoring.
In addition, human raters score some constructed-response items. AIR subcontracts with Data Recognition Corp. (DRC) to fulfill all OST handscoring requirements. This section describes the process for configuring and validating machine rubrics and the process for handscoring, including rules, descriptions of scorer training and systems used, and mechanisms for ensuring reliability and validity of item scores.
11.1 MACHINE-SCORING
11.1.1 EXPLICIT RUBRICS
As part of the item development process for machine-scored item types other than multiple-choice, a rubric validation process is enacted to verify that rubrics are implemented as intended, and responses are scored correctly. This procedure is conducted following the initial administration of items, usually when the item is field tested, and allows test developers to review the intended performance of the rubric versus the rubric’s actual behavior. Students’ responses are reviewed by test development experts, along with resulting item scores, to ensure that the rubric is functioning as intended and awarding credit appropriately. If necessary, test developers can modify machine rubrics to address insufficiencies, and automatically rescore student responses for the item, repeating the process as necessary to finalize and approve the machine-scored rubrics. Test developers review a strategic sample of responses, including responses where high-achieving students scored poorly on the item, lower-achieving students scored well on the item, and randomly selected responses from the population.
11.1.2 ESSAY AUTOSCORING
As part of each OST ELA test administration, students were administered a writing task. In fall 2016, writing responses produced from online test administrations were machine-scored, and all writing tasks administered on paper-pencil tests were handscored following the procedures described in section 11.2. In spring 2017, each of the writing tasks were paired with a reading passage and being randomly administered to students as operational-field-test items. All of the writing tasks were handscored by DRC.
For AIRCore writing tasks that had previously been administered online in Florida field tests (grades 8–10) or Utah SAGE summative assessments (grade 3), ODE adopted the scoring models generated from student responses in those test administrations. Because the scoring models are based on semantic and syntactic features of the text that discriminate high versus low scoring essays as determined by human raters, the models are highly generalizable.
To develop the scoring models, AIR drew a random sample of 2,000 responses to each of the writing tasks for use in building the statistical scoring models. Those responses were double scored by human readers, and any discrepancies were routed for resolution scoring. The resolution of all discrepancies is essential to ensure that the
Ohio Department of Education 143 American Institutes for Research
human-assigned scores used to develop the statistical scoring model are highly refined and thus limit to the extent possible human error in the assignment of dimension scores that would be captured in the scoring models.53
The random sample of 2,000 responses was divided into a model-building sample of 1,500 responses and a cross-validation sample of 500 responses. Model performance was evaluated on the cross-validation sample to ensure that model fit indices were not based on the model building sample, which may inflate fit indicators.
The statistical rubrics used to develop the scoring models measure a broad set of features, some of which may be item specific and “learned” from a training set. During training, these features are related to handscores through a statistical model. The resulting estimates complete a prediction equation that predicts how a human would score a response with the measured features. Statistical rubrics are, effectively, proxy measures. Although they can directly measure some aspects of writing conventions (e.g., use of passive voice, misspellings, run-on sentences), they do not make direct measures of argument structure or content relevance. Hence, although statistical rubrics often prove useful for scoring essays and even for providing some diagnostic feedback in writing, they do not develop a sufficiently specific model of the correct semantic structure to score many propositional items. Further, they cannot provide the explanatory or diagnostic information available from an explicit rubric. For example, the frequency of incorrect spellings may predict whether a response to a factual item is correct—higher-performing students may also have better spelling skills. Spelling may prove useful in predicting the handscore, but it is not the “reason” that the handscorer deducts points. The statistical rubrics therefore are not about explanation or reason but rather about a prediction of how a human would score the response.
As noted, the engine employs a “training set,” a set of essay responses scored with maximally valid dimension scores, which we obtain by having all responses double-scored by expert scorers and a thorough adjudication process for any discrepant scores. The quality of the human-assigned scores is critical to the identification of a valid model and final performance of the scoring engine. Approximately 1500 essay responses were selected at random from the set of scored essay responses to serve as the training set.
For each dimension in the rubric, the system estimates an appropriate statistical model relating the measures to the score assigned by humans. This model, along with its final parameter estimates, is used to generate a predicted or “proxy” score.
In addition to the training set, an independent random sample of responses is drawn for cross-validation of the identified scoring rubric. As with the training set, student responses in the cross-validation study are handscored, and agreement between human- and machine-assigned scores is examined. The cross-validation process ensures that the rubric generalizes across all responses and that the statistical model identified during training does not capitalize on peculiarities in the training set.
For each of the responses of the validation set, whether or not the predicted score matches the score of record is coded with a new binary variable (match = 1, no match = 0). That variable is predicted with a probit model that has three predictors: word count, the probability of the assigned score under the regression model for predicting the scores, and the Mahalanobis distance between the response and the average of the training set on all the features used in the regression model as predictors (document qualities and LSA dimensions). The predicted probability of a match is used as a confidence measure for the validation data (and also used during operational scoring).
Table 1 in Appendix L presents agreement indicators for the two initial human raters, and between the validated final human and statistical rubric score. 54 Indicators include percent Pearson’s correlation, exact agreement, a
53 Standard 4.19 – When automated algorithms are to be used to score complex examinee responses, characteristics of responses at each score level should be documented along with the theoretical and empirical bases for the use of the algorithms. 54 Standard 6.8 – Those responsible for test scoring should establish scoring protocols. Test scoring that involves human judgment should include rubrics, procedures, and criteria for scoring. When scoring of complex responses is done by computer, the accuracy of the algorithm and processes should be documented.
Ohio Department of Education 144 American Institutes for Research
quadratic weighted kappa statistic, and the standardized mean difference (SMD) between the comparing scores. Although absolute values for evaluating statistics have been advanced (Condon, 2013; Higgins, 2013), the focus of these comparisons is degradation of agreement when moving from human-human agreement to machine-human agreement. Agreement between human raters is an indicator of how reliably the responses can be scored by human raters. Since the statistical rubrics attempt to reproduce human assigned scores, evaluation of machine-human agreement is with respect to observed human-human agreement. Neither human-assigned nor machine-assigned scores will be reliable when human-human agreement is poor. As seen in Table 1 in Appendix L, agreement rates between the machine-assigned scores and the validated final handscores are as high as or higher than agreement rates observed between independent human readers. Table 2 in Appendix L presents the intercorrelations among dimension scores.
To produce a better overall scoring, in spring 2018, the responses with the lowest 25% confidence index were
routed for human verification. In addition to the low confidence responses, all responses assigned the prompt
copy condition code were also sent for human scoring. Responses with low confidence scores were sent for an
independent human read; the reader was not aware of the model based score. For responses assigned the prompt
copy condition code, the reader was informed that the response was flagged for copying of the prompt text, and
asked to either confirm that there is insufficient independent student work to support a writing score, or
determine that there is sufficient independent student writing to support a score and to assign the appropriate
score.
Exhibit 11.1.2.1 shows that human raters returned Prompt Copy Match on about 75% verification cases or, in a
limited number of occasions, assigned zero scores. In the cases where human rater identified sufficient
independent student writing to earn a score, the scores were low, for example 20% returned score of 1 in
Organization and Elaboration.
Exhibit 11.1.2.1: Human Rater Judgments of Responses Assigned the Prompt Copy Condition Code
Distribution of Human Assigned Scores Prompt Copy 0 1 2 3 4
Convention 75% 1% 11% 13% NA NA
Elaboration 75% 1% 20% 4% 0 0
Organization 76% 1% 20% 3% 0 0
11.2 HANDSCORING
AIR subcontracted with DRC to fulfill all handscoring needs for Ohio’s State Tests. For items that were scored by human raters, each student response was scored by at least one reader (Reader 1). Ten percent of all paper-pencil form responses receive a second reading (Reader 2) for the purpose of monitoring and maintaining sufficient inter-rater reliability. The Reader 1 score was the score of record.
11.2.1 RANGEFINDING
For embedded field-test items, DRC’s Content Specialists and Scoring Directors prepare for range finding meetings by using DRC’s Image Handscoring System to access student field-test responses. They select a representative sampling for the full range of student performance. These responses are assembled into range finding sets for each item, and the range finding sets are duplicated for all range finding participants.
Ohio Department of Education 145 American Institutes for Research
Range finding for each item begins with a discussion about the rubric with AIR, ODE, and the committee members. Once an understanding of the rubric has been established, participants score and discuss each response until a consensus is reached. Our facilitators move through each of the responses in the range finding set for that item until there are a sufficient number of responses to construct anchor and training sets. Only responses with a high level of agreement are used to train our scorers. DRC staff makes careful notes of scoring decisions for use in training the scorers.55
11.2.2 DEVELOPING TRAINING MATERIALS AFTER RANGEFINDING
Once range finding is complete, DRC uses the range finding responses to develop training materials for scoring field-test responses. DRC’s Content Specialists and Scoring Directors select anchor and training responses from the sets of range finding responses. Scoring notes generated during the range finding process remains with each response selected, either in the annotation (for anchor papers) or in the Scoring Director’s notes (for training papers). If requested, DRC submits copies of training materials to the state assessment staff for approval prior to their use. Any training material created by DRC can also be provided to ODE in PDF format for archival purposes.
11.2.3 SCORING GUIDES WITH ANCHOR RESPONSES
Each constructed-response item requires item-specific training materials, including a rubric comprised of the item-specific scoring guidelines and 2 to 4 annotated anchor responses to illustrate/exemplify each score point. Anchor papers are selected to illustrate particular scoring concepts. These responses help ensure that scorers are able to make accurate and consistent scoring decisions. All anchor papers are annotated to explain how they exemplify each score point. The anchor sets serve as the scorers’ constant reference.
11.2.4 TRAINING SETS
For each field-test constructed-response item, DRC develops one training set of ten student responses for two-point items and two training sets for four-point items. These training papers hone each scorer’s ability to discern the different score-point levels in an accurate and consistent manner. When reviewing training responses from the front of the scoring room, the Scoring Director uses the notes generated during range finding to ensure that scorers reach scoring decisions in a manner consistent with way the rubrics were applied during range finding.
11.2.5 OPERATIONAL TRAINING AND QUALIFYING MATERIALS
Prior to scoring operational items, DRC provides the field-test training materials (anchor and training sets) for each item selected for operational administration. DRC supplements these field-test training materials with one to two more training sets of 10 student responses and two to three qualifying sets of 10 student responses. These supplemental responses are drawn from exemplar responses generated during field-test scoring. The Scoring Director reviews all exemplar responses. These sets were sent to ODE for approval. The supplemental training and
55 Standard 4.20 – The process for selecting, training, qualifying, and monitoring scorers should be specified by the test developer. The training materials, such as the scoring rubrics and examples of students’ responses that illustrate the levels on the rubric score scale, and the procedures for training scorers should result in a degree of accuracy and agreement among scorers that allows the scores to be interpreted as originally intended by the test developer. Specifications should also describe processes for assessing scorer consistency and potential drift over time in raters’ scoring.
Ohio Department of Education 146 American Institutes for Research
qualifying materials solidify scorers’ understanding of how the range finding and field-test responses were scored in order to ensure accurate and consistent scoring.56
Further quality control measures, including validity and recalibration sets, were implemented for operational items. Cycling responses with known scores into the scoring queue allows DRC to continuously monitor the performance of scores and intervene when necessary. Validity sets were sent to ODE for approval. Recalibration sets use live student responses and are archived with training materials.
11.2.6 HANDSCORING PROCEDURES
Pairs of scorers were seated in ergonomically adjustable chairs at long, rectangular tables. There were two imaging stations at each table. Each workstation included a large flat-screen monitor for clear image reproduction and easy viewing. Each scorer was assigned a unique ID number and password.
Team Leaders assisted the Scoring Directors with scorer training and monitoring. Teams consisted of approximately ten scorers.
The Scoring Director explained in detail the directions for use of the computerized handscoring system. All scorers followed along using the Imaging Handbook, created specifically for DRC scorers.
For each item, scorer training began with a room-wide presentation and discussion of the scoring guide (rubric and anchors) by the Scoring Director. Next, the scorers practiced by scoring the responses in the training set(s). Afterwards, the Scoring Director and/or Team Leaders led a thorough discussion of each set.
The student responses were routed to scorers by grade/course and item. Images of responses were sent to designated groups of scorers qualified to score that item. Only qualified scorers have access to student response images. The scorers read each response and entered the correct scores. After the scores were entered, a new response image appeared. Scorers could not tell if they were reading a response for the first or second time: all readings were, in effect, “blind.”
Ongoing quality control checks and procedures, as described later in this document, were employed to monitor and maintain the quality of the scoring sessions. If any unusual data were observed, DRC would investigate and resolve any issues.
Routing and scoring of images continued until all student responses received the prescribed number of readings (listed below). If the first two scores were equal or adjacent, the score from the first read (R1) was the score of record. If the first two scores (R1 and R2) were non-adjacent (e.g., a 0 and a 2), a resolution reading was performed by a Team Leader or Scoring Director. The third read (R3) was also independent and became the score of record.
All operational item responses were handscored with 10% independent second reads.
For previously field-tested items that were simply being recalibrated, scored a random sample of 1500 responses per item, with 10% second reads.
For new field-test items, scored a random sample of 1500 responses per item, with 100% independent second reads.
DRC’s Image Handscoring System allowed for on-demand retrieval of specified images (e.g., specific batch files, specific grades, specific students) should the need have arisen during or subsequent to the handscoring process.
56 Standard 6.8 – Those responsible for test scoring should establish scoring protocols. Test scoring that involves human judgment should include rubrics, procedures, and criteria for scoring. When scoring of complex responses is done by computer, the accuracy of the algorithm and processes should be documented.
Ohio Department of Education 147 American Institutes for Research
11.2.7 TRAINING OF SCORERS
DRC provides team leaders who assist the Scoring Directors with scorer training and monitoring. Teams consist of approximately ten scorers.
Scorer training begins with a room-wide presentation and discussion of the scoring guide (rubric and anchor responses) by the Scoring Director. Next, the scorers practice by scoring the responses in the training set(s). Afterwards, the Scoring Director and/or Team Leaders lead a thorough discussion of each set.
Once the scorers have become familiar with the rubric and the anchor set and received feedback from the training set(s), they begin scoring.
11.2.8 MONITORING AND MAINTAINING QUALITY CONTROL
Each room of scorers were divided into teams. Each team has approximately ten scorers and are assigned a Team Leader. The Team Leaders conduct routine read-behinds for all scorers. During read-behinds, the Team Leaders review responses and check the scores given by their team members. If a Team Leader disagrees with a reader’s score, the Team Leader will correct the score.
The Team Leaders use these read-behind responses to provide scorers with ongoing feedback and training. DRC’s imaging system allows a Team Leader to determine read-behind rates (frequency of monitoring) for each scorer. If the scorer needs scoring guidelines clarification, or is scoring tentatively, DRC typically monitors one out of five readings. Scorers requiring less feedback receive less frequent read behinds. DRC’s imaging system randomly selects which images the Team Leader will read behind.
A number of handscoring quality control reports are run on a daily basis (or more often as needed). Throughout the handscoring process, the Scoring Directors meet with their Team Leaders each morning to review the reports generated from the previous day’s work. If scoring patterns are apparent, the Team Leaders address any issues on an individual basis.
One key handscoring quality control report is DRC’s Scoring Summary Report, which includes inter-rater reliability and score point distributions by individual and item, both on a daily and a cumulative basis. To monitor scorer reliability and maintain an acceptable level of scoring accuracy, DRC closely reviews daily reports. The reports include item-level data as well as individual scorer data, including scorer number, number of responses scored, individual score point distributions, and exact, adjacent, and non-adjacent agreement rates. DRC investigates any issues and resolve any problems those reports identify. DRC can provide ODE with a copy of the reports on a daily basis or at the end of the project, depending on ODE’s preference.
DRC also studies the inter-rater agreement. Appendix H shows an item summary report of the Inter-Rater Reliability for each test. For operational assessments, DRC strives for 80% exact agreement on 0-2 point items and 70% exact agreement on 0-4 point items. DRC monitors the agreement rates with this in mind, and investigates outliers. There are generally three different causes of lower inter-rater reliability; each cause triggers a different response.
One cause may be scorers misapplying the scoring criteria defined by the rubric and exemplified by the anchor responses. In this case, scorers are re-trained (generally using responses from Team Leader read-behinds for feedback), and, if necessary, scores are erased so that the responses can be redistributed and rescored.
A second, less common cause may be some ambiguity in the rubric or the training materials. If this is uncovered, DRC will work with AIR and ODE to update the rubric and/or the training materials and rescore the responses (if time permits and everyone agrees to this solution).
Ohio Department of Education 148 American Institutes for Research
A third, infrequent cause may be an item that inherently leads to lower reliability. If this is uncovered, DRC will work with AIR and ODE to see if there may be a way to improve the reliability by modifying the rubric or training materials.
11.2.9 HANDLING UNUSUAL RESPONSES AND DISTURBING RESPONSES
Unusual or aberrant responses that cannot be assigned a score receive a nonscorable code following the rules in Exhibit 11.2.9.1 Note that all rubrics have a score point range that includes a score point of “0” that are applied for incorrect responses. This limits the types of responses that can be deemed nonscorable due to falling outside of the criteria defined by the rubric.
Exhibit 11.2.9.1: OST Non-Scorable Codes
Non-score Value
Meaning Definition/Examples/Notes
B Blank The response is completely blank (nothing on the entire response).
F Foreign
Language The response is written in a language other than English.
U Unreadable
The response is unreadable. For example, an online response is unreadable if it only contains repeated/random keystrokes (e.g., “yyyyyyyyyyy”; “av:aeoiahvb;e”; “hhrrttuuvv”). Operational paper-pencil test may be unreadable if it simply contains random letter or drawings or is indecipherable for other reasons.
T Off-Topic Student writes to a subject that is unrelated to the prompt. (This nonscore code is only used on the writing prompts.) DRC will score off topic responses for conventions.
If a scorer assigns a nonscorable code other than Blank to a response, DRC’s Image Handscoring System automatically forwards the response to the Scoring Director. The Scoring Director reviews the response and makes the final determination. During scoring, DRC contacts the designated ODE representative to obtain a ruling on responses that cannot be assigned a score based on our understanding at that point.
To handle possible alert papers (student responses indicating potential issues related to the student’s safety and/or well-being that may require attention at the local level), DRC’s imaging system gives scorers the ability to alert questionable student responses. An alerted image is routed to the Scoring Director who will print the response if he/she determines it to be alertable. Next, these alerts are reviewed by the Handscoring Project Advisor, who then sends copies of the responses to DRC’s Project Management Team. If they also conclude that the response warrants an alert, it is then sent to ODE. Please be assured that at no time during scoring do scorers have access to demographic information on any students participating in the assessment.
Ohio Department of Education 149 American Institutes for Research
12. QUALITY CONTROL PROCEDURES
Quality assurance procedures are enforced through all stages of OST test development, administration, and scoring and reporting of results. This section describes quality assurance procedures associated with the following:
• Test construction
• Test production
• Answer document processing
• Data preparation
• Equating and scaling
• Scoring and reporting
Because quality assurance procedures pervade all aspects of test development, we note that discussion of quality assurance procedures is not limited to this section, but is also included in sections describing all phases of test development and implementation.
12.1 QUALITY ASSURANCE IN TEST CONSTRUCTION
Each form is built to exactly match the detailed test blueprint, and match the target distribution of item difficulty and test information. Together, these constitute the definition of the instrument. The blueprint describes the content to be covered, the depth of knowledge with which it will be covered, the type of items that will measure the constructs, and every other content-relevant aspect of the test. The statistical targets ensure that students will receive scores of similar precision, regardless of which form of the test they receive.
AIR’s test developers use the FormBuilder software to help construct operational forms. FormBuilder interfaces with AIR’s Item Authoring Tool (IAT) to extract test information and interactively creates test characteristics curves (TCCs), test information curves (TICs), and Standard Error of Measurement Curves (SEMCs) as test developers build a test map. This helps our content specialists ensure that the test forms are statistically parallel, in addition to ensuring content parallelism.
Immediately upon generation of a test form, the FormBuilder generates a blueprint match report to ensure that all elements of the test blueprint have been satisfied. In addition, the FormBuilder produces a statistical summary of form characteristics to ensure consistency of test characteristics across test forms. The summary report also flags items with low biserial correlations, as well as very easy and very difficult items. Although items in the operational pool have passed through data review, construction of fixed-form assessments allows another opportunity to ensure that poorly performing items are not included in operational test forms.
The FormBuilder also plots the distribution of item difficulties, both classical and IRT indices, to both flag extremely easy or difficult items and to ensure that the distribution of item difficulties is consistent across test forms. As test developers build forms, FormBuilder generates TCCs, TICs, and CSEMs for the reference (previously used) form and the target (new) form(s) on the screen. The TCCs and SEMCs are plotted using a different color trace line for each prototype form. Using FormBuilder, our content specialists select test items that match the blueprint and are of appropriate difficulty. Beginning with content considerations and supplementing those considerations with statistical considerations, AIR creates alternate, parallel test forms by comparing TCCs for the form that is being created with TCCs from previous forms. To the degree that the TCC for the total test is the same as for previous tests, the raw score required for meeting any performance standard will remain as close to the same as it was on previous forms.
When submitting test forms for review by ODE, AIR produces a form evaluation workbook that includes an evaluation summary checklist, as well as summary statistics and test characteristic graphs.
The mechanical features of a test—arrangement, directions and production—are just as important as the quality of the items. Many factors directly affect a student’s ability to demonstrate proficiency on the assessment, while
Ohio Department of Education 150 American Institutes for Research
others relate to the ability to score the assessment accurately and efficiently. Still others affect the inferences made from the test results.
When the test developer is reviewing a test form for content, in addition to making sure all the benchmark/indicator item requirements are met, test developers must also make sure that the items on the form do not cue each other—that one item does not present material that indicates the answer to another item. This is important to ensure that a student’s response on any particular test item is unaffected by, and is statistically independent of, a response to any other test item. This is called “local independence.” Independence is most commonly violated when there is a hint in one item about the answer to another item. In that case, a student’s true ability on the second item is not being assessed.
Once the items and passages for the form have been selected and matched against the blueprint, the test developer reviews the form for a variety of additional content considerations, including the following:
• The items are sequentially ordered
• Each item of the same type is presented in a consistent manner
• The listing of the options for the multiple-choice items is consistent
• The answer options are lettered with A, B, C, and D
• All graphics are consistently presented
• All tables and charts have titles and are consistently formatted
• The number of the answer choice letters should be approximately equal across the form
• The answer key should be checked by the initial reviewer and one additional independent reviewer
• All stimuli have items associated with them
• The topics of items, passages, or stimuli are not too similar to one another
• There are no errors in spelling, grammar, or accuracy of graphics
• The wording, layout, and appearance of the item matches how the item was field-tested
• There is gender and ethnic balance
• The passage sets do not start with or end with a constructed-response item
• Each item and the form have been checked against the appropriate style guide
• The directions are consistent across items and are accurate
• All copyrighted materials have up-to-date permissions agreements
• Word counts are within documented ranges
After completing the initial build of the form, the test developer hands it off to another content specialist, who conducts a final review of the criteria listed above. If the test specialist reviewer finds any issues, the form is sent back for revisions. If the form meets blueprint and complies with all specified criteria, the test developer sends it to the psychometric team for review. When the psychometric team approves the form, the test developer uploads the item list into FormBuilder. After operational forms were defined in FormBuilder, all bookmaps (test maps), key files, and conversion tables were produced directly from FormBuilder to eliminate the possibility of human error in the construction of these important files. Bookmaps, key files, conversion tables, and other critical documents were generated directly from information maintained in IAT. The information stored in IAT is rigorously reviewed by multiple skilled reviewers, to protect against errors. Automated production of these critical files (such as key files) virtually eliminates opportunities for errors.
Bookmaps include any item attribute stored in IAT, so that in addition to form-level attributes such as test administration and item position, item attributes such as learning standard, benchmark, indicator, complexity, item release status, point value, weight, keyed response, and more are included in the bookmap. The bookmap feature in FormBuilder was customized to OST.
As a further layer of quality assurance for printed test booklets, both during the blueline production phase prior to printing and again following the final printing of all test forms, two AIR technical team staff members independently took all test forms. Responses to the test forms were compared to the answer keys for each form to
Ohio Department of Education 151 American Institutes for Research
confirm the accuracy of scoring keys. In addition, the printed forms were compared against IAT and FormBuilder for content and item ordering to ensure that no changes to the form were introduced prior to printing.
12.2 QUALITY ASSURANCE IN TEST PRODUCTION
The production of computer-delivered assessments involves two distinct types of products, each of which follows an appropriate quality assurance process:
1. Content for online delivery shares some processes with paper-pencil versions, but also requires additional, unique steps.
2. Online test delivery software must deliver the content reliably (and, with the right tools, the accommodations, layouts, etc.).
OST assessments’ test delivery system also has a real-time quality-monitoring component built in. As students are administered assessments, data flow through the test delivery system’s Quality Monitor (QM) software. QM conducts a series of data integrity checks, ensuring, for example, that the record for each test contains information for each item that was supposed to be on the test, and that the test record contains no data from items that have been invalidated. QM scores the test, recalculates performance level designations, calculates subscores, compares item parameters to the reference item parameters in the bank, and conducts a host of other checks.
QM also aggregates data to detect problems that become apparent only in the aggregate. For example, QM monitors item fit and flags items that perform differently operationally than their item parameters predict. This functions as a sort of automated key or rubric check, flagging items where data suggest a potential problem.
12.2.1 PRODUCTION OF CONTENT
While the online workflow requires some additional steps, it actually removes a substantial amount of work from the time critical path, reducing the likelihood of errors. Like a test book, an online system can deliver a sequence of items; however, the online system makes the layout of that sequence algorithmic. A paper-pencil form must await final forms construction before blackline proofs can show how the item will look in the booklet. Online, the appearance of the item screen can be known with certainty before the final test form is ever constructed. This characteristic of online forms enables us to lock down the final presentation of each item well before forms are constructed. In turn, this moves the final blueline review of items much earlier in the process, removing it from the critical path.
The production of computer-based tests includes five key steps:
1 Final content is previewed and approved in a process called web approval. Web approval packages the item exactly as it will be displayed to the student.
2 Forms are finalized using the process described in Section 6.3, and final forms are approved in our FormBuilder software.
3 Complete test packages are created with our test packager, which gathers the content, form information, display information, and relevant scoring and psychometric information from the item bank and packages it for deployment.
4 Forms are initially deployed to a test site where they undergo platform review, a process during which we ensure that each item displays properly on a large number of platforms representative of those used in the field.
5 The final system is deployed to a staging environment accessible to ODE for user acceptance testing and final review.
Ohio Department of Education 152 American Institutes for Research
12.2.2 WEB APPROVAL OF CONTENT DURING DEVELOPMENT
The Item Authoring Tool (IAT) integrates directly with the test delivery system (TDS) display module, and displays each item exactly as it will appear to the student. This process is called web preview, and web preview is tied to specific item review levels. Upon approval at those levels, the system locks content as it will be displayed to the student, transforming the item representation to the exact representation that will be rendered to the student. No change to the display content can occur without a subsequent web preview. This process freezes the display code that will present the item to the student.
Web approval functions as an item-by-item blueline review. It is the final rendering of the item as the student will see it. Layout changes can be made after this process in two ways:
1. Content can be revised and re-approved for web display.
2. Online style sheets can change to revise the layout of all items on the test.
Both of these processes are subject to strict change control protocols to ensure that accidental changes are not introduced. Below, we discuss automated quality control processes during content publication that raise warnings if item content has changed after the most recent web-approved content was generated. The web approval process offers the benefit of allowing final layout review much earlier in the process, reducing the work that must be done during the very busy period just before tests go live.
12.2.3 APPROVAL OF FINAL FORMS
Section 6.3 describes our process for constructing operational test forms, including the approval of test forms by ODE. The forms are built in FormBuilder (a component of our IAT), and upon approval, they are ready for preliminary publication.
12.2.4 PACKAGING
The test packaging system performs two simultaneous roles in the preparation of computer-based products: It compiles the form definitions and other information about how the test is to be administered (e.g., where any embedded field-test items might be inserted) and pulls together the content packaged during web approval.
The test packager assigns form identifiers to each form, evaluates the form against the blueprint, and performs a quality check against the content. The content quality check includes checks to see that every asset (e.g., graphics) referenced in the item is included in the package, confirms that the item has not changed since it was web approved, and ensures that the items have received all the approvals necessary for publication.
12.2.5 PLATFORM REVIEW
Platform review is a process in which each item is checked to ensure that it is displayed appropriately on each tested platform. A platform is a combination of a hardware device and an operating system. In recent years, the number of platforms has proliferated, and platform review now takes place on approximately 15 platforms that are significantly different from one another.
A team conducts platform review. The team leader projects the item as it was web approved in IAT, and team members, each behind a different platform, look at the same item to see that it renders as expected.
Ohio Department of Education 153 American Institutes for Research
12.2.6 USER ACCEPTANCE TESTING AND FINAL REVIEW
Prior to deployment, the testing system and content are deployed to a staging server where they are subject to user acceptance testing (UAT). UAT of the test delivery system serves both a software evaluation and content approval role. The UAT period provides ODE with an opportunity to interact with the exact test with which the students will interact.
12.2.7 FUNCTIONALITY AND CONFIGURATION
The items, both in themselves and as configured onto the tests, form one type of online product. The delivery of that test can be thought of as an independent service. Here, we document quality assurance procedures for delivering the online assessments.
One area of quality unique to online delivery is the quality of the delivery system. Three activities provide for the predictable, reliable, quality performance of our system:
1. Testing on the system itself to ensure function, performance, and capacity
2. Capacity planning
3. Continuous monitoring
AIR statisticians examine the delivery demands, including the number of tests to be delivered, the length of the testing window, and the historic state-specific behaviors to model the likely peak loads. Using data from the load tests, these calculations indicate the number of each type of server necessary to provide continuous, responsive service, and AIR contracts for service in excess of this amount. Once deployed, our servers are monitored at the hardware, operating system, and software platform levels with monitoring software that alerts our engineers at the first signs that trouble may be ahead. Applications log not only errors and exceptions, but also latency (timing) information for critical database calls. This information enables us to know instantly whether the system is performing as designed, or if it is starting to slow down or experience a problem.
In addition, latency data is captured for each assessed student—data about how long it takes to load, view, or respond to an item. All of this information is logged as well, enabling us to automatically identify schools or districts experiencing unusual slowdowns, often before they even notice.
Ohio Department of Education 154 American Institutes for Research
12.3 QUALITY ASSURANCE IN DOCUMENT PROCESSING
12.3.1 SCANNING ACCURACY
Quality assurance procedures for the scanning process begin before the paper-pencil tests are ever shipped to districts.
The scanning process begins with a specifications document that defines the scanning, edit check, and other business rules that will be used to score responses, capture images and flag scans for human review and editing.
Once ODE gives approval, DRC programmers write the customized scanning programs meeting the specifications.
DRC next reviews the programming in systematic testing and code review, as it does with all of its software development.
DRC then tests the system using a test deck (mock answer documents) marked to cover all responses, blanks, multiple marks, imperfect gridding, and other markings to be defined in conjunction with ODE. These tests validate that the programs process each type of marking as intended. The checks to be conducted include
• readability of security, student and school bar codes;
• data capture of pregridded and bar code information;
• accurate capture of district and school codes;
• consistent data capture on all scanners;
• accurate scan positions on all documents and forms; and • scanner calibration and hardware functionality.
Both AIR and DRC quality management staff confirm the results of the tests before the programs are approved for use.
In addition, once real answer forms arrive, DRC visually inspect a sample to verify the accuracy of the scoring and the correct implementation of the business rules.
Throughout the scanning process, batches are checked for quality and scanning accuracy by experienced document processing staff. All scanners are calibrated and cleaned on a regularly scheduled basis to ensure accurate and consistent scoring. DRC also has an on-site field service engineer to resolve any technical issues as they arise.
DRC’s scanning process produces comprehensive, detailed information, including
• student demographic data;
• student multiple-choice response data;
• TIFF images of complete documents; and
• identifiers to link the TIFF images to the student demographic data.
12.3.2 QUALITY ASSURANCE IN EDITING AND DATA INPUT
After each batch is scanned, the documents are processed through a computer-based editing program to detect potential errors because of smudges, multiple marks and omits in the specified response fields. Marks or omits that do not meet the predefined editing standards are routed to the document processing editing staff for resolution.
Ohio Department of Education 155 American Institutes for Research
Using the unique serial number printed on the document during scanning, the editor compares the actual document to the online data. Corrections are then made to the data file according to predefined, Ohio-specific specifications. The editing staff follows strict quality control procedures to produce clean data files that can be submitted for scoring and reporting functions.
Post-Editing
A final edit is performed to confirm that all requirements for final processing have been met. Once the demographic information and multiple-choice data pass all the predefined editing processes, the images of the student responses to constructed-response (CR) items are extracted into files for scoring. The CR student response images are routed through the DRC Imaging Workflow System to handscoring terminals at DRC’s Scoring Centers for scoring by qualified readers. Images are stored so that they can be efficiently retrieved based on student and school identification information, scores, and item information. Upon completion of processing, scannable documents are boxed for security purposes and final storage.
Throughout the process, DRC operators maintain an issues log. The quality assurance staff will review the log to ensure that every issue has been adequately resolved before the final data are validated.
Data File Construction
DRC ensures that all student answer documents have been accounted for and processed through scanning, pre-editing and post-editing processes. After staff confirms that these processes are complete, final data collection processes begin. The original scanned multiple-choice data are converted into a master student file. Record counts are verified against the counts from the document processing staff to ensure that all students are accounted for in the file.
The data file includes scored data. AIR has developed reliable procedures for ensuring accurate answer keys. The answer to each item is maintained along with the item in the IAT. These keys go through extensive review during development, and after books are ready for print, two members of our technical team take each form of the test as a final confirmation of the key information stored in the IAT. From that point, all keys are automatically generated in machine-readable format from the IAT, virtually eliminating the possibility of version errors or other human errors in these critical documents. DRC systems read the key files generated by the IAT. DRC’s Software Quality Assurance staff compares their scoring file against the approved answer key source file to ensure that it is 100% accurate. AIR staff members independently conduct a review of the data received from DRC, providing an independent accuracy check.
12.4 QUALITY ASSURANCE IN DATA PREPARATION
AIR’s test delivery system has a real-time quality-monitoring component built in. As students test, data flow through our Quality Monitor (QM) software. QM conducts a series of data integrity checks, ensuring, for example, that the record for each test contains information for each item that was supposed to be on the test, and that the test record contains no data from items that have been invalidated. QM scores the test, recalculates performance level designations, calculates subscores, compares item parameters to the reference item parameters in the bank, and conducts a host of other checks.
QM also aggregates data to detect problems that become apparent only in the aggregate. For example, QM monitors item fit and flags items that perform differently operationally than their item parameters predict. This functions as a sort of automated key or rubric check, flagging items where data suggest a potential problem. This automated process is similar to the sorts of checks that are done for data review, but (a) they are done on operational data and (b) they are conducted in real time so that our psychometricians can catch and correct any problems before they have an opportunity to do any harm.
Data pass directly from the QM to the database of record (DoR), which serves as the repository for all test information, and from which all test information for reporting is pulled. The data extract generator (DEG) is the
Ohio Department of Education 156 American Institutes for Research
tool that is used to pull data from the DoR for delivery to ODE and their quality assurance contractor. AIR psychometricians ensure that data in the extract files matches the DoR prior to delivery to ODE.
12.5 QUALITY ASSURANCE IN TEST FORM EQUATING
Item information necessary for statistical and psychometric analyses is provided to ODE and Assessment and Evaluation Services (AES), ODE’s independent quality assurance contractor, prior to test administration. Item information is published as part of the configuration of the online assessment system that AIR employs for administering, scoring, and reporting test scores. Information contained in these workbooks includes, but is not limited to, unique item ID used for item tracking, test form ID, location on the test form, correct answer, item difficulty, and information about the strand, standard, and benchmark each item measures. These item files are used in quality control checks of the assessment data scoring and analysis.
To ensure security, all data is shared using AIR’s SFTP site.
Prior to operational work, AIR produces simulated datasets for the purpose of testing software and analysis procedures, including a dry run of calibration and post-equating activities and compare results. The practice runs serve two functions:
• To verify accuracy of program code and procedures.
• To evaluate the communication and work flow among participants. If necessary, the team will reconcile differences and correct production or verification programs.
Following the completion of these activities and resolution of questions that arise, analysis specifications are finalized.
12.6 QUALITY ASSURANCE IN SCORING AND REPORTING
12.6.1 QUALITY ASSURANCE IN HANDSCORING
The entire scoring process is managed by DRC’s electronic scoring system, which implements many programmatic controls. The system enables team leaders to call up individual responses, monitor a variety of indicators and designate items for rescoring. Throughout the scoring, the following processes will monitor the validity and reliability of the scores assigned:
Backreading, in which team leaders continuously review the work of the scorers on their teams.
Validity testing, in which each scorer scores at least two validity responses each day. A validity response is a response that has been pre-scored by expert scorers and is placed in the queue of responses to be scored. These responses are “blind” to scorers; scorers do not realize they are assessing pre-scored responses. The data generated by these processes will be presented in a set of at least nine reports, as described in Deliverables 22 and 24.
When a reader does not provide scores that are sufficiently reliable or valid, DRC has a range of remediation options:
Individual coaching, which typically occurs when a team leader disagrees with a reader’s score. The reader and team leader can discuss the situation immediately, which has proved to be a very effective type of feedback because it is done with responses that were recently scored by a particular reader.
Ohio Department of Education 157 American Institutes for Research
Retraining, which occurs if scorers show unacceptable rates of missed validity scores or high non-adjacency rates with other scorers. Errant scorers are retrained in scoring the item and do not score any further operational responses until they have re-qualified to score the item.
Dismissal, which occurs if scorers continue to assign inaccurate scores following retraining on an item. All scores of responses by a dismissed employee are erased and the responses are re-routed to accurate scorers.
DRC’s Image Scoring System maintains the information needed to identify and rescore all papers scored by errant readers so that bad scorers will not contaminate student data.
Routing Responses to Ensure “Blind” Second Reads
DRC’s Image Handscoring System separates responses by item and subject and routes to qualified scorers who read each response and enter the score. The process of routing and scoring responses continues until all responses have received the prescribed level of readings. Scorers cannot tell if they are conducting first or second readings; all readings are “blind.”
Monitoring Scorers
DRC generates handscoring reports on demand in order to monitor progress and maintain handscoring quality control. Reports are also automatically generated overnight. DRC provides copies of reports in PDF format to ODE daily or on demand.
During the handscoring process, the scoring directors meet with their team leaders each morning to review the statistics generated from the previous day’s work. If scoring patterns are apparent among individual scorers, team leaders will deal with these issues on an individual basis. DRC’s imaging system allows a team leader to determine read-behind rates (frequency of monitoring) for each scorer. If a scorer appears to need clarification of the scoring rules, or is scoring tentatively, DRC typically monitors one out of five readings. The imaging system randomly selects which images the team leader monitors.
DRC also monitors the inter-rater reliability. If a scorer falls below an acceptable rate of agreement, the team leader re-trains the scorer. If the scorer fails to improve after retraining and feedback, DRC may remove the scorer from the project. In this situation, DRC removes all scores assigned by the scorer in question. The responses are then re-dealt and rescored.
DRC does not report on scorer performance after the fact, as some contractors do. DRC believes that scorers with less-than-acceptable scoring patterns must be identified immediately and those patterns corrected. DRC has worked diligently to devise effective monitoring reports and procedures to accomplish both detection and correction. Accurate and consistent results are the backbone of all handscoring activities. The following methods used by DRC guarantee scoring quality:
• Rigorous training and qualifying for each item ensures a pool of scorers who will apply consistent and accurate scores.
• Recalibration sets re-focus scorers on the scoring standards by comparing the pre-determined score to that assigned by the scorer.
• Validity responses detect possible room drift and individual drift. Validity reports compare scorers’ scores to pre-determined scores. Validity responses are seeded to scorers. Scorers cannot distinguish validity responses from live responses, making this a powerful measure of quality control.
• Team leaders conduct routine read-behinds to observe, in real time, scorers’ performance. Team leaders utilize live, scored responses to provide ongoing feedback and, if necessary, retraining for scorers.
• Inter-rater reliability and score point distribution reports are generated daily or on demand to monitor scorer reliability and maintain an acceptable level of scoring accuracy. The reports compile individual scorer data, including the scorer identification number, the number of responses scored, individual score
Ohio Department of Education 158 American Institutes for Research
point distributions, and exact agreement rates. DRC investigates any issues and resolve any problems identified by the reports.
Handscoring Quality Assurance Monitoring Reports
DRC produces these reports on demand, they can assure that immediate action is taken to resolve scoring discrepancies within minutes (when necessary) of the first and/or second reading. DRC prepares a number of reports to monitor the quality and effectiveness of various aspects of the project. The reports are described in Exhibit 12.6.1.1.
Exhibit 12.6.1.1: QA Monitoring Reports
Report Report Specifics
Scoring Summary Report
DRC’s Scoring Summary Report provides daily and cumulative inter-rater reliability results, score point distribution data, and production volumes for each reader and item.
Inter-rater Reliability
Monitors how often scorers are in exact agreement with each other and ensures that an acceptable agreement rate is maintained. This report provides daily and cumulative exact and adjacent inter- scorer agreement and the percentage of responses requiring resolution (only if required). The calculations for this report are as follows:
• Percent Exact—total number of responses by scorer where scores are equal divided by the number of responses that were scored twice.
• Percent Adjacent—total number of responses by scorer where scores are one point apart divided by the number of responses that were scored twice.
• Percent Non-Adjacent—total number of responses by scorer where scores are more than one score point apart divided by the number of responses that were scored twice.
Score Point Distribution
Monitors the percentage of responses given to each of the score points. For example, for items on a 0–4 point scale, this daily and cumulative report shows how many 0s, 1s, 2s, 3s, and 4s a scorer has given to all the responses he or she has scored at the time the report is produced. These percentages can be compared to room-wide percentages to detect individual scoring issues.
Production Volumes
This report also indicates the number of responses read by each scorer each day so that production rates can be monitored. Additionally, it includes totals for each item, so that progress toward completion can be monitored.
Item Status Report Monitors the progress of handscoring. This report tracks each response and indicates the status (e.g., “needs a second reading,” “complete”). This report ensures that all discrepancies are resolved by the end of the project.
Responses Read by Reader Report
Identifies all responses scored by an individual scorer. This report is useful if any responses need rescoring due to potential scorer drift.
Ohio Department of Education 159 American Institutes for Research
Report Report Specifics
Read-Behind Log
Used by team leaders/scoring directors to monitor scorer reliability. Team leaders randomly select and read scored responses from each team member daily. If the team leader disagrees with the scorer’s score, remediation occurs, either with the team leader or the scoring director. This has proven to be a very effective form of feedback because it is implemented with items live-scored by individual scorers.
Validity Reports
These reports can be generated on demand throughout the scoring process. All validity reports compare pre-determined scores to scorers’ scores for validity responses. These reports can be run at the individual, team, or room level in order to detect individual, team, or room-wide scorer drift.
Identifying, Evaluating, and Informing the State on Alert Papers
DRC applies a nonscorable code to unusual or aberrant responses that cannot be assigned a score. Prior to scoring, DRC and AIR works closely with ODE to define what constitutes a nonscorable response. During handscoring, DRC contacts the designated ODE representative to obtain a ruling on any response that cannot be assigned a score or a nonscorable code based on current understanding. The image handscoring functionality forwards all potential nonscorable responses to the scoring director. Only the scoring director is able to assign the nonscorable code.
To handle possible alerts (responses indicating potential issues that may require attention at the state or local level, such as potential security breaches or concerns about a student’s safety), DRC’s imaging system gives scorers the ability to alert individual student responses. Alerted images are routed to the scoring director who will print any responses deemed to indicate a potential issue. At no time during scoring do scorers have access to the demographic information of any students participating in the assessment.
Next, these alerts are routed to a handscoring senior manager who reviews them and, if needed, sends copies of the student’s responses to DRC project management staff. A project manager forwards copies of the alerts to ODE.
12.6.2 QUALITY ASSURANCE FOR SCORE REPORTING
As test results come back from the Test Delivery System or scanning-scoring process, they are routed to our test integration system and Quality Monitor, and ultimately to our systems for reporting. Here, we summarize the quality checks that are implemented:
• We ensure that the student response and score data to be reported are correct.
• The reporting software systems accurately report and aggregate the student scores.
• Paper reports contain accurate data correctly displayed, and any Ohio-specific programming is tested and replicated to ensure that it is error free.
• Print and packaging quality is maintained for paper reports.
Student Response and Score Data Are Correct
Data entering the reporting process, either from paper-based or online tests, flow through our Quality Monitor (QM) software. QM conducts a series of data integrity checks, ensuring, for example, that the record for each test contains information for each item that was supposed to be on the test, and that the test record contains no data from items that have been invalidated. QM scores the test, recalculates performance level designations, calculates subscores, compares item parameters to the reference item parameters in the bank, and conducts a host of other checks.
Ohio Department of Education 160 American Institutes for Research
QM also aggregates data to detect problems that become apparent only in the aggregate. For example, QM monitors item fit and flags items that perform differently operationally than their item parameters predict. This functions as a sort of automated key or rubric check, flagging items where data suggest a potential problem. This automated process is similar to the sorts of checks that are done for data review, but (a) they are done on operational data; and (b) they are conducted in real time so that our psychometricians can catch and correct any problems before they have an opportunity to do any harm.
Reporting System Software Accurately Reports and Aggregates Student Scores
Although test scores in the base year could not be reported until after standard-setting activities were completed, online reports can be configured to appear in real time, and ODE may report test scores immediately in future test administrations. Therefore, quality assurance cannot rely on post-hoc reviews of reports. Instead, the accuracy of the reporting system rests on the quality of programming of the system, the implementation of Ohio’s business rules, and the quality of the algorithms used for aggregating scores.
In building our systems, we undergo an extensive software testing process and when we configure the systems for individual clients, we test them by running simulated or real data through the full system, allowing the system to generate reports. Our statistical programming team simultaneously implements the same rules and processes the same data. Statistics from the system are then compared to statistics produced by our statistical team. Discrepancies are tracked down and resolved.
The entire process is guided by a set of complete reporting and analysis specifications. Both the software team and the independent statistical programming team work from the same specifications, but work independently. This process provides an independent check on the immediate reporting system before it is deployed. The data is available for review by ODE upon request.
Quality Assurance—Statistical Programming
All custom programming is guided by detailed and precise specifications in our Reporting Specifications document. Upon approval of the specifications, analytic rules are programmed and each program is extensively tested on test decks and real data from other programs. Two senior statisticians and one senior programmer review the final programs to ensure that they implement agreed-upon procedures.
Custom programming is implemented independently by two statistical programming teams working from the specifications. Only when the output from both teams matches exactly are the scripts released for production. Quality control, however, does not stop there.
Much of the statistical processing is repeated and AIR has implemented a structured software development process to ensure that the repeated tasks are implemented correctly and identically each time. We write small programs (called macros) that take specified data as input and produce data sets containing derived variables as output. Approximately 30 such macros reside in our library for the grades 3–8 program score reports. Each macro is extensively tested and stored in a central development server. Once a macro is tested and stored, changes to the macro must be approved by the Director of Score Reporting and the Director of Psychometrics, as well as by the project directors for affected projects. A complete retesting with the entire collection of scenarios on which the macro was originally tested follows each change.
The main statistical program is mostly made up of calls to various macros, including macros that read in and verify the data and conversion tables and the macros that do the many complex calculations. This program is developed and tested using artificial data generated to test both typical and extreme cases. In addition, the program goes through a rigorous code review by a senior statistician.
Quality Assurance—Display Programming
The reports are programmed in a Xerox-developed language called VIPP. VIPP code is tested using both artificial and real data. AIR’s data generation utilities can read the output layout specifications and generate artificial data
Ohio Department of Education 161 American Institutes for Research
for direct input into the VIPP programs. This allows the testing of these programs to begin before the statistical programming is complete. In later stages, artificial data are generated according to the input layout and run through the psychometric process and the score reporting statistical programs, and the output is formatted as VIPP input. This enables us to test the entire system.
Once we receive final data and VIPP programs, the AIR Score Reporting team reviews proofs that contain actual data based on quality assurance documentation that is provided by the ODE. In addition, we compare data independently calculated by AIR psychometricians with data on the reports. A large sample of each type of report is reviewed by several AIR staff members to make sure that all data are correctly placed on reports. This rigorous review typically is conducted over several days and takes place in a secure location in the AIR building. All reports containing actual data are stored in a locked storage area. Staff from ODE and its Quality Assurance contractor are welcome to visit AIR at any point during this revision process to oversee our procedures.
Prior to printing the reports, AIR provides a live data file and reports with sample districts as chosen by ODE for review. AIR works closely with the ODE to resolve questions and correct any problems. The reports will not be delivered unless ODE approves the sample reports and data file.
Print and Package Quality is Maintained
Automated tools help ensure quality and accuracy. PrintTracker tracks and manages the print and packaging process. AIR’s PrintTracker software is an online tool for managing the print/pack/ship process across multiple print sites and vendors. PrintTracker manages the workload at each printer, reallocating work in response to unforeseen downtime. It monitors the production of each shipment (e.g., reports going to a district) to ensure that the correct number of packets is packed in the expected number of boxes and shipped to the correct address. Any conflicting information entered by the print operator generates a discrepancy report, which is immediately emailed to the print lead and the on-site AIR representative. Each discrepancy is resolved before a shipment can be released. This ensures that the correct score reports make it to the correct destination every time.
Inline smart-press technology allows inline sorting, folding, stitching, trimming, and offset stacking of reports with varying page counts. This all occurs without human intervention, reducing the opportunities for errors. Materials go straight from offset stacked, collated sets into sealed school packages. The packaging can be heavy cardboard envelopes or boxes, depending on size. School packages are boxed for shipping. Customized packing lists and labels are created from the same data set used to generate the report. For quality assurance, each report is assigned a unique serial number, printed inline on the gutter of each signature as an additional human readable QA device.
PrintTracker has multiple built-in quality assurance features to ensure that reports are correctly packaged and shipped. For example, an operator at a print site must record the number of packages and boxes destined for each school or district. PrintTracker confirms this number against the database before indicating to the operator that the package may be sealed and allowing him or her to print shipping labels. Any discrepancies are reported to, and resolved by, AIR print management staff to ensure that every report is shipped to the correct location.
During printing, reports are checked for color against color samples and print site staff review reports as they are printed to make sure that graphics are printing properly, all pages are correctly printed, and there are no printing errors such as ink smudging or faulty lines. AIR will provide documentation of established quality checks upon request.
12.6.3 QUALITY ASSURANCE FOR TEST SCORING
AIR verifies the accuracy of the scoring engine using simulated test administrations. The simulator generates a sample of students with an ability distribution that matches that of the state. The ability of each simulated students is used to generate a sequence of item responses consistent with the underlying ability. Although the simulations were designed to provide a rigorous test of the adaptive algorithm for adaptively administered tests,
Ohio Department of Education 162 American Institutes for Research
they also provide a check of the full range of item responses and test scores in fixed-form tests as well. Simulations are always generated using the production item selection and scoring engine to ensure that verification of the scoring engine is based on a very wide range of student response patterns.
To verify the accuracy of the online reporting system, we merge item response data with the demographic information taken either from previous year assessment data, or if current year enrollment data is available by the time simulated data files are created, we can verify online reporting using current year testing information. By populating the simulated data files with real school information, it is possible to verify that special school types and special districts are being handled properly in the reporting system.
Specifications for generating simulated data files are included in the Analysis Specifications document submitted to ODE each year. Although ODE does not currently provide immediate reporting, review of all simulated data is scheduled to be completed prior to the opening of the test administration, so that the integrity of item administration, data capture, item and test scoring and reporting can be verified before the system goes live.
To monitor the performance of the assessment system during the testing window, a series of Quality Assurance Reports can be generated at any time during the online assessment window. For example, item analysis reports allow psychometricians to ensure that items are performing as intended and serve as an empirical key check through the operational testing window. In the context of adaptive test administrations, other reports such as blueprint match and item exposure reports allow psychometricians to verify that test administrations conform to specifications.
An additional set of cheating analysis reports flags unlikely patterns of behavior in testing administrations aggregated at the test administration, test administrator, and school level. The quality assurance reports can be generated on any desired schedule. Item analysis and blueprint match reports are evaluated frequently at the opening of the testing window to ensure that test administrations conform to blueprint and items are performing as anticipated.
Each time the reports are generated, the lead psychometrician reviews the results. If any unexpected results are identified, the lead psychometrician alerts the project manager immediately to resolve any issues. Exhibit 12.6.3.1 presents an overview of the quality assurance (QA) reports.
Exhibit 12.6.3.1: Overview of Quality Assurance Reports
QA Reports Purpose Rationale
Item Statistics To confirm whether items work as expected
Early detection of errors (key errors for selected-response items and scoring errors for constructed-response, performance, or technology items)
Blueprint Match Rates To monitor unexpected low blueprint match rates
Early detection of unexpected blueprint match issue
Item Exposure Rates
To monitor unlikely high exposure rates of items or passages or unusually low item pool usage (high unused items/passages)
Early detection of any oversight in the blueprint specification
Ohio Department of Education 163 American Institutes for Research
QA Reports Purpose Rationale
Cheating Analysis To monitor testing irregularities
Early detection of testing irregularities
Item Analysis Report
The item analysis report is used to monitor the performance of test items throughout the testing window and serves as a key check for the early detection of potential problems with item scoring, including incorrect designation of a keyed response or other scoring errors, as well as potential breaches of test security that may be indicated by changes in the difficulty of test items. To examine test items for changes in performance, this report generates classical item analysis indicators of difficulty and discrimination, including proportion correct and biserial/polyserial correlation, as well as IRT based item fit statistics. The report is configurable and can be produced so that only items with statistics falling outside a specified range are flagged for reporting or to generate reports based on all items in the pool.
Item p-Value. For multiple-choice items, the proportion of students selecting each of response option is computed; for constructed-response, performance, and technology items, the proportion of student responses classified at each score point is computed. For multiple-choice items, if the keyed response is not the modal response, the item is also flagged. Although the correct response is not always the modal response, keyed response options flagged for both low biserial correlations and non-modal response are indicative of miskeyed item.
Item Discrimination. Biserial correlations for the keyed response for selected-response items and polyserial correlations for polytomous constructed response, performance, and technology items are computed. AIR psychometric staff evaluates all items with biserial correlations below a target level, even if the obtained values are consistent with past item performance.
Item Fit. In addition to the item difficulty and item discrimination indices, an item fit index is produced for each item. For each student, a residual between observed and expected score given the student’s ability is computed for each item. The residuals for each are averaged across all students, and the average residual is used to flag an item.
We begin by defining Pij = pr(zij=1), representing the probability that student i responds correctly to item j ( the term zij represents the student’s score on the item). For selected-response items we use the 3PL IRT model to
calculate the expected score on item j for student i with estimated ability θ̂ as
𝐸(𝑧𝑖𝑗) = 𝑐𝑗 + (1 − 𝑐𝑗)exp (𝐷𝑎𝑗(�̂�𝑖 − 𝑏𝑗))
1 + exp (𝐷𝑎𝑗(�̂�𝑖 − 𝑏𝑗))
For constructed-response, performance, or technology items, using the Generalized Partial Credit model, the
expected score for student i with estimated ability θ̂ on an item j with a maximum possible score of Kj is calculated as
𝐸(𝑧𝑖𝑗) = ∑𝑙exp(𝐷𝑎𝑗 ∑ (�̂�𝑖 − 𝑏𝑗,𝑘)𝑙
𝑘=1 )
1 + ∑ exp(𝐷𝑎𝑗 ∑ (�̂�𝑖 − 𝑏𝑗,𝑘)𝑚𝑘=1 )
𝐾𝑗
𝑚=1
𝐾𝑗
𝑙=1
For each item j, the residual between observed and expected score for each student is defined as
𝛿𝑖𝑗 = 𝑧𝑖𝑗 − 𝐸(𝑧𝑖𝑗)
Ohio Department of Education 164 American Institutes for Research
The statistic is aggregated across students of different abilities for each item,
𝛿�̅� =1
𝑛∑(𝛿𝑖𝑗)
𝑛
𝑖=1
The report can be configured to report all items or flag and report only those items where the fit index is above a given threshold (e.g., items could be flagged when
𝛿�̅�
𝑠𝑒(𝛿�̅�)> 1.96
where 𝛿�̅� =𝑆𝐷(𝛿𝑖𝑗)
√𝑛.
12.6.4 REPORTING
Scores for online assessments are assigned by automated systems in real time. For machine scored portions of assessments, the machine rubrics are created and reviewed along with the items, then validated and finalized during rubric validation following field testing. The review process “locks down” the item and rubric when the item is approved for web display (Web Approval). During operational testing, actual item responses are compared to expected item responses (given the item response theory [IRT] parameters), which can detect miskeyed items, item drift, or other scoring problems. Potential issues are automatically flagged in reports available to our psychometricians.
The handscoring processes include rigorous training, validity and reliability monitoring, and backreading to ensure accurate scoring. Handscored items are married up with the machine-scored items by our Test Integration System (TIS). The integration is based on identifiers that are never separated from their data and are further checked by the quality monitor (QM) system where the integrated record is passed for scoring. Once the integrated scores are sent to the QM, the records are rescored in the test-scoring system, a mature, well-tested real-time system that applies client-specific scoring rules and assigns scores from the calibrated items, including calculating performance-level indicators, subscale scores and other features, which then pass automatically to the reporting system and Database of Record (DoR). The scoring system is tested extensively prior to deployment, including hand checks of scored tests and large-scale simulations to ensure that point estimates and standard errors are correct.
After passing through the series of validation checks in the QM system, data are passed to the DoR, which serves as the centralized location for all student scores and responses, ensuring that there is only one place where the “official” record is stored. Only after scores have passed the QM checks and are uploaded to the DoR are they passed to the Online Reporting System, which is responsible for presenting individual-level results and calculating and presenting aggregate results. Absolutely no score is reported in the Online Reporting System until it passes all of the QM system’s validation checks.
Ohio Department of Education 165 American Institutes for Research
REFERENCES
American Educational Research Association, American Psychological Association, & National Council on
Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC.
Bentler, P. M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238–246.
Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item
response models and standardized indices. British Journal of Mathematical and Statistical Psychology,
38(1), 67–86.
Huynh, H. (1979). Statistical inference for two reliability indices in mastery testing based on the beta-binomial
model. Journal of Educational Statistics, 4, 231–246.
Kish, L. (1967). Survey sampling, New York, John Wiley and Sons.
Lewis, D.M., Mitzel, H.C., Green, D.R. (1996). Standard Setting: A Bookmark Approach. In D.R. Green (Chair), IRT-
Based Standard-Setting Procedures Utilizing Behavioral Anchoring. Symposium presented at the 1996
Council of Chief State School Officers 1996 National Conference on Large Scale Assessment, Phoenix, AZ.
Linacre, J.M. (2004). A user’s guide to WINSTEPS: Rasch-Model Computer Program. Chicago: MESA Press.
Livingston, S. A., & Wingersky, M. S. (1979). Assessing the reliability of tests used to make pass/fail
decisions. Journal of Educational Measurement, 247–260.
Livingston, S. A., & Lewis, C. (1995). Estimating the consistency and accuracy of classifications based on test
scores. Journal of Educational Measurement, 32(2), 179–197.
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.
McLaughlin, D., Scarloss, B. A., Stancavage, F. B., & Blankenship, C. D. (2005). Using State Assessments to Impute
Achievement of Students Absent from NAEP: An Empirical Study in Four States. Washington, DC:
American Institutes for Research. Retrieved from www.air.org/files/McLaughlin_AbsentStudents.pdf
Millsap, R. E. (2011). Statistical approaches to measurement invariance. New York: Routledge.
Millsap, R. E., & Cham, H. (2012). Investigating factorial invariance in longitudinal data. In B. Laursen, T. D. Little, &
N. A. Card (Eds.), Handbook of Developmental Research Methods (pp. 109–126). New York: Guilford
Press.
Mitzel, H.C., Lewis, D.M., Patz, R.J., & Green, D.R. (2001). The bookmark procedure: Psychological perspectives. In
G.J. Cizek (Ed), Setting performance standards: Concepts, methods and perspectives (pp. 249–281).
Mahwah, NJ: Lawrence Eribaum Assoc.
Sotaridona, L. S., Pornel, J. B., & Vallejo, A. (2003). Some applications of item response theory to testing. The
Phillipine Statistician, 52(1–4), 81–92.
Snijders, T.A.B. (2001). Asymptotic null distribution of person fit statistics with estimated person parameter.
Psychometrika, 66(3), 331–342.
Ohio Department of Education 166 American Institutes for Research
Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor
analysis. Psychometrika, 38(1), 1–10.
Way, W. D., Davis, L. L., & Fitzpatrick, S. (2006, April). Score comparability of online and paper administrations of
the Texas Assessment of Knowledge and Skills. In annual meeting of the National Council on
Measurement in Education, San Francisco, CA.
Wesolowsky G.O. 2000. "Detecting Excessive Similarity in Answers on Multiple Choice Exams", Journal of Applied
Statistics, Vol. 27, 909–921.
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-1 American Institutes for Research
Appendix A.1a Global Model Fit Indices of Measurement Invariance Tests – Grade 3 ELA
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 176272.607 700
Metric 176935.899 727 Configural 663.293 (27) < 0.01 0.001
Scalar 179591.433 754 Metric 2655.533 (27) < 0.01 0.001
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 148127.862 700
Metric 151127.568 727 Configural 2999.706 (27) < 0.01 0.000
Scalar 156938.554 754 Metric 5810.986 (27) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 126346.473 700
Metric 126734.042 727 Configural 387.569 (27) < 0.01 0.001
Scalar 128041.882 754 Metric 1307.840 (27) < 0.01 0.001
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 123990.014 700
Metric 124166.615 727 Configural 176.602 (27) < 0.01 0.001
Scalar 125882.150 754 Metric 1715.535 (27) < 0.01 0.001
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 119859.234 700
Metric 119867.039 727 Configural 7.805 (27) 0.99 0.001
Scalar 119901.028 754 Metric 33.989 (27) 0.17 0.001
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 132893.839 700
Metric 133180.748 727 Configural 286.909 (27) < 0.01 0.001
Scalar 133787.036 754 Metric 606.288 (27) < 0.01 0.001
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 173856.657 700
Metric 176665.321 727 Configural 2808.664 (27) < 0.01 0.001
Scalar 181061.496 754 Metric 4396.176 (27) < 0.01 0.000
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 173816.452 700
Metric 176408.711 727 Configural 2592.260 (27) < 0.01 0.001
Scalar 179987.338 754 Metric 3578.626 (27) < 0.01 0.001
Appendix A.1b Global Model Fit Indices of Scalar Invariance Model – Grade 3 ELA
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 157017.616 736 < 0.01 0.970 0.058
Model B-1 134967.598 736 < 0.01 0.970 0.058
Model B-2 114552.858 736 < 0.01 0.970 0.059
Model B-3 114546.382 736 < 0.01 0.968 0.059
Model B-4 4953.975 626 < 0.01 0.991 0.013
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-2 American Institutes for Research
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model B-5 120532.604 736 < 0.01 0.970 0.059
Model C 85146.485 680 < 0.01 0.931 0.044
Model D 150361.839 736 < 0.01 0.973 0.057
Appendix A.2a Global Model Fit Indices of Measurement Invariance Tests – Grade 4 ELA
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 130445.192 648
Metric 131108.330 674 Configural 663.137 (26) < 0.01 0.001
Scalar 134746.210 700 Metric 3637.880 (26) < 0.01 0.000
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 109597.394 648
Metric 111872.069 674 Configural 2274.675 (26) < 0.01 0.001
Scalar 117261.749 700 Metric 5389.679 (26) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 97368.768 648
Metric 97805.828 674 Configural 437.061 (26) < 0.01 0.001
Scalar 99165.715 700 Metric 1359.886 (26) < 0.01 0.001
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 96161.689 648
Metric 96444.259 674 Configural 282.569 (26) < 0.01 0.001
Scalar 97294.713 700 Metric 850.454 (26) < 0.01 0.001
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 92771.984 648
Metric 92811.511 674 Configural 39.527 (26) 0.04 0.001
Scalar 92836.238 700 Metric 24.727 (26) 0.53 0.001
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 101615.047 648
Metric 101912.014 674 Configural 296.967 (26) < 0.01 0.001
Scalar 102634.806 700 Metric 722.792 (26) < 0.01 0.001
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 121624.905 648
Metric 125745.093 674 Configural 4120.188 (26) < 0.01 0.000
Scalar 137650.795 700 Metric 11905.701 (26) < 0.01 0.002
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 127793.118 648
Metric 129153.828 674 Configural 1360.711 (26) < 0.01 0.001
Scalar 133310.002 700 Metric 4156.173 (26) < 0.01 0.000
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-3 American Institutes for Research
Appendix A.2b Global Model Fit Indices of Scalar Invariance Model – Grade 4 ELA
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 79175.409 684 < 0.01 0.983 0.043
Model B-1 70429.082 684 < 0.01 0.982 0.043
Model B-2 61495.500 684 < 0.01 0.982 0.044
Model B-3 62589.438 684 < 0.01 0.981 0.045
Model B-4 41506.750 684 < 0.01 0.984 0.037
Model B-5 64156.587 684 < 0.01 0.982 0.044
Model C 80276.805 684 < 0.01 0.982 0.043
Model D 73622.215 684 < 0.01 0.985 0.041
Appendix A.3a Global Model Fit Indices of Measurement Invariance Tests – Grade 5 ELA
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 147634.619 700
Metric 148808.238 727 Configural 1173.620 (27) < 0.01 0.000
Scalar 154734.500 754 Metric 5926.262 (27) < 0.01 0.000
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 126498.633 700
Metric 132507.137 727 Configural 6008.504 (27) < 0.01 0.000
Scalar 135617.320 754 Metric 3110.183 (27) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 108369.039 700
Metric 109128.565 727 Configural 759.527 (27) < 0.01 0.001
Scalar 109808.786 754 Metric 680.221 (27) < 0.01 0.001
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 106198.026 700
Metric 106496.410 727 Configural 298.384 (27) < 0.01 0.001
Scalar 107252.901 754 Metric 756.491 (27) < 0.01 0.001
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 102537.283 700
Metric 102588.472 727 Configural 51.190 (27) < 0.01 0.001
Scalar 102626.133 754 Metric 37.661 (27) 0.08 0.001
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 112359.126 700
Metric 112956.809 727 Configural 597.683 (27) < 0.01 0.001
Scalar 113281.002 754 Metric 324.193 (27) < 0.01 0.001
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 147918.570 700
Metric 154410.491 727 Configural 6491.921 (27) < 0.01 0.001
Scalar 157882.846 754 Metric 3472.355 (27) < 0.01 0.001
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-4 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 146929.710 700
Metric 150847.327 727 Configural 3917.617 (27) < 0.01 0.000
Scalar 152039.293 754 Metric 1191.966 (27) < 0.01 0.001
Appendix A.3b Global Model Fit Indices of Scalar Invariance Model – Grade 5 ELA
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 132173.133 738 < 0.01 0.964 0.053
Model B-1 120039.546 738 < 0.01 0.963 0.054
Model B-2 102794.477 738 < 0.01 0.965 0.054
Model B-3 104288.496 738 < 0.01 0.961 0.055
Model B-4 5865.498 628 < 0.01 0.991 0.014
Model B-5 9662.025 628 < 0.01 0.988 0.017
Model C 139017.910 738 < 0.01 0.963 0.054
Model D 11845.152 628 < 0.01 0.991 0.017
Appendix A.4a Global Model Fit Indices of Measurement Invariance Tests – Grade 6 ELA
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 168282.645 1120
Metric 171059.092 1154 Configural 2776.448 (34) < 0.01 0.001
Scalar 182954.777 1188 Metric 11895.685 (34) < 0.01 0.001
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 143585.881 1120
Metric 150360.360 1154 Configural 6774.479 (34) < 0.01 0.001
Scalar 153361.720 1188 Metric 3001.360 (34) < 0.01 0.001
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 124176.961 1120
Metric 125036.105 1154 Configural 859.144 (34) < 0.01 0.000
Scalar 125638.033 1188 Metric 601.928 (34) < 0.01 0.001
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 121927.115 1120
Metric 122385.012 1154 Configural 457.898 (34) < 0.01 0.000
Scalar 123241.196 1188 Metric 856.184 (34) < 0.01 0.001
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 117833.007 1120
Metric 117622.219 1154 Configural NA (34) NA 0.000
Scalar 117645.099 1188 Metric 22.880 (34) 0.93 0.001
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-5 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Configural 129166.104 1120
Metric 129821.143 1154 Configural 655.039 (34) < 0.01 0.001
Scalar 130101.567 1188 Metric 280.424 (34) < 0.01 0.001
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 163070.972 1120
Metric 174027.048 1154 Configural 10956.076 (34) < 0.01 0.001
Scalar 182002.462 1188 Metric 7975.415 (34) < 0.01 0.000
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 167842.566 1120
Metric 172389.688 1154 Configural 4547.123 (34) < 0.01 0.000
Scalar 174443.388 1188 Metric 2053.700 (34) < 0.01 0.001
Appendix A.4b Global Model Fit Indices of Scalar Invariance Model – Grade 6 ELA
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 102199.823 1168 < 0.01 0.969 0.037
Model B-1 82400.115 1168 < 0.01 0.964 0.036
Model B-2 70065.871 1168 < 0.01 0.965 0.035
Model B-3 71153.323 1168 < 0.01 0.961 0.036
Model B-4 6735.073 1030 < 0.01 0.992 0.011
Model B-5 72573.325 1168 < 0.01 0.965 0.035
Model C 94312.750 1168 < 0.01 0.962 0.036
Model D 89135.877 1168 < 0.01 0.981 0.035
Appendix A.5a Global Model Fit Indices of Measurement Invariance Tests – Grade 7 ELA
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 153106.579 1330
Metric 155280.064 1367 Configural 2173.485 (37) < 0.01 0.000
Scalar 165855.870 1404 Metric 10575.806 (37) < 0.01 0.000
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 134663.915 1330
Metric 139621.049 1367 Configural 4957.134 (37) < 0.01 0.000
Scalar 141723.904 1404 Metric 2102.855 (37) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 119830.630 1330
Metric 120820.856 1367 Configural 990.226 (37) < 0.01 0.001
Scalar 121210.813 1404 Metric 389.956 (37) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 118713.552 1330
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-6 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Metric 119129.087 1367 Configural 415.535 (37) < 0.01 0.001
Scalar 120640.953 1404 Metric 1511.866 (37) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 114856.793 1330
Metric 114905.184 1367 Configural 48.391 (37) 0.10 0.001
Scalar 114946.565 1404 Metric 41.381 (37) 0.29 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 124523.031 1330
Metric 124984.272 1367 Configural 461.240 (37) < 0.01 0.001
Scalar 125309.279 1404 Metric 325.007 (37) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 153916.614 1330
Metric 159235.022 1367 Configural 5318.408 (37) < 0.01 0.000
Scalar 164816.218 1404 Metric 5581.196 (37) < 0.01 0.000
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 153929.117 1330
Metric 156906.356 1367 Configural 2977.240 (37) < 0.01 0.000
Scalar 158653.435 1404 Metric 1747.079 (37) < 0.01 0.001
Appendix A.5b Global Model Fit Indices of Scalar Invariance Model – Grade 7 ELA
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 153828.801 1379 < 0.01 0.975 0.042
Model B-1 130919.521 1379 < 0.01 0.974 0.042
Model B-2 116459.642 1379 < 0.01 0.974 0.042
Model B-3 117392.902 1379 < 0.01 0.970 0.043
Model B-4 66384.668 1379 < 0.01 0.979 0.032
Model B-5 121854.746 1379 < 0.01 0.973 0.042
Model C 143154.323 1379 < 0.01 0.972 0.041
Model D 125747.203 1379 < 0.01 0.980 0.038
Appendix A.6a Global Model Fit Indices of Measurement Invariance Tests – Grade 8 ELA
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 180383.360 1188
Metric 184712.267 1223 Configural 4328.907 (35) < 0.01 0.000
Scalar 199572.427 1258 Metric 14860.160 (35) < 0.01 0.001
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 158655.439 1188
Metric 164544.857 1223 Configural 5889.418 (35) < 0.01 0.000
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-7 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Scalar 169750.937 1258 Metric 5206.080 (35) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 139622.510 1188
Metric 140694.608 1223 Configural 1072.099 (35) < 0.01 0.001
Scalar 141730.165 1258 Metric 1035.556 (35) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 137768.314 1188
Metric 138337.859 1223 Configural 569.545 (35) < 0.01 0.001
Scalar 139931.044 1258 Metric 1593.184 (35) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 133651.357 1188
Metric 133693.809 1223 Configural 42.452 (35) 0.18 0.001
Scalar 133755.628 1258 Metric 61.818 (35) < 0.01 0.001
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 145125.458 1188
Metric 145790.929 1223 Configural 665.471 (35) < 0.01 0.001
Scalar 146393.178 1258 Metric 602.249 (35) < 0.01 0.001
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 178864.671 1188
Metric 185984.935 1223 Configural 7120.263 (35) < 0.01 0.000
Scalar 195159.754 1258 Metric 9174.820 (35) < 0.01 0.001
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 181989.475 1188
Metric 185160.056 1223 Configural 3170.581 (35) < 0.01 0.000
Scalar 188164.706 1258 Metric 3004.650 (35) < 0.01 0.000
Appendix A.6b Global Model Fit Indices of Scalar Invariance Model – Grade 8 ELA
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 115362.862 1237 < 0.01 0.968 0.038
Model B-1 98674.468 1237 < 0.01 0.968 0.038
Model B-2 83198.288 1237 < 0.01 0.969 0.037
Model B-3 84753.871 1237 < 0.01 0.966 0.038
Model B-4 46742.399 1237 < 0.01 0.976 0.029
Model B-5 88122.358 1237 < 0.01 0.970 0.038
Model C 109656.528 1237 < 0.01 0.970 0.037
Model D 68275.316 1165 < 0.01 0.973 0.030
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-8 American Institutes for Research
Appendix A.7a Global Model Fit Indices of Measurement Invariance Tests – High School ELA 1
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 139303.829 1188
Metric 142656.595 1223 Configural 3352.766 (35) < 0.01 0.000
Scalar 149070.781 1258 Metric 6414.186 (35) < 0.01 0.000
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 119251.442 1188
Metric 130055.542 1223 Configural 10804.099 (35) < 0.01 0.001
Scalar 132296.422 1258 Metric 2240.881 (35) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 100970.374 1188
Metric 103442.356 1223 Configural 2471.982 (35) < 0.01 0.000
Scalar 104232.537 1258 Metric 790.181 (35) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 99068.695 1188
Metric 99529.304 1223 Configural 460.609 (35) < 0.01 0.000
Scalar 101241.561 1258 Metric 1712.257 (35) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 96123.562 1188
Metric 96175.731 1223 Configural 52.169 (35) 0.03 0.000
Scalar 96212.336 1258 Metric 36.605 (35) 0.39 0.001
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 105233.921 1188
Metric 106538.728 1223 Configural 1304.807 (35) < 0.01 0.000
Scalar 107031.991 1258 Metric 493.264 (35) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 136531.110 1188
Metric 149438.981 1223 Configural 12907.871 (35) < 0.01 0.001
Scalar 152658.585 1258 Metric 3219.603 (35) < 0.01 0.000
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 137523.733 1188
Metric 144747.765 1223 Configural 7224.032 (35) < 0.01 0.000
Scalar 148020.572 1258 Metric 3272.807 (35) < 0.01 0.000
Appendix A.7b Global Model Fit Indices of Scalar Invariance Model – High School ELA 1
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 118863.527 1237 < 0.01 0.978 0.036
Model B-1 113871.033 1237 < 0.01 0.976 0.037
Model B-2 94548.376 1237 < 0.01 0.976 0.037
Model B-3 97930.200 1237 < 0.01 0.972 0.038
Model B-4 10269.973 1095 < 0.01 0.990 0.013
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-9 American Institutes for Research
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model B-5 98736.587 1237 < 0.01 0.977 0.037
Model C 73975.406 1165 < 0.01 0.977 0.029
Model D 66457.321 1165 < 0.01 0.984 0.027
Appendix A.8a Global Model Fit Indices of Measurement Invariance Tests – High School ELA 2
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 144160.958 1258
Metric 149271.471 1294 Configural 5110.513 (36) < 0.01 0.000
Scalar 160177.179 1330 Metric 10905.709 (36) < 0.01 0.001
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 128065.864 1258
Metric 134338.698 1294 Configural 6272.833 (36) < 0.01 0.001
Scalar 138294.074 1330 Metric 3955.377 (36) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 116372.511 1258
Metric 117977.745 1294 Configural 1605.233 (36) < 0.01 0.000
Scalar 119302.131 1330 Metric 1324.386 (36) < 0.01 0.001
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 115083.545 1258
Metric 115706.794 1294 Configural 623.249 (36) < 0.01 0.000
Scalar 117326.031 1330 Metric 1619.237 (36) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 111535.374 1258
Metric 111584.779 1294 Configural 49.405 (36) 0.07 0.000
Scalar 111628.834 1330 Metric 44.055 (36) 0.17 0.001
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 119700.711 1258
Metric 120278.285 1294 Configural 577.574 (36) < 0.01 0.000
Scalar 120561.258 1330 Metric 282.973 (36) < 0.01 0.001
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 145117.348 1258
Metric 152335.693 1294 Configural 7218.346 (36) < 0.01 0.001
Scalar 158340.252 1330 Metric 6004.559 (36) < 0.01 0.000
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 145933.871 1258
Metric 148032.874 1294 Configural 2099.003 (36) < 0.01 0.000
Scalar 154366.269 1330 Metric 6333.395 (36) < 0.01 0.000
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-10 American Institutes for Research
Appendix A.8b Global Model Fit Indices of Scalar Invariance Model – High School ELA 2
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 103674.106 1306 < 0.01 0.987 0.033
Model B-1 91303.503 1306 < 0.01 0.986 0.033
Model B-2 80680.422 1306 < 0.01 0.986 0.034
Model B-3 84810.984 1306 < 0.01 0.983 0.035
Model B-4 48677.650 1306 < 0.01 0.989 0.027
Model B-5 83255.971 1306 < 0.01 0.985 0.034
Model C 100077.934 1306 < 0.01 0.986 0.033
Model D 87238.839 1306 < 0.01 0.991 0.031
Appendix A.9a Global Model Fit Indices of Measurement Invariance Tests – Grade 3 Math
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 190947.987 1720
Metric 193254.779 1762 Configural 2306.792 (42) < 0.01 0.001
Scalar 202057.114 1804 Metric 8802.334 (42) < 0.01 0.001
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 150540.004 1720
Metric 167722.482 1762 Configural 17182.479 (42) < 0.01 0.002
Scalar 174359.453 1804 Metric 6636.970 (42) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 119295.226 1720
Metric 121010.871 1762 Configural 1715.644 (42) < 0.01 0.000
Scalar 121474.981 1804 Metric 464.110 (42) < 0.01 0.001
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 114655.043 1720
Metric 115243.155 1762 Configural 588.112 (42) < 0.01 0.000
Scalar 116359.095 1804 Metric 1115.940 (42) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 110917.923 1720
Metric 110968.999 1762 Configural 51.076 (42) 0.16 0.000
Scalar 111005.438 1804 Metric 36.439 (42) 0.71 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 125372.296 1720
Metric 126856.240 1762 Configural 1483.944 (42) < 0.01 0.000
Scalar 127300.676 1804 Metric 444.436 (42) < 0.01 0.001
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 180820.553 1720
Metric 195174.653 1762 Configural 14354.101 (42) < 0.01 0.001
Scalar 199534.697 1804 Metric 4360.044 (42) < 0.01 0.000
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-11 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 190572.554 1720
Metric 194007.584 1762 Configural 3435.030 (42) < 0.01 0.001
Scalar 194914.246 1804 Metric 906.661 (42) < 0.01 0.000
Appendix A.9b Global Model Fit Indices of Scalar Invariance Model – Grade 3 Math
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 178542.083 1767 < 0.01 0.957 0.040
Model B-1 146430.781 1767 < 0.01 0.950 0.039
Model B-2 100392.823 1767 < 0.01 0.959 0.035
Model B-3 91327.522 1767 < 0.01 0.956 0.034
Model B-4 48266.868 1767 < 0.01 0.974 0.025
Model B-5 108469.708 1767 < 0.01 0.957 0.036
Model C 170436.283 1767 < 0.01 0.952 0.039
Model D 161574.471 1767 < 0.01 0.961 0.038
Appendix A.10a Global Model Fit Indices of Measurement Invariance Tests – Grade 4 Math
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 132124.195 2160
Metric 133819.876 2207 Configural 1695.681 (47) < 0.01 0.000
Scalar 142829.229 2254 Metric 9009.353 (47) < 0.01 0.001
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 110522.702 2160
Metric 120189.887 2207 Configural 9667.185 (47) < 0.01 0.000
Scalar 124533.299 2254 Metric 4343.412 (47) < 0.01 0.001
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 94331.712 2160
Metric 95246.599 2207 Configural 914.887 (47) < 0.01 0.000
Scalar 95852.257 2254 Metric 605.658 (47) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 92182.599 2160
Metric 92626.962 2207 Configural 444.363 (47) < 0.01 0.000
Scalar 93733.957 2254 Metric 1106.995 (47) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 89090.227 2160
Metric 89141.357 2207 Configural 51.130 (47) 0.31 0.000
Scalar 89192.200 2254 Metric 50.843 (47) 0.32 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-12 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Configural 97943.909 2160
Metric 99141.934 2207 Configural 1198.025 (47) < 0.01 0.000
Scalar 99658.985 2254 Metric 517.051 (47) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 128864.507 2160
Metric 140727.090 2207 Configural 11862.583 (47) < 0.01 0.001
Scalar 145824.015 2254 Metric 5096.925 (47) < 0.01 0.000
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 132333.410 2160
Metric 134040.744 2207 Configural 1707.335 (47) < 0.01 0.000
Scalar 135509.068 2254 Metric 1468.324 (47) < 0.01 0.000
Appendix A.10b Global Model Fit Indices of Scalar Invariance Model – Grade 4 Math
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 148317.365 2207 < 0.01 0.969 0.032
Model B-1 119688.757 2207 < 0.01 0.964 0.031
Model B-2 97381.534 2207 < 0.01 0.966 0.031
Model B-3 93061.605 2207 < 0.01 0.962 0.030
Model B-4 44760.357 2207 < 0.01 0.981 0.021
Model B-5 103741.708 2207 < 0.01 0.965 0.031
Model C 136891.966 2207 < 0.01 0.966 0.031
Model D 127289.872 2207 < 0.01 0.972 0.030
Appendix A.11a Global Model Fit Indices of Measurement Invariance Tests – Grade 5 Math
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 167827.148 1978
Metric 169795.186 2023 Configural 1968.038 (45) < 0.01 0.000
Scalar 181780.169 2068 Metric 11984.982 (45) < 0.01 0.001
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 131834.412 1978
Metric 153988.791 2023 Configural 22154.379 (45) < 0.01 0.002
Scalar 160457.408 2068 Metric 6468.617 (45) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 113472.774 1978
Metric 115632.135 2023 Configural 2159.361 (45) < 0.01 0.000
Scalar 116093.143 2068 Metric 461.008 (45) < 0.01 0.001
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 110120.724 1978
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-13 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Metric 110769.229 2023 Configural 648.506 (45) < 0.01 0.001
Scalar 112014.539 2068 Metric 1245.309 (45) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 106936.222 1978
Metric 107019.415 2023 Configural 83.193 (45) < 0.01 0.001
Scalar 107086.266 2068 Metric 66.851 (45) 0.02 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 118045.905 1978
Metric 120018.452 2023 Configural 1972.547 (45) < 0.01 0.000
Scalar 120510.974 2068 Metric 492.522 (45) < 0.01 0.001
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 162442.028 1978
Metric 178244.169 2023 Configural 15802.140 (45) < 0.01 0.001
Scalar 189410.778 2068 Metric 11166.610 (45) < 0.01 0.001
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 167756.026 1978
Metric 170854.256 2023 Configural 3098.230 (45) < 0.01 0.000
Scalar 172860.513 2068 Metric 2006.257 (45) < 0.01 0.000
Appendix A.11b Global Model Fit Indices of Scalar Invariance Model – Grade 5 Math
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 130334.860 2026 < 0.01 0.975 0.032
Model B-1 101639.676 2026 < 0.01 0.972 0.030
Model B-2 81594.183 2026 < 0.01 0.975 0.029
Model B-3 80287.581 2026 < 0.01 0.972 0.029
Model B-4 41802.299 2026 < 0.01 0.984 0.021
Model B-5 88250.421 2026 < 0.01 0.974 0.030
Model C 124475.421 2026 < 0.01 0.972 0.031
Model D 108542.900 2026 < 0.01 0.978 0.029
Appendix A.12a Global Model Fit Indices of Measurement Invariance Tests – Grade 6 Math
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 158114.805 1978
Metric 160690.681 2023 Configural 2575.876 (45) < 0.01 0.001
Scalar 176085.262 2068 Metric 15394.581 (45) < 0.01 0.002
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 121340.475 1978
Metric 144376.540 2023 Configural 23036.066 (45) < 0.01 0.003
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-14 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Scalar 150521.986 2068 Metric 6145.446 (45) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 107722.322 1978
Metric 110019.726 2023 Configural 2297.403 (45) < 0.01 0.000
Scalar 110576.871 2068 Metric 557.145 (45) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 106152.536 1978
Metric 106917.021 2023 Configural 764.485 (45) < 0.01 0.000
Scalar 108965.289 2068 Metric 2048.268 (45) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 101947.142 1978
Metric 102084.007 2023 Configural 136.866 (45) < 0.01 0.000
Scalar 102136.836 2068 Metric 52.828 (45) 0.20 0.001
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 112226.072 1978
Metric 114389.647 2023 Configural 2163.576 (45) < 0.01 0.000
Scalar 114827.666 2068 Metric 438.019 (45) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 139877.131 1978
Metric 163047.776 2023 Configural 23170.645 (45) < 0.01 0.003
Scalar 178426.752 2068 Metric 15378.976 (45) < 0.01 0.001
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 156734.616 1978
Metric 159287.454 2023 Configural 2552.838 (45) < 0.01 0.000
Scalar 162529.372 2068 Metric 3241.917 (45) < 0.01 0.000
Appendix A.12b Global Model Fit Indices of Scalar Invariance Model – Grade 6 Math
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 112375.374 2029 < 0.01 0.979 0.030
Model B-1 83800.161 2029 < 0.01 0.978 0.027
Model B-2 63884.428 2029 < 0.01 0.981 0.026
Model B-3 63945.406 2029 < 0.01 0.977 0.026
Model B-4 31920.301 2029 < 0.01 0.989 0.018
Model B-5 67946.129 2029 < 0.01 0.980 0.026
Model C 91468.840 2029 < 0.01 0.979 0.027
Model D 86364.284 2029 < 0.01 0.983 0.026
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-15 American Institutes for Research
Appendix A.13a Global Model Fit Indices of Measurement Invariance Tests – Grade 7 Math
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 90381.581 1804
Metric 92164.867 1847 Configural 1783.287 (43) < 0.01 0.000
Scalar 99662.188 1890 Metric 7497.321 (43) < 0.01 0.000
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 74000.204 1804
Metric 85387.158 1847 Configural 11386.954 (43) < 0.01 0.001
Scalar 87043.400 1890 Metric 1656.242 (43) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 65148.372 1804
Metric 66312.673 1847 Configural 1164.301 (43) < 0.01 0.000
Scalar 66604.358 1890 Metric 291.685 (43) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 64271.645 1804
Metric 64717.586 1847 Configural 445.941 (43) < 0.01 0.000
Scalar 66010.518 1890 Metric 1292.931 (43) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 62002.888 1804
Metric 62074.052 1847 Configural 71.164 (43) < 0.01 0.000
Scalar 62127.231 1890 Metric 53.179 (43) 0.14 0.001
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 67795.449 1804
Metric 68772.232 1847 Configural 976.783 (43) < 0.01 0.000
Scalar 69006.771 1890 Metric 234.539 (43) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 83292.021 1804
Metric 95684.228 1847 Configural 12392.207 (43) < 0.01 0.001
Scalar 106105.765 1890 Metric 10421.537 (43) < 0.01 0.001
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 89296.664 1804
Metric 90817.157 1847 Configural 1520.493 (43) < 0.01 0.001
Scalar 92614.653 1890 Metric 1797.496 (43) < 0.01 0.000
Appendix A.13b Global Model Fit Indices of Scalar Invariance Model – Grade 7 Math
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 96462.431 1854 < 0.01 0.980 0.029
Model B-1 72340.614 1854 < 0.01 0.979 0.027
Model B-2 61077.288 1854 < 0.01 0.980 0.027
Model B-3 60574.534 1854 < 0.01 0.977 0.027
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-16 American Institutes for Research
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model B-4 29630.985 1854 < 0.01 0.988 0.019
Model B-5 65535.622 1854 < 0.01 0.979 0.027
Model C 79157.480 1854 < 0.01 0.979 0.026
Model D 74019.021 1854 < 0.01 0.983 0.026
Appendix A.14a Global Model Fit Indices of Measurement Invariance Tests – Grade 8 Math
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 142717.934 2350
Metric 144758.678 2399 Configural 2040.743 (49) < 0.01 0.000
Scalar 150954.434 2448 Metric 6195.757 (49) < 0.01 0.000
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 115632.325 2350
Metric 125519.999 2399 Configural 9887.674 (49) < 0.01 0.001
Scalar 131613.357 2448 Metric 6093.358 (49) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 101885.450 2350
Metric 103040.908 2399 Configural 1155.459 (49) < 0.01 0.000
Scalar 103538.186 2448 Metric 497.277 (49) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 99494.111 2350
Metric 100633.078 2399 Configural 1138.967 (49) < 0.01 0.000
Scalar 102669.061 2448 Metric 2035.983 (49) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 96903.432 2350
Metric 96960.018 2399 Configural 56.586 (49) 0.21 0.000
Scalar 97024.289 2448 Metric 64.271 (49) 0.07 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 105622.368 2350
Metric 106627.952 2399 Configural 1005.584 (49) < 0.01 0.000
Scalar 107048.496 2448 Metric 420.544 (49) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 127209.091 2350
Metric 144619.498 2399 Configural 17410.407 (49) < 0.01 0.002
Scalar 153384.474 2448 Metric 8764.976 (49) < 0.01 0.001
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 141725.007 2350
Metric 143366.016 2399 Configural 1641.009 (49) < 0.01 0.000
Scalar 145242.165 2448 Metric 1876.149 (49) < 0.01 0.000
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-17 American Institutes for Research
Appendix A.14b Global Model Fit Indices of Scalar Invariance Model – Grade 8 Math
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 108670.655 2401 < 0.01 0.967 0.030
Model B-1 87774.437 2401 < 0.01 0.963 0.029
Model B-2 70271.540 2401 < 0.01 0.966 0.028
Model B-3 69015.526 2401 < 0.01 0.961 0.028
Model B-4 29900.357 2401 < 0.01 0.979 0.018
Model B-5 75489.717 2401 < 0.01 0.965 0.029
Model C 93707.109 2401 < 0.01 0.964 0.028
Model D 88035.755 2401 < 0.01 0.970 0.027
Appendix A.15a Global Model Fit Indices of Measurement Invariance Tests – Algebra
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 145317.574 2068
Metric 147946.803 2114 Configural 2629.229 (46) < 0.01 0.000
Scalar 157550.783 2160 Metric 9603.980 (46) < 0.01 0.000
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 114668.900 2068
Metric 133651.132 2114 Configural 18982.231 (46) < 0.01 0.002
Scalar 141380.030 2160 Metric 7728.899 (46) < 0.01 0.001
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 100042.607 2068
Metric 103339.056 2114 Configural 3296.449 (46) < 0.01 0.000
Scalar 104948.891 2160 Metric 1609.835 (46) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 98295.100 2068
Metric 100556.063 2114 Configural 2260.963 (46) < 0.01 0.000
Scalar 103587.621 2160 Metric 3031.558 (46) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 95264.755 2068
Metric 95340.447 2114 Configural 75.692 (46) < 0.01 0.001
Scalar 95391.742 2160 Metric 51.295 (46) 0.27 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 103671.582 2068
Metric 104617.340 2114 Configural 945.757 (46) < 0.01 0.000
Scalar 105299.079 2160 Metric 681.740 (46) < 0.01 0.001
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 133596.131 2068
Metric 150353.151 2114 Configural 16757.020 (46) < 0.01 0.001
Scalar 167543.093 2160 Metric 17189.942 (46) < 0.01 0.001
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-18 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 143456.998 2068
Metric 145629.284 2114 Configural 2172.287 (46) < 0.01 0.000
Scalar 152056.332 2160 Metric 6427.048 (46) < 0.01 0.000
Appendix A.15b Global Model Fit Indices of Scalar Invariance Model – Algebra
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 120175.681 2120 < 0.01 0.976 0.028
Model B-1 89500.076 2120 < 0.01 0.976 0.025
Model B-2 69417.062 2120 < 0.01 0.979 0.024
Model B-3 71063.328 2120 < 0.01 0.975 0.025
Model B-4 35671.528 2120 < 0.01 0.986 0.018
Model B-5 74509.579 2120 < 0.01 0.978 0.025
Model C 97354.383 2120 < 0.01 0.976 0.025
Model D 90478.731 2120 < 0.01 0.980 0.024
Appendix A.16a Global Model Fit Indices of Measurement Invariance Tests – Geometry
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 179535.383 2448
Metric 182763.309 2498 Configural 3227.926 (50) < 0.01 0.000
Scalar 194081.235 2548 Metric 11317.926 (50) < 0.01 0.000
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 149911.287 2448
Metric 160144.751 2498 Configural 10233.464 (50) < 0.01 0.001
Scalar 169351.533 2548 Metric 9206.782 (50) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 134034.325 2448
Metric 135710.009 2498 Configural 1675.684 (50) < 0.01 0.000
Scalar 136991.025 2548 Metric 1281.016 (50) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 132014.972 2448
Metric 133460.759 2498 Configural 1445.787 (50) < 0.01 0.000
Scalar 135537.557 2548 Metric 2076.798 (50) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 128006.674 2448
Metric 128085.780 2498 Configural 79.106 (50) 0.01 0.000
Scalar 128146.794 2548 Metric 61.014 (50) 0.14 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-19 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Configural 137247.895 2448
Metric 137846.970 2498 Configural 599.074 (50) < 0.01 0.000
Scalar 138446.519 2548 Metric 599.550 (50) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 168429.004 2448
Metric 178443.243 2498 Configural 10014.240 (50) < 0.01 0.000
Scalar 197663.061 2548 Metric 19219.817 (50) < 0.01 0.002
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 179174.789 2448
Metric 180846.286 2498 Configural 1671.497 (50) < 0.01 0.001
Scalar 185655.520 2548 Metric 4809.234 (50) < 0.01 0.001
Appendix A.16b Global Model Fit Indices of Scalar Invariance Model – Geometry
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 181322.936 2501 < 0.01 0.968 0.033
Model B-1 135153.027 2501 < 0.01 0.968 0.031
Model B-2 118898.258 2501 < 0.01 0.969 0.031
Model B-3 129396.628 2501 < 0.01 0.963 0.033
Model B-4 54553.445 2501 < 0.01 0.981 0.021
Model B-5 131293.202 2501 < 0.01 0.966 0.032
Model C 142562.024 2501 < 0.01 0.967 0.030
Model D 127841.920 2501 < 0.01 0.974 0.028
Appendix A.17a Global Model Fit Indices of Measurement Invariance Tests – Integrated Math 1
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 12533.996 2068
Metric 12833.254 2114 Configural 299.258 (46) < 0.01 0.000
Scalar 13653.808 2160 Metric 820.554 (46) < 0.01 0.000
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 9313.286 2068
Metric 12515.551 2114 Configural 3202.265 (46) < 0.01 0.005
Scalar 13016.987 2160 Metric 501.436 (46) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 7581.356 2068
Metric 7674.792 2114 Configural 93.435 (46) < 0.01 0.001
Scalar 7777.285 2160 Metric 102.493 (46) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 7699.410 2068
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-20 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Metric 7858.907 2114 Configural 159.496 (46) < 0.01 0.000
Scalar 8483.634 2160 Metric 624.728 (46) < 0.01 0.001
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural NA NA
Metric 9545.293 1847 Configural NA (NA) NA NA
Scalar 9577.343 1890 Metric 32.050 (43) 0.89 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 8552.807 2068
Metric 9060.741 2114 Configural 507.933 (46) < 0.01 0.000
Scalar 9304.084 2160 Metric 243.343 (46) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 12186.557 2068
Metric 13655.944 2114 Configural 1469.386 (46) < 0.01 0.002
Scalar 14654.803 2160 Metric 998.859 (46) < 0.01 0.001
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 11763.555 2068
Metric 13278.360 2114 Configural 1514.806 (46) < 0.01 0.001
Scalar 14937.766 2160 Metric 1659.406 (46) < 0.01 0.002
Appendix A.17b Global Model Fit Indices of Scalar Invariance Model – Integrated Math 1
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 9882.116 2120 < 0.01 0.984 0.024
Model B-1 6869.263 2120 < 0.01 0.984 0.021
Model B-2 4197.159 2120 < 0.01 0.991 0.017
Model B-3 4964.188 2120 < 0.01 0.988 0.020
Model B-4 NA NA NA NA NA
Model B-5 5544.281 2120 < 0.01 0.987 0.021
Model C 7114.987 2120 < 0.01 0.988 0.019
Model D 8118.431 2120 < 0.01 0.986 0.021
Appendix A.18a Global Model Fit Indices of Measurement Invariance Tests – Integrated Math 2
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 15650.713 2548
Metric 15981.010 2599 Configural 330.298 (51) < 0.01 0.000
Scalar 16469.255 2650 Metric 488.245 (51) < 0.01 0.000
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 12767.571 2548
Metric 14756.416 2599 Configural 1988.845 (51) < 0.01 0.002
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-21 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Scalar 15437.452 2650 Metric 681.036 (51) < 0.01 0.001
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 10263.565 2548
Metric 10355.810 2599 Configural 92.245 (51) < 0.01 0.001
Scalar 10411.967 2650 Metric 56.157 (51) 0.29 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 10695.507 2548
Metric 10868.321 2599 Configural 172.813 (51) < 0.01 0.000
Scalar 11165.357 2650 Metric 297.036 (51) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural NA 2350
Metric NA 2399 Configural 90.952 (49) < 0.01 0.000
Scalar NA 2448 Metric 67.720 (49) 0.04 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 11106.254 2548
Metric 11264.122 2599 Configural 157.868 (51) < 0.01 0.000
Scalar 11509.815 2650 Metric 245.693 (51) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 16042.711 2548
Metric 16570.857 2599 Configural 528.146 (51) < 0.01 0.000
Scalar 17884.401 2650 Metric 1313.544 (51) < 0.01 0.001
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 15704.517 2548
Metric 16734.883 2599 Configural 1030.366 (51) < 0.01 0.001
Scalar 17663.151 2650 Metric 928.268 (51) < 0.01 0.001
Appendix A.18b Global Model Fit Indices of Scalar Invariance Model – Integrated Math 2
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 12720.000 2600 < 0.01 0.969 0.027
Model B-1 13370.135 2600 < 0.01 0.951 0.030
Model B-2 6859.081 2600 < 0.01 0.973 0.023
Model B-3 7025.287 2600 < 0.01 0.973 0.023
Model B-4 3580.814 2399 < 0.01 0.982 0.013
Model B-5 7402.193 2600 < 0.01 0.973 0.023
Model C 8852.619 2600 < 0.01 0.973 0.021
Model D 11279.954 2600 < 0.01 0.968 0.025
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-22 American Institutes for Research
Appendix A.19a Global Model Fit Indices of Measurement Invariance Tests – Grade 5 Science
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 63350.264 2160
Metric 65924.672 2207 Configural 2574.407 (47) < 0.01 0.000
Scalar 71076.019 2254 Metric 5151.347 (47) < 0.01 0.001
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 48975.307 2160
Metric 62668.052 2207 Configural 13692.745 (47) < 0.01 0.002
Scalar 66158.488 2254 Metric 3490.436 (47) < 0.01 0.001
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 43435.967 2160
Metric 44806.080 2207 Configural 1370.112 (47) < 0.01 0.000
Scalar 45290.173 2254 Metric 484.093 (47) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 43017.333 2160
Metric 43272.723 2207 Configural 255.390 (47) < 0.01 0.000
Scalar 44070.379 2254 Metric 797.656 (47) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 41471.972 2160
Metric 41552.751 2207 Configural 80.779 (47) < 0.01 0.000
Scalar 41605.053 2254 Metric 52.302 (47) 0.28 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 45495.444 2160
Metric 46819.459 2207 Configural 1324.015 (47) < 0.01 0.000
Scalar 47225.920 2254 Metric 406.461 (47) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 57906.681 2160
Metric 68633.867 2207 Configural 10727.185 (47) < 0.01 0.002
Scalar 72884.224 2254 Metric 4250.357 (47) < 0.01 0.000
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 62431.638 2160
Metric 64664.211 2207 Configural 2232.573 (47) < 0.01 0.000
Scalar 65615.472 2254 Metric 951.261 (47) < 0.01 0.000
Appendix A.19b Global Model Fit Indices of Scalar Invariance Model – Grade 5 Science
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 48465.973 2213 < 0.01 0.986 0.018
Model B-1 38704.713 2213 < 0.01 0.983 0.017
Model B-2 26621.863 2213 < 0.01 0.987 0.015
Model B-3 25118.109 2213 < 0.01 0.986 0.015
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-23 American Institutes for Research
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model B-4 13685.571 2213 < 0.01 0.991 0.011
Model B-5 28481.220 2213 < 0.01 0.987 0.016
Model C 48502.598 2213 < 0.01 0.983 0.018
Model D 36286.565 2213 < 0.01 0.990 0.016
Appendix A.20a Global Model Fit Indices of Measurement Invariance Tests – Grade 8 Science
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 90393.529 2254
Metric 94554.790 2302 Configural 4161.262 (48) < 0.01 0.000
Scalar 107482.525 2350 Metric 12927.735 (48) < 0.01 0.002
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 70703.399 2254
Metric 83901.581 2302 Configural 13198.183 (48) < 0.01 0.002
Scalar 89892.739 2350 Metric 5991.158 (48) < 0.01 0.001
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 66182.166 2254
Metric 67700.300 2302 Configural 1518.134 (48) < 0.01 0.000
Scalar 68191.700 2350 Metric 491.400 (48) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 66438.873 2254
Metric 67121.267 2302 Configural 682.394 (48) < 0.01 0.001
Scalar 68468.267 2350 Metric 1347.000 (48) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 63812.204 2254
Metric 63871.191 2302 Configural 58.987 (48) 0.13 0.000
Scalar 63911.516 2350 Metric 40.325 (48) 0.78 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 68725.221 2254
Metric 69861.354 2302 Configural 1136.133 (48) < 0.01 0.000
Scalar 70296.196 2350 Metric 434.842 (48) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 82183.891 2254
Metric 94817.504 2302 Configural 12633.613 (48) < 0.01 0.001
Scalar 101895.282 2350 Metric 7077.778 (48) < 0.01 0.001
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 89535.891 2254
Metric 91556.651 2302 Configural 2020.760 (48) < 0.01 0.000
Scalar 93680.256 2350 Metric 2123.605 (48) < 0.01 0.000
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-24 American Institutes for Research
Appendix A.20b Global Model Fit Indices of Scalar Invariance Model – Grade 8 Science
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 70504.219 2306 < 0.01 0.976 0.022
Model B-1 51594.463 2306 < 0.01 0.975 0.020
Model B-2 40574.909 2306 < 0.01 0.978 0.019
Model B-3 41410.612 2306 < 0.01 0.976 0.019
Model B-4 19676.823 2306 < 0.01 0.985 0.013
Model B-5 43588.756 2306 < 0.01 0.977 0.019
Model C 54941.635 2306 < 0.01 0.977 0.019
Model D 47452.805 2306 < 0.01 0.983 0.018
Appendix A.21a Global Model Fit Indices of Measurement Invariance Tests – Physical Science
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Model B-1: Students’ Ethnicity (African American vs. White)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-25 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Appendix A.21b Global Model Fit Indices of Scalar Invariance Model – Physical Science
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A NA NA NA NA NA
Model B-1 NA NA NA NA NA
Model B-2 NA NA NA NA NA
Model B-3 NA NA NA NA NA
Model B-4 NA NA NA NA NA
Model B-5 NA NA NA NA NA
Model C NA NA NA NA NA
Model D NA NA NA NA NA
Appendix A.22a Global Model Fit Indices of Measurement Invariance Tests – Biology
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 56014.787 1558
Metric 64140.511 1598 Configural 8125.724 (40) < 0.01 0.002
Scalar 66899.524 1638 Metric 2759.013 (40) < 0.01 0.000
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 47235.358 1558
Metric 47282.315 1598 Configural 46.957 (40) 0.21 0.001
Scalar 47322.862 1638 Metric 40.547 (40) 0.45 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-26 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural NA NA
Metric NA NA Configural NA (NA) NA NA
Scalar NA NA Metric NA (NA) NA NA
Appendix A.22b Global Model Fit Indices of Scalar Invariance Model – Biology
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A NA NA NA NA NA
Model B-1 62644.623 1607 < 0.01 0.971 0.025
Model B-2 NA NA NA NA NA
Model B-3 NA NA NA NA NA
Model B-4 27145.815 1607 < 0.01 0.982 0.018
Model B-5 NA NA NA NA NA
Model C NA NA NA NA NA
Model D NA NA NA NA NA
Appendix A.23a Global Model Fit Indices of Measurement Invariance Tests – American Government
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 63772.064 1804
Metric 65254.577 1847 Configural 1482.513 (43) < 0.01 0.000
Scalar 69797.660 1890 Metric 4543.083 (43) < 0.01 0.001
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 53387.944 1804
Metric 59308.688 1847 Configural 5920.744 (43) < 0.01 0.001
Scalar 62407.716 1890 Metric 3099.028 (43) < 0.01 0.001
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 46495.040 1804
Metric 47917.832 1847 Configural 1422.791 (43) < 0.01 0.001
Scalar 48683.136 1890 Metric 765.304 (43) < 0.01 0.001
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 45846.218 1804
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-27 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Metric 46075.553 1847 Configural 229.335 (43) < 0.01 0.001
Scalar 46876.882 1890 Metric 801.329 (43) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 44923.999 1804
Metric 44974.598 1847 Configural 50.598 (43) 0.20 0.001
Scalar 45032.853 1890 Metric 58.255 (43) 0.06 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 47941.150 1804
Metric 48707.382 1847 Configural 766.232 (43) < 0.01 0.000
Scalar 48997.369 1890 Metric 289.987 (43) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 58213.852 1804
Metric 64905.650 1847 Configural 6691.798 (43) < 0.01 0.001
Scalar 70018.085 1890 Metric 5112.435 (43) < 0.01 0.001
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 62807.811 1804
Metric 64161.129 1847 Configural 1353.317 (43) < 0.01 0.000
Scalar 66816.812 1890 Metric 2655.683 (43) < 0.01 0.000
Appendix A.23b Global Model Fit Indices of Scalar Invariance Model – American Government
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 65296.915 1865 < 0.01 0.963 0.028
Model B-1 54450.972 1865 < 0.01 0.959 0.027
Model B-2 42102.698 1865 < 0.01 0.963 0.026
Model B-3 39864.033 1865 < 0.01 0.961 0.025
Model B-4 20341.352 1865 < 0.01 0.973 0.018
Model B-5 43727.796 1865 < 0.01 0.963 0.026
Model C 58145.074 1865 < 0.01 0.960 0.026
Model D 51333.089 1865 < 0.01 0.970 0.025
Appendix A.24a Global Model Fit Indices of Measurement Invariance Tests – American History
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Model A: Students’ Gender (Female vs. Male)
Configural 43804.217 2350
Metric 45898.350 2399 Configural 2094.133 (49) < 0.01 0.000
Scalar 64720.766 2448 Metric 18822.416 (49) < 0.01 0.003
Model B-1: Students’ Ethnicity (African American vs. White)
Configural 39764.080 2350
Metric 46043.401 2399 Configural 6279.320 (49) < 0.01 0.001
Ohio’s State Tests —Spring 2018 Administration Technical Report
A-28 American Institutes for Research
Invariance Model
χ2 df χ2 Difference Test Change in
RMSEA Comparison χ2(df) p value
Scalar 50107.218 2448 Metric 4063.818 (49) < 0.01 0.001
Model B-2: Students’ Ethnicity (Hispanics vs. White)
Configural 33965.739 2350
Metric 34952.402 2399 Configural 986.663 (49) < 0.01 0.000
Scalar 35813.482 2448 Metric 861.080 (49) < 0.01 0.000
Model B-3: Students’ Ethnicity (Asian vs. White)
Configural 33395.205 2350
Metric 33661.299 2399 Configural 266.094 (49) < 0.01 0.000
Scalar 34452.210 2448 Metric 790.911 (49) < 0.01 0.000
Model B-4: Students’ Ethnicity (American Indian vs. White)
Configural 32761.844 2350
Metric 32803.386 2399 Configural 41.542 (49) 0.77 0.000
Scalar 32860.184 2448 Metric 56.798 (49) 0.21 0.000
Model B-5: Students’ Ethnicity (Multi-Ethnics vs. White)
Configural 35200.084 2350
Metric 35763.361 2399 Configural 563.277 (49) < 0.01 0.000
Scalar 36159.538 2448 Metric 396.177 (49) < 0.01 0.000
Model C: Students’ IEP Status (Individualized Education Program vs. Non-IEP)
Configural 44510.740 2350
Metric 49693.172 2399 Configural 5182.432 (49) < 0.01 0.001
Scalar 54970.627 2448 Metric 5277.455 (49) < 0.01 0.000
Model D: Students’ LEP Status (Limited English Proficiency vs. Non-LEP)
Configural 46263.172 2350
Metric 47481.357 2399 Configural 1218.185 (49) < 0.01 0.000
Scalar 49707.893 2448 Metric 2226.537 (49) < 0.01 0.000
Appendix A.24b Global Model Fit Indices of Scalar Invariance Model – American History
Model Chi-Square Test
CFI RMSEA Value df P-Value
Model A 61853.159 2411 < 0.01 0.985 0.020
Model B-1 48290.257 2411 < 0.01 0.984 0.018
Model B-2 32825.971 2411 < 0.01 0.988 0.016
Model B-3 30701.853 2411 < 0.01 0.988 0.016
Model B-4 15249.497 2411 < 0.01 0.993 0.011
Model B-5 33960.287 2411 < 0.01 0.988 0.016
Model C 48883.897 2411 < 0.01 0.986 0.017
Model D 41688.775 2411 < 0.01 0.990 0.016
Ohio’s State Tests —Spring 2018 Administration Technical Report
B-1 American Institutes for Research
Table B.1 Number and Percentage of Flagged Aggregate Units for Test Integrity Forensic Studies –
English Language Arts
Test Aggregate Unit Total
Change in Student Performance
Response Latency Person Fit
Number of
Flagged Unit
Pct. of Flagged
Unit
Number of
Flagged Unit
Pct. of Flagged
Unit
Number of
Flagged Unit
Pct. of Flagged
Unit
Grade 3 ELA
Student 131,376 917 1% 327 0%
Session 19,307 280 1% 52 0%
Test Administrator
12,701 118 1% 54 0%
School 2,124 - 0% 104 5%
Grade 4 ELA
Student 131,355 2 0% 1,102 1% 305 0%
Session 17,954 2 0% 314 2% 35 0%
Test Administrator
12,396 2 0% 130 1% 41 0%
School 2,129 1 0% - 0% 105 5%
Grade 5 ELA
Student 131,589 717 1% 1,008 1% 1,440 1%
Session 17,127 187 1% 276 2% 105 1%
Test Administrator
11,847 129 1% 102 1% 128 1%
School 1,954 150 8% - 0% 151 8%
Grade 6 ELA
Student 130,098 760 1% 967 1% 1,551 1%
Session 15,034 231 2% 245 2% 665 4%
Test Administrator
10,337 182 2% 88 1% 739 7%
School 1,617 191 12% 1 0% 394 24%
Grade 7 ELA
Student 127,605 756 1% 918 1% 1,333 1%
Session 14,035 232 2% 212 2% 155 1%
Test Administrator
9,665 188 2% 77 1% 173 2%
School 1,443 186 13% 1 0% 115 8%
Grade 8 ELA
Student 128,537 693 1% 899 1% 871 1%
Session 14,174 144 1% 185 1% 284 2%
Test Administrator
9,635 113 1% 83 1% 349 4%
School 1,403 132 9% - 0% 279 20%
EOC ELA I
Student 155,905 135 0% 804 1% 2,487 2%
Session 17,840 44 0% 126 1% 531 3%
Test Administrator
9,803 31 0% 45 0% 564 6%
School 1,193 29 2% 1 0% 264 22%
Ohio’s State Tests —Spring 2018 Administration Technical Report
B-2 American Institutes for Research
Test Aggregate Unit Total
Change in Student Performance
Response Latency Person Fit
Number of
Flagged Unit
Pct. of Flagged
Unit
Number of
Flagged Unit
Pct. of Flagged
Unit
Number of
Flagged Unit
Pct. of Flagged
Unit
EOC ELA II
Student 146,878 936 1% 780 1% 1,636 1%
Session 17,108 181 1% 117 1% 566 3%
Test Administrator
9,239 124 1% 39 0% 626 7%
School 1,074 136 13% - 0% 374 35%
Ohio’s State Tests —Spring 2018 Administration Technical Report
B-3 American Institutes for Research
Table B.2 Number and Percentage of Flagged Aggregate Units for Test Integrity Forensic Studies –
Mathematics
Test Aggregate Unit Total
Change in Student Performance
Response Latency Person Fit
Number of
Flagged Unit
Pct. of Flagged
Unit
Number of
Flagged Unit
Pct. of Flagged
Unit
Number of
Flagged Unit
Pct. of Flagged
Unit
Grade 3 Math
Student 132,658 987 1% 626 0%
Session 18,873 280 1% 24 0%
Test Administrator
12,662 124 1% 25 0%
School 2,152 - 0% 30 1%
Grade 4 Math
Student 130,792 647 0% 1,109 1% 331 0%
Session 17,961 714 4% 311 2% 17 0%
Test Administrator
12,366 677 5% 134 1% 18 0%
School 2,134 563 26% - 0% 34 2%
Grade 5 Math
Student 130,221 4 0% 1,472 1% 251 0%
Session 17,261 5 0% 424 2% 9 0%
Test Administrator
11,891 5 0% 188 2% 20 0%
School 1,978 5 0% - 0% 37 2%
Grade 6 Math
Student 128,486 1 0% 830 1% 494 0%
Session 15,381 1 0% 220 1% 26 0%
Test Administrator
10,524 1 0% 73 1% 26 0%
School 1,731 1 0% 1 0% 27 2%
Grade 7 Math
Student 122,942 - 0% 853 1% 276 0%
Session 14,402 - 0% 207 1% 7 0%
Test Administrator
9,887 - 0% 80 1% 8 0%
School 1,535 - 0% 1 0% 32 2%
Grade 8 Math
Student 100,264 438 0% 807 1% 392 0%
Session 13,411 322 2% 191 1% 38 0%
Test Administrator
9,134 314 3% 81 1% 37 0%
School 1,404 259 18% - 0% 56 4%
Algebra I
Student 153,027 177 0% 971 1% 348 0%
Session 20,231 71 0% 221 1% 120 1%
Test Administrator
11,406 55 0% 88 1% 134 1%
School 1,753 57 3% 2 0% 212 12%
Ohio’s State Tests —Spring 2018 Administration Technical Report
B-4 American Institutes for Research
Test Aggregate Unit Total
Change in Student Performance
Response Latency Person Fit
Number of
Flagged Unit
Pct. of Flagged
Unit
Number of
Flagged Unit
Pct. of Flagged
Unit
Number of
Flagged Unit
Pct. of Flagged
Unit
Geometry
Student 134,970 526 0% 775 1% 348 0%
Session 17,407 249 1% 174 1% 40 0%
Test Administrator
9,519 211 2% 62 1% 48 1%
School 1,286 209 16% 3 0% 56 4%
Integrated Math I
Student 12,687 12 0% 99 1% 18 0%
Session 1,940 6 0% 21 1% 8 0%
Test Administrator
1,137 5 0% 11 1% 9 1%
School 308 - 0% 2 1% 9 3%
Integrated Math II
Student 11,240 50 0% 74 1% 36 0%
Session 1,820 28 2% 17 1% 5 0%
Test Administrator
1,039 20 2% 7 1% 8 1%
School 279 15 5% 1 0% 7 3%
Ohio’s State Tests —Spring 2018 Administration Technical Report
B-5 American Institutes for Research
Table B.3 Number and Percentage of Flagged Aggregate Units for Test Integrity Forensic Studies –
Science
Test Aggregate Unit Total
Response Latency Person Fit
Number of Flagged
Unit
Pct. of Flagged
Unit
Number of Flagged
Unit
Pct. of Flagged
Unit
Grade 5 Science
Student 131,474 687 1% 365 0%
Session 16,530 174 1% 28 0%
Test Administrator 11,591 80 1% 26 0%
School 1,951 0 0% 46 2%
Grade 8 Science
Student 129,418 831 1% 244 0%
Session 13,790 185 1% 36 0%
Test Administrator 9,405 81 1% 40 0%
School 1,409 1 0% 68 5%
Biology
Student 141,089 1,287 1% 115 0%
Session 16,098 227 1% 4 0%
Test Administrator 8,918 98 1% 5 0%
School 1,089 0 0% 15 1%
Physical Science
Student 740 9 1% 1 0%
Session 393 3 1% 0 0%
Test Administrator 210 2 1% 0 0%
School 122 2 2% 0 0%
Ohio’s State Tests —Spring 2018 Administration Technical Report
B-6 American Institutes for Research
Table B.4 Number and Percentage of Flagged Aggregate Units for Test Integrity Forensic Studies –
Social Studies
Test Aggregate Unit Total
Response Latency Person Fit
Number of Flagged
Unit
Pct. of Flagged
Unit
Number of Flagged
Unit
Pct. of Flagged
Unit
American Government
Student 98,804 1,807 2% 816 1%
Session 13,114 287 2% 124 1%
Test Administrator 6,864 133 2% 181 3%
School 1,026 2 0% 140 14%
American History
Student 13,1062 1,729 1% 459 0%
Session 15,338 265 2% 122 1%
Test Administrator 8,641 123 1% 242 3%
School 1,064 1 0% 243 23%
Ohio’s State Tests —Spring 2018 Administration Technical Report
B-7 American Institutes for Research
Table B.5 Number and Percentage of Flagged Aggregate Units for Response Pattern Similarity Study –
English Language Arts
Mode Test Aggregate
Unit Total
Number of Flagged Unit Percentage of Flagged Unit
Aggressive method
Conservative method
Aggressive method
Conservative method
Online
G3E School 430 353 14 82% 3% Session 334 310 25 93% 7%
G4E School 340 290 15 85% 4% Session 307 291 23 95% 7%
G5E School 288 247 3 86% 1% Session 245 239 14 98% 6%
G6E School 415 314 10 76% 2% Session 333 320 15 96% 5%
G7E School 301 235 7 78% 2%
Session 284 267 21 94% 7%
G8E School 644 421 17 65% 3% Session 555 490 38 88% 7%
ELA I School 560 315 28 56% 5% Session 533 474 61 89% 11%
ELA II School 1,063 492 48 46% 5%
Session 841 717 98 85% 12%
Paper
G3E School 2 2 0 100% 0% G4E School 1 1 1 100% 100% G5E School 2 2 0 100% 0% G6E School 2 2 0 100% 0% G7E School 1 1 0 100% 0% G8E School 3 3 0 100% 0% ELA I School 4 4 0 100% 0% ELA II School 5 4 0 80% 0%
Note: The aggressive approach is used to flag the School/session where flagged pair using alpha=.05 and Bonferroni adjustment factor (n-1). The conservative approach is used to flag a problematic School/session where the flagged pair is using alpha=.01 and Bonferroni adjustment factor (n(n-1)/2).
Ohio’s State Tests —Spring 2018 Administration Technical Report
B-8 American Institutes for Research
Table B.6 Number and Percentage of Flagged Aggregate Units for Response Pattern Similarity Study –
Mathematics
Mode Test Aggregate
Unit Total
Number of Flagged Unit Percentage of Flagged Unit
Aggressive method
Conservative method
Aggressive method
Conservative method
Online
G3M School 114 112 0 98% 0% Session 137 136 3 99% 2%
G4M School 429 358 1 83% 0% Session 409 383 6 94% 1%
G5M School 156 136 0 87% 0% Session 192 182 4 95% 2%
G6M School 285 241 1 85% 0% Session 318 301 2 95% 1%
G7M School 312 246 2 79% 1% Session 314 283 10 90% 3%
G8M School 310 233 4 75% 1% Session 253 239 15 94% 6%
Algebra School 2,616 770 80 29% 3% Session 1,465 1,171 167 80% 11%
Geometry School 796 441 3 55% 0% Session 775 675 32 87% 4%
IM I School 232 55 7 24% 3% Session 125 93 9 74% 7%
IM II School 278 60 12 22% 4% Session 134 110 21 82% 16%
Paper
G3M School 3 2 0 67% 0% G4M School 3 3 0 100% 0% G5M School 3 2 0 67% 0% G6M School 3 3 0 100% 0% G7M School 2 2 0 100% 0% G8M School 13 6 2 46% 15%
Algebra School 4 4 0 100% 0%
Geometry School 1 1 0 100% 0%
IM I School 3 2 0 67% 0% IM II School - - - - -
Note: The aggressive approach is used to flag the School/session where flagged pair using alpha=.05 and Bonferroni adjustment factor (n-1). The conservative approach is used to flag a problematic School/session where the flagged pair is using alpha=.01 and Bonferroni adjustment factor (n(n-1)/2).
Ohio’s State Tests —Spring 2018 Administration Technical Report
B-9 American Institutes for Research
Table B.7 Number and Percentage of Flagged Aggregate Units for Response Pattern Similarity Study –
Science
Mode Test Aggregate Unit
Total
Number of Flagged Unit Percentage of Flagged Unit
Aggressive method
Conservative method
Aggressive method
Conservative method
Online
G5Sci School 295 240 20 81% 7%
Session 271 246 37 91% 14%
G8Sci School 579 373 25 64% 4%
Session 576 521 60 90% 10%
Biology School 2,148 676 85 31% 4%
Session 1,537 1,190 203 77% 13%
Physical Science
School 6 5 0 83% 0%
Session 1 1 0 100% 0%
Paper
G5Sci School 17 4 3 24% 18%
G8Sci School 9 8 3 89% 33%
Biology School 7 6 2 86% 29%
Physical Science
School 1 1 1 100% 100%
Note: The aggressive approach is used to flag the School/session where flagged pair using alpha=.05 and Bonferroni adjustment factor (n-1). The conservative approach is used to flag a problematic School/session where the flagged pair is using alpha=.01 and Bonferroni adjustment factor (n(n-1)/2).
Ohio’s State Tests —Spring 2018 Administration Technical Report
B-10 American Institutes for Research
Table B.8 Number and Percentage of Flagged Aggregate Units for Response Pattern Similarity Study –
Social Studies
Mode Test Aggregate
Unit Total
Number of Flagged Unit Percentage of Flagged Unit
Aggressive method
Conservative method
Aggressive method
Conservative method
Online
AG School 464 288 25 62% 5%
Session 556 449 67 81% 12%
AH School 1,449 520 139 36% 10%
Session 1,212 862 234 71% 19%
Paper AG School 3 2 0 67% 0%
AH School 4 3 0 75% 0%
Note: The aggressive approach is used to flag the School/session where flagged pair using alpha=.05 and Bonferroni adjustment factor (n-1). The conservative approach is used to flag a problematic School/session where the flagged pair is using alpha=.01 and Bonferroni adjustment factor (n(n-1)/2).
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
C-1 American Institutes for Research
Table C.1. Number of Students Participating in Fall 2017 Online Assessments
Assessment
Overall
Female
Male
Unk
nown
Afric
an
American
Asian
Hispa
nic /
Latin
o
American
Indian
White
Multip
le
Ethn
icities
LEP
IEP
ELA Grade 3 127,757 62,751 64,856 150 23,117 3,083 5,356 158 86,241 9,640 5,079 14,830 ELA I 41,904 18,094 23,036 774 14,167 764 2,207 59 21,694 2,762 2,843 9,906 ELA II 35,466 16,371 18,418 677 11,482 728 1,778 55 18,860 2,324 2,282 7,058
Math Algebra 47,341 22,249 24,121 971 14,667 571 2,599 74 26,117 3,032 2,005 9,283 Geometry 34,514 16,700 17,113 701 10,499 593 1,826 46 19,168 2,120 1,515 6,346
Integrated Math I 4,822 2,342 2,441 39 2,326 213 76 14 1,625 565 812 664 Integrated Math II 4,389 2,092 2,266 31 1,988 184 69 8 1,672 466 619 534
Science Biology 24,772 11,992 12,467 313 8,825 391 1,249 39 12,544 1,620 1,467 4,292
Physical Science 897 435 444 18 448 6 39 325 66 41 196 Social Studies
American Government 32,259 15,625 16,221 413 7,154 749 1,313 47 21,092 1,771 1,100 4,541
American History 20,912 10,329 10,258 325 7,534 348 1,130 28 10,431 1,348 1,404 4,145
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
C-2 American Institutes for Research
Table C.2. Number of Students Participating in Fall 2017 Paper Assessments
Assessment
Overall
Female
Male
Unk
nown
Afric
an
American
Asian
Hispa
nic /
Latin
o
American
Indian
White
Multip
le
Ethn
icities
LEP
IEP
ELA G3 ELA 448 181 265 2 93 4 13 322 15 35 310 ELA I 135 42 89 4 28 7 88 10 6 83 ELA II 87 29 56 2 11 6 58 10 4 53
Math Algebra 150 54 87 9 44 2 9 1 76 17 6 70 Geometry 117 42 61 14 42 3 2 54 15 4 57
Integrated Math I 10 4 6 1 8 1 7 Integrated Math II 12 6 6 10 1 11
Science Biology 65 21 41 3 20 1 6 33 5 3 41
Physical Science 2 1 1 2 2 Social Studies
American Government 71 30 41 7 3 4 53 4 4 42
American History 57 20 34 3 17 3 2 29 5 4 28
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
C-3 American Institutes for Research
Table C.3. Number of Students Participating in Spring 2018 ELA Online Assessments
Assessment
Overall
Female
Male
Unk
nown
Afric
an
American
Asian
Hispa
nic /
Latin
o
American
Indian
White
Multip
le
Ethn
icities
LEP
IEP
G3 ELA 126,050 61,977 63,914 159 22,817 3,033 5,323 157 84,936 9,621 5,202 16,183 G4 ELA 125,854 61,651 64,033 170 20,643 3,115 5,201 162 87,542 9,060 4,044 17,185 G5 ELA 127,446 62,418 64,848 180 21,267 3,125 5,051 188 88,832 8,841 3,549 17,484 G6 ELA 125,943 61,650 64,106 187 20,465 3,011 5,088 173 88,720 8,349 3,231 16,888 G7 ELA 123,841 60,688 62,901 252 18,864 3,063 4,643 160 89,059 7,901 2,934 16,162 G8 ELA 124,880 60,910 63,779 191 19,126 2,978 4,537 158 89,992 7,900 2,966 16,349 ELA I 148,951 71,575 76,796 580 27,502 3,482 5,927 203 102,174 9,355 5,593 21,741 ELA II 139,572 68,420 70,584 568 23,659 3,363 5,294 192 98,465 8,336 4,511 18,913
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
C-4 American Institutes for Research
Table C.4. Number of Students Participating in Spring 2018 ELA Paper Assessments
Assessment
Overall
Female
Male
Unk
nown
Afric
an
American
Asian
Hispa
nic /
Latin
o
American
Indian
White
Multip
le
Ethn
icities
LEP
IEP
G3 ELA 490 204 284 2 96 6 12 356 19 37 317 G4 ELA 640 273 364 3 131 6 24 2 430 46 23 417 G5 ELA 511 212 297 2 98 4 20 354 35 16 320 G6 ELA 466 178 288 94 8 8 1 327 26 17 317 G7 ELA 475 209 265 1 83 5 9 359 18 17 303 G8 ELA 408 174 233 1 68 7 1 310 20 14 250 ELA I 443 143 295 5 109 5 21 1 265 30 20 295 ELA II 402 157 241 4 87 4 15 271 17 13 279
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
C-5 American Institutes for Research
Table C.5. Number of Students Participating in Spring 2018 Mathematics Online Assessments
Assessment
Overall
Female
Male
Unk
nown
Afric
an
American
Asian
Hispa
nic /
Latin
o
American
Indian
White
Multip
le
Ethn
icities
LEP
IEP
G3 Math 126,769 62,334 64,255 180 22,837 3,020 5,351 157 85,601 9,636 5,202 16,243 G4 Math 125,282 61,445 63,659 178 20,639 3,030 5,205 160 87,091 9,024 4,034 17,187 G5 Math 126,093 61,865 64,032 196 21,222 2,928 5,042 191 87,807 8,761 3,544 17,457 G6 Math 124,337 60,962 63,161 214 20,443 2,803 5,019 173 87,489 8,267 3,235 16,919 G7 Math 119,191 58,570 60,362 259 18,598 2,613 4,594 154 85,433 7,644 2,915 16,090 G8 Math 97,029 46,827 50,018 184 16,595 1,939 4,059 129 67,596 6,572 2,811 15,726 Algebra 144,091 69,666 73,795 630 24,933 3,119 5,817 206 101,262 8,470 4,025 20,161 Geometry 126,729 62,827 63,385 517 20,163 2,857 4,876 179 91,392 7,028 3,084 15,831
Integrated Math I 12,152 5,860 6,239 53 3,808 494 300 25 6,206 1,298 1,441 1,830 Integrated Math II 10,490 5,132 5,301 57 3,049 434 266 27 5,725 967 1,004 1,516
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
C-6 American Institutes for Research
Table C.6. Number of Students Participating in Spring 2018 Mathematics Paper Assessments
Assessment
Overall
Female
Male
Unk
nown
Afric
an
American
Asian
Hispa
nic /
Latin
o
American
Indian
White
Multip
le
Ethn
icities
LEP
IEP
G3 Math 653 279 370 4 186 7 19 414 25 42 406 G4 Math 640 273 364 3 142 5 22 1 424 45 27 418 G5 Math 521 210 309 2 99 4 19 361 38 21 328 G6 Math 484 183 301 97 11 8 1 336 29 23 334 G7 Math 502 211 290 1 94 5 9 374 19 24 325 G8 Math 437 187 249 1 76 7 1 327 25 23 285 Algebra 398 127 270 1 97 5 16 249 24 19 268 Geometry 289 118 166 5 62 3 12 1 183 17 8 178
Integrated Math I 76 27 48 1 15 2 2 49 7 6 63 Integrated Math II 46 13 31 2 8 1 2 32 3 4 41
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
C-7 American Institutes for Research
Table C.7. Number of Students Participating in Spring 2018 Science and Social Studies Online Assessments
Assessment
Overall
Female
Male
Unk
nown
Afric
an
American
Asian
Hispa
nic /
Latin
o
American
Indian
White
Multip
le
Ethn
icities
LEP
IEP
Science G5 Science 127,349 62,404 64,751 194 21,166 3,135 5,063 188 88,823 8,831 3,550 17,438 G8 Science 125,778 61,534 64,043 201 19,078 2,995 4,589 155 90,895 7,879 2,967 16,291 Biology 135,109 66,887 67,803 419 21,928 3,225 4,870 185 96,774 7,872 3,878 17,412
Physical Science 478 243 231 4 195 4 20 1 211 40 25 91 Social Studies
American Government 86,861 42,931 43,521 409 14,322 1,565 2,924 121 62,730 5,009 2,149 10,331
American History 125,839 62,295 63,145 399 20,438 2,489 4,677 163 90,536 7,308 3,905 17,060
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
C-8 American Institutes for Research
Table C.8. Number of Students Participating in Spring 2018 Science and Social Studies Paper Assessments
Assessment
Overall
Female
Male
Unk
nown
Afric
an
American
Asian
Hispa
nic /
Latin
o
American
Indian
White
Multip
le
Ethn
icities
LEP
IEP
Science G5 Science 520 210 308 2 99 4 19 361 37 20 328 G8 Science 424 185 238 1 73 6 1 316 26 21 267 Biology 371 138 228 5 67 5 14 1 251 23 10 253
Physical Science 6 2 4 1 5 3 Social Studies
American Government 217 79 133 5 47 3 6 1 147 12 9 143
American History 369 136 230 3 93 2 14 232 22 14 254
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-1 American Institutes for Research
Table D1. Scaled Score Frequency Distributions Fall 2017 – Grade 3 Reading
Reading Promotion Score Frequency Percent Cumulative
Frequency Cumulative Percent
16 89 0.07 89 0.07 21 235 0.19 324 0.26 26 645 0.51 969 0.76 30 1,302 1.03 2,271 1.79 32 2,169 1.71 4,440 3.51 35 2,978 2.35 7,418 5.86 37 3,934 3.11 11,352 8.96 39 4,489 3.54 15,841 12.51 41 5,178 4.09 21,019 16.59 43 5,489 4.33 26,508 20.93 45 5,800 4.58 32,308 25.51 46 6,187 4.88 38,495 30.39 48 6,399 5.05 44,894 35.44 50 6,814 5.38 51,708 40.82 51 6,799 5.37 58,507 46.19 53 6,872 5.43 65,379 51.61 54 7,027 5.55 72,406 57.16 56 7,071 5.58 79,477 62.74 58 7,050 5.57 86,527 68.31 59 7,006 5.53 93,533 73.84 61 6,807 5.37 100,340 79.22 63 6,502 5.13 106,842 84.35 65 5,633 4.45 112,475 88.80 67 4,758 3.76 117,233 92.55 70 3,815 3.01 121,048 95.56 73 2,760 2.18 123,808 97.74 77 1,641 1.30 125,449 99.04 82 884 0.70 126,333 99.74 86 334 0.26 126,667 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-2 American Institutes for Research
Table D2. Scaled Score Frequency Distributions Fall 2017 – Grade 3 ELA
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
545 114 0.09 114 0.09 567 358 0.28 472 0.37 588 913 0.71 1,385 1.08 603 1,909 1.49 3,294 2.56 616 2,981 2.32 6,275 4.88 626 4,246 3.30 10,521 8.19 635 5,097 3.97 15,618 12.16 643 6,058 4.72 21,676 16.87 651 6,438 5.01 28,114 21.88 657 6,813 5.30 34,927 27.19 664 6,816 5.31 41,743 32.49 672 6,974 5.43 48,717 37.92 676 6,598 5.14 55,315 43.06 681 6,435 5.01 61,750 48.06 687 6,374 4.96 68,124 53.03 692 5,716 4.45 73,840 57.47 697 5,500 4.28 79,340 61.76 702 5,285 4.11 84,625 65.87 708 5,065 3.94 89,690 69.81 713 4,802 3.74 94,492 73.55 718 4,584 3.57 99,076 77.12 725 4,334 3.37 103,410 80.49 729 4,163 3.24 107,573 83.73 735 3,817 2.97 111,390 86.70 741 3,456 2.69 114,846 89.39 748 3,071 2.39 117,917 91.78 755 2,708 2.11 120,625 93.89 762 2,214 1.72 122,839 95.61 770 1,812 1.41 124,651 97.02 778 1,450 1.13 126,101 98.15 787 1,008 0.78 127,109 98.94 796 628 0.49 127,737 99.43 807 393 0.31 128,130 99.73 818 197 0.15 128,327 99.89 831 99 0.08 128,426 99.96 846 34 0.03 128,460 99.99 863 14 0.01 128,474 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-3 American Institutes for Research
Table D3. Scaled Score Frequency Distributions Fall 2017 – High School ELA I
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
606 66 0.15 66 0.15 615 111 0.26 177 0.41 624 195 0.45 372 0.86 631 360 0.84 732 1.70 637 552 1.28 1,284 2.98 642 810 1.88 2,094 4.86 647 1,054 2.45 3,148 7.31 651 1,348 3.13 4,496 10.44 655 1,398 3.25 5,894 13.68 658 1,644 3.82 7,538 17.50 661 1,642 3.81 9,180 21.31 665 1,814 4.21 10,994 25.53 668 1,729 4.01 12,723 29.54 670 1,799 4.18 14,522 33.72 673 1,851 4.30 16,373 38.01 676 1,836 4.26 18,209 42.28 678 1,876 4.36 20,085 46.63 681 1,900 4.41 21,985 51.04 683 1,905 4.42 23,890 55.47 685 1,759 4.08 25,649 59.55 688 1,750 4.06 27,399 63.62 690 1,622 3.77 29,021 67.38 692 1,547 3.59 30,568 70.97 694 1,372 3.19 31,940 74.16 697 1,274 2.96 33,214 77.12 699 1,134 2.63 34,348 79.75 701 1,120 2.60 35,468 82.35 703 965 2.24 36,433 84.59 705 851 1.98 37,284 86.57 707 734 1.70 38,018 88.27 709 671 1.56 38,689 89.83 711 562 1.30 39,251 91.13 713 510 1.18 39,761 92.32 716 407 0.94 40,168 93.26 718 372 0.86 40,540 94.13 720 359 0.83 40,899 94.96 722 277 0.64 41,176 95.60 725 262 0.61 41,438 96.21 727 221 0.51 41,659 96.72 729 204 0.47 41,863 97.20 732 190 0.44 42,053 97.64 734 170 0.39 42,223 98.03 737 135 0.31 42,358 98.35
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-4 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
740 135 0.31 42,493 98.66 743 127 0.29 42,620 98.96 746 114 0.26 42,734 99.22 749 89 0.21 42,823 99.43 752 65 0.15 42,888 99.58 756 49 0.11 42,937 99.69 760 39 0.09 42,976 99.78 764 35 0.08 43,011 99.86 769 24 0.06 43,035 99.92 774 13 0.03 43,048 99.95 780 11 0.03 43,059 99.97 786 6 0.01 43,065 99.99 794 3 0.01 43,068 100.00 800 2 0.00 43,070 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-5 American Institutes for Research
Table D4. Scaled Score Frequency Distributions Fall 2017 – High School ELA II
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
597 62 0.17 62 0.17 603 94 0.26 156 0.43 613 200 0.55 356 0.98 621 324 0.89 680 1.87 628 557 1.53 1,237 3.40 633 756 2.08 1,993 5.48 639 992 2.73 2,985 8.21 643 1,187 3.27 4,172 11.48 647 1,341 3.69 5,513 15.17 651 1,423 3.92 6,936 19.09 655 1,569 4.32 8,505 23.40 658 1,622 4.46 10,127 27.87 662 1,549 4.26 11,676 32.13 665 1,668 4.59 13,344 36.72 668 1,618 4.45 14,962 41.17 671 1,566 4.31 16,528 45.48 673 1,531 4.21 18,059 49.69 676 1,544 4.25 19,603 53.94 679 1,428 3.93 21,031 57.87 681 1,396 3.84 22,427 61.71 684 1,310 3.60 23,737 65.32 686 1,247 3.43 24,984 68.75 689 1,156 3.18 26,140 71.93 691 1,147 3.16 27,287 75.09 693 1,040 2.86 28,327 77.95 696 941 2.59 29,268 80.54 698 862 2.37 30,130 82.91 700 787 2.17 30,917 85.08 703 685 1.88 31,602 86.96 705 595 1.64 32,197 88.60 707 522 1.44 32,719 90.04 709 435 1.20 33,154 91.23 712 421 1.16 33,575 92.39 714 350 0.96 33,925 93.35 716 313 0.86 34,238 94.22 719 260 0.72 34,498 94.93 721 250 0.69 34,748 95.62 724 215 0.59 34,963 96.21 726 222 0.61 35,185 96.82 729 183 0.50 35,368 97.33 731 162 0.45 35,530 97.77 734 143 0.39 35,673 98.16 737 130 0.36 35,803 98.52
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-6 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
740 122 0.34 35,925 98.86 743 109 0.30 36,034 99.16 746 93 0.26 36,127 99.41 749 58 0.16 36,185 99.57 753 40 0.11 36,225 99.68 757 33 0.09 36,258 99.77 761 31 0.09 36,289 99.86 765 21 0.06 36,310 99.92 770 12 0.03 36,322 99.95 775 6 0.02 36,328 99.97 781 6 0.02 36,334 99.98 788 4 0.01 36,338 99.99 795 2 0.01 36,340 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-7 American Institutes for Research
Table D5. Scaled Score Frequency Distributions Fall 2017 – Algebra
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
618 213 0.44 213 0.44 630 316 0.65 529 1.10 639 666 1.38 1,195 2.48 646 1,115 2.31 2,310 4.78 652 1,792 3.71 4,102 8.50 657 2,488 5.15 6,590 13.65 662 3,222 6.67 9,812 20.32 666 3,577 7.41 13,389 27.73 670 3,706 7.68 17,095 35.41 674 3,604 7.46 20,699 42.87 677 3,505 7.26 24,204 50.13 682 3,142 6.51 27,346 56.64 684 2,881 5.97 30,227 62.61 687 2,677 5.54 32,904 68.15 690 2,306 4.78 35,210 72.93 693 2,046 4.24 37,256 77.17 696 1,768 3.66 39,024 80.83 698 1,506 3.12 40,530 83.95 701 1,312 2.72 41,842 86.67 704 1,030 2.13 42,872 88.80 706 940 1.95 43,812 90.75 709 735 1.52 44,547 92.27 712 630 1.30 45,177 93.57 714 506 1.05 45,683 94.62 717 420 0.87 46,103 95.49 720 310 0.64 46,413 96.13 722 284 0.59 46,697 96.72 725 224 0.46 46,921 97.19 728 161 0.33 47,082 97.52 730 173 0.36 47,255 97.88 733 127 0.26 47,382 98.14 736 118 0.24 47,500 98.38 739 98 0.20 47,598 98.59 742 79 0.16 47,677 98.75 745 81 0.17 47,758 98.92 748 71 0.15 47,829 99.07 751 76 0.16 47,905 99.22 755 50 0.10 47,955 99.33 758 50 0.10 48,005 99.43 762 54 0.11 48,059 99.54 765 41 0.08 48,100 99.63 769 44 0.09 48,144 99.72 773 30 0.06 48,174 99.78
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-8 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
777 20 0.04 48,194 99.82 782 21 0.04 48,215 99.87 787 13 0.03 48,228 99.89 792 11 0.02 48,239 99.92 798 11 0.02 48,250 99.94 805 13 0.03 48,263 99.96 814 17 0.04 48,280 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-9 American Institutes for Research
Table D6. Scaled Score Frequency Distributions Fall 2017 – Geometry
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
604 231 0.66 231 0.66 617 646 1.84 877 2.50 630 1,399 3.98 2,276 6.48 640 2,248 6.40 4,524 12.88 648 3,094 8.81 7,618 21.69 655 3,518 10.02 11,136 31.71 661 3,595 10.24 14,731 41.95 666 3,463 9.86 18,194 51.81 671 3,064 8.73 21,258 60.54 675 2,732 7.78 23,990 68.32 679 2,106 6.00 26,096 74.31 683 1,789 5.09 27,885 79.41 687 1,429 4.07 29,314 83.48 691 1,111 3.16 30,425 86.64 694 831 2.37 31,256 89.01 697 674 1.92 31,930 90.93 700 485 1.38 32,415 92.31 704 413 1.18 32,828 93.48 707 296 0.84 33,124 94.33 709 257 0.73 33,381 95.06 712 183 0.52 33,564 95.58 715 167 0.48 33,731 96.06 718 152 0.43 33,883 96.49 721 131 0.37 34,014 96.86 723 99 0.28 34,113 97.14 726 91 0.26 34,204 97.40 729 77 0.22 34,281 97.62 732 73 0.21 34,354 97.83 734 66 0.19 34,420 98.02 737 57 0.16 34,477 98.18 740 73 0.21 34,550 98.39 743 55 0.16 34,605 98.54 745 51 0.15 34,656 98.69 748 49 0.14 34,705 98.83 751 51 0.15 34,756 98.97 754 41 0.12 34,797 99.09 757 38 0.11 34,835 99.20 760 40 0.11 34,875 99.31 763 36 0.10 34,911 99.42 767 31 0.09 34,942 99.50 770 30 0.09 34,972 99.59 774 23 0.07 34,995 99.66 777 20 0.06 35,015 99.71
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-10 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
781 18 0.05 35,033 99.76 786 20 0.06 35,053 99.82 790 14 0.04 35,067 99.86 795 10 0.03 35,077 99.89 801 12 0.03 35,089 99.92 807 12 0.03 35,101 99.96 810 15 0.04 35,116 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-11 American Institutes for Research
Table D7. Scaled Score Frequency Distributions Fall 2017 – Integrated Math I
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
618 32 0.64 32 0.64 626 55 1.10 87 1.74 635 85 1.70 172 3.45 642 143 2.87 315 6.31 649 223 4.47 538 10.78 654 296 5.93 834 16.71 659 335 6.71 1,169 23.42 663 361 7.23 1,530 30.66 667 373 7.47 1,903 38.13 671 363 7.27 2,266 45.40 675 313 6.27 2,579 51.67 678 278 5.57 2,857 57.24 682 230 4.61 3,087 61.85 685 253 5.07 3,340 66.92 688 233 4.67 3,573 71.59 690 206 4.13 3,779 75.72 693 176 3.53 3,955 79.24 696 145 2.91 4,100 82.15 699 139 2.79 4,239 84.93 701 109 2.18 4,348 87.12 704 85 1.70 4,433 88.82 707 81 1.62 4,514 90.44 709 70 1.40 4,584 91.85 712 47 0.94 4,631 92.79 714 47 0.94 4,678 93.73 717 38 0.76 4,716 94.49 720 34 0.68 4,750 95.17 722 25 0.50 4,775 95.67 725 23 0.46 4,798 96.13 728 31 0.62 4,829 96.75 730 16 0.32 4,845 97.07 733 18 0.36 4,863 97.44 736 23 0.46 4,886 97.90 739 14 0.28 4,900 98.18 742 13 0.26 4,913 98.44 745 16 0.32 4,929 98.76 748 5 0.10 4,934 98.86 751 12 0.24 4,946 99.10 755 6 0.12 4,952 99.22 758 8 0.16 4,960 99.38 762 7 0.14 4,967 99.52 766 9 0.18 4,976 99.70 770 1 0.02 4,977 99.72
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-12 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
774 4 0.08 4,981 99.80 778 3 0.06 4,984 99.86 783 4 0.08 4,988 99.94 795 1 0.02 4,989 99.96 811 2 0.04 4,991 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-13 American Institutes for Research
Table D8. Scaled Score Frequency Distributions Fall 2017 – Integrated Math II
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
594 15 0.33 15 0.33 604 33 0.73 48 1.06 618 94 2.08 142 3.14 629 143 3.16 285 6.30 637 231 5.11 516 11.41 645 265 5.86 781 17.27 651 340 7.52 1,121 24.78 657 402 8.89 1,523 33.67 662 405 8.95 1,928 42.63 667 392 8.67 2,320 51.29 671 356 7.87 2,676 59.16 675 330 7.30 3,006 66.46 679 260 5.75 3,266 72.21 683 228 5.04 3,494 77.25 687 163 3.60 3,657 80.85 690 160 3.54 3,817 84.39 694 101 2.23 3,918 86.62 697 86 1.90 4,004 88.53 700 64 1.41 4,068 89.94 704 53 1.17 4,121 91.11 707 51 1.13 4,172 92.24 710 36 0.80 4,208 93.04 713 39 0.86 4,247 93.90 716 28 0.62 4,275 94.52 719 20 0.44 4,295 94.96 722 21 0.46 4,316 95.42 725 22 0.49 4,338 95.91 728 13 0.29 4,351 96.20 731 22 0.49 4,373 96.68 734 11 0.24 4,384 96.93 737 7 0.15 4,391 97.08 740 10 0.22 4,401 97.30 743 10 0.22 4,411 97.52 746 13 0.29 4,424 97.81 750 11 0.24 4,435 98.05 753 8 0.18 4,443 98.23 758 5 0.11 4,448 98.34 760 8 0.18 4,456 98.52 764 12 0.27 4,468 98.78 768 8 0.18 4,476 98.96 771 5 0.11 4,481 99.07 776 10 0.22 4,491 99.29 780 3 0.07 4,494 99.36
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-14 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
785 4 0.09 4,498 99.45 790 10 0.22 4,508 99.67 795 5 0.11 4,513 99.78 801 1 0.02 4,514 99.80 807 3 0.07 4,517 99.87 813 6 0.13 4,523 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-15 American Institutes for Research
Table D9. Scaled Score Frequency Distributions Fall 2017 – Biology
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
617 17 0.07 17 0.07 626 49 0.19 66 0.26 638 97 0.38 163 0.65 647 230 0.91 393 1.56 654 412 1.64 805 3.20 655 1 0.00 806 3.20 660 808 3.21 1,614 6.41 665 1,195 4.74 2,809 11.15 670 1,571 6.24 4,380 17.38 674 1,888 7.49 6,268 24.88 678 2,203 8.74 8,471 33.62 681 2,106 8.36 10,577 41.98 685 2,055 8.16 12,632 50.14 687 1,867 7.41 14,499 57.55 690 1,614 6.41 16,113 63.95 692 1,319 5.24 17,432 69.19 695 1,081 4.29 18,513 73.48 697 862 3.42 19,375 76.90 700 747 2.96 20,122 79.87 702 624 2.48 20,746 82.34 704 528 2.10 21,274 84.44 707 433 1.72 21,707 86.16 709 339 1.35 22,046 87.50 711 288 1.14 22,334 88.64 713 250 0.99 22,584 89.64 715 212 0.84 22,796 90.48 717 214 0.85 23,010 91.33 719 150 0.60 23,160 91.92 722 159 0.63 23,319 92.55 724 146 0.58 23,465 93.13 726 150 0.60 23,615 93.73 728 125 0.50 23,740 94.23 730 125 0.50 23,865 94.72 732 122 0.48 23,987 95.21 735 106 0.42 24,093 95.63 737 136 0.54 24,229 96.17 739 121 0.48 24,350 96.65 742 85 0.34 24,435 96.98 744 121 0.48 24,556 97.46 747 93 0.37 24,649 97.83 749 84 0.33 24,733 98.17 752 84 0.33 24,817 98.50 755 62 0.25 24,879 98.75
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-16 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
758 64 0.25 24,943 99.00 761 54 0.21 24,997 99.21 764 44 0.17 25,041 99.39 768 36 0.14 25,077 99.53 771 36 0.14 25,113 99.67 776 27 0.11 25,140 99.78 780 23 0.09 25,163 99.87 786 13 0.05 25,176 99.92 792 9 0.04 25,185 99.96 799 6 0.02 25,191 99.98 809 1 0.00 25,192 99.99 822 3 0.01 25,195 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-17 American Institutes for Research
Table D10. Scaled Score Frequency Distributions Fall 2017 – Physical Science
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
634 2 0.21 2 0.21 637 4 0.43 6 0.64 648 7 0.75 13 1.39 655 1 0.11 14 1.50 656 21 2.25 35 3.75 663 34 3.64 69 7.39 668 45 4.82 114 12.21 673 85 9.10 199 21.31 677 109 11.67 308 32.98 680 1 0.11 309 33.08 681 117 12.53 426 45.61 685 102 10.92 528 56.53 688 105 11.24 633 67.77 691 83 8.89 716 76.66 694 46 4.93 762 81.58 697 37 3.96 799 85.55 700 34 3.64 833 89.19 702 26 2.78 859 91.97 704 12 1.28 871 93.25 706 11 1.18 882 94.43 709 14 1.50 896 95.93 711 10 1.07 906 97.00 713 7 0.75 913 97.75 715 3 0.32 916 98.07 717 5 0.54 921 98.61 719 2 0.21 923 98.82 722 4 0.43 927 99.25 724 1 0.11 928 99.36 728 1 0.11 929 99.46 730 1 0.11 930 99.57 736 2 0.21 932 99.79 740 1 0.11 933 99.89 752 1 0.11 934 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-18 American Institutes for Research
Table D11. Scaled Score Frequency Distributions Fall 2017 – American Government
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
642 3 0.01 3 0.01 645 17 0.05 20 0.06 654 20 0.06 40 0.12 659 47 0.14 87 0.27 664 107 0.33 194 0.59 668 176 0.54 370 1.13 671 272 0.83 642 1.97 674 404 1.24 1,046 3.21 677 523 1.60 1,569 4.81 679 660 2.02 2,229 6.83 681 714 2.19 2,943 9.02 683 839 2.57 3,782 11.59 685 899 2.76 4,681 14.35 687 989 3.03 5,670 17.38 688 931 2.85 6,601 20.23 690 924 2.83 7,525 23.06 691 895 2.74 8,420 25.81 693 895 2.74 9,315 28.55 694 833 2.55 10,148 31.10 695 858 2.63 11,006 33.73 697 773 2.37 11,779 36.10 698 733 2.25 12,512 38.35 699 693 2.12 13,205 40.47 700 714 2.19 13,919 42.66 701 711 2.18 14,630 44.84 703 651 2.00 15,281 46.84 704 641 1.96 15,922 48.80 705 632 1.94 16,554 50.74 706 613 1.88 17,167 52.62 707 617 1.89 17,784 54.51 708 641 1.96 18,425 56.47 709 588 1.80 19,013 58.28 710 631 1.93 19,644 60.21 711 596 1.83 20,240 62.04 713 631 1.93 20,871 63.97 714 631 1.93 21,502 65.90 715 608 1.86 22,110 67.77 716 617 1.89 22,727 69.66 717 586 1.80 23,313 71.46 718 629 1.93 23,942 73.38 719 611 1.87 24,553 75.26
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-19 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
721 621 1.90 25,174 77.16 722 553 1.69 25,727 78.85 723 616 1.89 26,343 80.74 725 574 1.76 26,917 82.50 726 564 1.73 27,481 84.23 728 534 1.64 28,015 85.87 729 545 1.67 28,560 87.54 731 493 1.51 29,053 89.05 733 467 1.43 29,520 90.48 734 462 1.42 29,982 91.90 736 449 1.38 30,431 93.27 739 406 1.24 30,837 94.52 741 359 1.10 31,196 95.62 744 331 1.01 31,527 96.63 746 314 0.96 31,841 97.59 750 270 0.83 32,111 98.42 754 198 0.61 32,309 99.03 758 130 0.40 32,439 99.43 764 104 0.32 32,543 99.75 773 56 0.17 32,599 99.92 774 27 0.08 32,626 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-20 American Institutes for Research
Table D12. Scaled Score Frequency Distributions Fall 2017 – American History
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
619 4 0.02 4 0.02 622 10 0.05 14 0.07 633 20 0.09 34 0.16 641 44 0.21 78 0.37 648 79 0.37 157 0.74 653 130 0.61 287 1.34 658 236 1.11 523 2.45 662 341 1.60 864 4.05 665 460 2.15 1,324 6.20 669 633 2.97 1,957 9.17 672 728 3.41 2,685 12.58 675 872 4.08 3,557 16.66 678 903 4.23 4,460 20.89 680 1,038 4.86 5,498 25.76 683 1,093 5.12 6,591 30.88 685 1,073 5.03 7,664 35.90 687 1,043 4.89 8,707 40.79 689 1,044 4.89 9,751 45.68 691 978 4.58 10,729 50.26 693 967 4.53 11,696 54.79 695 885 4.15 12,581 58.94 697 870 4.08 13,451 63.01 699 787 3.69 14,238 66.70 701 733 3.43 14,971 70.13 702 655 3.07 15,626 73.20 704 574 2.69 16,200 75.89 706 498 2.33 16,698 78.22 707 468 2.19 17,166 80.41 709 426 2.00 17,592 82.41 711 384 1.80 17,976 84.21 712 359 1.68 18,335 85.89 714 316 1.48 18,651 87.37 716 250 1.17 18,901 88.54 717 221 1.04 19,122 89.58 719 210 0.98 19,332 90.56 721 155 0.73 19,487 91.29 722 141 0.66 19,628 91.95 724 165 0.77 19,793 92.72 726 144 0.67 19,937 93.39 727 158 0.74 20,095 94.14 729 124 0.58 20,219 94.72 731 110 0.52 20,329 95.23 733 121 0.57 20,450 95.80
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-21 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
734 91 0.43 20,541 96.22 736 88 0.41 20,629 96.64 738 78 0.37 20,707 97.00 740 79 0.37 20,786 97.37 742 81 0.38 20,867 97.75 744 78 0.37 20,945 98.12 747 52 0.24 20,997 98.36 749 50 0.23 21,047 98.59 752 58 0.27 21,105 98.87 754 43 0.20 21,148 99.07 757 53 0.25 21,201 99.32 760 43 0.20 21,244 99.52 764 34 0.16 21,278 99.68 767 22 0.10 21,300 99.78 772 18 0.08 21,318 99.86 777 15 0.07 21,333 99.93 783 10 0.05 21,343 99.98 791 2 0.01 21,345 99.99 800 2 0.01 21,347 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-22 American Institutes for Research
Table D13. Scaled Score Frequency Distributions Spring 2018 – Grade 3 Reading
Reading Promotion Score Frequency Percent Cumulative
Frequency Cumulative Percent
16 89 0.07 89 0.07 21 235 0.19 324 0.26 26 645 0.51 969 0.76 30 1,302 1.03 2,271 1.79 32 2,169 1.71 4,440 3.51 35 2,978 2.35 7,418 5.86 37 3,934 3.11 11,352 8.96 39 4,489 3.54 15,841 12.51 41 5,178 4.09 21,019 16.59 43 5,489 4.33 26,508 20.93 45 5,800 4.58 32,308 25.51 46 6,187 4.88 38,495 30.39 48 6,399 5.05 44,894 35.44 50 6,814 5.38 51,708 40.82 51 6,799 5.37 58,507 46.19 53 6,872 5.43 65,379 51.61 54 7,027 5.55 72,406 57.16 56 7,071 5.58 79,477 62.74 58 7,050 5.57 86,527 68.31 59 7,006 5.53 93,533 73.84 61 6,807 5.37 100,340 79.22 63 6,502 5.13 106,842 84.35 65 5,633 4.45 112,475 88.80 67 4,758 3.76 117,233 92.55 70 3,815 3.01 121,048 95.56 73 2,760 2.18 123,808 97.74 77 1,641 1.30 125,449 99.04 82 884 0.70 126,333 99.74 86 334 0.26 126,667 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-23 American Institutes for Research
Table D14. Scaled Score Frequency Distributions Spring 2018 – Grade 3 ELA
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
545 78 0.06 78 0.06 562 191 0.15 269 0.21 583 513 0.40 782 0.62 599 1,040 0.82 1,822 1.44 612 1,732 1.37 3,554 2.81 623 2,285 1.80 5,839 4.61 633 2,961 2.34 8,800 6.95 642 3,377 2.67 12,177 9.61 650 3,938 3.11 16,115 12.72 657 4,181 3.30 20,296 16.02 664 4,399 3.47 24,695 19.50 672 4,726 3.73 29,421 23.23 678 4,954 3.91 34,375 27.14 684 5,004 3.95 39,379 31.09 690 5,476 4.32 44,855 35.41 696 5,622 4.44 50,477 39.85 702 5,669 4.48 56,146 44.33 708 5,639 4.45 61,785 48.78 714 5,861 4.63 67,646 53.40 719 6,026 4.76 73,672 58.16 725 5,782 4.56 79,454 62.73 731 5,645 4.46 85,099 67.18 737 5,687 4.49 90,786 71.67 743 5,437 4.29 96,223 75.97 752 5,259 4.15 101,482 80.12 756 4,859 3.84 106,341 83.95 763 4,404 3.48 110,745 87.43 770 3,988 3.15 114,733 90.58 777 3,385 2.67 118,118 93.25 785 2,658 2.10 120,776 95.35 794 2,143 1.69 122,919 97.04 803 1,569 1.24 124,488 98.28 813 1,009 0.80 125,497 99.08 824 631 0.50 126,128 99.57 837 326 0.26 126,454 99.83 851 144 0.11 126,598 99.95 863 69 0.05 126,667 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-24 American Institutes for Research
Table D15. Scaled Score Frequency Distributions Spring 2018 – Grade 4 ELA
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
549 28 0.02 28 0.02 569 120 0.09 148 0.12 589 266 0.21 414 0.33 603 598 0.47 1,012 0.80 615 955 0.75 1,967 1.55 626 1,394 1.10 3,361 2.65 634 1,783 1.41 5,144 4.06 642 2,241 1.77 7,385 5.83 650 2,642 2.09 10,027 7.92 656 2,933 2.32 12,960 10.24 663 3,269 2.58 16,229 12.82 669 3,751 2.96 19,980 15.78 674 4,062 3.21 24,042 18.99 680 4,460 3.52 28,502 22.51 685 4,705 3.72 33,207 26.23 690 4,879 3.85 38,086 30.08 695 5,184 4.09 43,270 34.17 700 5,221 4.12 48,491 38.30 705 5,575 4.40 54,066 42.70 709 5,647 4.46 59,713 47.16 714 5,747 4.54 65,460 51.70 719 5,764 4.55 71,224 56.25 725 5,636 4.45 76,860 60.70 729 5,722 4.52 82,582 65.22 735 5,674 4.48 88,256 69.70 740 5,473 4.32 93,729 74.02 746 5,459 4.31 99,188 78.33 753 5,197 4.10 104,385 82.44 759 4,745 3.75 109,130 86.19 766 4,336 3.42 113,466 89.61 774 3,707 2.93 117,173 92.54 783 3,087 2.44 120,260 94.98 792 2,456 1.94 122,716 96.92 802 1,689 1.33 124,405 98.25 814 1,109 0.88 125,514 99.12 828 628 0.50 126,142 99.62 845 311 0.25 126,453 99.87 846 169 0.13 126,622 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-25 American Institutes for Research
Table D16. Scaled Score Frequency Distributions Spring 2018 – Grade 5 ELA
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
552 28 0.02 28 0.02 557 64 0.05 92 0.07 576 159 0.12 251 0.20 590 314 0.25 565 0.44 602 495 0.39 1,060 0.83 612 860 0.67 1,920 1.50 621 1,079 0.84 2,999 2.34 629 1,450 1.13 4,449 3.47 636 1,817 1.42 6,266 4.89 643 2,082 1.63 8,348 6.52 650 2,510 1.96 10,858 8.48 656 2,621 2.05 13,479 10.52 662 2,912 2.27 16,391 12.80 669 3,121 2.44 19,512 15.23 673 3,365 2.63 22,877 17.86 678 3,485 2.72 26,362 20.58 683 3,873 3.02 30,235 23.61 689 4,041 3.16 34,276 26.76 694 4,211 3.29 38,487 30.05 700 4,462 3.48 42,949 33.53 704 4,712 3.68 47,661 37.21 708 5,064 3.95 52,725 41.17 713 5,270 4.11 57,995 45.28 718 5,439 4.25 63,434 49.53 725 5,612 4.38 69,046 53.91 729 5,983 4.67 75,029 58.58 734 6,042 4.72 81,071 63.30 740 6,133 4.79 87,204 68.09 745 6,221 4.86 93,425 72.94 751 5,981 4.67 99,406 77.61 758 5,912 4.62 105,318 82.23 765 5,357 4.18 110,675 86.41 773 4,811 3.76 115,486 90.17 782 4,109 3.21 119,595 93.38 791 3,329 2.60 122,924 95.98 802 2,288 1.79 125,212 97.76 815 1,493 1.17 126,705 98.93 830 811 0.63 127,516 99.56 847 376 0.29 127,892 99.86 848 184 0.14 128,076 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-26 American Institutes for Research
Table D17. Scaled Score Frequency Distributions Spring 2018 – Grade 6 ELA
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
555 34 0.03 34 0.03 561 70 0.06 104 0.08 575 120 0.09 224 0.18 586 201 0.16 425 0.34 595 359 0.28 784 0.62 603 509 0.40 1,293 1.02 610 648 0.51 1,941 1.53 617 813 0.64 2,754 2.18 623 935 0.74 3,689 2.91 628 1,130 0.89 4,819 3.81 634 1,231 0.97 6,050 4.78 638 1,557 1.23 7,607 6.01 643 1,696 1.34 9,303 7.35 648 1,868 1.48 11,171 8.83 652 2,072 1.64 13,243 10.46 656 2,249 1.78 15,492 12.24 660 2,434 1.92 17,926 14.16 664 2,535 2.00 20,461 16.17 668 2,712 2.14 23,173 18.31 671 2,814 2.22 25,987 20.53 674 2,879 2.27 28,866 22.81 678 2,961 2.34 31,827 25.15 681 3,033 2.40 34,860 27.54 685 3,147 2.49 38,007 30.03 688 3,251 2.57 41,258 32.60 691 3,283 2.59 44,541 35.19 694 3,378 2.67 47,919 37.86 697 3,443 2.72 51,362 40.58 701 3,524 2.78 54,886 43.36 704 3,674 2.90 58,560 46.27 707 3,611 2.85 62,171 49.12 710 3,708 2.93 65,879 52.05 713 3,692 2.92 69,571 54.97 716 3,759 2.97 73,330 57.94 720 3,816 3.01 77,146 60.95 723 3,946 3.12 81,092 64.07 726 3,920 3.10 85,012 67.16 730 3,873 3.06 88,885 70.22 733 3,878 3.06 92,763 73.29 737 3,865 3.05 96,628 76.34
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-27 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
741 3,744 2.96 100,372 79.30 744 3,544 2.80 103,916 82.10 748 3,402 2.69 107,318 84.79 752 3,232 2.55 110,550 87.34 757 2,871 2.27 113,421 89.61 761 2,671 2.11 116,092 91.72 766 2,297 1.81 118,389 93.53 771 2,073 1.64 120,462 95.17 777 1,660 1.31 122,122 96.48 782 1,311 1.04 123,433 97.52 789 1,064 0.84 124,497 98.36 796 794 0.63 125,291 98.99 803 550 0.43 125,841 99.42 812 356 0.28 126,197 99.70 821 193 0.15 126,390 99.86 833 108 0.09 126,498 99.94 847 51 0.04 126,549 99.98 851 23 0.02 126,572 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-28 American Institutes for Research
Table D18. Scaled Score Frequency Distributions Spring 2018 – Grade 7 ELA
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
568 40 0.03 40 0.03 569 76 0.06 116 0.09 581 178 0.14 294 0.24 592 257 0.21 551 0.44 600 331 0.27 882 0.71 608 546 0.44 1,428 1.15 615 624 0.50 2,052 1.65 621 780 0.63 2,832 2.27 627 847 0.68 3,679 2.95 632 1,002 0.80 4,681 3.76 638 1,197 0.96 5,878 4.72 642 1,437 1.15 7,315 5.87 647 1,601 1.29 8,916 7.16 651 1,715 1.38 10,631 8.54 656 1,809 1.45 12,440 9.99 660 2,093 1.68 14,533 11.67 663 2,235 1.79 16,768 13.46 667 2,294 1.84 19,062 15.31 671 2,448 1.97 21,510 17.27 674 2,582 2.07 24,092 19.35 678 2,713 2.18 26,805 21.52 681 2,947 2.37 29,752 23.89 684 2,912 2.34 32,664 26.23 688 3,097 2.49 35,761 28.72 691 3,164 2.54 38,925 31.26 694 3,197 2.57 42,122 33.82 697 3,358 2.70 45,480 36.52 700 3,471 2.79 48,951 39.31 703 3,536 2.84 52,487 42.15 706 3,576 2.87 56,063 45.02 709 3,686 2.96 59,749 47.98 712 3,793 3.05 63,542 51.02 715 3,787 3.04 67,329 54.06 718 3,914 3.14 71,243 57.21 721 3,907 3.14 75,150 60.34 725 3,954 3.18 79,104 63.52 727 3,820 3.07 82,924 66.59 730 3,951 3.17 86,875 69.76 734 3,924 3.15 90,799 72.91 737 3,737 3.00 94,536 75.91
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-29 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
740 3,597 2.89 98,133 78.80 744 3,507 2.82 101,640 81.62 749 3,296 2.65 104,936 84.26 751 3,089 2.48 108,025 86.74 755 2,924 2.35 110,949 89.09 759 2,702 2.17 113,651 91.26 764 2,372 1.90 116,023 93.17 768 1,999 1.61 118,022 94.77 773 1,665 1.34 119,687 96.11 779 1,283 1.03 120,970 97.14 784 1,069 0.86 122,039 98.00 791 832 0.67 122,871 98.66 797 633 0.51 123,504 99.17 805 419 0.34 123,923 99.51 814 270 0.22 124,193 99.73 824 159 0.13 124,352 99.85 833 182 0.15 124,534 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-30 American Institutes for Research
Table D19. Scaled Score Frequency Distributions Spring 2018 – Grade 8 ELA
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
586 23 0.02 23 0.02 593 43 0.03 66 0.05 603 89 0.07 155 0.12 612 183 0.15 338 0.27 618 306 0.24 644 0.51 625 434 0.35 1,078 0.86 630 598 0.48 1,676 1.34 635 770 0.61 2,446 1.95 639 875 0.70 3,321 2.65 643 1,183 0.94 4,504 3.59 647 1,371 1.09 5,875 4.68 651 1,611 1.28 7,486 5.96 654 1,911 1.52 9,397 7.49 657 2,109 1.68 11,506 9.17 661 2,320 1.85 13,826 11.01 664 2,542 2.02 16,368 13.04 666 2,716 2.16 19,084 15.20 669 2,884 2.30 21,968 17.50 672 2,803 2.23 24,771 19.73 675 2,963 2.36 27,734 22.09 677 2,972 2.37 30,706 24.46 680 3,189 2.54 33,895 27.00 682 3,166 2.52 37,061 29.52 685 3,265 2.60 40,326 32.12 687 3,236 2.58 43,562 34.70 690 3,504 2.79 47,066 37.49 692 3,575 2.85 50,641 40.34 695 3,613 2.88 54,254 43.22 697 3,620 2.88 57,874 46.10 700 3,762 3.00 61,636 49.10 702 3,731 2.97 65,367 52.07 704 3,866 3.08 69,233 55.15 707 3,910 3.11 73,143 58.27 709 3,930 3.13 77,073 61.40 712 4,002 3.19 81,075 64.58 714 4,052 3.23 85,127 67.81 717 3,962 3.16 89,089 70.97 720 3,937 3.14 93,026 74.10 722 3,862 3.08 96,888 77.18 725 3,805 3.03 100,693 80.21
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-31 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
728 3,480 2.77 104,173 82.98 731 3,360 2.68 107,533 85.66 734 3,066 2.44 110,599 88.10 737 2,756 2.20 113,355 90.30 741 2,504 1.99 115,859 92.29 744 2,165 1.72 118,024 94.02 748 1,803 1.44 119,827 95.45 752 1,513 1.21 121,340 96.66 756 1,198 0.95 122,538 97.61 760 998 0.80 123,536 98.41 765 679 0.54 124,215 98.95 770 515 0.41 124,730 99.36 776 339 0.27 125,069 99.63 782 219 0.17 125,288 99.80 789 114 0.09 125,402 99.89 797 75 0.06 125,477 99.95 805 57 0.05 125,534 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-32 American Institutes for Research
Table D20. Scaled Score Frequency Distributions Spring 2018 – High School ELA I
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
606 84 0.06 84 0.06 613 155 0.10 239 0.16 621 238 0.16 477 0.32 629 422 0.28 899 0.59 635 585 0.39 1,484 0.98 640 897 0.59 2,381 1.58 645 1,168 0.77 3,549 2.35 649 1,470 0.97 5,019 3.32 653 1,679 1.11 6,698 4.43 656 1,868 1.24 8,566 5.67 660 1,989 1.32 10,555 6.99 663 2,148 1.42 12,703 8.41 666 2,372 1.57 15,075 9.98 669 2,536 1.68 17,611 11.66 672 2,676 1.77 20,287 13.43 675 2,960 1.96 23,247 15.39 677 3,074 2.03 26,321 17.42 680 3,385 2.24 29,706 19.66 683 3,547 2.35 33,253 22.01 685 3,562 2.36 36,815 24.37 687 3,634 2.41 40,449 26.77 689 3,748 2.48 44,197 29.25 692 3,846 2.55 48,043 31.80 694 3,796 2.51 51,839 34.31 696 3,704 2.45 55,543 36.76 698 3,674 2.43 59,217 39.19 701 3,771 2.50 62,988 41.69 703 3,913 2.59 66,901 44.28 705 3,910 2.59 70,811 46.87 707 3,841 2.54 74,652 49.41 710 4,064 2.69 78,716 52.10 712 4,027 2.67 82,743 54.76 714 4,129 2.73 86,872 57.49 716 4,068 2.69 90,940 60.19 718 4,091 2.71 95,031 62.89 721 4,219 2.79 99,250 65.69 723 4,274 2.83 103,524 68.52 726 4,258 2.82 107,782 71.33 728 4,226 2.80 112,008 74.13 730 4,251 2.81 116,259 76.94
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-33 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
733 4,143 2.74 120,402 79.69 736 3,915 2.59 124,317 82.28 739 4,051 2.68 128,368 84.96 741 3,630 2.40 131,998 87.36 744 3,510 2.32 135,508 89.68 748 3,123 2.07 138,631 91.75 751 2,720 1.80 141,351 93.55 755 2,469 1.63 143,820 95.19 758 2,013 1.33 145,833 96.52 763 1,678 1.11 147,511 97.63 767 1,211 0.80 148,722 98.43 772 926 0.61 149,648 99.04 777 650 0.43 150,298 99.47 783 398 0.26 150,696 99.74 790 230 0.15 150,926 99.89 798 111 0.07 151,037 99.96 800 58 0.04 151,095 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-34 American Institutes for Research
Table D21. Scaled Score Frequency Distributions Spring 2018 – High School ELA II
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
597 141 0.10 141 0.10 606 208 0.15 349 0.25 616 335 0.24 684 0.48 624 597 0.42 1,281 0.91 631 922 0.65 2,203 1.56 637 1,282 0.91 3,485 2.47 642 1,579 1.12 5,064 3.58 647 2,006 1.42 7,070 5.00 651 2,188 1.55 9,258 6.55 655 2,396 1.70 11,654 8.25 659 2,682 1.90 14,336 10.15 662 2,607 1.85 16,943 11.99 666 2,800 1.98 19,743 13.98 669 2,747 1.94 22,490 15.92 672 2,939 2.08 25,429 18.00 675 3,056 2.16 28,485 20.16 679 2,933 2.08 31,418 22.24 681 3,104 2.20 34,522 24.44 683 3,267 2.31 37,789 26.75 686 3,165 2.24 40,954 28.99 688 3,341 2.37 44,295 31.36 691 3,540 2.51 47,835 33.86 693 3,476 2.46 51,311 36.32 696 3,750 2.65 55,061 38.98 698 4,018 2.84 59,079 41.82 700 3,894 2.76 62,973 44.58 703 4,013 2.84 66,986 47.42 705 4,117 2.91 71,103 50.33 707 4,246 3.01 75,349 53.34 710 4,282 3.03 79,631 56.37 712 4,234 3.00 83,865 59.37 714 4,326 3.06 88,191 62.43 716 4,305 3.05 92,496 65.48 719 4,385 3.10 96,881 68.58 721 4,234 3.00 101,115 71.58 723 4,234 3.00 105,349 74.58 726 4,084 2.89 109,433 77.47 728 4,008 2.84 113,441 80.30 731 3,664 2.59 117,105 82.90 733 3,571 2.53 120,676 85.43
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-35 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
736 3,191 2.26 123,867 87.69 739 3,076 2.18 126,943 89.86 742 2,747 1.94 129,690 91.81 745 2,321 1.64 132,011 93.45 748 2,075 1.47 134,086 94.92 752 1,723 1.22 135,809 96.14 755 1,451 1.03 137,260 97.17 759 1,121 0.79 138,381 97.96 763 867 0.61 139,248 98.57 768 662 0.47 139,910 99.04 773 504 0.36 140,414 99.40 778 348 0.25 140,762 99.65 783 211 0.15 140,973 99.79 790 148 0.10 141,121 99.90 797 72 0.05 141,193 99.95 805 44 0.03 141,237 99.98 808 26 0.02 141,263 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-36 American Institutes for Research
Table D22. Scaled Score Frequency Distributions Spring 2018 – Grade 3 Math
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
587 235 0.18 235 0.18 589 337 0.26 572 0.45 600 493 0.39 1,065 0.83 601 16 0.01 1,081 0.85 609 651 0.51 1,732 1.36 610 5 0.00 1,737 1.36 617 869 0.68 2,606 2.04 618 18 0.01 2,624 2.06 624 1,072 0.84 3,696 2.90 625 20 0.02 3,716 2.91 630 1,136 0.89 4,852 3.80 631 16 0.01 4,868 3.82 636 1,351 1.06 6,219 4.88 637 18 0.01 6,237 4.89 642 1,500 1.18 7,737 6.07 643 15 0.01 7,752 6.08 647 1,669 1.31 9,421 7.39 648 23 0.02 9,444 7.40 651 1,838 1.44 11,282 8.84 653 15 0.01 11,297 8.86 656 2,114 1.66 13,411 10.51 657 24 0.02 13,435 10.53 660 2,204 1.73 15,639 12.26 662 12 0.01 15,651 12.27 664 2,446 1.92 18,097 14.19 666 13 0.01 18,110 14.20 669 2,508 1.97 20,618 16.16 670 17 0.01 20,635 16.18 673 2,687 2.11 23,322 18.28 674 10 0.01 23,332 18.29 677 2,751 2.16 26,083 20.45 678 19 0.01 26,102 20.46 680 2,979 2.34 29,081 22.80 683 17 0.01 29,098 22.81 684 3,007 2.36 32,105 25.17 686 27 0.02 32,132 25.19 688 3,164 2.48 35,296 27.67 690 25 0.02 35,321 27.69 692 3,315 2.60 38,636 30.29 694 19 0.01 38,655 30.30 696 3,379 2.65 42,034 32.95 698 15 0.01 42,049 32.96 700 3,347 2.62 45,396 35.59
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-37 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
702 19 0.01 45,415 35.60 703 3,518 2.76 48,933 38.36 706 15 0.01 48,948 38.37 707 3,735 2.93 52,683 41.30 710 18 0.01 52,701 41.32 711 3,718 2.91 56,419 44.23 714 29 0.02 56,448 44.25 715 3,804 2.98 60,252 47.23 717 18 0.01 60,270 47.25 719 3,955 3.10 64,225 50.35 721 20 0.02 64,245 50.37 723 3,845 3.01 68,090 53.38 725 15 0.01 68,105 53.39 727 4,034 3.16 72,139 56.55 729 12 0.01 72,151 56.56 731 4,107 3.22 76,258 59.78 733 21 0.02 76,279 59.80 735 4,027 3.16 80,306 62.96 738 19 0.01 80,325 62.97 739 4,140 3.25 84,465 66.22 742 4 0.00 84,469 66.22 744 4,152 3.25 88,621 69.48 746 11 0.01 88,632 69.48 748 4,180 3.28 92,812 72.76 750 16 0.01 92,828 72.77 753 4,092 3.21 96,920 75.98 755 15 0.01 96,935 75.99 758 4,084 3.20 101,019 79.19 760 9 0.01 101,028 79.20 763 3,907 3.06 104,935 82.26 764 9 0.01 104,944 82.27 768 3,807 2.98 108,751 85.26 769 10 0.01 108,761 85.26 774 3,631 2.85 112,392 88.11 775 5 0.00 112,397 88.11 780 3,191 2.50 115,588 90.62 781 8 0.01 115,596 90.62 787 2,968 2.33 118,564 92.95 794 6 0.00 118,570 92.95 795 2,546 2.00 121,116 94.95 803 4 0.00 121,120 94.95 804 2,257 1.77 123,377 96.72 814 1 0.00 123,378 96.72 815 1,804 1.41 125,182 98.14
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-38 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
818 2,376 1.86 127,558 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-39 American Institutes for Research
Table D23. Scaled Score Frequency Distributions Spring 2018 – Grade 4 Math
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
605 336 0.27 336 0.27 606 649 0.52 985 0.78 620 1,121 0.89 2,106 1.67 631 1,499 1.19 3,605 2.86 640 1,747 1.39 5,352 4.25 647 1,841 1.46 7,193 5.71 653 2,088 1.66 9,281 7.37 659 2,229 1.77 11,510 9.13 664 2,287 1.81 13,797 10.95 669 2,400 1.90 16,197 12.85 674 2,414 1.92 18,611 14.77 678 2,575 2.04 21,186 16.81 682 2,656 2.11 23,842 18.92 686 2,574 2.04 26,416 20.96 689 2,724 2.16 29,140 23.12 693 2,770 2.20 31,910 25.32 696 2,680 2.13 34,590 27.45 700 2,858 2.27 37,448 29.72 703 2,748 2.18 40,196 31.90 706 2,818 2.24 43,014 34.13 710 2,805 2.23 45,819 36.36 713 2,805 2.23 48,624 38.59 716 2,900 2.30 51,524 40.89 719 2,951 2.34 54,475 43.23 722 3,015 2.39 57,490 45.62 725 2,975 2.36 60,465 47.98 728 3,120 2.48 63,585 50.46 732 3,163 2.51 66,748 52.97 735 3,144 2.49 69,892 55.46 738 3,198 2.54 73,090 58.00 741 3,280 2.60 76,370 60.60 745 3,182 2.53 79,552 63.13 748 3,360 2.67 82,912 65.80 752 3,382 2.68 86,294 68.48 756 3,445 2.73 89,739 71.21 760 3,375 2.68 93,114 73.89 764 3,410 2.71 96,524 76.60 768 3,400 2.70 99,924 79.30 773 3,431 2.72 103,355 82.02 778 3,266 2.59 106,621 84.61 783 3,296 2.62 109,917 87.23 789 3,173 2.52 113,090 89.74 796 3,002 2.38 116,092 92.13
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-40 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
803 2,776 2.20 118,868 94.33 812 2,412 1.91 121,280 96.24 823 2,015 1.60 123,295 97.84 835 2,719 2.16 126,014 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-41 American Institutes for Research
Table D24. Scaled Score Frequency Distributions Spring 2018 – Grade 5 Math
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
624 1,562 1.23 1,562 1.23 631 1,634 1.29 3,196 2.52 639 1,934 1.53 5,130 4.05 646 2,334 1.84 7,464 5.89 652 2,639 2.08 10,103 7.97 658 2,765 2.18 12,868 10.15 662 2,899 2.29 15,767 12.44 667 2,957 2.33 18,724 14.78 671 3,109 2.45 21,833 17.23 674 3,105 2.45 24,938 19.68 678 3,101 2.45 28,039 22.13 681 3,057 2.41 31,096 24.54 685 3,150 2.49 34,246 27.02 688 3,249 2.56 37,495 29.59 691 3,256 2.57 40,751 32.16 694 3,277 2.59 44,028 34.74 696 3,287 2.59 47,315 37.34 700 3,406 2.69 50,721 40.03 702 3,367 2.66 54,088 42.68 705 3,370 2.66 57,458 45.34 707 3,312 2.61 60,770 47.96 710 3,456 2.73 64,226 50.68 713 3,361 2.65 67,587 53.34 715 3,252 2.57 70,839 55.90 718 3,249 2.56 74,088 58.47 720 3,264 2.58 77,352 61.04 723 3,240 2.56 80,592 63.60 726 3,139 2.48 83,731 66.08 728 3,090 2.44 86,821 68.51 731 2,980 2.35 89,801 70.87 734 2,826 2.23 92,627 73.10 737 2,806 2.21 95,433 75.31 739 2,725 2.15 98,158 77.46 742 2,615 2.06 100,773 79.52 745 2,506 1.98 103,279 81.50 749 2,460 1.94 105,739 83.44 752 2,406 1.90 108,145 85.34 755 2,317 1.83 110,462 87.17 759 2,275 1.80 112,737 88.97 762 2,116 1.67 114,853 90.64 767 2,052 1.62 116,905 92.25 771 1,926 1.52 118,831 93.77 776 1,727 1.36 120,558 95.14
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-42 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
782 1,684 1.33 122,242 96.47 788 1,463 1.15 123,705 97.62 797 1,228 0.97 124,933 98.59 804 1,787 1.41 126,720 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-43 American Institutes for Research
Table D25. Scaled Score Frequency Distributions Spring 2018 – Grade 6 Math
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
616 264 0.21 264 0.21 619 470 0.38 734 0.59 628 805 0.64 1,539 1.23 635 1,290 1.03 2,829 2.26 640 1,605 1.28 4,434 3.55 645 1,886 1.51 6,320 5.06 650 2,014 1.61 8,334 6.67 654 2,111 1.69 10,445 8.36 658 2,220 1.78 12,665 10.13 661 2,336 1.87 15,001 12.00 665 2,473 1.98 17,474 13.98 668 2,472 1.98 19,946 15.96 671 2,594 2.08 22,540 18.04 674 2,647 2.12 25,187 20.15 677 2,767 2.21 27,954 22.37 680 2,718 2.17 30,672 24.54 682 2,684 2.15 33,356 26.69 685 2,795 2.24 36,151 28.93 688 2,877 2.30 39,028 31.23 690 2,857 2.29 41,885 33.52 693 3,007 2.41 44,892 35.92 696 3,001 2.40 47,893 38.32 698 3,079 2.46 50,972 40.79 701 3,104 2.48 54,076 43.27 703 3,284 2.63 57,360 45.90 706 3,228 2.58 60,588 48.48 708 3,255 2.60 63,843 51.09 711 3,292 2.63 67,135 53.72 713 3,356 2.69 70,491 56.41 716 3,330 2.66 73,821 59.07 719 3,230 2.58 77,051 61.66 721 3,146 2.52 80,197 64.17 725 3,205 2.56 83,402 66.74 727 3,242 2.59 86,644 69.33 729 3,123 2.50 89,767 71.83 732 3,089 2.47 92,856 74.30 735 3,035 2.43 95,891 76.73 738 2,918 2.33 98,809 79.07 742 2,793 2.23 101,602 81.30 745 2,755 2.20 104,357 83.50 748 2,703 2.16 107,060 85.67 752 2,548 2.04 109,608 87.71 756 2,405 1.92 112,013 89.63
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-44 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
760 2,312 1.85 114,325 91.48 765 2,161 1.73 116,486 93.21 770 2,087 1.67 118,573 94.88 776 1,807 1.45 120,380 96.33 783 1,550 1.24 121,930 97.57 790 3,041 2.43 124,971 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-45 American Institutes for Research
Table D26. Scaled Score Frequency Distributions Spring 2018 – Grade 7 Math
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
605 168 0.14 168 0.14 606 477 0.40 645 0.54 619 977 0.81 1,622 1.35 629 1,476 1.23 3,098 2.58 636 2,050 1.71 5,148 4.29 643 2,411 2.01 7,559 6.30 648 2,583 2.15 10,142 8.46 653 2,712 2.26 12,854 10.72 658 2,715 2.26 15,569 12.98 662 2,786 2.32 18,355 15.31 666 2,751 2.29 21,106 17.60 670 2,725 2.27 23,831 19.87 673 2,751 2.29 26,582 22.17 677 2,717 2.27 29,299 24.44 680 2,834 2.36 32,133 26.80 684 2,851 2.38 34,984 29.18 686 2,835 2.36 37,819 31.54 689 2,753 2.30 40,572 33.84 692 2,818 2.35 43,390 36.19 695 2,885 2.41 46,275 38.59 697 2,771 2.31 49,046 40.90 700 2,831 2.36 51,877 43.27 703 2,887 2.41 54,764 45.67 706 2,838 2.37 57,602 48.04 708 2,874 2.40 60,476 50.44 711 2,979 2.48 63,455 52.92 713 2,864 2.39 66,319 55.31 716 2,869 2.39 69,188 57.70 719 2,971 2.48 72,159 60.18 721 2,885 2.41 75,044 62.59 725 2,946 2.46 77,990 65.04 727 2,886 2.41 80,876 67.45 730 2,932 2.45 83,808 69.90 732 2,884 2.41 86,692 72.30 735 2,741 2.29 89,433 74.59 738 2,809 2.34 92,242 76.93 741 2,638 2.20 94,880 79.13 744 2,648 2.21 97,528 81.34 747 2,505 2.09 100,033 83.43 751 2,458 2.05 102,491 85.48 755 2,477 2.07 104,968 87.54 758 2,354 1.96 107,322 89.51 762 2,204 1.84 109,526 91.34
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-46 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
767 2,085 1.74 111,611 93.08 771 1,826 1.52 113,437 94.61 777 1,709 1.43 115,146 96.03 784 1,471 1.23 116,617 97.26 791 1,237 1.03 117,854 98.29 801 964 0.80 118,818 99.09 806 1,087 0.91 119,905 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-47 American Institutes for Research
Table D27. Scaled Score Frequency Distributions Spring 2018 – Grade 8 Math
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
633 548 0.56 548 0.56 639 637 0.65 1,185 1.21 644 995 1.02 2,180 2.23 649 1,288 1.32 3,468 3.55 654 1,567 1.60 5,035 5.15 658 1,751 1.79 6,786 6.95 661 1,957 2.00 8,743 8.95 665 2,081 2.13 10,824 11.08 668 2,136 2.19 12,960 13.27 671 2,266 2.32 15,226 15.59 674 2,326 2.38 17,552 17.97 676 2,438 2.50 19,990 20.46 679 2,479 2.54 22,469 23.00 682 2,553 2.61 25,022 25.62 684 2,720 2.78 27,742 28.40 687 2,744 2.81 30,486 31.21 690 2,805 2.87 33,291 34.08 692 2,922 2.99 36,213 37.07 694 2,941 3.01 39,154 40.08 696 3,002 3.07 42,156 43.16 699 3,020 3.09 45,176 46.25 701 2,981 3.05 48,157 49.30 703 3,089 3.16 51,246 52.46 705 2,992 3.06 54,238 55.53 708 3,041 3.11 57,279 58.64 710 2,960 3.03 60,239 61.67 712 2,990 3.06 63,229 64.73 714 2,962 3.03 66,191 67.76 717 2,925 2.99 69,116 70.76 719 2,766 2.83 71,882 73.59 721 2,637 2.70 74,519 76.29 723 2,578 2.64 77,097 78.93 726 2,495 2.55 79,592 81.48 728 2,191 2.24 81,783 83.72 730 2,115 2.17 83,898 85.89 733 1,944 1.99 85,842 87.88 735 1,817 1.86 87,659 89.74 738 1,648 1.69 89,307 91.43 741 1,416 1.45 90,723 92.88 744 1,279 1.31 92,002 94.19 747 1,145 1.17 93,147 95.36 750 1,031 1.06 94,178 96.41 753 816 0.84 94,994 97.25
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-48 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
757 700 0.72 95,694 97.97 761 584 0.60 96,278 98.56 766 462 0.47 96,740 99.04 771 377 0.39 97,117 99.42 774 564 0.58 97,681 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-49 American Institutes for Research
Table D28. Scaled Score Frequency Distributions Spring 2018 – Algebra
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
618 435 0.30 435 0.30 629 780 0.54 1,215 0.83 638 1,504 1.03 2,719 1.87 645 2,481 1.70 5,200 3.57 652 3,558 2.44 8,758 6.02 657 4,321 2.97 13,079 8.99 662 5,011 3.44 18,090 12.43 666 5,299 3.64 23,389 16.07 670 5,260 3.61 28,649 19.68 674 5,198 3.57 33,847 23.25 677 4,914 3.38 38,761 26.63 682 4,747 3.26 43,508 29.89 684 4,630 3.18 48,138 33.07 687 4,447 3.06 52,585 36.13 690 4,553 3.13 57,138 39.26 693 4,267 2.93 61,405 42.19 695 4,201 2.89 65,606 45.07 698 3,965 2.72 69,571 47.80 701 4,025 2.77 73,596 50.56 703 4,064 2.79 77,660 53.35 706 3,859 2.65 81,519 56.01 708 3,965 2.72 85,484 58.73 711 3,682 2.53 89,166 61.26 713 3,718 2.55 92,884 63.81 716 3,551 2.44 96,435 66.25 718 3,546 2.44 99,981 68.69 721 3,449 2.37 103,430 71.06 723 3,353 2.30 106,783 73.36 726 3,170 2.18 109,953 75.54 728 3,243 2.23 113,196 77.77 731 2,989 2.05 116,185 79.82 733 2,831 1.94 119,016 81.77 736 2,773 1.91 121,789 83.67 739 2,569 1.76 124,358 85.44 741 2,370 1.63 126,728 87.07 744 2,388 1.64 129,116 88.71 747 2,111 1.45 131,227 90.16 750 2,029 1.39 133,256 91.55 754 1,822 1.25 135,078 92.80 757 1,663 1.14 136,741 93.95 760 1,553 1.07 138,294 95.01 764 1,329 0.91 139,623 95.93 768 1,163 0.80 140,786 96.72
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-50 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
772 1,077 0.74 141,863 97.46 777 917 0.63 142,780 98.09 782 753 0.52 143,533 98.61 788 626 0.43 144,159 99.04 795 552 0.38 144,711 99.42 803 354 0.24 145,065 99.66 813 260 0.18 145,325 99.84 814 229 0.16 145,554 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-51 American Institutes for Research
Table D29. Scaled Score Frequency Distributions Spring 2018 – Geometry
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
604 628 0.49 628 0.49 609 1,465 1.15 2,093 1.64 622 2,929 2.29 5,022 3.93 632 4,164 3.26 9,186 7.19 640 5,069 3.97 14,255 11.16 647 5,505 4.31 19,760 15.46 653 5,654 4.43 25,414 19.89 658 5,505 4.31 30,919 24.20 663 5,351 4.19 36,270 28.39 668 4,935 3.86 41,205 32.25 672 4,738 3.71 45,943 35.96 676 4,404 3.45 50,347 39.40 679 4,092 3.20 54,439 42.61 683 3,826 2.99 58,265 45.60 686 3,704 2.90 61,969 48.50 689 3,585 2.81 65,554 51.30 693 3,317 2.60 68,871 53.90 696 3,154 2.47 72,025 56.37 700 3,068 2.40 75,093 58.77 702 2,947 2.31 78,040 61.08 705 2,922 2.29 80,962 63.36 707 2,766 2.16 83,728 65.53 710 2,658 2.08 86,386 67.61 713 2,631 2.06 89,017 69.67 716 2,453 1.92 91,470 71.59 719 2,459 1.92 93,929 73.51 721 2,223 1.74 96,152 75.25 725 2,280 1.78 98,432 77.04 727 2,142 1.68 100,574 78.71 730 2,077 1.63 102,651 80.34 732 2,047 1.60 104,698 81.94 735 2,017 1.58 106,715 83.52 738 1,858 1.45 108,573 84.97 741 1,825 1.43 110,398 86.40 744 1,713 1.34 112,111 87.74 747 1,507 1.18 113,618 88.92 750 1,558 1.22 115,176 90.14 753 1,524 1.19 116,700 91.33 756 1,421 1.11 118,121 92.45 759 1,334 1.04 119,455 93.49 763 1,197 0.94 120,652 94.43 766 1,103 0.86 121,755 95.29 770 1,063 0.83 122,818 96.12
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-52 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
774 930 0.73 123,748 96.85 778 807 0.63 124,555 97.48 783 739 0.58 125,294 98.06 788 612 0.48 125,906 98.54 793 548 0.43 126,454 98.97 799 452 0.35 126,906 99.32 806 291 0.23 127,197 99.55 810 577 0.45 127,774 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-53 American Institutes for Research
Table D30. Scaled Score Frequency Distributions Spring 2018 – Integrated Math I
Scaled Score Frequency Percent Cumulative
Frequency Cumulative Percent
618 124 0.99 124 0.99 627 163 1.31 287 2.30 636 308 2.47 595 4.77 644 469 3.76 1,064 8.53 650 587 4.71 1,651 13.24 655 632 5.07 2,283 18.31 660 639 5.12 2,922 23.43 664 625 5.01 3,547 28.44 669 592 4.75 4,139 33.19 672 484 3.88 4,623 37.07 676 469 3.76 5,092 40.83 679 449 3.60 5,541 44.43 682 396 3.18 5,937 47.61 685 387 3.10 6,324 50.71 688 344 2.76 6,668 53.47 691 297 2.38 6,965 55.85 694 338 2.71 7,303 58.56 696 284 2.28 7,587 60.84 700 295 2.37 7,882 63.21 702 276 2.21 8,158 65.42 704 268 2.15 8,426 67.57 707 217 1.74 8,643 69.31 709 247 1.98 8,890 71.29 711 232 1.86 9,122 73.15 714 237 1.90 9,359 75.05 716 194 1.56 9,553 76.61 719 204 1.64 9,757 78.24 721 184 1.48 9,941 79.72 724 216 1.73 10,157 81.45 726 218 1.75 10,375 83.20 729 165 1.32 10,540 84.52 731 173 1.39 10,713 85.91 734 168 1.35 10,881 87.26 737 161 1.29 11,042 88.55 739 161 1.29 11,203 89.84 742 163 1.31 11,366 91.15 745 139 1.11 11,505 92.26 748 137 1.10 11,642 93.36 751 118 0.95 11,760 94.31 754 126 1.01 11,886 95.32 758 103 0.83 11,989 96.14 761 123 0.99 12,112 97.13 765 78 0.63 12,190 97.75
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-54 American Institutes for Research
Scaled Score Frequency Percent Cumulative
Frequency Cumulative Percent
769 62 0.50 12,252 98.25 774 59 0.47 12,311 98.72 779 45 0.36 12,356 99.09 784 43 0.34 12,399 99.43 791 31 0.25 12,430 99.68 798 15 0.12 12,445 99.80 808 15 0.12 12,460 99.92 814 10 0.08 12,470 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-55 American Institutes for Research
Table D31. Scaled Score Frequency Distributions Spring 2018 – Integrated Math II
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
594 17 0.16 17 0.16 600 50 0.47 67 0.63 614 113 1.06 180 1.68 624 191 1.78 371 3.46 632 313 2.92 684 6.39 640 420 3.92 1,104 10.31 646 552 5.15 1,656 15.46 651 617 5.76 2,273 21.23 657 618 5.77 2,891 27.00 661 612 5.71 3,503 32.71 666 563 5.26 4,066 37.97 670 536 5.01 4,602 42.97 674 462 4.31 5,064 47.29 678 457 4.27 5,521 51.55 681 411 3.84 5,932 55.39 685 372 3.47 6,304 58.87 688 330 3.08 6,634 61.95 691 321 3.00 6,955 64.95 695 301 2.81 7,256 67.76 698 253 2.36 7,509 70.12 701 231 2.16 7,740 72.28 704 207 1.93 7,947 74.21 707 183 1.71 8,130 75.92 710 190 1.77 8,320 77.69 713 179 1.67 8,499 79.36 716 197 1.84 8,696 81.20 719 164 1.53 8,860 82.73 722 142 1.33 9,002 84.06 725 129 1.20 9,131 85.26 728 136 1.27 9,267 86.53 731 148 1.38 9,415 87.92 734 108 1.01 9,523 88.93 738 147 1.37 9,670 90.30 741 102 0.95 9,772 91.25 744 112 1.05 9,884 92.30 747 112 1.05 9,996 93.34 751 74 0.69 10,070 94.03 754 81 0.76 10,151 94.79 758 78 0.73 10,229 95.52 762 73 0.68 10,302 96.20 765 67 0.63 10,369 96.83 769 70 0.65 10,439 97.48 774 57 0.53 10,496 98.01
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-56 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
778 36 0.34 10,532 98.35 783 44 0.41 10,576 98.76 788 38 0.35 10,614 99.11 793 31 0.29 10,645 99.40 799 20 0.19 10,665 99.59 806 14 0.13 10,679 99.72 813 30 0.28 10,709 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-57 American Institutes for Research
Table D32. Scaled Score Frequency Distributions Spring 2018 – Grade 5 Science
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
559 12 0.01 12 0.01 569 27 0.02 39 0.03 582 67 0.05 106 0.08 593 162 0.13 268 0.21 602 297 0.23 565 0.44 610 449 0.35 1,014 0.79 617 733 0.57 1,747 1.36 623 945 0.74 2,692 2.10 629 1,114 0.87 3,806 2.97 635 1,362 1.06 5,168 4.04 640 1,571 1.23 6,739 5.26 645 1,707 1.33 8,446 6.60 649 1,883 1.47 10,329 8.07 650 15 0.01 10,344 8.08 654 1,960 1.53 12,304 9.61 658 2,029 1.59 14,333 11.20 659 16 0.01 14,349 11.21 664 2,088 1.63 16,437 12.84 667 2,293 1.79 18,730 14.63 671 2,327 1.82 21,057 16.45 675 2,416 1.89 23,473 18.34 678 2,515 1.96 25,988 20.30 679 18 0.01 26,006 20.32 682 2,610 2.04 28,616 22.36 683 17 0.01 28,633 22.37 686 2,748 2.15 31,381 24.52 687 13 0.01 31,394 24.53 690 2,856 2.23 34,250 26.76 691 19 0.01 34,269 26.77 693 2,916 2.28 37,185 29.05 694 24 0.02 37,209 29.07 697 3,181 2.49 40,390 31.55 698 14 0.01 40,404 31.56 701 3,216 2.51 43,620 34.08 702 18 0.01 43,638 34.09 704 3,324 2.60 46,962 36.69 705 13 0.01 46,975 36.70 708 3,540 2.77 50,515 39.46 709 24 0.02 50,539 39.48 712 3,583 2.80 54,122 42.28 713 16 0.01 54,138 42.29 715 3,823 2.99 57,961 45.28 716 18 0.01 57,979 45.29
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-58 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
719 3,920 3.06 61,899 48.36 720 15 0.01 61,914 48.37 723 4,112 3.21 66,026 51.58 725 13 0.01 66,039 51.59 726 4,230 3.30 70,269 54.90 728 12 0.01 70,281 54.91 730 4,334 3.39 74,615 58.29 731 6 0.00 74,621 58.30 734 4,379 3.42 79,000 61.72 735 12 0.01 79,012 61.73 738 4,402 3.44 83,414 65.17 739 11 0.01 83,425 65.17 742 4,427 3.46 87,852 68.63 743 14 0.01 87,866 68.64 746 4,478 3.50 92,344 72.14 748 13 0.01 92,357 72.15 751 4,251 3.32 96,608 75.47 753 8 0.01 96,616 75.48 755 4,170 3.26 100,786 78.74 756 10 0.01 100,796 78.74 760 4,136 3.23 104,932 81.98 761 9 0.01 104,941 81.98 765 3,821 2.99 108,762 84.97 766 6 0.00 108,768 84.97 770 3,584 2.80 112,352 87.77 771 4 0.00 112,356 87.78 776 3,256 2.54 115,612 90.32 777 3 0.00 115,615 90.32 782 2,902 2.27 118,517 92.59 783 3 0.00 118,520 92.59 789 2,490 1.95 121,010 94.54 790 5 0.00 121,015 94.54 797 2,117 1.65 123,132 96.19 805 1,666 1.30 124,798 97.50 806 1 0.00 124,799 97.50 815 1,263 0.99 126,062 98.48 827 883 0.69 126,945 99.17 841 598 0.47 127,543 99.64 845 461 0.36 128,004 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-59 American Institutes for Research
Table D33. Scaled Score Frequency Distributions Spring 2018 – Grade 8 Science
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
575 17 0.01 17 0.01 579 55 0.04 72 0.06 593 109 0.09 181 0.14 598 3 0.00 184 0.15 604 246 0.19 430 0.34 610 1 0.00 431 0.34 613 456 0.36 887 0.70 619 1 0.00 888 0.70 622 694 0.55 1,582 1.25 627 4 0.00 1,586 1.25 629 1,082 0.86 2,668 2.11 634 8 0.01 2,676 2.12 636 1,451 1.15 4,127 3.26 641 12 0.01 4,139 3.27 642 1,773 1.40 5,912 4.68 647 10 0.01 5,922 4.68 648 2,064 1.63 7,986 6.32 653 2,487 1.97 10,473 8.28 658 18 0.01 10,491 8.30 659 2,633 2.08 13,124 10.38 663 16 0.01 13,140 10.39 664 2,839 2.25 15,979 12.64 668 19 0.02 15,998 12.65 669 3,097 2.45 19,095 15.10 674 3,361 2.66 22,456 17.76 677 27 0.02 22,483 17.78 678 3,421 2.71 25,904 20.49 681 14 0.01 25,918 20.50 682 3,540 2.80 29,458 23.30 686 3,720 2.94 33,178 26.24 690 23 0.02 33,201 26.26 691 3,891 3.08 37,092 29.33 694 16 0.01 37,108 29.35 695 4,018 3.18 41,126 32.53 697 14 0.01 41,140 32.54 700 4,079 3.23 45,219 35.76 701 14 0.01 45,233 35.77 703 4,066 3.22 49,299 38.99 705 16 0.01 49,315 39.00 707 4,173 3.30 53,488 42.30 708 17 0.01 53,505 42.32 710 4,337 3.43 57,842 45.75 712 7 0.01 57,849 45.75
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-60 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
714 4,287 3.39 62,136 49.14 716 8 0.01 62,144 49.15 718 4,245 3.36 66,389 52.50 719 8 0.01 66,397 52.51 722 4,252 3.36 70,649 55.87 723 8 0.01 70,657 55.88 726 4,291 3.39 74,948 59.27 730 4,238 3.35 79,186 62.63 733 6 0.00 79,192 62.63 734 4,044 3.20 83,236 65.83 737 4 0.00 83,240 65.83 738 3,863 3.06 87,103 68.89 741 9 0.01 87,112 68.89 742 3,851 3.05 90,963 71.94 744 6 0.00 90,969 71.94 746 3,645 2.88 94,614 74.83 748 4 0.00 94,618 74.83 750 3,509 2.78 98,127 77.61 752 4 0.00 98,131 77.61 754 3,387 2.68 101,518 80.29 756 6 0.00 101,524 80.29 758 3,205 2.53 104,729 82.83 760 9 0.01 104,738 82.83 763 2,916 2.31 107,654 85.14 766 8 0.01 107,662 85.15 768 2,691 2.13 110,353 87.27 769 6 0.00 110,359 87.28 772 2,604 2.06 112,963 89.34 773 2 0.00 112,965 89.34 777 2,325 1.84 115,290 91.18 778 2 0.00 115,292 91.18 782 2,077 1.64 117,369 92.82 783 1 0.00 117,370 92.82 788 1,885 1.49 119,255 94.31 789 1 0.00 119,256 94.32 793 1,593 1.26 120,849 95.58 794 2 0.00 120,851 95.58 799 1,328 1.05 122,179 96.63 800 1 0.00 122,180 96.63 806 1,174 0.93 123,354 97.56 813 950 0.75 124,304 98.31 820 747 0.59 125,051 98.90 822 1 0.00 125,052 98.90 828 580 0.46 125,632 99.36
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-61 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
832 3 0.00 125,635 99.36 838 340 0.27 125,975 99.63 849 247 0.20 126,222 99.82 863 140 0.11 126,362 99.94 868 82 0.06 126,444 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-62 American Institutes for Research
Table D34. Scaled Score Frequency Distributions Spring 2018 – Biology
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
617 19 0.01 19 0.01 619 29 0.02 48 0.04 631 89 0.07 137 0.10 637 1 0.00 138 0.10 641 176 0.13 314 0.23 647 2 0.00 316 0.23 648 390 0.29 706 0.52 654 4 0.00 710 0.52 655 679 0.50 1,389 1.02 660 1,116 0.82 2,505 1.84 665 1,716 1.26 4,221 3.10 669 2,379 1.75 6,600 4.85 670 5 0.00 6,605 4.85 673 2,907 2.14 9,512 6.99 674 20 0.01 9,532 7.00 677 3,413 2.51 12,945 9.51 678 21 0.02 12,966 9.53 680 3,871 2.84 16,837 12.37 681 30 0.02 16,867 12.39 683 4,047 2.97 20,914 15.37 685 27 0.02 20,941 15.39 686 4,158 3.06 25,099 18.44 688 26 0.02 25,125 18.46 689 4,293 3.15 29,418 21.62 690 32 0.02 29,450 21.64 692 3,969 2.92 33,419 24.56 693 21 0.02 33,440 24.57 694 4,124 3.03 37,564 27.60 696 26 0.02 37,590 27.62 697 3,991 2.93 41,581 30.55 698 10 0.01 41,591 30.56 700 3,865 2.84 45,456 33.40 701 13 0.01 45,469 33.41 702 3,860 2.84 49,329 36.25 703 14 0.01 49,343 36.26 704 3,807 2.80 53,150 39.06 705 8 0.01 53,158 39.06 706 3,766 2.77 56,924 41.83 708 11 0.01 56,935 41.84 709 3,763 2.77 60,698 44.60 710 11 0.01 60,709 44.61 711 3,749 2.75 64,458 47.37 712 12 0.01 64,470 47.37
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-63 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
713 3,859 2.84 68,329 50.21 714 4 0.00 68,333 50.21 715 3,736 2.75 72,069 52.96 716 7 0.01 72,076 52.96 718 3,735 2.74 75,811 55.71 720 3,573 2.63 79,384 58.33 722 3,630 2.67 83,014 61.00 724 3,708 2.72 86,722 63.73 725 7 0.01 86,729 63.73 726 3,566 2.62 90,295 66.35 727 1 0.00 90,296 66.35 728 3,467 2.55 93,763 68.90 729 5 0.00 93,768 68.90 731 3,417 2.51 97,185 71.41 733 3,315 2.44 100,500 73.85 735 3,213 2.36 103,713 76.21 737 1 0.00 103,714 76.21 738 3,137 2.31 106,851 78.52 740 2,990 2.20 109,841 80.71 742 2,961 2.18 112,802 82.89 744 3 0.00 112,805 82.89 745 2,754 2.02 115,559 84.92 747 1 0.00 115,560 84.92 748 2,566 1.89 118,126 86.80 749 2 0.00 118,128 86.80 750 2,523 1.85 120,651 88.66 752 1 0.00 120,652 88.66 753 2,365 1.74 123,017 90.40 755 3 0.00 123,020 90.40 756 2,091 1.54 125,111 91.93 759 2,058 1.51 127,169 93.45 763 1,821 1.34 128,990 94.78 764 2 0.00 128,992 94.79 766 1,619 1.19 130,611 95.98 770 1,408 1.03 132,019 97.01 775 1,234 0.91 133,253 97.92 780 952 0.70 134,205 98.62 785 765 0.56 134,970 99.18 792 505 0.37 135,475 99.55 794 1 0.00 135,476 99.55 800 317 0.23 135,793 99.78 811 192 0.14 135,985 99.93 823 102 0.07 136,087 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-64 American Institutes for Research
Table D35. Scaled Score Frequency Distributions Spring 2018 – Physical Science
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
634 4 0.78 4 0.78 635 12 2.33 16 3.10 646 15 2.91 31 6.01 654 23 4.46 54 10.47 661 31 6.01 85 16.47 667 44 8.53 129 25.00 671 58 11.24 187 36.24 676 49 9.50 236 45.74 680 46 8.91 282 54.65 684 51 9.88 333 64.53 685 1 0.19 334 64.73 687 41 7.95 375 72.67 688 2 0.39 377 73.06 690 29 5.62 406 78.68 691 1 0.19 407 78.88 693 24 4.65 431 83.53 696 23 4.46 454 87.98 698 13 2.52 467 90.50 701 5 0.97 472 91.47 702 1 0.19 473 91.67 703 5 0.97 478 92.64 706 5 0.97 483 93.60 708 5 0.97 488 94.57 710 6 1.16 494 95.74 712 1 0.19 495 95.93 714 3 0.58 498 96.51 717 7 1.36 505 97.87 719 3 0.58 508 98.45 723 2 0.39 510 98.84 729 1 0.19 511 99.03 731 2 0.39 513 99.42 743 1 0.19 514 99.61 745 1 0.19 515 99.81 754 1 0.19 516 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-65 American Institutes for Research
Table D36. Scaled Score Frequency Distributions Spring 2018 – American Government
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
642 5 0.01 5 0.01 644 14 0.02 19 0.02 652 23 0.03 42 0.05 658 45 0.05 87 0.10 663 109 0.12 196 0.22 667 158 0.18 354 0.40 670 5 0.01 359 0.41 671 256 0.29 615 0.70 673 4 0.00 619 0.71 674 340 0.39 959 1.10 676 5 0.01 964 1.10 677 469 0.54 1,433 1.64 678 5 0.01 1,438 1.64 679 610 0.70 2,048 2.34 681 743 0.85 2,791 3.19 683 881 1.01 3,672 4.20 685 1,126 1.29 4,798 5.48 687 1,258 1.44 6,056 6.92 689 1,324 1.51 7,380 8.44 690 11 0.01 7,391 8.45 691 1,421 1.62 8,812 10.07 692 1,488 1.70 10,300 11.77 694 1,616 1.85 11,916 13.62 695 1,702 1.95 13,618 15.57 697 1,889 2.16 15,507 17.73 698 1,860 2.13 17,367 19.85 699 5 0.01 17,372 19.86 700 1,989 2.27 19,361 22.13 701 2,039 2.33 21,400 24.46 702 2,249 2.57 23,649 27.03 703 2,218 2.54 25,867 29.57 704 4 0.00 25,871 29.57 705 2,218 2.54 28,089 32.11 706 2,397 2.74 30,486 34.85 707 2,430 2.78 32,916 37.63 708 2,491 2.85 35,407 40.48 709 2,512 2.87 37,919 43.35 710 2,558 2.92 40,477 46.27 712 2,515 2.88 42,992 49.15 713 2,465 2.82 45,457 51.96 714 2,523 2.88 47,980 54.85 715 2,470 2.82 50,450 57.67 716 2,442 2.79 52,892 60.46
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-66 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
717 2,367 2.71 55,259 63.17 718 2,373 2.71 57,632 65.88 719 2,281 2.61 59,913 68.49 720 1 0.00 59,914 68.49 721 2,268 2.59 62,182 71.08 722 2,157 2.47 64,339 73.55 723 2,154 2.46 66,493 76.01 724 2,048 2.34 68,541 78.35 725 1,893 2.16 70,434 80.52 727 1,832 2.09 72,266 82.61 728 1,780 2.03 74,046 84.65 730 1,672 1.91 75,718 86.56 731 1,581 1.81 77,299 88.36 733 1,469 1.68 78,768 90.04 734 1,299 1.48 80,067 91.53 736 1,237 1.41 81,304 92.94 737 1 0.00 81,305 92.94 738 1,147 1.31 82,452 94.25 739 1 0.00 82,453 94.26 740 960 1.10 83,413 95.35 742 903 1.03 84,316 96.39 745 784 0.90 85,100 97.28 747 627 0.72 85,727 98.00 751 583 0.67 86,310 98.66 754 450 0.51 86,760 99.18 759 309 0.35 87,069 99.53 765 226 0.26 87,295 99.79 773 106 0.12 87,401 99.91 774 77 0.09 87,478 100.00
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-67 American Institutes for Research
Table D37. Scaled Score Frequency Distributions Spring 2018 – American History
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
619 24 0.02 24 0.02 622 1 0.00 25 0.02 630 19 0.01 44 0.03 639 40 0.03 84 0.07 641 1 0.00 85 0.07 645 75 0.06 160 0.13 651 123 0.10 283 0.22 653 1 0.00 284 0.22 655 208 0.16 492 0.39 658 2 0.00 494 0.39 659 360 0.28 854 0.67 662 6 0.00 860 0.68 663 586 0.46 1,446 1.14 665 13 0.01 1,459 1.15 667 911 0.72 2,370 1.87 669 12 0.01 2,382 1.88 670 1,232 0.97 3,614 2.85 672 10 0.01 3,624 2.86 673 1,585 1.25 5,209 4.11 675 12 0.01 5,221 4.12 676 1,989 1.57 7,210 5.69 678 2,184 1.72 9,394 7.41 680 17 0.01 9,411 7.42 681 2,407 1.90 11,818 9.32 682 25 0.02 11,843 9.34 683 2,590 2.04 14,433 11.39 685 2,667 2.10 17,100 13.49 687 2,747 2.17 19,847 15.66 689 2,768 2.18 22,615 17.84 691 14 0.01 22,629 17.85 692 2,733 2.16 25,362 20.01 693 2,708 2.14 28,070 22.14 695 2,711 2.14 30,781 24.28 697 2,586 2.04 33,367 26.32 699 2,536 2.00 35,903 28.32 700 11 0.01 35,914 28.33 701 2,574 2.03 38,488 30.36 702 4 0.00 38,492 30.36 703 2,710 2.14 41,202 32.50 704 2,740 2.16 43,942 34.66 706 2,643 2.08 46,585 36.75 707 3 0.00 46,588 36.75
Ohio’s State Tests —Fall 2017 Administration & Spring 2018 Administration Technical Report
D-68 American Institutes for Research
Scaled Score Frequency Percent Cumulative Frequency
Cumulative Percent
708 2,731 2.15 49,319 38.91 709 2,702 2.13 52,021 41.04 710 6 0.00 52,027 41.04 711 2,716 2.14 54,743 43.18 712 9 0.01 54,752 43.19 713 2,790 2.20 57,542 45.39 714 2,833 2.23 60,375 47.63 715 1 0.00 60,376 47.63 716 2,853 2.25 63,229 49.88 717 2,976 2.35 66,205 52.23 718 4 0.00 66,209 52.23 719 2,946 2.32 69,155 54.55 720 4 0.00 69,159 54.56 721 2,734 2.16 71,893 56.71 722 2,760 2.18 74,653 58.89 723 7 0.01 74,660 58.90 724 2,915 2.30 77,575 61.19 725 7 0.01 77,582 61.20 726 2,954 2.33 80,536 63.53 728 2,994 2.36 83,530 65.89 729 3,158 2.49 86,688 68.38 730 3 0.00 86,691 68.39 731 3,062 2.42 89,753 70.80 733 2,970 2.34 92,723 73.14 735 3,015 2.38 95,738 75.52 737 3,036 2.39 98,774 77.92 739 2,998 2.36 101,772 80.28 741 2,919 2.30 104,691 82.59 743 2,892 2.28 107,583 84.87 746 2,745 2.17 110,328 87.03 748 2,748 2.17 113,076 89.20 751 2,454 1.94 115,530 91.14 754 2,303 1.82 117,833 92.95 757 2,098 1.66 119,931 94.61 760 1,797 1.42 121,728 96.02 764 1,561 1.23 123,289 97.26 768 1 0.00 123,290 97.26 769 1,226 0.97 124,516 98.22 774 903 0.71 125,419 98.94 780 608 0.48 126,027 99.42 788 390 0.31 126,417 99.72 799 233 0.18 126,650 99.91 800 117 0.09 126,767 100.00
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-1 American Institutes for Research
Table E1. Operational Item Parameter Estimates – English Language Arts Grade 3
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
25862 multipleChoice -0.51358 -0.51358
24759 multipleChoice 0.28037 0.28037
25875 multipleChoice -0.58801 -0.58801
24772 multipleChoice 0.90872 0.90872
24783 multipleChoice 0.45596 0.45596
27066 multipleSelect 0.08976 0.08976
24756 multipleChoice 0.16383 0.16383
31212 multipleChoice -1.76431 -1.76431
31213 multipleChoice -0.7541 -0.7541
31217 multipleChoice -0.34516 -0.34516
31220 multipleChoice -1.68894 -1.68894
31223 multipleChoice, multipleSelect
1.28346 2.161 1.72223
31224 tableMatch 1.70256 1.70256
31664_E textEntryExtendedResponse -0.03933 1.29975 2.62324 3.95269 1.959088
31664_O textEntryExtendedResponse -0.00337 1.24574 2.52953 3.92214 1.92351
31664_C textEntryExtendedResponse -1.09795 0.48455 -0.3067
26907 multipleChoice -0.7537 -0.7537
26923 multipleChoice -0.54825 -0.54825
26938 multipleChoice 0.23929 0.23929
26912 multipleChoice -1.54273 -1.54273
26940 multipleChoice -0.54281 -0.54281
26935 multipleChoice -0.28401 -0.28401
26919 multipleChoice, multipleSelect
-1.29207 2.74061 0.72427
30441 multipleChoice 0.37338 0.37338
30377 hotTextCustom 1.00733 1.00733
30401 multipleChoice -0.01883 -0.01883
30440 multipleChoice -0.10418 -0.10418
30450 multipleChoice -1.24834 -1.24834
30382 multipleChoice, multipleChoice
0.98573 -0.36036 0.312685
30374 tableMatch 1.13778 1.13778 *Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-2 American Institutes for Research
Table E2. Operational Item Parameter Estimates – English Language Arts Grade 4
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
26101 multipleChoice -1.47924 -1.47924
26103 multipleChoice, multipleChoice
0.04984 0.64621 0.348025
26096 multipleChoice -0.64156 -0.64156
26098 multipleChoice -0.16945 -0.16945
26090 multipleChoice -2.31003 -2.31003
26091 hotTextCustom -0.28987 1.38096 0.545545
28240 multipleChoice 0.18033 0.18033
27699 multipleChoice 1.25277 1.25277
27685 multipleChoice -0.34283 -0.34283
27698 multipleChoice -1.30588 -1.30588
28245 multipleChoice 0.27098 0.27098
27704 multipleChoice -0.20895 -0.20895
27693 multipleSelect, multipleChoice
-0.22254 0.48933 0.133395
31960_E textEntryExtendedResponse -0.15226 0.67673 2.47006 3.66758 1.665528
31960_O textEntryExtendedResponse -0.34222 0.5696 2.40274 3.78163 1.602938
31960_C textEntryExtendedResponse -0.82716 0.80742 -0.00987
27297 multipleChoice -0.44982 -0.44982
24671 multipleChoice -0.30015 -0.30015
26875 multipleChoice -0.291 -0.291
24661 multipleChoice -0.10594 -0.10594
24668 multipleChoice -1.30412 -1.30412
24672 multipleChoice 0.31595 0.31595
26874 tableMatch 1.58679 1.58679
26734 multipleChoice -0.00577 -0.00577
26733 multipleSelect 2.48263 2.48263
28370 multipleChoice -0.07815 -0.07815
26726 hotTextCustom -0.24864 -0.24864
26735 multipleSelect 1.71361 1.71361
26732 multipleChoice, multipleChoice
1.71089 -1.17365 0.26862
*Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-3 American Institutes for Research
Table E3. Operational Item Parameter Estimates – English Language Arts Grade 5
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
26656 multipleChoice -0.27554 -0.27554
26662 multipleChoice, multipleChoice
1.93508 -0.42745 0.753815
26667 multipleChoice -0.00665 -0.00665
26668 multipleChoice, multipleChoice
0.00075 0.37582 0.188285
26659 multipleChoice, multipleChoice
-0.16897 -0.52732 -0.34815
30753 multipleChoice -0.5249 -0.5249
30757 multipleChoice -0.86081 -0.86081
30748 multipleChoice -0.12624 -0.12624
30756 multipleChoice -1.46763 -1.46763
30763 multipleChoice, multipleChoice
1.50975 -0.07645 0.71665
30751 multipleChoice 0.81347 0.81347
30755 tableMatch 2.48511 2.48511
30754 multipleChoice 1.04094 1.04094
32035_E textEntryExtendedResponse -1.87689 0.12241 2.73684 3.78664 1.19225
32035_O textEntryExtendedResponse -1.75119 0.05105 2.5082 3.93655 1.186153
32035_C textEntryExtendedResponse -2.39768 -1.10781 -1.75275
28309 multipleChoice -1.22741 -1.22741
28308 multipleChoice 0.00366 0.00366
26910 multipleChoice 0.46695 0.46695
26916 hotTextCustom -1.45276 -1.45276
26894 multipleChoice 0.52243 0.52243
26925 multipleSelect 0.34783 0.34783
26902 multipleSelect 0.31747 0.31747
27045 multipleChoice -1.15754 -1.15754
27050 multipleChoice -0.98601 -0.98601
27049 multipleChoice -0.31096 -0.31096
27047 multipleSelect 1.93112 1.93112
27046 multipleChoice 0.10667 0.10667
27040 multipleChoice -1.23815 -1.23815
27048 hotTextCustom -1.3071 0.41946 -0.44382 *Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-4 American Institutes for Research
Table E4. Operational Item Parameter Estimates – English Language Arts Grade 6
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
27397 multipleChoice 0.57743 0.57743
28200 multipleSelect 1.34037 1.34037
27406 multipleChoice 0.75512 0.75512
28189 multipleChoice 0.72681 0.72681
27376 multipleChoice -2.02554 -2.02554
27400 multipleChoice, multipleChoice
0.999 -1.07631 -0.03866
28266 multipleChoice -1.47034 -1.47034
27405 tableMatch 1.55259 1.55259
30424 multipleChoice -0.50231 -0.50231
30419 multipleSelect 1.31736 1.31736
30414 multipleSelect 1.67789 1.67789
30411 multipleChoice -0.48177 -0.48177
30443 multipleChoice -2.04217 -2.04217
30428 multipleChoice 0.07108 0.07108
30437 multipleChoice -0.09652 -0.09652
31711_E textEntryExtendedResponse -2.1755 -0.44258 1.24382 2.91468 0.385105
31711_O textEntryExtendedResponse -2.28712 -0.50726 1.03694 2.90727 0.287458
31711_C textEntryExtendedResponse -2.15883 -0.72987 -1.44435
31077 multipleChoice -1.40513 -1.40513
31071 multipleChoice, multipleChoice
0.86457 -1.99897 -0.5672
31079 multipleChoice -0.97851 -0.97851
31073 multipleChoice, multipleChoice
0.20393 1.22595 0.71494
31078 multipleChoice -0.45674 -0.45674
31083 multipleChoice -1.84572 -1.84572
31075 multipleChoice, multipleChoice
1.27041 -1.7974 -0.2635
30596 multipleChoice -0.19419 -0.19419
30793 multipleChoice 0.54423 0.54423
30588 multipleChoice -1.03902 -1.03902
30584 multipleChoice, multipleChoice
2.35609 -1.43596 0.460065
30593 multipleChoice, multipleChoice
1.70071 -0.47066 0.615025
31762 multipleChoice 0.15788 0.15788
31760 multipleChoice, multipleChoice
0.45428 0.6636 0.55894
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-5 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
31750 multipleChoice, multipleChoice
0.34763 0.03426 0.190945
31752 multipleSelect 0.81296 0.81296
31759 multipleChoice -0.15812 -0.15812
31754 multipleChoice -0.16302 -0.16302
31766_E textEntryExtendedResponse -1.01954 -0.93593 1.79549 2.3993 0.55983
31766_O textEntryExtendedResponse -1.46504 -1.1883 1.07088 2.42998 0.21188
31766_C textEntryExtendedResponse -2.07575 -1.31535 -1.69555 *Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-6 American Institutes for Research
Table E5. Operational Item Parameter Estimates – English Language Arts Grade 7
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
31023 multipleChoice -0.56586 -0.56586
31019 multipleChoice, multipleChoice
1.55504 -1.50591 0.024565
31032 multipleSelect 1.14036 1.14036
31021 multipleChoice -0.8639 -0.8639
31024 multipleChoice 0.62635 0.62635
31018 multipleChoice, multipleChoice
1.2891 -0.00621 0.641445
27064 multipleChoice 0.71433 0.71433
27062 hotTextCustom 0.48573 0.48573
28473 multipleChoice -2.23718 -2.23718
28474 multipleChoice -1.35959 -1.35959
27053 multipleSelect 1.75525 1.75525
27060 multipleChoice, multipleChoice
0.41915 0.01625 0.2177
27055 multipleSelect 1.21371 1.21371
31594 multipleChoice -0.86628 -0.86628
31599 multipleChoice -2.29263 -2.29263
31595 hotTextCustom 1.43375 1.72133 1.57754
31597 multipleChoice -0.11666 -0.11666
31593 multipleChoice -0.09674 -0.09674
31598 multipleChoice 0.3994 0.3994
31601 tableMatch 0.1095 0.1095
31604_E textEntryExtendedResponse -1.82723 0.17685 1.37555 4.62317 1.087085
31604_O textEntryExtendedResponse -2.13106 -0.18213 1.74263 4.53662 0.991515
31604_C textEntryExtendedResponse -1.47162 -1.15294 -1.31228
31444 multipleChoice -2.03737 -2.03737
31425 multipleChoice -0.06718 -0.06718
31446 multipleSelect 1.19092 1.19092
31428 multipleChoice -0.6831 -0.6831
31430 multipleChoice -1.45137 -1.45137
31423 multipleChoice, multipleSelect
-0.20489 0.1525 -0.0262
31978_E textEntryExtendedResponse -2.29311 0.00093 1.91644 2.98783 0.653023
31978_O textEntryExtendedResponse -2.36305 -0.09609 1.58055 3.07265 0.548515
31978_C textEntryExtendedResponse -1.82623 0.1674 -0.82942
26956 multipleChoice -0.34722 -0.34722
26961 multipleChoice 0.31368 0.31368
27727 multipleSelect 0.3158 0.3158
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-7 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
26959 multipleChoice -1.87329 -1.87329
27998 multipleChoice 0.06576 0.06576
28023 hotTextCustom 0.61062 0.61062
28211 multipleChoice 0.19803 0.19803
28212 hotTextCustom 0.96254 -0.03328 0.46463
28028 multipleSelect 1.46287 1.46287
28206 multipleSelect 0.24186 0.24186 *Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-8 American Institutes for Research
Table E6. Operational Item Parameter Estimates – English Language Arts Grade 8
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
31383 multipleChoice -0.55067 -0.55067
31377 multipleChoice -0.32759 -0.32759
31380 multipleChoice -0.48801 -0.48801
31375 multipleChoice -0.1515 -0.1515
31376 multipleSelect 0.72229 0.72229
31390 multipleChoice -0.83735 -0.83735
26682 multipleChoice -0.58563 -0.58563
26689 multipleChoice -0.7992 -0.7992
27629 multipleChoice 0.15096 0.15096
26681 multipleChoice, multipleSelect
1.7132 2.07288 1.89304
26685 multipleChoice 1.047 1.047
26683 hotTextCustom 0.83429 0.59436 0.714325
26688 multipleSelect 1.32824 1.32824
30901 multipleChoice -2.01057 -2.01057
30897 multipleChoice 0.23526 0.23526
30900 multipleChoice 0.68989 0.68989
30895 multipleChoice 0.60201 0.60201
30896 multipleChoice, multipleChoice
0.54015 0.83416 0.687155
30898 multipleChoice -0.17447 -0.17447
30902 multipleChoice -0.17224 -0.17224
31406 multipleChoice, multipleChoice
0.03536 0.34237 0.188865
32037_E textEntryExtendedResponse -1.02401 0.3447 1.61684 2.62319 0.89018
32037_O textEntryExtendedResponse -2.4713 0.08525 1.47033 2.70651 0.447698
32037_C textEntryExtendedResponse -1.57328 -1.13021 -1.35175
31049 multipleChoice, multipleChoice
1.16051 -1.21849 -0.02899
31291 multipleChoice 0.36046 0.36046
31054 multipleChoice -0.58357 -0.58357
31050 multipleChoice, multipleChoice
1.38379 -1.7015 -0.15886
31056 multipleChoice -2.44654 -2.44654
31057 multipleSelect -0.89718 -0.89718
31053 multipleChoice 0.29311 0.29311
26225 multipleChoice -0.08456 -0.08456
28025 multipleChoice 1.04605 1.04605
26232 multipleChoice -1.33216 -1.33216
26233 multipleChoice -0.3111 -0.3111
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-9 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
27589 multipleChoice, multipleChoice
-0.6218 0.46594 -0.07793
26229 multipleChoice, multipleSelect
-0.58882 1.67072 0.54095
32110_E textEntryExtendedResponse -1.21001 0.0479 3.00048 4.25873 1.524275
32110_O textEntryExtendedResponse -2.23153 -0.14704 2.39801 3.92207 0.985378
32110_C textEntryExtendedResponse -1.08833 -1.29952 -1.19393 *Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-10 American Institutes for Research
Table E7. Operational Item Parameter Estimates – English Language Arts High School I
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
30951 multipleChoice -0.13614 -0.13614
30957 multipleChoice 0.40764 0.40764
30950 multipleChoice 0.21227 0.21227
30946 multipleChoice 0.46709 0.46709
30953 multipleChoice -1.58898 -1.58898
30954 multipleChoice -1.12329 -1.12329
30958 hotTextCustom 0.93252 0.93252
30956 multipleChoice -0.48663 -0.48663
31556 multipleChoice -0.49819 -0.49819
31557 multipleChoice, multipleSelect
0.54825 0.66111 0.60468
31566 multipleChoice 1.4894 1.4894
31560 multipleChoice -0.23619 -0.23619
31567 multipleSelect 2.41602 2.41602
31573 multipleChoice, multipleChoice
-0.29548 0.56089 0.132705
31572 multipleChoice, multipleChoice
0.10151 1.23776 0.669635
31583_E textEntryExtendedResponse -1.5057 0.42964 1.78587 4.46892 1.294683
31583_O textEntryExtendedResponse -2.16508 0.01785 1.60777 5.16421 1.156188
31583_C textEntryExtendedResponse -1.90408 -1.29084 -1.59746
30547 multipleChoice, multipleChoice
1.54144 -2.73577 -0.59717
30550 multipleChoice -0.91463 -0.91463
30552 multipleChoice -1.17356 -1.17356
30555 multipleChoice -1.15727 -1.15727
30548 multipleChoice -1.77249 -1.77249
30549 multipleChoice, multipleChoice
1.86504 -1.19054 0.33725
26331 multipleChoice, multipleSelect
-0.23161 -0.35943 -0.29552
26334 multipleChoice 0.68089 0.68089
26329 hotTextCustom 0.80981 0.80981
26325 multipleChoice -0.79961 -0.79961
26330 multipleChoice -1.01538 -1.01538
26336 multipleSelect 1.51159 1.51159
30092 multipleChoice 0.58017 0.58017
30095 multipleChoice -0.1724 -0.1724
30097 multipleChoice 0.21786 0.21786
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-11 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
30090 multipleChoice, multipleChoice
1.25035 0.51591 0.88313
30093 multipleChoice -1.67816 -1.67816
30089 multipleChoice, multipleChoice
1.25707 0.88235 1.06971
30100 multipleChoice -0.27289 -0.27289
31555_E textEntryExtendedResponse -1.12096 0.42531 2.74762 3.41907 1.36776
31555_O textEntryExtendedResponse -1.74091 0.10791 2.04877 3.31284 0.932153
31555_C textEntryExtendedResponse -1.777 -0.73414 -1.25557 *Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-12 American Institutes for Research
Table E8. Operational Item Parameter Estimates – English Language Arts High School II
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
25190 multipleChoice -1.38713 -1.38713
25194 multipleChoice, multipleChoice
1.60918 -1.00029 0.304445
25189 multipleChoice 0.67163 0.67163
25198 multipleChoice 1.32166 1.32166
26637 multipleSelect 1.12953 1.12953
26346 multipleChoice -1.22044 -1.22044
25191 multipleChoice -0.13953 -0.13953
25193 multipleSelect -0.08459 -0.08459
30934 multipleChoice -0.38673 -0.38673
30941 multipleChoice 0.49817 0.49817
30931 multipleChoice -1.26484 -1.26484
30933 multipleChoice -0.02776 -0.02776
30944 multipleChoice -0.48321 -0.48321
30929 multipleChoice, multipleChoice
0.7873 -0.33296 0.22717
30938 multipleChoice 0.33021 0.33021
31605_E textEntryExtendedResponse -0.89865 0.44504 2.27125 3.3033 1.280235
31605_O textEntryExtendedResponse -1.08579 0.48711 2.03362 3.38053 1.203868
31605_C textEntryExtendedResponse -0.98127 -0.96786 -0.97457
27282 multipleChoice -1.274 -1.274
27287 multipleSelect 1.85964 1.85964
27288 multipleChoice -0.45826 -0.45826
27286 multipleChoice 0.23141 0.23141
27280 multipleChoice -1.78291 -1.78291
27290 multipleChoice 0.12689 0.12689
27292 multipleSelect 0.53887 0.53887
25230 multipleChoice 0.41206 0.41206
25229 multipleSelect 2.05923 2.05923
25226 multipleChoice -0.62109 -0.62109
25225 multipleChoice, multipleChoice
2.90686 -1.7798 0.56353
25224 multipleChoice, multipleChoice
0.13315 -0.03889 0.04713
27595 multipleChoice -0.42206 -0.42206
31621 multipleChoice -0.2174 -0.2174
31613 multipleChoice -0.70043 -0.70043
31611 multipleChoice, multipleChoice
1.1151 0.00262 0.55886
31617 multipleChoice -2.13025 -2.13025
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-13 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
31615 multipleChoice, multipleChoice
0.17067 2.23294 1.201805
31616 multipleChoice -1.48749 -1.48749
31620 multipleChoice 0.75856 0.75856
31622_E textEntryExtendedResponse -1.19454 0.37515 1.58915 2.97187 0.935408
31622_O textEntryExtendedResponse -2.62289 0.23661 1.31857 3.08116 0.503363
31622_C textEntryExtendedResponse -0.97326 -0.63074 -0.802 *Note: The item that include _C, _E, _O are the parameters for the one writing item that is scored on three dimensions: C is Conventions, E is Elaboration and O is Organization.
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-14 American Institutes for Research
Table E9. Operational Item Parameter Estimates – Mathematics Grade 3
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
25661 equation -1.78259 -1.78259
29257 grid -0.70287 -0.70287
23577 multipleChoice -2.1213 -2.1213
28618 grid -0.54177 -0.54177
23836 equation 0.47113 0.47113
28529 grid 0.61526 0.61526
24465 multipleSelect 0.9128 0.9128
26279 multipleChoice -2.77373 -2.77373
28885 equation 1.3547 1.3547
24514 multipleChoice -1.6448 -1.6448
24373 multipleChoice 0.74207 0.74207
29321 grid 1.72593 1.72593
33654 grid, equation -1.23304 -0.24168 -0.73736
25470 equation -1.63217 -1.63217
23810 multipleSelect 0.9555 0.9555
29330 grid 1.89706 2.3401 2.11858
24813 tableInput -0.92575 -0.92575
23518 grid -0.84628 0.07584 -0.38522
24511 grid 2.34634 1.53813 1.942235
24369 tableInput -0.39255 -0.39255
25469 multipleChoice -0.15975 -0.15975
23585 equation -0.68999 -0.68999
25124 grid -1.77938 -1.77938
29339 equation -0.81774 -0.81774
23809 multipleChoice 0.2975 0.2975
26998 multipleChoice -2.61914 -2.61914
29005 multipleSelect -1.27174 -1.27174
23858 equation 1.41295 1.41295
23567 tableInput 0.84694 0.84694
27001 equation -0.55102 -0.55102
28551 multipleChoice -0.68254 -0.68254
29081 equation 0.84854 0.84854
25861 grid -1.41323 -1.41323
23600 equation -1.10248 -1.10248
28709 tableInput 0.67996 2.84826 1.76411
26971 multipleChoice -1.19719 -1.19719
28568 equation 0.9789 0.9789
33653 grid, equation, equation -0.46244 3.32124 1.4294
28526 equation 0.18229 0.18229
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-15 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
28867 grid 1.68554 1.68554
28530 equation 0.5425 0.5425
23597 tableInput -0.40709 -0.40709
23572 equation -1.83296 -1.83296
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-16 American Institutes for Research
Table E10. Operational Item Parameter Estimates – Mathematics Grade 4
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
27445 multipleChoice -1.83466 -1.83466
29186 equation -1.74196 -1.74196
24180 equation -1.39733 -1.39733
24716 multipleSelect -0.79471 -0.79471
27538 multipleSelect -0.53704 -0.53704
29190 grid 0.70104 0.70104
28237 equation 1.03322 1.03322
26379 multipleSelect -0.4354 -0.4354
28762 tableInput 0.4339 0.4339
25755 multipleChoice -0.00197 -0.00197
29193 equation 2.22057 2.22057
27449 equation 0.28056 0.28056
29322 multipleChoice 0.25681 0.25681
29184 equation -0.60709 -0.60709
28774 tableInput 2.13184 2.13184
24349 equation 0.13152 0.13152
29289 multipleChoice 0.04224 0.04224
25518 tableMatch 1.3496 1.3496
24182 equation 0.36665 0.36665
25712 equation 0.83204 0.83204
26151 multipleSelect 0.10503 0.10503
29312 grid -0.41946 -0.41946
29291 equation -0.75822 -0.75822
27402 multipleChoice -0.80318 -0.80318
24020 equation -1.71258 -1.71258
27248 grid -0.86221 -0.86221
25109 equation -0.73885 -0.73885
25884 equation -0.17991 -0.17991
24714 multipleChoice 0.12111 0.12111
25763 equation 0.23493 0.23493
29448 multipleChoice 1.1636 1.1636
23525 multipleSelect -0.27031 -0.27031
29189 grid -0.64618 -0.64618
25711 grid 2.16269 2.16269
26376 equation 0.18182 0.18182
24510 grid 0.12348 0.46686 0.29517
24801 grid 0.42449 0.42449
29192 equation 1.77962 1.77962
24343 equation 0.08131 0.08131
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-17 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
24188 grid 0.0463 0.0463
27249 multipleChoice 0.16513 0.16513
27947 equation -0.53084 -0.53084
26629 tableMatch 0.80042 0.80042
27946 multipleSelect 0.24841 0.24841
26373 grid 0.00897 0.00897
28776 multipleChoice -0.5088 -0.5088
27399 multipleChoice -0.53706 -0.53706
29191 equation -1.14969 -1.14969
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-18 American Institutes for Research
Table E11. Operational Item Parameter Estimates – Mathematics Grade 5
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
29393 equation -1.17303 -1.17303
29077 grid -2.43274 -2.43274
26469 equation 0.05173 0.05173
26470 equation 0.35295 0.35295
26575 multipleChoice -1.59654 -1.59654
28982 equation 0.75717 0.75717
26473 equation -0.11233 -0.11233
28015 multipleSelect 1.09796 1.09796
29057 equation 1.24738 1.24738
27937 grid -1.44051 -1.44051
33658 equation, equation,
equation 0.00026 1.94693 1.67589 1.207693
29053 equation 0.24494 0.24494
26844 multipleChoice -1.92524 -1.92524
29395 equation 0.10069 0.10069
28073 equation -0.38274 -0.38274
27782 grid -0.08631 0.79111 0.3524
26445 equation 1.27954 1.27954
29475 equation 0.77376 0.77376
26578 multipleSelect 1.30185 1.30185
29063 equation 1.55816 1.55816
29054 grid 0.1345 0.1345
28010 equation -1.86763 -0.31855 -1.09309
29055 equation -1.1243 -1.1243
26560 equation -0.92813 -0.92813
29469 tableInput -0.75977 -0.75977
27731 equation -0.68964 -0.68964
26256 tableInput -1.1864 -1.1864
27926 equation 1.00069 1.00069
28076 multipleChoice -1.33085 -1.33085
26447 multipleChoice -0.13752 -0.13752
28316 equation 2.5563 2.5563
26721 multipleChoice 0.09168 0.09168
26891 equation -0.19235 -0.19235
27973 grid 0.04739 0.04739
29059 equation 1.64473 1.64473
28314 multipleChoice 0.41186 0.41186
27348 equation 1.29193 1.29193
27236 multipleSelect -0.34515 -0.34515
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-19 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
27744 equation 1.37688 1.37688
29390 equation 1.07762 1.07762
28981 multipleSelect -0.38154 -0.38154
29394 equation -0.00655 -0.00655
28170 equation 0.25053 0.25053
28072 multipleChoice -2.22454 -2.22454
26454 grid -0.49473 -0.49473
27116 equation -0.62614 -0.62614
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-20 American Institutes for Research
Table E12. Operational Item Parameter Estimates – Mathematics Grade 6
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
25154 equation -1.52018 -1.52018
26286 multipleChoice -2.5914 -2.5914
33662 multipleChoice, multipleChoice, multipleChoice
-1.9145 0.22474 2.41447 0.24157
24377 tableInput -0.8363 -0.8363
28983 equation -0.65832 -0.65832
29176 equation 0.23555 0.23555
28669 equation 1.96613 0.94358 1.454855
28883 equation -0.31261 -0.31261
28704 equation 1.0977 1.0977
25423 equation 1.87181 1.87181
28989 multipleChoice -0.24068 -0.24068
25426 grid -0.96464 -0.96464
26980 equation 0.63065 0.63065
25484 equation 2.00575 2.00575
29308 multipleSelect -0.40145 -0.40145
28513 multipleChoice 1.52327 1.52327
29045 multipleChoice -0.67372 -0.67372
26292 multipleChoice -1.00471 -1.00471
28578 equation 2.49015 2.49015
28833 multipleSelect -0.13065 -0.13065
26984 multipleSelect -0.4251 -0.4251
28565 equation -0.68866 -0.68866
28994 equation -0.70868 -0.70868
29042 multipleChoice -1.53868 -1.53868
24382 equation -1.62708 -1.62708
27095 equation -2.5287 -2.5287
25155 equation -0.47653 -0.47653
24473 multipleChoice -0.74595 -0.74595
23819 equation -0.00324 -0.00324
29265 multipleChoice 2.46168 2.46168
28528 equation -0.17241 -0.17241
25419 equation 0.21512 0.21512
28995 multipleChoice 0.63867 0.63867
24383 equation 0.67979 0.67979
25485 equation 2.3476 2.3476
27091 multipleChoice -1.76916 -1.76916
33661 equation, grid -0.5282 0.4976 -0.0153
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-21 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
24385 multipleSelect 2.465 2.465
24760 multipleChoice 0.00616 0.00616
23834 multipleSelect 2.39386 2.39386
33663 equation, tableInput,
equation 0.15628 0.82401 2.27284 1.084377
27925 equation -0.93773 -0.93773
28541 equation -1.24736 0.85404 -0.19666
28745 equation -0.88921 -0.88921
25417 equation -1.34438 -1.34438
28990 multipleChoice -0.27495 -0.27495
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-22 American Institutes for Research
Table E13. Operational Item Parameter Estimates – Mathematics Grade 7
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
24031 multipleChoice -1.93977 -1.93977
27865 equation -0.82288 -0.82288
25773 multipleChoice -1.02455 -1.02455
33666 equation, grid, equation -0.30161 2.54453 -0.03032 0.737533
24729 multipleChoice -0.59856 -0.59856
29424 equation -0.70083 -0.70083
24448 equation 0.42042 0.42042
28744 equation 1.31492 1.31492
26384 multipleChoice 0.15613 0.15613
26594 equation 0.47133 0.47133
28806 multipleChoice 0.29465 0.29465
29284 grid 0.29375 0.29375
28823 equation -0.15624 -0.15624
28798 multipleChoice 0.02391 0.02391
29438 equation -0.16596 -0.16596
33667 multipleChoice, tableMatch,
equation -0.19441 1.55301 3.46423 1.60761
29414 multipleSelect 0.13742 0.13742
25765 multipleChoice -0.44495 -0.44495
29490 multipleChoice -0.15851 -0.15851
29136 equation 2.15782 2.15782
25691 multipleChoice -1.28556 -1.28556
28795 equation -1.87004 0.99348 -0.43828
27541 multipleChoice -1.42701 -1.42701
27597 equation -0.02817 -0.02817
33664 grid, multipleChoice,
equation -0.77251 0.83918 1.70761 0.591427
26391 equation -0.09251 -0.09251
27394 equation 0.25139 0.25139
25771 equation 1.3479 1.3479
24022 tableMatch 0.3224 0.3224
29607 equation 0.12659 0.12659
26383 equation -0.28778 -0.28778
26191 equation -0.14311 -0.14311
29474 multipleChoice -1.49796 -1.49796
24161 equation 0.13008 0.13008
29354 equation -0.56223 0.74649 0.09213
27483 equation -0.53548 -0.53548
27826 tableInput -1.04169 -1.04169
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-23 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
26271 equation 1.09318 1.09318
26440 equation 1.12143 1.12143
24725 equation 0.36639 0.36639
25770 multipleChoice -0.06662 -0.06662
26242 equation 0.52242 0.52242
29491 equation 0.81976 0.81976
29137 equation 1.92173 1.92173
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-24 American Institutes for Research
Table E14. Operational Item Parameter Estimates – Mathematics Grade 8
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
28125 multipleChoice -2.04461 -2.04461
27942 equation -2.70256 -2.70256
26591 equation 0.29072 0.29072
26861 tableInput 0.1606 0.1606
26563 multipleChoice -1.81844 -1.81844
29623 multipleChoice -0.09442 -0.09442
27327 equation 1.39545 1.39545
25500 multipleChoice -1.14419 -1.14419
29540 equation 0.44009 0.44009
26900 tableInput 0.65227 0.65227
29646 multipleChoice 0.3008 0.3008
26463 multipleChoice 0.05963 0.05963
28101 multipleChoice -1.0249 -1.0249
29585 equation 2.36331 2.36331
29016 multipleChoice -0.09024 -0.09024
29850 equation 0.8892 0.8892
29582 multipleChoice -1.51651 -1.51651
27979 equation 1.82285 1.82285
28134 equation -0.78704 -0.78704
27954 equation -1.68922 2.72496 0.51787
29931 equation 2.37478 2.37478
29539 equation 1.31319 1.31319
29544 grid -1.12358 -1.12358
26474 multipleChoice -1.35419 -1.35419
29528 equation -1.14941 -1.14941
29580 equation -2.73721 -2.73721
27994 multipleChoice -0.59947 -0.59947
28225 equation 0.30136 0.30136
29937 multipleChoice -2.13658 -2.13658
27108 equation -1.16486 -1.16486
29108 grid 1.52522 2.03748 1.78135
26565 multipleChoice -0.02511 -0.02511
26459 equation 0.74127 0.74127
29583 multipleChoice -0.93428 -0.93428
27790 tableInput 1.52332 1.52332
30242 equation 1.66703 1.66703
29017 multipleChoice 0.34282 0.34282
27471 equation 1.05664 1.05664
26738 grid -1.38276 0.8594 -0.26168
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-25 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
28033 multipleSelect 0.90768 0.90768
26261 equation 2.36342 2.36342
28130 multipleChoice -2.41372 -2.41372
30245 equation 2.05957 2.05957
29549 equation 0.61948 0.61948
27997 equation 1.04529 1.04529
30235 multipleSelect 1.90899 1.90899
28224 multipleSelect 0.26013 0.26013
29164 equation 1.9191 1.9191
29542 multipleChoice -0.23869 -0.23869
29579 multipleChoice -2.21812 -2.21812
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-26 American Institutes for Research
Table E15. Operational Item Parameter Estimates – Algebra
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
27243 equation -2.08738 -2.08738
27130 multipleChoice -0.67775 -0.67775
24154 multipleSelect -0.53029 -0.53029
29801 multipleChoice 0.072 0.072
26986 multipleChoice -0.26204 -0.26204
30264 tableInput 0.53619 0.53619
25159 multipleChoice 0.30613 0.30613
26181 tableInput -0.30715 2.32257 1.00771
24026 multipleChoice -0.63474 -0.63474
30273 equation 0.22863 0.22863
29708 equation 0.08013 0.21808 0.149105
29800 tableInput -0.16107 -0.87565 -0.51836
26418 multipleChoice 0.58758 0.58758
25440 equation -0.36289 -0.36289
25790 equation -0.2077 3.28148 1.53689
25442 multipleChoice 0.11279 0.11279
29778 multipleChoice -0.10838 -0.10838
25697 multipleChoice -0.7898 -0.7898
33646 hotTextSelectable, equation,
equation 0.41982 0.61361 0.516715
28633 equation 1.70328 1.70328
29328 grid 0.7869 0.7869
28997 multipleChoice -0.53511 -0.53511
25480 multipleChoice -0.62439 -0.62439
29277 multipleChoice -1.08298 -1.08298
26704 equation -1.54533 -1.54533
26876 multipleChoice -1.80846 -1.80846
26989 multipleChoice -1.59233 -1.59233
28856 equation -1.30973 -1.30973
28870 equation 1.86405 0.8173 1.340675
30269 equation -0.04757 -0.04757
29674 multipleChoice -0.04274 -0.04274
28118 equation 0.49438 0.49438
24396 equation 1.61637 1.61637
28643 multipleChoice 0.98869 0.98869
29207 multipleChoice -0.01165 -0.01165
29935 tableMatch -0.90445 -0.90445
29836 equation 2.33268 2.33268
30539 tableInput 1.08831 1.08831
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-27 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
29492 multipleChoice -0.2961 -0.2961
25886 multipleChoice -0.52299 -0.52299
28720 multipleChoice 0.28116 0.28116
24638 equation 1.94598 1.94598
27083 equation -0.82419 -0.82419
29287 equation -0.01199 2.69615 1.34208
26426 multipleChoice -0.50014 -0.50014
26985 multipleChoice -0.37111 -0.37111
28566 multipleChoice -0.28946 -0.28946
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-28 American Institutes for Research
Table E16. Operational Item Parameter Estimates – Geometry
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
28646 hotTextCustom -1.99619 -1.99619
26436 multipleChoice -0.85378 -0.85378
29797 multipleChoice -0.70297 -0.70297
29769 multipleChoice -0.15812 -0.15812
24834 grid 2.56981 2.56981
29516 equation 0.49399 0.49399
27506 equation 0.98612 0.98612
26255 equation 0.7528 0.7528
26070 equation 2.56238 2.56238
29807 multipleChoice 0.19627 0.19627
27490 equation 1.86812 1.86812
29409 equation 0.39963 0.39963
27246 equation -0.92409 -0.92409
24457 multipleChoice 0.24553 0.24553
27703 equation 1.95523 1.95523
27621 multipleChoice 0.73628 0.73628
29288 equation 1.15982 1.15982
26363 equation 1.62716 1.62716
29753 equation -1.11056 -1.11056
26086 equation 1.15562 1.15562
29893 hotTextCustom 2.14979 2.14979
26709 equation 1.00788 1.00788
27466 equation -0.14625 -0.14625
28937 multipleChoice -0.49776 -0.49776
29841 multipleChoice -0.53051 -0.53051
24167 multipleChoice -0.48169 -0.48169
24737 hotTextCustom -0.75547 -0.75547
26444 multipleChoice -1.25539 -1.25539
27505 equation -0.01091 -0.01091
26254 equation 0.5252 0.5252
28722 multipleChoice -2.15188 -2.15188
27679 equation 0.53601 0.53601
29523 equation 0.06605 1.258 0.662025
30257 equation 2.18607 2.18607
26441 multipleChoice 0.36686 0.36686
28964 equation 0.977 0.977
27495 equation 0.57981 0.57981
25702 equation 1.36028 1.36028
29884 hotTextCustom 1.925 1.925
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-29 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
29500 equation 1.28158 1.28158
26083 equation 0.28851 0.28851
29770 equation 0.75335 0.75335
26397 equation 0.92528 0.92528
29359 hotTextCustom 0.64314 3.08517 1.864155
25784 equation 1.72064 3.50211 2.611375
27300 equation 1.24809 1.24809
28108 multipleSelect -0.11239 -0.11239
29669 tableInput 0.36336 0.36336
27070 multipleChoice -0.11106 -0.11106
28095 hotTextCustom -1.27425 0.13845 -0.5679
28685 multipleChoice -1.0984 -1.0984
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-30 American Institutes for Research
Table E17. Operational Item Parameter Estimates – Integrated Mathematics I
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
27243 equation -2.08738 -2.08738
28646 hotTextCustom -1.99619 -1.99619
28856 equation -1.30973 -1.30973
27130 multipleChoice -0.67775 -0.67775
29674 multipleChoice -0.04274 -0.04274
25440 equation -0.36289 -0.36289
25159 multipleChoice 0.30613 0.30613
29801 multipleChoice 0.072 0.072
24451 multipleSelect 1.36135 1.36135
26418 multipleChoice 0.58758 0.58758
33646 hotTextSelectable, equation,
equation 0.41982 0.61361 0.516715
24396 equation 1.61637 1.61637
29328 grid 0.7869 0.7869
28929 multipleChoice -0.21343 -0.21343
28118 equation 0.49438 0.49438
29836 equation 2.33268 2.33268
27070 multipleChoice -0.11106 -0.11106
29287 equation -0.01199 2.69615 1.34208
26426 multipleChoice -0.50014 -0.50014
26986 multipleChoice -0.26204 -0.26204
24457 multipleChoice 0.24553 0.24553
24154 multipleSelect -0.53029 -0.53029
29800 tableInput -0.16107 -0.87565 -0.51836
26876 multipleChoice -1.80846 -1.80846
25697 multipleChoice -0.7898 -0.7898
26989 multipleChoice -1.59233 -1.59233
29935 tableMatch -0.90445 -0.90445
24737 hotTextCustom -0.75547 -0.75547
26086 equation 1.15562 1.15562
29277 multipleChoice -1.08298 -1.08298
29778 multipleChoice -0.10838 -0.10838
28095 hotTextCustom -1.27425 0.13845 -0.5679
30264 tableInput 0.53619 0.53619
29708 equation 0.08013 0.21808 0.149105
26181 tableInput -0.30715 2.32257 1.00771
30273 equation 0.22863 0.22863
28675 multipleChoice -0.00596 -0.00596
29718 equation 1.70761 1.70761
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-31 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
26985 multipleChoice -0.37111 -0.37111
27004 multipleChoice -0.04457 -0.04457
28633 equation 1.70328 1.70328
28870 equation 1.86405 0.8173 1.340675
28108 multipleSelect -0.11239 -0.11239
25886 multipleChoice -0.52299 -0.52299
28997 multipleChoice -0.53511 -0.53511
25480 multipleChoice -0.62439 -0.62439
27083 equation -0.82419 -0.82419
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-32 American Institutes for Research
Table E18. Operational Item Parameter Estimates – Integrated Mathematics II
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
26704 equation -1.54533 -1.54533
26436 multipleChoice -0.85378 -0.85378
27506 equation 0.98612 0.98612
28720 multipleChoice 0.28116 0.28116
29409 equation 0.39963 0.39963
28685 multipleChoice -1.0984 -1.0984
28897 equation 0.92233 0.92233
28643 multipleChoice 0.98869 0.98869
26255 equation 0.7528 0.7528
29492 multipleChoice -0.2961 -0.2961
30539 tableInput 1.08831 1.08831
29516 equation 0.49399 0.49399
29529 multipleChoice 0.73545 0.73545
27703 equation 1.95523 1.95523
28566 multipleChoice -0.28946 -0.28946
29288 equation 1.15982 1.15982
24639 equation 2.74918 2.74918
29548 multipleChoice 1.01781 1.01781
29938 equation -1.44802 -1.44802
26363 equation 1.62716 1.62716
24394 multipleSelect 1.29338 1.29338
26709 equation 1.00788 1.00788
29207 multipleChoice -0.01165 -0.01165
23535 hotTextCustom 1.51424 1.51424
24647 multipleChoice -0.67907 -0.67907
28945 grid 0.1539 0.1539
29841 multipleChoice -0.53051 -0.53051
26251 equation -2.10336 -2.10336
28923 multipleChoice -0.4526 -0.4526
29875 multipleChoice -0.11267 -0.11267
29753 equation -1.11056 -1.11056
26444 multipleChoice -1.25539 -1.25539
26254 equation 0.5252 0.5252
25790 equation -0.2077 3.28148 1.53689
23844 multipleChoice -0.09876 -0.09876
29500 equation 1.28158 1.28158
27679 equation 0.53601 0.53601
24026 multipleChoice -0.63474 -0.63474
30257 equation 2.18607 2.18607
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-33 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
26441 multipleChoice 0.36686 0.36686
28850 multipleChoice -0.04539 -0.04539
29924 hotTextCustom 2.68437 2.68437
28676 equation 2.49616 2.49616
29769 multipleChoice -0.15812 -0.15812
29157 multipleChoice 0.57159 0.57159
23589 equation 3.60937 1.43799 2.52368
29807 multipleChoice 0.19627 0.19627
29770 equation 0.75335 0.75335
23534 grid 1.41159 1.41159
29669 tableInput 0.36336 0.36336
28677 multipleChoice 0.41899 0.41899
29457 multipleChoice -1.00169 -1.00169
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-34 American Institutes for Research
Table E19. Operational Item Parameter Estimates – Science Grade 5
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
29203 multipleChoice -1.58109 -1.58109
21159 multipleChoice -1.6228 -1.6228
17690 multipleChoice -0.7568 -0.7568
17691 multipleChoice -1.00261 -1.00261
17752 multipleChoice -1.1029 -1.1029
21865 grid -0.94277 0.23491 -0.35393
18570 multipleChoice 0.04933 0.04933
22156 multipleChoice, multipleChoice
0.92422 -0.0561 0.43406
17761 multipleChoice -0.68586 -0.68586
15238 grid -0.1924 1.51032 0.65896
17704 multipleChoice -0.96725 -0.96725
19053 multipleChoice, multipleSelect
1.83093 1.83093
21279 multipleChoice -0.38782 -0.38782
20413 multipleChoice 0.15027 0.15027
14351 grid 2.88089 2.88089
16259 grid 1.61871 1.61871
15020 multipleChoice -0.93952 -0.93952
19935 tableMatch -0.56749 -0.56749
22141 multipleSelect 2.6437 2.6437
19079 multipleChoice 0.17869 0.17869
15789 grid 0.07965 0.07965
17730 multipleChoice -1.24531 -1.24531
16063 multipleChoice -0.77081 -0.77081
21043 multipleChoice, multipleChoice
0.67127 0.67127
21347 multipleSelect 2.30552 2.30552
20420 multipleChoice -0.30901 -0.30901
28835 grid -1.0238 0.44349 -0.29016
28832 tableMatch -0.45956 -0.45956
28834 multipleSelect -0.82699 -0.82699
29208 tableInput 0.21758 0.21758
22214 multipleChoice, multipleChoice
-0.2408 0.03736 -0.10172
21474 multipleChoice -1.54729 -1.54729
29372 multipleChoice, multipleChoice
2.4077 -0.35718 1.02526
21268 multipleChoice -0.79015 -0.79015
14372 grid 1.04096 1.04096
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-35 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
20411 multipleChoice -0.18911 -0.18911
21138 multipleChoice -0.29587 -0.29587
21820 multipleChoice, multipleChoice
1.55777 1.55777
21333 multipleChoice 1.01285 1.01285
14337 grid 0.21771 0.21771
29086 multipleChoice, multipleSelect
2.72563 2.72563
19652 multipleChoice 0.0764 0.0764
28757 multipleChoice, multipleSelect
0.92152 1.3126 1.11706
28763 multipleChoice -0.01389 -0.01389
15913 multipleChoice -1.69533 -1.69533
18576 multipleChoice -1.55166 -1.55166
17763 multipleChoice -1.6485 -1.6485
21128 multipleChoice -1.42229 -1.42229
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-36 American Institutes for Research
Table E20. Operational Item Parameter Estimates – Science Grade 8
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
18566 multipleChoice -2.06775 -2.06775
17016 multipleChoice -2.04356 -2.04356
30279 multipleChoice -1.30959 -1.30959
16476 multipleChoice -1.71798 -1.71798
22205 grid -0.131 -0.78732 -0.45916
30369 multipleSelect -1.09285 -1.09285
17919 multipleChoice -0.05924 -0.05924
15821 grid -0.94453 -0.94453
19839 multipleChoice -0.69399 -0.69399
16397 grid 0.43831 0.43831
17925 multipleChoice -0.71846 -0.71846
16523 multipleChoice -0.7211 -0.7211
19013 tableMatch 0.56419 0.56419
15796 multipleChoice -0.47319 -0.47319
16747 grid -1.94559 -1.01263 -1.47911
29514 tableMatch 0.34591 0.34591
16521 multipleChoice 0.22712 0.22712
18523 multipleChoice 0.28258 0.28258
22294 grid -0.14883 -0.14883
17899 multipleChoice -0.87987 -0.87987
19802 multipleChoice 0.47772 0.47772
22226 grid -0.26399 0.87187 0.30394
17455 multipleChoice -0.40412 -0.40412
21227 multipleSelect 0.36309 0.36309
15274 grid 1.47057 -0.76461 0.35298
21235 multipleChoice -0.34254 -0.34254
29525 multipleSelect 1.2195 1.2195
16937 grid 2.16866 2.16866
19123 tableMatch 2.44517 2.44517
29902 multipleSelect 1.56687 1.56687
21248 multipleChoice -0.83731 -0.83731
22256 multipleChoice, multipleChoice
1.25318 1.25318
22204 grid 0.65711 0.65711
18526 multipleChoice -2.03158 -2.03158
18527 multipleChoice -0.64404 -0.64404
29946 tableInput 2.23124 2.23124
21226 multipleSelect 0.31853 0.31853
19040 tableInput 1.27997 1.27997
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-37 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
17843 multipleChoice -0.54066 -0.54066
17820 multipleChoice 0.17989 0.17989
19006 tableMatch 2.16307 2.16307
15192 grid 2.04999 1.71399 1.88199
22145 multipleSelect, multipleSelect
0.70019 0.70019
17436 multipleChoice 0.01653 0.01653
14436 grid 1.49465 1.49465
19991 multipleChoice, multipleSelect
1.91586 1.91586
15754 multipleChoice -0.95311 -0.95311
15873 multipleChoice -1.8218 -1.8218
18567 multipleChoice -1.85211 -1.85211
18543 multipleChoice -2.0248 -2.0248
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-38 American Institutes for Research
Table E21. Operational Item Parameter Estimates – Biology
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
17928 multipleChoice -1.76541 -1.76541
22092 multipleChoice -0.97475 -0.97475
20226 multipleChoice -0.65732 -0.65732
22019 multipleChoice, multipleChoice
0.17693 -1.27173 -0.5474
16102 grid -0.77951 -0.77951
22082 multipleChoice, multipleChoice
-0.34335 -1.19265 -0.768
16505 multipleChoice 0.32597 0.32597
19797 multipleChoice, multipleChoice
0.26213 0.26213
16803 grid -0.72136 -0.72136
16812 multipleChoice 0.50776 0.50776
15525 grid 1.19633 0.43382 0.815075
15171 grid 1.40587 1.40587
16358 grid 0.17575 0.17575
21844 multipleChoice 0.34955 0.34955
16251 grid 1.36382 -0.77612 0.29385
20296 multipleChoice, multipleChoice
0.65389 0.00279 0.32834
16754 grid -0.26577 -0.26577
16741 multipleChoice -0.15714 -0.15714
16847 multipleChoice -0.79144 -0.79144
16842 grid 0.4727 0.4727
16961 multipleChoice 0.39063 0.39063
19913 grid -1.74706 -0.6623 -1.20468
19252 multipleSelect, multipleChoice
-0.98275 0.78479 -0.09898
29743 multipleChoice, multipleChoice
0.62346 0.06442 0.34394
20381 tableInput 0.65528 0.65528
22040 grid 0.82054 0.83554 0.82804
16849 multipleChoice 0.33521 0.33521
16851 textEntryNaturalLanguage -0.04618 -0.04618
17462 multipleChoice 0.46589 0.46589
21927 tableMatch 0.80675 0.80675
28540 multipleChoice, multipleChoice
0.39559 0.39559
17932 multipleChoice -0.32609 -0.32609
17933 multipleChoice -1.18548 -1.18548
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-39 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
17935 multipleChoice -0.86944 -0.86944
15237 grid -0.48652 -0.60252 -0.54452
22060 tableMatch 1.12187 1.12187
16388 grid 0.22765 0.22765
28630 multipleChoice 0.08316 0.08316
28628 multipleChoice 0.0436 0.0436
28830 multipleChoice 0.7999 0.7999
22055 grid -2.08387 4.22915 1.07264
17924 multipleChoice 0.46803 0.46803
20269 multipleChoice -0.17137 -0.17137
15761 multipleChoice -0.38141 -0.38141
16498 multipleChoice -0.47677 -0.47677
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-40 American Institutes for Research
Table E22. Operational Item Parameter Estimates – Physical Science
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
17512 multipleChoice -1.37535 -1.37535
20672 multipleChoice -1.25898 -1.25898
18517 multipleChoice -1.38934 -1.38934
18583 multipleChoice -1.14114 -1.14114
18518 multipleChoice -0.93137 -0.93137
19072 multipleChoice -1.22237 -1.22237
18988 tableMatch 0.51899 0.51899
19155 multipleSelect, multipleChoice
1.98437 -0.26886 0.857755
21470 multipleSelect 0.94195 0.94195
20396 tableMatch 1.27228 1.27228
16809 grid 0.60826 1.21914 0.9137
14507 grid 0.80053 0.80053
20579 simulation -0.97698 -0.32934 -0.65316
20580 multipleChoice, multipleSelect
1.07005 1.34451 1.20728
21294 multipleSelect 1.4548 1.4548
17800 textEntrySimple 0.88271 0.27587 0.57929
21427 multipleSelect 1.04038 1.04038
20103 multipleChoice 0.51331 0.51331
21335 multipleChoice 0.06696 0.06696
16732 grid 0.7124 -0.57406 0.06917
21232 multipleChoice -0.3057 -0.3057
17425 multipleChoice -1.0752 -1.0752
16220 grid -1.49964 -1.49964
19466 multipleChoice -0.93302 -0.93302
21555 multipleChoice -0.62627 -0.62627
21410 multipleChoice -0.11986 -0.11986
21495 multipleChoice -0.0688 -0.0688
21186 multipleSelect 2.2813 2.2813
20488 tableInput 0.90325 0.90325
20996 multipleSelect -0.05922 -0.05922
18484 textEntrySimple 0.76281 -1.34063 -0.28891
19063 tableMatch 1.17735 1.17735
20998 multipleSelect 1.75409 1.75409
19173 textEntrySimple 0.83551 0.48433 0.65992
19926 multipleSelect 1.47008 1.47008
19530 textEntrySimple -0.13462 -0.20426 0.14279 1.2915 0.273853
15432 grid -0.98531 -0.98531
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-41 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
16751 grid -0.75315 -0.75315
14723 grid -0.98096 -0.40658 -0.69377
21000 multipleSelect 1.21358 1.21358
20723 multipleChoice -1.01753 -1.01753
21511 multipleChoice -0.90422 -0.90422
21092 multipleChoice -1.22478 -1.22478
15655 multipleChoice -1.29953 -1.29953
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-42 American Institutes for Research
Table E25. Operational Item Parameter Estimates – American Government
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
20795 multipleChoice -1.18425 -1.18425
16468 multipleChoice -1.7603 -1.7603
16129 multipleChoice 0.14344 0.14344
20793 multipleChoice -0.0317 -0.0317
14957 grid 0.73866 0.33396 0.53631
18884 tableMatch -1.25458 -1.25458
19294 multipleChoice 0.18226 0.18226
21777 multipleChoice, multipleChoice
1.85196 -1.02566 0.41315
20964 multipleChoice -0.76606 -0.76606
20777 multipleChoice -0.99384 -0.99384
20801 multipleChoice 0.73048 0.73048
16345 multipleChoice 0.23582 0.23582
19289 tableMatch -0.7507 -0.7507
18810 multipleChoice, multipleSelect
1.42078 0.4145 0.91764
14573 grid 2.42041 -1.33447 0.54297
21964 hotTextCustom -0.26174 -0.26174
30716 multipleChoice -0.73916 -0.73916
30717 multipleChoice 1.0441 1.0441
30736 multipleChoice, multipleChoice
1.61641 0.16787 0.89214
21855 multipleChoice, multipleSelect
-0.89065 1.48073 0.29504
15741 grid 0.33478 0.43615 0.385465
19740 multipleChoice, multipleSelect
0.70423 -0.15814 0.273045
19765 multipleChoice, multipleChoice
-0.41538 -0.79806 -0.60672
21952 multipleChoice -0.11643 -0.11643
16821 grid -2.09801 2.92869 0.41534
19724 multipleSelect 1.36051 1.36051
21816 multipleChoice, multipleChoice
0.22403 -0.90413 -0.34005
21950 multipleChoice, multipleChoice
2.07563 -0.86449 0.60557
20949 multipleChoice -1.30796 -1.30796
21736 multipleChoice, multipleChoice
1.73655 -1.75449 -0.00897
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-43 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
22139 multipleChoice, multipleSelect
0.80401 1.71007 1.25704
20849 multipleChoice -0.47215 -0.47215
19745 multipleChoice, multipleChoice
0.51228 1.13658 0.82443
19045 multipleSelect 1.16235 1.16235
19061 multipleChoice, multipleChoice
0.58391 -0.21505 0.18443
19871 multipleChoice -0.84131 -0.84131
19812 multipleChoice 0.71513 0.71513
20868 multipleChoice 0.86757 0.86757
21867 multipleChoice, multipleChoice
1.26671 -1.24791 0.0094
14955 grid 0.15002 -2.40744 -1.12871
21784 multipleChoice, multipleChoice
2.6407 -0.92816 0.85627
21780 tableMatch 0.05067 0.05067
18902 tableMatch 0.65411 0.65411
19894 multipleChoice -1.96274 -1.96274
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-44 American Institutes for Research
Table E26. Operational Item Parameter Estimates – American History
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
20883 multipleChoice -0.67498 -0.67498
21460 multipleChoice 0.10604 0.10604
18036 multipleChoice -0.22318 -0.22318
18973 multipleSelect 0.95043 0.95043
21774 multipleChoice, multipleChoice
2.95645 -1.52593 0.71526
19326 multipleChoice 0.25066 0.25066
29911 multipleChoice -0.42785 -0.42785
29957 multipleChoice -0.36061 -0.36061
29958 multipleChoice, multipleSelect
-1.02121 1.83121 0.405
29964 multipleChoice 0.49904 0.49904
29961 multipleChoice -0.44507 -0.44507
21975 multipleChoice, multipleChoice
1.81701 1.13552 1.476265
21818 multipleChoice, multipleChoice
0.00044 -0.21772 -0.10864
14828 grid -2.04957 1.43645 -0.30656
19389 multipleSelect 0.98877 0.98877
21576 multipleChoice -0.55486 -0.55486
21141 multipleChoice -1.62106 -1.62106
20774 multipleChoice -0.19826 -0.19826
18003 multipleChoice -1.07938 -1.07938
21753 multipleChoice, multipleChoice
1.58199 -0.84331 0.36934
21441 multipleChoice -0.41929 -0.41929
21875 multipleChoice, multipleChoice
1.67627 -0.63645 0.51991
21136 multipleChoice -1.03587 -1.03587
15232 grid 0.35952 -0.87663 -0.25856
18016 multipleChoice -0.27497 -0.27497
19526 multipleSelect 0.72824 0.72824
28733 multipleChoice -0.1197 -0.1197
18139 multipleChoice -0.01626 -0.01626
18081 multipleChoice 0.19987 0.19987
14784 grid -0.23448 0.50972 0.13762
16047 multipleChoice 0.21402 0.21402
18156 multipleChoice -0.82391 -0.82391
18191 multipleChoice -0.76695 -0.76695
18093 multipleChoice -0.41319 -0.41319
Ohio’s State Tests—Spring 2018 Administration Technical Report
E-45 American Institutes for Research
Item Item Type Item Parameter Estimates Average
Rasch Value Step 1 Step 2 Step 3 Step 4
18920 hotTextCustom -0.03893 -0.03893
18178 multipleChoice -0.22247 -0.22247
21384 multipleChoice 0.80694 0.80694
19091 multipleChoice, multipleSelect
-0.03477 1.91723 0.94123
18102 multipleChoice -0.28109 -0.28109
19769 multipleChoice, multipleChoice
0.80219 -0.27797 0.26211
16051 multipleChoice -0.41845 -0.41845
20900 multipleChoice -0.16332 -0.16332
16696 grid -1.35901 1.08607 -0.13647
18140 multipleChoice -0.63278 -0.63278
20749 multipleChoice 0.06319 0.06319
18065 multipleChoice -0.34193 -0.34193
20933 multipleChoice 0.83768 0.83768
29897 multipleChoice, multipleChoice
0.79864 0.243 0.52082
16783 grid 0.02495 0.02495
21326 multipleChoice -0.28605 -0.28605
Ohio’s State Tests —Spring 2018 Administration Technical Report
F-1 American Institutes for Research
Table F. OST Performance Standards – Spring 2018
Basic Proficient Accelerated Advanced
Test Theta Scaled Score
Theta Scaled Score
Theta Scaled Score
Theta Scaled Score
ELA
Grade 3 -0.70 672 -0.09 700 0.46 725 1.06 752
Grade 4 -0.56 674 0.06 700 0.65 725 1.32 753
Grade 5 -0.74 669 0.00 700 0.59 725 1.29 755
Grade 6 -0.83 668 -0.07 700 0.52 725 1.14 751
Grade 7 -0.80 670 -0.01 700 0.65 725 1.29 749
Grade 8 -0.43 682 0.15 700 0.95 725 1.55 744
EOC ELA I -0.71 683 -0.11 700 0.79 725 1.31 739
EOC ELA II -0.77 679 -0.08 700 0.75 725 1.30 742
Mathematics
Grade 3 -0.61 683 -0.08 700 0.68 725 1.53 753
Grade 4 -1.05 686 -0.61 700 0.15 725 1.19 759
Grade 5 -1.05 687 -0.54 700 0.43 725 1.35 749
Grade 6 -0.83 682 -0.12 700 0.89 725 1.65 744
Grade 7 -0.76 684 -0.19 700 0.68 725 1.74 755
Grade 8 -0.69 690 -0.18 700 1.06 725 2.00 744
Algebra -1.21 682 -0.57 700 0.32 725 1.37 754
Geometry -1.63 678 -0.89 700 -0.04 725 1.01 756
Int Math I -1.15 682 -0.52 700 0.37 725 1.42 754
Int Math II -1.37 677 -0.63 700 0.17 725 1.23 758
Science
Grade 5 -0.92 664 -0.04 700 0.57 725 1.25 753
Grade 8 -1.14 674 -0.51 700 0.09 725 1.08 766
Biology -1.19 685 -0.67 700 0.18 725 0.51 735
Physical Science
-1.56 684 -0.94 700 0.02 725 0.95 749
Social Studies
American History
-0.98 684 -0.37 700 0.60 725 1.12 738
American Government
-1.11 687 -0.41 700 0.92 725 1.66 739
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-1 American Institutes for Research
Table G1. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Grade 3 Reading
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 545 545 Promotion No 1 ‐3.50 545 545 Promotion No 2 ‐3.01 567 567 Promotion No 3 ‐2.56 588 588 Promotion No 4 ‐2.22 603 603 Promotion No 5 ‐1.95 616 616 Promotion No 6 ‐1.72 626 626 Promotion No 7 ‐1.52 635 635 Promotion No 8 ‐1.34 643 643 Promotion No 9 ‐1.18 651 651 Promotion No 10 ‐1.03 657 657 Promotion No 11 ‐0.89 664 664 Promotion No 12 ‐0.75 670 672 Promotion Yes 13 ‐0.63 676 676 Promotion Yes 14 ‐0.50 681 681 Promotion Yes 15 ‐0.38 687 687 Promotion Yes 16 ‐0.27 692 692 Promotion Yes 17 ‐0.15 697 697 Promotion Yes 18 ‐0.04 702 702 Promotion Yes 19 0.08 708 708 Promotion Yes 20 0.19 713 713 Promotion Yes 21 0.31 718 718 Promotion Yes 22 0.43 724 725 Promotion Yes 23 0.56 729 729 Promotion Yes 24 0.69 735 735 Promotion Yes 25 0.82 741 741 Promotion Yes 26 0.96 748 748 Promotion Yes 27 1.11 755 755 Promotion Yes 28 1.27 762 762 Promotion Yes 29 1.44 770 770 Promotion Yes 30 1.62 778 778 Promotion Yes 31 1.82 787 787 Promotion Yes 32 2.03 796 796 Promotion Yes 33 2.26 807 807 Promotion Yes 34 2.51 818 818 Promotion Yes 35 2.80 831 831 Promotion Yes 36 3.12 846 846 Promotion Yes 37 3.50 863 863 Promotion Yes 38 3.50 863 863 Promotion Yes 39 3.50 863 863 Promotion Yes
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-2 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 3.50 863 863 Promotion Yes
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-3 American Institutes for Research
Table G2. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Grade 3 ELA Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 545 545 Limited 1 ‐3.50 545 545 Limited 2 ‐3.01 567 567 Limited 3 ‐2.56 588 588 Limited 4 ‐2.22 603 603 Limited 5 ‐1.95 616 616 Limited 6 ‐1.72 626 626 Limited 7 ‐1.52 635 635 Limited 8 ‐1.34 643 643 Limited 9 ‐1.18 651 651 Limited 10 ‐1.03 657 657 Limited 11 ‐0.89 664 664 Limited 12 ‐0.75 670 672 Basic 13 ‐0.63 676 676 Basic 14 ‐0.50 681 681 Basic 15 ‐0.38 687 687 Basic 16 ‐0.27 692 692 Basic 17 ‐0.15 697 697 Basic 18 ‐0.04 702 702 Proficient 19 0.08 708 708 Proficient 20 0.19 713 713 Proficient 21 0.31 718 718 Proficient 22 0.43 724 725 Accelerated 23 0.56 729 729 Accelerated 24 0.69 735 735 Accelerated 25 0.82 741 741 Accelerated 26 0.96 748 748 Accelerated 27 1.11 755 755 Advanced 28 1.27 762 762 Advanced 29 1.44 770 770 Advanced 30 1.62 778 778 Advanced 31 1.82 787 787 Advanced 32 2.03 796 796 Advanced 33 2.26 807 807 Advanced 34 2.51 818 818 Advanced 35 2.80 831 831 Advanced 36 3.12 846 846 Advanced 37 3.50 863 863 Advanced 38 3.50 863 863 Advanced 39 3.50 863 863 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-4 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 3.50 863 863 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-5 American Institutes for Research
Table G3. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Grade 3 ELA Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 545 545 Limited 1 ‐3.50 545 545 Limited 2 ‐3.01 567 567 Limited 3 ‐2.56 588 588 Limited 4 ‐2.22 603 603 Limited 5 ‐1.95 616 616 Limited 6 ‐1.72 626 626 Limited 7 ‐1.52 635 635 Limited 8 ‐1.34 643 643 Limited 9 ‐1.18 651 651 Limited 10 ‐1.03 657 657 Limited 11 ‐0.89 664 664 Limited 12 ‐0.75 670 672 Basic 13 ‐0.63 676 676 Basic 14 ‐0.50 681 681 Basic 15 ‐0.38 687 687 Basic 16 ‐0.27 692 692 Basic 17 ‐0.15 697 697 Basic 18 ‐0.04 702 702 Proficient 19 0.08 708 708 Proficient 20 0.19 713 713 Proficient 21 0.31 718 718 Proficient 22 0.43 724 725 Accelerated 23 0.56 729 729 Accelerated 24 0.69 735 735 Accelerated 25 0.82 741 741 Accelerated 26 0.96 748 748 Accelerated 27 1.11 755 755 Advanced 28 1.27 762 762 Advanced 29 1.44 770 770 Advanced 30 1.62 778 778 Advanced 31 1.82 787 787 Advanced 32 2.03 796 796 Advanced 33 2.26 807 807 Advanced 34 2.51 818 818 Advanced 35 2.80 831 831 Advanced 36 3.12 846 846 Advanced 37 3.50 863 863 Advanced 38 3.50 863 863 Advanced 39 3.50 863 863 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-6 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 3.50 863 863 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-7 American Institutes for Research
Table G4. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – High School ELA I Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 606 606 Limited 1 ‐3.50 606 606 Limited 2 ‐3.50 606 606 Limited 3 ‐3.17 615 615 Limited 4 ‐2.85 624 624 Limited 5 ‐2.60 631 631 Limited 6 ‐2.39 637 637 Limited 7 ‐2.20 642 642 Limited 8 ‐2.03 647 647 Limited 9 ‐1.88 651 651 Limited 10 ‐1.75 655 655 Limited 11 ‐1.62 658 658 Limited 12 ‐1.50 661 661 Limited 13 ‐1.39 665 665 Limited 14 ‐1.28 668 668 Limited 15 ‐1.18 670 670 Limited 16 ‐1.08 673 673 Limited 17 ‐0.98 676 676 Limited 18 ‐0.89 678 678 Limited 19 ‐0.80 681 681 Limited 20 ‐0.72 683 683 Basic 21 ‐0.63 685 685 Basic 22 ‐0.55 688 688 Basic 23 ‐0.47 690 690 Basic 24 ‐0.39 692 692 Basic 25 ‐0.31 694 694 Basic 26 ‐0.24 697 697 Basic 27 ‐0.16 699 699 Basic 28 ‐0.08 701 701 Proficient 29 ‐0.01 703 703 Proficient 30 0.07 705 705 Proficient 31 0.15 707 707 Proficient 32 0.22 709 709 Proficient 33 0.30 711 711 Proficient 34 0.37 713 713 Proficient 35 0.45 716 716 Proficient 36 0.53 718 718 Proficient 37 0.61 720 720 Proficient 38 0.69 722 722 Proficient 39 0.77 725 725 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-8 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.86 727 727 Accelerated 41 0.95 729 729 Accelerated 42 1.04 732 732 Accelerated 43 1.13 734 734 Accelerated 44 1.22 737 737 Accelerated 45 1.32 740 740 Advanced 46 1.43 743 743 Advanced 47 1.54 746 746 Advanced 48 1.65 749 749 Advanced 49 1.78 752 752 Advanced 50 1.91 756 756 Advanced 51 2.05 760 760 Advanced 52 2.20 764 764 Advanced 53 2.36 769 769 Advanced 54 2.55 774 774 Advanced 55 2.75 780 780 Advanced 56 2.99 786 786 Advanced 57 3.26 794 794 Advanced 58 3.50 800 800 Advanced 59 3.50 800 800 Advanced 60 3.50 800 800 Advanced 61 3.50 800 800 Advanced 62 3.50 800 800 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-9 American Institutes for Research
Table G5. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – High School ELA I Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 606 606 Limited 1 ‐3.50 606 606 Limited 2 ‐3.50 606 606 Limited 3 ‐3.17 615 615 Limited 4 ‐2.85 624 624 Limited 5 ‐2.60 631 631 Limited 6 ‐2.39 637 637 Limited 7 ‐2.20 642 642 Limited 8 ‐2.03 647 647 Limited 9 ‐1.88 651 651 Limited 10 ‐1.75 655 655 Limited 11 ‐1.62 658 658 Limited 12 ‐1.50 661 661 Limited 13 ‐1.39 665 665 Limited 14 ‐1.28 668 668 Limited 15 ‐1.18 670 670 Limited 16 ‐1.08 673 673 Limited 17 ‐0.98 676 676 Limited 18 ‐0.89 678 678 Limited 19 ‐0.80 681 681 Limited 20 ‐0.72 683 683 Basic 21 ‐0.63 685 685 Basic 22 ‐0.55 688 688 Basic 23 ‐0.47 690 690 Basic 24 ‐0.39 692 692 Basic 25 ‐0.31 694 694 Basic 26 ‐0.24 697 697 Basic 27 ‐0.16 699 699 Basic 28 ‐0.08 701 701 Proficient 29 ‐0.01 703 703 Proficient 30 0.07 705 705 Proficient 31 0.15 707 707 Proficient 32 0.22 709 709 Proficient 33 0.30 711 711 Proficient 34 0.37 713 713 Proficient 35 0.45 716 716 Proficient 36 0.53 718 718 Proficient 37 0.61 720 720 Proficient 38 0.69 722 722 Proficient 39 0.77 725 725 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-10 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.86 727 727 Accelerated 41 0.95 729 729 Accelerated 42 1.04 732 732 Accelerated 43 1.13 734 734 Accelerated 44 1.22 737 737 Accelerated 45 1.32 740 740 Advanced 46 1.43 743 743 Advanced 47 1.54 746 746 Advanced 48 1.65 749 749 Advanced 49 1.78 752 752 Advanced 50 1.91 756 756 Advanced 51 2.05 760 760 Advanced 52 2.20 764 764 Advanced 53 2.36 769 769 Advanced 54 2.55 774 774 Advanced 55 2.75 780 780 Advanced 56 2.99 786 786 Advanced 57 3.26 794 794 Advanced 58 3.50 800 800 Advanced 59 3.50 800 800 Advanced 60 3.50 800 800 Advanced 61 3.50 800 800 Advanced 62 3.50 800 800 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-11 American Institutes for Research
Table G6. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – High School ELA II Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 597 597 Limited 1 ‐3.50 597 597 Limited 2 ‐3.50 597 597 Limited 3 ‐3.30 603 603 Limited 4 ‐2.97 613 613 Limited 5 ‐2.70 621 621 Limited 6 ‐2.48 628 628 Limited 7 ‐2.29 633 633 Limited 8 ‐2.12 639 639 Limited 9 ‐1.97 643 643 Limited 10 ‐1.83 647 647 Limited 11 ‐1.70 651 651 Limited 12 ‐1.58 655 655 Limited 13 ‐1.46 658 658 Limited 14 ‐1.35 662 662 Limited 15 ‐1.25 665 665 Limited 16 ‐1.15 668 668 Limited 17 ‐1.06 671 671 Limited 18 ‐0.96 673 673 Limited 19 ‐0.88 676 676 Limited 20 ‐0.79 679 679 Basic 21 ‐0.70 681 681 Basic 22 ‐0.62 684 684 Basic 23 ‐0.54 686 686 Basic 24 ‐0.46 689 689 Basic 25 ‐0.38 691 691 Basic 26 ‐0.30 693 693 Basic 27 ‐0.22 696 696 Basic 28 ‐0.15 698 698 Basic 29 ‐0.07 700 700 Proficient 30 0.01 703 703 Proficient 31 0.08 705 705 Proficient 32 0.16 707 707 Proficient 33 0.23 709 709 Proficient 34 0.31 712 712 Proficient 35 0.39 714 714 Proficient 36 0.46 716 716 Proficient 37 0.54 719 719 Proficient 38 0.62 721 721 Proficient 39 0.70 724 724 Proficient
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-12 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.79 726 726 Accelerated 41 0.87 729 729 Accelerated 42 0.96 731 731 Accelerated 43 1.05 734 734 Accelerated 44 1.14 737 737 Accelerated 45 1.24 740 740 Accelerated 46 1.34 743 743 Advanced 47 1.45 746 746 Advanced 48 1.56 749 749 Advanced 49 1.68 753 753 Advanced 50 1.81 757 757 Advanced 51 1.94 761 761 Advanced 52 2.09 765 765 Advanced 53 2.25 770 770 Advanced 54 2.42 775 775 Advanced 55 2.62 781 781 Advanced 56 2.84 788 788 Advanced 57 3.09 795 795 Advanced 58 3.39 804 804 Advanced 59 3.50 808 808 Advanced 60 3.50 808 808 Advanced 61 3.50 808 808 Advanced 62 3.50 808 808 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-13 American Institutes for Research
Table G7. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – High School ELA II Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 597 597 Limited 1 ‐3.50 597 597 Limited 2 ‐3.50 597 597 Limited 3 ‐3.30 603 603 Limited 4 ‐2.97 613 613 Limited 5 ‐2.70 621 621 Limited 6 ‐2.48 628 628 Limited 7 ‐2.29 633 633 Limited 8 ‐2.12 639 639 Limited 9 ‐1.97 643 643 Limited 10 ‐1.83 647 647 Limited 11 ‐1.70 651 651 Limited 12 ‐1.58 655 655 Limited 13 ‐1.46 658 658 Limited 14 ‐1.35 662 662 Limited 15 ‐1.25 665 665 Limited 16 ‐1.15 668 668 Limited 17 ‐1.06 671 671 Limited 18 ‐0.96 673 673 Limited 19 ‐0.88 676 676 Limited 20 ‐0.79 679 679 Basic 21 ‐0.70 681 681 Basic 22 ‐0.62 684 684 Basic 23 ‐0.54 686 686 Basic 24 ‐0.46 689 689 Basic 25 ‐0.38 691 691 Basic 26 ‐0.30 693 693 Basic 27 ‐0.22 696 696 Basic 28 ‐0.15 698 698 Basic 29 ‐0.07 700 700 Proficient 30 0.01 703 703 Proficient 31 0.08 705 705 Proficient 32 0.16 707 707 Proficient 33 0.23 709 709 Proficient 34 0.31 712 712 Proficient 35 0.39 714 714 Proficient 36 0.46 716 716 Proficient 37 0.54 719 719 Proficient 38 0.62 721 721 Proficient 39 0.70 724 724 Proficient
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-14 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.79 726 726 Accelerated 41 0.87 729 729 Accelerated 42 0.96 731 731 Accelerated 43 1.05 734 734 Accelerated 44 1.14 737 737 Accelerated 45 1.24 740 740 Accelerated 46 1.34 743 743 Advanced 47 1.45 746 746 Advanced 48 1.56 749 749 Advanced 49 1.68 753 753 Advanced 50 1.81 757 757 Advanced 51 1.94 761 761 Advanced 52 2.09 765 765 Advanced 53 2.25 770 770 Advanced 54 2.42 775 775 Advanced 55 2.62 781 781 Advanced 56 2.84 788 788 Advanced 57 3.09 795 795 Advanced 58 3.39 804 804 Advanced 59 3.50 808 808 Advanced 60 3.50 808 808 Advanced 61 3.50 808 808 Advanced 62 3.50 808 808 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-15 American Institutes for Research
Table G8. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Algebra Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.08 630 630 Limited 4 ‐2.76 639 639 Limited 5 ‐2.50 646 646 Limited 6 ‐2.28 652 652 Limited 7 ‐2.10 657 657 Limited 8 ‐1.93 662 662 Limited 9 ‐1.77 666 666 Limited 10 ‐1.63 670 670 Limited 11 ‐1.50 674 674 Limited 12 ‐1.38 677 677 Limited 13 ‐1.26 681 682 Basic 14 ‐1.14 684 684 Basic 15 ‐1.03 687 687 Basic 16 ‐0.93 690 690 Basic 17 ‐0.83 693 693 Basic 18 ‐0.73 696 696 Basic 19 ‐0.63 698 698 Basic 20 ‐0.53 701 701 Proficient 21 ‐0.44 704 704 Proficient 22 ‐0.34 706 706 Proficient 23 ‐0.25 709 709 Proficient 24 ‐0.15 712 712 Proficient 25 ‐0.06 714 714 Proficient 26 0.03 717 717 Proficient 27 0.13 720 720 Proficient 28 0.22 722 722 Proficient 29 0.32 725 725 Accelerated 30 0.41 728 728 Accelerated 31 0.51 730 730 Accelerated 32 0.61 733 733 Accelerated 33 0.71 736 736 Accelerated 34 0.82 739 739 Accelerated 35 0.92 742 742 Accelerated 36 1.03 745 745 Accelerated 37 1.14 748 748 Accelerated 38 1.26 751 751 Accelerated 39 1.37 755 755 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-16 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.49 758 758 Advanced 41 1.62 762 762 Advanced 42 1.75 765 765 Advanced 43 1.89 769 769 Advanced 44 2.03 773 773 Advanced 45 2.18 777 777 Advanced 46 2.34 782 782 Advanced 47 2.51 787 787 Advanced 48 2.70 792 792 Advanced 49 2.92 798 798 Advanced 50 3.17 805 805 Advanced 51 3.49 814 814 Advanced 52 3.50 814 814 Advanced 53 3.50 814 814 Advanced 54 3.50 814 814 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-17 American Institutes for Research
Table G9. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Algebra Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.08 630 630 Limited 4 ‐2.76 639 639 Limited 5 ‐2.50 646 646 Limited 6 ‐2.28 652 652 Limited 7 ‐2.10 657 657 Limited 8 ‐1.93 662 662 Limited 9 ‐1.77 666 666 Limited 10 ‐1.63 670 670 Limited 11 ‐1.50 674 674 Limited 12 ‐1.38 677 677 Limited 13 ‐1.26 681 682 Basic 14 ‐1.14 684 684 Basic 15 ‐1.03 687 687 Basic 16 ‐0.93 690 690 Basic 17 ‐0.83 693 693 Basic 18 ‐0.73 696 696 Basic 19 ‐0.63 698 698 Basic 20 ‐0.53 701 701 Proficient 21 ‐0.44 704 704 Proficient 22 ‐0.34 706 706 Proficient 23 ‐0.25 709 709 Proficient 24 ‐0.15 712 712 Proficient 25 ‐0.06 714 714 Proficient 26 0.03 717 717 Proficient 27 0.13 720 720 Proficient 28 0.22 722 722 Proficient 29 0.32 725 725 Accelerated 30 0.41 728 728 Accelerated 31 0.51 730 730 Accelerated 32 0.61 733 733 Accelerated 33 0.71 736 736 Accelerated 34 0.82 739 739 Accelerated 35 0.92 742 742 Accelerated 36 1.03 745 745 Accelerated 37 1.14 748 748 Accelerated 38 1.26 751 751 Accelerated 39 1.37 755 755 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-18 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.49 758 758 Advanced 41 1.62 762 762 Advanced 42 1.75 765 765 Advanced 43 1.89 769 769 Advanced 44 2.03 773 773 Advanced 45 2.18 777 777 Advanced 46 2.34 782 782 Advanced 47 2.51 787 787 Advanced 48 2.70 792 792 Advanced 49 2.92 798 798 Advanced 50 3.17 805 805 Advanced 51 3.49 814 814 Advanced 52 3.50 814 814 Advanced 53 3.50 814 814 Advanced 54 3.50 814 814 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-19 American Institutes for Research
Table G10. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Geometry Online
Raw Score Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 604 604 Limited 1 ‐3.50 604 604 Limited 2 ‐3.08 617 617 Limited 3 ‐2.62 630 630 Limited 4 ‐2.28 640 640 Limited 5 ‐2.01 648 648 Limited 6 ‐1.78 655 655 Limited 7 ‐1.58 661 661 Limited 8 ‐1.40 666 666 Limited 9 ‐1.23 671 671 Limited 10 ‐1.08 675 675 Limited 11 ‐0.94 679 679 Basic 12 ‐0.81 683 683 Basic 13 ‐0.68 687 687 Basic 14 ‐0.56 691 691 Basic 15 ‐0.45 694 694 Basic 16 ‐0.34 697 697 Basic 17 ‐0.23 700 700 Proficient 18 ‐0.12 704 704 Proficient 19 ‐0.02 707 707 Proficient 20 0.08 709 709 Proficient 21 0.17 712 712 Proficient 22 0.27 715 715 Proficient 23 0.36 718 718 Proficient 24 0.46 721 721 Proficient 25 0.55 723 723 Proficient 26 0.64 726 726 Accelerated 27 0.74 729 729 Accelerated 28 0.83 732 732 Accelerated 29 0.92 734 734 Accelerated 30 1.01 737 737 Accelerated 31 1.11 740 740 Accelerated 32 1.20 743 743 Accelerated 33 1.30 745 745 Accelerated 34 1.39 748 748 Accelerated 35 1.49 751 751 Accelerated 36 1.59 754 754 Accelerated 37 1.70 757 757 Advanced 38 1.80 760 760 Advanced 39 1.91 763 763 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-20 American Institutes for Research
Raw Score Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 2.02 767 767 Advanced 41 2.14 770 770 Advanced 42 2.26 774 774 Advanced 43 2.39 777 777 Advanced 44 2.53 781 781 Advanced 45 2.67 786 786 Advanced 46 2.83 790 790 Advanced 47 3.00 795 795 Advanced 48 3.19 801 801 Advanced 49 3.41 807 807 Advanced 50 3.50 810 810 Advanced 51 3.50 810 810 Advanced 52 3.50 810 810 Advanced 53 3.50 810 810 Advanced 54 3.50 810 810 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-21 American Institutes for Research
Table G11. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Geometry Paper
Raw Score Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 604 604 Limited 1 ‐3.50 604 604 Limited 2 ‐3.08 617 617 Limited 3 ‐2.62 630 630 Limited 4 ‐2.28 640 640 Limited 5 ‐2.01 648 648 Limited 6 ‐1.78 655 655 Limited 7 ‐1.58 661 661 Limited 8 ‐1.40 666 666 Limited 9 ‐1.23 671 671 Limited 10 ‐1.08 675 675 Limited 11 ‐0.94 679 679 Basic 12 ‐0.81 683 683 Basic 13 ‐0.68 687 687 Basic 14 ‐0.56 691 691 Basic 15 ‐0.45 694 694 Basic 16 ‐0.34 697 697 Basic 17 ‐0.23 700 700 Proficient 18 ‐0.12 704 704 Proficient 19 ‐0.02 707 707 Proficient 20 0.08 709 709 Proficient 21 0.17 712 712 Proficient 22 0.27 715 715 Proficient 23 0.36 718 718 Proficient 24 0.46 721 721 Proficient 25 0.55 723 723 Proficient 26 0.64 726 726 Accelerated 27 0.74 729 729 Accelerated 28 0.83 732 732 Accelerated 29 0.92 734 734 Accelerated 30 1.01 737 737 Accelerated 31 1.11 740 740 Accelerated 32 1.20 743 743 Accelerated 33 1.30 745 745 Accelerated 34 1.39 748 748 Accelerated 35 1.49 751 751 Accelerated 36 1.59 754 754 Accelerated 37 1.70 757 757 Advanced 38 1.80 760 760 Advanced 39 1.91 763 763 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-22 American Institutes for Research
Raw Score Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 2.02 767 767 Advanced 41 2.14 770 770 Advanced 42 2.26 774 774 Advanced 43 2.39 777 777 Advanced 44 2.53 781 781 Advanced 45 2.67 786 786 Advanced 46 2.83 790 790 Advanced 47 3.00 795 795 Advanced 48 3.19 801 801 Advanced 49 3.41 807 807 Advanced 50 3.50 810 810 Advanced 51 3.50 810 810 Advanced 52 3.50 810 810 Advanced 53 3.50 810 810 Advanced 54 3.50 810 810 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-23 American Institutes for Research
Table G12. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Integrated Math I Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.21 626 626 Limited 4 ‐2.88 635 635 Limited 5 ‐2.62 642 642 Limited 6 ‐2.40 649 649 Limited 7 ‐2.20 654 654 Limited 8 ‐2.03 659 659 Limited 9 ‐1.87 663 663 Limited 10 ‐1.73 667 667 Limited 11 ‐1.59 671 671 Limited 12 ‐1.47 675 675 Limited 13 ‐1.35 678 678 Limited 14 ‐1.23 681 682 Basic 15 ‐1.12 685 685 Basic 16 ‐1.01 688 688 Basic 17 ‐0.91 690 690 Basic 18 ‐0.81 693 693 Basic 19 ‐0.71 696 696 Basic 20 ‐0.62 699 699 Basic 21 ‐0.52 701 701 Proficient 22 ‐0.43 704 704 Proficient 23 ‐0.33 707 707 Proficient 24 ‐0.24 709 709 Proficient 25 ‐0.15 712 712 Proficient 26 ‐0.06 714 714 Proficient 27 0.04 717 717 Proficient 28 0.13 720 720 Proficient 29 0.22 722 722 Proficient 30 0.32 725 725 Accelerated 31 0.41 728 728 Accelerated 32 0.51 730 730 Accelerated 33 0.61 733 733 Accelerated 34 0.71 736 736 Accelerated 35 0.81 739 739 Accelerated 36 0.92 742 742 Accelerated 37 1.03 745 745 Accelerated 38 1.14 748 748 Accelerated 39 1.26 751 751 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-24 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.38 755 755 Advanced 41 1.50 758 758 Advanced 42 1.63 762 762 Advanced 43 1.77 766 766 Advanced 44 1.91 770 770 Advanced 45 2.06 774 774 Advanced 46 2.22 778 778 Advanced 47 2.40 783 783 Advanced 48 2.60 789 789 Advanced 49 2.81 795 795 Advanced 50 3.07 802 802 Advanced 51 3.38 811 811 Advanced 52 3.50 814 814 Advanced 53 3.50 814 814 Advanced 54 3.50 814 814 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-25 American Institutes for Research
Table G13. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Integrated Math I Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.21 626 626 Limited 4 ‐2.88 635 635 Limited 5 ‐2.62 642 642 Limited 6 ‐2.40 649 649 Limited 7 ‐2.20 654 654 Limited 8 ‐2.03 659 659 Limited 9 ‐1.87 663 663 Limited 10 ‐1.73 667 667 Limited 11 ‐1.59 671 671 Limited 12 ‐1.47 675 675 Limited 13 ‐1.35 678 678 Limited 14 ‐1.23 681 682 Basic 15 ‐1.12 685 685 Basic 16 ‐1.01 688 688 Basic 17 ‐0.91 690 690 Basic 18 ‐0.81 693 693 Basic 19 ‐0.71 696 696 Basic 20 ‐0.62 699 699 Basic 21 ‐0.52 701 701 Proficient 22 ‐0.43 704 704 Proficient 23 ‐0.33 707 707 Proficient 24 ‐0.24 709 709 Proficient 25 ‐0.15 712 712 Proficient 26 ‐0.06 714 714 Proficient 27 0.04 717 717 Proficient 28 0.13 720 720 Proficient 29 0.22 722 722 Proficient 30 0.32 725 725 Accelerated 31 0.41 728 728 Accelerated 32 0.51 730 730 Accelerated 33 0.61 733 733 Accelerated 34 0.71 736 736 Accelerated 35 0.81 739 739 Accelerated 36 0.92 742 742 Accelerated 37 1.03 745 745 Accelerated 38 1.14 748 748 Accelerated 39 1.26 751 751 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-26 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.38 755 755 Advanced 41 1.50 758 758 Advanced 42 1.63 762 762 Advanced 43 1.77 766 766 Advanced 44 1.91 770 770 Advanced 45 2.06 774 774 Advanced 46 2.22 778 778 Advanced 47 2.40 783 783 Advanced 48 2.60 789 789 Advanced 49 2.81 795 795 Advanced 50 3.07 802 802 Advanced 51 3.38 811 811 Advanced 52 3.50 814 814 Advanced 53 3.50 814 814 Advanced 54 3.50 814 814 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-27 American Institutes for Research
Table G14. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Integrated Math II Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 594 594 Limited 1 ‐3.50 594 594 Limited 2 ‐3.18 604 604 Limited 3 ‐2.72 618 618 Limited 4 ‐2.39 629 629 Limited 5 ‐2.12 637 637 Limited 6 ‐1.88 645 645 Limited 7 ‐1.68 651 651 Limited 8 ‐1.50 657 657 Limited 9 ‐1.34 662 662 Limited 10 ‐1.18 667 667 Limited 11 ‐1.04 671 671 Limited 12 ‐0.91 675 675 Limited 13 ‐0.78 679 679 Basic 14 ‐0.65 683 683 Basic 15 ‐0.54 687 687 Basic 16 ‐0.42 690 690 Basic 17 ‐0.31 694 694 Basic 18 ‐0.21 697 697 Basic 19 ‐0.10 700 700 Proficient 20 0.00 704 704 Proficient 21 0.10 707 707 Proficient 22 0.20 710 710 Proficient 23 0.30 713 713 Proficient 24 0.40 716 716 Proficient 25 0.49 719 719 Proficient 26 0.59 722 722 Proficient 27 0.68 725 725 Accelerated 28 0.78 728 728 Accelerated 29 0.88 731 731 Accelerated 30 0.97 734 734 Accelerated 31 1.07 737 737 Accelerated 32 1.17 740 740 Accelerated 33 1.27 743 743 Accelerated 34 1.37 746 746 Accelerated 35 1.48 750 750 Accelerated 36 1.59 753 753 Accelerated 37 1.69 757 758 Advanced 38 1.81 760 760 Advanced 39 1.93 764 764 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-28 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 2.05 768 768 Advanced 41 2.17 771 771 Advanced 42 2.31 776 776 Advanced 43 2.45 780 780 Advanced 44 2.60 785 785 Advanced 45 2.75 790 790 Advanced 46 2.93 795 795 Advanced 47 3.11 801 801 Advanced 48 3.32 807 807 Advanced 49 3.50 813 813 Advanced 50 3.50 813 813 Advanced 51 3.50 813 813 Advanced 52 3.50 813 813 Advanced 53 3.50 813 813 Advanced 54 3.50 813 813 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-29 American Institutes for Research
Table G15. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Integrated Math II Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 594 594 Limited 1 ‐3.50 594 594 Limited 2 ‐3.18 604 604 Limited 3 ‐2.72 618 618 Limited 4 ‐2.39 629 629 Limited 5 ‐2.12 637 637 Limited 6 ‐1.88 645 645 Limited 7 ‐1.68 651 651 Limited 8 ‐1.50 657 657 Limited 9 ‐1.34 662 662 Limited 10 ‐1.18 667 667 Limited 11 ‐1.04 671 671 Limited 12 ‐0.91 675 675 Limited 13 ‐0.78 679 679 Basic 14 ‐0.65 683 683 Basic 15 ‐0.54 687 687 Basic 16 ‐0.42 690 690 Basic 17 ‐0.31 694 694 Basic 18 ‐0.21 697 697 Basic 19 ‐0.10 700 700 Proficient 20 0.00 704 704 Proficient 21 0.10 707 707 Proficient 22 0.20 710 710 Proficient 23 0.30 713 713 Proficient 24 0.40 716 716 Proficient 25 0.49 719 719 Proficient 26 0.59 722 722 Proficient 27 0.68 725 725 Accelerated 28 0.78 728 728 Accelerated 29 0.88 731 731 Accelerated 30 0.97 734 734 Accelerated 31 1.07 737 737 Accelerated 32 1.17 740 740 Accelerated 33 1.27 743 743 Accelerated 34 1.37 746 746 Accelerated 35 1.48 750 750 Accelerated 36 1.59 753 753 Accelerated 37 1.69 757 758 Advanced 38 1.81 760 760 Advanced 39 1.93 764 764 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-30 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 2.05 768 768 Advanced 41 2.17 771 771 Advanced 42 2.31 776 776 Advanced 43 2.45 780 780 Advanced 44 2.60 785 785 Advanced 45 2.75 790 790 Advanced 46 2.93 795 795 Advanced 47 3.11 801 801 Advanced 48 3.32 807 807 Advanced 49 3.50 813 813 Advanced 50 3.50 813 813 Advanced 51 3.50 813 813 Advanced 52 3.50 813 813 Advanced 53 3.50 813 813 Advanced 54 3.50 813 813 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-31 American Institutes for Research
Table G16. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Biology Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 617 617 Limited 1 ‐3.50 617 617 Limited 2 ‐3.18 626 626 Limited 3 ‐2.76 638 638 Limited 4 ‐2.46 647 647 Limited 5 ‐2.21 654 654 Limited 6 ‐2.02 660 660 Limited 7 ‐1.84 665 665 Limited 8 ‐1.69 670 670 Limited 9 ‐1.56 674 674 Limited 10 ‐1.43 678 678 Limited 11 ‐1.32 681 681 Limited 12 ‐1.21 684 685 Basic 13 ‐1.11 687 687 Basic 14 ‐1.02 690 690 Basic 15 ‐0.93 692 692 Basic 16 ‐0.84 695 695 Basic 17 ‐0.76 697 697 Basic 18 ‐0.68 700 700 Proficient 19 ‐0.60 702 702 Proficient 20 ‐0.52 704 704 Proficient 21 ‐0.45 707 707 Proficient 22 ‐0.37 709 709 Proficient 23 ‐0.30 711 711 Proficient 24 ‐0.23 713 713 Proficient 25 ‐0.16 715 715 Proficient 26 ‐0.08 717 717 Proficient 27 ‐0.01 719 719 Proficient 28 0.06 722 722 Proficient 29 0.13 724 724 Proficient 30 0.20 726 726 Accelerated 31 0.28 728 728 Accelerated 32 0.35 730 730 Accelerated 33 0.43 732 732 Accelerated 34 0.50 735 735 Advanced 35 0.58 737 737 Advanced 36 0.66 739 739 Advanced 37 0.74 742 742 Advanced 38 0.82 744 744 Advanced 39 0.91 747 747 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-32 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.00 749 749 Advanced 41 1.09 752 752 Advanced 42 1.18 755 755 Advanced 43 1.28 758 758 Advanced 44 1.39 761 761 Advanced 45 1.50 764 764 Advanced 46 1.62 768 768 Advanced 47 1.75 771 771 Advanced 48 1.89 776 776 Advanced 49 2.05 780 780 Advanced 50 2.23 786 786 Advanced 51 2.44 792 792 Advanced 52 2.70 799 799 Advanced 53 3.02 809 809 Advanced 54 3.46 822 822 Advanced 55 3.50 823 823 Advanced 56 3.50 823 823 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-33 American Institutes for Research
Table G17. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Biology Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 617 617 Limited 1 ‐3.50 617 617 Limited 2 ‐3.18 626 626 Limited 3 ‐2.76 638 638 Limited 4 ‐2.45 647 647 Limited 5 ‐2.21 655 655 Limited 6 ‐2.01 660 660 Limited 7 ‐1.84 665 665 Limited 8 ‐1.69 670 670 Limited 9 ‐1.56 674 674 Limited 10 ‐1.43 678 678 Limited 11 ‐1.32 681 681 Limited 12 ‐1.21 684 685 Basic 13 ‐1.11 687 687 Basic 14 ‐1.02 690 690 Basic 15 ‐0.93 692 692 Basic 16 ‐0.84 695 695 Basic 17 ‐0.76 697 697 Basic 18 ‐0.68 700 700 Proficient 19 ‐0.60 702 702 Proficient 20 ‐0.52 704 704 Proficient 21 ‐0.45 707 707 Proficient 22 ‐0.37 709 709 Proficient 23 ‐0.30 711 711 Proficient 24 ‐0.23 713 713 Proficient 25 ‐0.16 715 715 Proficient 26 ‐0.09 717 717 Proficient 27 ‐0.02 719 719 Proficient 28 0.06 721 721 Proficient 29 0.13 724 724 Proficient 30 0.20 726 726 Accelerated 31 0.27 728 728 Accelerated 32 0.35 730 730 Accelerated 33 0.42 732 732 Accelerated 34 0.50 734 735 Advanced 35 0.57 737 737 Advanced 36 0.65 739 739 Advanced 37 0.73 741 741 Advanced 38 0.81 744 744 Advanced 39 0.90 746 746 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-34 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.99 749 749 Advanced 41 1.08 752 752 Advanced 42 1.17 754 754 Advanced 43 1.27 757 757 Advanced 44 1.38 761 761 Advanced 45 1.49 764 764 Advanced 46 1.61 767 767 Advanced 47 1.74 771 771 Advanced 48 1.88 775 775 Advanced 49 2.04 780 780 Advanced 50 2.22 785 785 Advanced 51 2.43 792 792 Advanced 52 2.69 799 799 Advanced 53 3.01 809 809 Advanced 54 3.45 822 822 Advanced 55 3.50 823 823 Advanced 56 3.50 823 823 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-35 American Institutes for Research
Table G18. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Physical Science Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 634 634 Limited 1 ‐3.50 634 634 Limited 2 ‐3.39 637 637 Limited 3 ‐2.95 648 648 Limited 4 ‐2.63 656 656 Limited 5 ‐2.38 663 663 Limited 6 ‐2.16 668 668 Limited 7 ‐1.98 673 673 Limited 8 ‐1.81 677 677 Limited 9 ‐1.67 681 681 Limited 10 ‐1.53 685 685 Basic 11 ‐1.40 688 688 Basic 12 ‐1.29 691 691 Basic 13 ‐1.18 694 694 Basic 14 ‐1.07 697 697 Basic 15 ‐0.97 699 700 Proficient 16 ‐0.87 702 702 Proficient 17 ‐0.78 704 704 Proficient 18 ‐0.69 706 706 Proficient 19 ‐0.60 709 709 Proficient 20 ‐0.52 711 711 Proficient 21 ‐0.44 713 713 Proficient 22 ‐0.35 715 715 Proficient 23 ‐0.27 717 717 Proficient 24 ‐0.19 719 719 Proficient 25 ‐0.11 722 722 Proficient 26 ‐0.03 724 724 Proficient 27 0.05 726 726 Accelerated 28 0.13 728 728 Accelerated 29 0.21 730 730 Accelerated 30 0.29 732 732 Accelerated 31 0.37 734 734 Accelerated 32 0.45 736 736 Accelerated 33 0.53 738 738 Accelerated 34 0.62 740 740 Accelerated 35 0.70 743 743 Accelerated 36 0.79 745 745 Accelerated 37 0.88 747 747 Accelerated 38 0.98 750 750 Advanced 39 1.07 752 752 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-36 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.17 755 755 Advanced 41 1.28 758 758 Advanced 42 1.39 760 760 Advanced 43 1.51 764 764 Advanced 44 1.64 767 767 Advanced 45 1.78 770 770 Advanced 46 1.93 774 774 Advanced 47 2.09 779 779 Advanced 48 2.28 783 783 Advanced 49 2.50 789 789 Advanced 50 2.76 796 796 Advanced 51 3.09 804 804 Advanced 52 3.50 815 815 Advanced 53 3.50 815 815 Advanced 54 3.50 815 815 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-37 American Institutes for Research
Table G19. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – Physical Science Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 634 634 Limited 1 ‐3.50 634 634 Limited 2 ‐3.46 635 635 Limited 3 ‐3.02 646 646 Limited 4 ‐2.69 655 655 Limited 5 ‐2.43 661 661 Limited 6 ‐2.22 667 667 Limited 7 ‐2.03 672 672 Limited 8 ‐1.86 676 676 Limited 9 ‐1.70 680 680 Limited 10 ‐1.56 684 684 Basic 11 ‐1.43 687 687 Basic 12 ‐1.31 691 691 Basic 13 ‐1.19 693 693 Basic 14 ‐1.08 696 696 Basic 15 ‐0.98 699 700 Proficient 16 ‐0.88 702 702 Proficient 17 ‐0.78 704 704 Proficient 18 ‐0.69 707 707 Proficient 19 ‐0.60 709 709 Proficient 20 ‐0.51 711 711 Proficient 21 ‐0.42 714 714 Proficient 22 ‐0.33 716 716 Proficient 23 ‐0.24 718 718 Proficient 24 ‐0.16 720 720 Proficient 25 ‐0.07 723 723 Proficient 26 0.01 725 725 Accelerated 27 0.10 727 727 Accelerated 28 0.18 729 729 Accelerated 29 0.27 731 731 Accelerated 30 0.35 734 734 Accelerated 31 0.44 736 736 Accelerated 32 0.53 738 738 Accelerated 33 0.62 741 741 Accelerated 34 0.72 743 743 Accelerated 35 0.81 745 745 Accelerated 36 0.91 748 749 Advanced 37 1.01 751 751 Advanced 38 1.12 753 753 Advanced 39 1.22 756 756 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-38 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.34 759 759 Advanced 41 1.46 762 762 Advanced 42 1.59 766 766 Advanced 43 1.72 769 769 Advanced 44 1.87 773 773 Advanced 45 2.02 777 777 Advanced 46 2.19 781 781 Advanced 47 2.38 786 786 Advanced 48 2.59 792 792 Advanced 49 2.83 798 798 Advanced 50 3.10 805 805 Advanced 51 3.44 813 813 Advanced 52 3.50 815 815 Advanced 53 3.50 815 815 Advanced 54 3.50 815 815 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-39 American Institutes for Research
Table G20. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – American Government Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 642 642 Limited 1 ‐3.50 642 642 Limited 2 ‐3.30 645 645 Limited 3 ‐2.88 654 654 Limited 4 ‐2.57 659 659 Limited 5 ‐2.32 664 664 Limited 6 ‐2.12 668 668 Limited 7 ‐1.95 671 671 Limited 8 ‐1.79 674 674 Limited 9 ‐1.66 677 677 Limited 10 ‐1.53 679 679 Limited 11 ‐1.42 681 681 Limited 12 ‐1.32 683 683 Limited 13 ‐1.22 685 685 Limited 14 ‐1.13 687 687 Basic 15 ‐1.04 688 688 Basic 16 ‐0.96 690 690 Basic 17 ‐0.88 691 691 Basic 18 ‐0.80 693 693 Basic 19 ‐0.73 694 694 Basic 20 ‐0.66 695 695 Basic 21 ‐0.59 697 697 Basic 22 ‐0.53 698 698 Basic 23 ‐0.46 699 699 Basic 24 ‐0.40 700 700 Proficient 25 ‐0.34 701 701 Proficient 26 ‐0.28 703 703 Proficient 27 ‐0.22 704 704 Proficient 28 ‐0.16 705 705 Proficient 29 ‐0.10 706 706 Proficient 30 ‐0.04 707 707 Proficient 31 0.02 708 708 Proficient 32 0.08 709 709 Proficient 33 0.14 710 710 Proficient 34 0.19 711 711 Proficient 35 0.25 713 713 Proficient 36 0.31 714 714 Proficient 37 0.37 715 715 Proficient 38 0.43 716 716 Proficient 39 0.50 717 717 Proficient
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-40 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.56 718 718 Proficient 41 0.62 719 719 Proficient 42 0.69 721 721 Proficient 43 0.76 722 722 Proficient 44 0.83 723 723 Proficient 45 0.90 725 725 Accelerated 46 0.98 726 726 Accelerated 47 1.05 728 728 Accelerated 48 1.14 729 729 Accelerated 49 1.22 731 731 Accelerated 50 1.32 733 733 Accelerated 51 1.42 734 734 Accelerated 52 1.52 736 736 Accelerated 53 1.64 739 739 Advanced 54 1.76 741 741 Advanced 55 1.90 744 744 Advanced 56 2.05 746 746 Advanced 57 2.23 750 750 Advanced 58 2.44 754 754 Advanced 59 2.69 758 758 Advanced 60 3.00 764 764 Advanced 61 3.44 773 773 Advanced 62 3.50 774 774 Advanced 63 3.50 774 774 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-41 American Institutes for Research
Table G21. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – American Government Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 642 642 Limited 1 ‐3.50 642 642 Limited 2 ‐3.30 645 645 Limited 3 ‐2.88 654 654 Limited 4 ‐2.57 659 659 Limited 5 ‐2.32 664 664 Limited 6 ‐2.12 668 668 Limited 7 ‐1.95 671 671 Limited 8 ‐1.79 674 674 Limited 9 ‐1.66 677 677 Limited 10 ‐1.53 679 679 Limited 11 ‐1.42 681 681 Limited 12 ‐1.32 683 683 Limited 13 ‐1.22 685 685 Limited 14 ‐1.13 687 687 Basic 15 ‐1.04 688 688 Basic 16 ‐0.96 690 690 Basic 17 ‐0.88 691 691 Basic 18 ‐0.80 693 693 Basic 19 ‐0.73 694 694 Basic 20 ‐0.66 695 695 Basic 21 ‐0.59 697 697 Basic 22 ‐0.53 698 698 Basic 23 ‐0.46 699 699 Basic 24 ‐0.40 700 700 Proficient 25 ‐0.34 701 701 Proficient 26 ‐0.28 703 703 Proficient 27 ‐0.22 704 704 Proficient 28 ‐0.16 705 705 Proficient 29 ‐0.10 706 706 Proficient 30 ‐0.04 707 707 Proficient 31 0.02 708 708 Proficient 32 0.08 709 709 Proficient 33 0.14 710 710 Proficient 34 0.19 711 711 Proficient 35 0.25 713 713 Proficient 36 0.31 714 714 Proficient 37 0.37 715 715 Proficient 38 0.43 716 716 Proficient 39 0.50 717 717 Proficient
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-42 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.56 718 718 Proficient 41 0.62 719 719 Proficient 42 0.69 721 721 Proficient 43 0.76 722 722 Proficient 44 0.83 723 723 Proficient 45 0.90 725 725 Accelerated 46 0.98 726 726 Accelerated 47 1.05 728 728 Accelerated 48 1.14 729 729 Accelerated 49 1.22 731 731 Accelerated 50 1.32 733 733 Accelerated 51 1.42 734 734 Accelerated 52 1.52 736 736 Accelerated 53 1.64 739 739 Advanced 54 1.76 741 741 Advanced 55 1.90 744 744 Advanced 56 2.05 746 746 Advanced 57 2.23 750 750 Advanced 58 2.44 754 754 Advanced 59 2.69 758 758 Advanced 60 3.00 764 764 Advanced 61 3.44 773 773 Advanced 62 3.50 774 774 Advanced 63 3.50 774 774 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-43 American Institutes for Research
Table G22. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – American History Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 619 619 Limited 1 ‐3.50 619 619 Limited 2 ‐3.39 622 622 Limited 3 ‐2.96 633 633 Limited 4 ‐2.65 641 641 Limited 5 ‐2.40 648 648 Limited 6 ‐2.19 653 653 Limited 7 ‐2.01 658 658 Limited 8 ‐1.85 662 662 Limited 9 ‐1.71 665 665 Limited 10 ‐1.58 669 669 Limited 11 ‐1.46 672 672 Limited 12 ‐1.34 675 675 Limited 13 ‐1.24 678 678 Limited 14 ‐1.14 680 680 Limited 15 ‐1.05 683 683 Limited 16 ‐0.96 685 685 Basic 17 ‐0.87 687 687 Basic 18 ‐0.79 689 689 Basic 19 ‐0.71 691 691 Basic 20 ‐0.63 693 693 Basic 21 ‐0.56 695 695 Basic 22 ‐0.48 697 697 Basic 23 ‐0.41 699 699 Basic 24 ‐0.34 701 701 Proficient 25 ‐0.27 702 702 Proficient 26 ‐0.21 704 704 Proficient 27 ‐0.14 706 706 Proficient 28 ‐0.08 707 707 Proficient 29 ‐0.01 709 709 Proficient 30 0.05 711 711 Proficient 31 0.12 712 712 Proficient 32 0.18 714 714 Proficient 33 0.24 716 716 Proficient 34 0.31 717 717 Proficient 35 0.37 719 719 Proficient 36 0.43 721 721 Proficient 37 0.50 722 722 Proficient 38 0.56 724 724 Proficient 39 0.63 726 726 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-44 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.69 727 727 Accelerated 41 0.76 729 729 Accelerated 42 0.83 731 731 Accelerated 43 0.90 733 733 Accelerated 44 0.97 734 734 Accelerated 45 1.04 736 736 Accelerated 46 1.11 738 738 Advanced 47 1.19 740 740 Advanced 48 1.27 742 742 Advanced 49 1.36 744 744 Advanced 50 1.44 747 747 Advanced 51 1.54 749 749 Advanced 52 1.63 752 752 Advanced 53 1.74 754 754 Advanced 54 1.85 757 757 Advanced 55 1.97 760 760 Advanced 56 2.10 764 764 Advanced 57 2.25 767 767 Advanced 58 2.42 772 772 Advanced 59 2.62 777 777 Advanced 60 2.85 783 783 Advanced 61 3.15 791 791 Advanced 62 3.50 800 800 Advanced 63 3.50 800 800 Advanced 64 3.50 800 800 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-45 American Institutes for Research
Table G23. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Fall 2017 – American History Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 619 619 Limited 1 ‐3.50 619 619 Limited 2 ‐3.39 622 622 Limited 3 ‐2.96 633 633 Limited 4 ‐2.65 641 641 Limited 5 ‐2.40 648 648 Limited 6 ‐2.19 653 653 Limited 7 ‐2.01 658 658 Limited 8 ‐1.85 662 662 Limited 9 ‐1.71 665 665 Limited 10 ‐1.58 669 669 Limited 11 ‐1.46 672 672 Limited 12 ‐1.34 675 675 Limited 13 ‐1.24 678 678 Limited 14 ‐1.14 680 680 Limited 15 ‐1.05 683 683 Limited 16 ‐0.96 685 685 Basic 17 ‐0.87 687 687 Basic 18 ‐0.79 689 689 Basic 19 ‐0.71 691 691 Basic 20 ‐0.63 693 693 Basic 21 ‐0.56 695 695 Basic 22 ‐0.48 697 697 Basic 23 ‐0.41 699 699 Basic 24 ‐0.34 701 701 Proficient 25 ‐0.27 702 702 Proficient 26 ‐0.21 704 704 Proficient 27 ‐0.14 706 706 Proficient 28 ‐0.08 707 707 Proficient 29 ‐0.01 709 709 Proficient 30 0.05 711 711 Proficient 31 0.12 712 712 Proficient 32 0.18 714 714 Proficient 33 0.24 716 716 Proficient 34 0.31 717 717 Proficient 35 0.37 719 719 Proficient 36 0.43 721 721 Proficient 37 0.50 722 722 Proficient 38 0.56 724 724 Proficient 39 0.63 726 726 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-46 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.69 727 727 Accelerated 41 0.76 729 729 Accelerated 42 0.83 731 731 Accelerated 43 0.90 733 733 Accelerated 44 0.97 734 734 Accelerated 45 1.04 736 736 Accelerated 46 1.11 738 738 Advanced 47 1.19 740 740 Advanced 48 1.27 742 742 Advanced 49 1.36 744 744 Advanced 50 1.44 747 747 Advanced 51 1.54 749 749 Advanced 52 1.63 752 752 Advanced 53 1.74 754 754 Advanced 54 1.85 757 757 Advanced 55 1.97 760 760 Advanced 56 2.10 764 764 Advanced 57 2.25 767 767 Advanced 58 2.42 772 772 Advanced 59 2.62 777 777 Advanced 60 2.85 783 783 Advanced 61 3.15 791 791 Advanced 62 3.50 800 800 Advanced 63 3.50 800 800 Advanced 64 3.50 800 800 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-47 American Institutes for Research
Table G24. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 3 Reading
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 545 545 Promotion No 1 ‐3.50 545 545 Promotion No 2 ‐3.12 562 562 Promotion No 3 ‐2.65 583 583 Promotion No 4 ‐2.31 599 599 Promotion No 5 ‐2.03 612 612 Promotion No 6 ‐1.78 623 623 Promotion No 7 ‐1.57 633 633 Promotion No 8 ‐1.38 642 642 Promotion No 9 ‐1.20 650 650 Promotion No 10 ‐1.03 657 657 Promotion No 11 ‐0.87 664 664 Promotion No 12 ‐0.72 671 672 Promotion Yes 13 ‐0.58 678 678 Promotion Yes 14 ‐0.44 684 684 Promotion Yes 15 ‐0.31 690 690 Promotion Yes 16 ‐0.18 696 696 Promotion Yes 17 ‐0.05 702 702 Promotion Yes 18 0.08 708 708 Promotion Yes 19 0.21 714 714 Promotion Yes 20 0.34 719 719 Promotion Yes 21 0.46 725 725 Promotion Yes 22 0.59 731 731 Promotion Yes 23 0.72 737 737 Promotion Yes 24 0.86 743 743 Promotion Yes 25 1.00 749 752 Promotion Yes 26 1.14 756 756 Promotion Yes 27 1.29 763 763 Promotion Yes 28 1.45 770 770 Promotion Yes 29 1.61 777 777 Promotion Yes 30 1.79 785 785 Promotion Yes 31 1.97 794 794 Promotion Yes 32 2.18 803 803 Promotion Yes 33 2.40 813 813 Promotion Yes 34 2.64 824 824 Promotion Yes 35 2.92 837 837 Promotion Yes 36 3.24 851 851 Promotion Yes 37 3.50 863 863 Promotion Yes 38 3.50 863 863 Promotion Yes 39 3.50 863 863 Promotion Yes
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-48 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 3.50 863 863 Promotion Yes
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-49 American Institutes for Research
Table G25. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 3 ELA Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 545 545 Limited 1 ‐3.50 545 545 Limited 2 ‐3.12 562 562 Limited 3 ‐2.65 583 583 Limited 4 ‐2.31 599 599 Limited 5 ‐2.03 612 612 Limited 6 ‐1.78 623 623 Limited 7 ‐1.57 633 633 Limited 8 ‐1.38 642 642 Limited 9 ‐1.20 650 650 Limited 10 ‐1.03 657 657 Limited 11 ‐0.87 664 664 Limited 12 ‐0.72 671 672 Basic 13 ‐0.58 678 678 Basic 14 ‐0.44 684 684 Basic 15 ‐0.31 690 690 Basic 16 ‐0.18 696 696 Basic 17 ‐0.05 702 702 Proficient 18 0.08 708 708 Proficient 19 0.21 714 714 Proficient 20 0.34 719 719 Proficient 21 0.46 725 725 Accelerated 22 0.59 731 731 Accelerated 23 0.72 737 737 Accelerated 24 0.86 743 743 Accelerated 25 1.00 749 752 Advanced 26 1.14 756 756 Advanced 27 1.29 763 763 Advanced 28 1.45 770 770 Advanced 29 1.61 777 777 Advanced 30 1.79 785 785 Advanced 31 1.97 794 794 Advanced 32 2.18 803 803 Advanced 33 2.40 813 813 Advanced 34 2.64 824 824 Advanced 35 2.92 837 837 Advanced 36 3.24 851 851 Advanced 37 3.50 863 863 Advanced 38 3.50 863 863 Advanced 39 3.50 863 863 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-50 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 3.50 863 863 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-51 American Institutes for Research
Table G26. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 3 ELA Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 545 545 Limited 1 ‐3.50 545 545 Limited 2 ‐3.12 562 562 Limited 3 ‐2.65 583 583 Limited 4 ‐2.31 599 599 Limited 5 ‐2.03 612 612 Limited 6 ‐1.78 623 623 Limited 7 ‐1.57 633 633 Limited 8 ‐1.38 642 642 Limited 9 ‐1.20 650 650 Limited 10 ‐1.03 657 657 Limited 11 ‐0.87 664 664 Limited 12 ‐0.72 671 672 Basic 13 ‐0.58 678 678 Basic 14 ‐0.44 684 684 Basic 15 ‐0.31 690 690 Basic 16 ‐0.18 696 696 Basic 17 ‐0.05 702 702 Proficient 18 0.08 708 708 Proficient 19 0.21 714 714 Proficient 20 0.34 719 719 Proficient 21 0.46 725 725 Accelerated 22 0.59 731 731 Accelerated 23 0.72 737 737 Accelerated 24 0.86 743 743 Accelerated 25 1.00 749 752 Advanced 26 1.14 756 756 Advanced 27 1.29 763 763 Advanced 28 1.45 770 770 Advanced 29 1.61 777 777 Advanced 30 1.79 785 785 Advanced 31 1.97 794 794 Advanced 32 2.18 803 803 Advanced 33 2.40 813 813 Advanced 34 2.64 824 824 Advanced 35 2.92 837 837 Advanced 36 3.24 851 851 Advanced 37 3.50 863 863 Advanced 38 3.50 863 863 Advanced 39 3.50 863 863 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-52 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 3.50 863 863 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-53 American Institutes for Research
Table G27. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 4 ELA Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 549 549 Limited 1 ‐3.50 549 549 Limited 2 ‐3.04 569 569 Limited 3 ‐2.57 589 589 Limited 4 ‐2.22 603 603 Limited 5 ‐1.94 615 615 Limited 6 ‐1.70 626 626 Limited 7 ‐1.49 634 634 Limited 8 ‐1.30 642 642 Limited 9 ‐1.13 650 650 Limited 10 ‐0.97 656 656 Limited 11 ‐0.82 663 663 Limited 12 ‐0.68 669 669 Limited 13 ‐0.55 674 674 Basic 14 ‐0.42 680 680 Basic 15 ‐0.30 685 685 Basic 16 ‐0.18 690 690 Basic 17 ‐0.06 695 695 Basic 18 0.05 700 700 Proficient 19 0.17 705 705 Proficient 20 0.28 709 709 Proficient 21 0.40 714 714 Proficient 22 0.51 719 719 Proficient 23 0.63 724 725 Accelerated 24 0.76 729 729 Accelerated 25 0.88 735 735 Accelerated 26 1.01 740 740 Accelerated 27 1.15 746 746 Accelerated 28 1.30 753 753 Advanced 29 1.46 759 759 Advanced 30 1.63 766 766 Advanced 31 1.81 774 774 Advanced 32 2.01 783 783 Advanced 33 2.23 792 792 Advanced 34 2.48 802 802 Advanced 35 2.76 814 814 Advanced 36 3.08 828 828 Advanced 37 3.47 845 845 Advanced 38 3.50 846 846 Advanced 39 3.50 846 846 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-54 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 3.50 846 846 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-55 American Institutes for Research
Table G28. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 4 ELA Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 549 549 Limited 1 ‐3.50 549 549 Limited 2 ‐3.04 569 569 Limited 3 ‐2.57 589 589 Limited 4 ‐2.22 603 603 Limited 5 ‐1.94 615 615 Limited 6 ‐1.70 626 626 Limited 7 ‐1.49 634 634 Limited 8 ‐1.30 642 642 Limited 9 ‐1.13 650 650 Limited 10 ‐0.97 656 656 Limited 11 ‐0.82 663 663 Limited 12 ‐0.68 669 669 Limited 13 ‐0.55 674 674 Basic 14 ‐0.42 680 680 Basic 15 ‐0.30 685 685 Basic 16 ‐0.18 690 690 Basic 17 ‐0.06 695 695 Basic 18 0.05 700 700 Proficient 19 0.17 705 705 Proficient 20 0.28 709 709 Proficient 21 0.40 714 714 Proficient 22 0.51 719 719 Proficient 23 0.63 724 725 Accelerated 24 0.76 729 729 Accelerated 25 0.88 735 735 Accelerated 26 1.01 740 740 Accelerated 27 1.15 746 746 Accelerated 28 1.30 753 753 Advanced 29 1.46 759 759 Advanced 30 1.63 766 766 Advanced 31 1.81 774 774 Advanced 32 2.01 783 783 Advanced 33 2.23 792 792 Advanced 34 2.48 802 802 Advanced 35 2.76 814 814 Advanced 36 3.08 828 828 Advanced 37 3.47 845 845 Advanced 38 3.50 846 846 Advanced 39 3.50 846 846 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-56 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 3.50 846 846 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-57 American Institutes for Research
Table G29. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 5 ELA Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 552 552 Limited 1 ‐3.50 552 552 Limited 2 ‐3.39 557 557 Limited 3 ‐2.93 576 576 Limited 4 ‐2.59 590 590 Limited 5 ‐2.31 602 602 Limited 6 ‐2.07 612 612 Limited 7 ‐1.87 621 621 Limited 8 ‐1.68 629 629 Limited 9 ‐1.50 636 636 Limited 10 ‐1.34 643 643 Limited 11 ‐1.19 650 650 Limited 12 ‐1.04 656 656 Limited 13 ‐0.90 662 662 Limited 14 ‐0.77 667 669 Basic 15 ‐0.64 673 673 Basic 16 ‐0.51 678 678 Basic 17 ‐0.39 683 683 Basic 18 ‐0.27 689 689 Basic 19 ‐0.15 694 694 Basic 20 ‐0.03 699 700 Proficient 21 0.08 704 704 Proficient 22 0.20 708 708 Proficient 23 0.32 713 713 Proficient 24 0.44 718 718 Proficient 25 0.56 724 725 Accelerated 26 0.68 729 729 Accelerated 27 0.80 734 734 Accelerated 28 0.93 740 740 Accelerated 29 1.07 745 745 Accelerated 30 1.21 751 751 Accelerated 31 1.37 758 758 Advanced 32 1.54 765 765 Advanced 33 1.72 773 773 Advanced 34 1.92 782 782 Advanced 35 2.15 791 791 Advanced 36 2.41 802 802 Advanced 37 2.71 815 815 Advanced 38 3.06 830 830 Advanced 39 3.47 847 847 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-58 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 3.50 848 848 Advanced 41 3.50 848 848 Advanced 42 3.50 848 848 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-59 American Institutes for Research
Table G30. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 5 ELA Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 552 552 Limited 1 ‐3.50 552 552 Limited 2 ‐3.39 557 557 Limited 3 ‐2.93 576 576 Limited 4 ‐2.59 590 590 Limited 5 ‐2.31 602 602 Limited 6 ‐2.07 612 612 Limited 7 ‐1.87 621 621 Limited 8 ‐1.68 629 629 Limited 9 ‐1.50 636 636 Limited 10 ‐1.34 643 643 Limited 11 ‐1.19 650 650 Limited 12 ‐1.04 656 656 Limited 13 ‐0.90 662 662 Limited 14 ‐0.77 667 669 Basic 15 ‐0.64 673 673 Basic 16 ‐0.51 678 678 Basic 17 ‐0.39 683 683 Basic 18 ‐0.27 689 689 Basic 19 ‐0.15 694 694 Basic 20 ‐0.03 699 700 Proficient 21 0.08 704 704 Proficient 22 0.20 708 708 Proficient 23 0.32 713 713 Proficient 24 0.44 718 718 Proficient 25 0.56 724 725 Accelerated 26 0.68 729 729 Accelerated 27 0.80 734 734 Accelerated 28 0.93 740 740 Accelerated 29 1.07 745 745 Accelerated 30 1.21 751 751 Accelerated 31 1.37 758 758 Advanced 32 1.54 765 765 Advanced 33 1.72 773 773 Advanced 34 1.92 782 782 Advanced 35 2.15 791 791 Advanced 36 2.41 802 802 Advanced 37 2.71 815 815 Advanced 38 3.06 830 830 Advanced 39 3.47 847 847 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-60 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 3.50 848 848 Advanced 41 3.50 848 848 Advanced 42 3.50 848 848 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-61 American Institutes for Research
Table G31. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 6 ELA Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 555 555 Limited 1 ‐3.50 555 555 Limited 2 ‐3.50 555 555 Limited 3 ‐3.35 561 561 Limited 4 ‐3.03 575 575 Limited 5 ‐2.77 586 586 Limited 6 ‐2.55 595 595 Limited 7 ‐2.36 603 603 Limited 8 ‐2.19 610 610 Limited 9 ‐2.03 617 617 Limited 10 ‐1.89 623 623 Limited 11 ‐1.76 628 628 Limited 12 ‐1.64 634 634 Limited 13 ‐1.52 638 638 Limited 14 ‐1.41 643 643 Limited 15 ‐1.31 648 648 Limited 16 ‐1.21 652 652 Limited 17 ‐1.11 656 656 Limited 18 ‐1.02 660 660 Limited 19 ‐0.93 664 664 Limited 20 ‐0.84 667 668 Basic 21 ‐0.76 671 671 Basic 22 ‐0.67 674 674 Basic 23 ‐0.59 678 678 Basic 24 ‐0.51 681 681 Basic 25 ‐0.43 685 685 Basic 26 ‐0.36 688 688 Basic 27 ‐0.28 691 691 Basic 28 ‐0.21 694 694 Basic 29 ‐0.13 697 697 Basic 30 ‐0.06 701 701 Proficient 31 0.02 704 704 Proficient 32 0.09 707 707 Proficient 33 0.17 710 710 Proficient 34 0.24 713 713 Proficient 35 0.32 716 716 Proficient 36 0.40 720 720 Proficient 37 0.47 723 723 Proficient 38 0.55 726 726 Accelerated 39 0.63 730 730 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-62 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.72 733 733 Accelerated 41 0.80 737 737 Accelerated 42 0.89 741 741 Accelerated 43 0.98 744 744 Accelerated 44 1.07 748 748 Accelerated 45 1.17 752 752 Advanced 46 1.27 757 757 Advanced 47 1.38 761 761 Advanced 48 1.49 766 766 Advanced 49 1.61 771 771 Advanced 50 1.74 777 777 Advanced 51 1.88 782 782 Advanced 52 2.02 789 789 Advanced 53 2.19 796 796 Advanced 54 2.36 803 803 Advanced 55 2.56 812 812 Advanced 56 2.79 821 821 Advanced 57 3.06 833 833 Advanced 58 3.40 847 847 Advanced 59 3.50 851 851 Advanced 60 3.50 851 851 Advanced 61 3.50 851 851 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-63 American Institutes for Research
Table G32. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 6 ELA Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 555 555 Limited 1 ‐3.50 555 555 Limited 2 ‐3.50 555 555 Limited 3 ‐3.35 561 561 Limited 4 ‐3.03 575 575 Limited 5 ‐2.77 586 586 Limited 6 ‐2.55 595 595 Limited 7 ‐2.36 603 603 Limited 8 ‐2.19 610 610 Limited 9 ‐2.03 617 617 Limited 10 ‐1.89 623 623 Limited 11 ‐1.76 628 628 Limited 12 ‐1.64 634 634 Limited 13 ‐1.52 638 638 Limited 14 ‐1.41 643 643 Limited 15 ‐1.31 648 648 Limited 16 ‐1.21 652 652 Limited 17 ‐1.11 656 656 Limited 18 ‐1.02 660 660 Limited 19 ‐0.93 664 664 Limited 20 ‐0.84 667 668 Basic 21 ‐0.76 671 671 Basic 22 ‐0.67 674 674 Basic 23 ‐0.59 678 678 Basic 24 ‐0.51 681 681 Basic 25 ‐0.43 685 685 Basic 26 ‐0.36 688 688 Basic 27 ‐0.28 691 691 Basic 28 ‐0.21 694 694 Basic 29 ‐0.13 697 697 Basic 30 ‐0.06 701 701 Proficient 31 0.02 704 704 Proficient 32 0.09 707 707 Proficient 33 0.17 710 710 Proficient 34 0.24 713 713 Proficient 35 0.32 716 716 Proficient 36 0.40 720 720 Proficient 37 0.47 723 723 Proficient 38 0.55 726 726 Accelerated 39 0.63 730 730 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-64 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.72 733 733 Accelerated 41 0.80 737 737 Accelerated 42 0.89 741 741 Accelerated 43 0.98 744 744 Accelerated 44 1.07 748 748 Accelerated 45 1.17 752 752 Advanced 46 1.27 757 757 Advanced 47 1.38 761 761 Advanced 48 1.49 766 766 Advanced 49 1.61 771 771 Advanced 49 1.61 771 771 Advanced 50 1.74 777 777 Advanced 51 1.88 782 782 Advanced 52 2.02 789 789 Advanced 53 2.19 796 796 Advanced 54 2.36 803 803 Advanced 55 2.56 812 812 Advanced 56 2.79 821 821 Advanced 57 3.06 833 833 Advanced 58 3.40 847 847 Advanced 59 3.50 851 851 Advanced 60 3.50 851 851 Advanced 61 3.50 851 851 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-65 American Institutes for Research
Table G33. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 7 ELA Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 568 568 Limited 1 ‐3.50 568 568 Limited 2 ‐3.50 568 568 Limited 3 ‐3.48 569 569 Limited 4 ‐3.14 581 581 Limited 5 ‐2.87 592 592 Limited 6 ‐2.64 600 600 Limited 7 ‐2.44 608 608 Limited 8 ‐2.26 615 615 Limited 9 ‐2.09 621 621 Limited 10 ‐1.94 627 627 Limited 11 ‐1.79 632 632 Limited 12 ‐1.66 638 638 Limited 13 ‐1.53 642 642 Limited 14 ‐1.41 647 647 Limited 15 ‐1.29 651 651 Limited 16 ‐1.18 656 656 Limited 17 ‐1.08 660 660 Limited 18 ‐0.97 663 663 Limited 19 ‐0.87 667 667 Limited 20 ‐0.78 671 671 Basic 21 ‐0.69 674 674 Basic 22 ‐0.59 678 678 Basic 23 ‐0.51 681 681 Basic 24 ‐0.42 684 684 Basic 25 ‐0.34 688 688 Basic 26 ‐0.25 691 691 Basic 27 ‐0.17 694 694 Basic 28 ‐0.09 697 697 Basic 29 ‐0.01 700 700 Proficient 30 0.07 703 703 Proficient 31 0.15 706 706 Proficient 32 0.23 709 709 Proficient 33 0.31 712 712 Proficient 34 0.38 715 715 Proficient 35 0.46 718 718 Proficient 36 0.54 721 721 Proficient 37 0.63 724 725 Accelerated 38 0.71 727 727 Accelerated 39 0.79 730 730 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-66 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.88 734 734 Accelerated 41 0.96 737 737 Accelerated 42 1.06 740 740 Accelerated 43 1.15 744 744 Accelerated 44 1.24 748 749 Advanced 45 1.35 751 751 Advanced 46 1.45 755 755 Advanced 47 1.56 759 759 Advanced 48 1.67 764 764 Advanced 49 1.80 768 768 Advanced 50 1.93 773 773 Advanced 51 2.07 779 779 Advanced 52 2.22 784 784 Advanced 53 2.38 791 791 Advanced 54 2.56 797 797 Advanced 55 2.77 805 805 Advanced 56 3.00 814 814 Advanced 57 3.27 824 824 Advanced 58 3.50 833 833 Advanced 59 3.50 833 833 Advanced 60 3.50 833 833 Advanced 61 3.50 833 833 Advanced 62 3.50 833 833 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-67 American Institutes for Research
Table G34. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 7 ELA Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 568 568 Limited 1 ‐3.50 568 568 Limited 2 ‐3.50 568 568 Limited 3 ‐3.48 569 569 Limited 4 ‐3.14 581 581 Limited 5 ‐2.87 592 592 Limited 6 ‐2.64 600 600 Limited 7 ‐2.44 608 608 Limited 8 ‐2.26 615 615 Limited 9 ‐2.09 621 621 Limited 10 ‐1.94 627 627 Limited 11 ‐1.79 632 632 Limited 12 ‐1.66 638 638 Limited 13 ‐1.53 642 642 Limited 14 ‐1.41 647 647 Limited 15 ‐1.29 651 651 Limited 16 ‐1.18 656 656 Limited 17 ‐1.08 660 660 Limited 18 ‐0.97 663 663 Limited 19 ‐0.87 667 667 Limited 20 ‐0.78 671 671 Basic 21 ‐0.69 674 674 Basic 22 ‐0.59 678 678 Basic 23 ‐0.51 681 681 Basic 24 ‐0.42 684 684 Basic 25 ‐0.34 688 688 Basic 26 ‐0.25 691 691 Basic 27 ‐0.17 694 694 Basic 28 ‐0.09 697 697 Basic 29 ‐0.01 700 700 Proficient 30 0.07 703 703 Proficient 31 0.15 706 706 Proficient 32 0.23 709 709 Proficient 33 0.31 712 712 Proficient 34 0.38 715 715 Proficient 35 0.46 718 718 Proficient 36 0.54 721 721 Proficient 37 0.63 724 725 Accelerated 38 0.71 727 727 Accelerated 39 0.79 730 730 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-68 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.88 734 734 Accelerated 41 0.96 737 737 Accelerated 42 1.06 740 740 Accelerated 43 1.15 744 744 Accelerated 44 1.24 748 749 Advanced 45 1.35 751 751 Advanced 46 1.45 755 755 Advanced 47 1.56 759 759 Advanced 48 1.67 764 764 Advanced 49 1.80 768 768 Advanced 50 1.93 773 773 Advanced 51 2.07 779 779 Advanced 52 2.22 784 784 Advanced 53 2.38 791 791 Advanced 54 2.56 797 797 Advanced 55 2.77 805 805 Advanced 56 3.00 814 814 Advanced 57 3.27 824 824 Advanced 58 3.50 833 833 Advanced 59 3.50 833 833 Advanced 60 3.50 833 833 Advanced 61 3.50 833 833 Advanced 62 3.50 833 833 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-69 American Institutes for Research
Table G35. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 8 ELA Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 586 586 Limited 1 ‐3.50 586 586 Limited 2 ‐3.50 586 586 Limited 3 ‐3.27 593 593 Limited 4 ‐2.94 603 603 Limited 5 ‐2.68 612 612 Limited 6 ‐2.46 618 618 Limited 7 ‐2.27 625 625 Limited 8 ‐2.09 630 630 Limited 9 ‐1.94 635 635 Limited 10 ‐1.80 639 639 Limited 11 ‐1.67 643 643 Limited 12 ‐1.54 647 647 Limited 13 ‐1.43 651 651 Limited 14 ‐1.32 654 654 Limited 15 ‐1.21 657 657 Limited 16 ‐1.11 661 661 Limited 17 ‐1.02 664 664 Limited 18 ‐0.92 666 666 Limited 19 ‐0.83 669 669 Limited 20 ‐0.75 672 672 Limited 21 ‐0.66 675 675 Limited 22 ‐0.58 677 677 Limited 23 ‐0.49 680 680 Limited 24 ‐0.41 682 682 Basic 25 ‐0.33 685 685 Basic 26 ‐0.26 687 687 Basic 27 ‐0.18 690 690 Basic 28 ‐0.10 692 692 Basic 29 ‐0.02 695 695 Basic 30 0.05 697 697 Basic 31 0.13 699 700 Proficient 32 0.21 702 702 Proficient 33 0.29 704 704 Proficient 34 0.37 707 707 Proficient 35 0.44 709 709 Proficient 36 0.53 712 712 Proficient 37 0.61 714 714 Proficient 38 0.69 717 717 Proficient 39 0.78 720 720 Proficient
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-70 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.87 722 722 Proficient 41 0.96 725 725 Accelerated 42 1.05 728 728 Accelerated 43 1.15 731 731 Accelerated 44 1.24 734 734 Accelerated 45 1.35 737 737 Accelerated 46 1.46 741 741 Accelerated 47 1.57 744 744 Advanced 48 1.69 748 748 Advanced 49 1.81 752 752 Advanced 50 1.94 756 756 Advanced 51 2.09 760 760 Advanced 52 2.24 765 765 Advanced 53 2.40 770 770 Advanced 54 2.58 776 776 Advanced 55 2.78 782 782 Advanced 56 3.00 789 789 Advanced 57 3.26 797 797 Advanced 58 3.50 805 805 Advanced 59 3.50 805 805 Advanced 60 3.50 805 805 Advanced 61 3.50 805 805 Advanced 62 3.50 805 805 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-71 American Institutes for Research
Table G36. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 8 ELA Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 586 586 Limited 1 ‐3.50 586 586 Limited 2 ‐3.50 586 586 Limited 3 ‐3.27 593 593 Limited 4 ‐2.94 603 603 Limited 5 ‐2.68 612 612 Limited 6 ‐2.46 618 618 Limited 7 ‐2.27 625 625 Limited 8 ‐2.09 630 630 Limited 9 ‐1.94 635 635 Limited 10 ‐1.80 639 639 Limited 11 ‐1.67 643 643 Limited 12 ‐1.54 647 647 Limited 13 ‐1.43 651 651 Limited 14 ‐1.32 654 654 Limited 15 ‐1.21 657 657 Limited 16 ‐1.11 661 661 Limited 17 ‐1.02 664 664 Limited 18 ‐0.92 666 666 Limited 19 ‐0.83 669 669 Limited 20 ‐0.75 672 672 Limited 21 ‐0.66 675 675 Limited 22 ‐0.58 677 677 Limited 23 ‐0.49 680 680 Limited 24 ‐0.41 682 682 Basic 25 ‐0.33 685 685 Basic 26 ‐0.26 687 687 Basic 27 ‐0.18 690 690 Basic 28 ‐0.10 692 692 Basic 29 ‐0.02 695 695 Basic 30 0.05 697 697 Basic 31 0.13 699 700 Proficient 32 0.21 702 702 Proficient 33 0.29 704 704 Proficient 34 0.37 707 707 Proficient 35 0.44 709 709 Proficient 36 0.53 712 712 Proficient 37 0.61 714 714 Proficient 38 0.69 717 717 Proficient 39 0.78 720 720 Proficient
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-72 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.87 722 722 Proficient 41 0.96 725 725 Accelerated 42 1.05 728 728 Accelerated 43 1.15 731 731 Accelerated 44 1.24 734 734 Accelerated 45 1.35 737 737 Accelerated 46 1.46 741 741 Accelerated 47 1.57 744 744 Advanced 48 1.69 748 748 Advanced 49 1.81 752 752 Advanced 50 1.94 756 756 Advanced 51 2.09 760 760 Advanced 52 2.24 765 765 Advanced 53 2.40 770 770 Advanced 54 2.58 776 776 Advanced 55 2.78 782 782 Advanced 56 3.00 789 789 Advanced 57 3.26 797 797 Advanced 58 3.50 805 805 Advanced 59 3.50 805 805 Advanced 60 3.50 805 805 Advanced 61 3.50 805 805 Advanced 62 3.50 805 805 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-73 American Institutes for Research
Table G37. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – High School ELA I Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 606 606 Limited 1 ‐3.50 606 606 Limited 2 ‐3.50 606 606 Limited 3 ‐3.26 613 613 Limited 4 ‐2.94 621 621 Limited 5 ‐2.68 629 629 Limited 6 ‐2.46 635 635 Limited 7 ‐2.27 640 640 Limited 8 ‐2.10 645 645 Limited 9 ‐1.95 649 649 Limited 10 ‐1.81 653 653 Limited 11 ‐1.68 656 656 Limited 12 ‐1.55 660 660 Limited 13 ‐1.44 663 663 Limited 14 ‐1.33 666 666 Limited 15 ‐1.22 669 669 Limited 16 ‐1.12 672 672 Limited 17 ‐1.02 675 675 Limited 18 ‐0.93 677 677 Limited 19 ‐0.84 680 680 Limited 20 ‐0.75 682 683 Basic 21 ‐0.66 685 685 Basic 22 ‐0.57 687 687 Basic 23 ‐0.49 689 689 Basic 24 ‐0.41 692 692 Basic 25 ‐0.33 694 694 Basic 26 ‐0.25 696 696 Basic 27 ‐0.16 698 698 Basic 28 ‐0.08 701 701 Proficient 29 ‐0.01 703 703 Proficient 30 0.07 705 705 Proficient 31 0.15 707 707 Proficient 32 0.23 710 710 Proficient 33 0.31 712 712 Proficient 34 0.39 714 714 Proficient 35 0.47 716 716 Proficient 36 0.56 718 718 Proficient 37 0.64 721 721 Proficient 38 0.72 723 723 Proficient 39 0.81 726 726 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-74 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.90 728 728 Accelerated 41 0.99 730 730 Accelerated 42 1.08 733 733 Accelerated 43 1.18 736 736 Accelerated 44 1.28 739 739 Advanced 45 1.38 741 741 Advanced 46 1.49 744 744 Advanced 47 1.60 748 748 Advanced 48 1.73 751 751 Advanced 49 1.85 755 755 Advanced 50 1.99 758 758 Advanced 51 2.14 763 763 Advanced 52 2.30 767 767 Advanced 53 2.48 772 772 Advanced 54 2.68 777 777 Advanced 55 2.89 783 783 Advanced 56 3.14 790 790 Advanced 57 3.43 798 798 Advanced 58 3.50 800 800 Advanced 59 3.50 800 800 Advanced 60 3.50 800 800 Advanced 61 3.50 800 800 Advanced 62 3.50 800 800 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-75 American Institutes for Research
Table G38. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – High School ELA I Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 606 606 Limited 1 ‐3.50 606 606 Limited 2 ‐3.50 606 606 Limited 3 ‐3.26 613 613 Limited 4 ‐2.94 621 621 Limited 5 ‐2.68 629 629 Limited 6 ‐2.46 635 635 Limited 7 ‐2.27 640 640 Limited 8 ‐2.10 645 645 Limited 9 ‐1.95 649 649 Limited 10 ‐1.81 653 653 Limited 11 ‐1.68 656 656 Limited 12 ‐1.55 660 660 Limited 13 ‐1.44 663 663 Limited 14 ‐1.33 666 666 Limited 15 ‐1.22 669 669 Limited 16 ‐1.12 672 672 Limited 17 ‐1.02 675 675 Limited 18 ‐0.93 677 677 Limited 19 ‐0.84 680 680 Limited 20 ‐0.75 682 683 Basic 21 ‐0.66 685 685 Basic 22 ‐0.57 687 687 Basic 23 ‐0.49 689 689 Basic 24 ‐0.41 692 692 Basic 25 ‐0.33 694 694 Basic 26 ‐0.25 696 696 Basic 27 ‐0.16 698 698 Basic 28 ‐0.08 701 701 Proficient 29 ‐0.01 703 703 Proficient 30 0.07 705 705 Proficient 31 0.15 707 707 Proficient 32 0.23 710 710 Proficient 33 0.31 712 712 Proficient 34 0.39 714 714 Proficient 35 0.47 716 716 Proficient 36 0.56 718 718 Proficient 37 0.64 721 721 Proficient 38 0.72 723 723 Proficient 39 0.81 726 726 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-76 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.90 728 728 Accelerated 41 0.99 730 730 Accelerated 42 1.08 733 733 Accelerated 43 1.18 736 736 Accelerated 44 1.28 739 739 Advanced 45 1.38 741 741 Advanced 46 1.49 744 744 Advanced 47 1.60 748 748 Advanced 48 1.73 751 751 Advanced 49 1.85 755 755 Advanced 50 1.99 758 758 Advanced 51 2.14 763 763 Advanced 52 2.30 767 767 Advanced 53 2.48 772 772 Advanced 54 2.68 777 777 Advanced 55 2.89 783 783 Advanced 56 3.14 790 790 Advanced 57 3.43 798 798 Advanced 58 3.50 800 800 Advanced 59 3.50 800 800 Advanced 60 3.50 800 800 Advanced 61 3.50 800 800 Advanced 62 3.50 800 800 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-77 American Institutes for Research
Table G39. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – High School ELA II Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 597 597 Limited 1 ‐3.50 597 597 Limited 2 ‐3.50 597 597 Limited 3 ‐3.19 606 606 Limited 4 ‐2.86 616 616 Limited 5 ‐2.59 624 624 Limited 6 ‐2.37 631 631 Limited 7 ‐2.17 637 637 Limited 8 ‐2.00 642 642 Limited 9 ‐1.84 647 647 Limited 10 ‐1.70 651 651 Limited 11 ‐1.57 655 655 Limited 12 ‐1.44 659 659 Limited 13 ‐1.33 662 662 Limited 14 ‐1.21 666 666 Limited 15 ‐1.11 669 669 Limited 16 ‐1.01 672 672 Limited 17 ‐0.91 675 675 Limited 18 ‐0.81 678 679 Basic 19 ‐0.72 681 681 Basic 20 ‐0.63 683 683 Basic 21 ‐0.55 686 686 Basic 22 ‐0.46 688 688 Basic 23 ‐0.38 691 691 Basic 24 ‐0.30 693 693 Basic 25 ‐0.22 696 696 Basic 26 ‐0.14 698 698 Basic 27 ‐0.07 700 700 Proficient 28 0.01 703 703 Proficient 29 0.09 705 705 Proficient 30 0.16 707 707 Proficient 31 0.24 710 710 Proficient 32 0.31 712 712 Proficient 33 0.39 714 714 Proficient 34 0.46 716 716 Proficient 35 0.54 719 719 Proficient 36 0.62 721 721 Proficient 37 0.70 723 723 Proficient 38 0.78 726 726 Accelerated 39 0.86 728 728 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-78 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.94 731 731 Accelerated 41 1.03 733 733 Accelerated 42 1.12 736 736 Accelerated 43 1.21 739 739 Accelerated 44 1.31 742 742 Advanced 45 1.42 745 745 Advanced 46 1.52 748 748 Advanced 47 1.64 752 752 Advanced 48 1.76 755 755 Advanced 49 1.89 759 759 Advanced 50 2.02 763 763 Advanced 51 2.17 768 768 Advanced 52 2.33 773 773 Advanced 53 2.50 778 778 Advanced 54 2.69 783 783 Advanced 55 2.90 790 790 Advanced 56 3.14 797 797 Advanced 57 3.42 805 805 Advanced 58 3.50 808 808 Advanced 59 3.50 808 808 Advanced 60 3.50 808 808 Advanced 61 3.50 808 808 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-79 American Institutes for Research
Table G40. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – High School ELA II Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 597 597 Limited 1 ‐3.50 597 597 Limited 2 ‐3.50 597 597 Limited 3 ‐3.19 606 606 Limited 4 ‐2.86 616 616 Limited 5 ‐2.59 624 624 Limited 6 ‐2.37 631 631 Limited 7 ‐2.17 637 637 Limited 8 ‐2.00 642 642 Limited 9 ‐1.84 647 647 Limited 10 ‐1.70 651 651 Limited 11 ‐1.57 655 655 Limited 12 ‐1.44 659 659 Limited 13 ‐1.33 662 662 Limited 14 ‐1.21 666 666 Limited 15 ‐1.11 669 669 Limited 16 ‐1.01 672 672 Limited 17 ‐0.91 675 675 Limited 18 ‐0.81 678 679 Basic 19 ‐0.72 681 681 Basic 20 ‐0.63 683 683 Basic 21 ‐0.55 686 686 Basic 22 ‐0.46 688 688 Basic 23 ‐0.38 691 691 Basic 24 ‐0.30 693 693 Basic 25 ‐0.22 696 696 Basic 26 ‐0.14 698 698 Basic 27 ‐0.07 700 700 Proficient 28 0.01 703 703 Proficient 29 0.09 705 705 Proficient 30 0.16 707 707 Proficient 31 0.24 710 710 Proficient 32 0.31 712 712 Proficient 33 0.39 714 714 Proficient 34 0.46 716 716 Proficient 35 0.54 719 719 Proficient 36 0.62 721 721 Proficient 37 0.70 723 723 Proficient 38 0.78 726 726 Accelerated 39 0.86 728 728 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-80 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.94 731 731 Accelerated 41 1.03 733 733 Accelerated 42 1.12 736 736 Accelerated 43 1.21 739 739 Accelerated 44 1.31 742 742 Advanced 45 1.42 745 745 Advanced 46 1.52 748 748 Advanced 47 1.64 752 752 Advanced 48 1.76 755 755 Advanced 49 1.89 759 759 Advanced 50 2.02 763 763 Advanced 51 2.17 768 768 Advanced 52 2.33 773 773 Advanced 53 2.50 778 778 Advanced 54 2.69 783 783 Advanced 55 2.90 790 790 Advanced 56 3.14 797 797 Advanced 57 3.42 805 805 Advanced 58 3.50 808 808 Advanced 59 3.50 808 808 Advanced 60 3.50 808 808 Advanced 61 3.50 808 808 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-81 American Institutes for Research
Table G41. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 3 Math Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 587 587 Limited 1 ‐3.50 587 587 Limited 2 ‐3.50 587 587 Limited 3 ‐3.46 589 589 Limited 4 ‐3.11 600 600 Limited 5 ‐2.83 609 609 Limited 6 ‐2.59 617 617 Limited 7 ‐2.38 624 624 Limited 8 ‐2.19 630 630 Limited 9 ‐2.02 636 636 Limited 10 ‐1.86 642 642 Limited 11 ‐1.70 647 647 Limited 12 ‐1.56 651 651 Limited 13 ‐1.42 656 656 Limited 14 ‐1.29 660 660 Limited 15 ‐1.16 664 664 Limited 16 ‐1.04 669 669 Limited 17 ‐0.91 673 673 Limited 18 ‐0.79 677 677 Limited 19 ‐0.68 680 680 Limited 20 ‐0.56 684 684 Basic 21 ‐0.44 688 688 Basic 22 ‐0.33 692 692 Basic 23 ‐0.21 696 696 Basic 24 ‐0.10 699 700 Proficient 25 0.02 703 703 Proficient 26 0.13 707 707 Proficient 27 0.25 711 711 Proficient 28 0.37 715 715 Proficient 29 0.49 719 719 Proficient 30 0.61 723 723 Proficient 31 0.73 727 727 Accelerated 32 0.85 731 731 Accelerated 33 0.98 735 735 Accelerated 34 1.11 739 739 Accelerated 35 1.25 744 744 Accelerated 36 1.39 748 748 Accelerated 37 1.53 753 753 Advanced 38 1.68 758 758 Advanced 39 1.83 763 763 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-82 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.99 768 768 Advanced 41 2.17 774 774 Advanced 42 2.35 780 780 Advanced 43 2.56 787 787 Advanced 44 2.79 795 795 Advanced 45 3.07 804 804 Advanced 46 3.41 815 815 Advanced 47 3.50 818 818 Advanced 48 3.50 818 818 Advanced 49 3.50 818 818 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-83 American Institutes for Research
Table G42. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 3 Math Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 587 587 Limited 1 ‐3.50 587 587 Limited 2 ‐3.50 587 587 Limited 3 ‐3.44 589 589 Limited 4 ‐3.09 601 601 Limited 5 ‐2.81 610 610 Limited 6 ‐2.57 618 618 Limited 7 ‐2.36 625 625 Limited 8 ‐2.17 631 631 Limited 9 ‐1.99 637 637 Limited 10 ‐1.82 643 643 Limited 11 ‐1.67 648 648 Limited 12 ‐1.52 653 653 Limited 13 ‐1.38 657 657 Limited 14 ‐1.24 662 662 Limited 15 ‐1.11 666 666 Limited 16 ‐0.98 670 670 Limited 17 ‐0.86 674 674 Limited 18 ‐0.73 678 678 Limited 19 ‐0.61 682 683 Basic 20 ‐0.49 686 686 Basic 21 ‐0.37 690 690 Basic 22 ‐0.26 694 694 Basic 23 ‐0.14 698 698 Basic 24 ‐0.02 702 702 Proficient 25 0.10 706 706 Proficient 26 0.21 710 710 Proficient 27 0.33 714 714 Proficient 28 0.45 717 717 Proficient 29 0.57 721 721 Proficient 30 0.69 725 725 Accelerated 31 0.81 729 729 Accelerated 32 0.94 733 733 Accelerated 33 1.06 738 738 Accelerated 34 1.19 742 742 Accelerated 35 1.32 746 746 Accelerated 36 1.45 750 750 Accelerated 37 1.59 755 755 Advanced 38 1.73 760 760 Advanced 39 1.88 764 764 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-84 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 2.03 769 769 Advanced 41 2.20 775 775 Advanced 42 2.37 781 781 Advanced 43 2.57 787 787 Advanced 44 2.79 794 794 Advanced 45 3.05 803 803 Advanced 46 3.38 814 814 Advanced 47 3.50 818 818 Advanced 48 3.50 818 818 Advanced 49 3.50 818 818 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-85 American Institutes for Research
Table G43. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 4 Math Online
Raw Score Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 605 605 Limited 1 ‐3.50 605 605 Limited 2 ‐3.47 606 606 Limited 3 ‐3.03 620 620 Limited 4 ‐2.70 631 631 Limited 5 ‐2.44 640 640 Limited 6 ‐2.22 647 647 Limited 7 ‐2.02 653 653 Limited 8 ‐1.85 659 659 Limited 9 ‐1.69 664 664 Limited 10 ‐1.55 669 669 Limited 11 ‐1.41 674 674 Limited 12 ‐1.28 678 678 Limited 13 ‐1.16 682 682 Limited 14 ‐1.04 686 686 Basic 15 ‐0.93 689 689 Basic 16 ‐0.82 693 693 Basic 17 ‐0.72 696 696 Basic 18 ‐0.62 700 700 Proficient 19 ‐0.52 703 703 Proficient 20 ‐0.42 706 706 Proficient 21 ‐0.32 710 710 Proficient 22 ‐0.22 713 713 Proficient 23 ‐0.13 716 716 Proficient 24 ‐0.03 719 719 Proficient 25 0.06 722 722 Proficient 26 0.16 725 725 Accelerated 27 0.25 728 728 Accelerated 28 0.35 732 732 Accelerated 29 0.45 735 735 Accelerated 30 0.55 738 738 Accelerated 31 0.65 741 741 Accelerated 32 0.75 745 745 Accelerated 33 0.86 748 748 Accelerated 34 0.97 752 752 Accelerated 35 1.08 756 756 Accelerated 36 1.20 760 760 Advanced 37 1.33 764 764 Advanced 38 1.46 768 768 Advanced 39 1.60 773 773 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-86 American Institutes for Research
Raw Score Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.75 778 778 Advanced 41 1.91 783 783 Advanced 42 2.09 789 789 Advanced 43 2.30 796 796 Advanced 44 2.53 803 803 Advanced 45 2.80 812 812 Advanced 46 3.14 823 823 Advanced 47 3.50 835 835 Advanced 48 3.50 835 835 Advanced 49 3.50 835 835 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-87 American Institutes for Research
Table G44. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 4 Math Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 605 605 Limited 1 ‐3.50 605 605 Limited 2 ‐3.47 606 606 Limited 3 ‐3.03 620 620 Limited 4 ‐2.70 631 631 Limited 5 ‐2.44 640 640 Limited 6 ‐2.22 647 647 Limited 7 ‐2.02 653 653 Limited 8 ‐1.85 659 659 Limited 9 ‐1.69 664 664 Limited 10 ‐1.55 669 669 Limited 11 ‐1.41 674 674 Limited 12 ‐1.28 678 678 Limited 13 ‐1.16 682 682 Limited 14 ‐1.04 686 686 Basic 15 ‐0.93 689 689 Basic 16 ‐0.82 693 693 Basic 17 ‐0.72 696 696 Basic 18 ‐0.62 700 700 Proficient 19 ‐0.52 703 703 Proficient 20 ‐0.42 706 706 Proficient 21 ‐0.32 710 710 Proficient 22 ‐0.22 713 713 Proficient 23 ‐0.13 716 716 Proficient 24 ‐0.03 719 719 Proficient 25 0.06 722 722 Proficient 26 0.16 725 725 Accelerated 27 0.25 728 728 Accelerated 28 0.35 732 732 Accelerated 29 0.45 735 735 Accelerated 30 0.55 738 738 Accelerated 31 0.65 741 741 Accelerated 32 0.75 745 745 Accelerated 33 0.86 748 748 Accelerated 34 0.97 752 752 Accelerated 35 1.08 756 756 Accelerated 36 1.20 760 760 Advanced 37 1.33 764 764 Advanced 38 1.46 768 768 Advanced 39 1.60 773 773 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-88 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.75 778 778 Advanced 41 1.91 783 783 Advanced 42 2.09 789 789 Advanced 43 2.30 796 796 Advanced 44 2.53 803 803 Advanced 45 2.80 812 812 Advanced 46 3.14 823 823 Advanced 47 3.50 835 835 Advanced 48 3.50 835 835 Advanced 49 3.50 835 835 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-89 American Institutes for Research
Table G45. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 5 Math Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 624 624 Limited 1 ‐3.50 624 624 Limited 2 ‐3.50 624 624 Limited 3 ‐3.23 631 631 Limited 4 ‐2.90 639 639 Limited 5 ‐2.62 646 646 Limited 6 ‐2.39 652 652 Limited 7 ‐2.19 658 658 Limited 8 ‐2.00 662 662 Limited 9 ‐1.83 667 667 Limited 10 ‐1.68 671 671 Limited 11 ‐1.53 674 674 Limited 12 ‐1.40 678 678 Limited 13 ‐1.26 681 681 Limited 14 ‐1.14 685 685 Limited 15 ‐1.02 688 688 Basic 16 ‐0.90 691 691 Basic 17 ‐0.79 694 694 Basic 18 ‐0.68 696 696 Basic 19 ‐0.57 699 700 Proficient 20 ‐0.46 702 702 Proficient 21 ‐0.36 705 705 Proficient 22 ‐0.26 707 707 Proficient 23 ‐0.15 710 710 Proficient 24 ‐0.05 713 713 Proficient 25 0.05 715 715 Proficient 26 0.15 718 718 Proficient 27 0.25 720 720 Proficient 28 0.35 723 723 Proficient 29 0.46 726 726 Accelerated 30 0.56 728 728 Accelerated 31 0.66 731 731 Accelerated 32 0.77 734 734 Accelerated 33 0.88 737 737 Accelerated 34 0.99 739 739 Accelerated 35 1.10 742 742 Accelerated 36 1.22 745 745 Accelerated 37 1.34 748 749 Advanced 38 1.47 752 752 Advanced 39 1.60 755 755 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-90 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.74 759 759 Advanced 41 1.88 762 762 Advanced 42 2.04 767 767 Advanced 43 2.21 771 771 Advanced 44 2.41 776 776 Advanced 45 2.63 782 782 Advanced 46 2.89 788 788 Advanced 47 3.21 797 797 Advanced 48 3.50 804 804 Advanced 49 3.50 804 804 Advanced 50 3.50 804 804 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-91 American Institutes for Research
Table G46. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 5 Math Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 624 624 Limited 1 ‐3.50 624 624 Limited 2 ‐3.50 624 624 Limited 3 ‐3.23 631 631 Limited 4 ‐2.90 639 639 Limited 5 ‐2.62 646 646 Limited 6 ‐2.39 652 652 Limited 7 ‐2.19 658 658 Limited 8 ‐2.00 662 662 Limited 9 ‐1.83 667 667 Limited 10 ‐1.68 671 671 Limited 11 ‐1.53 674 674 Limited 12 ‐1.40 678 678 Limited 13 ‐1.26 681 681 Limited 14 ‐1.14 685 685 Limited 15 ‐1.02 688 688 Basic 16 ‐0.90 691 691 Basic 17 ‐0.79 694 694 Basic 18 ‐0.68 696 696 Basic 19 ‐0.57 699 700 Proficient 20 ‐0.46 702 702 Proficient 21 ‐0.36 705 705 Proficient 22 ‐0.26 707 707 Proficient 23 ‐0.15 710 710 Proficient 24 ‐0.05 713 713 Proficient 25 0.05 715 715 Proficient 26 0.15 718 718 Proficient 27 0.25 720 720 Proficient 28 0.35 723 723 Proficient 29 0.46 726 726 Accelerated 30 0.56 728 728 Accelerated 31 0.66 731 731 Accelerated 32 0.77 734 734 Accelerated 33 0.88 737 737 Accelerated 34 0.99 739 739 Accelerated 35 1.10 742 742 Accelerated 36 1.22 745 745 Accelerated 37 1.34 748 749 Advanced 38 1.47 752 752 Advanced 39 1.60 755 755 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-92 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.74 759 759 Advanced 41 1.88 762 762 Advanced 42 2.04 767 767 Advanced 43 2.21 771 771 Advanced 44 2.41 776 776 Advanced 45 2.63 782 782 Advanced 46 2.89 788 788 Advanced 47 3.21 797 797 Advanced 48 3.50 804 804 Advanced 49 3.50 804 804 Advanced 50 3.50 804 804 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-93 American Institutes for Research
Table G47. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 6 Math Online
Raw Score Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 616 616 Limited 1 ‐3.50 616 616 Limited 2 ‐3.50 616 616 Limited 3 ‐3.38 619 619 Limited 4 ‐3.04 628 628 Limited 5 ‐2.77 635 635 Limited 6 ‐2.53 640 640 Limited 7 ‐2.33 645 645 Limited 8 ‐2.15 650 650 Limited 9 ‐1.98 654 654 Limited 10 ‐1.83 658 658 Limited 11 ‐1.68 661 661 Limited 12 ‐1.54 665 665 Limited 13 ‐1.41 668 668 Limited 14 ‐1.29 671 671 Limited 15 ‐1.17 674 674 Limited 16 ‐1.05 677 677 Limited 17 ‐0.94 680 680 Limited 18 ‐0.83 682 682 Basic 19 ‐0.72 685 685 Basic 20 ‐0.61 688 688 Basic 21 ‐0.51 690 690 Basic 22 ‐0.40 693 693 Basic 23 ‐0.30 696 696 Basic 24 ‐0.20 698 698 Basic 25 ‐0.10 701 701 Proficient 26 0.00 703 703 Proficient 27 0.11 706 706 Proficient 28 0.21 708 708 Proficient 29 0.31 711 711 Proficient 30 0.42 713 713 Proficient 31 0.52 716 716 Proficient 32 0.63 719 719 Proficient 33 0.74 721 721 Proficient 34 0.84 724 725 Accelerated 35 0.96 727 727 Accelerated 36 1.07 729 729 Accelerated 37 1.19 732 732 Accelerated 38 1.31 735 735 Accelerated 39 1.43 738 738 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-94 American Institutes for Research
Raw Score Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.56 742 742 Accelerated 41 1.70 745 745 Advanced 42 1.84 748 748 Advanced 43 1.99 752 752 Advanced 44 2.15 756 756 Advanced 45 2.32 760 760 Advanced 46 2.51 765 765 Advanced 47 2.72 770 770 Advanced 48 2.96 776 776 Advanced 49 3.24 783 783 Advanced 50 3.50 790 790 Advanced 51 3.50 790 790 Advanced 52 3.50 790 790 Advanced 53 3.50 790 790 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-95 American Institutes for Research
Table G48. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 6 Math Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 616 616 Limited 1 ‐3.50 616 616 Limited 2 ‐3.50 616 616 Limited 3 ‐3.38 619 619 Limited 4 ‐3.04 628 628 Limited 5 ‐2.77 635 635 Limited 6 ‐2.53 640 640 Limited 7 ‐2.33 645 645 Limited 8 ‐2.15 650 650 Limited 9 ‐1.98 654 654 Limited 10 ‐1.83 658 658 Limited 11 ‐1.68 661 661 Limited 12 ‐1.54 665 665 Limited 13 ‐1.41 668 668 Limited 14 ‐1.29 671 671 Limited 15 ‐1.17 674 674 Limited 16 ‐1.05 677 677 Limited 17 ‐0.94 680 680 Limited 18 ‐0.83 682 682 Basic 19 ‐0.72 685 685 Basic 20 ‐0.61 688 688 Basic 21 ‐0.51 690 690 Basic 22 ‐0.40 693 693 Basic 23 ‐0.30 696 696 Basic 24 ‐0.20 698 698 Basic 25 ‐0.10 701 701 Proficient 26 0.00 703 703 Proficient 27 0.11 706 706 Proficient 28 0.21 708 708 Proficient 29 0.31 711 711 Proficient 30 0.42 713 713 Proficient 31 0.52 716 716 Proficient 32 0.63 719 719 Proficient 33 0.74 721 721 Proficient 34 0.84 724 725 Accelerated 35 0.96 727 727 Accelerated 36 1.07 729 729 Accelerated 37 1.19 732 732 Accelerated 38 1.31 735 735 Accelerated 39 1.43 738 738 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-96 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.56 742 742 Accelerated 41 1.70 745 745 Advanced 42 1.84 748 748 Advanced 43 1.99 752 752 Advanced 44 2.15 756 756 Advanced 45 2.32 760 760 Advanced 46 2.51 765 765 Advanced 47 2.72 770 770 Advanced 48 2.96 776 776 Advanced 49 3.24 783 783 Advanced 50 3.50 790 790 Advanced 51 3.50 790 790 Advanced 52 3.50 790 790 Advanced 53 3.50 790 790 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-97 American Institutes for Research
Table G49. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 7 Math Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 605 605 Limited 1 ‐3.50 605 605 Limited 2 ‐3.45 606 606 Limited 3 ‐3.00 619 619 Limited 4 ‐2.67 629 629 Limited 5 ‐2.41 636 636 Limited 6 ‐2.18 643 643 Limited 7 ‐1.99 648 648 Limited 8 ‐1.81 653 653 Limited 9 ‐1.65 658 658 Limited 10 ‐1.51 662 662 Limited 11 ‐1.37 666 666 Limited 12 ‐1.24 670 670 Limited 13 ‐1.12 673 673 Limited 14 ‐1.00 677 677 Limited 15 ‐0.89 680 680 Limited 16 ‐0.78 683 684 Basic 17 ‐0.67 686 686 Basic 18 ‐0.57 689 689 Basic 19 ‐0.47 692 692 Basic 20 ‐0.37 695 695 Basic 21 ‐0.28 697 697 Basic 22 ‐0.18 700 700 Proficient 23 ‐0.09 703 703 Proficient 24 0.00 706 706 Proficient 25 0.10 708 708 Proficient 26 0.19 711 711 Proficient 27 0.28 713 713 Proficient 28 0.37 716 716 Proficient 29 0.46 719 719 Proficient 30 0.56 721 721 Proficient 31 0.65 724 725 Accelerated 32 0.74 727 727 Accelerated 33 0.84 730 730 Accelerated 34 0.93 732 732 Accelerated 35 1.03 735 735 Accelerated 36 1.13 738 738 Accelerated 37 1.24 741 741 Accelerated 38 1.34 744 744 Accelerated 39 1.45 747 747 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-98 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.57 751 751 Accelerated 41 1.70 754 755 Advanced 42 1.83 758 758 Advanced 43 1.97 762 762 Advanced 44 2.12 767 767 Advanced 45 2.30 771 771 Advanced 46 2.49 777 777 Advanced 47 2.72 784 784 Advanced 48 2.99 791 791 Advanced 49 3.33 801 801 Advanced 50 3.50 806 806 Advanced 51 3.50 806 806 Advanced 52 3.50 806 806 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-99 American Institutes for Research
Table G50. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 7 Math Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 605 605 Limited 1 ‐3.50 605 605 Limited 2 ‐3.45 606 606 Limited 3 ‐3.00 619 619 Limited 4 ‐2.67 629 629 Limited 5 ‐2.41 636 636 Limited 6 ‐2.18 643 643 Limited 7 ‐1.99 648 648 Limited 8 ‐1.81 653 653 Limited 9 ‐1.65 658 658 Limited 10 ‐1.51 662 662 Limited 11 ‐1.37 666 666 Limited 12 ‐1.24 670 670 Limited 13 ‐1.12 673 673 Limited 14 ‐1.00 677 677 Limited 15 ‐0.89 680 680 Limited 16 ‐0.78 683 684 Basic 17 ‐0.67 686 686 Basic 18 ‐0.57 689 689 Basic 19 ‐0.47 692 692 Basic 20 ‐0.37 695 695 Basic 21 ‐0.28 697 697 Basic 22 ‐0.18 700 700 Proficient 23 ‐0.09 703 703 Proficient 24 0.00 706 706 Proficient 25 0.10 708 708 Proficient 26 0.19 711 711 Proficient 27 0.28 713 713 Proficient 28 0.37 716 716 Proficient 29 0.46 719 719 Proficient 30 0.56 721 721 Proficient 31 0.65 724 725 Accelerated 32 0.74 727 727 Accelerated 33 0.84 730 730 Accelerated 34 0.93 732 732 Accelerated 35 1.03 735 735 Accelerated 36 1.13 738 738 Accelerated 37 1.24 741 741 Accelerated 38 1.34 744 744 Accelerated 39 1.45 747 747 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-100 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.57 751 751 Accelerated 41 1.70 754 755 Advanced 42 1.83 758 758 Advanced 43 1.97 762 klklkl762 Advanced 44 2.12 767 767 Advanced 45 2.30 771 771 Advanced 46 2.49 777 777 Advanced 47 2.72 784 784 Advanced 48 2.99 791 791 Advanced 49 3.33 801 801 Advanced 50 3.50 806 806 Advanced 51 3.50 806 806 Advanced 52 3.50 806 806 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-101 American Institutes for Research
Table G51. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 8 Math Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 633 633 Limited 1 ‐3.50 633 633 Limited 2 ‐3.50 633 633 Limited 3 ‐3.50 633 633 Limited 4 ‐3.23 639 639 Limited 5 ‐2.94 644 644 Limited 6 ‐2.70 649 649 Limited 7 ‐2.48 654 654 Limited 8 ‐2.29 658 658 Limited 9 ‐2.11 661 661 Limited 10 ‐1.94 665 665 Limited 11 ‐1.78 668 668 Limited 12 ‐1.63 671 671 Limited 13 ‐1.48 674 674 Limited 14 ‐1.35 676 676 Limited 15 ‐1.21 679 679 Limited 16 ‐1.08 682 682 Limited 17 ‐0.96 684 684 Limited 18 ‐0.83 687 687 Limited 19 ‐0.71 689 690 Basic 20 ‐0.59 692 692 Basic 21 ‐0.48 694 694 Basic 22 ‐0.36 696 696 Basic 23 ‐0.25 699 699 Basic 24 ‐0.13 701 701 Proficient 25 ‐0.02 703 703 Proficient 26 0.09 705 705 Proficient 27 0.20 708 708 Proficient 28 0.31 710 710 Proficient 29 0.42 712 712 Proficient 30 0.53 714 714 Proficient 31 0.64 717 717 Proficient 32 0.75 719 719 Proficient 33 0.87 721 721 Proficient 34 0.98 723 723 Proficient 35 1.09 726 726 Accelerated 36 1.21 728 728 Accelerated 37 1.33 730 730 Accelerated 38 1.45 733 733 Accelerated 39 1.58 735 735 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-102 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.71 738 738 Accelerated 41 1.84 741 741 Accelerated 42 1.98 744 744 Advanced 43 2.13 747 747 Advanced 44 2.29 750 750 Advanced 45 2.46 753 753 Advanced 46 2.64 757 757 Advanced 47 2.84 761 761 Advanced 48 3.07 766 766 Advanced 49 3.35 771 771 Advanced 50 3.50 774 774 Advanced 51 3.50 774 774 Advanced 52 3.50 774 774 Advanced 53 3.50 774 774 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-103 American Institutes for Research
Table G52. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 8 Math Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 633 633 Limited 1 ‐3.50 633 633 Limited 2 ‐3.50 633 633 Limited 3 ‐3.50 633 633 Limited 4 ‐3.23 639 639 Limited 5 ‐2.94 644 644 Limited 6 ‐2.70 649 649 Limited 7 ‐2.48 654 654 Limited 8 ‐2.29 658 658 Limited 9 ‐2.11 661 661 Limited 10 ‐1.94 665 665 Limited 11 ‐1.78 668 668 Limited 12 ‐1.63 671 671 Limited 13 ‐1.48 674 674 Limited 14 ‐1.35 676 676 Limited 15 ‐1.21 679 679 Limited 16 ‐1.08 682 682 Limited 17 ‐0.96 684 684 Limited 18 ‐0.83 687 687 Limited 19 ‐0.71 689 690 Basic 20 ‐0.59 692 692 Basic 21 ‐0.48 694 694 Basic 22 ‐0.36 696 696 Basic 23 ‐0.25 699 699 Basic 24 ‐0.13 701 701 Proficient 25 ‐0.02 703 703 Proficient 26 0.09 705 705 Proficient 27 0.20 708 708 Proficient 28 0.31 710 710 Proficient 29 0.42 712 712 Proficient 30 0.53 714 714 Proficient 31 0.64 717 717 Proficient 32 0.75 719 719 Proficient 33 0.87 721 721 Proficient 34 0.98 723 723 Proficient 35 1.09 726 726 Accelerated 36 1.21 728 728 Accelerated 37 1.33 730 730 Accelerated 38 1.45 733 733 Accelerated 39 1.58 735 735 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-104 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.71 738 738 Accelerated 41 1.84 741 741 Accelerated 42 1.98 744 744 Advanced 43 2.13 747 747 Advanced 44 2.29 750 750 Advanced 45 2.46 753 753 Advanced 46 2.64 757 757 Advanced 47 2.84 761 761 Advanced 48 3.07 766 766 Advanced 49 3.35 771 771 Advanced 50 3.50 774 774 Advanced 51 3.50 774 774 Advanced 52 3.50 774 774 Advanced 53 3.50 774 774 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-105 American Institutes for Research
Table G53. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Algebra Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.10 629 629 Limited 4 ‐2.78 638 638 Limited 5 ‐2.51 645 645 Limited 6 ‐2.29 652 652 Limited 7 ‐2.10 657 657 Limited 8 ‐1.93 662 662 Limited 9 ‐1.78 666 666 Limited 10 ‐1.63 670 670 Limited 11 ‐1.50 674 674 Limited 12 ‐1.38 677 677 Limited 13 ‐1.26 681 682 Basic 14 ‐1.15 684 684 Basic 15 ‐1.04 687 687 Basic 16 ‐0.93 690 690 Basic 17 ‐0.83 693 693 Basic 18 ‐0.74 695 695 Basic 19 ‐0.64 698 698 Basic 20 ‐0.55 701 701 Proficient 21 ‐0.46 703 703 Proficient 22 ‐0.37 706 706 Proficient 23 ‐0.28 708 708 Proficient 24 ‐0.19 711 711 Proficient 25 ‐0.10 713 713 Proficient 26 ‐0.01 716 716 Proficient 27 0.07 718 718 Proficient 28 0.16 721 721 Proficient 29 0.25 723 723 Proficient 30 0.34 726 726 Accelerated 31 0.43 728 728 Accelerated 32 0.52 731 731 Accelerated 33 0.62 733 733 Accelerated 34 0.71 736 736 Accelerated 35 0.81 739 739 Accelerated 36 0.91 741 741 Accelerated 37 1.01 744 744 Accelerated 38 1.11 747 747 Accelerated 39 1.22 750 750 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-106 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.34 754 754 Advanced 41 1.46 757 757 Advanced 42 1.58 760 760 Advanced 43 1.71 764 764 Advanced 44 1.86 768 768 Advanced 45 2.01 772 772 Advanced 46 2.18 777 777 Advanced 47 2.36 782 782 Advanced 48 2.57 788 788 Advanced 49 2.81 795 795 Advanced 50 3.10 803 803 Advanced 51 3.45 813 813 Advanced 52 3.50 814 814 Advanced 53 3.50 814 814 Advanced 54 3.50 814 814 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-107 American Institutes for Research
Table G54. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Algebra Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.10 629 629 Limited 4 ‐2.78 638 638 Limited 5 ‐2.51 645 645 Limited 6 ‐2.29 652 652 Limited 7 ‐2.10 657 657 Limited 8 ‐1.93 662 662 Limited 9 ‐1.78 666 666 Limited 10 ‐1.63 670 670 Limited 11 ‐1.50 674 674 Limited 12 ‐1.38 677 677 Limited 13 ‐1.26 681 682 Basic 14 ‐1.15 684 684 Basic 15 ‐1.04 687 687 Basic 16 ‐0.93 690 690 Basic 17 ‐0.83 693 693 Basic 18 ‐0.74 695 695 Basic 19 ‐0.64 698 698 Basic 20 ‐0.55 701 701 Proficient 21 ‐0.46 703 703 Proficient 22 ‐0.37 706 706 Proficient 23 ‐0.28 708 708 Proficient 24 ‐0.19 711 711 Proficient 25 ‐0.10 713 713 Proficient 26 ‐0.01 716 716 Proficient 27 0.07 718 718 Proficient 28 0.16 721 721 Proficient 29 0.25 723 723 Proficient 30 0.34 726 726 Accelerated 31 0.43 728 728 Accelerated 32 0.52 731 731 Accelerated 33 0.62 733 733 Accelerated 34 0.71 736 736 Accelerated 35 0.81 739 739 Accelerated 36 0.91 741 741 Accelerated 37 1.01 744 744 Accelerated 38 1.11 747 747 Accelerated 39 1.22 750 750 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-108 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.34 754 754 Advanced 41 1.46 757 757 Advanced 42 1.58 760 760 Advanced 43 1.71 764 764 Advanced 44 1.86 768 768 Advanced 45 2.01 772 772 Advanced 46 2.18 777 777 Advanced 47 2.36 782 782 Advanced 48 2.57 788 788 Advanced 49 2.81 795 795 Advanced 50 3.10 803 803 Advanced 51 3.45 813 813 Advanced 52 3.50 814 814 Advanced 53 3.50 814 814 Advanced 54 3.50 814 814 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-109 American Institutes for Research
Table G55. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Geometry Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 604 604 Limited 1 ‐3.50 604 604 Limited 2 ‐3.34 609 609 Limited 3 ‐2.88 622 622 Limited 4 ‐2.55 632 632 Limited 5 ‐2.27 640 640 Limited 6 ‐2.04 647 647 Limited 7 ‐1.84 653 653 Limited 8 ‐1.66 658 658 Limited 9 ‐1.50 663 663 Limited 10 ‐1.35 668 668 Limited 11 ‐1.21 672 672 Limited 12 ‐1.07 676 676 Limited 13 ‐0.95 679 679 Basic 14 ‐0.83 683 683 Basic 15 ‐0.71 686 686 Basic 16 ‐0.60 689 689 Basic 17 ‐0.49 693 693 Basic 18 ‐0.39 696 696 Basic 19 ‐0.29 699 700 Proficient 20 ‐0.19 702 702 Proficient 21 ‐0.09 705 705 Proficient 22 0.01 707 707 Proficient 23 0.11 710 710 Proficient 24 0.20 713 713 Proficient 25 0.29 716 716 Proficient 26 0.39 719 719 Proficient 27 0.48 721 721 Proficient 28 0.57 724 725 Accelerated 29 0.67 727 727 Accelerated 30 0.76 730 730 Accelerated 31 0.85 732 732 Accelerated 32 0.95 735 735 Accelerated 33 1.05 738 738 Accelerated 34 1.14 741 741 Accelerated 35 1.24 744 744 Accelerated 36 1.34 747 747 Accelerated 37 1.45 750 750 Accelerated 38 1.55 753 753 Accelerated 39 1.66 756 756 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-110 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.78 759 759 Advanced 41 1.89 763 763 Advanced 42 2.02 766 766 Advanced 43 2.14 770 770 Advanced 44 2.28 774 774 Advanced 45 2.42 778 778 Advanced 46 2.58 783 783 Advanced 47 2.74 788 788 Advanced 48 2.93 793 793 Advanced 49 3.13 799 799 Advanced 50 3.36 806 806 Advanced 51 3.50 810 810 Advanced 52 3.50 810 810 Advanced 53 3.50 810 810 Advanced 54 3.50 810 810 Advanced 55 3.50 810 810 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-111 American Institutes for Research
Table G56. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Geometry Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 604 604 Limited 1 ‐3.50 604 604 Limited 2 ‐3.34 609 609 Limited 3 ‐2.88 622 622 Limited 4 ‐2.55 632 632 Limited 5 ‐2.27 640 640 Limited 6 ‐2.04 647 647 Limited 7 ‐1.84 653 653 Limited 8 ‐1.66 658 658 Limited 9 ‐1.50 663 663 Limited 10 ‐1.35 668 668 Limited 11 ‐1.21 672 672 Limited 12 ‐1.07 676 676 Limited 13 ‐0.95 679 679 Basic 14 ‐0.83 683 683 Basic 15 ‐0.71 686 686 Basic 16 ‐0.60 689 689 Basic 17 ‐0.49 693 693 Basic 18 ‐0.39 696 696 Basic 19 ‐0.29 699 700 Proficient 20 ‐0.19 702 702 Proficient 21 ‐0.09 705 705 Proficient 22 0.01 707 707 Proficient 23 0.11 710 710 Proficient 24 0.20 713 713 Proficient 25 0.29 716 716 Proficient 26 0.39 719 719 Proficient 27 0.48 721 721 Proficient 28 0.57 724 725 Accelerated 29 0.67 727 727 Accelerated 30 0.76 730 730 Accelerated 31 0.85 732 732 Accelerated 32 0.95 735 735 Accelerated 33 1.05 738 738 Accelerated 34 1.14 741 741 Accelerated 35 1.24 744 744 Accelerated 36 1.34 747 747 Accelerated 37 1.45 750 750 Accelerated 38 1.55 753 753 Accelerated 39 1.66 756 756 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-112 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.78 759 759 Advanced 41 1.89 763 763 Advanced 42 2.02 766 766 Advanced 43 2.14 770 770 Advanced 44 2.28 774 774 Advanced 45 2.42 778 778 Advanced 46 2.58 783 783 Advanced 47 2.74 788 788 Advanced 48 2.93 793 793 Advanced 49 3.13 799 799 Advanced 50 3.36 806 806 Advanced 51 3.50 810 810 Advanced 52 3.50 810 810 Advanced 53 3.50 810 810 Advanced 54 3.50 810 810 Advanced 55 3.50 810 810 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-113 American Institutes for Research
Table G57. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Integrated Math I Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.16 627 627 Limited 4 ‐2.84 636 636 Limited 5 ‐2.58 644 644 Limited 6 ‐2.35 650 650 Limited 7 ‐2.16 655 655 Limited 8 ‐1.99 660 660 Limited 9 ‐1.83 664 664 Limited 10 ‐1.69 669 669 Limited 11 ‐1.56 672 672 Limited 12 ‐1.43 676 676 Limited 13 ‐1.32 679 679 Limited 14 ‐1.20 682 682 Basic 15 ‐1.10 685 685 Basic 16 ‐0.99 688 688 Basic 17 ‐0.89 691 691 Basic 18 ‐0.79 694 694 Basic 19 ‐0.70 696 696 Basic 20 ‐0.61 699 700 Proficient 21 ‐0.52 702 702 Proficient 22 ‐0.43 704 704 Proficient 23 ‐0.34 707 707 Proficient 24 ‐0.25 709 709 Proficient 25 ‐0.16 711 711 Proficient 26 ‐0.08 714 714 Proficient 27 0.01 716 716 Proficient 28 0.10 719 719 Proficient 29 0.19 721 721 Proficient 30 0.27 724 724 Proficient 31 0.36 726 726 Accelerated 32 0.45 729 729 Accelerated 33 0.54 731 731 Accelerated 34 0.64 734 734 Accelerated 35 0.73 737 737 Accelerated 36 0.83 739 739 Accelerated 37 0.93 742 742 Accelerated 38 1.03 745 745 Accelerated 39 1.14 748 748 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-114 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.25 751 751 Accelerated 41 1.36 754 754 Advanced 42 1.48 758 758 Advanced 43 1.61 761 761 Advanced 44 1.75 765 765 Advanced 45 1.90 769 769 Advanced 46 2.06 774 774 Advanced 47 2.23 779 779 Advanced 48 2.43 784 784 Advanced 49 2.66 791 791 Advanced 50 2.93 798 798 Advanced 51 3.27 808 808 Advanced 52 3.50 814 814 Advanced 53 3.50 814 814 Advanced 54 3.50 814 814 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-115 American Institutes for Research
Table G58. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Integrated Math I Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 618 618 Limited 1 ‐3.50 618 618 Limited 2 ‐3.50 618 618 Limited 3 ‐3.16 627 627 Limited 4 ‐2.84 636 636 Limited 5 ‐2.58 644 644 Limited 6 ‐2.35 650 650 Limited 7 ‐2.16 655 655 Limited 8 ‐1.99 660 660 Limited 9 ‐1.83 664 664 Limited 10 ‐1.69 669 669 Limited 11 ‐1.56 672 672 Limited 12 ‐1.43 676 676 Limited 13 ‐1.32 679 679 Limited 14 ‐1.20 682 682 Basic 15 ‐1.10 685 685 Basic 16 ‐0.99 688 688 Basic 17 ‐0.89 691 691 Basic 18 ‐0.79 694 694 Basic 19 ‐0.70 696 696 Basic 20 ‐0.61 699 700 Proficient 21 ‐0.52 702 702 Proficient 22 ‐0.43 704 704 Proficient 23 ‐0.34 707 707 Proficient 24 ‐0.25 709 709 Proficient 25 ‐0.16 711 711 Proficient 26 ‐0.08 714 714 Proficient 27 0.01 716 716 Proficient 28 0.10 719 719 Proficient 29 0.19 721 721 Proficient 30 0.27 724 724 Proficient 31 0.36 726 726 Accelerated 32 0.45 729 729 Accelerated 33 0.54 731 731 Accelerated 34 0.64 734 734 Accelerated 35 0.73 737 737 Accelerated 36 0.83 739 739 Accelerated 37 0.93 742 742 Accelerated 38 1.03 745 745 Accelerated 39 1.14 748 748 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-116 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.25 751 751 Accelerated 41 1.36 754 754 Advanced 42 1.48 758 758 Advanced 43 1.61 761 761 Advanced 44 1.75 765 765 Advanced 45 1.90 769 769 Advanced 46 2.06 774 774 Advanced 47 2.23 779 779 Advanced 48 2.43 784 784 Advanced 49 2.66 791 791 Advanced 50 2.93 798 798 Advanced 51 3.27 808 808 Advanced 52 3.50 814 814 Advanced 53 3.50 814 814 Advanced 54 3.50 814 814 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-117 American Institutes for Research
Table G59. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Integrated Math II Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 594 594 Limited 1 ‐3.50 594 594 Limited 2 ‐3.33 600 600 Limited 3 ‐2.88 614 614 Limited 4 ‐2.54 624 624 Limited 5 ‐2.27 632 632 Limited 6 ‐2.05 640 640 Limited 7 ‐1.85 646 646 Limited 8 ‐1.67 651 651 Limited 9 ‐1.50 657 657 Limited 10 ‐1.35 661 661 Limited 11 ‐1.21 666 666 Limited 12 ‐1.08 670 670 Limited 13 ‐0.95 674 674 Limited 14 ‐0.83 678 678 Basic 15 ‐0.71 681 681 Basic 16 ‐0.60 685 685 Basic 17 ‐0.49 688 688 Basic 18 ‐0.39 691 691 Basic 19 ‐0.28 695 695 Basic 20 ‐0.18 698 698 Basic 21 ‐0.08 701 701 Proficient 22 0.02 704 704 Proficient 23 0.12 707 707 Proficient 24 0.21 710 710 Proficient 25 0.31 713 713 Proficient 26 0.40 716 716 Proficient 27 0.50 719 719 Proficient 28 0.60 722 722 Proficient 29 0.69 725 725 Accelerated 30 0.79 728 728 Accelerated 31 0.89 731 731 Accelerated 32 0.99 734 734 Accelerated 33 1.09 738 738 Accelerated 34 1.19 741 741 Accelerated 35 1.30 744 744 Accelerated 36 1.40 747 747 Accelerated 37 1.51 751 751 Accelerated 38 1.62 754 754 Accelerated 39 1.74 758 758 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-118 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.86 762 762 Advanced 41 1.98 765 765 Advanced 42 2.11 769 769 Advanced 43 2.24 774 774 Advanced 44 2.38 778 778 Advanced 45 2.53 783 783 Advanced 46 2.69 788 788 Advanced 47 2.87 793 793 Advanced 48 3.06 799 799 Advanced 49 3.28 806 806 Advanced 50 3.50 813 813 Advanced 51 3.50 813 813 Advanced 52 3.50 813 813 Advanced 53 3.50 813 813 Advanced 54 3.50 813 813 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-119 American Institutes for Research
Table G60. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Integrated Math II Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 594 594 Limited 1 ‐3.50 594 594 Limited 2 ‐3.33 600 600 Limited 3 ‐2.88 614 614 Limited 4 ‐2.54 624 624 Limited 5 ‐2.27 632 632 Limited 6 ‐2.05 640 640 Limited 7 ‐1.85 646 646 Limited 8 ‐1.67 651 651 Limited 9 ‐1.50 657 657 Limited 10 ‐1.35 661 661 Limited 11 ‐1.21 666 666 Limited 12 ‐1.08 670 670 Limited 13 ‐0.95 674 674 Limited 14 ‐0.83 678 678 Basic 15 ‐0.71 681 681 Basic 16 ‐0.60 685 685 Basic 17 ‐0.49 688 688 Basic 18 ‐0.39 691 691 Basic 19 ‐0.28 695 695 Basic 20 ‐0.18 698 698 Basic 21 ‐0.08 701 701 Proficient 22 0.02 704 704 Proficient 23 0.12 707 707 Proficient 24 0.21 710 710 Proficient 25 0.31 713 713 Proficient 26 0.40 716 716 Proficient 27 0.50 719 719 Proficient 28 0.60 722 722 Proficient 29 0.69 725 725 Accelerated 30 0.79 728 728 Accelerated 31 0.89 731 731 Accelerated 32 0.99 734 734 Accelerated 33 1.09 738 738 Accelerated 34 1.19 741 741 Accelerated 35 1.30 744 744 Accelerated 36 1.40 747 747 Accelerated 37 1.51 751 751 Accelerated 38 1.62 754 754 Accelerated 39 1.74 758 758 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-120 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.86 762 762 Advanced 41 1.98 765 765 Advanced 42 2.11 769 769 Advanced 43 2.24 774 774 Advanced 44 2.38 778 778 Advanced 45 2.53 783 783 Advanced 46 2.69 788 788 Advanced 47 2.87 793 793 Advanced 48 3.06 799 799 Advanced 49 3.28 806 806 Advanced 50 3.50 813 813 Advanced 51 3.50 813 813 Advanced 52 3.50 813 813 Advanced 53 3.50 813 813 Advanced 54 3.50 813 813 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-121 American Institutes for Research
Table G61. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – American Government Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 642 642 Limited 1 ‐3.50 642 642 Limited 2 ‐3.39 644 644 Limited 3 ‐2.94 652 652 Limited 4 ‐2.62 658 658 Limited 5 ‐2.36 663 663 Limited 6 ‐2.15 667 667 Limited 7 ‐1.96 671 671 Limited 8 ‐1.80 674 674 Limited 9 ‐1.66 677 677 Limited 10 ‐1.52 679 679 Limited 11 ‐1.40 681 681 Limited 12 ‐1.29 683 683 Limited 13 ‐1.18 685 685 Limited 14 ‐1.08 687 687 Basic 15 ‐0.99 689 689 Basic 16 ‐0.90 691 691 Basic 17 ‐0.81 692 692 Basic 18 ‐0.73 694 694 Basic 19 ‐0.65 695 695 Basic 20 ‐0.58 697 697 Basic 21 ‐0.50 698 698 Basic 22 ‐0.43 700 700 Proficient 23 ‐0.36 701 701 Proficient 24 ‐0.30 702 702 Proficient 25 ‐0.23 703 703 Proficient 26 ‐0.17 705 705 Proficient 27 ‐0.10 706 706 Proficient 28 ‐0.04 707 707 Proficient 29 0.02 708 708 Proficient 30 0.08 709 709 Proficient 31 0.14 710 710 Proficient 32 0.20 712 712 Proficient 33 0.26 713 713 Proficient 34 0.32 714 714 Proficient 35 0.38 715 715 Proficient 36 0.44 716 716 Proficient 37 0.50 717 717 Proficient 38 0.56 718 718 Proficient 39 0.62 719 719 Proficient
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-122 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.68 721 721 Proficient 41 0.74 722 722 Proficient 42 0.81 723 723 Proficient 43 0.87 724 724 Proficient 44 0.94 725 725 Accelerated 45 1.01 727 727 Accelerated 46 1.08 728 728 Accelerated 47 1.16 730 730 Accelerated 48 1.23 731 731 Accelerated 49 1.32 733 733 Accelerated 50 1.40 734 734 Accelerated 51 1.49 736 736 Accelerated 52 1.59 738 738 Accelerated 53 1.70 740 740 Advanced 54 1.82 742 742 Advanced 55 1.95 745 745 Advanced 56 2.10 747 747 Advanced 57 2.27 751 751 Advanced 58 2.47 754 754 Advanced 59 2.71 759 759 Advanced 60 3.03 765 765 Advanced 61 3.47 773 773 Advanced 62 3.50 774 774 Advanced 63 3.50 774 774 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-123 American Institutes for Research
Table G62. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – American Government Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 642 642 Limited 1 ‐3.50 642 642 Limited 2 ‐3.43 643 643 Limited 3 ‐2.98 651 651 Limited 4 ‐2.66 658 658 Limited 5 ‐2.40 662 662 Limited 6 ‐2.19 666 666 Limited 7 ‐2.00 670 670 Limited 8 ‐1.84 673 673 Limited 9 ‐1.69 676 676 Limited 10 ‐1.55 678 678 Limited 11 ‐1.43 681 681 Limited 12 ‐1.31 683 683 Limited 13 ‐1.21 685 685 Limited 14 ‐1.11 687 687 Basic 15 ‐1.01 689 689 Basic 16 ‐0.92 690 690 Basic 17 ‐0.83 692 692 Basic 18 ‐0.75 694 694 Basic 19 ‐0.67 695 695 Basic 20 ‐0.59 697 697 Basic 21 ‐0.52 698 698 Basic 22 ‐0.45 699 699 Basic 23 ‐0.38 701 701 Proficient 24 ‐0.31 702 702 Proficient 25 ‐0.24 703 703 Proficient 26 ‐0.17 704 704 Proficient 27 ‐0.11 706 706 Proficient 28 ‐0.05 707 707 Proficient 29 0.02 708 708 Proficient 30 0.08 709 709 Proficient 31 0.14 710 710 Proficient 32 0.20 712 712 Proficient 33 0.26 713 713 Proficient 34 0.32 714 714 Proficient 35 0.38 715 715 Proficient 36 0.45 716 716 Proficient 37 0.51 717 717 Proficient 38 0.57 718 718 Proficient 39 0.63 720 720 Proficient
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-124 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.70 721 721 Proficient 41 0.76 722 722 Proficient 42 0.83 723 723 Proficient 43 0.90 725 725 Accelerated 44 0.97 726 726 Accelerated 45 1.04 727 727 Accelerated 46 1.11 729 729 Accelerated 47 1.19 730 730 Accelerated 48 1.27 732 732 Accelerated 49 1.36 733 733 Accelerated 50 1.45 735 735 Accelerated 51 1.54 737 737 Accelerated 52 1.64 739 739 Advanced 53 1.75 741 741 Advanced 54 1.87 743 743 Advanced 55 2.01 746 746 Advanced 56 2.16 748 748 Advanced 57 2.33 752 752 Advanced 58 2.53 755 755 Advanced 59 2.77 760 760 Advanced 60 3.07 766 766 Advanced 61 3.50 774 774 Advanced 62 3.50 774 774 Advanced 63 3.50 774 774 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-125 American Institutes for Research
Table G63. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – American History Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 619 619 Limited 1 ‐3.50 619 619 Limited 2 ‐3.50 619 619 Limited 3 ‐3.07 630 630 Limited 4 ‐2.75 639 639 Limited 5 ‐2.50 645 645 Limited 6 ‐2.29 651 651 Limited 7 ‐2.11 655 655 Limited 8 ‐1.94 659 659 Limited 9 ‐1.80 663 663 Limited 10 ‐1.66 667 667 Limited 11 ‐1.54 670 670 Limited 12 ‐1.43 673 673 Limited 13 ‐1.32 676 676 Limited 14 ‐1.22 678 678 Limited 15 ‐1.12 681 681 Limited 16 ‐1.03 683 683 Limited 17 ‐0.94 685 685 Basic 18 ‐0.86 687 687 Basic 19 ‐0.78 689 689 Basic 20 ‐0.70 692 692 Basic 21 ‐0.62 693 693 Basic 22 ‐0.55 695 695 Basic 23 ‐0.48 697 697 Basic 24 ‐0.40 699 699 Basic 25 ‐0.34 701 701 Proficient 26 ‐0.27 703 703 Proficient 27 ‐0.20 704 704 Proficient 28 ‐0.14 706 706 Proficient 29 ‐0.07 708 708 Proficient 30 ‐0.01 709 709 Proficient 31 0.06 711 711 Proficient 32 0.12 713 713 Proficient 33 0.18 714 714 Proficient 34 0.25 716 716 Proficient 35 0.31 717 717 Proficient 36 0.37 719 719 Proficient 37 0.44 721 721 Proficient 38 0.50 722 722 Proficient 39 0.57 724 724 Proficient
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-126 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.63 726 726 Accelerated 41 0.70 728 728 Accelerated 42 0.77 729 729 Accelerated 43 0.84 731 731 Accelerated 44 0.91 733 733 Accelerated 45 0.99 735 735 Accelerated 46 1.06 737 737 Accelerated 47 1.14 739 739 Advanced 48 1.23 741 741 Advanced 49 1.31 743 743 Advanced 50 1.41 746 746 Advanced 51 1.50 748 748 Advanced 52 1.61 751 751 Advanced 53 1.72 754 754 Advanced 54 1.84 757 757 Advanced 55 1.98 760 760 Advanced 56 2.13 764 764 Advanced 57 2.30 769 769 Advanced 58 2.50 774 774 Advanced 59 2.74 780 780 Advanced 60 3.04 788 788 Advanced 61 3.46 799 799 Advanced 62 3.50 800 800 Advanced 63 3.50 800 800 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-127 American Institutes for Research
Table G64. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – American History Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 619 619 Limited 1 ‐3.50 619 619 Limited 2 ‐3.39 622 622 Limited 3 ‐2.96 633 633 Limited 4 ‐2.65 641 641 Limited 5 ‐2.40 648 648 Limited 6 ‐2.19 653 653 Limited 7 ‐2.01 658 658 Limited 8 ‐1.85 662 662 Limited 9 ‐1.71 665 665 Limited 10 ‐1.58 669 669 Limited 11 ‐1.46 672 672 Limited 12 ‐1.35 675 675 Limited 13 ‐1.24 678 678 Limited 14 ‐1.14 680 680 Limited 15 ‐1.05 682 682 Limited 16 ‐0.96 685 685 Basic 17 ‐0.87 687 687 Basic 18 ‐0.79 689 689 Basic 19 ‐0.71 691 691 Basic 20 ‐0.64 693 693 Basic 21 ‐0.56 695 695 Basic 22 ‐0.49 697 697 Basic 23 ‐0.42 699 699 Basic 24 ‐0.35 700 700 Proficient 25 ‐0.28 702 702 Proficient 26 ‐0.22 704 704 Proficient 27 ‐0.15 706 706 Proficient 28 ‐0.09 707 707 Proficient 29 ‐0.03 709 709 Proficient 30 0.03 710 710 Proficient 31 0.10 712 712 Proficient 32 0.16 714 714 Proficient 33 0.22 715 715 Proficient 34 0.28 717 717 Proficient 35 0.34 718 718 Proficient 36 0.40 720 720 Proficient 37 0.46 721 721 Proficient 38 0.53 723 723 Proficient 39 0.59 725 725 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-128 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.65 726 726 Accelerated 41 0.72 728 728 Accelerated 42 0.79 730 730 Accelerated 43 0.85 731 731 Accelerated 44 0.92 733 733 Accelerated 45 1.00 735 735 Accelerated 46 1.07 737 737 Accelerated 47 1.15 739 739 Advanced 48 1.23 741 741 Advanced 49 1.31 743 743 Advanced 50 1.40 746 746 Advanced 51 1.50 748 748 Advanced 52 1.60 751 751 Advanced 53 1.71 754 754 Advanced 54 1.83 757 757 Advanced 55 1.96 760 760 Advanced 56 2.11 764 764 Advanced 57 2.27 768 768 Advanced 58 2.47 773 773 Advanced 59 2.70 779 779 Advanced 60 3.00 787 787 Advanced 61 3.42 798 798 Advanced 62 3.50 800 800 Advanced 63 3.50 800 800 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-129 American Institutes for Research
Table G65. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 5 Science Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 559 559 Limited 1 ‐3.50 559 559 Limited 2 ‐3.50 559 559 Limited 3 ‐3.25 569 569 Limited 4 ‐2.92 582 582 Limited 5 ‐2.66 593 593 Limited 6 ‐2.44 602 602 Limited 7 ‐2.25 610 610 Limited 8 ‐2.08 617 617 Limited 9 ‐1.92 623 623 Limited 10 ‐1.78 629 629 Limited 11 ‐1.65 635 635 Limited 12 ‐1.52 640 640 Limited 13 ‐1.40 645 645 Limited 14 ‐1.28 649 649 Limited 15 ‐1.17 654 654 Limited 16 ‐1.06 658 658 Limited 17 ‐0.96 663 664 Basic 18 ‐0.86 667 667 Basic 19 ‐0.76 671 671 Basic 20 ‐0.67 675 675 Basic 21 ‐0.57 678 678 Basic 22 ‐0.48 682 682 Basic 23 ‐0.39 686 686 Basic 24 ‐0.30 690 690 Basic 25 ‐0.21 693 693 Basic 26 ‐0.12 697 697 Basic 27 ‐0.03 701 701 Proficient 28 0.06 704 704 Proficient 29 0.15 708 708 Proficient 30 0.24 712 712 Proficient 31 0.33 715 715 Proficient 32 0.42 719 719 Proficient 33 0.51 723 723 Proficient 34 0.60 726 726 Accelerated 35 0.70 730 730 Accelerated 36 0.79 734 734 Accelerated 37 0.89 738 738 Accelerated 38 0.99 742 742 Accelerated 39 1.09 746 746 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-130 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.20 751 751 Accelerated 41 1.31 755 755 Advanced 42 1.43 760 760 Advanced 43 1.55 765 765 Advanced 44 1.68 770 770 Advanced 45 1.82 776 776 Advanced 46 1.98 782 782 Advanced 47 2.14 789 789 Advanced 48 2.33 797 797 Advanced 49 2.54 805 805 Advanced 50 2.78 815 815 Advanced 51 3.06 827 827 Advanced 52 3.42 841 841 Advanced 53 3.50 845 845 Advanced 54 3.50 845 845 Advanced 55 3.50 845 845 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-131 American Institutes for Research
Table G66. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 5 Science Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 559 559 Limited 1 ‐3.50 559 559 Limited 2 ‐3.50 559 559 Limited 3 ‐3.26 569 569 Limited 4 ‐2.93 582 582 Limited 5 ‐2.67 593 593 Limited 6 ‐2.45 602 602 Limited 7 ‐2.26 610 610 Limited 8 ‐2.08 617 617 Limited 9 ‐1.92 623 623 Limited 10 ‐1.78 629 629 Limited 11 ‐1.64 635 635 Limited 12 ‐1.51 640 640 Limited 13 ‐1.39 645 645 Limited 14 ‐1.27 650 650 Limited 15 ‐1.16 654 654 Limited 16 ‐1.05 659 659 Limited 17 ‐0.95 663 664 Basic 18 ‐0.84 667 667 Basic 19 ‐0.74 671 671 Basic 20 ‐0.65 675 675 Basic 21 ‐0.55 679 679 Basic 22 ‐0.46 683 683 Basic 23 ‐0.36 687 687 Basic 24 ‐0.27 691 691 Basic 25 ‐0.18 694 694 Basic 26 ‐0.09 698 698 Basic 27 0.00 702 702 Proficient 28 0.09 705 705 Proficient 29 0.18 709 709 Proficient 30 0.27 713 713 Proficient 31 0.36 716 716 Proficient 32 0.45 720 720 Proficient 33 0.54 724 725 Accelerated 34 0.63 728 728 Accelerated 35 0.73 731 731 Accelerated 36 0.82 735 735 Accelerated 37 0.92 739 739 Accelerated 38 1.02 743 743 Accelerated 39 1.12 748 748 Accelerated
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-132 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.23 752 753 Advanced 41 1.34 756 756 Advanced 42 1.46 761 761 Advanced 43 1.58 766 766 Advanced 44 1.71 771 771 Advanced 45 1.85 777 777 Advanced 46 2.00 783 783 Advanced 47 2.16 790 790 Advanced 48 2.34 797 797 Advanced 49 2.55 806 806 Advanced 50 2.78 815 815 Advanced 51 3.06 827 827 Advanced 52 3.41 841 841 Advanced 53 3.50 845 845 Advanced 54 3.50 845 845 Advanced 55 3.50 845 845 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-133 American Institutes for Research
Table G67. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 8 Science Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 575 575 Limited 1 ‐3.50 575 575 Limited 2 ‐3.50 575 575 Limited 3 ‐3.40 579 579 Limited 4 ‐3.07 593 593 Limited 5 ‐2.80 604 604 Limited 6 ‐2.58 613 613 Limited 7 ‐2.38 622 622 Limited 8 ‐2.20 629 629 Limited 9 ‐2.04 636 636 Limited 10 ‐1.89 642 642 Limited 11 ‐1.75 648 648 Limited 12 ‐1.62 653 653 Limited 13 ‐1.49 659 659 Limited 14 ‐1.37 664 664 Limited 15 ‐1.26 669 669 Limited 16 ‐1.14 673 674 Basic 17 ‐1.04 678 678 Basic 18 ‐0.93 682 682 Basic 19 ‐0.83 686 686 Basic 20 ‐0.73 691 691 Basic 21 ‐0.63 695 695 Basic 22 ‐0.54 699 700 Proficient 23 ‐0.44 703 703 Proficient 24 ‐0.35 707 707 Proficient 25 ‐0.25 710 710 Proficient 26 ‐0.16 714 714 Proficient 27 ‐0.07 718 718 Proficient 28 0.02 722 722 Proficient 29 0.11 726 726 Accelerated 30 0.21 730 730 Accelerated 31 0.30 734 734 Accelerated 32 0.40 738 738 Accelerated 33 0.49 742 742 Accelerated 34 0.59 746 746 Accelerated 35 0.69 750 750 Accelerated 36 0.79 754 754 Accelerated 37 0.89 758 758 Accelerated 38 1.00 763 763 Accelerated 39 1.11 768 768 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-134 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.22 772 772 Advanced 41 1.34 777 777 Advanced 42 1.46 782 782 Advanced 43 1.59 788 788 Advanced 44 1.73 793 793 Advanced 45 1.87 799 799 Advanced 46 2.02 806 806 Advanced 47 2.18 813 813 Advanced 48 2.36 820 820 Advanced 49 2.56 828 828 Advanced 50 2.79 838 838 Advanced 51 3.06 849 849 Advanced 52 3.39 863 863 Advanced 53 3.50 868 868 Advanced 54 3.50 868 868 Advanced 55 3.50 868 868 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-135 American Institutes for Research
Table G68. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Grade 8 Science Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 575 575 Limited 1 ‐3.50 575 575 Limited 2 ‐3.50 575 575 Limited 3 ‐3.26 585 585 Limited 4 ‐2.93 598 598 Limited 5 ‐2.67 610 610 Limited 6 ‐2.44 619 619 Limited 7 ‐2.25 627 627 Limited 8 ‐2.07 634 634 Limited 9 ‐1.91 641 641 Limited 10 ‐1.77 647 647 Limited 11 ‐1.63 653 653 Limited 12 ‐1.50 658 658 Limited 13 ‐1.38 663 663 Limited 14 ‐1.27 668 668 Limited 15 ‐1.16 673 674 Basic 16 ‐1.05 677 677 Basic 17 ‐0.95 681 681 Basic 18 ‐0.85 686 686 Basic 19 ‐0.75 690 690 Basic 20 ‐0.66 694 694 Basic 21 ‐0.57 697 697 Basic 22 ‐0.48 701 701 Proficient 23 ‐0.39 705 705 Proficient 24 ‐0.30 708 708 Proficient 25 ‐0.22 712 712 Proficient 26 ‐0.13 716 716 Proficient 27 ‐0.05 719 719 Proficient 28 0.04 723 723 Proficient 29 0.12 726 726 Accelerated 30 0.21 730 730 Accelerated 31 0.29 733 733 Accelerated 32 0.38 737 737 Accelerated 33 0.47 741 741 Accelerated 34 0.56 744 744 Accelerated 35 0.65 748 748 Accelerated 36 0.74 752 752 Accelerated 37 0.83 756 756 Accelerated 38 0.93 760 760 Accelerated 39 1.03 764 766 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-136 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.14 769 769 Advanced 41 1.25 773 773 Advanced 42 1.36 778 778 Advanced 43 1.48 783 783 Advanced 44 1.61 789 789 Advanced 45 1.75 794 794 Advanced 46 1.89 800 800 Advanced 47 2.05 807 807 Advanced 48 2.23 814 814 Advanced 49 2.42 822 822 Advanced 50 2.64 832 832 Advanced 51 2.91 843 843 Advanced 52 3.24 857 857 Advanced 53 3.50 868 868 Advanced 54 3.50 868 868 Advanced 55 3.50 868 868 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-137 American Institutes for Research
Table G69. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Biology Science Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 617 617 Limited 1 ‐3.50 617 617 Limited 2 ‐3.43 619 619 Limited 3 ‐3.00 631 631 Limited 4 ‐2.68 641 641 Limited 5 ‐2.42 648 648 Limited 6 ‐2.21 655 655 Limited 7 ‐2.03 660 660 Limited 8 ‐1.87 665 665 Limited 9 ‐1.72 669 669 Limited 10 ‐1.59 673 673 Limited 11 ‐1.47 677 677 Limited 12 ‐1.35 680 680 Limited 13 ‐1.25 683 683 Limited 14 ‐1.14 686 686 Basic 15 ‐1.05 689 689 Basic 16 ‐0.95 692 692 Basic 17 ‐0.86 694 694 Basic 18 ‐0.78 697 697 Basic 19 ‐0.69 699 700 Proficient 20 ‐0.61 702 702 Proficient 21 ‐0.53 704 704 Proficient 22 ‐0.45 706 706 Proficient 23 ‐0.38 709 709 Proficient 24 ‐0.30 711 711 Proficient 25 ‐0.23 713 713 Proficient 26 ‐0.15 715 715 Proficient 27 ‐0.08 718 718 Proficient 28 0.00 720 720 Proficient 29 0.07 722 722 Proficient 30 0.14 724 724 Proficient 31 0.22 726 726 Accelerated 32 0.29 728 728 Accelerated 33 0.37 731 731 Accelerated 34 0.45 733 733 Accelerated 35 0.52 735 735 Advanced 36 0.60 738 738 Advanced 37 0.68 740 740 Advanced 38 0.77 742 742 Advanced 39 0.85 745 745 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-138 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 0.94 748 748 Advanced 41 1.03 750 750 Advanced 42 1.13 753 753 Advanced 43 1.23 756 756 Advanced 44 1.34 759 759 Advanced 45 1.45 763 763 Advanced 46 1.57 766 766 Advanced 47 1.71 770 770 Advanced 48 1.86 775 775 Advanced 49 2.03 780 780 Advanced 50 2.22 785 785 Advanced 51 2.45 792 792 Advanced 52 2.73 800 800 Advanced 53 3.10 811 811 Advanced 54 3.50 823 823 Advanced 55 3.50 823 823 Advanced 56 3.50 823 823 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-139 American Institutes for Research
Table G70. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Biology Science Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 617 617 Limited 1 ‐3.50 617 617 Limited 2 ‐3.22 625 625 Limited 3 ‐2.79 637 637 Limited 4 ‐2.48 647 647 Limited 5 ‐2.23 654 654 Limited 6 ‐2.03 660 660 Limited 7 ‐1.85 665 665 Limited 8 ‐1.69 670 670 Limited 9 ‐1.55 674 674 Limited 10 ‐1.42 678 678 Limited 11 ‐1.31 681 681 Limited 12 ‐1.20 684 685 Basic 13 ‐1.09 688 688 Basic 14 ‐1.00 690 690 Basic 15 ‐0.90 693 693 Basic 16 ‐0.82 696 696 Basic 17 ‐0.73 698 698 Basic 18 ‐0.65 701 701 Proficient 19 ‐0.57 703 703 Proficient 20 ‐0.49 705 705 Proficient 21 ‐0.42 708 708 Proficient 22 ‐0.34 710 710 Proficient 23 ‐0.27 712 712 Proficient 24 ‐0.20 714 714 Proficient 25 ‐0.12 716 716 Proficient 26 ‐0.05 718 718 Proficient 27 0.02 720 720 Proficient 28 0.09 722 722 Proficient 29 0.16 724 725 Accelerated 30 0.23 727 727 Accelerated 31 0.30 729 729 Accelerated 32 0.37 731 731 Accelerated 33 0.44 733 733 Accelerated 34 0.52 735 735 Advanced 35 0.59 737 737 Advanced 36 0.67 740 740 Advanced 37 0.75 742 742 Advanced 38 0.83 744 744 Advanced 39 0.91 747 747 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-140 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.00 749 749 Advanced 41 1.09 752 752 Advanced 42 1.19 755 755 Advanced 43 1.29 758 758 Advanced 44 1.40 761 761 Advanced 45 1.51 764 764 Advanced 46 1.64 768 768 Advanced 47 1.78 772 772 Advanced 48 1.93 777 777 Advanced 49 2.10 782 782 Advanced 50 2.29 787 787 Advanced 51 2.52 794 794 Advanced 52 2.79 802 802 Advanced 53 3.14 813 813 Advanced 54 3.50 823 823 Advanced 55 3.50 823 823 Advanced 56 3.50 823 823 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-141 American Institutes for Research
Table G71. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Physical Science Online
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 634 634 Limited 1 ‐3.50 634 634 Limited 2 ‐3.47 635 635 Limited 3 ‐3.03 646 646 Limited 4 ‐2.71 654 654 Limited 5 ‐2.45 661 661 Limited 6 ‐2.23 667 667 Limited 7 ‐2.04 671 671 Limited 8 ‐1.88 676 676 Limited 9 ‐1.72 680 680 Limited 10 ‐1.58 683 684 Basic 11 ‐1.46 687 687 Basic 12 ‐1.33 690 690 Basic 13 ‐1.22 693 693 Basic 14 ‐1.11 696 696 Basic 15 ‐1.01 698 698 Basic 16 ‐0.91 701 701 Proficient 17 ‐0.82 703 703 Proficient 18 ‐0.72 706 706 Proficient 19 ‐0.64 708 708 Proficient 20 ‐0.55 710 710 Proficient 21 ‐0.47 712 712 Proficient 22 ‐0.38 714 714 Proficient 23 ‐0.30 717 717 Proficient 24 ‐0.22 719 719 Proficient 25 ‐0.15 721 721 Proficient 26 ‐0.07 723 723 Proficient 27 0.01 725 725 Accelerated 28 0.08 727 727 Accelerated 29 0.16 729 729 Accelerated 30 0.24 731 731 Accelerated 31 0.31 733 733 Accelerated 32 0.39 735 735 Accelerated 33 0.47 737 737 Accelerated 34 0.55 739 739 Accelerated 35 0.63 741 741 Accelerated 36 0.71 743 743 Accelerated 37 0.79 745 745 Accelerated 38 0.88 747 747 Accelerated 39 0.97 749 749 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-142 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.06 752 752 Advanced 41 1.15 754 754 Advanced 42 1.25 757 757 Advanced 43 1.35 759 759 Advanced 44 1.46 762 762 Advanced 45 1.57 765 765 Advanced 46 1.70 768 768 Advanced 47 1.83 772 772 Advanced 48 1.98 776 776 Advanced 49 2.14 780 780 Advanced 50 2.32 785 785 Advanced 51 2.54 790 790 Advanced 52 2.79 797 797 Advanced 53 3.11 805 805 Advanced 54 3.50 815 815 Advanced 55 3.50 815 815 Advanced 56 3.50 815 815 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-143 American Institutes for Research
Table G72. Comparison of Scaled Scores Before and After Application of Ohio Rounding and Truncation Rules Spring 2018 – Physical Science Paper
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 0 ‐3.50 634 634 Limited 1 ‐3.50 634 634 Limited 2 ‐3.50 634 634 Limited 3 ‐3.12 644 644 Limited 4 ‐2.80 652 652 Limited 5 ‐2.54 659 659 Limited 6 ‐2.32 664 664 Limited 7 ‐2.13 669 669 Limited 8 ‐1.96 674 674 Limited 9 ‐1.80 678 678 Limited 10 ‐1.66 681 681 Limited 11 ‐1.53 685 685 Basic 12 ‐1.40 688 688 Basic 13 ‐1.29 691 691 Basic 14 ‐1.17 694 694 Basic 15 ‐1.07 697 697 Basic 16 ‐0.96 699 700 Proficient 17 ‐0.87 702 702 Proficient 18 ‐0.77 704 704 Proficient 19 ‐0.68 707 707 Proficient 20 ‐0.59 709 709 Proficient 21 ‐0.50 711 711 Proficient 22 ‐0.42 714 714 Proficient 23 ‐0.33 716 716 Proficient 24 ‐0.25 718 718 Proficient 25 ‐0.17 720 720 Proficient 26 ‐0.09 722 722 Proficient 27 ‐0.01 724 724 Proficient 28 0.07 726 726 Accelerated 29 0.15 728 728 Accelerated 30 0.23 730 730 Accelerated 31 0.31 732 732 Accelerated 32 0.39 735 735 Accelerated 33 0.47 737 737 Accelerated 34 0.56 739 739 Accelerated 35 0.64 741 741 Accelerated 36 0.72 743 743 Accelerated 37 0.81 745 745 Accelerated 38 0.90 748 748 Accelerated 39 0.99 750 750 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
G-144 American Institutes for Research
Raw Score
Ohio Theta
Before Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Scaled Score
After Ohio Rounding/Truncation
Proficiency 40 1.09 753 753 Advanced 41 1.19 755 755 Advanced 42 1.29 758 758 Advanced 43 1.40 761 761 Advanced 44 1.51 764 764 Advanced 45 1.63 767 767 Advanced 46 1.76 770 770 Advanced 47 1.90 774 774 Advanced 48 2.06 778 778 Advanced 49 2.23 782 782 Advanced 50 2.42 787 787 Advanced 51 2.64 793 793 Advanced 52 2.90 800 800 Advanced 53 3.23 808 808 Advanced 54 3.50 815 815 Advanced 55 3.50 815 815 Advanced 56 3.50 815 815 Advanced
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
H-1 American Institutes for Research
Table H.1 DRC Item Summary Report – Grade 3 ELA Fall 2017
Item ID Dimension Inter‐Rater Reliability Score Point Distribution
2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %F %T %U
31679 Conventions 8,834 84 16 0 37,914 9 39 47 0 0 0 0 0 6 Purpose/Organization 8,834 80 20 0 37,914 21 45 24 3 0 0 0 1 6 Evidence/Elaboration 8,834 80 20 0 37,914 21 47 23 3 0 0 0 1 6
Note: 2x = the number of student responses scored by two readers, %EX = the percent of student responses that reader 1 and reader 2 were in exact agreement (3,3), %AD = the percent of student responses that reader 1 and reader 2 scores were adjacent (1, 2), %NA = the percent of student responses that reader 1 and reader 2 were non‐adjacent (0, 2), %B = percent of Blank/No response, %U = percent of Unreadable responses, %F = percent of Foreign Language responses, %T = percent of Off Topic responses
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
H-2 American Institutes for Research
Table H.2 DRC Item Summary Report – High School ELA I Fall 2017
Item ID Dimension Inter‐Rater Reliability Score Point Distribution
2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %F %T %U
31578 Conventions 6,040 91 9 0 17,216 22 30 31 0 0 0 0 0 17 Purpose/Organization 6,040 90 10 0 17,216 17 41 18 5 1 0 0 1 17 Evidence/Elaboration 6,040 90 10 0 17,216 19 40 18 3 1 0 0 1 17
31588 Conventions 3,388 92 8 0 9,021 16 35 27 0 0 0 0 0 22 Purpose/Organization 3,388 94 6 0 9,021 4 45 19 8 1 0 0 0 22 Evidence/Elaboration 3,388 93 7 0 9,021 11 40 20 5 1 0 0 0 22
Note: 2x = the number of student responses scored by two readers, %EX = the percent of student responses that reader 1 and reader 2 were in exact agreement (3,3), %AD = the percent of student responses that reader 1 and reader 2 scores were adjacent (1, 2), %NA = the percent of student responses that reader 1 and reader 2 were non‐adjacent (0, 2), %B = percent of Blank/No response, %U = percent of Unreadable responses, %F = percent of Foreign Language responses, %T = percent of Off Topic responses
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
H-3 American Institutes for Research
Table H.3 DRC Item Summary Report – High School ELA II Fall 2017
Item ID Dimension Inter‐Rater Reliability Score Point Distribution
2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %F %T %U
31513 Conventions 5,304 81 19 0 23,034 35 35 24 0 0 0 0 0 5 Purpose/Organization 5,304 82 18 0 23,034 31 44 16 3 0 0 0 1 5 Evidence/Elaboration 5,304 82 18 0 23,034 41 37 13 3 0 0 0 1 5
31662 Conventions 2,194 90 10 0 6,997 29 26 31 0 0 0 0 0 15 Purpose/Organization 2,194 90 10 0 6,997 9 44 22 9 1 0 0 0 15 Evidence/Elaboration 2,194 86 14 0 6,997 25 29 23 7 1 0 0 0 15
Note: 2x = the number of student responses scored by two readers, %EX = the percent of student responses that reader 1 and reader 2 were in exact agreement (3,3), %AD = the percent of student responses that reader 1 and reader 2 scores were adjacent (1, 2), %NA = the percent of student responses that reader 1 and reader 2 were non‐adjacent (0, 2), %B = percent of Blank/No response, %U = percent of Unreadable responses, %F = percent of Foreign Language responses, %T = percent of Off Topic responses
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
H-4 American Institutes for Research
Table H.4 DRC Item Summary Report – Grade 3 ELA Spring 2018
Item ID Dimension Inter‐Rater Reliability Score Point Distribution
2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %C %F %T %U
31664 First 600
Conventions 1,532 81 18 1 1,532 9 28 35 0 0 2 0 0 0 25 Purpose/Organization 1,532 81 18 1 1,532 22 35 12 2 1 2 0 0 1 25 Evidence/Elaboration 1,532 80 19 1 1,532 23 34 12 2 1 2 0 0 1 25
31664 Online
Conventions 124,592 83 17 0 124,592 4 26 36 0 0 0 34 0 0 0 Purpose/Organization 124,592 75 23 2 124,592 11 33 18 3 0 0 34 0 0 0 Evidence/Elaboration 124,592 77 21 1 124,592 11 37 15 2 0 0 34 0 0 0
Note: 2x = the number of student responses scored by two readers, %EX = the percent of student responses that reader 1 and reader 2 were in exact agreement (3,3), %AD = the percent of student responses that reader 1 and reader 2 scores were adjacent (1, 2), %NA = the percent of student responses that reader 1 and reader 2 were non‐adjacent (0, 2), %B = percent of Blank/No response, %U = percent of Unreadable responses, %F = percent of Foreign Language responses, %T = percent of Off Topic responses
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
H-5 American Institutes for Research
Table H.5 DRC Item Summary Report – Grade 4 ELA Spring 2018
Item ID Dimension Inter‐Rater Reliability Score Point Distribution
2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %C %F %T %U
31960 First 600
Conventions 1,490 84 16 0 1,490 7 37 41 0 0 3 0 0 0 12 Purpose/Organization 1,490 79 20 1 1,490 13 32 30 8 1 3 0 0 1 12 Evidence/Elaboration 1,490 80 19 0 1,490 10 33 31 8 1 3 0 0 1 12
31960 Online
Conventions 97,344 73 26 0 97,344 5 41 41 0 0 0 14 0 0 0 Purpose/Organization 97,344 65 33 2 97,344 6 29 34 14 2 0 14 0 0 0 Evidence/Elaboration 97,344 68 30 1 97,344 5 31 32 15 2 0 14 0 0 0
Note: 2x = the number of student responses scored by two readers, %EX = the percent of student responses that reader 1 and reader 2 were in exact agreement (3,3), %AD = the percent of student responses that reader 1 and reader 2 scores were adjacent (1, 2), %NA = the percent of student responses that reader 1 and reader 2 were non‐adjacent (0, 2), %B = percent of Blank/No response, %U = percent of Unreadable responses, %F = percent of Foreign Language responses, %T = percent of Off Topic responses
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
H-6 American Institutes for Research
Table H.6 DRC Item Summary Report – Grade 5 ELA Spring 2018
Item ID Dimension Inter‐Rater Reliability Score Point Distribution
2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %C %F %T %U
32035 First 600
Conventions 1,200 82 17 1 1,200 5 28 57 0 0 2 0 0 0 7 Purpose/Organization 1,200 77 21 2 1,200 4 35 41 9 1 2 0 0 1 7 Evidence/Elaboration 1,200 78 20 2 1,200 5 39 40 5 1 2 0 0 1 7
32035 Online
Conventions 119,148 80 20 0 119,148 5 24 66 0 0 0 5 0 0 0 Purpose/Organization 119,148 73 26 0 119,148 2 32 45 14 1 0 5 0 0 0 Evidence/Elaboration 119,148 68 32 0 119,148 2 42 39 10 1 0 5 0 0 0
Note: 2x = the number of student responses scored by two readers, %EX = the percent of student responses that reader 1 and reader 2 were in exact agreement (3,3), %AD = the percent of student responses that reader 1 and reader 2 scores were adjacent (1, 2), %NA = the percent of student responses that reader 1 and reader 2 were non‐adjacent (0, 2), %B = percent of Blank/No response, %U = percent of Unreadable responses, %F = percent of Foreign Language responses, %T = percent of Off Topic responses
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
H-7 American Institutes for Research
Table H.7 DRC Item Summary Report – Grade 6 ELA Spring 2018
Item ID Dimension Inter‐Rater Reliability Score Point Distribution
2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %C %F %T %U
31711 First 600
Conventions 1,310 76 22 2 1,310 7 33 51 0 0 5 0 0 0 5 Purpose/Organization 1,310 77 22 1 1,310 9 35 33 11 1 5 0 0 1 5 Evidence/Elaboration 1,310 73 25 2 1,310 10 38 31 9 1 5 0 0 1 5
31711 Online
Conventions 128,596 82 18 0 128,596 5 21 70 0 0 0 4 0 0 0 Purpose/Organization 128,596 69 31 0 128,596 3 26 40 25 2 0 4 0 0 0 Evidence/Elaboration 128,596 69 31 0 128,596 3 30 40 21 2 0 4 0 0 0
31766 First 600
Conventions 1,310 80 20 1 1,310 4 27 59 0 0 2 0 0 0 8 Purpose/Organization 1,310 68 29 4 1,310 5 30 32 17 3 2 0 0 2 8 Evidence/Elaboration 1,310 67 30 3 1,310 13 23 38 12 2 2 0 0 2 8
31766 Online
Conventions 123,336 81 18 1 123,336 5 20 67 0 0 0 8 0 0 0 Purpose/Organization 123,336 63 35 2 123,336 3 26 30 30 3 0 8 0 0 0 Evidence/Elaboration 123,336 61 37 2 123,336 5 23 37 23 4 0 8 0 0 0
Note: 2x = the number of student responses scored by two readers, %EX = the percent of student responses that reader 1 and reader 2 were in exact agreement (3,3), %AD = the percent of student responses that reader 1 and reader 2 scores were adjacent (1, 2), %NA = the percent of student responses that reader 1 and reader 2 were non‐adjacent (0, 2), %B = percent of Blank/No response, %U = percent of Unreadable responses, %F = percent of Foreign Language responses, %T = percent of Off Topic responses
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
H-8 American Institutes for Research
Table H.8 DRC Item Summary Report – Grade 7 ELA Spring 2018
Item ID Dimension Inter‐Rater Reliability Score Point Distribution
2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %C %F %T %U
31604 First 600
Conventions 1,366 84 15 1 1,366 4 20 68 0 0 3 0 0 0 4 Purpose/Organization 1,366 74 25 1 1,366 3 33 40 14 2 3 0 0 1 4 Evidence/Elaboration 1,366 71 27 2 1,366 11 33 32 15 1 3 0 0 1 4
31604 Online
Conventions 95,876 84 15 0 95,874 3 18 76 0 0 0 3 0 0 0 Purpose/Organization 95,876 64 34 1 95,875 1 24 48 22 2 0 3 0 0 0 Evidence/Elaboration 95,876 66 33 1 95,875 2 32 38 22 2 0 3 0 0 0
31978 First 600
Conventions 1,366 83 17 0 1,366 6 23 61 0 0 4 0 0 0 6 Purpose/Organization 1,366 76 23 1 1,366 4 39 32 11 2 4 0 0 2 6 Evidence/Elaboration 1,366 72 25 2 1,366 9 37 30 11 2 4 0 0 2 6
31978 Online
Conventions 138,058 76 23 1 138,058 6 25 63 0 0 0 5 0 0 0 Purpose/Organization 138,058 68 32 1 138,058 1 36 37 18 1 0 5 0 1 0 Evidence/Elaboration 138,058 67 32 1 138,058 3 32 39 18 2 0 5 0 1 0
Note: 2x = the number of student responses scored by two readers, %EX = the percent of student responses that reader 1 and reader 2 were in exact agreement (3,3), %AD = the percent of student responses that reader 1 and reader 2 scores were adjacent (1, 2), %NA = the percent of student responses that reader 1 and reader 2 were non‐adjacent (0, 2), %B = percent of Blank/No response, %U = percent of Unreadable responses, %F = percent of Foreign Language responses, %T = percent of Off Topic responses
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
H-9 American Institutes for Research
Table H.9 DRC Item Summary Report – Grade 8 ELA Spring 2018
Item ID Dimension Inter‐Rater Reliability Score Point Distribution
2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %C %F %T %U
32037 First 600
Conventions 1,196 86 12 2 1,196 10 15 64 0 0 6 0 0 0 5 Purpose/Organization 1,196 75 25 1 1,196 8 30 30 15 2 6 0 0 4 5 Evidence/Elaboration 1,196 70 28 2 1,196 17 29 28 9 1 6 0 0 4 5
32037 Online
Conventions 112,342 85 15 0 112,342 6 14 77 0 0 0 3 0 0 1 Purpose/Organization 112,342 62 38 0 112,342 3 23 42 23 4 0 3 0 1 1 Evidence/Elaboration 112,342 60 40 0 112,342 5 26 36 24 4 0 3 0 1 1
32110 First 600
Conventions 1,196 82 17 1 1,196 10 19 63 0 0 4 0 0 0 5 Purpose/Organization 1,196 75 24 1 1,196 3 30 45 12 1 4 0 0 1 5 Evidence/Elaboration 1,196 72 26 1 1,196 9 38 33 10 1 4 0 0 1 5
32110 Online
Conventions 154,432 88 12 0 154,432 4 11 83 0 0 0 3 0 0 0 Purpose/Organization 154,432 63 36 0 154,432 1 19 54 22 1 0 3 0 0 0 Evidence/Elaboration 154,432 66 34 0 154,432 2 22 53 19 1 0 3 0 0 0
Note: 2x = the number of student responses scored by two readers, %EX = the percent of student responses that reader 1 and reader 2 were in exact agreement (3,3), %AD = the percent of student responses that reader 1 and reader 2 scores were adjacent (1, 2), %NA = the percent of student responses that reader 1 and reader 2 were non‐adjacent (0, 2), %B = percent of Blank/No response, %U = percent of Unreadable responses, %F = percent of Foreign Language responses, %T = percent of Off Topic responses
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
H-10 American Institutes for Research
Table H.10 DRC Item Summary Report – High School ELA I Spring 2018
Item ID Dimension Inter‐Rater Reliability Score Point Distribution
2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %C %F %T %U
31555 First 600
Conventions 1,338 79 20 1 1,338 10 34 33 0 0 11 0 0 0 12 Purpose/Organization 1,338 82 18 0 1,338 8 47 17 4 0 11 0 0 2 12 Evidence/Elaboration 1,338 87 13 0 1,338 6 48 17 3 0 11 0 0 2 12
31555 Online
Conventions 155,198 79 21 0 155,198 9 24 58 0 0 0 8 0 0 1 Purpose/Organization 155,198 74 26 0 155,198 4 41 29 16 0 0 8 0 1 1 Evidence/Elaboration 155,198 75 25 0 155,198 3 42 29 15 1 0 8 0 1 1
31583 First 600
Conventions 1,338 73 25 1 1,338 12 37 41 0 0 5 0 0 0 4 Purpose/Organization 1,338 79 20 1 1,338 2 56 25 6 1 5 0 0 1 4 Evidence/Elaboration 1,338 77 22 1 1,338 16 39 30 5 0 5 0 0 1 4
31583 Online
Conventions 127,958 84 15 0 127,958 7 16 72 0 0 0 4 0 0 0 Purpose/Organization 127,958 64 35 1 127,958 1 33 44 17 0 0 4 0 0 0 Evidence/Elaboration 127,958 68 32 0 127,958 5 21 46 22 1 0 4 0 0 0
Note: 2x = the number of student responses scored by two readers, %EX = the percent of student responses that reader 1 and reader 2 were in exact agreement (3,3), %AD = the percent of student responses that reader 1 and reader 2 scores were adjacent (1, 2), %NA = the percent of student responses that reader 1 and reader 2 were non‐adjacent (0, 2), %B = percent of Blank/No response, %U = percent of Unreadable responses, %F = percent of Foreign Language responses, %T = percent of Off Topic responses
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
H-11 American Institutes for Research
Table H.11 DRC Item Summary Report – High School ELA II Spring 2018
Item ID Dimension Inter‐Rater Reliability Score Point Distribution
2X % EX % AD % NA Total 0% 1% 2% 3% 4% %B %C %F %T %U
31605 First 600
Conventions 1,164 79 19 2 1,164 17 32 33 0 0 8 0 0 0 10 Purpose/Organization 1,164 77 22 1 1,164 25 35 14 7 1 8 0 0 1 10 Evidence/Elaboration 1,164 80 19 1 1,164 28 33 14 5 1 8 0 0 1 10
31605 Online
Conventions 130,670 93 7 0 130,670 2 6 87 0 0 0 5 0 0 0 Purpose/Organization 130,670 66 33 1 130,670 2 22 47 20 4 0 5 0 0 0 Evidence/Elaboration 130,670 68 31 1 130,670 3 21 53 15 3 0 5 0 0 0
31622 First 600
Conventions 1,164 80 19 1 1,164 19 30 37 0 0 8 0 0 0 5 Purpose/Organization 1,164 88 12 1 1,164 4 48 25 8 1 8 0 0 0 5 Evidence/Elaboration 1,164 84 15 1 1,164 28 30 22 5 1 8 0 0 0 5
31622 Online
Conventions 118,230 91 8 0 118,230 6 8 83 0 0 0 3 0 0 0 Purpose/Organization 118,230 70 30 0 118,230 1 17 47 30 2 0 3 0 0 0 Evidence/Elaboration 118,230 67 33 1 118,230 7 18 46 22 3 0 3 0 0 0
Note: 2x = the number of student responses scored by two readers, %EX = the percent of student responses that reader 1 and reader 2 were in exact agreement (3,3), %AD = the percent of student responses that reader 1 and reader 2 scores were adjacent (1, 2), %NA = the percent of student responses that reader 1 and reader 2 were non‐adjacent (0, 2), %B = percent of Blank/No response, %U = percent of Unreadable responses, %F = percent of Foreign Language responses, %T = percent of Off Topic responses
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-1 American Institutes for Research
Fall 2017 Grade 3 ELA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G3ELA SP16 Proportional Score
OH Online G3ELA FA17 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G3ELA SP16 SEM
OH Online G3ELA FA17 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G3ELA SP16 Prop. ‐ OH Online G3ELA FA17Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-2 American Institutes for Research
Fall 2017 HS1 ELA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G9ELA SP16 Proportional Score
OH Online G9ELA FA17 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G9ELA SP16 SEM
OH Online G9ELA FA17 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G9ELA SP16 Prop. ‐ OH Online G9ELA FA17Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-3 American Institutes for Research
Fall 2017 HS2 ELA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G10ELA SP16 Proportional Score
OH Online G10ELA FA17 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G10ELA SP16 SEM
OH Online G10ELA FA17 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G10ELA SP16 Prop. ‐ OH Online G10ELA FA17Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-4 American Institutes for Research
Fall 2017 Algebra
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online Alg SP16 Proportional Score
OH Online Alg FA17 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM
Theta
CSEM
OH Online Alg SP16 SEM OH Online Alg FA17 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online Alg SP16 Prop. ‐ OH Online Alg FA17 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-5 American Institutes for Research
Fall 2017 Geometry
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online Geo SP16 Proportional Score
OH Online Geo FA17 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM
Theta
CSEM
OH Online Geo SP16 SEM OH Online Geo FA17 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online Geo SP16 Prop. ‐ OH Online Geo FA17 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-6 American Institutes for Research
Fall 2017 Integrated Math 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online Int Math1 SP16 Proportional Score
OH Online Int Math1 FA17 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online Int Math1 SP16 SEM
OH Online Int Math1 FA17 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online Int Math1 SP16 Prop. ‐ OH Online Int Math1FA17 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-7 American Institutes for Research
Fall 2017 Integrated Math 2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online Int Math2 SP16 Proportional Score
OH Online Int Math2 FA17 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online Int Math2 SP16 SEM
OH Online Int Math2 FA17 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online Int Math2 SP16 Prop. ‐ OH Online Int Math2FA17 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-8 American Institutes for Research
Fall 2017 Biology
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online HS Biology SP16 Proportional Score
OH Online HS Biology FA17 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online HS Biology SP16 SEM
OH Online HS Biology FA17 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online HS Biology SP16 Prop. ‐ OH Online HS BiologyFA17 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-9 American Institutes for Research
Fall 2017 American Government
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online HS AG SP16 Proportional Score
OH Online HS AG FA17 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online HS AG SP16 SEM
OH Online HS AG FA17 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online HS AG SP16 Prop. ‐ OH Online HS AG FA17Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-10 American Institutes for Research
Fall 2017 American History
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online HS AH SP16 Proportional Score
OH Online HS AH FA17 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online HS AH SP16 SEM
OH Online HS AH FA17 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online HS AH SP16 Prop. ‐ OH Online HS AH FA17 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-11 American Institutes for Research
Spring 2018 Grade 3 ELA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G3ELA SP16 Proportional Score
OH OL PP G3ELA SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G3ELA SP16 SEM
OH OL PP G3ELA SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G3ELA SP16 Prop. ‐ OH OL PP G3ELA SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-12 American Institutes for Research
Spring 2018 Grade 4 ELA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G4ELA SP16 Proportional Score
OH OL PP G4ELA SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G4ELA SP16 SEM
OH OL PP G4ELA SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G4ELA SP16 Prop. ‐ OH OL PP G4ELA SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-13 American Institutes for Research
Spring 2018 Grade 5 ELA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G5ELA SP16 Proportional Score
OH OL PP G5ELA SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G5ELA SP16 SEM
OH OL PP G5ELA SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G5ELA SP16 Prop. ‐ OH OL PP G5ELA SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-14 American Institutes for Research
Spring 2018 Grade 6 ELA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G6ELA SP16 Proportional Score
OH OL PP G6ELA SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G6ELA SP16 SEM
OH OL PP G6ELA SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G6ELA SP16 Prop. ‐ OH OL PP G6ELA SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-15 American Institutes for Research
Spring 2018 Grade 7 ELA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G7ELA SP16 Proportional Score
OH OL PP G7ELA SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM
Theta
CSEM
OH Online G7ELA SP16 SEM
OH OL PP G7ELA SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G7ELA SP16 Prop. ‐ OH OL PP G7ELA SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-16 American Institutes for Research
Spring 2018 Grade 8 ELA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G8ELA SP16 Proportional Score
OH OL PP G8ELA SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G8ELA SP16 SEM
OH OL PP G8ELA SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G8ELA SP16 Prop. ‐ OH OL PP G8ELA SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-17 American Institutes for Research
Spring 2018 High School ELA I
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G9ELA SP16 Proportional Score
OH OL PP G9ELA SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G9ELA SP16 SEM
OH OL PP G9ELA SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G9ELA SP16 Prop. ‐ OH OL PP G9ELA SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-18 American Institutes for Research
Spring 2018 High School ELA II
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G10ELA SP16 Proportional Score
OH OL PP G10ELA SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G10ELA SP16 SEM
OH OL PP G10ELA SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G10ELA SP16 Prop. ‐ OH OL PP G10ELA SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-19 American Institutes for Research
Spring 2018 Grade 3 Math ‐ Online
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G3M SP16 Proportional Score
OH Online G3M SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G3M SP16 SEM
OH Online G3M SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G3M SP16 Prop. ‐ OH Online G3M SP18 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-20 American Institutes for Research
Spring 2018 Grade 3 Math ‐ Paper
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-21 American Institutes for Research
Spring 2018 Grade 4 Math
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G4M SP16 Proportional Score
OH Online_Paper G4M SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G4M SP16 SEM
OH Online_Paper G4M SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G4M SP16 Prop. ‐ OH Online_Paper G4M SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-22 American Institutes for Research
Spring 2018 Grade 5 Math
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G5M SP16 Proportional Score
OH Online/Paper G5M SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G5M SP16 SEM
OH Online/Paper G5M SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G5M SP16 Prop. ‐ OH Online/Paper G5M SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-23 American Institutes for Research
Spring 2018 Grade 6 Math
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G6M SP16 Proportional Score
OH Online_Paper G6M SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM
Theta
CSEM
OH Online G6M SP16 SEM
OH Online_Paper G6M SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G6M SP16 Prop. ‐ OH Online_Paper G6M SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-24 American Institutes for Research
Spring 2018 Grade 7 Math
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G7M SP16 Proportional Score
OH Online/Paper G7M SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM
Theta
CSEM
OH Online G7M SP16 SEM
OH Online/Paper G7M SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G7M SP16 Prop. ‐ OH Online/Paper G7M SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-25 American Institutes for Research
Spring 2018 Grade 8 Math
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G8M SP16 Proportional Score
OH Online/Paper G8M SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online G8M SP16 SEM
OH Online/Paper G8M SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G8M SP16 Prop. ‐ OH Online/Paper G8M SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-26 American Institutes for Research
Spring 2018 Algebra
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online Alg SP16 Proportional Score
OH OLPP Alg SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM
Theta
CSEM
OH Online Alg SP16 SEM OH OLPP Alg SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online Alg SP16 Prop. ‐ OH OLPP Alg SP18 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-27 American Institutes for Research
Spring 2018 Geometry
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online Geo SP16 Proportional Score
OH Online_PP Geo SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online Geo SP16 SEM
OH Online_PP Geo SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online Geo SP16 Prop. ‐ OH Online_PP Geo SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-28 American Institutes for Research
Spring 2018 Integrated Math 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online Int Math1 SP16 Proportional Score
OH Online/Paper Int Math1 SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online Int Math1 SP16 SEM
OH Online/Paper Int Math1 SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online Int Math1 SP16 Prop. ‐ OH Online/Paper IntMath1 SP18 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-29 American Institutes for Research
Spring 2018 Integrated Math 2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online Int Math2 SP16 Proportional Score
OH Online/Paper Int Math2 SP18 Proportional Score
00.150.3
0.450.6
0.750.9
1.051.2
1.351.5
1.651.8
1.952.1
2.252.4
2.552.7
2.853
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online Int Math2 SP16 SEM
OH Online/Paper Int Math2 SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online Int Math2 SP16 Prop. ‐ OH Online/Paper IntMath2 SP18 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-30 American Institutes for Research
Spring 2018 Grade 5 Science ‐ Online
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G5S SP16 Proportional Score
OH Online G5S SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM
Theta
CSEM
OH Online G5S SP16 SEM OH Online G5S SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G5S SP16 Prop. ‐ OH Online G5S SP18 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-31 American Institutes for Research
Spring 2018 Grade 5 Science ‐ Paper
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G5S SP16 Proportional Score
OH Paper G5S SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM
Theta
CSEM
OH Online G5S SP16 SEM OH Paper G5S SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G5S SP16 Prop. ‐ OH Paper G5S SP18 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-32 American Institutes for Research
Spring 2018 Grade 8 Science ‐ Online
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G8S SP16 Proportional Score
OH Online G8S SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM
Theta
CSEM
OH Online G8S SP16 SEM OH Online G8S SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G8S SP16 Prop. ‐ OH Online G8S SP18 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-33 American Institutes for Research
Spring 2018 Grade 8 Science ‐ Paper
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online G8S SP16 Proportional Score
OH Paper G8S SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM
Theta
CSEM
OH Online G8S SP16 SEM OH Paper G8S SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online G8S SP16 Prop. ‐ OH Paper G8S SP18 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-34 American Institutes for Research
Spring 2018 Biology ‐ Online
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online HS Biology SP16 Proportional Score
OH Online HS Biology SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online HS Biology SP16 SEM
OH Online HS Biology SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online HS Biology SP16 Prop. ‐ OH Online HS BiologySP18 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-35 American Institutes for Research
Spring 2018 Biology ‐ Paper
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online HS Biology SP16 Proportional Score
OH Paper HS Biology SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online HS Biology SP16 SEM
OH Paper HS Biology SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online HS Biology SP16 Prop. ‐ OH Paper HS BiologySP18 Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-36 American Institutes for Research
Spring 2018 American Government ‐ Online
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online HS AG SP16 Proportional Score
OH Online HS AG SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online HS AG SP16 SEM
OH Online HS AG SP18 SEM
Proficient
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online HS AG SP16 Prop. ‐ OH Online HS AG SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-37 American Institutes for Research
Spring 2018 American Government ‐ Paper
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online HS AG SP16 Proportional Score
OH Paper HS AG SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online HS AG SP16 SEM
OH Paper HS AG SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online HS AG SP16 Prop. ‐ OH Paper HS AG SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-38 American Institutes for Research
Spring 2018 American History ‐ Online
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
OH Online HS AH SP16 Proportional Score
OH Online HS AH SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
CSEM
Theta
CSEM
OH Online HS AH SP16 SEM
OH Online HS AH SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
OH Online HS AH SP16 Prop. ‐ OH Online HS AH SP18Prop.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
I-39 American Institutes for Research
Spring 2018 American History ‐ Paper
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
ortio
n
Theta
Test Characteristic Curves
AH Online SP16 Proportional Score
AH Paper SP18 Proportional Score
00.150.30.450.60.750.91.051.21.351.51.651.81.952.12.252.42.552.72.85
3
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5CSEM
Theta
CSEM
AH Online SP16 SEM AH Paper SP18 SEM
‐0.2
‐0.15
‐0.1
‐0.05
0
0.05
0.1
0.15
0.2
‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5
TCC Prop
. Differen
ce
Theta
TCC Prop. Difference
AH Online SP16 Prop. ‐ AH Paper SP18 Prop.
Descriptions of the operation of the Test Information Distribution Engine, Test Delivery System, and related systems are property of the American Institutes for Research (AIR) and are used with the permission of AIR.
Test Delivery System
Test Administrator User Guide
2017-2018
Published February 9, 2018
Prepared by the American Institutes for Research®
Test Delivery System Table of Contents
ii
Table of Contents
Section I. Introduction to the User Guide ................................................................................................. 1
Organization of the User Guide ................................................................................................................ 1
Document Conventions ............................................................................................................................ 1
Intended Audience .................................................................................................................................... 2
Additional Resources ................................................................................................................................ 2
Section II. Overview of the Test Delivery System .................................................................................... 3
Description of the Test Delivery System’s Sites ....................................................................................... 3
User Roles and System Requirements .................................................................................................... 3
General Rules of Online Testing .............................................................................................................. 4
Test Setting Rules ................................................................................................................................ 4
Pause Rules ......................................................................................................................................... 4
Test Timeout Rules .............................................................................................................................. 4
Test Opportunity Expiration Rules ........................................................................................................ 4
Section III. Accessing the Test Administration Sites .............................................................................. 5
About Usernames and Passwords ........................................................................................................... 6
Section IV. Overview of the Test Administration Sites ........................................................................... 7
Test Administrator Site Layout ................................................................................................................. 7
TA Site Features ....................................................................................................................................... 8
Looking Up Students ............................................................................................................................ 9
Printing Session Information............................................................................................................... 10
Section V. Administering Online Tests ................................................................................................... 11
Starting a Test Session........................................................................................................................... 11
Approving Students for Testing .............................................................................................................. 12
Monitoring Students’ Testing Progress ................................................................................................... 15
About the Timer .................................................................................................................................. 16
Pausing a Student’s Test.................................................................................................................... 17
Stopping a Test Session and Logging Out ............................................................................................. 17
Stopping a Test Session ..................................................................................................................... 17
Logging Out of the Test Administrator Site ........................................................................................ 18
Accidentally Closing the Browser Window ..................................................................................... 18
Section VI. Signing in to the Student Testing Site ................................................................................ 19
Step 1: Signing Students In .................................................................................................................... 19
Common Student Sign-in Errors ........................................................................................................ 20
Enabling Settings from the Sign-in Page ............................................................................................ 20
Step 2: Verifying Student Information ..................................................................................................... 21
Test Delivery System Table of Contents
iii
Step 3: Selecting a Test.......................................................................................................................... 22
Step 4: Verifying Test Information .......................................................................................................... 23
Step 5: Functionality Checks .................................................................................................................. 24
Step 5a: Text-to-Speech Check ............................................................................................................. 24
Step 5b: Audio Playback Check ............................................................................................................. 26
Troubleshooting Audio Issues ........................................................................................................ 26
Step 5c: Recording Device Check .......................................................................................................... 27
Troubleshooting Recording Device Issues .................................................................................... 27
Step 6: Viewing Instructions and Starting the Test ................................................................................. 29
Section VII. Overview of the Student Testing Site ................................................................................. 30
Test Layout ............................................................................................................................................. 30
Test Tools ............................................................................................................................................... 31
Using Menus and Tools ...................................................................................................................... 34
About the Global Menu .................................................................................................................. 34
About the Context Menus............................................................................................................... 34
Opening a Context Menu for Passages and Questions ................................................................. 35
Opening a Context Menu for Answer Options ............................................................................... 35
About the Masking Tool ................................................................................................................. 36
About Text-to-Speech (TTS) .......................................................................................................... 37
Selecting a Previous Response Version ........................................................................................ 38
Section VIII. Proceeding Through a Test ................................................................................................ 39
About Reading Passages ....................................................................................................................... 39
Responding to Test Questions ............................................................................................................... 39
Reviewing Questions in a Test ............................................................................................................... 40
Pausing Tests ......................................................................................................................................... 40
Submitting a Test .................................................................................................................................... 41
Reaching the End of a Test ................................................................................................................ 41
End Test Page .................................................................................................................................... 41
Your Results Page .............................................................................................................................. 42
Appendix A. About the Secure Browser ................................................................................................. 44
Additional Measures for Securing the Test Environment ....................................................................... 44
Forbidden Application Detection ........................................................................................................ 45
Configuring Tablets for Testing .............................................................................................................. 45
Closing the Student Testing Site on Tablets ...................................................................................... 45
About Permissive Mode .......................................................................................................................... 46
Troubleshooting ...................................................................................................................................... 47
Resolving Secure Browser Error Messages ....................................................................................... 47
Test Delivery System Table of Contents
iv
Force-Quit Commands ........................................................................................................................... 48
Appendix B. Text Response Formatting Toolbar .................................................................................. 49
Spell Check ............................................................................................................................................. 50
Special Characters ................................................................................................................................. 50
Appendix C. Keyboard Navigation for Students .................................................................................... 51
Sign-In Pages and In-Test Pop-ups ....................................................................................................... 51
Keyboard Commands for Test Navigation .............................................................................................. 51
Keyboard Commands for Global and Context Menus ............................................................................ 52
Global Menu ....................................................................................................................................... 52
Context Menus ................................................................................................................................... 52
Highlighting Selected Regions of Text ............................................................................................... 52
Keyboard Commands for Grid Questions .......................................................................................... 53
Appendix D. Transferring a Test Session ............................................................................................... 54
Appendix E. User Support........................................................................................................................ 55
Appendix F. Change Log .......................................................................................................................... 56
Test Delivery System Table of Figures
v
Table of Figures
Figure 1. Portal User Cards .......................................................................................................................... 5
Figure 2. Card for Test Administrator Interface ............................................................................................. 5
Figure 3. Cards for Test Administrator Practice Site .................................................................................... 5
Figure 4. Login Page ..................................................................................................................................... 5
Figure 5. Test Administrator Site Layout ....................................................................................................... 7
Figure 6. Test Administrator Site Banner ...................................................................................................... 8
Figure 7. Student Lookup: Quick Search ...................................................................................................... 9
Figure 8. Student Lookup: Advanced Search ............................................................................................. 10
Figure 9. Test Selection Box ....................................................................................................................... 11
Figure 10. Students Awaiting Approval ....................................................................................................... 12
Figure 11. Approvals and Student Test Settings Window .......................................................................... 13
Figure 12. Test Settings Window for a Selected Student ........................................................................... 14
Figure 13. Student Sign-In Page ................................................................................................................. 19
Figure 14. Choose Settings Window ........................................................................................................... 20
Figure 15. Is This You? Page ..................................................................................................................... 21
Figure 16. Your Tests Page ........................................................................................................................ 22
Figure 17. Is This Your Test? Page ............................................................................................................ 23
Figure 18. Text-to-Speech Sound Check Page .......................................................................................... 24
Figure 19. Sound Check Page .................................................................................................................... 26
Figure 20. Recording Device Check Page .................................................................................................. 27
Figure 21. Recording Input Device Selection Page .................................................................................... 28
Figure 22. Instructions and Help Page ........................................................................................................ 29
Figure 23. Test Layout ................................................................................................................................ 30
Figure 24. Test Page ................................................................................................................................... 31
Figure 25. Global Menu ............................................................................................................................... 34
Figure 26. Context Menu for Questions ...................................................................................................... 35
Figure 27. Context Menu for Answer Options ............................................................................................. 35
Figure 28. Test Page with Masked Area ..................................................................................................... 36
Figure 29. Speak Tool Options for Questions ............................................................................................. 37
Figure 30. Select Previous Version Window ............................................................................................... 38
Figure 31. Reading Passage....................................................................................................................... 39
Figure 32. Question Marked for Review ..................................................................................................... 40
Figure 33. Global Menu with End Test Button ............................................................................................ 41
Figure 34. End Test Page ........................................................................................................................... 41
Figure 35. Your Results Page ..................................................................................................................... 42
Test Delivery System List of Tables
vi
Figure 36. Practice Test Summary Report .................................................................................................. 42
Figure 37. Text Response Question with Formatting Toolbar .................................................................... 49
Figure 38. Spell Check Tool ........................................................................................................................ 50
Figure 39. Special Characters Window ....................................................................................................... 50
Figure 40. Grid Question ............................................................................................................................. 53
List of Tables
Table 1. Key Symbols and Elements ............................................................................................................ 1
Table 2. Test Administrator Site Features .................................................................................................... 8
Table 3. Columns in the Students in Your Test Session Table .................................................................. 15
Table 4. Student Testing Statuses .............................................................................................................. 16
Table 5. Global Tools .................................................................................................................................. 31
Table 6. Context Menu Tools and Stimulus Tools ...................................................................................... 33
Table 7. Overview of the Practice Test Summary Report ........................................................................... 43
Table 8. Description of Formatting Tools .................................................................................................... 49
Table 9. Keyboard Commands for Sign-In Pages and Pop-Up Windows .................................................. 51
Table 10. Keyboard Commands for Test Navigation .................................................................................. 51
Test Delivery System Introduction to the User Guide
1
Section I. Introduction to the User Guide This user guide supports test administrators who manage testing for students participating in the Ohio's State Tests and Ohio English Language Proficiency Assessment practice tests and operational tests.
Organization of the User Guide
• Overview of the Test Delivery System provides an overview of online testing and general test rules.
• Accessing the Test Administration Sites explains how to log in to the test administrator sites.
• Overview of the Test Administration Sites describes the overall layout of the test administrator sites and highlights the important tasks and functions.
• Administering Online Tests outlines the process for creating a test session, approving students for testing, pausing tests, and logging out.
• Signing in to the Student Testing Site explains how students sign in to a test session.
• Overview of the Student Testing Site describes the layout of an online test, as well as the tools available to students.
• Proceeding Through a Test explains how students complete tests.
• The Appendices provide additional information about the secure browser, keyboard commands, transferring test sessions and user support.
Document Conventions
Table 1 describes the conventions appearing in this guide.
Table 1. Key Symbols and Elements
Element Description
Alert: This symbol accompanies important information regarding a task that may cause minor errors.
Note: This symbol accompanies additional information or instructions of which users must take note.
Policy: This symbol accompanies information regarding test administration policies.
Test Delivery System Introduction to the User Guide
2
Element Description
Warning: This symbol accompanies important information regarding actions that may cause major errors.
Intended Audience
This user guide is intended for test administrators responsible for proctoring tests with the Test Delivery System. To use this system, you should be familiar with using a web browser to retrieve data and with filling out web forms. You should also be familiar with printing documents and adjusting a computer’s audio settings. If you or your students use Chromebooks, iPads, or other tablets for testing, then you should be familiar with operating these devices as well.
Additional Resources
The following publications provide additional information:
• For information about policies and procedures that govern secure and valid test administration, see the Test Administration Manual.
• For information about supported operating systems and browsers, see the Online System Requirements document.
• For information about student and user management, rosters, and test status requests, see the TIDE User Guide.
• For information about network and internet requirements, general peripheral and software requirements, and configuring text-to-speech settings, see the Technical Specifications Manual.
• For information about installing secure browsers, see the Secure Browser Installation Manual.
The above resources are available on the Ohio's State Tests Portal (www.ohiostatetests.org).
Test Delivery System Overview of the Test Delivery System
3
Section II. Overview of the Test Delivery System The Test Delivery System delivers Ohio’s online tests. The following sections describe highlights of online testing in general and the Test Delivery System in particular.
Description of the Test Delivery System’s Sites
The Test Delivery System consists of practice sites and operational testing sites. The practice sites function identically to the operational testing sites.
• Practice Sites
o Test Administrator Practice Site: Allows test administrators to practice administering tests.
o Student Practice Site: Allows students to practice taking tests online and using test tools.
• Operational Testing Sites
o Test Administrator Interface: Allows test administrators to administer operational tests.
o Student Testing Site: Allows students to take operational tests.
User Roles and System Requirements
Access to the practice and operational testing sites depends on your user role and browser.
• Test administrators can use any supported web browser to access either the Test Administrator Practice Site or the Test Administrator Interface. For a list of user roles that can access the Test Administrator Sites, see the User Role Matrix document available in the Resources section of the Ohio's State Tests Portal (www.ohiostatetests.org).
• Students, test administrators, and parents can use a supported web browser or secure browser to access the Student Practice Site as guests. Students can also sign in to a practice test session created by a test administrator.
• Students use a secure browser to access the Student Testing Site.
For information about supported operating systems and browsers, see the Online System Requirements document available on the Ohio's State Tests Portal (www.ohiostatetests.org).
Test Delivery System Overview of the Test Delivery System
4
General Rules of Online Testing
This section describes the rules for administering online tests.
Test Setting Rules
Students should not begin testing until they are assigned the correct test settings. You may have to update some test settings in the Test Information and Distribution Engine (TIDE).
Pause Rules
Test administrators and students can pause a test in order to temporarily log the student out of the test session. Students cannot access their test if it is paused for more than one day, even if they marked questions for review. The only exception to this rule is if the district test coordinator submits a test status request for reopen in TIDE.
These pause rules apply regardless of whether the student or the test administrator pauses the test or a technical issue logs the student out.
Test Timeout Rules
A warning message displays after 20 minutes of test inactivity. Students who do not click OK within 30 seconds after this message appears are logged out. This timeout automatically pauses the test.
Test Opportunity Expiration Rules
Opportunities refer to the number of times a student can take a test within a range of dates. Ohio tests have one opportunity per test part during the test window. A student’s test opportunity remains active until the student submits the test or until the opportunity expires at the end of the test window. Once a test opportunity expires, the student cannot submit or review the test. Opportunities that have been started but not submitted by the student will be automatically submitted by the system.
Test Delivery System Accessing the Test Administration Sites
5
Section III. Accessing the Test Administration Sites This section describes how to access the Test Administrator Sites.
To access the Test Administrator Interface:
1. Navigate to the Ohio's State Tests Portal (www.ohiostatetests.org).
2. Select the Teachers/ Test Administrators or Test Coordinators card (see Figure 1).
3. Select the appropriate TA Site:
o To access the Test Administrator Interface, click TA Interface (see Figure 2).
o To access the Test Administrator Practice Site, click TA Practice Site, (see Figure 3).
4. The login page appears (see Figure 3). Enter your email address and password.
5. Click Secure Login. The selected TA Site appears.
Figure 1. Portal User Cards
Figure 2. Card for Test Administrator Interface
Figure 3. Cards for Test Administrator Practice
Site
Figure 4. Login Page
Test Delivery System Accessing the Test Administration Sites
6
Note: For information about logging out of the TA Site, see the section Logging Out of the Test Administrator Site.
About Usernames and Passwords
Your username is the email address associated with your account in TIDE. When you are added to TIDE, you receive an email containing a temporary link to the Reset Your Password page. To activate your account, you must set up your password and set a security question within 15 minutes of receiving this email.
• If your first temporary link expired or you forgot your password:
On the login page, click Forgot Your Password? and then enter your email address in the Email Address field to reset your password. If your account is already set up, you need to answer your security question as well. You will receive an email with a new link to reset your password.
• If you did not receive an email containing a temporary password:
Check your spam folder to make sure your email program did not categorize it as junk mail. If you still do not have an email, contact your Building or District Test Coordinator to make sure you are listed in TIDE.
• Additional help:
If you are unable to log in, contact the Ohio Help Desk for assistance. You must provide your name and email address. Contact information is available in the User Support section of this user guide.
Test Delivery System Overview of the Test Administration Sites
7
Section IV. Overview of the Test Administration Sites This section describes the test administration sites for test administrators. Throughout the rest of this user guide, “Test Administrator Site” refers to both the Test Administrator Interface and Test Administrator Practice Site.
Warning: Do not use the Test Administrator Interface for practice. To practice administering tests, use the Test Administrator Practice Site. Both Test Administrator Sites have the same functionality, but the available tests are different. The table header will have “Practice” in the Test Administrator Practice Site and “Operational” in the Test Administrator Interface. Tests provided in the Test Administrator Interface are operational and will expend the students’ test opportunities.
Test Administrator Site Layout
Figure 5 displays the layout of the Test Administrator Site during an active test session.
Figure 5. Test Administrator Site Layout
Essential features in the Test Administrator Site:
1. Session ID
2. Select Tests button
3. Approvals button
4. Students in Your Test Session table
Test Delivery System Overview of the Test Administration Sites
8
Table 2 provides an overview of the major features available in the Test Administrator Site.
Table 2. Test Administrator Site Features
Feature Description/More Information
Student Lookup button Searches for student information. See the section Looking Up Students.
Print Session button Prints your screen. See the section Printing Session Information.
Help Guide button Displays the online version of this user guide.
Log Out button Logs you out of the Test Administrator Site. See the section Stopping a Test Session and Logging Out.
Stop Session button* Ends the test session. See the section Stopping a Test Session and Logging Out.
Session ID* Displays the unique ID generated for the test session.
Select Tests button Opens the Test Selection window. See the section Starting a Test Session.
Approvals button* Opens the Approvals and Student Test Settings window. See the section Approving Students for Testing.
Refresh button* Updates the on-screen information.
Students in Your Test Session table**
Displays the testing progress for students in your test session. See the section Monitoring Students’ Testing Progress.
*Feature appears after you start a test session.
**Feature appears after you approve students for testing.
TA Site Features
This section provides instructions for using the features available in the banner at the top of the Test Administrator Site (see Figure 6).
Figure 6. Test Administrator Site Banner
Test Delivery System Overview of the Test Administration Sites
9
Looking Up Students
You can use the student lookup feature to perform a quick or advanced search for student information. This is useful if students signing in to your test session cannot remember their login information.
Warning: You must ensure that a student’s demographic information is correct before testing begins. If a student’s information is not correct, that student should not begin testing.
To perform a quick search:
1. In the banner, click Student Lookup.
2. Enter a student’s full SSID and click Submit SSID. Search results appear below the search field (see Figure 7).
Figure 7. Student Lookup: Quick Search
To perform an advanced search:
1. Click Student Lookup > Advanced Search.
a. Select the appropriate district and school from the drop-down lists.
b. Select the appropriate grade.
c. Optional: Enter a student’s exact first or last name. Partial names are not allowed.
Test Delivery System Overview of the Test Administration Sites
10
2. Click Search. Search results appear below the search fields (see Figure 8).
Figure 8. Student Lookup: Advanced Search
3. To view a student’s information, click in the Details column.
Printing Session Information
You can print a snapshot of the Test Administrator Site as it currently appears if you wish to keep a hard-copy record of the Session ID or list of approved students.
To print a snapshot of the page:
1. In the banner, click Print Session. The computer’s print dialog window appears.
2. Click OK.
Policy Note: Federal law prohibits the release of students' personally identifiable information. All printouts must be securely stored and then destroyed when no longer needed.
Test Delivery System Administering Online Tests
11
Section V. Administering Online Tests The basic workflow for administering online tests is as follows:
1. The test administrator selects tests and starts a test session.
2. Students sign in and request approval for tests.
3. The test administrator reviews students’ requests and approves them for testing.
4. Students complete and submit their tests.
5. The test administrator stops the test session and logs out.
For information about the testing process from a student’s perspective, see the sections Signing in to the Student Testing Site and Overview of the Student Testing Site.
Starting a Test Session
When you log in to the Test Administrator Site, the Test Selection window opens automatically (see Figure 9). This window allows you to select tests and start the session. Only the tests that you select will be available to students who join your session.
Figure 9. Test Selection Box
The Test Selection window color-codes tests and groups them into subjects. A test group may include one or more sub-groups. All test groups and sub-groups appear collapsed by default. To
expand a test group, click (or Expand All). To collapse an expanded test group, click (or Collapse All).
Test Delivery System Administering Online Tests
12
To create a new test session:
1. If the Test Selection window is not open, click Select Tests in the upper-right corner of the Test Administrator Site (otherwise skip to step 2).
2. To select tests for the session, do one of the following:
o To select individual tests, mark the checkbox for each test you want to include.
o To select all the tests in a test group, mark the checkbox for that group.
3. In the lower-left corner of the window, click Start Session (the exact label for this button may vary depending on whether you are starting a practice or operational session). The window closes and the Session ID appears on the Test Administrator Site.
4. Provide the Session ID to your students.
Note: Write down the Session ID in case you accidentally close the browser window and need to return to the active test session. You may have only one session open at a time. You cannot reopen closed sessions, but students can resume a test opportunity in a new session.
To add tests to an active test session:
1. In the upper-right corner of the Test Administrator Site, click Select Tests.
2. In the Test Selection window, mark the checkbox for the required test and click Add to Session in the lower-left corner.
3. A confirmation message asks if you are sure you want to modify the tests in your session. To continue, click Yes.
Note: You cannot remove tests from an active session.
Approving Students for Testing
After students sign in and select tests, you must verify that their test settings are correct before approving them for testing. When students are awaiting approval, the Approvals button next to the Session ID becomes active and shows you how many students are awaiting approval (see Figure 10).
Figure 10. Students Awaiting Approval
Test Delivery System Administering Online Tests
13
Note: The Approvals notification updates regularly, but you can also click in the upper-
right corner to update it manually.
To approve students for testing:
1. Click Approvals. The Approvals and Student Test Settings window appears, displaying a list of students grouped by test (see Figure 11).
Figure 11. Approvals and Student Test Settings Window
2. To check a student’s test settings, click for that student. The student’s information
appears in the Test Settings window (see Figure 12). This window groups test settings by their tool categories.
Test Delivery System Administering Online Tests
14
Figure 12. Test Settings Window for a Selected Student
a. If any settings are incorrect, update them as required. Students should not begin testing until their settings are correct.
Alert: When approving students for testing, you must update the editable settings in this window, rather than in TIDE.
b. Do one of the following:
To confirm the settings, click Set. You must still approve the student for testing (see step 5).
To confirm the settings and approve the student, click Set & Approve. Students can start testing once you approve them.
To return to the Approvals and Student Test Settings window without confirming settings, click Cancel.
3. Repeat step 2 for each student in the Approvals and Student Test Settings list.
Note: The Approvals and Student Test Settings window does not automatically refresh. To update the list of students awaiting approval, click Refresh at the top of the window.
4. If you need to deny a student access to testing, do the following (otherwise skip to step 5):
a. Click for that student.
Test Delivery System Administering Online Tests
15
b. Optional: In the window that appears, enter a brief reason for denying the student.
c. Click Deny. The student receives a message explaining the reason for the denial if you entered a reason, and is logged out.
Note: If you deny students entry for a test, they can still request access to that test again.
5. If you wish to approve students directly from the Approvals and Student Test Settings window, do the following:
o To approve individual students, click for each student.
o To approve all students displayed in the list, click Approve All Students for that subject.
Monitoring Students’ Testing Progress
After you approve students for testing, the Students in Your Test Session table appears (see Figure 5). This table displays the testing progress for each student logged in to your session. Table 3 describes the columns in this table. To sort the table by a given column, click that column header.
Table 3. Columns in the Students in Your Test Session Table
Column Description
Student Name Last and first name of the student in the session.
SSID SSID associated with the student.
Opp # Opportunity number for the student’s selected test.
Test Name of the test the student selected.
Time Indicates the approximate elapsed time (in minutes only) in the student’s test. There will be an approximate one minute delay between the elapsed time showing on the student’s test and the time displayed in this column.
Student Status
Current status for each student in the session. This column may also indicate how many questions the student has completed out of the total number of test questions. For more information about the statuses in this column, see Table 4.
Test Settings This column displays one of the following:
• Standard: Default test settings are applied for this test opportunity.
• Custom: One or more of the student’s test settings differ from the default settings.
To view the student’s settings for the current test opportunity, click .
Pause Test Pauses the student’s test. When a test pauses, this column displays an information button that opens a pop-up message explaining how the test became paused. For more information, see the section Pause Rules.
Test Delivery System Administering Online Tests
16
Table 4 describes the codes in the Student Status column of the Students in Your Test Session table.
Table 4. Student Testing Statuses
Status Description
Approved You approved the student, but the student did not yet start or resume the test.
Started Student started the test and is actively testing.
Review Student visited all questions and is currently reviewing answers before completing the test.
Completed Student submitted the test. The student can take no additional action at this point.
Submitted Test was submitted for quality assurance review and validation.
Reported Test passed quality assurance and is undergoing further processing.
Paused* Student’s test is paused. The time listed indicates how long the test has been paused.
Expired* Test was not completed by the end of the testing window and the opportunity expired.
Pending* Student is awaiting approval for a new test opportunity.
Suspended* Student is awaiting approval to resume a test opportunity.
*Appears when the student is not actively testing. The student’s row grays out in such cases.
Note: The Students in Your Test Session table refreshes at regular intervals, but you can
also refresh it manually by clicking in the upper-right corner.
About the Timer
You can view the approximate time a student has been actively testing via the time column on the TA Interface (see Figure 5). The time column updates approximately once per minute and will only display the approximate time in minutes, not seconds. If a student’s test is paused, the timer in the time column on the TA Interface will pause as well. The student status column will reflect how much time has passed since the student’s test was paused. If a student logs back in and resumes a test, the approximate time will resume counting up from the point the student’s test was paused previously. Elapsed time that a student has spent in the test will continue to accrue, even if the student resumes testing in a new session.
If the test clock setting is turned on for the student, the Student Interface displays a test clock at the top right of the screen (see Figure 23). The test clock displays in real-time the amount of time that a student has spent viewing item content. Students can choose to hide the displayed time by selecting the test clock button on the Student Interface. The Student Interface will continue to keep track of the time elapsed, even if the student hides the test clock time. Students can view the time elapsed display by selecting the test clock button again.
Test Delivery System Administering Online Tests
17
If the test clock setting is turned off for the student, the Student Interface will not display the test clock icon nor the displayed time. However, the time column on the TA Interface will still provide the approximate time the student has been actively testing.
Note: The time reflects only the time a student spends viewing test content. It does not include the time a student spends on the log-in pages, the review page or when the test is paused.
The TA Interface and Student Testing Site do not enforce a time limit. Test administrators are responsible for ensuring that students complete each part of their tests within the testing time published on the portal.
Pausing a Student’s Test
You can pause a student’s test via the Pause Test column in the Students in Your Test Session table (see Figure 5). For information about pause rules, see the section Pause Rules.
To pause an individual student’s test:
1. In the Pause Test column, click for that student.
2. Click Yes to confirm. The Test Delivery System logs the student out and an information button appears in the Pause Test column.
Stopping a Test Session and Logging Out
This section explains how to stop a test session and log out of the Test Administrator Site.
Stopping a Test Session
When students finish testing or the current testing time slot is over, you should stop the test session. Stopping a session automatically logs out all the students in the session and pauses their tests.
Once you stop a test session, you cannot resume it. To resume testing students, you must start a new session.
Warning: The Test Delivery System automatically logs you out after 20 minutes of both user and student inactivity in the session. This action automatically stops the test session.
To stop a test session:
1. In the upper-right corner, click (see Figure 10). A confirmation message appears.
2. Click OK. The test session stops.
Test Delivery System Administering Online Tests
18
Logging Out of the Test Administrator Site
You should log out of the Test Administrator Site only after stopping a test session.
To log out of the Test Administrator Site:
1. In the banner, click Log Out. A warning message appears.
2. In the warning message, click Log Out. The Ohio's State Tests Portal appears.
Alert: Navigating away from the Test Administrator Site will also log you out. Logging out while a session is in progress stops the session. If you need to access another application while administering tests, open it in a separate browser window.
If you log out from another Ohio's State Tests system, such as TIDE, you will also log out of the TA Site.
Accidentally Closing the Browser Window
If you accidentally close the browser while students are testing, your session remains open until it times out. To return to the test session in the Test Administrator Site, you must enter the active Session ID.
If you do not return to the active session within 20 minutes and there is no student activity during that time, the Test Delivery System logs you out and pauses the students’ tests.
Test Delivery System Signing in to the Student Testing Site
19
Section VI. Signing in to the Student Testing Site This section describes the student sign-in process for the Student Testing Site. Students follow this procedure when starting a new test or resuming a paused test.
Note: Students must sign in to the appropriate testing site:
• For sessions created in the Test Administrator Interface, students sign in to the Student Testing Site on the secure browser.
• For sessions created in the Test Administrator Practice Site, students sign in to the Student Practice Site. Students can access the Student Practice Site on the Ohio's State Tests Portal.
Step 1: Signing Students In
To sign students in to a test session:
1. Launch the secure browser on the student’s testing device. The Student Sign-In page appears (see Figure 13).
Figure 13. Student Sign-In Page
2. Students enter the following information:
a. In the First Name field, students enter their first name as it appears in TIDE.
b. In the Student ID field, students enter their SSID as it appears in TIDE. Non-public students enter the non-public student ID. Home schooled students enter the home schooled student ID.
Note: If students do not know their exact information as it appears in TIDE, you can retrieve it in the Test Administrator Site (see the section Looking Up Students).
c. In the Session ID field, students enter the Session ID as it appears on the Test Administrator Site.
3. Students select Sign In. The Is This You? page appears.
Test Delivery System Signing in to the Student Testing Site
20
Common Student Sign-in Errors
The Test Delivery System generates an error message if a student cannot sign in. The following are the most common student sign-in issues:
• Session does not exist: The student entered the Session ID incorrectly or signed in to the wrong site. Verify that the student correctly entered the active Session ID. Also, verify that both you and the student are using the correct sites. For example, students signed in to the Student Practice Site cannot access sessions created in the Test Administrator Interface.
• Student information is not entered correctly: Verify that the student correctly entered the SSID. If this does not resolve the error, use the Student Lookup tool to verify the student's information. See the section Looking Up Students.
• Session has expired: The Session ID corresponds to a closed session. Ensure that the student enters the correct Session ID and verify that your session is open. For more information about test sessions, see the section Starting a Test Session.
• Student is not associated with the school: The student is not associated with your school, or you are not associated with the student’s school. Contact your test coordinator to make the appropriate update in TIDE.
Enabling Settings from the Sign-in Page
On the Student Practice Site, students can modify the settings they want to use during the sign-in process.
Note: On the operational Student Testing Site, students cannot modify the settings they want during the sign-in process
To edit settings:
1. Students select the cog wheel in the upper-right corner of the login page. The Choose Settings window appears (see Figure 14).
2. Students select their preferred options from the available drop-down lists. These settings persist until you set the actual test settings during the test administrator approval process.
Figure 14. Choose Settings Window
Test Delivery System Signing in to the Student Testing Site
21
Step 2: Verifying Student Information
After students sign in, the Is This You? page appears (see Figure 15). On this page, students verify their personal information.
Figure 15. Is This You? Page
To verify personal information:
• If all the information is correct, students select Yes. The Your Tests page appears.
• If any of the information displayed is incorrect, the student must not proceed with testing. The student should select No. You must notify the building or district test coordinator that the student’s information is incorrect.
Warning: Incorrect student demographic information (including SSID) must be updated before the student begins testing.
Note: When signing in to the Student Practice Site as a guest, the Is This You? page displays a Student Grade Level drop-down list, from which students select the grade they wish to use for testing.
Test Delivery System Signing in to the Student Testing Site
22
Step 3: Selecting a Test
The Your Tests page displays all the tests that a student is eligible to take (see Figure 16). Students can only select tests that are included in the session and still need to be completed.
Available tests are color-coded and grouped into categories, just like the tests listed in the Test Selection window of the TA Site (see Figure 9).
If the student has not started a test opportunity, the button for that test is labeled Start [Test Name]. If the student has started and paused a test opportunity, the button for that test is labeled Resume [Test Name].
Figure 16. Your Tests Page
To select an available test:
• Students select the required test name. The request is sent to the test administrator for approval and the Waiting for TA Approval message appears.
• If a student’s required test is inactive or not displayed, the student should click Back to Login. You should verify the test session includes the correct tests and add additional tests, if necessary.
Test Delivery System Signing in to the Student Testing Site
23
Step 4: Verifying Test Information
After you approve the student for testing, the student should verify the test information and settings on the Is This Your Test? page (see Figure 17). At this point, the student’s actual test settings override any settings selected earlier in the sign-in process.
Figure 17. Is This Your Test? Page
To verify test information:
• If the settings are correct, students select Yes.
• If the settings are incorrect, students select No. After a student’s test settings are corrected, the student must sign in and request approval again.
Note: When signing in to the Student Practice Site, a Choose Settings page appears in place of the Is This Your Test? page. On this page, students can select the test settings they wish to use.
Test Delivery System Signing in to the Student Testing Site
24
Step 5: Functionality Checks
Depending on the test content and the specified test settings, students may need to verify that their testing device is functioning properly. Any of the following verification pages may appear:
• Step 5a: Text-to-Speech Check
• Step 5b: Audio Playback Check
• Step 5c: Recording Device Check
Step 5a: Text-to-Speech Check
The Text-to-Speech Sound Check page appears if a student has the text-to-speech (TTS) setting (see Figure 18). On this page, students verify that text-to-speech is working properly on their device. Students can only use text-to-speech within a supported browser. This check can also be performed from the Diagnostics page. The Diagnostics page can be accessed from the homepage of the Student Practice Site. If the student has the Bilingual English-Spanish accommodation in addition to the TTS setting, a second Text-to-Speech Sound Check page appears to verify the Spanish voice.
Figure 18. Text-to-Speech Sound Check Page
To check text-to-speech functionality:
1. Students select the speaker icon and listen to the audio.
Test Delivery System Signing in to the Student Testing Site
25
o If the voice is clearly audible, students select I heard the voice.
o If the voice is not clearly audible, students adjust the settings using the sliders and select the speaker icon again.
o If students still cannot hear the voice clearly, they select I did not hear the voice and close the browser. You can work with students to adjust their audio or headset settings (for more information, see the section Troubleshooting Audio Issues). They can sign in again when the issue is resolved.
Test Delivery System Signing in to the Student Testing Site
26
Step 5b: Audio Playback Check
The Audio Playback Check page appears for tests with listening questions administered for the Ohio English Language Proficiency Assessment (OELPA) (see Figure 19). On this page, students verify that they can hear the sample audio. This check can also be performed from the Diagnostics page. The Diagnostics page can be accessed from the homepage of the Student Practice Site.
Figure 19. Sound Check Page
To check audio settings:
1. Students select the icon and listen to the audio.
2. Depending on the sound quality, students do one of the following:
o If the sound is audible, students select I heard the sound.
o If the sound is not audible, students select I did not hear the sound. The Sound Check: Audio Problem page appears, giving students two options:
Students can select Try Again. This returns them to the Audio Playback Check page.
Students can select Log Out. You should troubleshoot the device and headphones or move the student to another device with working audio.
Troubleshooting Audio Issues
Prior to testing, ensure that audio is enabled on each device and that headsets are functioning correctly. If audio issues occur, do the following:
• Ensure headphones are securely plugged in to the correct jack or USB port.
• If the headphones have a volume control, ensure the volume is not muted.
• Ensure that the audio on the device is not muted.
Test Delivery System Signing in to the Student Testing Site
27
Step 5c: Recording Device Check
The Recording Device Check page appears for tests with speaking questions administered for the Ohio English Language Proficiency Assessment (OELPA) (see Figure 20). On this page, students record their voice and verify that they can hear the recorded audio.
Figure 20. Recording Device Check Page
To check recording device settings:
1. To begin recording, students select the icon.
2. Students speak into their recording device.
3. To stop recording, students select the icon.
4. To listen to their recorded audio, students select the icon.
5. Depending on the recorded audio quality, students do one of the following:
o If the recorded audio is audible, students select I heard my recording.
o If the recorded audio is not audible, students select I did not hear my recording. The Problem Recording Audio page appears.
Troubleshooting Recording Device Issues
The Problem Recording Audio page appears when students experience difficulties recording audio or playing back recorded audio. This page gives students up to three options:
• Try Again: This returns students to the Recording Device Check page.
• Log Out: This returns students to the sign-in page. You should troubleshoot the recording device or set up a new recording device.
Test Delivery System Signing in to the Student Testing Site
28
• Select New Recording Device: This option only appears for students testing on computers or tablets with multiple recording devices. When students select this option, the Recording Input Device Selection page appears (see Figure 21), listing the available recording devices.
Figure 21. Recording Input Device Selection Page
a. To select a different recording device, students speak their names. The blue bar to the right of each recording device indicates the strength of the audio detection for that device.
b. Students select the recording device with the strongest audio detection.
c. Students select Yes.
Note: The Recording Input Device Selection page only allows students to change recording input devices. The audio output device does not change.
Test Delivery System Signing in to the Student Testing Site
29
Step 6: Viewing Instructions and Starting the Test
The Instructions and Help page is the last step of the sign-in process (see Figure 22). Students may review this page to understand how to navigate the test and use test tools.
Figure 22. Instructions and Help Page
To proceed and begin the test:
• After reviewing this page, students select Begin Test Now. The test opportunity officially begins or resumes.
Test Delivery System Overview of the Student Testing Site
30
Section VII. Overview of the Student Testing Site This section describes the layout of the Student Testing Site and the available testing tools.
Test Layout
Figure 23 shows the main sections of the layout for a test page that includes a stimulus. A stimulus is a reading passage or other testing material (such as a simulation or graphic) that students review in order to answer associated questions.
Figure 23. Test Layout
A test page can include the following sections:
• The Global Menu section displays the global navigation and tool buttons. The banner above the global menu displays the Questions drop-down list, test information, help button, system settings button and test clock.
Note: To hide the elapsed time, students can select the test clock in the upper-right corner.
The TA Interface and Student Test Site will still maintain the time elapsed while the time is hidden. To display the hidden time, students can select the test clock again.
• The Stimulus section appears only for questions associated with a stimulus. This section contains the stimulus content (such as a reading passage or graphic), context menu and either the expand passage button or reading mode button.
• The Question section contains one or more test questions (also known as “items”). Each question includes a number, context menu, stem, and response area.
For more information about the global menu and context menus, see the section Using Menus and Tools.
Test Delivery System Overview of the Student Testing Site
31
Test Tools
This section provides an overview of the Test Delivery System’s available tools.
Figure 24 shows the primary features and tools available in the Student Testing Site.
Figure 24. Test Page
Note: Some tools are available for all tests, while others are only available for a particular subject, test setting, or type of question.
Table 5 and Table 6 list the Student Testing Site’s available global tools and context menu tools, respectively.
Table 5. Global Tools
Global Tool Instructions
Help To view the on-screen Test Instructions and Help window, select in the
upper-right corner.
Test Clock
To hide the time elapsed from displaying, select the test clock in
the upper-right corner. To unhide the elapsed time, select the test clock again. Note: The test clock and elapsed time will not display at all if the test clock setting is turned off for the student.
Test Delivery System Overview of the Student Testing Site
32
Global Tool Instructions
Calculator
To use the on-screen calculator, select Calculator.
The graphing calculator is available on the following tests:
• Algebra I
• Geometry
• Integrated Mathematics I
• Integrated Mathematics II
The scientific calculator is available on the following tests:
• Physical Science parts 1 and 2
• Grade 6 Mathematics part 2
• Grade 7 Mathematics part 2
• Grade 8 Mathematics parts 1 and 2
Formula
To view the on-screen reference sheet, select Formula.
Available on the following tests (sheet specifics vary by test):
• Grades 4 to 8 Mathematics
• Algebra I
• Geometry
• Integrated Mathematics I
• Integrated Mathematics II
• Physical Science
Line Reader
To highlight an individual line of text in a passage or question, select Line Reader. This tool is not available while the Highlighter tool is in use.
Masking
To temporarily cover a distracting area of the test page:
1. Select Masking.
2. Click and drag across the distracting area.
3. Release the mouse button.
To close the Masking tool, select Masking again. To remove a masked area, select X in the upper-right corner of that area.
Notes
To open the on-screen notepad, select Notes.
Students cannot copy/paste text from notepad into a response space.
Periodic Table
To view the on-screen periodic table, select Periodic Table.
Available on the Physical Science tests.
System Settings
To adjust text-to-speech settings during the test, select in the upper-right
corner.
Students testing on mobile devices cannot use this tool to adjust volume. To adjust audio volume on mobile devices, students must use the device's built-in volume control.
Test Delivery System Overview of the Student Testing Site
33
Global Tool Instructions
Zoom buttons
To enlarge the text and images on a test page, select Zoom In. You can zoom in up to four levels. To undo zooming, select Zoom Out.
Table 6. Context Menu Tools and Stimulus Tools
Tool Name Instructions
Expand Passage To expand the passage section, select the double arrow icon. The section
will expand and overlap the question section for easier readability. To collapse
the expanded section, select the double arrow icon again.
Expand Buttons You can expand the passage section or the question section for easier readability.
• To expand the passage section, select the right arrow icon below the
global menu. To collapse the expanded passage section, select the left arrow
icon in the upper-right corner.
To expand the question section, select the left arrow icon below the global
menu. To collapse the expanded question section, select the right arrow icon
in the upper-left corner.
Highlighter To highlight text, select the text on the screen and then select Highlight Selection from the context menu. To remove highlighting, select Reset Highlighting from the context menu.
Text in images cannot be highlighted. This tool is not available while the Line Reader tool is in use.
Mark for Review To mark a question for review, select Mark for Review from the context menu.
The question number displays a flap in the upper-right corner. A flag icon
appears next to the number on the test page. The Questions drop-down list
displays "(marked)" for the selected question.
Reading Mode Reading Mode opens a pop-up window that lets you view two pages of a reading
passage at a time. To open Reading Mode, select below a reading passage.
To exit Reading Mode, select in the lower-right corner of the pop-up window.
Select Previous Version
To view and restore responses previously entered for a Text Response question, select the Select Previous Version option from the context menu. A list of saved responses appears. Select the appropriate response and click Select.
Test Delivery System Overview of the Student Testing Site
34
Tool Name Instructions
Strikethrough
For selected-response questions, you can cross out an answer option to focus on the options you think might be correct. There are two options for using this tool:
• Option A:
a. To activate Strikethrough mode, open the context menu and select Strikethrough.
b. Select each answer option you wish to strike out.
c. To deactivate Strikethrough mode, press Esc or click outside the question’s response area.
• Option B:
a. Right-click an answer option and select Strikethrough.
Text-to-Speech To listen to passages and questions, select a Speak option from the context menu.
Text-to-Speech Tracking
When this tool is enabled, words become highlighted as text-to-speech reads them aloud.
Tutorial To view a short video demonstrating how to respond to a particular question type, select Tutorial from the context menu. The tutorials do not include audio.
Using Menus and Tools
This section describes how to use the global and context menus to access on-screen tools. This section also provides further details for using some of the Student Testing Site tools.
Note: Students can access tools using a mouse or keyboard commands. For information about keyboard commands, see Appendix C.
About the Global Menu
The global menu at the top of the test page contains navigation buttons on the left and tool buttons on the right (see Figure 25).
Figure 25. Global Menu
To open a test tool in the global menu:
1. Select the button for the tool. The selected test tool activates.
About the Context Menus
Each test page may include several elements, such as the question, answer options and stimulus (see Figure 23). The context menu for each element (including the stimulus) only contains tools that are applicable to that element (see Figure 26 and Figure 27).
Test Delivery System Overview of the Student Testing Site
35
Figure 26. Context Menu for Questions
Figure 27. Context Menu for Answer Options
Opening a Context Menu for Passages and Questions
Students can access context menus by right-clicking elements or by selecting elements and then clicking the context menu button.
To access the context menu for a passage or question:
1. Click the context menu button in the upper-right corner of the passage or question.
The context menu opens.
2. Select a tool.
Opening a Context Menu for Answer Options
Students can use the context menu to access tools for answer options in a multiple-choice or multi-select question.
To access an answer option’s context menu:
1. To open the context menu, do one of the following:
o If you are using a two-button mouse, right-click an answer option.
o If you are using a single-button mouse, click an answer option while pressing Ctrl.
o If you are using a Chromebook, click an answer option while pressing Alt.
o If you are using a tablet, tap the answer option and then tap the context menu button (this selects the answer option until you select a different option).
2. Select a tool from the context menu.
Test Delivery System Overview of the Student Testing Site
36
About the Masking Tool
The Masking tool allows students to hide distracting areas of the test page (see Figure 28).
Figure 28. Test Page with Masked Area
To mask an area of a test page:
1. To activate the Masking tool, select Masking in the global menu. The button becomes orange.
2. Click and drag across the distracting area of the test page.
3. Release the mouse button. The selected area becomes dark gray. The tool remains active until you deactivate it.
To deactivate the Masking tool:
1. Select Masking in the global menu again. The button becomes green. Any masked areas remain on the screen until you remove them.
To remove a masked area from a test page:
1. Select X in the upper-right corner of a masked area.
Test Delivery System Overview of the Student Testing Site
37
About Text-to-Speech (TTS)
Students testing with text-to-speech can listen to passages, questions, and answer options (see Figure 29). If a student is using Text-to-Speech Tracking, the words become highlighted as they are read aloud. Text-to-speech is only available when using the secure browser or a supported Chrome or Firefox browser.
For information about setting up text-to-speech, see the Technical Specifications Manual.
Figure 29. Speak Tool Options for Questions
To listen to content with the Text-to-Speech tool:
• To listen to a passage, students open the passage context menu and select a Speak option. Students can also select a portion of text to listen to, such as a word or phrase. To do this, students select the text, open the passage context menu and select Speak Selection.
Note: When listening to passages, students can pause TTS and then resume it at the point where it was paused. However, this feature is not available on mobile devices. Students testing on mobile devices can resume a paused TTS passage by selecting the remaining text to be read aloud and selecting Speak Selection from the context menu.
• To listen to a question or answer options, students open the question context menu and select one of the following Speak options:
o To listen only to the question, students select Speak Question.
o To listen to a multiple-choice question and all answer options, students select Speak Question and Options.
o To listen only to an answer option, select Speak Option from the context menu and then select the answer option. Students could also right-click the answer option and select Speak Option.
Test Delivery System Overview of the Student Testing Site
38
Selecting a Previous Response Version
The Select Previous Version tool allows students to view and restore responses they previously entered for a Text Response question. For example, if students type a response, click Save, delete the text, and enter new text, they can use this tool to recover the original response.
To recover a previously-entered response:
2. Select the Select Previous Version option from the context menu. The Select Previous Version window appears, listing all the saved responses for the question in the left panel (see Figure 30).
Figure 30. Select Previous Version Window
3. Select a response version from the left panel. The text associated with that response appears in the right panel.
4. Click Select. The selected response appears in the text box for the question.
Note: This tool is only available for Text Response questions. If the student or test administrator pauses the test, any responses entered prior to pausing will no longer appear in the Select Previous Version window.
Test Delivery System Proceeding Through a Test
39
Section VIII. Proceeding Through a Test Students can view reading passages, respond to questions, review previously answered questions, pause a test, and submit a test. The following sections describe each of these tasks.
About Reading Passages
Responding to Test Questions
Students answer test questions depending on the question type.
• Multiple choice questions: Students select a single answer option.
• Multi-select questions: Students select one or more answer options.
• Technology-enhanced questions: Students follow the instructions given for each question. Technology-enhanced questions require students to do any of the following tasks:
o Use an on-screen keypad to generate an answer.
o Select an object or text excerpt on the screen.
o Place points, lines, or bars on a graph.
o Drag and drop text or graphic objects.
o Enter text in a text box or table.
o Match answer options together.
o Modify a highlighted word or phrase in a reading selection.
o Enter input parameters to run an on-screen simulation.
When test question is associated with a reading passage, students should review the passage before responding to the question. The content for a reading passage may be paginated. To move between the
pages of a reading passage, students can select
and below the stimulus. Students can also select
to open the Reading Mode window, which displays
two pages at a time.
Note: If students want to highlight text that spans multiple pages in a reading passage, they must highlight the text on each page separately.
Figure 31. Reading Passage
Test Delivery System Proceeding Through a Test
40
Some questions may consist of multiple parts that students must answer. After students respond to all the questions on a page, they select Next to proceed to the next page.
All responses are saved automatically. Students can also manually save their responses to questions by selecting Save in the global menu.
Questions grouped with the same stimulus are tabbed for individual viewing ( ).
Students select the tabs in the upper-right corner to proceed to the corresponding question.
The navigation tabs may also include a stimulus icon ( ) that students can select to view the
stimulus associated with the grouped questions.
Note: Students can use the Student Practice Site to familiarize themselves with the question types that may appear on tests.
Reviewing Questions in a Test
Students may return to a previous question and modify their response if the test was not paused for more than one day. See the Pause Rules section for more information.
Students can use the Back button or the Questions drop-down list to return to questions they want to review. The drop-down list displays "(marked)" for any questions marked for review (see Figure 32).
Figure 32. Question Marked for Review
Pausing Tests
Students can pause the test at any time. Pausing a test logs the student out. To resume testing, students must repeat the sign-in process (see the section Signing in to the Student Testing Site).
To pause a test:
1. The student selects Pause in the global menu. A confirmation message appears.
2. The student selects Yes. The Student Sign-In page appears.
Test Delivery System Proceeding Through a Test
41
Submitting a Test
This section describes how students submit a test when they are done answering questions.
Reaching the End of a Test
After students respond to the last test question, the End Test button appears in the global menu (see Figure 33).
Figure 33. Global Menu with End Test Button
To end a test:
1. Students select End Test. A confirmation message appears.
2. Students select OK.
End Test Page
When students end a test, the End Test page appears (see Figure 34). This page allows students to review answers and submit the test for scoring. A flag icon appears for any questions marked for review.
Figure 34. End Test Page
To review answers:
1. Students select a question number.
2. To return to the End Test page, students select End Test in the global menu.
Test Delivery System Proceeding Through a Test
42
To submit the test:
1. Students select Submit Test.
Warning: Once students select Submit Test, they cannot return to the test or modify answers.
Your Results Page
After students submit the test, the Your Results page appears, displaying the student’s name, the test name, and the completion date (see Figure 35).
Figure 35. Your Results Page
For some practice tests, this page also displays a summary report (see Figure 36).
Figure 36. Practice Test Summary Report
Test Delivery System Proceeding Through a Test
43
Table 7 provides an overview of the columns in the practice test summary report.
Table 7. Overview of the Practice Test Summary Report
Column Description
Item Number The link in this column opens the question page with the student’s entered response.
Achieved This column displays the student’s achieved points for the item.
Max This column displays the maximum amount of points possible for the item.
Score Rationale This column displays information about the correct answer to a part of or the whole item. A check mark is shown next to the score rationale for each item or part of an item that the student responded to correctly. An X is shown next to the score rationale for each item or part of an item that the student responded to incorrectly.
To exit the Student Testing Site:
2. Select Log Out.
3. In the upper-right corner, select Close Secure Browser. For information about exiting the Student Testing Site on mobile devices, see Appendix A.
Note: If you are testing with the Take a Test app on Windows 10, you must press Ctrl + Alt + Delete to exit the Student Testing Site. For more information about the Take a Test app, see the Technical Specifications Manual.
44
Appendix A. About the Secure Browser This appendix includes the following sections:
• Additional Measures for Securing the Test Environment
• Configuring Tablets for Testing
• About Permissive Mode
• Troubleshooting
For more information about the secure browser, see the Secure Browser Installation Manual.
Additional Measures for Securing the Test Environment
The secure browser ensures test security by prohibiting access to external applications and navigation away from the test. This section provides additional measures you can implement to ensure the test environment is secure.
• Close External User Applications
Before launching the secure browser, or prior to administering the online tests, close all non-required applications on testing devices, such as word processors and web browsers.
• Avoid Testing with Dual Monitors
Students should not take online tests on computers connected to more than one monitor. Systems that use a dual monitor setup typically display an application on one screen while another application is accessible on the other screen.
• Disable Screen Savers and Timeout Features
On all testing devices, be sure to disable any features that display a screen saver or log users out after a period of inactivity. If such features activate while a student is testing, the secure browser logs the student out of the test.
Test Delivery System About the Secure Browser
45
Forbidden Application Detection
When the secure browser launches, it checks for other applications running on the device. If it detects a forbidden application, it displays a message listing the offending application and prevents the student from testing. This also occurs if a forbidden application launches while the student is already in a test.
In most cases, a detected forbidden application is a scheduled or background job, such as anti-virus scans or software updates. The best way to prevent forbidden applications from running during a test is to schedule such jobs outside of planned testing hours.
Configuring Tablets for Testing
Tablets and Chromebooks should be configured for testing before you provide them to students. For more information, see the Technical Specifications Manual on the Ohio's State Tests Portal.
To configure iOS devices:
1. Tap the AIRSecureTest secure browser icon.
To configure Android tablets:
1. Tap the AIRSecureTest secure browser icon. 2. If the secure browser keyboard is not selected, follow the prompts on the screen.
When the secure browser keyboard is selected, the secure browser app opens.
To configure Chromebooks:
1. From the Apps link on the Chrome OS login screen, select AIRSecureTest secure browser.
Closing the Student Testing Site on Tablets
After a test session ends, close the AIRSecureTest application on student tablets.
To close the Student Testing Site on iOS devices:
1. Double-tap the Home button. The multitasking bar appears. 2. Locate the AIRSecureTest app preview and slide it upward.
To close the Student Testing Site on Android tablets:
1. Tap the Menu icon in the upper-right corner.
2. Tap Exit. A confirmation message appears. 3. Tap Exit.
To close the Student Testing Site on Chromebooks:
1. Click Close Secure Browser in the upper-right corner.
Test Delivery System About the Secure Browser
46
About Permissive Mode
Permissive Mode is a test setting option that allows students to use accessibility software in addition to the secure browser.
Policy: Requests to use permissive mode for operational testing must be submitted in advance of testing. Districts may submit student-specific requests by contacting the Ohio Help Desk. All requests will be reviewed by the Ohio Department of Education.
Permissive Mode activates when the student is approved for testing. Students who have the Permissive Mode setting enabled should not continue with the sign-in process until their accessibility software is correctly configured.
To use accessibility software with the secure browser:
1. Open the required accessibility software.
2. Open the secure browser. Begin the normal sign-in process up to the test administrator approval step.
3. When a student is approved for testing, the secure browser allows the operating system’s menu and task bar to appear.
4. The student must immediately switch to the accessibility software that is already open on the computer so that it appears over the secure browser. The student cannot click within the secure browser until the accessibility software is configured.
o Windows: To switch to the accessibility software application, click the application in the task bar.
o Mac: To switch to the accessibility software application, click the application in the dock.
Note: When using Windows 8 and above, the task bar remains on-screen throughout the test after enabling accessibility software. However, forbidden applications are still prohibited.
5. The student configures the accessibility software settings as needed.
6. After configuring the accessibility software settings, the student returns to the secure browser. At this point, the student can no longer switch back to the accessibility software. If changes need to be made, the student must sign out and then sign in again.
7. The student continues with the sign-in process.
Test Delivery System About the Secure Browser
47
Permissive Mode is available only for computers running supported desktop Windows and Mac operating systems. For information about supported operating systems, see the Technical Specifications Manual.
Personnel should test the compatibility of assistive technology software with permissive mode enabled by accessing the Student Practice Site via the Secure Browser in advance of testing.
Forbidden applications will still not be allowed to run.
Troubleshooting
This section describes how to troubleshoot some situations in which a student cannot connect to a test.
Resolving Secure Browser Error Messages
This section provides possible resolutions for the following messages that students may receive when signing in.
• You cannot login with this browser: This message occurs when the student is not using the correct secure browser. To resolve this issue, ensure the latest version of the secure browser is installed, and that the student launched the secure browser instead of a standard web browser. If the latest version of the secure browser is already running, then log the student out, restart the computer, and try again.
• Looking for an internet connection: This message occurs when the secure browser cannot connect with the Test Delivery System. This can occur if there is a network-related problem. Make sure that either the network cable is plugged in (for wired connections) or the Wi-Fi connection is live (for wireless connections). Also check if the secure browser must use specific proxy settings; if so, those settings must be part of the command that launches the secure browser.
• Test Environment Is Not Secure: This message can occur when the secure browser detects a forbidden application running on the device (see the section Additional Measures for Securing the Test Environment). If this message appears on an iPad, ensure that either Autonomous Single App Mode or Automatic Assessment Configuration is enabled (see the section Configuring Tablets for Testing).
Test Delivery System About the Secure Browser
48
Force-Quit Commands
In the rare event that the secure browser or test becomes unresponsive, you can force-quit the secure browser.
To force the secure browser to close, use the keyboard command for your operating system as shown below. This action logs the student out of the test. When the secure browser is opened again, the student logs back in to resume testing.
Operating System Key Combination
Windows* Ctrl + Alt + Shift + F10
Mac OS X* Ctrl + Alt + Shift + F10. The Ctrl key may appear as Control, Ctrl, or ^
Linux Ctrl + Alt + Shift + Esc
*If you are using a laptop or notebook, you may need to press Function before pressing F10.
Caution: Use of Force-Quit Commands
The secure browser hides features such as the Windows task bar or Mac OS X dock. If the secure browser is not closed correctly, then the task bar or dock may not reappear correctly, requiring you to reboot the device. Avoid using a force-quit command if possible.
Force-quit commands do not exist for the secure browser for iOS, Chrome OS, and Android devices.
• iOS: Double-tap the Home button, then close the app as you would any other iOS app.
• Chrome OS: To exit the secure browser, press Ctrl + Shift + S.
• Android: To close the secure browser, tap the menu button in the upper-right corner and select Exit.
Test Delivery System Text Response Formatting Toolbar
49
Appendix B. Text Response Formatting Toolbar In addition to the standard test tools described in the section Test Tools, students can use a formatting toolbar above the response field for some text response questions (see Figure 37). The formatting toolbar allows students to apply styling to text and use standard word-processing features.
Figure 37. Text Response Question with Formatting Toolbar
The lower-right corner of the response field displays the word count and character count for the student's response.
Table 8 provides an overview of the formatting tools available.
Table 8. Description of Formatting Tools
Tool Description of Function
Bold, italicize, or underline selected text.
Remove formatting that was applied to the selected text.
Insert a numbered or bulleted list.
Indent a line of selected text.
Decrease indent of text.
Cut selected text.
Copy selected text.
Paste copied or cut text.
Undo the last edit to text or formatting in the response field.
Redo the last undo action.
Use spell check (if available) to identify potentially misspelled words in the response field. Not available for English language arts tests.
Add special characters in the response field.
Test Delivery System Text Response Formatting Toolbar
50
Spell Check
The spell check tool identifies words in the response field that may be misspelled (see Figure 38).
Figure 38. Spell Check Tool
To use spell check:
1. In the toolbar, select .
2. Potentially incorrect words change color and become underlined.
3. Select a misspelled word. A list of suggestions appears.
4. Select a replacement word from the list. If none of the replacement words are correct, close the list by clicking anywhere outside it.
5. To exit spell check, select again.
Special Characters
Students can add mathematical, accented characters, and other symbols.
To add a special character:
1. In the toolbar, select .
2. In the window that pops up, select the required character (see Figure 39).
Figure 39. Special Characters Window
Test Delivery System Keyboard Navigation for Students
51
Appendix C. Keyboard Navigation for Students Students can use keyboard commands to navigate between test elements, features, and tools.
Keyboard commands require the use of the primary keyboard. Do not use keys in a numeric keypad.
Sign-In Pages and In-Test Pop-ups
Table 9 lists keyboard commands for selecting options on the sign-in pages or pop-up windows that appear during a test.
Table 9. Keyboard Commands for Sign-In Pages and Pop-Up Windows
Function Keyboard Commands
Move to the next option Tab
Move to the previous option Shift + Tab
Select the active option Enter
Mark checkbox Space
Scroll through drop-down list options Arrow Keys
Close pop-up window Esc
Keyboard Commands for Test Navigation
Table 10 lists keyboard commands for navigating tests and responding to questions.
Table 10. Keyboard Commands for Test Navigation
Function Keyboard Commands
Scroll up Up Arrow
Scroll down Down Arrow
Scroll to the right Right Arrow
Scroll to the left Left Arrow
Move to the next element Tab
Move to the previous element Shift + Tab
Select an answer option Space
Go to the next test page Ctrl + Right Arrow
Go to the previous test page Ctrl + Left Arrow
Open the global menu Ctrl + G
Open a context menu Ctrl + M
Test Delivery System Keyboard Navigation for Students
52
Keyboard Commands for Global and Context Menus
Students can use keyboard commands to access tools in the global and context menus. For more information about tools in the global menu, see Table 5. For more information about tools in the context menu, see Table 6.
Global Menu
To access the global menu tools using keyboard commands:
1. Press Ctrl + G. The global menu list opens.
2. To move between options in the global menu, use the Up or Down arrow key.
3. To select an option, press Enter.
4. To close the global menu without selecting an option, press Esc.
Context Menus
To open the context menu for an element:
5. Navigate to the element using the Tab or Shift + Tab command.
1. Press Ctrl + M. The context menu for the selected element opens.
2. To move between options in the context menu, use the Up or Down arrow keys.
3. To select an option, press Enter.
4. To close the context menu without selecting an option, press Esc.
Highlighting Selected Regions of Text
This section explains how to use keyboard commands to select a text excerpt (such as a word in a passage) and highlight it. These instructions only apply to students using the secure browser.
To select text and highlight it:
1. Navigate to the element containing the text you want to select.
2. Press Ctrl + M to open the context menu and navigate to Enable Text Selection.
3. Press Enter. A flashing cursor appears at the upper-left corner of the active element.
4. To move the cursor to the beginning of the text you want to select, use the arrow keys.
5. Press Shift and an arrow key to select your text. The text you select appears shaded.
6. Press Ctrl + M and select Highlight Selection.
Test Delivery System Keyboard Navigation for Students
53
Keyboard Commands for Grid Questions
Questions with the grid response area (see Figure 40) may have up to three main sections:
• Answer Space: The grid area where students enter the response.
• Button Row: The following buttons may appear above the answer space: Delete, Add Point, Add Arrow, Add Line, Add Circle, Add Dashed Line, and Connect Line.
• Object Bank: A panel containing objects you can move to the answer space.
Figure 40. Grid Question
To move between the main sections:
1. To move clockwise, press Tab. To move counter-clockwise, press Shift + Tab.
To add an object to the answer space:
1. With the object bank active, use the arrow keys to move between objects. The active object has a blue background.
2. To add the active object to the answer space, press Space.
To use the action buttons:
1. With the button row active, use the left and right arrow keys to move between the buttons. The active button is white.
2. To select a button, press Enter.
3. Press Space to apply the point, arrow, or line to the answer space.
To move objects and graph elements in the answer space:
1. With the answer space active, press Enter to move between the objects. The active object displays a blue border.
2. Press Space.
3. Press an arrow key to move the object. To move the object in smaller increments, hold Shift while pressing an arrow key.
Test Delivery System Transferring a Test Session
54
Appendix D. Transferring a Test Session You can transfer an active test session from one device or browser to another without stopping the session or interrupting in-progress tests. This is useful in scenarios when your computer malfunctions while a session is in progress.
Warning: If you do not know the active Session ID, you cannot transfer the session.
The Test Delivery System ensures that you can only administer a test session from one browser at a time. If you move a test session to a new device, you cannot simultaneously administer the session from the original browser or device.
These instructions apply to both the Test Administrator Interface and Test Administrator Practice Site. However, you cannot transfer a session from the Test Administrator Interface to the Test Administrator Practice Site or vice versa.
To transfer a test session to a new device or browser:
1. While the session is still active on the original device or browser, log in to the Test Administrator Site on the new device or browser. A Session ID prompt appears.
2. Enter the active Session ID in the text box and click Enter. The Test Administrator Site appears, allowing you to continue monitoring your students’ progress. The test session on the previous computer or browser automatically closes.
The Session ID prompt appears any time you access the TA Site during an active session. If you do not wish to return to the active session, you can click Start a Different Session to create a new session or Logout to close the active session and log out of the TA Site.
Test Delivery System User Support
55
Appendix E. User Support For additional information and assistance in using the Test Delivery System, contact the Ohio Help Desk.
The Help Desk is open Monday-Friday 7 a.m. to 5 p.m. (except holidays or as otherwise indicated on the Ohio's State Tests portal).
Ohio Help Desk
Toll-Free Phone Support: 877-231-7809
Email Support: [email protected]
Please provide the Help Desk with a detailed description of your problem, as well as the following:
• Test Administrator name
• If the issue pertains to a student, provide the student’s SSID, test name, test part, and associated district or school. Do not provide the student’s name.
• If the issue pertains to a TIDE user, provide the user’s full name and email address.
• Any error messages and codes that appeared, if applicable.
• Affected test ID and question number, if applicable.
• Operating system and browser version information, including version numbers (for example, Windows 7 and Firefox 45 or Mac OS 10.10 and Safari 8)
• Information about your network configuration, if known:
o Secure browser installation (to individual devices or network)
o Wired or wireless internet network setup
Test Delivery System Change Log
56
Appendix F. Change Log
Date Section/Element Note
9/25/17 Initial version 2017-2018
02/06/18 All Updates made to reflect TA Interface time column and Student Interface test clock feature
FOURTH EDITION
JANUARY 2018
Ohio’s Accessibility Manual
Table of Contents
Section 1: Introduction ........................................................................................................................... 4
1.1 About this Manual ........................................................................................................................................4
1.2 About Accessibility Features on Ohio’s State Tests ...............................................................................4
1.3 General Testing Procedures .......................................................................................................................4
Section 2: Ohio’s Accessibility Features for Students Taking Ohio’s State Tests ........................ 4
2.1 Decision-Making Framework for Accessibility Features ........................................................................4
2.2 Ohio’s Accessibility Features .....................................................................................................................5
2.3 Administrative Considerations ...................................................................................................................6
2.4 Universal Tools ............................................................................................................................................7
2.5 Designated Supports ................................................................................................................................. 10
2.6 Accommodations for Students with Disabilities and English learners............................................... 13
2.7 Considerations for English Learner Accommodations ......................................................................... 23
2.8 Other Accommodations and Modifications ............................................................................................ 27
Section 3: Universal Design and Ohio’s State Tests ........................................................................ 28
Acknowledgements
The Ohio Department of Education would like to acknowledge the members of the Ohio AT Network for giving their time, insight and expertise to this manual.
Revision History The revision history of this manual provides a means for readers to easily navigate to places in the relevant section where updates have occurred. Significant changes and updates are indicated with red text and underline for additions and strike-throughs for deletions. Minor changes, such as typos, formatting and grammar corrections or updates, are not highlighted.
Page Description
4 Noted that some universal features can be turned off.
4 Noted new Desmos calculator.
5 Changed description of Highlighter to match Test Administrator Manual (TAM).
5 Changed name of General Masking to Masking to match TAM.
5 Changed description of Line reader to match TAM.
5 Changed name of Flag items to Mark for review to align with TAM.
5 Changed description of Paginated stimuli and reading mode to match TAM.
6 Changed name of Eliminate answer choices to Strikethrough to align with TAM.
6 Added new feature, Test Timer.
6 Added reminder about voice packs for Text-to-speech.
6 Changed name of Magnification or enlargement to Zoom to align with TAM.
7 Noted that online features can be turned on and off in the student test settings.
7-8 Removed multiple features that can be disabled and consolidated under term Disable universal tool.
8 Added Fact charts to Calculator - handheld.
8 Added Line reader tool - handheld.
9 Added Spellchecker - handheld.
9 Added Tactile fidgets/Fidget devices.
9 Changed name Timer to Timer – external to differentiate from new universal Test timer feature.
10 Added note about documenting accommodations for different standardized tests.
11 Added note to not document accommodations for college and career readiness tests on IEPs or 504 plans.
11 Added that graphic organizers are not allowable on Ohio State Tests.
11 Added note about the Assistive Technology and Accessible Educational Materials Center.
12 Added reminder that reading only questions and answer options to students is not allowed.
15 Added reminder about voice packs for Text-to-speech.
15 Added note about the Assistive Technology and Accessible Educational Materials Center.
16 Noted new Desmos calculator.
16-17 Changed calculator policy for science tests.
17 Added rekenrek and removed limit on use to visually impaired.
17 Added comment about Mathematical tools.
18 Removed prohibition of calculators on science tests.
22 Added reminder about voice packs for Text-to-speech.
23 Added word-to-word glossaries and dictionaries approved by ACT and College Board to state allowed dictionaries.
23 Added additional information to section on emergency accommodations.
P a g e 4 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Section 1: Introduction
1.1 About this Manual Ohio’s Accessibility Manual is a comprehensive policy document providing information about the accessibility features of Ohio’s State Tests for grades 3-8 and high school in English language arts, mathematics, science and social studies. The manual helps to define the specific accessibility features available for all students, students with disabilities, students who are English learners and students who are English learners with disabilities. The intended audience of the manual is district decision makers and teams who will determine the accessibility features for all students taking the tests.
1.2 About Accessibility Features on Ohio’s State Tests Ohio regards tests as tools for enhancing teaching and learning. Ohio is committed to providing all students, including but not limited to, students with disabilities, English learners, English learners with disabilities, and underserved populations, with equitable access to high-quality, 21st century assessments. By applying principles of universal design, leveraging technology, and embedding and allowing a broad range of accessibility features, Ohio’s State Tests provide opportunities for the widest possible number of students to demonstrate their knowledge and skills. Ohio sets and maintains high expectations that all students will have access to the full range of grade-level and course content standards. Together, these elements will increase student access to Ohio’s State Tests with fidelity of implementation. Ohio’s goals for promoting student access include:
● Applying principles of universal design to the development of the assessments such that the assessments provide the greatest amount of accessibility and minimize test related barriers for all students;
● Measuring the full range of complexity of the standards; ● Leveraging technology for the accessible delivery of the assessments; ● Building accessibility throughout the test without sacrificing assessment validity; and ● Using a combination of accessible design and accessible technologies from the inception of items and
tasks.
1.3 General Testing Procedures For information about coordinating or administering Ohio’s State Tests, including test security policies, administrative procedures and tasks to complete before, during and after testing, refer to the Test Administration Manual. Manuals are available on Ohio’s State Tests Portal.
Section 2: Ohio’s Accessibility Features for Students Taking Ohio’s State Tests
2.1 Decision-Making Framework for Accessibility Features Students should be familiar with accessibility features prior to testing and should have the opportunity to select, practice and use those features in instruction before test day. Students can become familiar with the computer-based features by accessing the practice items available on the Student Practice Site on Ohio’s State Test
P a g e 5 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Portal. Appendix G provides a graphic to assist district testing accessibility decision makers in selecting appropriate features based on student needs. The graphic shows the various layers of features and provides guiding questions to support the district’s selection process.
2.2 Ohio’s Accessibility Features Through a combination of universal design principles and computer-embedded accessibility features, Ohio has designed an inclusive assessment system by considering accessibility from initial design through item development, field-testing and implementation of the assessments for all students. Although accommodations may still be needed for some students with disabilities and English learners to assist in demonstrating what they know and can do, the computer-embedded accessibility features should minimize the need for accommodations during testing and ensure the inclusive, accessible and fair testing of the diverse students being assessed.
Ohio’s Accessibility System
Accommodations for students with disabilities must be documented on IEPs or 504 plans. Other accessibility features are not required to be documented to be provided. However, if there is an accessibility feature that a team wants to ensure a student receives, the team should document the feature on the student’s IEP or 504 plan as well.
P a g e 6 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
For example, if a student with a disability needs to have the test administered in a small group setting or if a student must have color contrast for testing, these features also should be included on the IEP or 504 plan. If they are not included on a plan, they may still be provided, but documenting the student’s need ensures that the features are provided.
2.3 Administrative Considerations Students are typically tested in their general education classrooms following the test administration schedule for the grade and content area being administered. However, the administrator has the authority to schedule students in testing spaces other than general education classrooms and at different scheduled times, as long as all requirements for testing conditions and test security are met as set forth in the Test Administration Manual. Decisions may be considered, for example, that benefit students who are easily distracted in large group settings by testing them in a small group or individual setting. In general, changes to the timing, setting or conditions of testing are left to the discretion of the principal or test coordinator. In accordance with principles of universal design for assessment, these administrative considerations are available to all students.
Administrative Considerations Description
Familiar test administrator The student knows the test administrator and/or interpreter.
Frequent breaks
All students may take breaks as needed. Frequent breaks refers
to multiple, planned, short breaks during testing based on a
specific student’s needs (for example, the student fatigues
easily). During each break, the testing clock is stopped.
Students should pause their test when taking a break. Students may pause their test from the student testing site or the test administrator may do so from the Test Administrator Interface. Pausing a student’s test signs the student out of his or her test. A student who pauses his or her test and signs back into the test on the same school day will be able to revisit all the items on the test. A warning message displays after 20 minutes of test inactivity. If
the student does not click OK within 30 seconds after this
message appears, the test is paused and they are signed out.
Separate or alternate location
The test is administered in a different location than the location
where other students are testing (for example, a different
classroom).
Small group
A small group is a subset of a larger testing group assessed in a
separate location. There is no specific number defined for a small
group, but two to eight students is typical. A “group” of one also is
permissible. Small groups may be appropriate for human read-
aloud and translated test administration or to reduce distractors
for some students.
P a g e 7 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Specialized equipment or furniture This includes equipment such as adjustable desks or chairs.
Specified area or seating
The student sits in a specific place in the test setting, such as by
the window for natural light or beside the test administrator’s
desk.
Time of day
The student takes test during time of day most beneficial to his or
her performance. Care must be taken to ensure that the student
has all allowable time available for testing.
2.4 Universal Tools On the Ohio computer-based assessments, universal tools are features or preferences that are either built into the assessment system or provided externally by test administrators. Universal tools are available for all students taking Ohio’s State Tests. Since these features are available for all students, they are not classified as accommodations. Students should be familiar with these features prior to testing and should have the opportunity to select and practice using them in order to appropriately use these features on test day. Universal tools are intended to benefit a wide range of students and may be used by the student at his or her discretion during testing. Universal tools embedded in the test delivery system are on by default but some may be turned off. See the Test Administration Manual for detailed information about turning features on and off in the student test settings.
Universal Tools Description
Blank paper
The test administrator provides blank scratch paper to students to take
notes and/or work through items during testing. Blank paper is required
for the English language arts tests. For mathematics, science and social
studies, blank paper must be available upon request. Refer to the Test
Administration Manual for more information about blank paper.
Calculator – Test Delivery System
The Test Delivery System provides a calculator for student use on calculator-allowable mathematics tests or parts of test and the physical science test. Beginning with the 2017-2018 school year, the Ohio’s State Tests will use Desmos as the online calculator. The previous calculator versions will no longer be available on the tests or under the Student Practice Test resources. Practice tests that have the calculator tool have been updated to provide students with the Desmos calculator. The Desmos calculators are also available in the Student Practice Resources folder on the Ohio’s State Tests portal. Additional calculator guidance is in the Test Administration Manual. A graphing calculator is available on the following tests:
• Algebra I
• Geometry
• Integrated Mathematics I
• Integrated Mathematics II
P a g e 8 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
A scientific calculator is available on the following tests:
• Physical Science
• Grades 6 to 8 Mathematics
General directions
The test administrator must read the scripted general directions for starting all administrations and must not deviate from the script. After the test administrator has read the directions, students may ask for the directions to be repeated or clarified. General directions may be translated or signed (e.g. ASL). General directions include the scripted information for students that comes before the test starts. Once students have begun the test, nothing may be clarified.
General Masking
The student electronically “covers” parts of the test with a blank box, as
needed covers an area of the item so they can focus on certain item
elements. The student may uncover anything masked when ready. This
setting can be changed in TIDE and the Test Administrator Interface.
Headphones
The student uses headphones or earbuds to access text-to-speech or
media on the assessment.
Students using text-to-speech must use headphones if tested in a group
setting.
At this time, there are no audio clips embedded in any content area
test. Therefore, headphones are not required for testing unless a
student is using the text-to-speech feature in a group setting.
Students with hearing impairments may use personal FM systems. For
more information on additional assistive technology devices and
software for use on Ohio’s State Tests, refer to Appendix D of this
manual.
Highlighter
The student electronically highlights text as needed to recall and/or
emphasize. This setting can be changed in TIDE and the Test
Administrator Interface.
Line reader
The student uses an onscreen tool to highlight lines of text as they read.
This setting can be changed in TIDE and the Test Administrator
Interface.
Mark for review (Flag items) The student electronically “flags” or “bookmarks” items to review later.
Notepad The student writes notes using the embedded notepad feature.
Paginated stimuli and reading mode
The student moves between pages of a reading passage by clicking on the arrow keys below the passage reads a passage by flipping pages, similar to a book or e-reader. This eliminates vertical scrolling on passages. The student can also select to open the reading mode window which displays two pages of the reading passage at a time.
P a g e 9 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Paginated stimuli and reading mode are available only for ELA and some social studies tests. This setting can be changed in TIDE and the Test Administrator Interface
Redirect student to the test
The test administrator redirects the student’s attention to the test
without coaching or assisting the student in any way. To redirect a
student is not the same as to cueing or prompting the student.
Spellcheck
This feature allows the student to check the spelling of words in student-
generated responses.
Spellcheck is available only for some science and social studies items that require a student to write/type a response. Unlike some word processing programs, the Student Testing Site does not automatically highlight misspelled words as the student types. Students must click the ABC button to check spelling. Spellcheck is not allowed on the English language arts tests. There are no type-written responses for mathematics.
Strikethrough
(Eliminate answer choices)
The student electronically crosses out possible answer choices on multiple choice items. This setting can be changed in TIDE and the Test Administrator Interface
Test timer
The student test timer displays the amount of time the student has been in the test. The timer only runs while the student is viewing test content. The test timer does not enforce a time limit. Test administrators are responsible for ensuring that students complete each part of their tests within the posted testing time, or when applicable, within the student’s allotted extended time. The student can collapse or un-collapse the test timer by clicking on it. This feature may be turned off.
Text-to-speech for mathematics,
science and social studies
Text-to-speech as a universal tool will be turned on for mathematics,
science and social studies. The text-to-speech feature reads aloud the
test to the student when the student selects an available “speak” option.
Student must use headphones if tested in a group setting.
Only students who meet the criteria to have a read-aloud
accommodation on the English language arts test may use this feature
for English language arts.
Students who use text-to-speech should use a voice pack they are
familiar with and adjust the volume, pitch and rate prior to starting the
test. Detailed information about text-to-speech functionality is in the
Test Administration Manual. Manuals are available on Ohio’s State
Tests Portal.
Text-to-speech tracking for
mathematics, science and social
studies
The feature will highlight words in test questions as the embedded text-
to-speech feature reads the test aloud.
Only students who meet the criteria to have a read-aloud
accommodation on the English language arts test may use this feature
for English language arts.
P a g e 10 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Writing tools Writing tools (cut and paste, copy, underline, bold and insert bullets) are
available for select constructed-response items.
Zoom
(Magnification or enlargement)
Students use the zoom out and zoom in buttons to decrease and increase the size of the text and graphics on the page. Maximum zoom is about 250 percent depending on the device.
2.5 Designated Supports A relatively small number of students will require additional features for their particular needs (for example, changing the background or font color or disabling text-to-speech for the mathematics assessments). Providing too many tools on screen might distract some students. Therefore, some designated features will be selected ahead of time based on the individual needs and preferences of the student. Students should practice using these features and understand when and how to use them. Students can decide whether or not to use a pre-selected support without any consequence to the student, school or district. Individualizing access needs on the test for each student provides increased opportunities to accurately demonstrate knowledge and skills. Designated supports are divided into two types: 1) embedded designated supports; and 2) non-embedded
designated supports. Embedded supports are those that are available as part of the technology platform. They
can be turned on three different ways:
1. By uploading a student settings file in TIDE;
2. By marking the features under the “Test Settings” section of the student’s record manually in TIDE; or
3. Test administrators can select the feature(s) under “Test Settings” in the Test Administrator Interface
when approving the student to test during the test session.
See the Test Administration Manual for detailed information about turning features on and off in the student
test settings. Non-embedded supports are not part of the technology platform so test administrators must
provide them locally.
Designated Supports
Embedded Designated
Supports
Description
Background/font color choice
• Black on light yellow
• Black on light blue
• Black on light magenta
• White on black (inverted)
• White on navy blue
Alternate on-screen background and font color is enabled.
A note about color blindness: The Department follows accessibility color guidelines when developing test items. Items on state tests should not be color dependent. Graphs, maps, charts and other images may have color, but being able to distinguish the colors should not affect a student’s ability to respond to a question. When using color-contrast options, the contrast may not transfer to some images or text in images. If a student comes to an item that he or she cannot answer, either because it is not universally accessible or the color contrast does not work properly, it is allowable for the test administrator to describe what needs to be explained to the student to be able to answer the question. The test administrator must be
P a g e 11 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
cautious to not provide any information that gives the answer to the student.
Disable universal tool Some students may benefit from fewer tools in the Test Delivery System when testing. Many of the universal tools available in the Test Delivery System can be turned off. See the Test Administration Manual for details about turning student settings on and off.
Disable general masking Turn off general masking to reduce student distraction.
Disable paginated stimuli and reading mode
Turn off paginated stimuli and reading mode to reduce student distraction.
Disable text-to-speech for
mathematics, science and social
studies
Turn off text-to-speech to reduce student distraction.
Disable text-to-speech tracking for
mathematics, science and social
studies
Turn off text-to-speech tracking to reduce student distraction.
Mouse pointer size and color
• Large/extra large black
• Large/extra large green
• Large/extra large red
• Large/extra large yellow
• Large/extra large white
Adjust the size and color of the mouse cursor as it appears on the student's screen.
Print size
• Level 0: 1X (default/no zoom)
• Level 1: 1.5X
• Level 2: 1.75X
• Level 3: 2.5X
• Level 4: 3X
The print size can be pre-set to one to four levels larger than the
default.
Non-embedded Designated
Supports
Description
Calculator or fact charts - handheld
Students may use handheld calculators and fact charts (addition, subtraction, multiplication or division only) for calculator-allowable mathematics tests or parts of test and the physical science test. Additional calculator guidance is in the Test Administration Manual.
External magnification or
enlargement device
The student uses external magnification or enlargement devices to
increase the font or graphic size (e.g., projector, closed-circuit
television, eye-glass mounted or hand-held magnifiers, electronic
magnification systems, etc.).
Line reader tool - handheld The student uses a blank straight edge as he or she reads and follows along with the text on the screen.
P a g e 12 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Music and white noise
A student or group of students listens to background music during testing. The test administrator may play music to a student or group of students, or a student may use a teacher-provided device and earbuds. Music selections should be free of any test content-specific lyrics. Test security must be maintained. Students may not use a personal device (e.g. cell phone, MP3 player). Additional information about the electronic device policy is in the Test Administration Manual.
Noise buffers
The student uses headphones/earbuds or earplugs to minimize
distraction or filter external noise during testing. If students use
headphones/earbuds as noise buffers, they should not be plugged into
a device.
Rulers, angled-rulers, compasses and protractors
Students may be familiar with these tools from instruction at various grade levels and want to use them on the test. While these tools are not required for testing, districts may choose to provide them to students or allow students to provide their own. The tools cannot contain any additional writing or information that may provide an unfair testing advantage. Examples of additional writing could include but are not limited to multiplication tables, formulas or conversion charts. A student with a visual impairment may need adapted mathematical tools such as a large print ruler, Braille ruler, tactile compass or Braille protractor.
Specialized paper
In addition to blank paper, students may use test administrator-
provided grid paper, wide-ruled paper, Braille paper, raised-line paper,
bold-line paper, raised-line grid paper, bold-line grid paper, colored
paper, etc. The paper provided cannot contain any writing that may
give the student an unfair testing advantage. Examples of additional
writing that is prohibited can include but is not limited to number lines,
two-column tables, fraction models and coordinate grids. Students also
may use personal white boards.
Spellchecker - handheld
The student uses a handheld spellchecker during testing instead of the universal spellcheck embedded into the test delivery system. A handheld spellchecker may be used on all Ohio state tests except English language arts. Spell checkers may not connect to the internet, store information or include definitions, phrases, sentences or pictures. The student should be familiar with the spellchecker he or she will use during testing.
Student reads test aloud to self
Student reads aloud to self. This feature includes the use of whisper
phones. Student must be tested in a one-on-one setting so that the
student does not disturb other students or in a setting in which students
are separated enough from each other that they cannot hear each
other and do not disturb one another.
P a g e 13 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Tactile fidgets/Fidget devices
Student uses tool for self-regulation, to help with focus, attention, calming, and active listening. (e.g. Fidget Spinner, squish ball, focus cube, pencil topper, etc.). Tool must be free of anything that may give an advantage during testing or test content.
Timer - external
Student uses a timer. There are a variety of timers that students may use, ranging from basic kitchen timers to more complex wearable devices that vibrate or flash at preset intervals or timers with visual clues such as a red covering that disappears as the timer counts down.
Students may not use cell phones and devices must not connect to the internet.
2.6 Accommodations for Students with Disabilities and English learners While all students potentially can benefit from the universal tools and designated supports embedded within the test, some students may still need further support to access the tests and show what they know. Those students may require testing accommodations. Accommodations for testing are supports that are already familiar to the student because they are being used in the classroom to support instruction. Four distinct groups of students may receive accommodations on Ohio’s State Tests:
1. Students with disabilities who have an Individualized Education Program (IEP);
2. Students with a Section 504 plan who have physical or mental impairments that substantially limit
one or more major life activities, have records of such impairments, or are regarded as having such
impairments, but who do not qualify for special education services;
3. Students who are English learners (Guidelines for determining English learner status can be
found in the Ohio Statewide Assessments Rules Book.) Students who have exited English learner
status may not receive English learner accommodations on Ohio’s State Tests; and
4. Students who are English learners with disabilities who have IEPs or 504 plans are eligible for
both accommodations for students with disabilities and English learners. For additional guidance
and information about English learners with disabilities, access the About the Lau Resource Center
page of the Ohio Department of Education website.
For Ohio’s State Tests, accommodations are considered to be adjustments to the testing conditions, test format or test administration that provide equitable access during assessments for students with disabilities and students who are English learners. The administration of the assessment should never be the first occasion in which an accommodation is introduced to the student. Accommodations should:
• Provide equitable access during instruction and assessment;
• Mitigate the effects of a student’s disability or English learner status;
• Not reduce learning or performance expectations;
• Not change the construct being assessed; and
• Not compromise the integrity or validity of the assessment. The guidelines provided in this manual are intended to ensure that valid and reliable scores are produced on Ohio’s State Tests and that an unfair advantage is not given to students who receive accommodations. Outside of the guidance provided in this manual, changes to an accommodation or the conditions in which it is provided may change what the test is measuring, and will likely call into question the reliability and validity of the results regarding what a student knows and is able to do as measured by the test.
P a g e 14 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Accommodations should adhere to the following principles:
• Accommodations enable students to participate more fully and fairly on assessments and to demonstrate their knowledge and skills;
• Accommodations should be based upon an individual student’s needs rather than on the category of a student’s disability, level of English language proficiency alone, level of or access to grade-level instruction, amount of time spent in a general classroom, current program setting or availability of staff;
• Teams should base accommodations on a documented need in the instruction and assessment setting and educators should not provide accommodations in order to give the student an enhancement that others could view as an unfair advantage;
• IEP teams and 504 Plan coordinators should describe and document accommodations for students with disabilities in the student’s appropriate plan (i.e., either the IEP or 504 Plan);
• Ohio requires that districts develop district-wide educational plans for English learners that include testing accessibility features;
• Educators should not introduce accommodations to the student for the first time during testing; • When allowable, students also should use accommodations used during instruction on district
assessments and state tests. • Policies about allowable accommodations sometimes differ between standardized tests. For example,
an accommodation allowed on the state ELA test may not be allowed on a district ELA test. To help ensure that students only receive accommodations that result in a valid assessment, IEP teams and 504 plan coordinators should document student accommodations by test and content area, not content area alone.
• Tests that require application for and vendor approval of accommodations, such as college and career readiness tests, should not be documented on an IEP or 504 plan. Because students will have a valid score only if they use the accommodations the vendor approves, IEP teams and 504 plan coordinators cannot ensure that a student will be provided the accommodations documented in their plan.
The table below shows the allowable accommodations for Ohio’s State Tests. Note that some accommodations students use in the classroom will reduce the validity of a student’s test score and are not allowable, such as use of a thesaurus, graphic organizers or access to the Internet during testing.
Accommodations for Students with Disabilities
Presentation Accommodations
Presentation accommodations alter the method or format used to administer Ohio’s State Tests to a student, by changing either the auditory, tactile, visual and/or a combination of these characteristics.
Description
Students who benefit most from presentation accommodations are those with disabilities that affect reading standard print, typically as a result of a physical, sensory, cognitive or specific learning disability.
Additional assistive technology regularly used in instruction
Students may use a range of assistive technologies (AT) on Ohio’s State Tests including devices that are compatible with the AIR Student Testing Site, and those that are used externally (i.e., on a separate device).
P a g e 15 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
For more information on additional assistive technology devices and software for use on Ohio’s State Tests, refer to Appendix D of this manual. For information about who needs AT, how to obtain AT and AT tools, visit the Assistive Technology & Accessible Educational Materials Center online (ataem.org).
Human reader for computer-based test
A test administrator or monitor reads from the student’s computer screen to the student. For computer-based testing, most students should be able to use text-to-speech for a read-aloud. In some cases, a student’s disability may prohibit them from using the text-to-speech feature and require a human reader.
If testing in a small group, test administrators should ensure that all students in the group have similar abilities so that the reader’s pace meets all student’s needs without being too slow or too fast for any student.
Refer to the TIDE User Guide for information about setting up groups for computer-based testing.
If a student need this accommodation, then the person providing the accommodation must read the entire test to the student. It cannot be “as needed” or “on demand.”
Only students who meet the criteria to have a read-aloud accommodation on the English language arts test may use this feature for English language arts.
Paper version of test instead of online
A paper version of the test is available for the small number of students who are unable to use a computer due to the impact of their disability. Before selecting a paper version of the test, IEP teams and 504 plan coordinators should first consider other accessibility features. Students who take a paper-based test should be unable, even with support, to use technology to produce and publish writing using keyboarding. Situations that may require this accommodation include:
● A student with a disability who cannot participate in the online assessment due to a health-related disability, neurological disorder or other complex disability and/or cannot meet the demands of a computer-based test administration even with other accessibility features such as extended time, frequent breaks or a scribe;
● A student with an emotional, behavioral or other disability who is unable to maintain sufficient concentration to participate in a computer-based test administration, even with other accessibility features such as a familiar test administrator, frequent breaks, small group, specified seating or a timer;
● A student with a disability who requires assistive technology that is not compatible with the testing platform.
P a g e 16 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
If a student takes a paper version of a test, the student must take both parts of the test on paper.
Refer to Appendix A of this manual for additional information about paper-based testing.
Read-aloud on English language arts
“Read-aloud” as a general term is when a student is administered a test via text-to-speech, human reader, screen reader or sign language interpreter.
The read-aloud accommodation for the English language arts test is intended to provide access for a very small number of students to printed or written texts on the tests. These students have print-related disabilities and otherwise would be unable to participate in the state tests because their disabilities severely limit or prevent them from decoding, thus accessing printed text.
Because students who require this accommodation are unable to access printed text, they must have a read-aloud for the entire test, including the items, answer options, charts/graphs/figures and passages. This accommodation is not intended for students reading somewhat (only moderately) below grade level.
Reading only questions and answer options to a student is not allowable on the ELA test. If a student qualifies for this accommodation, then they must have the entire test read, including the passages.
In making decisions on whether to provide a student with this accommodation, IEP teams and 504 plan coordinators should consider whether the student has:
• A disability that severely limits or prevents him or her from accessing printed text, even after varied and repeated attempts to teach the student to do so (for example, the student is unable to decode printed text);
OR
• Blindness or a visual impairment and has not learned (or is unable to use) Braille;
OR
• Deafness or hearing loss and is severely limited or prevented from decoding text due to a documented history of early and prolonged language deprivation.
Before documenting the accommodation in the student’s IEP or 504 plan, teams/coordinators also should consider whether:
• The student has access to printed text during routine instruction through a reader, or other spoken-text audio format
P a g e 17 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
accessible educational materials (AEM) or sign language interpreter;
• The student’s inability to decode printed text or read Braille is documented in evaluation summaries from locally administered diagnostic assessments;
• The student receives ongoing, intensive instruction and/or interventions in foundational reading skills to continue attaining the important college- and career-ready skill of independent reading.
For information about who needs AEM, how to obtain AEM and tools to support AEM, visit the Assistive Technology & Accessible Educational Materials Center online (http://ataem.org/); IEP teams and 504 plan coordinators make decisions about who receives this accommodation. Schools should use a variety of sources as evidence (including state assessments, district assessments and one or more locally administered diagnostic assessments or other evaluation).
For students who receive this accommodation, no claims should be inferred regarding the student’s ability to demonstrate foundational reading skills.
Refer to the Test Administration Manual for more information about administering a test through a human reader.
Screen reader mode
Screen reader mode is for students with visual impairments who use
screen readers. Students who do not use screen readers should not
use screen reader mode. Screen reader mode changes the
presentation of items and removes some features. Students working
in this mode do not have the same access to tools. Additional
information about the screen reader and functionality is in the Test
Administration Manual, Practice Test Guidance Document and TIDE
User Guide.
Only students who meet the criteria to have a read-aloud accommodation on the English language arts test may use this feature for English language arts. Screen reader mode is not available for grade 8 science, biology or physical science. By design, screen reader mode does not render simulations and displays alternate text that describes the key information about the simulation needed to answer the associated items. Screen reader mode is not available for these tests because they contain simulations that cannot be adequately described due to the complexity of the simulations.
Sign language interpreter Any student who is deaf or has hearing loss may have a sign language interpreter reflecting their IEP accommodations (American
P a g e 18 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Sign Language, Signed English, Cued Speech) for mathematics, science and social studies.
For the purposes of statewide testing, sign language is considered a second language and should be treated the same as any other language from a translational standpoint. The test must be signed verbatim. The intent of the phrase “signed verbatim” does not mean a word-to-word translation, as this is not appropriate for any language translation. The expectation is that the interpreter should faithfully translate, to the greatest extent possible, all of the words on the test without changing or enhancing the meaning of the content, adding information or explaining concepts unknown to the student.
If a sign language interpreter perceives that a specific sign gives a student the answer or otherwise provides an unfair advantage, an alternate sign or finger spelling should be used.
Only students who meet the criteria to have a read-aloud accommodation on the English language arts test may use this feature for English language arts.
Text-to-speech for English language arts
The text-to-speech feature reads aloud the test to the student when the student selects an available “speak” option.
Student must use headphones if tested in a group setting.
Only students who meet the criteria to have a read-aloud accommodation on the English language arts test may use this feature for English language arts.
Students who use text-to-speech should use a voice pack they are familiar with and adjust the volume, pitch and rate prior to starting the test. Detailed information about text-to-speech functionality is in the Test Administration Manual. Manuals are available on Ohio’s State Tests Portal.
Text-to-speech tracking for English language arts
The feature will highlight words in test questions as the embedded text-to-speech feature reads the test aloud.
Only students who meet the criteria to have a read-aloud accommodation on the English language arts test may use this feature for English language arts.
P a g e 19 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Response Accommodations
Response accommodations allow students to use alternative methods for providing responses to test items, such as through dictating to a scribe or using an assistive device.
Description
Response accommodations can benefit students who have physical, sensory or learning disabilities who have difficulties with memory, fine-motor skills, sequencing, directionality, alignment and organization.
Additional assistive technology regularly used in instruction
Students may use a range of assistive technologies (AT)on Ohio’s State Tests, including devices that are compatible with the Student Testing Site and those that are used externally (i.e., on a separate device).
For more information on additional assistive technology devices and software for use on Ohio’s State Tests, refer to Appendix D.
For information about who needs AT, how to obtain AT and AT tools, visit the Assistive Technology & Accessible Educational Materials Center online (ataem.org).
Answers transcribed by test administrator
The student records his or her answers directly on paper and the test administrator/monitor transcribes the responses verbatim into the Student Testing Site.
Braille notetaker
A student who is blind or has visual impairments may use an electronic Braille notetaker. For Ohio’s State Tests, grammar checker, Internet and stored file functionalities must be turned off. The responses of a student who uses an electronic Braille notetaker during Ohio’s State Tests must be transcribed exactly as entered in the electronic Braille notetaker. Only transcribed responses will be scored. Transcription guidelines are available in Appendix C of this manual.
Braille writer
A student who is blind or has visual impairments may use an electronic Braille writer. A test administrator must transcribe into the computer the student’s responses exactly as entered in the electronic Braille writer.
Only transcribed responses will be scored. Transcription guidelines are available in Appendix C of this manual.
Calculator or fact charts on non-calculator mathematics test or part of test
The student uses a handheld or embedded calculator or fact chart (addition, subtraction, multiplication or division only) on a non-calculator mathematics test or part of test. Both parts of grades 3 through 5 mathematics tests and part 1 of grades 6 and 7 mathematics tests are non-calculator tests.
The accommodation would be permitted on test sections for which calculators are not allowed for other students. IEP teams and 504
P a g e 20 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
plan coordinators should carefully review the following guidelines for identifying students to receive this accommodation.
This accommodation is for students with disabilities that severely limit or prevent their abilities to perform basic calculations (i.e., single-digit addition, subtraction, multiplication or division).
In making decisions whether to provide the student with this accommodation, IEP teams and 504 plan coordinators should consider whether the student has a disability that severely limits or prevents the student’s ability to perform basic calculations (i.e., single-digit addition, subtraction, multiplication or division), even after varied and repeated attempts to teach the student to do so.
Before documenting the accommodation in the student’s IEP or 504 plan, teams also should consider whether:
● The student is unable to perform calculations without the use of a calculation device, arithmetic table or manipulative during routine instruction;
● The student’s inability to perform mathematical calculations is documented in evaluation summaries from locally administered diagnostic assessments;
● The student receives ongoing, intensive instruction and/or interventions to learn to calculate without using a calculation device, in order to ensure that the student continues to learn basic calculation and fluency.
If students in grades 3-5 will use the embedded Desmos calculator within the Student Testing Site for a math test, the test administrator must turn on this accommodation when approving the student to test for part 1 and part 2. If students in grades 6 and 7 will use the embedded Desmos calculator within the Student Testing Site as an accommodation, the test administrator must turn it on this accommodation when approving the student to test for part 1. An embedded calculator for non-calculator math tests or parts of math tests cannot be turned on ahead of testing in TIDE.
Calculators are not allowed on the grades 5 and 8 science tests and the biology end-of-course test for students with disabilities. However, there are no mathematical calculations on these Ohio science tests and a calculator should not be needed. An embedded calculator is not available for these tests.
Calculator guidance is in the Test Administration Manual.
Mathematical tools – allowable tools as accommodation include:
• 100s chart • Abacus/rekenrek and other
specialized tools for students with visual impairments
• Algebra Tiles
Student uses these tools and manipulatives to assist mathematical problem solving. These manipulatives allow the flexibility of grouping, representing or counting without numeric labels.
Tools that give students answers (e.g. fraction tiles with numerical labels) or lead a student to use a specific strategy (e.g. number lines) are not allowed. These types of tools can be effective for instruction
P a g e 21 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
• Base 10 blocks • Counters and counting chips • Cubes • Fraction tiles and pies without
numerical labels • Square tiles • Two-colored chips
and while students may create their own during testing as a strategy, they may not be provided to students on Ohio state tests.
For information about fact charts, see calculation device or fact charts on non-calculator mathematics test or part of test in this section.
Information about rulers, angled-rulers, compasses and protractors is located in the non-embedded designated supports section of this manual.
The Department will review and revise this list annually as needed.
Allowable for mathematics and physical science tests only.
Scribe
The student dictates responses either verbally, using a speech-to text device, augmentative or assistive communication device (e.g., picture or word board), or by signing, gesturing, pointing or eye gazing. Grammar checker, Internet and stored files functionalities must be turned off. Word prediction must also be turned off for students who do not receive this accommodation. The student must test in a separate setting.
In making decisions whether to provide the student with this accommodation, IEP teams and 504 plan coordinators should consider whether the student has:
● A physical disability that severely limits or prevents the student’s motor process of writing through keyboarding;
OR
● A disability that severely limits or prevents the student from expressing written language, even after varied and repeated attempts to teach the student to do so.
Before documenting the accommodation in the student’s IEP or 504 plan, teams/coordinators should also consider whether:
● The student’s inability to express in writing is documented in evaluation summaries from locally administered diagnostic assessments;
● The student routinely uses a scribe for written assignments; and
● The student receives ongoing, intensive instruction and/or interventions to learn written expression, as deemed appropriate by the IEP team or 504 plan coordinator.
Student’s responses must be transcribed exactly as dictated.
Information about the scribing process is available in Appendix C of this manual.
Specialized calculation device A student uses a specialized calculation device (for example, a large key, talking or other adapted calculator) on the calculator part of the
P a g e 22 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
mathematics assessments. If a talking calculator is used, the student must use headphones or test in a separate setting.
The student must qualify for the calculation device or fact charts on non-calculator mathematics test or part of test accommodation to use a specialized calculator in those tests.
Calculators are not allowed on science tests except physical science.
Word prediction external device
The student uses an external word prediction device that provides a bank of frequently or recently used words on screen as a result of the student entering the first few letters of a word.
The student must be familiar with the use of the external device prior to assessment administration. The device cannot connect to the Internet or save information.
In making decisions whether to provide the student with this accommodation, IEP teams and 504 plan coordinators are instructed to consider whether the student has:
● A physical disability that severely limits or prevents the student from writing or keyboarding responses;
OR
● A disability that severely limits or prevents the student from recalling, processing and expressing written language, even after varied and repeated attempts to teach the student to do so.
Before documenting the accommodation in the student’s IEP/504 plan, teams/coordinators are instructed to consider whether:
● The student’s inability to express in writing is documented in evaluation summaries from locally administered diagnostic assessments; and
● The student receives ongoing, intensive instruction and/or intervention in language processing and writing, as deemed appropriate by the IEP team/504 plan coordinator.
Timing Accommodation
Timing and scheduling accommodations are changes in the allowable length of time in which a student may complete the test.
Description
The extended-time accommodation is most beneficial for students who routinely need more time than is generally allowed to complete activities, assignments and tests. Extra time may be needed to:
● Process written text (for a student who processes information slowly or has a human reader);
● Write (for a student with limited dexterity); ● Use other accommodations or devices.
Extended time Student is allowed more time than allotted for each test part.
P a g e 23 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
In most cases, the Department recommends that extended time be defined for students and not open-ended. This accommodation is usually expressed as one and a one-half time (1.5x) or double time (2x). A student who has one and one-half time on a test that normally takes 90 minutes may be allowed 135 minutes. Extended time may not exceed one school day; students must complete each test part on the same day that part is started.
Decisions about how much extended time is provided must be made on a case-by-case basis for each individual student, not for any category of students or group. Teams should keep in mind the purposes of different accommodations as they relate to disability characteristic or language barrier. Typically, if a student needs extended time, one and one-half time is sufficient. For some accommodations, such as use of a human reader or scribe, double time may be appropriate. Rarely is unlimited time (an entire school day) applicable.
Schools may choose to test students with the extended-time accommodation in a separate setting to minimize distractions. Department recommends scheduling these students for testing in the morning to allow adequate time for completion of a test part by the end of the school day.
2.7 Considerations for English Learner Accommodations While all English learners have in common that they are acquiring English language proficiency, they are not a homogenous group. Similar to students with disabilities, English learners should not be assigned accommodations using a one-size-fits-all approach. Knowing the student is key. When considering accommodations for English learners, it is important to focus on the effectiveness of each accommodation for each individual student. Not only does an English learner’s English language proficiency influence accommodation effectiveness, but so do other factors, including their literacy development in English and their native language, grade, age, affective needs and time in U.S. schools. Keep in mind that the purpose of English language assessment accommodations is not to improve an English learners rate of passing state assessments, but to allow more accurate demonstration of their knowledge of the content being assessed. All students who have been identified as an English learner may receive accommodations for English learners even if they do not participate in the district English learner program. Schools should monitor how English learners in the classroom benefit from English learner-specific accommodations when determining accommodations for state tests.
P a g e 24 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Accommodations for English
Learners
Accommodations for English
learners are intended to reduce
and/or eliminate the effects of a
student’s lack of English
language proficiency.
Description
When making decisions about accommodations for English
learners, teams should consider the effectiveness of the
accommodation based on the English language proficiency
level of the student.
Extended time
Student is allowed more time than allotted for each test part.
In most cases, the Department recommends that extended time be defined for students and not open-ended. This accommodation is usually expressed as one and a one-half time (1.5x) or double time (2x). A student who has one and one-half time on a test that normally takes 60 minutes may be allowed 90. Extended time may not exceed one school day; students must complete each test part on the same day that part is started.
Decisions about how much extended time is provided must be made on a case-by-case basis for each individual student, not for any category of students or group. Teams should keep in mind the purposes of different accommodations as they relate to disability characteristic or language barrier. Typically, if a student needs extended time, one and one-half time is sufficient. For some accommodations, such as an oral translation, double time may be appropriate. Rarely is unlimited time (an entire school day) applicable.
Schools may choose to test students with the extended-time
accommodation in a separate setting to minimize distractions. The
department recommends scheduling these students for testing in the
morning to allow adequate time for completion of a test part by the
end of the school day.
Appropriate for all English language proficiency levels.
Human reader for computer-based
test
Not allowed for English learners on the English language arts test.
A test administrator reads in English from the student’s computer
screen to the student. For computer-based testing, most students
should be able to use text-to-speech for a read-aloud.
Test administrators must administer the read-aloud accommodation
in a separate setting. This feature can be provided in small groups if
set up as a small group administration in the Student Testing Site. If
testing in a small group, test administrators should ensure that all
students in the group have similar abilities so that the reader’s pace
P a g e 25 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
meets all student’s needs without being too slow or too fast for some
students.
If a student need this accommodation, the person providing the
accommodation must read the entire test to the student. It cannot be
“as needed” or “on demand.”
Appropriate for students who regularly have a human reader in the
classroom and who have had very little or no prior experience or
familiarity with computer-based testing technology.
Refer to the Test Administration Manual for more information about
administering a test through a human reader.
Oral translation of the test
Not allowed for English language arts test.
Note: The general directions for all tests, including English language
arts, may be translated. The general directions are the scripted
directions the test administrator reads to all students before the test
begins. The Department will not reimburse translators for translating
general directions only.
A translator reads aloud the test to a student in his or her native
language. Translators will translate the test from the student’s device.
Student responses must be recorded in the Student Testing Site in
English. Responses submitted in a language other than English will
not be scored.
Refer to the Test Administration Manual for additional information
about how to administer an oral translation.
A translator must administer an oral translation of the test in a
separate setting.
Appropriate for beginning and some intermediate-level English
learners but may not be appropriate for advanced-level English
learners.
Refer to the Test Administration Manual for more information about
administering an oral translation.
Scribe (In English)
Not allowed for the English language arts test.
The student dictates responses in English. The test administrator or
monitor must test the student in a separate setting.
May be appropriate for beginning-level English learners who do not
have translators and who have better spoken than written English
language proficiency. Typically, not appropriate for intermediate- or
advanced-level English learners.
P a g e 26 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Stacked Spanish/English bilingual
form of the test
Not allowed for the English language arts test.
Test items presented with Spanish on the top and English on the
bottom. Only responses in English will be scored.
Appropriate for students who have content knowledge in both
Spanish and English. Not appropriate for students who have not been
instructed in tested content in Spanish.
Text-to-speech Spanish/English
Not allowed for the English language arts test.
The text-to-speech feature reads aloud the test to the student.
Students who use text-to-speech should use a voice pack they are
familiar with and adjust the volume, pitch and rate prior to starting the
test. Detailed information about text-to-speech functionality is in the
Test Administration Manual. Manuals are available on Ohio’s State
Tests Portal.
Recommended for beginning and some intermediate English learners
but may not be appropriate for advanced-level English learners.
Text-to-speech tracking
Not allowed for the English language arts test.
The feature will highlight words in test questions as the embedded text-to-speech feature reads the test aloud. May help some students who use text-to-speech.
Word-to-word dictionary
(English/Native Language)
The student uses an allowable bilingual, word-to-word dictionary.
Dictionaries that include definitions, phrases, sentences or pictures
are not allowed. The student should be familiar with the dictionary he
or she will use during testing. An electronic translator may be used
instead of a paper dictionary. An electronic translator cannot connect
to the Internet or store information.
Recommended for intermediate and advanced English learners but
may not be appropriate for beginning-level English learners.
The Massachusetts Department of Elementary and Secondary
Education has released a list of dictionaries that are known to meet
the criteria for allowable dictionaries for statewide testing. This list
may be accessed at doe.mass.edu/mcas/testadmin/lep-bilingual-
dictionary.pdf.
Word-to-word glossaries and dictionaries approved by ACT or the
College Board are allowable.
Assessment scores for students who qualify and receive any of the accommodations listed in this manual will be aggregated with the scores of other students and those of relevant groups and will be included for accountability purposes.
P a g e 27 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
2.8 Other Accommodations and Modifications
Emergency Accommodations
An emergency accommodation may be appropriate for a student who incurs a temporary disabling condition
that interferes with test performance shortly before or during the assessment window (e.g. the student has a
recently fractured limb that affects physical access to the test, a student whose only pair of eyeglasses has
broken or a student returning after a serious or prolonged illness or injury). Scribe is the most common
emergency accommodation for the examples given. Extended time may also be considered when providing a
scribe but it is not required. For a student with a concussion, a paper test may be an appropriate emergency
accommodation, alternately, frequent breaks and a human reader may provide needed access for a student in
this situation.
If the principal (or designee) determines that a student requires an emergency accommodation, the optional Emergency Accommodation form found in Appendix E may be completed and maintained in the student’s file. The Department recommends that the school notify the parent or guardian that an emergency accommodation was provided. If appropriate, the form also may be submitted to the district testing coordinator to be retained in the student’s central office file. Accommodation Irregularities In the event that a student was provided a test accommodation the student was not entitled to, or if a student was not provided a test accommodation the student was entitled to, the school should refer to the Test Incident Guidance Document located in the Test Administration Manual to determine next steps. Modifications on Assessments Modifications are not permitted on Ohio’s State Tests. Modifications, as contrasted with accessibility features, involve changes in the standards being measured on the test, or in the conditions in which a student takes the test that would result in changes in what the assessment is designed to measure (e.g., reducing or changing expectations for students), or provides an unfair advantage to a student. Examples of modifications the Department does not permit on Ohio’s State Tests include:
● Allowing a student to be assessed off grade level; ● Instructing a student to skip selected items, reducing the scope of assessments so a student needs to
complete only a limited number of problems or items; ● Modifying the complexity of assessments to make them easier (e.g., deleting response choices on a
multiple-choice assessment so that a student selects from two or three options instead of four); ● Providing hints, clues or other coaching that directs the student to correct responses; ● Defining vocabulary on the assessment, for non-glossed words, or explaining assessment items; ● Allowing the student to complete an assessment of English language arts in a language other than
English; and ● Using a dictionary that provides definitions (rather than an acceptable word-to-word dual language
dictionary). Providing a student with modifications during Ohio’s State Tests may constitute a test irregularity and will result in an invalidated score (the score will not be counted) and/or an investigation by the state into the school’s or district’s testing practices. Moreover, providing modifications to students during statewide tests may have the unintended consequence of reducing their opportunities to learn critical content and may result in adverse effects on the students throughout their educational careers.
P a g e 28 | O H I O ’ S A C C E S S I B I L I T Y M A N U A L | J a n u a r y 2 0 1 8
Section 3: Universal Design and Ohio’s State Tests The Department designed Ohio’s State Tests to ensure all students have the tools and supports to
demonstrate what they know. Using universal design approaches, the test makers ensure that all students
have an equal opportunity to show what they have learned. All students benefit from the flexibility universal
design can bring to assessment design and administration, including students who need accommodations.
Universally designed assessment aims to create multiple alternatives and approaches, so a maximum number
of students can take the assessment without accommodations.
Ohio has included the following universal-design requirements for item development for Ohio’s State Tests:
● The item or task takes into consideration the diversity of the assessment population and the need to
allow the full range of eligible students to respond to the item/stimulus.
● Constructs have been precisely defined and the item or task measures what is intended.
● Assessments contain accessible, non-biased items.
● Assessments are designed to be amenable to accommodations.
● Instructions and procedures are simple, clear and intuitive.
● Assessments are designed for maximum readability, comprehensibility and legibility.
● The item or task material uses a clear and accessible text format.
● The item or task material uses clear and accessible visual elements (when essential to the item).
● The item or task material uses text appropriate for the intended grade level.
● Decisions will be made to ensure that items and tasks measure what they are intended to measure for
English language learner students with different levels of English language proficiency and/or first
language proficiency.
● All accessibility features have been considered that may increase access while preserving the targeted
construct.
● Test developers considered multiple means of item presentation, expression and student engagement
with regard to items/tasks for both students with disabilities and English learners.
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
L-1 American Institutes for Research
Table L1. Summary of Human and Machine Scores for Fall 2017 Writing Prompts
Grade
Item D
Dim
ension
Mean Standard Deviation Human1‐Human2 Agreement Machine‐Human Agreement
Hum
an
Engine
Hum
an
Engine
Pearson r
% Exact
Weighted κ*
SMD*
Pearson r
% Exact
Weighted κ*
SMD*
3 31679
Conventions 1.42 1.47 0.64 0.61 0.70 0.74 0.62 0.00 0.73 0.79 0.66 0.07
Evidence 1.33 1.30 0.81 0.70 0.72 0.71 0.61 0.00 0.67 0.67 0.53 0.03
Purpose 1.42 1.36 0.78 0.64 0.73 0.72 0.63 0.01 0.66 0.67 0.52 0.08
9 31578
Conventions 1.50 1.53 0.68 0.65 0.79 0.81 0.72 0.00 0.82 0.84 0.75 0.04
Evidence 1.66 1.64 0.90 0.81 0.86 0.77 0.76 0.01 0.82 0.75 0.71 0.03
Purpose 1.81 1.82 0.91 0.81 0.85 0.75 0.75 0.01 0.85 0.78 0.76 0.02
9 31588
Conventions 1.35 1.38 0.74 0.72 0.74 0.74 0.65 0.04 0.80 0.79 0.71 0.03
Evidence 1.54 1.56 0.85 0.80 0.83 0.76 0.74 0.04 0.81 0.76 0.71 0.03
Purpose 1.76 1.71 0.81 0.73 0.81 0.77 0.73 0.01 0.79 0.78 0.71 0.07
10 31662
Conventions 1.54 1.59 0.68 0.63 0.81 0.84 0.74 0.01 0.79 0.83 0.72 0.07
Evidence 1.72 1.76 0.88 0.81 0.86 0.79 0.77 0.01 0.81 0.74 0.70 0.04
Purpose 1.82 1.82 0.84 0.78 0.87 0.82 0.80 0.00 0.84 0.80 0.76 0.00
10 31513
Conventions 1.53 1.54 0.68 0.68 0.83 0.83 0.76 0.02 0.81 0.82 0.73 0.01
Evidence 1.59 1.60 0.90 0.84 0.88 0.80 0.80 0.02 0.85 0.77 0.76 0.01
Purpose 1.85 1.83 0.89 0.81 0.87 0.79 0.79 0.01 0.84 0.77 0.74 0.03 *Weighted K = Quadratic weighted kappa; SMD = Standardized Mean Difference
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
L-2 American Institutes for Research
Table L2. Summary of Dimension Intercorrelations for Fall 2017 Writing Prompts
Grade ITS ID Dimensions Correlations Among Dimensions
Human1 vs Human2 Machine vs Final Human Conventions Evidence Conventions Evidence
3 31664 Evidence 0.52 0.49 Purpose 0.66 0.96 0.46 0.85
9 31578 Evidence 0.75 0.81 Purpose 0.79 0.90 0.80 0.89
9 31588 Evidence 0.72 0.86 Purpose 0.72 0.81 0.74 0.88
10 31662 Evidence 0.55 0.68 Purpose 0.56 0.87 0.67 0.87
10 31513 Evidence 0.71 0.77 Purpose 0.65 0.88 0.72 0.90
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
L-3 American Institutes for Research
Table L3. Summary of Human and Machine Scores for Spring 2018 Writing Prompts
Grade
Item D
Dim
ension
Mean Standard Deviation
Human1‐Human2 Agreement Machine‐Human Agreement
Hum
an
Engine
Hum
an
Engine
Pearson r
% Exact
Weighted
κ*
SMD*
Pearson r
% Exact
Weighted
κ*
SMD*
3 31664
Conventions 1.35 1.35 0.63 0.57 0.71 0.75 0.64 0.02 0.63 0.73 0.56 0.00
Evidence 0.98 0.96 0.86 0.75 0.76 0.72 0.66 0.03 0.73 0.66 0.60 0.03
Purpose 1.03 0.97 0.86 0.80 0.76 0.70 0.65 0.03 0.74 0.65 0.60 0.08
4 31960
Conventions 1.21 1.22 0.66 0.59 0.71 0.75 0.63 0.00 0.62 0.73 0.56 0.02
Evidence 1.46 1.46 0.99 0.95 0.83 0.71 0.72 0.03 0.84 0.71 0.72 0.01
Purpose 1.45 1.48 1.05 0.97 0.82 0.69 0.70 0.00 0.81 0.68 0.69 0.03
5 32035 Conventions 1.61 1.65 0.59 0.56 0.62 0.75 0.56 0.01 0.70 0.80 0.63 0.06 Evidence 1.55 1.53 0.71 0.61 0.75 0.73 0.65 0.03 0.75 0.77 0.67 0.04 Purpose 1.72 1.74 0.77 0.68 0.76 0.75 0.67 0.02 0.78 0.77 0.69 0.03
6 31711 Conventions 1.54 1.59 0.67 0.63 0.72 0.80 0.65 0.04 0.73 0.80 0.66 0.07 Evidence 1.72 1.73 0.86 0.73 0.81 0.75 0.71 0.01 0.80 0.75 0.70 0.00 Purpose 1.86 1.87 0.85 0.80 0.81 0.74 0.71 0.01 0.82 0.77 0.73 0.01
6 31766 Conventions 1.52 1.57 0.67 0.68 0.77 0.81 0.70 0.00 0.78 0.82 0.70 0.06 Evidence 1.72 1.72 0.89 0.75 0.84 0.78 0.75 0.01 0.77 0.70 0.63 0.01 Purpose 1.94 1.97 0.83 0.82 0.84 0.77 0.75 0.02 0.81 0.75 0.71 0.03
7 31604 Conventions 1.45 1.46 0.69 0.68 0.75 0.78 0.68 0.03 0.80 0.82 0.74 0.02 Evidence 1.58 1.60 0.87 0.78 0.82 0.74 0.72 0.02 0.79 0.72 0.68 0.01 Purpose 1.69 1.66 0.89 0.77 0.80 0.75 0.71 0.03 0.80 0.72 0.68 0.05
7 31978 Conventions 1.44 1.47 0.66 0.64 0.74 0.77 0.66 0.01 0.74 0.79 0.67 0.04 Evidence 1.68 1.68 0.84 0.73 0.81 0.75 0.71 0.01 0.79 0.76 0.70 0.02 Purpose 1.78 1.77 0.83 0.77 0.81 0.74 0.71 0.01 0.83 0.78 0.74 0.03
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
L-4 American Institutes for Research
Grade
Item D
Dim
ension
Mean Standard Deviation
Human1‐Human2 Agreement Machine‐Human Agreement
Hum
an
Engine
Hum
an
Engine
Pearson r
% Exact
Weighted
κ*
SMD*
Pearson r
% Exact
Weighted
κ*
SMD*
8 32110 Conventions 1.61 1.63 0.58 0.57 0.69 0.78 0.62 0.02 0.78 0.86 0.73 0.04 Evidence 1.67 1.68 0.74 0.67 0.76 0.74 0.66 0.00 0.79 0.79 0.71 0.01 Purpose 1.82 1.82 0.77 0.77 0.78 0.74 0.68 0.00 0.80 0.76 0.71 0.00
8 32037 Conventions 1.49 1.49 0.60 0.60 0.74 0.81 0.69 0.02 0.75 0.83 0.71 0.00 Evidence 1.69 1.69 0.84 0.76 0.82 0.77 0.73 0.00 0.83 0.78 0.73 0.03 Purpose 1.76 1.76 0.88 0.78 0.82 0.76 0.73 0.00 0.84 0.79 0.75 0.02
9 31583 Conventions 1.48 1.50 0.66 0.64 0.76 0.79 0.69 0.01 0.78 0.82 0.72 0.03 Evidence 1.56 1.58 0.83 0.73 0.84 0.79 0.75 0.01 0.80 0.76 0.70 0.02 Purpose 1.59 1.60 0.80 0.69 0.78 0.77 0.70 0.02 0.73 0.71 0.62 0.00
9 31555 Conventions 1.48 1.50 0.66 0.62 0.74 0.78 0.67 0.02 0.78 0.82 0.72 0.02 Evidence 1.55 1.56 0.84 0.73 0.78 0.72 0.67 0.04 0.79 0.75 0.69 0.01 Purpose 1.75 1.78 0.87 0.75 0.81 0.73 0.70 0.03 0.84 0.79 0.75 0.03
10 31605 Conventions 1.59 1.61 0.62 0.62 0.80 0.82 0.73 0.00 0.79 0.84 0.72 0.04 Evidence 1.36 1.34 0.95 0.91 0.85 0.74 0.75 0.01 0.84 0.73 0.73 0.02 Purpose 1.40 1.37 1.02 0.94 0.84 0.73 0.73 0.02 0.83 0.68 0.69 0.04
10 31622
Conventions 1.55 1.59 0.67 0.65 0.83 0.85 0.77 0.00 0.75 0.80 0.67 0.06
Evidence 1.53 1.56 1.02 0.92 0.85 0.73 0.74 0.00 0.81 0.70 0.70 0.03
Purpose 1.84 1.83 0.83 0.80 0.85 0.78 0.77 0.00 0.81 0.76 0.72 0.01 *Weighted K = Quadratic weighted kappa; SMD = Standardized Mean Difference
Ohio’s State Tests — Fall 2017 Administration & Spring 2018 Administration Technical Report
L-5 American Institutes for Research
Table L4. Summary of Dimension Intercorrelations for Spring 2018 Writing Prompts
Grade ITS ID Dimensions Correlations Among Dimensions
Human1 vs Human2 Machine vs Final Human Conventions Evidence Conventions Evidence
3 31664 Evidence 0.54 0.62 Purpose 0.49 0.94 0.61 0.84
4 31960 Evidence 0.59 0.70 Purpose 0.64 0.92 0.73 0.88
5 32035 Evidence 0.55 0.56 Purpose 0.58 0.80 0.58 0.82
6 31711 Evidence 0.53 0.61 Purpose 0.56 0.95 0.58 0.89
6 31766 Evidence 0.58 0.65 Purpose 0.63 0.90 0.62 0.87
7 31604 Evidence 0.62 0.65 Purpose 0.59 0.89 0.64 0.92
7 31978 Evidence 0.60 0.59 Purpose 0.59 0.90 0.61 0.87
8 32110 Evidence 0.64 0.64 Purpose 0.59 0.84 0.60 0.84
8 32037 Evidence 0.64 0.67 Purpose 0.69 0.90 0.69 0.90
9 31583 Evidence 0.66 0.68 Purpose 0.58 0.71 0.66 0.85
9 31555 Evidence 0.57 0.69 Purpose 0.61 0.87 0.68 0.87
10 31605 Evidence 0.65 0.70 Purpose 0.61 0.89 0.64 0.93
10 31622 Evidence 0.66 0.66 Purpose 0.63 0.87 0.65 0.85