six steps for avoiding misinterpretations
TRANSCRIPT
INTERPRETING TEST SCORES
AND NORMSPresented To;
Dr. Muhammad Ramzan
Presented By;
Abdul Majid (RNo.02) MPhil Education
Interpreting Test Scores and
Norms• Criteria Reference or Standard based
Test
• True Zero Point
(The point at which there is no height
at all or no weigh at all)
Equal units provide uniform meaning
Methods of expressing test scores
METHODS OF INTERPRETING
TEST SCORES Raw scores Criterion reference and standard based
interpretations Norm referenced Interpretation Grade norms Percentile Rank Standard scores Profiles Judging the adequacy of norms Using local norms
Raw Scores
Numerical summary of student’s test performance
Criterion Referenced and Standard Based interpretations
o Describe individual’s test performance
o Specify levels of performance
o Percentage score used
o Measures clearly stated learning tasks
o Criterion Referenced Interpretation of Standardization Test
o Analyzing each student response
o Expectancy tables
Standardized Tests – What’s the
Difference? Criterion-Referenced Test
Criterion-referenced tests, also called mastery tests, compare a person's performance to a set of objectives. Anyone who meets the criterion can get a high score.
Everyone knows what the benchmarks / objectives are and can attain mastery to meet them.
It is possible for ALL the test takers to achieve 100% mastery.
“Adjusting” The Raw Score
We have already noted the most immediate result from an
assessment or test is the raw score. Sometimes, before we proceed
to discuss the meaning of the score (i.e., interpret it from either a
norm or criteria perspective), the raw score is adjusted. Usually this
is done by researchers, not classroom teachers. Two special
considerations
◦ Correction for Guessing
Only for selected-response items
Use has faded
◦ Factoring in Item Difficulty
Students get a higher “Theta” Score based on doing well on the
more difficult items of a test.
In fact, you may be looking at a test score report given in
percentiles or standard scores and not realize you are looking
at a transformation from a Theta score rather than the
traditional raw score.
6
Interpreting Student PerformanceNorms or Criteria . . . .
Intelligent interpretation of student performance is crucial for
the use of educational assessment information. We are
building toward this with previous discussions of
◦ Building / choosing good tests
◦ Determining reliability
◦ Determining validity
So now we are set to explore some methods of interpretation.
These methods fall into two basic categories or approaches:
◦ Norm-referenced
Compare this student with others.
◦ Criterion-referenced
Compare this student with some judgment regarding
expected performance level irrespective of others.
7
Standardized Tests – What’s the
Difference? Norm-Referenced TestNorm-referenced tests compare an individual's
performance with the performance of others. They are designed to yield a normal curve, with 50%
of test takers scoring above the 50th percentile and 50% scoring below it, so half the test takers MUST pass and half the test takers MUST fail
The test makers design the test with questions that MOST people will get incorrect.
If too many people get a question correct, or too many score well, then test questions are “thrown out” until they achieve a normal curve again.
Norm Referenced
Interpretation How an individual compares with other
persons
Rank the scores from highest to
lowest
Derived scores
o Obtain a general framework for norm
referenced interpretation
o Raw scores converted to derived
scores
o Numerical report of test performance
Interpreting Test Scores
(some definitions) Raw score. This is the number of items the
student answered correctly. It is used to calculate the other, more useful scores.
Stanine. One of nine equal sections of the normal curve. Stanines can be easily averaged and compared from test to test, but are less precise than other scores.
Normal curve equivalent (NCE). For these scores, the normal curve is divided into equal units ranging from 1 to 99, with an average of 50. These can be averaged and compared from test to test or year to year.
Norms And It Types
o Based on students performance
a) Grade Equivalent or Norms
b) Percentile Ranks
c) Standard Scores
• Above first two indicate individuals relative
understanding with in group
• While Standard Scores indicate T-Scores,
Normal curve equivalent and Standard age
scores
• Differ only in numerical values
Six Misinterpretations(Assumptions
about Grade Equivalent)1) Assuming norms are standards
2) Assuming grade equivalents indicate appropriate grade
3) Assuming all students expected to grow
4) Assuming units are equal
5) Assuming grade equivalent are comparable
6) Assuming scores are based on extrapolations to grades
Six Steps For Avoiding
Misinterpretations1) Don't confuse norms with standards
of what should be
2) Don't interpret a grade equivalent as
an estimate of the grade where a
student should be placed.
3) Don't expect that all students gain 1.0
grad equivalent each year
Don’t assume that the unit are equal
at different parts of the scale.
Don’t assume that scores on different
tests are comparable
Don’t interprete extreme scores as
dependable estimates of students
performance level.
Percentile Rankat or below . . .
Percentiles and Percentile Rank
◦ Definition: % of cases “at or below”
◦ As we noted earlier, these two terms are different
conceptually, however, in practice often both terms are
used interchangeably.
◦ Strengths:
Easy to describe
Easy to compute
◦ Weaknesses
Confusion with a “percentage-right score”
Inequality of units [see next slide]
15
Percentile Rank
Widely used and easy understood.
Indicates students relative position.
Interpreted as percentage of individuals.
Standard scores
Showing how far the raw score is above average or below average.
Expressed in standard deviation and mean.
SD is a measure of the spread of scores.
Normal Curve & Standard
Deviation Unit Symmetrical bell shaped.
Curve contain fix percentages
Types of standard scores
Z-score.Z-score = X – M / SD
X = any raw score.
M = Arithmetic Mean.
SD = Standard deviation of raw scores.
Z-score is always negative when raw score is smaller than the mean.
Pros & Cons of Standard Scores
Strengths
◦ Wide applicability
◦ Nice statistical properties
◦ Teachers often build their narrative reports on these standard scores using the “accepted descriptive words” rather than the numbers.
Weaknesses
◦ May be hard to explain to laypersons
◦ Need to know M and SD of original test
◦ Teachers often build their narrative reports on these standard scores using the “accepted descriptive words” rather than the numbers.
18
More Standard Scores of Interest . .
. .◦ T-scores, SATs, GREs
◦ NCEs (Normal Curve Equivalent)
Recall that the percentile rank scale is not an equal-interval scale; NCEs solve this problem by converting percentile ranks to an equal-interval scale. NCEs range from 1 to 99 with a mean of 50. The major advantage of NCEs over percentile ranks is that NCEs can be averaged.
Used almost exclusively by federal reporting requirement for achievement testing.
◦ Stanines
Widely used in schools so we will look at them in more detail in the next slide.
19
T-Score
Types of normalized score.
T-score = 50 + 10 (z)
Normalized standard scores.
can be calculated by
1. Converting distribution of raw scores into percentile rank.
2. Looking up the Z-score to the corresponding raw scores.
3. Converting the Z-scores to T-score
Stanines
Pronounced (stay-nines)
Single digit scores.
Strengths of stanines scores.
1. Nine point scale 9 is high, 1 is low and 5 is average.
2. Possible to compare a student’s performance on different tests.
3. Easy to combine diverse types of data (test scores, rating, ranked data)
4. Uses single digit score, easily recorded and takes less space.
Limitation
1. Growth can’t be shown from 1 year to the next.
Judging the adequacy of
norms Main purpose is to able to interpret
students test performance.
Qualities in test norms.
1. Test norms should be relevant.
2. Test norms should be representative.
3. Test norms should be upto date.
4. Test norms should be comparable.
5. Test norms should be adequately described.
Using local norms
Compare students with local norms.
Published norms on aptitude,
educational experience, cultural
background.
Prepared using percentile ranks or
stanines.
Cautions in interpreting test
scores A test should be interpreted in terms of
the specific test from which it was derived.
A test should be interpreted in light of all of the students relevant characteristics.
A test should be interpreted according to the type of decision to be made.
A test score should be interpreted as a band of scores rather than as a specific value.
A test score should be verified by supplementary evidence.