error organizer
TRANSCRIPT
-
8/14/2019 Error Organizer
1/24
The STANDARD ERROR OF MEASUREMENT, (which we will
discuss in detail shortly), ESTIMATES the RANGE of scores
within which an individuals true score or true level of ability lies.
Student A gets a 75 on a test, we can only hope that As TRUESCORE - her actual level of ability- is somewhere around 75.
The closer the reliability of the test is to perfect (r = 1.00), the more
likely it is that the true score is very close to 75.
-
8/14/2019 Error Organizer
2/24
ERROR
If your obtained scores do not always reflect your true ability (if
they underestimated or overestimated your true ability), then they
were associated with some error.
In other words, your OBTAINED SCORE has a TRUE SCORE
component (actual level of ability, skill, knowledge), and an
ERROR component (which acts to raise or lower the obtained
score).
Obtained score = True score +/- error score
-
8/14/2019 Error Organizer
3/24
The Standard Error of Measurement (Sm)
The standard error of measurement is the STANDARD
DEVIATION of the ERROR scores of a test.
Although we can never know the error scores, we can ESTIMATE
the standard error of measurement by using the following formulawhere r is the reliability of the test:
Sm = SD1- r
Where r is the reliability of the test.
-
8/14/2019 Error Organizer
4/24
Using the Standard Error of Measurement
The distribution of error scores approximates the normal
distribution.
We can extend this information to construct a band around anyobtained score to identify the range of scores that, at a certain level
of confidence, will capture or span an individuals true score.
-
8/14/2019 Error Organizer
5/24
The SEM can be used to provide the following:
To make an estimation of the value of a persons true score. Inother words, we can use it to predict what would happen if a person
took additional equivalent tests.
68% of the scores would fall between +or - 1 SEM of the true score.
95% of the scores would fall between +or - 2 SEM of the true score.
99.7% of the scores would fall between +or - 3 SEM of the true
score.
-
8/14/2019 Error Organizer
6/24
Thus, if a person achieved a score of 80 on a math test, and theSEM for that test was 5, then we could state the following:
68% of the scores would fall between ____ and____
95% of the scores would fall between ____and ____
99.7% of the scores would fall between ___and____
-
8/14/2019 Error Organizer
7/24
When applied to the prediction of future test performance,
these ranges are known as CONFIDENCE INTERVALS
That is, we can be:
68%of the scores would fall between +or - 1 SEM of the true score.
95%of the scores would fall between +or - 2 SEM of the true score.
99.7% of the scores would fall between +or - 3 SEM of the true score.
-
8/14/2019 Error Organizer
8/24
Confidence Intervals
Finally, the SEM can be used to determine whether a score issignificantly different from a particular criterion such as a cutoffscore.
If a person received a score of 105 on the WAIS, that has an SD of15, a reliability of .97, and an SEM of 2.5, how confident can we bethat repeated testing would not place this person in the gifted range(130 or above)?
68% confident that the true score lies between____and_____
95% confident that the true score lies between ___and_____
99.7% confident that the true score lies between___and____
-
8/14/2019 Error Organizer
9/24
Confidence IntervalsIn conclusion, the SEM is a statistic that estimates for us just
how fallible, or error-prone tests are.
-
8/14/2019 Error Organizer
10/24
Confidence Interval
In education, we have long had a tendency to
OVERINTERPRET small differences in test scores since we
too often consider obtained scores to be completely accurate.
-
8/14/2019 Error Organizer
11/24
Reliability and the SEM
If a test is perfectly reliable (r = 1.00), then a student will always get
exactly the same score, there will be no error and the SEM will be 0.
If the test is not reliable, the SEM will be almost as big as the SD;
Remember:
The SD is the variability of raw scores; the SEM is the variability of
error scores.
-
8/14/2019 Error Organizer
12/24
Sources of Error
Error Within Test-Takers (Intra-Individual Error)
These include any within-student factors that would result
in obtained scores being lower or higher than true scores.
Error Within the Test
This error is within-test and can include: trick questions;
reading level too high; ambiguous questions; grammatical cues inthe items; items too easy or too difficult; and poorly written items.
-
8/14/2019 Error Organizer
13/24
Error in Test Administration
This error includes the following:Physical comfort
Instructions & explanations- Different test
administrators provide different amounts to test takers.
Test administrator attitudes - Administrators differ in
the notions they convey about the importance of the
test, the extent to which they are emotionally supportiveof students, and the way in which they monitor the test.
-
8/14/2019 Error Organizer
14/24
Error in Test Administration
Error is Scoring
Computer scoring has decreased this source of error.
But teachers and administrators can still make mistakes onanswer keys; students dont use #2 pencils or make stray marks;
and hand scoring can lead to error.
-
8/14/2019 Error Organizer
15/24
Sources of Error Influencing Various Reliability
Coefficients
Test-Retest Reliability
If test-retest coefficients are determined over a short time, the effects
of within-student error should be small.
What about sources of:
within test error ?
error in administration?
error in scoring?
-
8/14/2019 Error Organizer
16/24
Sources of Error
Alternated-forms reliability
Since this form of reliability is determined by administering two
different forms of the test to the same group close together in
time, the effects of within-student error should be small.
-
8/14/2019 Error Organizer
17/24
Sources of Error
Internal consistency
With this type of reliability, neither within-student nor within-
test sources of error will exert an influence, since only 1 test isgiven one time. The same goes for administration and scoring
error.
-
8/14/2019 Error Organizer
18/24
Band Interpretation
Johns scores on end of year achievement test
Sub-tests Scores
Reading 103Listening 104
Writing 105
Social Studies 98
Science 100
Math 91
-
8/14/2019 Error Organizer
19/24
Band Interpretation
How large a difference do we need between test scores toconclude that the differences represent real and not chance
differences?
We can use the SEM to answer these questions, using a technique
called BAND INTERPRETATION.
1. First, determine the SEM for each sub-test.
2. Add and subtract the SEM for each sub-test score.
-
8/14/2019 Error Organizer
20/24
Band Interpretation
3. Graph each scale- Shade in the bands to represent the
range of scores that has a 68% (or 95%) chance of capturing
Johns true score.
4. Interpret the bands- Interpret the profile of bands by
visually inspecting the bars to see which band overlap and
which do not.
-
8/14/2019 Error Organizer
21/24
Band Interpretation
Using the 68% band-- those bands that overlap probably
represent differences that occurred by chance.
In Johns case, his the difference between math and the
other sub-tests, and Social Studies and Writing represent
real differences (because there is no overlap).
-
8/14/2019 Error Organizer
22/24
Band Interpretation
What happens if we take a more conservative approach by
using the 95% band?
Since the bands are larger with 95% approach, the only real
difference we find at the 95% level are between Johns math
achievement and his achievement in listening and writing.
-
8/14/2019 Error Organizer
23/24
Band Interpretation
All the other bands overlap, suggesting that at the 95% level the
differences in obtained scores are due to chance.
If we employ the more conservative approach, we would concludethat even though the difference between Johns obtained reading and
math scores is 12 points (103-91= 12), the difference is due to
chance, not to a real difference in achievement.
-
8/14/2019 Error Organizer
24/24
Band Interpretation
To make it simpler, let differences at the 68% level be a signal to
you. Let differences at the 95% level be a signal to the school and
to parents.
Also, use the 95% approach when determining real differences
between a students potential for achievement (aptitude) and
actual achievement.