error organizer

8/14/2019 Error Organizer

1/24

The STANDARD ERROR OF MEASUREMENT, (which we will

discuss in detail shortly), ESTIMATES the RANGE of scores

within which an individuals true score or true level of ability lies.

Student A gets a 75 on a test, we can only hope that As TRUESCORE - her actual level of ability- is somewhere around 75.

The closer the reliability of the test is to perfect (r = 1.00), the more

likely it is that the true score is very close to 75.


2/24

ERROR

If your obtained scores do not always reflect your true ability (if

they underestimated or overestimated your true ability), then they

were associated with some error.

In other words, your OBTAINED SCORE has a TRUE SCORE

component (actual level of ability, skill, knowledge), and an

ERROR component (which acts to raise or lower the obtained

score).

Obtained score = True score +/- error score


3/24

The Standard Error of Measurement (Sm)

The standard error of measurement is the STANDARD

DEVIATION of the ERROR scores of a test.

Although we can never know the error scores, we can ESTIMATE

the standard error of measurement by using the following formulawhere r is the reliability of the test:

Sm = SD1- r

Where r is the reliability of the test.


4/24

Using the Standard Error of Measurement

The distribution of error scores approximates the normal

distribution.

We can extend this information to construct a band around anyobtained score to identify the range of scores that, at a certain level

of confidence, will capture or span an individuals true score.


5/24

The SEM can be used to provide the following:

To make an estimation of the value of a persons true score. Inother words, we can use it to predict what would happen if a person

took additional equivalent tests.

68% of the scores would fall between +or - 1 SEM of the true score.

95% of the scores would fall between +or - 2 SEM of the true score.

99.7% of the scores would fall between +or - 3 SEM of the true

score.


6/24

Thus, if a person achieved a score of 80 on a math test, and theSEM for that test was 5, then we could state the following:

68% of the scores would fall between ____ and____

95% of the scores would fall between ____and ____

99.7% of the scores would fall between ___and____


7/24

When applied to the prediction of future test performance,

these ranges are known as CONFIDENCE INTERVALS

That is, we can be:

68%of the scores would fall between +or - 1 SEM of the true score.

95%of the scores would fall between +or - 2 SEM of the true score.

99.7% of the scores would fall between +or - 3 SEM of the true score.


8/24

Confidence Intervals

Finally, the SEM can be used to determine whether a score issignificantly different from a particular criterion such as a cutoffscore.

If a person received a score of 105 on the WAIS, that has an SD of15, a reliability of .97, and an SEM of 2.5, how confident can we bethat repeated testing would not place this person in the gifted range(130 or above)?

68% confident that the true score lies between____and_____

95% confident that the true score lies between ___and_____

99.7% confident that the true score lies between___and____


9/24

Confidence IntervalsIn conclusion, the SEM is a statistic that estimates for us just

how fallible, or error-prone tests are.


10/24

Confidence Interval

In education, we have long had a tendency to

OVERINTERPRET small differences in test scores since we

too often consider obtained scores to be completely accurate.


11/24

Reliability and the SEM

If a test is perfectly reliable (r = 1.00), then a student will always get

exactly the same score, there will be no error and the SEM will be 0.

If the test is not reliable, the SEM will be almost as big as the SD;

Remember:

The SD is the variability of raw scores; the SEM is the variability of

error scores.


12/24

Sources of Error

Error Within Test-Takers (Intra-Individual Error)

These include any within-student factors that would result

in obtained scores being lower or higher than true scores.

Error Within the Test

This error is within-test and can include: trick questions;

reading level too high; ambiguous questions; grammatical cues inthe items; items too easy or too difficult; and poorly written items.


13/24

Error in Test Administration

This error includes the following:Physical comfort

Instructions & explanations- Different test

administrators provide different amounts to test takers.

Test administrator attitudes - Administrators differ in

the notions they convey about the importance of the

test, the extent to which they are emotionally supportiveof students, and the way in which they monitor the test.


14/24

Error in Test Administration

Error is Scoring

Computer scoring has decreased this source of error.

But teachers and administrators can still make mistakes onanswer keys; students dont use #2 pencils or make stray marks;

and hand scoring can lead to error.


15/24

Sources of Error Influencing Various Reliability

Coefficients

Test-Retest Reliability

If test-retest coefficients are determined over a short time, the effects

of within-student error should be small.

What about sources of:

within test error ?

error in administration?

error in scoring?


16/24

Sources of Error

Alternated-forms reliability

Since this form of reliability is determined by administering two

different forms of the test to the same group close together in

time, the effects of within-student error should be small.


17/24

Sources of Error

Internal consistency

With this type of reliability, neither within-student nor within-

test sources of error will exert an influence, since only 1 test isgiven one time. The same goes for administration and scoring

error.


18/24

Band Interpretation

Johns scores on end of year achievement test

Sub-tests Scores

Reading 103Listening 104

Writing 105

Social Studies 98

Science 100

Math 91


19/24

Band Interpretation

How large a difference do we need between test scores toconclude that the differences represent real and not chance

differences?

We can use the SEM to answer these questions, using a technique

called BAND INTERPRETATION.

1. First, determine the SEM for each sub-test.

2. Add and subtract the SEM for each sub-test score.


20/24

Band Interpretation

3. Graph each scale- Shade in the bands to represent the

range of scores that has a 68% (or 95%) chance of capturing

Johns true score.

4. Interpret the bands- Interpret the profile of bands by

visually inspecting the bars to see which band overlap and

which do not.


21/24

Band Interpretation

Using the 68% band-- those bands that overlap probably

represent differences that occurred by chance.

In Johns case, his the difference between math and the

other sub-tests, and Social Studies and Writing represent

real differences (because there is no overlap).


22/24

Band Interpretation

What happens if we take a more conservative approach by

using the 95% band?

Since the bands are larger with 95% approach, the only real

difference we find at the 95% level are between Johns math

achievement and his achievement in listening and writing.


23/24

Band Interpretation

All the other bands overlap, suggesting that at the 95% level the

differences in obtained scores are due to chance.

If we employ the more conservative approach, we would concludethat even though the difference between Johns obtained reading and

math scores is 12 points (103-91= 12), the difference is due to

chance, not to a real difference in achievement.


24/24

Band Interpretation

To make it simpler, let differences at the 68% level be a signal to

you. Let differences at the 95% level be a signal to the school and

to parents.

Also, use the 95% approach when determining real differences

between a students potential for achievement (aptitude) and

actual achievement.

error organizer

Documents