reliability for testing and assessment

Reliability

Presenter: Erlwinmer Reyes Mangmang

OutlineTHE CONCEPT OF RELIABILITY- Sources of Error VarianceRELIABILITY ESTIMATES- Test-Retest Reliability Estimates- Parallel-Forms & Alternate-Forms Reliability

Estimates- Split-Half Reliability Estimates- Other Methods of Estimating Internal Consistency- Measures of Inter-Scorer Reliability

Outline

USING & INTERPRETING A COEFFICIENT OF RELIABILITY- The Purpose of the Reliability Coefficient- The Nature of the Test- The True Score Model of Measurement &

Alternatives to It

Definition of Reliability

- Lay man

- Psychometrics

- Reliability coefficient

Simple analogy

X=T + E

X= represent an observed scoreT= represent a true scoreE= represent error

Categories of error

•Random error

•Systematic error

Sources of error Variance

Test ConstructionTest AdministrationScoringInterpretation

Test Construction

ITEM SAMPLING or CONTENT SAMPLING

Test Administration

• Test Environment• Test taker Variables• Examiner-related Variables

Test Scoring and Interpretation

•Computer Scoring

•Subjective test

•Assessment purposes

Other sources of error•Methodological error

“RELIABILITY (solving) is not the ultimate fact in the book of recording angel”

-Stanley (1971)

Reliability Estimates

Test-Retest Reliability Estimates

• Using the same instrument to measure the same thing at two points in time• Coefficient of stability

Parallel-Forms & Alternate-Forms Reliability Estimates

Parallel forms (of a test)- each form of the test, the means the variances of observed test scores are equal

Alternate forms (of a test)- simply different versions of a test that have been constructed so as to be parallel

Example: Variable such as CONTENT & LEVEL OF DIFFICULTY

Split-Half Reliability Estimates

• Obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered once• SPEARMAN-BROWN FORMULA

Other Methods of Estimating Internal Consistency

• KUDER-RICHARDSON FORMULAS• CRONBACH ALPHA• AVERAGE PROPORTIONAL DISTANCE (APD)

Measures of Inter-Scorer Reliability

• Is the degree of agreement or consistency between two or more scores (or judges or raters) with regard to a particular measure

USING AND INTERPRETING A COEFFICIENT OF RELIABILITY

Guide question…

How high should the coefficient of reliability be?

Answer:1. “on a continuum relative to the purpose and

importance of the decisions to be made on the basis of scores on the test”

2. A .95 or higher (important decisions)B .85 to .90B- .75 to 80sF .74 and below (barely passing)

The purpose of the Reliability Coefficient

Nature of the test

•Homogenous or heterogeneous•Dynamic or static•Restriction or inflation of range• Speed test or power test•Criterion referenced tests

The true score model of measurement and Alternatives

to it• Classical test theory• Domain sampling theory and

generalizability theory• Item response theory

End of slide

XU advance psych testing and assessment 16-17

Thank you

reliability for testing and assessment

Education