measurement: reliability lu ann aday, ph.d. the university of texas school of public health
Post on 14-Dec-2015
216 Views
Preview:
TRANSCRIPT
MEASUREMENT: RELIABILITY
Lu Ann Aday, Ph.D.The University of Texas School of Public Health
RELIABILITY: Definition Extent of random variation in
answers to questions as a function of when they are asked (test-retest), who asked them (inter-rater), and the fact that a given question is one of a number of questions that could have been asked to measure the concept of interest (internal consistency).
RELIABILITY: TypesTest-test reliabilityInter-rater reliabilityInternal consistency reliability
RELIABILITY: Computation
Requires repeated measures to estimate stability over time (test-retest) or equivalence across data gatherers (inter-rater) or across questions/ items intended to measure the same underlying concept (internal consistency).
RELIABILITY: Test-retest
Definition: correlation between answers to same question by same respondent at two different points in time
RELIABILITY: Test-retest
Factors affecting: Vague question wording Transient personal states, e.g., physical or mental
Situational factors, e.g., presence of other people
RELIABILITY: Test-retest Computation: Compute
correlation coefficient between answers to same question by same respondent at two different points in time:Respondent Q1, Time 1 Q1, Time 21 Agree Agree2 Agree Agree3 Agree Agree44 Agree Agree DisagreeDisagree5 Agree Agree
RELIABILITY: Test-retest
Correlation coefficients: Interval: Pearson r Ordinal: Spearman rho Nominal: Chi-square-based measures of association
Correlation desired: .70+
RELIABILITY: Test-retest Comparisons of means:
Interval: paired t-test, repeated measures analysis of variance
Advantages: more accurately take into account
that the first and second measurements are not independent
more directly compare the actual answers at the two points in time
RELIABILITY: Inter-rater Definition: correlation between answers to same question by same respondent obtained by different data gatherers at (approximately) the same point in time
RELIABILITY: Inter-rater
Factors affecting: Lack of adequate interviewer training
Lack of standardization of data collection protocols and procedures
RELIABILITY: Inter-rater Computation: Compute correlation
coefficient between answers to same question by same respondent obtained by different data gatherers:Respondent Q1, Int. A Q1, Int. B1 BP=140/90 BP=140/90 2 BP=150/80 BP=150/80 3 BP=145/95 BP=145/95 44 BP=145/95BP=145/95 BP=120/80BP=120/805 BP=140/90 BP=140/90
RELIABILITY: Inter-rater Correlation coefficients:
(correlation coefficients for 3+ data gatherers noted in parentheses):
Interval: Pearson r (eta)
Ordinal: Spearman rho (chi-square)
Nominal: Kappa (chi-square)
Correlation desired: .80+
RELIABILITY: Internal Consistency Definition: correlation between answers by same respondent to different questions about the same underlying concept (usually summarized in scales)
RELIABILITY: Internal Consistency Factors affecting:
Number of different questions asked to capture the underlying concept
Level of association (correlation) between answers the same respondents give to different questions about the concept
RELIABILITY: Internal Consistency
Computation: Compute internal consistency (underlying correlation) coefficients between answers by same respondent to different questions about the same concept:Respondent Q1 Q2 Q3
1 Agree DisagreeDisagree Agree2 Agree DisagreeDisagree Agree3 Agree DisagreeDisagree Agree44 Agree Agree AgreeAgree AgreeAgree5 Agree DisagreeDisagree Agree
RELIABILITY:Internal Consistency
Internal consistency coefficients Corrected item-total correlation* Split-half reliability coefficient Cronbach alpha coefficient
Coefficient desired: .70+ (group) .90+ (individual) .40+ (corrected item-total)*
RELIABILITY:Internal Consistency
Computation: Corrected item-total correlation Add up the scores for answers to different
questions about the same concept to create a total score
Subtract the score for answer to a given question from the total score to create item-specific “corrected” total scores
Compute Pearson correlation coefficients between score for each of the items and corresponding “corrected” total score
RELIABILITY:Internal Consistency
Computation: Split-half reliability coefficient Randomly divide a series of questions
about the same concept into halves and add up the scores for answers to the questions in the respective halves
Compute Spearman-Brown prophecy coefficient for correlation between the scores for each half, adjusting for the fact that the respective scores are based on only half the original number of items
RELIABILITY:Spearman-Brown prophecy adjustments
Original alpha/ Scale length -/+
-.75 -.67 -.50 2x 3x 4x
.50 .20 .25 .33 .67 .75 .80
.60 .25 .33 .43 .75 .82 .86
.70 .37 .44 .54 .82 .88 .90
.80 .50 .57 .67 .89 .92 .94
.90 .69 .75 .82 .95 .96 .97
RELIABILITY:Spearman-Brown prophecy formula
Computation:k * ro /1 + [(k-1) * ro] where,
k = factor by which scale is increased or decreased
ro= alpha based on original length
Example:2 * .70/1 + [(2-1) * .70] = .82
RELIABILITY:Cronbach alpha coefficient Computation:
k * ra /1 + [(k-1) * ra] where,
k = number of items in the scale ra= average Pearson r between
items
Example:10 * .32/1 + [(10-1) * .32] = .82
WHEN TO UNDERTAKE RELIABILITY ANALYSIS
RELIABILITY/DIMENSIONS
TEST-RETEST
INTER-RATER INTERNALCONSISTENCY
QUESTIONS Concerned about stability of wording
Concerned about equivalence of data gatherers
Constructing summary scales of attitudes or other abstract concepts
STUDIES Esp. important in longitudinal or experimental designs
Monitored, but not usually measured directly in surveys
Esp. used in attitudinal surveys
STAGES Pilot test or pretest
Pretest plus monitor in final study
Pretest or final study
REFERENCES DeVellis, Robert F. (2003). Scale
Development: Theory and Applications. Second Edition. Thousand Oaks, CA: Sage.
Ware, J.E., Jr., & Gandek, B., for the IQOLA Project (1998). Methods for testing data quality, scaling assumptions, and reliability: The IQOLA Project Approach. J. Clinical Epidemiology, 51 (11), 945-952.
top related