validity and reliability of measurements

88
Validity and reliability of measurements

Upload: others

Post on 18-Dec-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Validity and reliability of measurements

Validity and reliability of measurements

Page 2: Validity and reliability of measurements

2

Page 3: Validity and reliability of measurements

3

Page 4: Validity and reliability of measurements

4

Page 5: Validity and reliability of measurements

Components in a dataset

5

Page 6: Validity and reliability of measurements

Random error

• The difference, due to chance alone, of an observation on a sample from the true population value

• Leading to lack of precision in the measurement (of association)

• Three major sources

• Individual biological variation

• Sampling variation

• Measurement variation

Page 7: Validity and reliability of measurements

Random error………..

• Random error can never be completely eliminated since we can study only a sample of the population.

• Random error can be reduced by the careful measurement

• Sampling error can be reduced by optimal sampling procedures and by increasing sample size

Page 8: Validity and reliability of measurements

Why bother

8

Page 9: Validity and reliability of measurements

• Unreliable histological score – the score is not measuring what

we were planning to measure.

• → wrong conclusion about effectiveness of treatment?

• Imprecise or Inaccurate protein measurement –

• too big variation to be able to draw conclusion

• too low/ high detection in our samples to be able to compare

differences.

• Incomplete quantity and quality of a DNA sample analysed

• Unreliable laboratory equipment or technique in analysing a

(DNA) sample.

9

Examples

Page 10: Validity and reliability of measurements

Examples

• Unreliable weighing scale

Is my intervention that aimed at weight loss really unsuccessful, or

was the weighing scale unreliable?

• Imprecise heart rate monitor

Did the medication really change heart rate ?

• Inaccurate observation

Did I really observe an improvement in scaring?

Does the computer based evaluation of calcium scores really

perform worse then a semi-automatic grading

• Instable depression questionnaire

Did these patients really get worse after counselling?

10

Page 11: Validity and reliability of measurements

What is reliability?

11

Page 12: Validity and reliability of measurements

Components in a dataset

12

Page 13: Validity and reliability of measurements

• Reliable measures (instruments) maximize the true

score component and minimize error

• Terms:

• Stability

• Consistency

• Accuracy

• Dependability

13

Page 14: Validity and reliability of measurements

Reliability

”Reliability refers to the degree of consistency or

accuracy with which it measures the target

attribute.

Page 15: Validity and reliability of measurements

• The idea behind reliability is that any

significant results must be more than a one-

off finding and be repeatable.

• Other people must be able to perform exactly

the same test, under the same conditions

and generate the same results

Page 16: Validity and reliability of measurements

Reliability

• Stability

• Internal consistency

• Equivalence (intra observer –inter observer)

16

Page 17: Validity and reliability of measurements

Humans

………….. and animals

17

Page 18: Validity and reliability of measurements

18

Page 19: Validity and reliability of measurements

Test- retest (stability)

Can be used in

• Observational

• Physiologic

• Questionnaires

Page 20: Validity and reliability of measurements

20

Page 21: Validity and reliability of measurements

21

Page 22: Validity and reliability of measurements

22

Page 23: Validity and reliability of measurements

Questions with stability

• What is the best time frame between test-retest

• How many people should do the test-retest

• How to ask /prepare people

• EPN?

• Same sample /new sample?

23

Page 24: Validity and reliability of measurements

Reliability

• Stability

• Internal consistency

• Equivalence

24

Page 25: Validity and reliability of measurements

Internal consistency

• Ensure that each part of the test generates similar

results, and that each part of a test measures the

correct construct.

• For example, a test of anxiety should measure anxiety

only, and every single question must also contribute.

• If people’s responses to Item 1 are completely

unrelated to their responses to Item 2, then it does not

seem reasonable to think that these two items are both

measuring anxiety … so they probably should not be

on the same test

Page 26: Validity and reliability of measurements

Cronbach’s alpha

Page 27: Validity and reliability of measurements

Or in plain english…

• The most common way to assessing and reporting internal

consistency for multi item scales

• A function of two things:

• The average inter-item correlation

• The number of items

• A common interpretation of the α coefficient

• <0.7 Unsatisfactory

• ≥0.7 Acceptable

• ≥0.8 Good

• ≥0.9 Excellent

Page 28: Validity and reliability of measurements

Internal consistency

• Item-item correlation

• Item-total correlation (corrected)

• Split-half

Page 29: Validity and reliability of measurements

Internal consistency

Page 30: Validity and reliability of measurements

Internal consistency

Page 31: Validity and reliability of measurements
Page 32: Validity and reliability of measurements

Reliability

• Stability

• Internal consistency

• Equivalence : the degree to which two or more independent

observers or coders agree about scoring

• - interrater and intrarater

33

Page 33: Validity and reliability of measurements

Inter-Rater or Inter-Observer Reliability

SIMPLE

Number of agreement

----------------------------

Number of agreement + disagreement

Cohens Kappa

Adjusts for chance agreement

Page 34: Validity and reliability of measurements

Intra Class Correlation (ICC)

• the assessment of consistency or reproducibility of quantitative

measurements made by different observers measuring the

same quantity

• Since the intraclass correlation coefficient gives a composite of

intra-observer and inter-observer variability

• Example: How correct can ER doctors predict the severity of

disease based on echo results: how consistent the scores are

to each other?

35

Page 35: Validity and reliability of measurements

36

Page 36: Validity and reliability of measurements

2021-02-26 37

Day 0

Week 1 Post-op evaluation

Day 14

Day 13

Methods (Timeline)

From a student from Sci-meth HT20

Week 2 Post-op evaluation

1st biopsy

Surgical operation

Treatment formulation

Cell expansion

2nd biopsy

Page 37: Validity and reliability of measurements

Kappa, weighted kappa and ICC

Methods used for assessment of agreement between observers

depending on the type of variable measured and the number of

observers

38

Page 38: Validity and reliability of measurements

39

Page 39: Validity and reliability of measurements

Not only in humans

• Intra essay and inter assay coefficient of variablitiy (in stored

serum samples, e,g with ELISA

40

Page 41: Validity and reliability of measurements

Validity

Test validity is the degree to which an instrument

measures what it is supposed to measure

Page 42: Validity and reliability of measurements

Does this ‘instrument’ measure what I want to

measure?

• What is the digoxin level in the blood?

Not valid

…. To measure digoxin level

Page 43: Validity and reliability of measurements

Does this instrument measure what I want to

measure?• How is self care

Not valid

…. To measure self care

Page 44: Validity and reliability of measurements

Does this instrument measure what I want to

measure?

• What is her temperature?

Not valid

…. To measure temperature

Page 45: Validity and reliability of measurements

Does this ‘histological count’ really reflect

recovery

48

Page 46: Validity and reliability of measurements

Experimental an other examples of Validity

• Adapting a new methodology for a new set of data or samples

• Golden standard?

• Species dependent?

• Ability to distinguish between who has a disease and who does

not

-Two components:

1) Sensitivity

2) Specificity

49

Page 47: Validity and reliability of measurements

Validity

Content validity

Criterion-Related Validity

Construct Validity

Page 48: Validity and reliability of measurements

51

Page 49: Validity and reliability of measurements

Content validity

How much a measure represents the construct

Face validity

Page 50: Validity and reliability of measurements

Content validity• Based on judgement

• Expert panels of users or Patient interviews

Page 51: Validity and reliability of measurements

Criterion-related validity

The degree to which scores on an instrument

are correlated with some external criterion

Page 52: Validity and reliability of measurements

Compare to other criteria

For example

- Measures on a newly developed personality

test

Compare to

- Behavioral criterion on which psychologists

agree

Page 53: Validity and reliability of measurements

Construct validity

The degree to which an instrument measures

the construct under investigation

Constructs are abstractions that are deliberately created by researchers

in order to conceptualize (e.g. happiness, quality of life, infection)

Page 54: Validity and reliability of measurements

Factor analyses

• Based on classical test theory (CTT) /traditional test theory

(TTT)

• Are used to explore or confirm dimensionality in data

• Different forms of factor analyses

• Principal component analysis

• Exploratory factor analysis

• Confirmatory factor analysis

Page 55: Validity and reliability of measurements

58

Page 56: Validity and reliability of measurements

HADS

Page 57: Validity and reliability of measurements

• Please write in the chat if there were questions or things

unclear related to VALIDITY / RELIABILITY

62

Page 59: Validity and reliability of measurements

Poll

• How will you deal with missing values

• - I will not have missing values

• I will ignore them

• I will impute them

• I do not know yet

64

Page 60: Validity and reliability of measurements

4. How should I treat missing values?

Reasons for missing data

65

Page 61: Validity and reliability of measurements

4. How should I treat missing values?

Reasons for missing data

• Participants can fail to respond to questions

• Equipment malfunction

• Power failure

• Sample loss

• Data collecting or recording mechanisms can malfunction

• Subjects can withdraw from studies before they are completed

• Incomplete registry data

• Data entry errors

66

Page 62: Validity and reliability of measurements

• Attrition due to social/natural processes

• Example:

• Accident of the patients, snowfall during datacollection,

• Patients I not able to drive to the clinic, death

• Power failure in the lab, sick lab technician

• Skip pattern in survey

• Example: Certain questions only asked to respondents who

indicate they have symptoms (see SCFHI)

67

Page 63: Validity and reliability of measurements

• Attrition due to social/natural processes

• Example: accident, not able to drive to the clinic, death

• Skip pattern in survey

• Example: Certain questions only asked to respondents who

indicate they have symptoms (see SCFHI)

• Random data collection issues

• Respondent refusal/Non-response

• Is it the same patients who misses all

• Are there certain item? (e.g. MLwHFQ)

68

Page 64: Validity and reliability of measurements

Missing Data Mechanisms

• Missing Completely at Random (MCAR)(uniform non-response)

• Missing at Random (MAR)

• Missing not at Random (NMAR) non-ignorable

69

Page 65: Validity and reliability of measurements

Missing Data Mechanisms

• Missing Completely at Random (MCAR)(uniform non-response)

• Data missing are independent both of observable variables and of

unobservable parameters of interest

• Example: postal questionnaire is missing, sample is dropped

70

Page 66: Validity and reliability of measurements

Missing Data Mechanisms

• Missing Completely at Random (MCAR)(uniform non-response)

• Missing at Random (MAR)

• Missingness is related to a particular variable, but it is not related to

the value of the variable that has missing data

• Example: male participants are more likely to refuse to fill out the

depression survey, but it does not depend on the level of their

depression.

• Example: technical error of a machine or handling of sample during

measurements

71

Page 67: Validity and reliability of measurements

Missing Data Mechanisms

• Missing Completely at Random (MCAR)(uniform non-response)

• Missing at Random (MAR)

• Missing not at Random (NMAR) non-ignorable

• The probability of a missing value depends on the variable that is

missing

Example:

• Respondents with high income less likely to report income

• a particular treatment causes discomfort, a patient is more likely to

drop out of the study. This missingness is not at random

• Missing lab value in high range values

72

Page 68: Validity and reliability of measurements

4. How to treat missing values/data?

• Deletion Methods

• Listwise deletion, pairwise deletion

• Imputation?

Methods

• Mean/mode substitution, dummy variable method, single regression

• Model-Based Methods

• Maximum Likelihood, Multiple imputation

73

Page 69: Validity and reliability of measurements

Listwise deletion

• Only analyzes cases with data on each variable

• Simple + comparability in all analysis

But

• Losing power

• Losing information

• Bias if data are not MCAR

74

Page 70: Validity and reliability of measurements

Pairwise deletion

• Only cases relating to each pair of variables with missing data

involved in an analysis are deleted.

• Analysis of all cases in wich the variables of interest are

present

• Keeps as many cases as possible

• Keeps many information

But

• Different ’n’ per analysis

75

Page 71: Validity and reliability of measurements

IMPUTATION

Rather than removing variables or observations with missing data,

another approach is to fill in or “impute” missing values.

Examples

• Last value carried forward

• Mean imputation

• Imputation based on logical rules

• Using regression predictions to perform deterministic imputation

76

Page 72: Validity and reliability of measurements

77

Page 73: Validity and reliability of measurements

5. I cannot find an instrument: should I develop

my own, translate, adapt?

Discuss

Advantages

Disadvantages

Consequences

78

Page 74: Validity and reliability of measurements

Developing a new questionnaire

79

Phase 1

• Decision on theoretical framework

Phase 2

• Item development

• Systematic review or qualitative analysis of the literature

• Qualitative interviews of patients, physician, nurses

Phase 3

• Pilot testing

• Initial validity and reliability assessment

• Reduction or revision of items or scale structure

• Scoring

Phase 4

• Further field testing, more advanced validity and

reliability estimation

Page 75: Validity and reliability of measurements

Poll

• What would you do if you cannot find a good instrument to

measure your main concept of your study?

• 1. Change main concept

• 2. Measure a related concept

• 3. Translate an instrument

• 4. Develop an instrument

• 5. Cry

80

Page 76: Validity and reliability of measurements

Translating

Page 77: Validity and reliability of measurements

Culture and language

• Translation = translation?

• Translation = cultural?

• Translation = validation?

Page 78: Validity and reliability of measurements

Concerns in translation

• Use of expressions

• E.g. Thirst scale:

• ’feels like cotton in the mouth’

83

Page 79: Validity and reliability of measurements

84

Page 80: Validity and reliability of measurements

Can I add /delete, change procedures?

85

Page 81: Validity and reliability of measurements
Page 82: Validity and reliability of measurements

87

Page 83: Validity and reliability of measurements

88

Page 84: Validity and reliability of measurements

Changing scoring method?

89

Page 85: Validity and reliability of measurements

Adding items / changing an instrument

Yes you can do that

• Consider ’revised version’

• Validity and reliability

• Legal rights / copy rights

Consequences:

• Comparability?

• Reliability, validity

90

Page 86: Validity and reliability of measurements

91

Page 87: Validity and reliability of measurements

92

Page 88: Validity and reliability of measurements

“Every line is the perfect length if you don't measure it.”

Marty Rubin

93