Download - ERRORS IN EPIDEMIOLOGICAL STUDIES Assoc. Prof. Pratap Singhasivanon Department of Tropical Hygiene

ERRORS IN ERRORS IN

EPIDEMIOLOGICAL EPIDEMIOLOGICAL

STUDIESSTUDIES

Assoc. Prof. Pratap SinghasivanonAssoc. Prof. Pratap SinghasivanonDepartment of Tropical HygieneDepartment of Tropical Hygiene

ERRORERROR

Is defined as a false or mistaken result obtained in a study or experiment

Consists of 2 components

Systematic error

Random errorRandom error

RANDOM ERRORRANDOM ERROR

Refers to fluctuations around a true value because of Sampling variability

SYSTEMATIC ERRORSYSTEMATIC ERRORAny difference between the true value and

that actually obtained that is the result

of all causes other than Sampling variability.

A false or mistaken result obtained in

a study or experiment

BIAS BIAS BIAS BIAS Fluctuation of and

estimate around thepopulation value

(RANDOM VARIABILITY)

Error due to factorsthat inherent in the

design, conduct and analysis

Result obtained in sample differs from result that would

be obtained if the entire population were studies

ERRORERRORERRORERROR SYSTEMATIC ERRORSYSTEMATIC ERROR RANDOM ERRORRANDOM ERROR== ++

Sources of ErrorSources of Error

Observers Subjects

ResearchersAdministering The measure

Bias Random RandomRandomBias Bias

SOURCES AND SOURCES AND TYPES OF MEASUREMENT ERRORTYPES OF MEASUREMENT ERROR

SYSTEMATIC ERROR :SYSTEMATIC ERROR :

SELECTION BIAS

INFORMATION BIAS

CONFOUNDING

RANDOM ERRORRANDOM ERROR

IsIs the divergence, due to chance alone, of

an observation on an sample from the true

population value

Different combinations of high and low Different combinations of high and low reliability and validityreliability and validity

VALIDITYVALIDITYHighHigh

HighHigh

RE

LIA

BIL

ITY

RE

LIA

BIL

ITY

LowLow

LowLow

HighHigh

LowLow

Internal and Internal and EExternal Validityxternal Validity

ExternalExternalPopulationPopulation

TargetPopulation

StudySample

VALIDITYVALIDITYVALIDITYVALIDITY

INT.INT. EXT.EXT.

VALIDITY AND RELIABILITYVALIDITY AND RELIABILITY

HIGH LOWVALIDITYVALIDITYVALIDITYVALIDITY

HIGH

RELIABILITYRELIABILITYRELIABILITYRELIABILITY

LOW

FR

EQ

UE

NC

YA B

C D

A study is valid if its results corresponds

to the truth, no systematic error or

should be as small as possible

VALIDITY :VALIDITY :

IsIs the expression of the degree to which a

test is capable of measuring what it is

intended to measure

AA study is valid if its results corresponds to

the truth, no systematic error and random

error should be as small as possible

VALIDITYVALIDITY

TRUE BLOODPRESSURE

(INTRA-ARTERIAL CANULA)

BLOOD PRESSUREMEASUREMENT

(SPHYGMOMANOMETER)

DIASTOLIC BLOOD PRESSURE (mmHg)

80 90

BIASBIAS

NO

. O

F

OB

SE

RV

AT

ION

RELATION SHIP BETWEEN BIAS AND CHANCERELATION SHIP BETWEEN BIAS AND CHANCE

CHANCECHANCECHANCECHANCE

SOURCES OF VARIATIONSOURCES OF VARIATION

CONDITIONS OF DISTRIBUTION OF SOURCE OF MEASUREMENT MEASUREMENT VARIATION

One Patient,One Observer

Repeated observations

One Patient,Many Observer,

At one time

One Patient,One observer,

Many Times of Day

Many PatientsMany Patients

MEASUREMENTMEASUREMENT

BIOLOGICBIOLOGIC

++MEASUREMENTMEASUREMENT

FRAMEWORK FOR THE INTERPRETATIONFRAMEWORK FOR THE INTERPRETATIONOF AN EPIDEMIOLOGIC STUDYOF AN EPIDEMIOLOGIC STUDY

IS THERE A VALID STATISTICAL ASSOCIATIONIS THERE A VALID STATISTICAL ASSOCIATION??

Is the association likely to be due chance?

Is the association likely to be due bias?

Is the association likely to be due confounding?

CAN THIS VALID STATISTICAL ASSOCIATION BE JUDGED CAN THIS VALID STATISTICAL ASSOCIATION BE JUDGED AS CAUSE AND EFFECTAS CAUSE AND EFFECT??

Is there a strong association?

Is there biologic credibility to the hypothesis?

Is there consistency with other studies?

Is the time sequence compatible?

Is there evidence of a dose-response relationship?

IIs the quality of being sharply defined through

exact detail.

TThe repeated assay of a single test specimen

typically gives rise to a set of results that differ to a greater or lesser extent from each order. The smaller the differences, the greater the precision of the assay method.

Precision :Precision :

MeasurementMeasurement

The procedure of applying a standard scale to a variable or a set of values. (Last, 1988)

Terms used to describe properties of measurement:

- Accuracy- Validity- Precision- Reliability- Repeatability- Reproducibility

is a distorsion in the estimate of effect resulting from the manner in which subject are selected for the study population

MAJOR SOUREC OFMAJOR SOUREC OF SELECTION BIASSELECTION BIAS

1) flaws in the choice of groups to be compared2) choice of sampling frame3) loss to follow up or nonresponse during data collection4) selective survival

SELECTION BIASSELECTION BIAS

INFORMATION BIAS INFORMATION BIAS is a distortion in the measur

ement error or misclassification of subject on one or

more variables

MAJOR SOURCES OF INFORMATION BIASMAJOR SOURCES OF INFORMATION BIAS 1) invalid measurement

2) incorrect diagnostic criteria

3) omissions imprecisions

4) other inadequacies in previously recorded data

0

1

2

3

4

5

6

7

8

9

<20 20-24 25-29 30-34 35-39 40+

Maternal AgeMaternal Age

Aff

ecte

d B

abie

s p

er 1

000

Liv

e A

ffec

ted

Bab

ies

per

100

0 L

ive

Bir

ths

Bir

ths

Prevalence of Down syndrome at Maternal AgePrevalence of Down syndrome at Maternal Age

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

1 2 3 4 5+

Birth OrderBirth Order

Aff

ecte

d

Aff

ecte

d B

abie

sB

abie

s p

er 1

000

Liv

e B

irth

s p

er 1

000

Liv

e B

irth

sPrevalence of Down syndrome at birth by birth orderPrevalence of Down syndrome at birth by birth order

Example No.

Type of Confounding

Unadjusted Relative Risk

Adjusted Relative Risk

1 Positive 3.5 1.0

2 Positive 3.5 2.1

3 Positive 0.3 0.7

4 Negative 1.0 3.2

5 Negative 1.5 3.2

6 Negative 0.8 0.2

7 Qualitative 2.0 0.7

8 Qualitative 0.6 1.8

Hypothetical Examples of Unadjusted and Adjusted Relative Risks

According to Type of confounding (Positive or Negative)

CONFOUNDING CONFOUNDING

MIXING OF EFFECTSMIXING OF EFFECTS

The estimate of the effect of The exposure

of interest is distorted because it is mixed

With the effect of an Extraneous factor

COFFEE DRINKING, CIGARETTE SMOKINGCOFFEE DRINKING, CIGARETTE SMOKINGAND CORONARY HEART DISEASEAND CORONARY HEART DISEASE

EXPOSURE(coffee drinking)

DISEASE(heart disease)

CONFOUNDINGVARIABLE

(cigarette smoking)

CONFOUNDING CONFOUNDING

TThe distortion introduced by a confounding factor can lead to overestimation or under estimation of an effect depending on the direction of the association that the confounding factor has with exposure and disease.

CConfoundingonfounding can even change the apparent direction of an effect.

ExampleExample : : AlcoholAlcohol

Smoking Smoking

Oral cancerOral cancer

E =E = E = E =

D D

D D

EE EEEE

DD

EE

DD

E =E = E =E =

D D

D D

E E E E E E E E

D D

D D

Situation in which Situation in which FF is a is a confounder confounder for a for a DD - - EE association. association.

Situation in which Situation in which FF is notis not a a confounder confounder for a for a DD - - EE association. association.

EE

FF

DD

EE

FF

DD

EE

FF

DD

EE

FF

DD

EE

FF

DD

EE

FF

DD

EE

FF

DD

To be confounding, the extraneous variable must To be confounding, the extraneous variable must have the following characteristicshave the following characteristics

AA confounding variable confounding variable must be a risk factor for the disease.

AA confounding variableconfounding variable must be associated with the exposure under study (in the population from which the case derive).

AA confounding variableconfounding variable must not be an intermediate step in the causal path between the exposure and the disease.

TThe data-based criterion for establishing the

presence or absence of confounding involve the comparison of a crude effect measure with an adjusted effect measure that corrects for distortions due to extraneous variables.

CConfounding is acknowledged to be present

when the crude and adjusted effect measure differ in value.

- RESTRICTION

- MATCHING

- STRATIFICATION

- MATHEMATICAL MODEL

(Multivariate analysis)

DESIGNDESIGN

ANALYSISANALYSIS

CONTROL OF CONFOUNDINGCONTROL OF CONFOUNDING

AGE *MI (%) CONTROLS(%) **OC USE (%)

25-29 3 16 29

30-34 9 14 10

35-39 16 20 8

40-44 30 21 4

45-49 42 18 3

*MI : MYOCARDIAL INFARCTION**OC : ORAL CONTRACEPTIVE

Relation of Confounder to Relation of Confounder to Disease and Exposure Disease and Exposure

DISEASE EXPOSURE

CRUDE RRCRUDE RR

D

D

+

-

E E-

RR = 4RR = 4

CRR1000

• Collapsed• Collapsed in 1 table without separation into subgroup.

CRUDERR

4==

1000 2000

E =E =

D

D

EE EE

D

D

EE EE

D

D

SMOKERSSMOKERSSMOKERSSMOKERSNON-SMOKERSNON-SMOKERSNON-SMOKERSNON-SMOKERS

6 29

871

-+

+

-

CIR = 1.86 SM

^

ADJUSTED

CIR = 1.13^

CIR = 1.02 SM

^

EXPOSUREEXPOSURE

194 21

706 79

-+

DISEASEDISEASE

9494

+

-

200200

800800

5050

950950

ALC ALC

250

1750DISEASEDISEASE

1000 1000 2000

CRUDE CIR = 4.0^

EXPOSUREEXPOSURE

OCaOCa

+

- OCaOCa

Degree of ConfoundingDegree of Confounding

mmeasures the amount of confoundingrather than mere presence or absence

degree of confoundingdegree of confounding = crude measure adjusted measure

= 4.00 = 3.531.13

Crude = 1.68Adjusted = 3.97

d.c. = 1.68 = 0.42 3.97

over estimationover estimation

under estimationunder estimation

AGERecent

Use of OC MI Controls OR

Yes 4 6225-29

No 2 2447.2

Yes 9 3330-34

No 12 3908.9

Yes 4 26 1.535-39

No 33 330

Yes 6 9 3.740-44

No 65 362

Yes 6 5 3.945-49

No 93 301

Yes 29 135 1.7TOTAL

No 205 160765

AGERecent

Use of OC MI Controls OR

Yes 4 6225-29

No 2 2447.2

Yes 9 3330-34

No 12 3908.9

Yes 4 26 1.535-39

No 33 330

Yes 6 9 3.740-44

No 65 362

Yes 6 5 3.945-49

No 93 301

Yes 29 135 1.7TOTAL

No 205 160765

97.3)MH(ORa

97.3)MH(ORa

4 fold risk of 4 fold risk of MIMI among recent of among recent of OCOC users as compared to non-users.users as compared to non-users.

TYPES OF ASSOCIATIONTYPES OF ASSOCIATION

A. Not statistically associated (Independent)

B. Statistically associated

1. Noncausally associated 1. Noncausally associated (Secondarily)

2. Causally associated 2. Causally associated a. Indirectly associated b. Directly causal

AssociationAssociation refers to the statistical dependence

between two variables that is ..

The degree to which the rate of disease in person

with a specific exposure is either higher or lower

than the rate of disease among those without that

exposure.

The presence of an association, does not imply

that the observed association is one of causecause and

effecteffect.

YESYESYESYES NONONONO

YES NO

OKOK

YES NO

RESEARCHRESEARCH

Clinical / Public Health

significance

Clinical / Public Health

significanceSample size big enoughSample size big enough

STATISTICAL SIGNIFICANCESTATISTICAL SIGNIFICANCESTATISTICAL SIGNIFICANCESTATISTICAL SIGNIFICANCE

Advantages Disadvantages

- May study several outcomes- Control over selection of subjects- Control over measurements- Relatively short duration- A good first step for a cohort study- Yield prevalence, relative prevalence

- Dose not establish sequence of events- Potential bias in measuring predictors- Potential survival bias- Not feasible for rare conditions- Does not yield incidence or true relative risk

Advantages and disadvantages of the major observAdvantages and disadvantages of the major observational designs. ational designs. (cont.)(cont.)

2. CROSS-SECTIONAL2. CROSS-SECTIONAL

4. NESTED CASE-CONTROL4. NESTED CASE-CONTROL (Prospective or retrospective)(Prospective or retrospective)

* All of these observational designs have the disadvantages (compare to experiment) of being susceptible to the influence of confounding variables

Advantages Disadvantage

Scientific advantages of cohort Requires bank of outcomes design samples stored until occur

Relatively inexpensive


Advantages and disadvantages of the major Advantages and disadvantages of the major observational designsobservational designs

1. COHORT1. COHORT

AdvantagesAdvantages Disadvantages Disadvantages

Establishes sequence of events Often requires large

sample sizes

Avoid bias in measuring predictors Not feasible for rare

outcomes

Avoid survival bias

Can study; several outcomes

Number of outcome events grows over time

3. CASE-CONTROL3. CASE-CONTROL

Advantage Disadvantages

Useful for studying rare conditions Potential bias from sampling two population

Short duration Does not establish sequence of events

Relatively inexpensive Yield odds ratio(usually a good predictors

Potential bias in measuring Potential survival bias approximation of relative risk)

Limited to one outcome variable

Does not yield prevalence, incidence, or excess risk


CHARCTERISTICS OF INCIDENCE AND PREVALENCE

INCIDENCEINCIDENCE PREVALENCEPREVALENCENNew cases occurring during a period of time among a group initially free of disease

AAll susceptible people present at the beginning of the period

DDuration of the period

CCohort study

AAll cases counted on a single survey or examination of a group

AAll people examined including cases and new cases

SSingle point

PPrevalence (cross-sectional)study

NUMERATORNUMERATOR

DENOMINATORDENOMINATOR

TIMETIME

HOWMEASURED

HOWMEASURED

SENSITIVITY AND SPECIFICITY REMAIN

CONSTANT IRRESPECTIVE OF THE VALUES

OF THE OTHER VARIABLE :

Nondifferential MisclassificationNondifferential Misclassification

““BIAS TOWARD THE NULL”BIAS TOWARD THE NULL”

““BIAS TOWARD OR AWAY FROM NULL VALUE”BIAS TOWARD OR AWAY FROM NULL VALUE”

Differential MisclassificationDifferential Misclassification

WHEN THE MAGNITUDE OF ERROR FOR ONE

VARIABLE DIFFERS ACC. TOTHE ACTUAL VALUE OF

ANOTHER VARIABLE

(DIFF. SENSITIVITY & SPECIFICITY) EG.

EXPOSURE TO CONGENITAL

RADIATION MALFORMATION

EMPHYSEMA SMOKING

FRAMEWORK FOR THE INTERPRETATIONFRAMEWORK FOR THE INTERPRETATIONOF AN EPIDEMIOLOGIC STUDYOF AN EPIDEMIOLOGIC STUDY

IS THERE A VALID STATISTICAL ASSOCIATIONIS THERE A VALID STATISTICAL ASSOCIATION??

Is the association likely to be due chance?

Is the association likely to be due bias?

Is the association likely to be due confounding?

CAN THIS VALID STATISTICAL ASSOCIATION BE JUDGED CAN THIS VALID STATISTICAL ASSOCIATION BE JUDGED AS CAUSE AND EFFECTAS CAUSE AND EFFECT??

Is there a strong association?

Is there biologic credibility to the hypothesis?

Is there consistency with other studies?

Is the time sequence compatible?

Is there evidence of a dose-response relationship?

Prevalence of DyslipidemiaPrevalence of Dyslipidemia

Source pop.Prevalence = 25%

1.2.3.4.+5.6.+7.8.9.10.11.12.+13.14.15.+16.17.1819.20.+

1.2.3.4.+5.6.+7.8.9.10.11.12.+13.14.15.+16.17.1819.20.+

Sample 1

Sample 2

Sample 3

4+6+8914

4+6+8914

Prevalence = 40%

12+7171619

12+7171619

Prevalence = 20%

181411105

181411105

Prevalence = 0%


AdvantagesAdvantages Disadvantages Disadvantages

Yields incidence, relative risk, excess risk

- Prospective More controls over More expensive selection of Longer duration

- Retrospective Less expensive Less controls over Shorter duration selection of subjects

Less controls over measurements

- Double cohort Useful when distinct Potential bias from cohort different or sampling two rare exposures populations

MISCLASSIFICATION WITH REGARD TO DISEASEMISCLASSIFICATION WITH REGARD TO DISEASE(NONDIFFERENTIAL MISCLASSIFICATION)(NONDIFFERENTIAL MISCLASSIFICATION)

Exposed Unexposed Relative RiskExposed Unexposed Relative Risk

Number of Individuals 2,000 8,000 2.0

Number of Cases 20 40

Under DiagnosisUnder Diagnosis (Sens. = 0.05; Spec.=1.0)(Sens. = 0.05; Spec.=1.0) Number Identified as cases 10Number Identified as cases 10 20 20 2.02.0

Over DiagnosisOver Diagnosis (Sens.= 1.0 ; Spec.=0.99) :(Sens.= 1.0 ; Spec.=0.99) : Number Identified as cases 40Number Identified as cases 40 120 120 1.31.3

1. Chance (Random error)

2. Bias (Systematic error)

Selection

Information

Confounding

3. Effect of propanolol

Explanation for the observed difference in survival Explanation for the observed difference in survival between propanolol and control group:between propanolol and control group:

AA high reliabilityhigh reliability means that in repeated measurements the results fall very close to each other; conversely,

AA low reliabilitylow reliability means that they are scattered.

250250

20002000

10001000

Download - ERRORS IN EPIDEMIOLOGICAL STUDIES Assoc. Prof. Pratap Singhasivanon Department of Tropical Hygiene

Top Related