case control & cohort study

CASE CONTROL

AND COHORT STUDY

Dr. Bhumika Bhatt

Junior Resident

CONTENTS:

DEFINITION TYPES OF STUDY ANALYTICAL STUDIES CASE CONTROL STUDY VARIANTS OF CASE CONTROL STUDY SUMMARY COHORT STUDY DIFFERENCE SUMMARY REFERENCE

EPIDEMIOLOGY The most conventional definition of epidemiology

is "the study of the distribution and determinants of health-related states or events in specified populations, and the application of this study to control health problems." ( John M.Last,1988)

TYPES OF STUDY

Experimental Observational

RCT Non RCT

Analytical Descriptive

Ecological Cross-sectional Case-control Cohort

ANALYTICAL STUDIES In analytical studies , the subject of interest is the

individual within the population. The object is not to formulate but to test the

hypothesis. To evaluate an association between exposure and

disease. Analytical studies focuses on the magnitude of the

association between the exposure and the health problem under the study.

CASE-CONTROL STUDY (RETROSPECTIVE STUDY) Unit of Study: Cases/Control(Individuals) Study Question : What had happened Direction of Inquiry: E O Study Design:

CasesNot

Exposed

Exposed

Control

Expose

d

Not Exposed

CASE CONTROL STUDY A case–control study is an observational study in

which subjects are sampled based upon presence or absence of disease and then their prior exposure status is determined.

Distinct feature:

a. Both exposure and outcome (disease) have occurred before the start of the study.

b. The study proceeds backwards from effect to cause.

c. It uses a control or comparison group to support or refute an inference.

BASIC DESIGN

RISK FACTORS

CASES(Disease Present)

CONTORLS(Disease Absent)

PRESENTa b

ABSENT c d

Total a+c b+d

BASIC STEPS Selection of cases and controls. Matching. Measurement of exposure and Analysis and interpretation.

CASES SELECTION

Study begins with cases, i.e. the patients in whom the

disease has already occurred.

Patients with the disease in question (cases) were

enquired for all the details of their exposure to the

suspected cause.

The new cases, which are similar clinically,

histologically, pathologically and in their duration of

exposure (stage) will be chosen to avoid any error and

for better comparison.

CASES SELECTIONDefinition of case: it involve two specifications-

(i) Diagnostic criteria :Enunciate clear cut diagnostic criteria for the disease of interest. As far as possible use criteria given by expert bodies.

(ii) Eligibility criteria : It is always advisable to take the incident cases since the prevalent cases might have changed their exposure status due to medical advice etc.

Sources of Cases Hospitals. General population:

SELECTION OF CONTOLS Controls must be free from the disease under

study. The usual principle that is to be observed while selecting controls should be that “like should be compared with the like” to avoid errors and for better comparison .

Sources of controls: Hospital controls General population Relatives/Neighborhood

To Do To Avoid

Select controls from various diagnostic groups so no particular risk factors will be overrepresented

Do not select patients who have multiple concurrent conditions

Select controls from patients with acute conditions so earlier exposures could not have been influenced by the condition

Do not select patients with diagnoses known to be related to the risk factor of interest

-Source of controls (healthy population based or hospital based)- No. of controls- No. of control groups- Method of sampling the controls- Matching, if considered.

STRENGTHS OF POPULATION-BASED AND HOSPITAL-BASED CASE-CONTROL STUDIES

Population-Based Hospital-Based

Source population is better defined Subjects are more accessible

Easier to make certain that cases and controls derive from the same source population

Subjects tend to be more cooperative

Exposure histories of controls more likely to reflect those of persons without the disease of interest

Easier to collect exposure information from medical records and biological specimens

CONFOUNDING FACTOR Defined as one which is associated both with

exposure and disease and is distributed unequally in study and control groups.

Confounder

Exposure outcome

(i)Associated with the exposure of interest.(ii) Related to the outcome of the interest.(iii) It should not be in the direct chain or link between the exposure and outcome

Hypothesis:Whether consumption of alcohol is a risk factor for oral CA.

100 cases of oral CA and 100 healthy subjects were asked regardingthe history of alcohol consumption during past 15 years.

Odds ratio= (a x d / b x c)= (80 x 80) (20 x 20) = 16

Risk of getting oral cancer is 16 times higher if a person drinks alcohol.

History of Alcohol

Oral CancerPresent

Oral Cancer Absent

Total

Present 80 20 100

Absent 20 80 100

Total 100 100 200

Due to the “hidden” effect of tobacco use becausepeople who drink alcohol are also often the ones who also use tobacco; and tobacco use is itself a direct cause of oral cancer, whether one drinks or not.

Findings may be false:

Dissecting hypothetical data into two strata

Tobacco Users

Non-Tobacco Users

Stratum OR=60x5/20x15 =1

Stratum OR= 5x 60/15 x 20=1

Conclusion :Both the strata OR falls to 1 i.e. there is no risk of cancer from alcohol after adjusting for the effect of tobacco

METHODS FOR CONTROLLING CONFOUNDING Randomisation: If a group of subjects is divided

into two , using “random allocation” (syn. Randomization) the 2 groups will be similar to each other in all respect.

Restriction: the subjects having the particular confounding variable(s) are not taken up at all.

Matching

MATCHING

Defined as the process by which we select controls in such a way that they are similar to cases with regards to certain pertinent selected variables (e g. age, sex, occupation, social status etc. ) which are known to influence the outcome of the disease.

MATCHING

Advantages Disadvantages

May increase the precision of case-control comparisons and thus allow a smaller study.

May be time-consuming and expensive to perform.

The sampling process is easy to understand and explain.

Some potential cases and controls may be excluded because matches cannot be made.

If analyzed correctly, provides reassurance that matched variables cannot explain case-control differences in the risk factor of interest.

The matched variables cannot be evaluated as risk factors in the study population.

MEASUREMENT OF THE EXPOSURE Information about the exposure should be obtained

in precisely the same manner for both cases and controls.

This may be obtained by the interviews, by questionnaires, or by studying past records of cases such as hospital records, employment records.

ANALYSISThe final step is Analysis:

Exposure rate among cases and controls to suspected factors.

Estimation of the Disease risk associated with exposure (Odds ratio).

CASE CONTROL STUDIES OF SMOKING AND LUNG CANCER

CASES (WITH LUNG CANCER

CONTROLS (WITHOUT LUNG CANCER)

SMOKERS 33(a) 55(b)

NON SMOKERS 2(c ) 27 (d)

TOTAL 35 (a + c) 82( b + d)

ANALYSIS: Exposure rates:

A. Cases a/a + c = 33/35 = 94.2%.

B. Controls = b/b + d = 55/82 = 67.0% This shows frequency rate of lung cancer is

definitely higher among smokers than among non-smokers.

The chance of something happening can be expressed as a risk and/or as an odds

Risk = the chances of something happening

the chances of all things happening

Odds = the chances of something happening the chances of it not happening

Example-1: If we choose a student randomly from your class of say 9, how likely is it that you will be chosen?

Risk (probability) = 1/9 = .111

Odds = 1/8 = .125

Example-2: Among 100 people at baseline, 20 develop influenza over a year.

The risk is 1 in 5 (i.e. 20 among 100) = .2

The odds is 1 to 4 (i.e. 20 compared to 80) = .25

ODDS RATIO

Measure of strength of association between risk factors and outcome.

Odds ratio= P/1-P, P= Probability The odds ratio is also known as the cross-products ratio Based on 3 assumption:

1. Disease being investigated must be relatively rare. In fact majority of the chronic disease have a low incidence in the general population.

2. The cases must be representative of those with the disease.

3. The controls must be representative of those without the disease.

ODDS RATIO

Cohort study Case control study

ODDS RATIO Odds Ratio : ad/ bc

33 X 27/55 X 2 = 8.1 Odds ratio is a Key Parameter in the analysis of case

control studies. It interprets that odds of cases being exposed are so

many times higher compared to the odds of controls being exposed.

In our example risk of lung cancer due to smoking is 8.1 as compared to non smoking.

BIAS IN CASE CONTROL STUDIES Selection Biases

Berksonian Bias : The probability of admission to hospital or detection of the outcome (disease) may be more among the cases simply because of the exposure.

Selection of inappropriate Cases or Controls : Cases or controls who do not have adequate chance of exposure.

Self selection Bias : Patients who are admitted to a particular hospital and hence taken as cases may be systematically very different from most of the patients with the disease but who are not admitted to that hospital, as regards the exposure status.

Survivorship Bias : Case control study generally takes the patients who are living. Cases who have died are generally not taken and these may be systematically very different from living case as regards the exposure status

Selection of wrong control group : Controls who are not from the same source population from where the cases have come; selection of close friends of cases - since they would in general have the same behavioural factors as cases (birds of a feather flock together ), example of condom use and STDs.

Information (measurement) Biases Recall bias : Cases who are suffering from a disease are likely to recall much more

as regards their exposure (example on congenital malformation and exposure to X - rays).

Observer bias : If observer is aware of the case - control status, he/she may subconsciously tend to ask much more from cases.

Confounding Bias

VARIANTS OF CASE CONTROL STUDIES

NESTED CASE CONTROL STUDY Combines the advantages of a cohort and a case

control study. Firstly , the study becomes inexpensive and take

care of the logistics. Secondly, we can calculate the incidence of the

disease which would not have been possible in a usual case control study.

Thirdly, the problem of recall bias and that the controls may be from a different source population than cases (which occur in case control study) have been prevented.

Watch for 15 - 20 years

20 randomly selected samples of

those whohave not developed

mental illness (controls)

analyse these 40 samples for serum

lithium andmake comparisons between the two

groups

20 cases of mental disease(cases)

Rest of the cohort is continously folowed

Rest of the cohort ris continously folowed

Hypothesis : High serum lithium levels are a cause of subsequent mental illness.

Take a cohort of say 1000 personswho are free of mental disease, collect their blood sample, preserve them in cold storage

CASE-COHORT STUDIES

CASE-COHORT STUDIESAdvantages: Recall bias is eliminated. If abnormalities in biologic characteristics such as laboratory

values are found, because the specimens were obtained years before the development of clinical disease, it is more likely that these findings represent risk factors or other premorbid characteristics than a manifestation of early, subclinical disease. When such abnormalities are found in the traditional case-control study, we do not know whether they preceded the disease or were a result of the disease.

More economical to conduct. It is possible to study different diseases (different sets of

cases) in the same case-cohort study using the same cohort for controls.

ADVANTAGES AND DISADVANTAGES OF CASE-CONTROL STUDIES

Advantages Disadvantages

Efficient for the study of rare diseases

Risk of disease cannot be estimated directly

Efficient for the study of chronic diseases

Not efficient for the study of rare exposure

Tend to require a smaller sample size than other designs

More susceptible to selection bias than alternative designs

Less expensive than alternative designs

Information on exposure may be less accurate than that available in alternative designs

May be completed more rapidly than alternative designs

SUMMARY

Review of research question and confirm that case -control study is the right design.

Specify the total population and actual (study) population.

Specify the major study variables (exposure,outcome,confounding factors) and their ‘scales’ of measurement(dichotomous etc)

Calculate the sample size.

Specify the selection criteria of cases

• Well suited for diseases which have a long latent period(e.g. cancers, AIDS, MI, CVA etc.)

• Well suited for an outcome which is ‘rare’• Well suited for conditions in which medical care is

usually sought• Helps in examining multiple etiologic factors - once

we have the cases of the disease, we can take history of all the factors that we feel may be risk factors

• Reasonably good for diseases that have a “relatively rapid onset” and are usually hospitalised (e.g. most of the acute infections; injuries etc.)

SUMMARYSpecify the selection procedure for controls

Specify the procedures of measurement and specially take care to ensure validity and reliability

Do a pilot study on 5 to 10 cases and controls

Conduct the study

Analysis of data

COHORT STUDY Forward looking ,incidence , longitudinal, prospective

study or follow up study Cohort = Group of people who share a common

characteristic or experience within a defined time period(age, occupation ,exposure etc).

Cohort study: Cohort studies are observational studies in which the investigator determines the exposure status of subjects and then follows them for subsequent outcomes

Quantified with relative risk/incidence rates/attributable risk

Cohorts are identified prior to the appearance of the disease under investigation.

BASIC FRAME WORK In cohort study the exposure has occurred , but the

disease has not.

Cohort With disease

Without disease

Total exposure

Exposure (etiologic factor)

a b a + b

Non- Exposure

c d c + d

a/(a + b) - Incidence of disease in exposed

c/( c + d)- Incidence of disease in non exposed if a/(a + b )> c/ (c + d) It would suggest that the disease and suspected cause are associated.

GENERAL CONSIDERATION Cohorts must be free from the disease under study. Study and control group must be easily susceptible

to the disease under study. Both the groups must be comparable in respect to

all the possible variables which may influence the frequency of the disease.

The diagnostic and eligibility criteria of the disease must be defined before hand.

Groups are then followed , under the same identical conditions, over a period of time to determine the outcome of the exposure.

TYPES OF COHORT STUDIES

define population

Non randomization

exposed Non exposed

diseased Not diseased

diseased

Not diseased

2000

2010

2020

1987

1997

2007

RetrospectiveProspective

combined

1987 2007 2017

ELEMENTS OF COHORT SELECTION OF STUDY SUBJECT OBTAINING THE DATA ON THE EXPOSURE. SELECTION OF THE COMPARISION GROUP. FOLLOW UP ANALYSIS

SELECTION OF STUDY POPULATION

SELECTION OF STUDY COHORT Special Exposure Groups (e.g. radiologists for

studies on effect of radiation; ANC cases having PIH for studying the outcome of pregnancy, etc.)

Cohort defined on basis of geographical or administrative boundaries (e.g. people living in a given state or district like Framingham heart study). The special advantage of such cohort is that the same group will give an exposed as well as unexposed (comparison) cohort.

Groups offering special resources (e.g. all registered doctors can be followed up for development of IHD after recording their physical activity levels.

OBTAINING DATA ON THE EXPOSURE

DATA External Sources Internal Sources

Exposure Hospital records Questionnaires, physical examinations, and/or blood tests, other diagnostic tests

Event Disease registries, death certificates, physician and hospital records

Questionnaires, physical examinations, and/or blood tests, other diagnostic tests

Confounder Hospital records registries

Questionnaires, physical examinations

SELECTION OF THE COMPARISON POPULATION Internal Control Group

Exposed and non-exposed in

the same Study population

(Framingham study) Minimise the differences

between exposed and non-

exposed External Control Group

When information on degree

of exposure is not available

chose another group, another

cohort (smokers and non

smokers)

General Population: If none of the

above comparison is available than

the mortality experience of the

exposed group is compared with the

mortality experience of the general

population in the same geographic

area as the exposed people.

E.g. comparison of frequency of

cancer among uranium mine

workers with the rate in general

population in same geographic area.

FOLLOW UP One of the problem in cohort studies is the regular

follow up of the participants. Therefore , at the start of the study, methods

should be devised depending upon the outcome to be determined (morbidity or Death) to obtain the data assessing the outcome.

Routine surveillance of

death records.

Review physician and

hospital records

Mailed questionnaires

, telephone calls, periodic home visits.

Periodic medical

examination of each

member of the cohort.

Death.Change of residence.Migration.Withdrawal from occupation etc.

Procedures:

EPIDEMIOLOGIC MEASURES OF ASSOCIATION Absolute comparison

Risk difference I exposed - I unexposed

Measures public health problem caused by the exposure

Relative comparisonRelative RiskOdds Ratio

RR=I exposed / I unexposed

Measures strength of an association

ANALYSIS DATA ARE ANALYSED IN TERMS OF

a) Incidence rates of outcome among exposed and non- exposed.

b) Estimation of risk.

(i) relative risk

(ii) attributable risk

INCIDENCE RATESCigarett

e smokin

g

Develop CHD

Did not develop

CHD

Total Incidence

Yes 70(a)

6930 (b)

7000 (a + b )

70/7000=10 per

1000

No 3(c)

2997 (d)

3000 (c +d)

3/3000=1 per 1000

ESTIMATION OF RISK R. R = incidence of disease (or Death) among exposed

incidence of disease (or Death) among non- exposed

Cigarette

smoking

Develop CHD

Did not develop

CHD

Total Incidence

Yes 70(a)

6930 (b)

7000 (a + b )

70/7000

0.01

No 3(c)

2997 (d)

3000 ( c +

d )

3/3000.001

RR= a/a+b = 70/7000 = 10 c/c+d 3/3000

INTERPRETATION OF RELATIVE RISK AND RATE RATIO RR=1 = No association between exposure and disease

incidence rates are identical between groups RR=> 1 = Positive association

exposed group has higher incidence than non-exposed group

RR=< 1 = Negative association or protective effect. non-exposed group has higher incidence than exposed

or exposed group has lower incidence than non-exposed e.g. RR 10% / 20% = 0.5 it would indicate that if one smokes, the risk of getting IHD is 10%; on the other hand if one does not smokes, the risk is 20%. Smoking thus reduces the risk of getting IHD by half.

MEASURES OF DIFFERENCE Risk difference =I exposed- I non exposed Attributable risk percent Population attributable risk percent.

ATTRIBUTABLE RISK PERCENT

Incidence

Exposed Unexposed

Iexposed – Iunexposed

I = Incidence= ( Iexposed-I unexposed)x 100

Iexposed

ATTRIBUTABLE RISK PERCENT

Attributable risk in our example:AR=( .01-.001/.01)x 100=90%

It indicates to what extent disease under study can be attributed to exposure. If smoking is given up then there will be 90% reduction in CHD among smokers.

Cigarette

smoking

Develop CHD

Did not develop

CHD

Total Incidence

Yes 70(a)

6930 (b)

7000 (a + b )

70/7000

0.01

No 3(c)

2997 (d)

3000 ( c +

d )

3/3000.001

The limitation of AR% is that it tells us the quantum of reductionin the disease that would be achieved in the “exposed” group if“exposure” was given up by them. However, it does not tell usabout the reduction that will occur in the “total population”

POPULATION ATTRIBUTABLE RISK Population attributable risk percent

Proportion of disease in the study population that could be eliminated if exposure is removed

Incidence in total population – Incidence in unexposed incidence in total population

{(73/10,000)-(3/3000)}/73/10,000=.86 PAR%=86%

BIASES IN COHORT STUDY Measurement (Ascertainment) bias : For obviating this, inform all

subjects of both groups well in advance of the dates and timings of medical examination and ensure that both the groups are examined by observers who have similar type of training and using similar type of instruments and techniques.

Observer bias : This occurs because the investigator is aware about the fact as to which subject is ‘exposed’ and who is not exposed. For obviating this, if possible, ‘blind’ the observer to the exposure status, the details of exposure being known only to another co - worker who is, himself, not making any observation regarding ascertainment of outcome.

Cross over bias : This may happen because those having the exposure (e.g. smokers) may cross over to the non exposed group (i.e. become non smokers) and vice versa. Periodic evaluation of both the groups as regards level of exposure, making record entries and subsequent adjustments in the data analysis can help overcoming this problem.

‘Loss to follow up’ bias : Some subjects in any case are likely to be lost to follow up / drop out.

ADVANTAGES DISADVANTAGES

Incidence can be calculated

Several possible outcomes related to exposure can be studied simultaneously.

Cohort studies provide a direct estimate of R.R

Dose – response ratio can also be calculated.

• Large No. of population.• Very lengthy- takes very long

time to complete.• Certain administrative.

• Loss of experience staff.• Loss of funding.• Extensive record keeping.

Selection of comparison group- limiting factor

There may be changes in study methods or Diagnostic Criteria of the Disease over the prolonged period.

Cohort studies are expensive. The study may itself alter the

patients Behavior.

FRAMINGHAM HEART STUDY Best-known cohort studies is the Framingham Study of

cardiovascular disease. Started in 1948. Framingham is a town in Massachusetts, about 20

miles from Boston. Residents between 30 and 62 years of age were

considered eligible for study. 1971 enrolled a second generation of participants. In April 2002, a third generation was enrolled in the

core study.

FRAMINGHAM HEART STUDY

Hypothesis: Incidence of CHD increases with age Hypertension develop CHD Elevated cholestrol is associated with ed CHD Tobacco smoking and habitual use of alcohol increased CHD Increased physical activity a/w with decreased incidence of

CHD Increased Body weight inceases incidence of CHD Diabetes increases incidence of CHD

New coronary events were identified by examining the study population every 2 years and by daily surveillance of hospitalizations at the only hospital in Framingham.

: contd..

FRAMINGHAM HEART STUDY Results: 1960s: Cigarette smoking Increased cholesterol and

elevated blood pressure obesity increases risk of heart disease. Exercise decreases risk of heart disease.

1970s: Elevated blood pressure increases risk of stroke. Postmenopausal women risk of heart disease is increased compared with who are premenopausal.

1980s High levels of HDL cholesterol reduce risk of heart disease.

1990s: Elevated blood pressure can progress to heart failure. At 40 years of age, the lifetime risk for CHD is 50% for men and 33% for women.

contd...

FRAMINGHAM HEART STUDY 2000s “High normal blood pressure" increases risk of

cardiovascular disease (high normal blood pressure is called prehypertension in medicine; it is defined as a systolic pressure of 120–139 mm Hg and/or a diastolic pressure of 80–89 mm Hg). Lifetime risk of developing elevated blood pressure is 90%. Serum aldosterone levels predict risk of elevated blood pressure. Lifetime risk for obesity is approximately 50%.

contd...

DIFFERENCES

SUMMARYSpecify the research question, objectives and background significance, confirm cohort study is to be done

Specify the variables of interest and their scales of Measurement (Exposure variable, Outcome variable, confounders)

Specify the exclusion criteria ( e.g. like to restrict the study to males)

Calculate the sample size

Select the study cohort(Special Exposure Groups , on basis of geographical or administrative boundaries)

• Where there is good evidence of association between exposure and disease, as derived from clinical observation and supported by descriptive and case –control studies.

• When exposure is rare, but the incidence of disease is high among exposed.

• When attrition of study population can be minimized e. g. follow up is easy , cohort is stable.

• When ample funds are available.

Select the study cohort

Select the comparison cohort (Ext. group,Int. group)

Specify the sampling procedure ( simple random or by systematic random sampling method).

Exclude the disease or outcome of interest in both the exposed and unexposed cohort groups

Obtain data on exposure level

Obtain Data on all Potential confounding factors

Consider matching (matching is not important , if eligible then frequency matching )

Follow up and ascertainment of ‘outcome’ of interest

Analysis

REFERENCES Text book of PSM 19th ed by K. Park Lange Medical Epidemiology 4th by Raymonds S

Greenberg , Stephen R Daniels ,John William Elley

Epidemiology by Leon Gordis. Textbook of Public Health and community

medicine by Rajvir Bhalwar ,Rajesh Vaidya, Reena Tilak

http://en.wikipedia.org/wiki/Cohort_(statistics)

THANK YOU

case control & cohort study

Health & Medicine