division of geriatrics using secondary data analysis for outcomes research epi 211 april 2011...

39
Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Upload: dominick-reynolds

Post on 28-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Using Secondary Data Analysis for Outcomes Research

Epi 211April 2011

Michael Steinman, MD

Page 2: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Disclosures:

None

Acknowledgements:

J. Michael McWilliams

Ann Nattinger

SGIM Research Committee

Shameless plug for CER

http://ctsi.ucsf.edu/research/cer

Disclosures and acknowledgements

Page 3: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Question:

• You are a fellow / junior faculty member interested in studying...– Impact of nurse-led HTN clinics on clinical

outcomes in patients with HTN– Impact of implementing EMRs on appropriate

prescribing in ambulatory surgical patients– Whether quality measures of asthma control

in children correlate with actual clinical outcomes in this population

Page 4: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Question:

• Here’s your choice:

– A. Get a multimillion dollar grant to conduct a multi-center, multi-year RCT

– B. Analyze existing data

Page 5: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

• Appreciate key conceptual and methodologic issues involved in outcomes research employing secondary data analysis

• Identify and use online tools for locating and learning about datasets relevant to your research

Learning objectives

Page 6: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

• Working with secondary data– Conceptual and methodologic issues

• Overview of high-value datasets and web-based resources

• Q&A

Overview

Page 7: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Working with Secondary Data

Page 8: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Key Take-Home Points

• Secondary data analysis is rigorous research – Not throwing data on a wall and seeing what sticks

• RQ must meet FINER criteria and be interesting a priori

• Know the data as if it were your own– How was it collected; limitations (including validity)

• Read the codebooks and any/all documentation; validation studies; speak with PIs.

– Perfect enemy of good (but so is crap)

Page 9: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

• Which comes first: question or dataset?a. Research question firstb. Dataset first

Conceiving a Project

Page 10: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

• Which comes first: question or dataset?a. Research question firstb. Dataset first

• Hybrid approach1. Identify research focus, broad question2. Consider candidate datasets3. Hone question4. Iterate between 2 and 3

Conceiving a Project

Page 11: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

• Data that have been collected but not for you

• Survey• Administrative (claims)• Discharge• Medical chart / EMR • Disease registries • Aggregate (ARF, US Census)• Combinations and linkages

Types of Secondary Data

Page 12: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

• Compatibility with research question(s)

• Availability and expense

• Sample: representativeness, power

• Measures of interest present and valid– Predictors, outcomes, confounders

• Messiness and missingness

• Local expertise

• Linkages

Selecting a Database

Page 13: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

1. Causal inference• Inherently limited with observational data• Does not preclude quasi-experimental

designs to recover causal effects• Core of comparative effectiveness research• Value of these approaches highly dependent on

expected confounders• For example, study of medical management vs.

catheterization for AMI

Challenges and Pitfalls

Page 14: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

2. Validity of measures– Beware of assumptions– Problems: coding, reporting, recall biases– Carefully read the codebooks and

documentation about the study• How variables measured• (Who was included in study)

– Solutions: direct validation in subgroup or another data source, literature review, sensitivity analyses

Challenges and Pitfalls

Page 15: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

• Want to measure financial resources• Explanation for underuse of health services, poor outcomes?

• Have measures of income.

• Are the two equivalent?

• Might financial resources also depend on:• Other assets – especially retired persons?• Family and community resources

What You Want and What You Have

Page 16: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

• Want to measure presence of a chronic disease

• Have ICD9 codes from Medicare billing claims.

• Will this work?

• Accuracy of ICD9 claims may depend on:• Type of disease – specificity of symptoms, “dominance” in

clinical visit, accuracy of clinician diagnosis• Coding incentives – upcoding in Medicare, undercoding in VA• How codes operationalized – which codes to use; require 1 or 2

separate codes; what time period; etc.

What You Want and What You Have

Page 17: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

3. Complexity of file structure– Row in dataset may not be unit of analysis– Skip patterns, proxy respondents

Challenges and Pitfalls

Page 18: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Ask: IF ((piRTab1X007AFinFam = FAMILYR) OR (piRTab1X007AFinFam = FINANCIAL_FAMILYR)) AND ((ACTIVELANGUAGE <> EXTENG) AND (ACTIVELANGUAGE <> EXTSPN)) AND (piInitA106_NumContactKids > 0) AND (piInitA100_NumNRKids > 0)

JE012 CHILDREN LIVE WITHIN 10 MILES Section: E Level: Household Type: Numeric Width: 1 Decimals: 0 CAI Reference: SecE.KidStatus.E012_ 2000 Link: G1980 2002 Link: HE012

IF {R DOES NOT HAVE SPOUSE/PARTNER and DOES NOT STILL HAVE HOME OUTSIDE NURSING HOME {(CS11/A028=1) and (CS26/A070 NOT 1)}} or {R & SPOUSE/PARTNER} LIVE IN SAME NURSING HOME (CS11/A028=1 and CS12/A030=1): [Do any of your children who do not live with you/Does CHILD NAME] live within 10 miles of you (in R's NURSING HOME CITY, STATE (CS25b/A067))?

OTHERWISE: [Do any of your children (who do not live with you)/Does CHILD NAME] live within 10 miles of you (in MAIN RESIDENCE [CITY/CITY, STATE STATE])?

6802 1. YES 4720 5. NO 32 8.DK (Don't Know); NA (Not Ascertained) 4 9. RF (Refused) 2087 Blank. INAP (Inapplicable)

A Simple Question?

* From the Health and Retirement Study

Page 19: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

4. Data mining / overfitting• Is urine cortisol associated with Catholicism?• But…

• “Just because you were too stupid to think of the question in advance doesn’t mean it’s not important”

- Warren Browner

Challenges and Pitfalls

Page 20: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

5. Representativeness of Sample• External validity (generalizability)• Internal validity (selection bias)• Example: comparing outcomes for insured and

uninsured patients using hospital discharge data• Must be hospitalized to enter sample• Not only limits generalizability (to outpatients)• But inferences about the sample may be wrong

– Sample would need to include uninsured who would have been hospitalized if insured

Challenges and Pitfalls

Page 21: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Finding the Right Dataset

Page 22: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Finding the Right Dataset

• Contain variables of interest – predictor, outcome, confounders

• Relevant time frame– Cross-sectional, longitudinal

• Feasible– Access: time, bureaucracy, cost– Usable

• No perfect datasets -> hybrid approach of developing research question

Page 23: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Administrative Data (VA)

• VA has multiple high-value administrative databases– Outpatient visit information

• Visit date, type of clinic, provider, ICD9 diagnoses

– Inpatient information• Admitting dx(s), discharge dx(s), CPT codes, bed section, meds

administered

– Lab data• >40 labs

– Pharmacy data• All inpatient and outpatient fills

– Academic affiliation– etc

Page 24: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Administrative Data (VA)

• Huge bureaucracy and paperwork

Page 25: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Administrative Data (VA)

• Messy data

• Huge size– 2 TB server

• Data analyst

Page 26: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Survey Data (NHANES)

• National Health and Nutrition Examination Survey (NHANES)– Nationally representative sample of >10K

patients every 2 years– Extensive interview data on clinical history

(including diseases, behaviors, psychosocial parameters, etc.)

– Physical exam information (e.g. VS)– Labs, biomarkers

Page 27: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Survey Data (NHANES)

• Free and easy to download• (Relatively) easy to use

– Although requires careful reading of documentation

• Serial cross-sectional • Disease data self-report• Very limited information about providers and

systems of care

Page 28: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Survey Data (NAMCS)

• National Ambulatory Medical Care Survey (NAMCS) and National Hospital Ambulatory Medical Care Survey (NHAMCS)

• Nationally representative sample of ~70K outpatient and ED visits per year

• Physician-completed form about office visit

Page 29: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD
Page 30: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Survey Data (NAMCS)

• Data more from physician perspective (diagnoses, treatments Rx’ed, etc) and some info on providers (e.g., clinic organization, use of EMRs, etc)

• Serial cross-sectional– Visit-focused– Not comprehensive, ? value for chronic diseases

Page 31: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Discharge Data (NIS)

• National Inpatient Sample (NIS)– Database of inpatient hospital stays collected from ~20% of US

community hospitals by AHRQ– Diagnoses and procedures, severity adjustment elements,

payment source, hospital organizational characteristics– Hospital and county identifiers that allow linkage to the American

Hospital Association Annual Survey and Area Resource File

Page 32: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Discharge Data (NIS)

• Relatively easy to access (DUA, $200/yr)

• Relatively easy to use– Though need close attention to

documentation

• Limited data elements

• Huge data files

Page 33: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Web-Based Resources

• Society of General Internal Medicine (SGIM) Research Dataset Compendium– www.sgim.org/go/datasets

• UCSF CELDAC– http://ctsi.ucsf.edu/research/celdac

• UCSF K-12 Data Resource Center– http://www.epibiostat.ucsf.edu/courses/

RoadmapK12/PublicDataSetResources/

Page 34: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD
Page 35: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD
Page 36: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Finding Additional Resources• National Information Center on Health Services Research and Health Care

Technology (NICHSR)• Inter-University Consortium for Political and Social Research (ICPSR)• Partners in Information Access for the Public Health Workforce• Roadmap K-12 Data Resource Center (UCSF)• List of datasets from the American Sociologic Association• Canadian Research Data Centers – Data Sets and Research Tools (Canada)• Directory of Health and Human Services Data Resources • Publicly Available Databases from National Institute on Aging (NIA)• Publicly Available Databases from National Heart, Lung, & Blood Institute (NHLBI)• National Center for Health Statistics (NCHS) Data Warehouse• Medicare Research Data Assistance Center (RESDAC); and Centers for Medicare

and Medicaid Services (CMS) Research, Statistics, Data & Systems• Veterans Affairs (VA) data

(all available at www.sgim.org/go/datasets)

Page 37: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

National Information Center on Health Services Research and Health Care Technology (NICHSR)

•Databases, data repositories, health statistics•Fellowship and funding opportunities•Glossaries, research and clinical guidelines•Evidence-based practice and health technology assessment

•Specialized PubMed searches on healthcare quality and costs

http://www.nlm.nih.gov/hsrinfo/datasites.html

Page 38: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Inter-University Consortium for Political and Social Research (ICPSR)

•World’s largest archive of social science data•Searchable•Many sub-archives relevant to HSR

–Health and Medical Care Archive–National Archive of Computerized Data on Aging

http://www.icpsr.umich.edu/icpsrweb/ICPSR/partners/archives.jsp

Page 39: Division of Geriatrics Using Secondary Data Analysis for Outcomes Research Epi 211 April 2011 Michael Steinman, MD

Division of Geriatrics

Conclusions

• Secondary data has lots of advantages– Relatively quick, tremendous power, high-profile work

• Approach with a high level of detail and care– Conceptual background and RQ– Validity and use of measures

• Explore range of options available – but also take advantage of resources at hand