making the case for anonymization
TRANSCRIPT
Making the case for anonymization
Craig Earle, MD MSc FRCPC
Ontario Institute for Cancer ResearchCancer Care OntarioInstitute for Clinical Evaluative SciencesSunnybrook-Odette Cancer CentreUniversity of Toronto
‹#›
PATIENT
Drugs
Ministry of
Health
OUTCOMES
Cancer Control Continuum
Lab
End of
Life
Cancer
Care
Ontario
HospitalPCPSpecialist
HEALTH CARE SYSTEM
Prevention
Screening
Diagnosis
Survivorship
Treatment
COSTQUALITY
ACCESS
Health Services Research (HSR)
3
Ontario strengths for Health Services Research
• Universal healthcare
– Small private insurance sector
• Drug coverage
– > 65 yo
• Centralization of some services
– radiation
Derived chronic conditions (IKN)
People &
Geography (IKN)
Special Collections
(IKN)
Registries
(Cancer, Stoke,
CCN, *Birth outcomes)
Federal immigration register
*First Nations/Metis
(using linked data)
Diabetes
Hypertension
COPD
CHF
AMI
Asthma
IBD
Health Services (IKN)
Physician claims
In-patient hospital discharge abstracts
Emergency and ambulatory care
abstracts
Prescription drug claims
(65 and over)
Home care claimsRehab claims
Long-term care visits
In-patient mental health
data
Providers/Facilities
PhysiciansHospitals
Complex careLong-term
care homesHome care
Real-time (IKN)
*Patient functional
status*Peritoneal
Dialysis
Population Estimates
Canada Census Profiles
Linked de-identified data set for analysisPrimary
Collected Data
(IKN)
Postal CodeConversion/Geographical
Ontario Linkable Data at ICES
People in Ontario since
1985
Unique individual
anonymous #IKN
*HIV
Developmental
Disabilities
Death register
IKN=unique algorithm based on Ontario health card number
5
The Power of Data Linkage
Create cohorts focusing on continuity of care paths, diseases, outcomes of care, including some determinants of health information
Example:
• OCR gives details about cancer diagnoses
• RPDB provides demographic information
• OHIP provides information about physician service utilization
• DAD/NACRS from CIHI assessment of disease, hospital-based procedures and health service outcomes
• Census permits population estimates and impact of SES
• Survey data may add self-reported disease prevalence, health service use and lifestyle information
HSR Across the Cancer Control Continuum
Screening/
Detection
Diagnosis/
Treatment
Survivorship End of life
care
Access Colon Cancer
Check
Referral patterns
for stage IV
NSCLC
Surveillance
after breast
cancer
Aggressiveness
of cancer care
near the end of
life
Quality Colon cancer
screening rates
Effect of Volume
& MD specialty
on ovarian
cancer outcomes
Concordance
with
surveillance
guidelines
Hospice
utilization
patterns
Cost CEA of PET
staging in
esophageal
cancer
Influenza
vaccination
Resource
utilization in
follow up
Case Costing
Outcomes Screening for
2nd cancers in
Hodgkin’s
survivors
Lung cancer
surgery
regionalization
Late effects
of cancer
treatment
Complications
from continuing
chemotherapy
too near death
7
ICES
The Ontario Cancer Data
Linkage Project (‘cd-link’)
9
cd-link goals
1. To make linkages of existing data
sources available as a resource for
cancer health services researchers
2. To put de-identified linked data
directly into the hands of
researchers
10
cd-link
Procedures
Proposal
11
Data Request Form
• Define Cohort
• Datasets
– Active selection of variables
12
Data Use Agreement (DUA)
• Purpose limitation• Confidentiality/re-identification/linkage/re-contact• Security: password protection, encryption, public access,
removable media• Research ethics approval• Limitation on onward transfer/sharing with 3rd parties• Cell size suppression• Pre-publication review• Acknowledgement (not co-authorship or endorsement)• Ownership of data• Returning/destroying data• Breach notification enforcement • Responsibility to educate anyone touching the data• Signed confidentiality agreement with anyone touching
the data• Threat of surprise audits
13
HIPAA 18 restricted variables
1. Name
2. MRN
3. HIC
4. Geographic units < 20,000
5. Dates (except year)
• 6. phone #
• 7. fax #
• 8. e-mail address
• 9. SSN
• 10. license #
• 11. account #
• 12. VIN
• 13. device serial #
• 14. URL
• 15. IP address
• 16. Biometrics
• 17. photos
• 18. any other unique identifying code
“safe harbor method’
14
PARAT
cd-link de-identification
Name OHIP DOB Sex Dx DoDx Adm dt MD Census
med
income
DoD
Lynn
Foma
123456 1/7/46 F NHL 3/9/07 7/4/07 35429 61,435 9/9/07
- 95135 1946* F NHL 2007 117 5384 61,000 184
*or 5-year age groups with high and low limits
No longer PHI. Not human subjects research.
19
Assess feasibility for
cd-link
Presentation to Cancer
Program
ICES Privacy approval
DCP discussions and dataset creation
PARATBurn to DVD and courier
Client receives DVD and completes analysesData
destruction and project closure
Request Form
submittedConsultation phase
Confirmation of Feasibility
issued
Services Agreement issued; DCP discussion and dataset creation
PARAT and IDAVE
onboarding
Dataset posted to
IDAVERequestor analyses data
ICES Data & Analytic Services
DAS
cd-link
- Solves the problem of data availability outside of Ontario- Two private sector pilots underway (Medtronic & Brogan), $100K access fee
- ‘Pilot tested’ 2 releases through DAS- In the process of modifying ICES-CCO collaboration agreement to facilitate shift
to VPN
20
Future challenges
• EHR data
– Misspellings, typographic errors, variations in nomenclature
– Possibly identifying processes
• Knowledge of inclusion in data set?
• Incremental privacy breach?
• ‘De-facing’ imaging data
21
22
Future challenges
• EHR data
– Misspellings, typographic errors, variations in nomenclature
– Possibly identifying processes
• Knowledge of inclusion in data set?
• Incremental privacy breach?
• ‘De-facing’ imaging data
• Genomics
23
Conclusion
• Privacy and research are both public goods
• With the proper safeguards in place, both can be
optimized
“Positive sum (win-win) paradigm”Dr. Ann Cavoukian
(former Ontario Information and Privacy Commissioner)