privacy risk and big data: the promise and peril of re ... · big data & privacy: the promise...
TRANSCRIPT
Privacy Risk and Big Data:
The Promise and Peril of
Re-Identification Research
Michelle N. Meyer, JD, PhD Assistant Professor and Associate Director, Research Ethics,
Geisinger Health System
Human Subject Protection: SPEAKER Changes Thursday, October 6, 2016
MICHELLE MEYER, JD, PhD, (www.michellenmeyer.com) is an Assistant
Professor and Associate Director, Research Ethics, in the Center for
Translational Bioethics and Health Care Policy at Geisinger Health
System. Before joining Geisinger in August 2016, she was an Assistant
Professor of Bioethics at Clarkson University and Director of Bioethics
Policy in the Clarkson–Icahn School of Medicine at Mt. Sinai Bioethics
Program. Previously, she was an Academic Fellow at Harvard Law
School’s Petrie-Flom Center, a Greenwall Fellow at Johns Hopkins and
Georgetown, and a Research Fellow at Harvard’s Kennedy School of Government. Her research focuses on
research ethics and regulation, including learning healthcare systems; genetics and reproduction;
applications of behavioral science to law and policy; and the law, ethics, and psychology of evidence-based
policy and practice. Her writing has appeared in leading journals of law, science, and bioethics, in popular
media outlets including the New York Times, the L.A. Times, Slate, and Wired, and at Forbes, where she is a
contributor. She is on the Advisory Boards of the Social Science Genetic Association Consortium and Making
Science Less WEIRD and a member of the Board of Directors of Open Humans Foundation, as well as a
participant in its Personal Genome Project. She earned a Ph.D. in religious studies, with a focus on applied
ethics, from the University of Virginia, a J.D. from Harvard Law School, where she was an editor of the
Harvard Law Review, and an A.B. in religious studies, summa cum laude, from Dartmouth College.
Human Subject Protection: SPEAKER Changes Thursday, October 6, 2016
BIG DATA & PRIVACY:
The Promise & Peril of
Re-Identification Research
Michelle N. Meyer, PhD JDAssistant Professor
Associate Director, Research Ethics
Center for Translational Bioethics & Health Care Policy
Geisinger Health System
Agenda
• Why re-identification research?
• Transition: One research participant’s experience as a re-
identification target
• Ethical & regulatory issues in re-identification research
• Consent
• Risk-benefit analysis
• Debriefing
• Best practices for data holders, re-ID researchers & IRBs
Why Re-Identification
Research?
Data Quality-Privacy Tradeoff
Source: Daniel Barth-Jones
Governor Weld (1998)
Sweeney, L. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and
Knowledge-based Systems, 10 (5), 2002; 557- 570.
Barth-Jones, DC ., The 'Re-Identification' of Governor William Weld's Medical Information: A Critical Re-Examination
of Health Data Identification Risks and Privacy Protections, Then and Now (July 2012),
http://ssrn.com/abstract=2076397
AOL (2006)
Netflix (2008)
Y-STR DNA Sequences (2013)
Cell Phone Data (2013)
NYC Taxi Data (2014)
Credit Card Data (2015)
PGP (2013): One Participant’s Experience
PGP (2013): One Participant’s Experience
PGP (2013): One Participant’s Experience
PGP (2013): One Participant’s Experience
PGP (2013): One Participant’s Experience
Ethical & Regulatory Issues
Ethical & Regulatory Issues: Consent
Ethical & Regulatory Issues: Consent
Ethical & Regulatory Issues: Consent
Other things PGP participants “consented” to?
• Could reveal your data to your health care provider or insurer or
include it in your medical record
• Could plant your DNA at a crime scene
• Could be discriminated against in employment, insurance
• Data could be publicly disclosed via breaches or hacking
• Could change and republish your data to falsely suggest you have
propensity for a disease or other detrimental trait
• Could use your cell lines for new or unexpected reproductive or
other purposes, including cloning
Ethical & Regulatory Issues: Consent
≠
Ethical & Regulatory Issues: Consent
Does broad consent for deidentified data to be used
for any research purpose
= Consent to be re-identified or to have personally
identifiable data created?
Ethical & Regulatory Issues: Consent
Explicit consent
Simulated data sets
Alternatives to nonconsensual
re-identification research
Ethical & Regulatory Issues: Risk-Benefit Analysis
Distinguish
Risk-Benefit of releasing de-identified data
from
Risk-Benefit of conducting a re-identification study
Ethical & Regulatory Issues: Risk-Benefit Analysis
Q: Are the expected benefits of re-identification research
reasonable in relation to the risks to participants?
A: It depends.(sorry)
Ethical & Regulatory Issues: Risk-Benefit Analysis
Value of white hat research is bound by risk of black
hat re-identification
Risk =
• Feasibility
• Incentives
• Magnitude of harm
Ethical & Regulatory Issues: Risk-Benefit Analysis
NPRM, p. 53940
Expected benefits
Ethical & Regulatory Issues: Risk-Benefit Analysis
Expected benefits
Sweeney et al. (2013)
Ethical & Regulatory Issues: Risk-Benefit Analysis
Risks: sensitivity of data
Ethical & Regulatory Issues: Risk-Benefit Analysis
Risks: identity of researchers
Ethical & Regulatory Issues: Debriefing
• Who?
• Data holders
• Data providers
• What?
• Give data holders a chance to plug leak
• Warn data providers of nature & extent of leak,
planned release of results/press, any fixes
• Can this duty of researchers to data providers be met by warning
data holders?
Ethical & Regulatory Issues: Science Communication
Sweeney et al. (2013)
Ethical & Regulatory Issues: Science Communication
Ethical & Regulatory Issues: Science Communication
Ethical & Regulatory Issues: Best Practices
• Warn participants during consent process of possibility of re-identification
• Have a transparent data use policy about cooperating with/encouraging re-identification researchers
• E.g., ban on re-identifying any data sources or their contacts (e.g., relatives, household members
• E.g., ban on linking other data elements to the data without obtaining certification that data remains de-identified
• Notify participants ASAP about
• Re-identification attacks
• Re-identification researchers’ plans to publicize results
• Any available security updates
• Contact information for questions/concerns
Data holders
Thank you