brussels, 13 february 2019 protecting genomic data...a science-for-policy perspective brussels, 13...

18
Protecting Genomic Data Protecting Genomic Data Protecting Genomic Data Protecting Genomic Data Prof. Jean-Pierre Hubaux Academic Director of the Center for Digital Trust School of Computer and Communication Sciences EPFL With gratitude to the biomedical and CS researchers I have the privilege to work with Integrating genomics into personalised healthcare: a science-for-policy perspective Brussels, 13 February 2019

Upload: others

Post on 26-Dec-2019

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

Protecting Genomic DataProtecting Genomic DataProtecting Genomic DataProtecting Genomic DataProf. Jean-Pierre Hubaux

Academic Director of the Center for Digital Trust

School of Computer and Communication Sciences

EPFL

With gratitude to the biomedical and CS researchers I have the privilege to work with

Integrating genomics into personalised healthcare:

a science-for-policy perspective

Brussels, 13 February 2019

Page 2: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

2

The Guardian,

14 May 2017

““““WannaCryWannaCryWannaCryWannaCry” Ransomware Virus (May 2017)” Ransomware Virus (May 2017)” Ransomware Virus (May 2017)” Ransomware Virus (May 2017)

Page 3: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

Growing Concern: Medical Data Breaches Growing Concern: Medical Data Breaches Growing Concern: Medical Data Breaches Growing Concern: Medical Data Breaches

3

Around 5 declared breaches per week, each affecting 500+ people

https://ocrportal.hhs.gov/ocr/breach/breach_report.jsf

Workshop on Artificial Intelligence for Health - JL Raisaro11/10/18, NYC

Page 4: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

4

• Lin et al. 2004 Science: 75 or more SNPs (Single Nucleotide Polymorphisms) are sufficient to identify a single person

• Homer et al. 2008 PLOS Genetics: aggregated genomic data (i.e., allele frequencies) can be used for re-identifying an individual in a case group with a certain disease

• Gymrek et al. 2013 Science: surnames can be recovered from personal genomes, linking “anonymous” genomes and public genetic genealogy databases

• Lipper et al. 2017 PNAS: Anonymous genomes can also be identified by inferring physical traits and demographic information

• Many more to come…

DeDeDeDe----identification of *identification of *identification of *identification of *omicomicomicomic data is impossibledata is impossibledata is impossibledata is impossible

Page 5: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

Security / Privacy Requirements forSecurity / Privacy Requirements forSecurity / Privacy Requirements forSecurity / Privacy Requirements forPersonalized HealthPersonalized HealthPersonalized HealthPersonalized Health

• Pragmatic approach, gradual introduction of new protection tools

• Different sensitivity levels of the data

• Different access rights

• Exploit existing data (electronic health records) and tools

• Be future-proof (no short-sighted “bricolage”)

• Awareness and enforcement of patient consent

5

Page 6: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

Technologies for Privacy and Security ProtectionTechnologies for Privacy and Security ProtectionTechnologies for Privacy and Security ProtectionTechnologies for Privacy and Security Protection

6

Traditional Encryption Homomorphic EncryptionSecure Multiparty

Computation

Trusted Execution Environments

Differential PrivacyDistributed Ledger

Technologies (Blockchains)

Page 7: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

Homomorphic EncryptionHomomorphic EncryptionHomomorphic EncryptionHomomorphic Encryption

Compute (⋇)

compute (∘)

encryptencrypt

�, � � ∘ �

�(�), �(�) � � ⋇ � � = � � ∘ �

Homomorphic encryption enables computations directly on encrypted data.

7

Page 8: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

DPPH DPPH DPPH DPPH –––– Data Protection inData Protection inData Protection inData Protection inPersonalized HealthPersonalized HealthPersonalized HealthPersonalized Health

• 5 research groups across the ETH domain + SDSC (Swiss Data Science Center)

• Funding: 3 Millions CHFrs

• Duration: 3 years (4/2018 - 3/2021)

• Funding Program: ETH PHRT (Personalized Health and Related Technologies)

https://dpph.ch

LCA1: Systems for privacy-

conscious data sharing

DEDIS: Distributed and Decentralized

Trust

GR-JET: Fundamental cryptography

Fellay Group: Medical

application

SDSC: Data Science

Infrastructure and Deployment

Health Ethics and Policy: Legal

and Ethical analysis

Project goals:

• Address the main privacy, security, scalability, and ethical challenges of data

sharing for enabling effective P4 medicine

• Define an optimal balance between usability, scalability and data protection

• Deploy an appropriate set of computing toolsD. Jetchev

E. Vayena

J. Fellay

O. Verscheure

B. Ford

JP Hubaux

Page 9: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

Envisioned NationEnvisioned NationEnvisioned NationEnvisioned Nation----Wide DeploymentWide DeploymentWide DeploymentWide Deployment

9

Q1: How many patients

with BRCA1 and breast

cancer?

Q2: What is the survival rate for

cancer patients undergoing a

given chemotherapy?

Year 2 & 3: tiered

deployment, extra

functionality

Year 1: small-scale

prototype, simple queries

Page 10: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

MedCo: Combining the best of Information MedCo: Combining the best of Information MedCo: Combining the best of Information MedCo: Combining the best of Information Security and Medical InformaticsSecurity and Medical InformaticsSecurity and Medical InformaticsSecurity and Medical Informatics

UnLynx

10

DISCLAIMER

MedCo is a generic concept and it is not fundamentally tied to these technologies, but can be

adapted and integrated to other ones

Data model

Interoperability layer

Meta API

Privacy-preserving

computing framework

Modern GUI

Page 11: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

Conclusion on Conclusion on Conclusion on Conclusion on MedCoMedCoMedCoMedCo demodemodemodemo

11

• Current state: cohort exploration under homomorphic encryption

• Fully decentralized architecture

• Data stay with each data provider

• Resistance against colluding, malicious adversaries

• Ongoing work: fully decentralized computation under homomorphic encryption

• Linear regression

• Logistic regression

• Neural networks

Page 12: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

12

© Dilbert.com

Page 13: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

DCC: Data Coordination Center

ITNode

ITNode

ITNode

ITNode

ITNode

DCC

DPPH DPPH DPPH DPPH –––– The Role of the The Role of the The Role of the The Role of the BlockchainBlockchainBlockchainBlockchain

13

… …

DPPH Blockchain

Inference

resistance

Provenance and

Reproducibility

Immutable Log

Big Data Platform

Distributed Access

Control

Distributed Privacy-

conscious

Processing

We use a private blockchain, unlike Bitcoin that uses a public blockchain.

Page 14: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

Data Protection for Personalized HealthData Protection for Personalized HealthData Protection for Personalized HealthData Protection for Personalized Health

14

Swiss Personalized Health Network

GA4GH has its own workstream on

data security

At the international level:

Page 15: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

15

Page 16: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

Events on Genome Privacy and SecurityEvents on Genome Privacy and SecurityEvents on Genome Privacy and SecurityEvents on Genome Privacy and Security

• Dagstuhl seminars on genome privacy and security 2013, 2015

• Conference on Genome and Patient Privacy (GaPP)• March 2016, Stanford School of Medicine

• GenoPri: International Workshop on Genome Privacy and Security • July 2014: Amsterdam (co-located with PETS)

• May 2015: San Jose (co-located with IEEE S&P)

• November 12, 2016: Chicago (co-located with AMIA)

• October 15, 2017: Orlando (co-located with Am. Societyfor Human Genetics (ASHG) and GA4GH)

• October 3, 2018, Basel (co-located with GA4GH)

• iDash: integrating Data for Analysis, Anonymization and sHaring(already in previous years)

• October 14, 2017: Orlando

• Inst. For Pure and Applied Mathematics (IPAM, UCLA)

Algorithmic Challenges in Protecting Privacy for Biomed Data

10-12 January, 2018

• DPPH Workshop, 15 February 2018

� Lots of material online 16DPPH18.epfl.ch

Page 17: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

““““genomeprivacy.orggenomeprivacy.orggenomeprivacy.orggenomeprivacy.org””””

Community website

• Searchable list of publications on genome privacy and security

• News from major media (from Science, Nature, GenomeWeb, etc.)

• Research groups and companies involved

• Tutorial and tools

• Events (past & future)17

Page 18: Brussels, 13 February 2019 Protecting Genomic Data...a science-for-policy perspective Brussels, 13 February 2019 2 The Guardian, 14 May 2017 ““““WannaCryWannaCry” Ransomware

ConclusionConclusionConclusionConclusion

• Worldwide, the confidentiality of health data is in jeopardy

• With the advent of genomics:• Risk is increasing

• Conventional medical data protection techniques (de-identification,…) do not work anymore

• The Data Protection for Personalized Health project is a response to these concerns

• Data security is one of the top priorities of GA4GH

https://dpph.ch

18