) linked2safety project (fp7-ict-2011-7 – 5.3 ) a next-generation, secure linked data medical...

13
Linked2Safety Project (FP7-ICT-2011-7 – 5.3) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC HEALTH RECORDS AND CLINICAL TRIALS SYSTEMS ADVANCING PATIENTS SAFETY IN CLINICAL RESEARCH 12 th International Conference on Bioinformatics and Bioengineering, Larnaka The effects of applying cell-suppression and perturbation to aggregated genetic data Athos Antoniades, John Keane, Aristos Aristodimou, Christa Philipou, Andreas Constantinou, Christos Georgousopoulos, Federica Tozzi, Kyriacos Kyriacou, Andreas Hadjisavas, Maria Loizidou, Christiana Demitriou and Constantinos Pattichis

Upload: easton-mountain

Post on 14-Dec-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

Linked2Safety Project (FP7-ICT-2011-7 – 5.3)A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR

SEMANTICALLY-INTERCONNECTING ELECTRONIC HEALTH RECORDSAND CLINICAL TRIALS SYSTEMS

ADVANCING PATIENTS SAFETY IN CLINICAL RESEARCH

12th International Conference on Bioinformatics and Bioengineering, Larnaka

The effects of applying cell-suppression andperturbation to aggregated genetic data

Athos Antoniades, John Keane, Aristos Aristodimou, Christa Philipou, Andreas Constantinou, Christos Georgousopoulos, Federica Tozzi, Kyriacos Kyriacou, Andreas Hadjisavas, Maria Loizidou, Christiana Demitriou and Constantinos

Pattichis

Page 2: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

FP7, ICT-2011 – 5.3 Page 2

Introduction

Why Share Data? What are the current legal and ethical

limitations? How have scientists shared medical data so far? Key Problems Perturbation Cell Suppression

Page 3: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

FP7, ICT-2011 – 5.3 Page 3

The Problem

Why share data:Replication TestingStatistical PowerMultiple Testing Problem

Legal and Ethical IssuesAnonymization vs PseudoanonimizationLimitations derived from consent form signed by subjectsOther, regional, study, or subject specific issues.

Page 4: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

FP7, ICT-2011 – 5.3 Page 4

How have scientists shared medical data Contingency Table and Data Cube

example

  aa aA AA

Case U00 U01 U02

Control U10 U11 U12

Page 5: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

FP7, ICT-2011 – 5.3 Page 5

16 year old widow Problem

A paper that analyzes data from a specific study reports:

Marital Status

Age

Age Married Widowed Single0-16 0 1 5018-24 10 5 5025-34 40 7 4035~ 60 15 20

Page 6: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

FP7, ICT-2011 – 5.3 Page 6

16 year old widow Problem

A paper that analyzes data from a specific study reports:

Marital Status

Age

Age Married Widowed Single0-16 0 1 5018-24 10 5 5025-34 40 7 4035~ 60 15 20

Page 7: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

FP7, ICT-2011 – 5.3 Page 7

16 year old widow Problem

A paper that analyzes data from a specific study reports:

Marital Status

Age

Age Married Widowed Single0-16 0 1 5018-24 10 5 5025-34 40 7 4035~ 60 15 20

Page 8: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

FP7, ICT-2011 – 5.3 Page 8

Categorization Differences

Paper 1 that analyzes data from a specific

study reports:Marital Status

Age

Age MarriedWidowe

d Single0-16 NA NA 5018-24 10 7 5025-34 40 7 4035~ 60 15 20

Marital Status

Age

Age MarriedWidowe

d Single0-16 NA NA 5018-25 10 8 5026-35 45 7 4036~ 55 14 20

Paper 2 that analyzes data from the same

study reports:

Page 9: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

FP7, ICT-2011 – 5.3 Page 9

Perturbation and Cell Suppression

Original Data

Marital Status

Age

Age MarriedWidowe

d Single0-16 0 1 5018-24 10 7 5025-34 40 7 4035~ 60 15 20

Marital Status

Age

Age MarriedWidowe

d Single0-16 NA NA 5118-24 9 8 4925-34 40 7 4135~ 61 14 21

Perturbation (+-1) andCell Suppression (<5)

Page 10: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

FP7, ICT-2011 – 5.3 Page 10

Evaluation

• Most common parameters testedPerturbation:[0], [-1,1], [-3,3], [-5,5], [-10,10]Cell Supression: <0, <=1, <=3,<=5,<=10

• Standard main effect test using Chi Square

• Pearson’s Correlation Coefficient used to evaluate deviation of each parameter combination to original results.

• A-priory defined threshold for Pearson’s correlation coefficient <=0.95.

Page 11: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

FP7, ICT-2011 – 5.3 Page 11

Evaluating Parameters with a matrix of graphs

Page 12: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

FP7, ICT-2011 – 5.3 Page 12

Conclusion and Future Work

We were able to identify for this dataset, the maximum noise that can be added to the data without significantly affecting the outcomes.

Results only relevant to MASTOS, all other datasets need to repeat the analytical approach described.

Further investigation is necessary to identify the minimum parameter settings to satisfy legal and ethical requirements.

Page 13: ) Linked2Safety Project (FP7-ICT-2011-7 – 5.3 ) A NEXT-GENERATION, SECURE LINKED DATA MEDICAL INFORMATION SPACE FOR SEMANTICALLY-INTERCONNECTING ELECTRONIC

FP7, ICT-2011 – 5.3 Page 13

Who to Contact

Athos AntoniadesUniversity of Cyprus

email: [email protected]