turning big data into precision medicine

23
Turning Big Data into Precision Medicine: Real-life Experiences Dr. Matthieu-P. Schapranow Festival of Genomics, Boston, MA June 24, 2015

Upload: matthieu-schapranow

Post on 14-Aug-2015

425 views

Category:

Health & Medicine


1 download

TRANSCRIPT

Page 1: Turning Big Data into Precision Medicine

Turning Big Data into Precision Medicine: Real-life Experiences

Dr. Matthieu-P. Schapranow Festival of Genomics, Boston, MA

June 24, 2015

Page 2: Turning Big Data into Precision Medicine

■  Online: Visit we.analyzegenomes.com for latest research results, tools, and news

■  Offline: Read more about it, e.g. High-Performance In-Memory Genome Data Analysis: How In-Memory Database Technology Accelerates Personalized Medicine, In-Memory Data Management Research, Springer, ISBN: 978-3-319-03034-0, 2014

■  In Person: Join us for “Big Data in Medicine” July 1-2, 2015 in Potsdam, Germany

Important things first: Where do you find additional information?

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

2

Page 3: Turning Big Data into Precision Medicine

What is the Hasso Plattner Institute, Potsdam, Germany?

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

3

Page 4: Turning Big Data into Precision Medicine

■  Since 2009 Program Manager E-Health & Life Sciences

■  2006-2014 Strategic Projects SAP HANA

■  Visiting Scientist at Charité, Berlin and V.A., Boston, MA

■  Software Engineer by training (PhD, M.Sc., B.Sc.)

Who are you dealing with?

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

4

Page 5: Turning Big Data into Precision Medicine

■  Patients

□  Individual anamnesis, family history, and background

□  Require fast access to individualized therapy

■  Clinicians

□  Identify root and extent of disease using laboratory tests

□  Evaluate therapy alternatives, adapt existing therapy

■  Researchers

□  Conduct laboratory work, e.g. analyze patient samples

□  Create new research findings and come-up with treatment alternatives

The Setting Actors in Oncology

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015 5

Turning Big Data into Precision Medicine

Page 6: Turning Big Data into Precision Medicine

IT Challenges Distributed Heterogeneous Data Sources

6

Human genome/biological data 600GB per full genome 15PB+ in databases of leading institutes

Prescription data 1.5B records from 10,000 doctors and 10M Patients (100 GB)

Clinical trials Currently more than 30k recruiting on ClinicalTrials.gov

Human proteome 160M data points (2.4GB) per sample >3TB raw proteome data in ProteomicsDB

PubMed database >23M articles

Hospital information systems Often more than 50GB

Medical sensor data Scan of a single organ in 1s creates 10GB of raw data Cancer patient records

>160k records at NCT

Turning Big Data into Precision Medicine

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Page 7: Turning Big Data into Precision Medicine

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Our Approach Analyze Genomes: Real-time Analysis of Big Medical Data

7

In-Memory Database

Extensions for Life Sciences

Data Exchange, App Store

Access Control, Data Protection

Fair Use

Statistical Tools

Real-time Analysis

App-spanning User Profiles

Combined and Linked Data

Genome Data

Cellular Pathways

Genome Metadata

Research Publications

Pipeline and Analysis Models

Drugs and Interactions

Drug Response Analysis

Pathway Topology Analysis

Medical Knowledge Cockpit Oncolyzer

Clinical Trial Assessment

Cohort Analysis

...

Turning Big Data into Precision Medicine

Page 8: Turning Big Data into Precision Medicine

Case Vignette I

■  Patient: 48 years, female, non-smoker, smoke-free environment

■  Diagnosis: Non-Small Cell Lung Cancer (NSCLC), stage IV

■  Markers: KRAS, EGFR, BRAF, NRAS, (ERBB2)

■  Initial treatment: Surgery

■  Therapy: Palliative chemotherapy

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

8

Medical Knowledge Cockpit

Page 9: Turning Big Data into Precision Medicine

■  Query-oriented search interface

■  Seamless integration of patient specifics, e.g. from EMR

■  Parallel search in international knowledge bases, e.g. for biomarkers, literature, cellular pathway, and clinical trials

Medical Knowledge Cockpit for Patients and Clinicians Linking Patient Specifics with International Knowledge

Turning Big Data into Precision Medicine

9

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Page 10: Turning Big Data into Precision Medicine

Medical Knowledge Cockpit for Patients and Clinicians

■  Search for affected genes in distributed and heterogeneous data sources

■  Immediate exploration of relevant information, such as

□  Gene descriptions,

□  Molecular impact and related pathways,

□  Scientific publications, and

□  Suitable clinical trials.

■  No manual searching for hours or days: In-memory technology translates searching into interactive finding!

Turning Big Data into Precision Medicine

Automatic clinical trial matching build on text

analysis features

Unified access to structured and un-structured data

sources

10

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Page 11: Turning Big Data into Precision Medicine

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Medical Knowledge Cockpit for Patients and Clinicians Pathway Topology Analysis

■  Search in pathways is limited to “is a certain element contained” today

■  Integrated >1,5k pathways from international sources, e.g. KEGG, HumanCyc, and WikiPathways, into HANA

■  Implemented graph-based topology exploration and ranking based on patient specifics

■  Enables interactive identification of possible dysfunctions affecting the course of a therapy before its start Turning Big Data into

Precision Medicine

Unified access to multiple formerly disjoint data sources

Pathway analysis of genetic variants with graph engine

11

Page 12: Turning Big Data into Precision Medicine

Case Vignette II

■  Patient: 67 years, male, smoker, consumes frequently alcohol

■  Diagnosis: Squamous cell carcinoma of the oropharynx, T2N2bM0, stage IVa

■  Initial treatment: Surgery

■  After one year: Relapse multiple metastatic nodules to the lung

■  Therapy: Palliative chemotherapy

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

12

Drug Response Analysis

Page 13: Turning Big Data into Precision Medicine

Real-time Data Analysis and Interactive Exploration

Drug Response Analysis Data Sources

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

Smoking status, tumor classification

and age (1MB - 100MB)

Raw DNA data and genetic variants

(100MB - 1TB)

Medication efficiency and wet lab results

(10MB - 1GB)

13

Patient-specific Data

Tumor-specific Data

Compound Interaction Data

Page 14: Turning Big Data into Precision Medicine

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

14

Page 15: Turning Big Data into Precision Medicine

Showcase

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

15 Calculating Drug Response… Predict Drug Response

Page 16: Turning Big Data into Precision Medicine

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

16 cetuximab might be more

beneficial for the current case

Page 17: Turning Big Data into Precision Medicine

Our Methodology Design Thinking

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

17

Page 18: Turning Big Data into Precision Medicine

Our Methodology Design Thinking

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

18

Desirability

■  Portfolio of integrated services for clinicians, researchers, and patients

■  Include latest treatment option, e.g. most effective therapies

Viability

■  Enable precision medicine also in far-off regions and developing countries

■  Involve word-wide experts (cost-saving)

■  Combine latest international data (publications, annotations, genome data)

Feasibility

■  HiSeq 2500 enables high-coverage whole genome sequencing in 20h

■  IMDB enables allele frequency determination of 12B records within <1s

■  Cloud-based data processing services reduce TCO

Page 19: Turning Big Data into Precision Medicine

Combined column and row store

Map/Reduce Single and multi-tenancy

Lightweight compression

Insert only for time travel

Real-time replication

Working on integers

SQL interface on columns and rows

Active/passive data store

Minimal projections

Group key Reduction of software layers

Dynamic multi-threading

Bulk load of data

Object-relational mapping

Text retrieval and extraction engine

No aggregate tables

Data partitioning Any attribute as index

No disk

On-the-fly extensibility

Analytics on historical data

Multi-core/ parallelization

Our Technology In-Memory Database Technology

+

+++

+

P

v

+++t

SQL

xx

T

disk

19

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

Page 20: Turning Big Data into Precision Medicine

■  1,000 core cluster at Hasso Plattner Institute with 25 TB main memory

■  25 nodes, each consists of:

□  40 cores

□  1 TB main memory

□  Intel® Xeon® E7- 4870

□  2.40GHz

□  30 MB Cache

In-Memory Database Technology Hardware Characteristics at HPI FSOC Lab

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

20

Page 21: Turning Big Data into Precision Medicine

■  Main memory access is the new bottleneck

■  Lightweight compression can reduce this bottleneck, i.e.

□  Lossless

□  Improved usage of data bus capacity

□  Work directly on compressed data

Lightweight Compression

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

21

Attribute Vector

RecId ValueId 1  C18.0 2  C32.0 3  C00.9 4  C18.0 5 C20.0 6 C20.0 7 C50.9 8 C18.0

Inverted Index

ValueId RecIdList 1  2 2  3 3  5,6 4  1,4,8 5  7

Data Dictionary

ValueId Value 1 Larynx 2 Lip 3 Rectum 4 Colon 5 Mama Table

… … … C18.0 Colon 646470 C50.9 Mama 167898 C20.0 Rectum 647912 C20.0 Rectum 215678 C18.0 Colon 998711 C00.9 Lip 123489 C32.0 Larynx 357982 C18.0 Colon 091487

RecId 1 RecId 2 RecId 3 RecId 4 RecId 5 RecId 6 RecId 7 RecId 8 …

•  Typical compression factor of 10:1 for enterprise software

•  In financial applications up to 50:1

Page 22: Turning Big Data into Precision Medicine

■  For patients

□  Identify relevant clinical trials and medical experts

□  Become an informed patient

■  For clinicians

□  Identify pharmacokinetic correlations

□  Scan for similar patient cases, e.g. to evaluate therapy efficiency

■  For researchers

□  Enable real-time analysis of medical data, e.g. assess pathways to identify impact of detected variants

□  Combined mining in structured and unstructured data, e.g. publications, diagnosis, and EMR data

What to Take Home? Test it Yourself: AnalyzeGenomes.com

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015 22

Turning Big Data into Precision Medicine

Page 23: Turning Big Data into Precision Medicine

Keep in contact with us!

Hasso Plattner Institute Enterprise Platform & Integration Concepts (EPIC)

Program Manager E-Health Dr. Matthieu-P. Schapranow

August-Bebel-Str. 88 14482 Potsdam, Germany

Dr. Matthieu-P. Schapranow [email protected] http://we.analyzegenomes.com/

Schapranow, Festival of Genomics, Boston, MA, June 24, 2015

Turning Big Data into Precision Medicine

23