big data in translational science - amazon web … data in translational science albert wang...
TRANSCRIPT
Big Data in Translational Science
Albert Wang
Associate Director, Translational R&D IT
Bristol-Myers Squibb
2015 AAPS Annual Meeting
Agenda
• Perspectives on Big Data
• Big Data in Translational R&D
• Selected Initiatives at Bristol-Myers Squibb
Why are we talking about Big Data?
4
Author | 00 Month Year Set area descriptor | Sub level 1
#1: Because our capability for generating data is growing - everywhere.
The Five ‘V’s of Big Data
5
IBM Institute for Business Value Analytics: The real-world use of big data
e.g., sentiment, social media, weather conditions, etc.
Value Deriving value from data
Until we can turn data into value, it is useless.
Why are we talking about Big Data?
6
#2: Because new technologies have proven applicable to big data problems across a variety of domains.
MapReduce
Virtualized/Distributed Storage
Artificial Intelligence (machine learning, NLP, etc.)
NoSQL Databases
Why are we talking about Big Data?
7
#3: Because there are still significant problems to solve.
Data Agility (Real-Time) Data privacy/security
Usability/Accessibility
Development Discovery
What does this mean for Translational R&D?
Translational Research
Translational Medicine
Bench
Bedside
Translational R&D leverages a variety of data types
Pharma-cology data
Clinical Trial Data
Epidemiological data
Genomic Data
Mole-cular Data
Medical/Hospital Data Tissue
Data
Real World Data
Patient Genotype
TR&D Analytics
Increased Need for Data Sharing Across Partners
• Efficient information sharing across
multiple partners
• Easy access to the data
• Transparency with collaborators
11
Photo: ©iStockphoto.com/123render
Photo: 123rf.com
Institute
Academia
Hospital
Foundation
CRO Partner
6 Areas of Pharma and Healthcare
Related Big Data
Life Science Data Owner: Academic ,
Government Example datasets: • NGS data • Imaging data • Literature, &
conferences text
• Signalling Pathway Data / Models
• Epidemiology
Business Intelligence Data Owner: Pharma / Biotech / Academia Example datasets: • News, Blogs, Social Media, Patents, Literature, Web pages, Financial Reports
Time
Data Entry and storage
Improved Query and Navigation
Enhanced Analysis and Visualization
Benefits
Data Access Retrieval
Doing What We Can Now - Building for the Future
Data Integration
How do we grow an infrastructure to support TR&D Big Data?
Data mining, Analysis,
Modeling, and Interpretation
Decision Support
Future State Infrastructure
Dat
a So
urc
es
Inte
grat
ion
D
ata
fee
ds
Ou
tpu
ts
Sample properties, availability, location,…
Clinical: Subject demographics, treatment, response….
Biomarker data Real World data, EMR, Claims,
Discovery data.
Informatician/ Data Scientist
Clinician External
Collaborator
Use
rs
Biomarker
Scientist Project team Discovery Scientist
Data curation, standardization, integration
analysis, mining, modeling, interpretation
Decision Support: query, analysis, knowledge capture
Data
Information
Knowledge
Insights
Decisions
Structured and unstructured data
An
alys
is
15
Case Study: Sensor data in atrial fibrillation INFORMED STUDY DESIGN ENABLED STUDY EXECUTION
… vs. Holter monitor
Medtronic SEEQ patch...
Wear each patch up to one week Patches replaced by subjects at home Mobile base station transmits ECG data to cloud in real time Alerts if safety event detected or device not reapplied correctly
Wear up to two days Replaced by ECG technician at clinical site Base station stores data; needs to be docked No real time alerts
CHALLENGED DATA MANAGEMENT
Patients carry wireless transmitters
ECG data streamed to
the cloud
Reports generated
Clinical database could not accept reports: conversion process required
Clinical DB
Subjects needed before querying
sensor dataset: 160
Subjects needed after querying sensor
dataset: 80
BMS used real-world pacemaker data analysis to revise the required # of subjects and observation period for
this study
Observation period reduced from 4 weeks to 2 weeks of continuous monitoring while on therapy. ANALYSIS = W.I.P.
17
Biomarker-driven Translational Research
TR Biology
GBS
GCR
DM Physician Clinical Biomarker
Biomarker Tech Matrix Bix
DWG Lead
DM Physician
Translational Research Scientist
Clinical Biomarker
Biomarker Tech Matrix
Discovery Teams
Clinical Teams
Purchased Tissue Collections
Specimen
Repositories
Clinical Trials
IHC Flow Cytometry Gene Expression Genetics Cytokine Profiling
Viral Sequencing
Metabolomics
Proteomics
Other assays…
Patient Information
Sample Information
Patient Information
18
Biomarker Repository: Conceptual Architecture
Biomarker Repository
Biomarker File
Repository
IHC
Fl
ow
G
eno
mic
s
Pro
tein
Clin
ical
TR&D Data Hub (Hadoop)
Sam
ple
s Specimen Biorepository
Clinical Data (CDMS, SAS
environment, etc.)
Visualization Analytics
Ad hoc biomarker
data Genomic Data