joel dudley - lsi-dev.sites.olt.ubc.ca

59
TRANSFORMING THE PRACTICE OF MEDICINE June 7-9 th 2015, UBC, Vancouver, BC Be part of the Personalized Medicine Revolution personalizedmedsummit.com #PMSummit2015

Upload: others

Post on 02-Apr-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

TRANSFORMING THE PRACTICE OF MEDICINEJune 7-9th 2015, UBC, Vancouver, BC

Be part of the Personalized Medicine Revolution

personalizedmedsummit.com#PMSummit2015

Connecting Precision Medicine to Precision Wellness towards a

systems understanding of health and disease

Joel Dudley, PhD Director of Biomedical Informatics &

Assistant Professor of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai

Icahn School of Medicine at Mount Sinai @IcahnInstitute

Still bitter after all these years?

Alain Vigneault

Mount Sinai Health System

>6,000Physicians

7Member hospital campuses

>3,500

>3,100,000Patient visits

Hospital beds

There are rarely smoking guns in human health

There are rarely smoking guns in human health

No such thing as a “simple” disease

Cutting GR. Nature Reviews Genetics. (2014) doi:10.1038/nrg3849

HEART

VASCULATURE

KIDNEY

IMMUNE SYSTEM

transcriptional network

protein network

metabolite network

Non-coding RNA network

GI TRACT

BRAIN

ENVIRONMENT EN

VIRO

NMEN

T

ENVIRONMENT

ENVI

RONM

ENT

That promise to enable the construction of molecular networks that define the biological processes that comprise living systems

We must embrace complexity to fully understand patient physiology and disease

We must embrace complexity to fully understand patient physiology and disease

“A complex adaptive system has three characteristics. The first is that the system consists of a number of heterogeneous agents, and each of those agents makes decisions about how to behave. The most important dimension here is that those decisions will evolve over time. The second characteristic is that the agents interact with one another. That interaction leads to the third—something that scientists call emergence: In a very real way, the whole becomes greater than the sum of the parts. The key issue is that you can’t really understand the whole system by simply looking at its individual parts”.

- Michael J. Mauboussin (investment banker)

Although our ability to embrace complexity will bump up against our want to tell stories

Zeus, the sky god; when he is angry he throws lightening bolts out of the sky

Ptolemaic astronomy: the earth is the center of the universe

The earth is flat

Biological processes are driven by simple linearly ordered pathways (e.g. TGF-beta signaling)

Who wants to hear actors talk?

– H.M. Warner Warner Brothers, 1927

There is no reason anyone would want a computer in their home

– Ken Olsen Founder and president of Digital Equipment

Corp., 1977

Anyone who thinks the ANC is going to run South Africa is living in cloud-cuckoo-land

– Margaret Thatcher British Prime Minister, 1987

I think there is a world market for maybe five computers

– Thomas Watson Chairman of IBM, 1943

There will never be a bigger plane built – A Boeing engineer after first

flight of the 247, a 10-seater

Biological processes are organized into simple pathways – Life and Biomedical Sciences

Icahn Institute for Genomics and Multiscale Biology

Being masters of really big data is now critical for biomedical research (TB→PB→EB→ZB)

Organisms Tissues Single  cells

Single  cell,  real-­‐2me,  

con2nuous?

We can measure more than we know

Exploring the transcriptional landscape of human disease

~300  Diseases  and  Condi2ons

20k+  Genes

Blue:  gene  goes  down  in  disease Yellow:  gene  goes  up  in  disease

Suthram S, Dudley J et al. Network-based elucidation of human disease similarities reveals common functional modules enriched for pluripotent drug targets. PLoS Computational Biology (2010)

Figure 2. Significant disease-disease similarities. (A) Hierarchical clustering of the disease correlations. The distance between two diseases wasdefined to be (1-correlation coefficient) of the two diseases. The tree was constructed using the average method of hierarchical clustering. The redline corresponds to a p-value of 0.01 and FDR of 10.37% and, disease correlations below this line are considered significant. The different colorsrepresent the various categories of significant disease correlations. (B) The network of all the 138 significant disease correlations. The colorscorrespond to significant disease correlation categories in (A). The nodes colored in grey are not marked in (A).doi:10.1371/journal.pcbi.1000662.g002

Network-Based Elucidation of Disease Relationships

PLoS Computational Biology | www.ploscompbiol.org 4 February 2010 | Volume 6 | Issue 2 | e1000662

Redefining disease with data

Data Driven Approach to Connect Drugs and Disease Using Molecular Profiles

Sirota, M., Dudley, J. T., et al. (2011). Discovery and Preclinical Validation of Drug Indications Using Compendia of Public Gene Expression Data. Science Translational Medicine, 3(96).

Topiramate Reduces IBD Severity in a TNBS Rodent Model of IBD

• TNBS chemically induced rat model of IBD

• Animals treated with 80mg/kg topiramate oral after sensitization

• Prednisolone positive control (approved for IBD in humans)

Dudley, J. T., Sirota, M., et al. (2011). Computational Repositioning of the Anticonvulsant Topiramate for Inflammatory Bowel Disease. Science Translational Medicine, 3(96).

Control Imipramine

Approved compound for non-cancer indication prevents formation of SCLC tumors in a genetic model of SCLC

p53/Rb/p130 triple knockout model of SCLC

Mice dosed after

tumor formation

33

Supplementary Fig. 2 | Inhibitory effects of Imipramine, Promethazine, and Bepridil on SCLC allografts and xenografts. a, Strategy used for the treatment of mice growing SCLC tumors under their skin. NSG immunocompromised mice were subcutaneously implanted with 2 different mouse SCLC cell lines (Kp1 and Kp3) (b) and one human SCLC cell line (H187) (c) and tumor volume was measured at the times indicated of daily IP injections with vehicle control (Saline and corn oil; n=10 in (b) and n=4 in (c)), Imipramine (25mg/kg; n=7 in (b) and n=4 in (c)), Promethazine (25mg/kg; n=7 in (b) and n=4 in (c)), and Bepridil (10mg/kg; n=7 in (b) and n=3 in (c)) (3 independent experiments in (b) and 1 experiment in (c)). Values are shown as mean ± s.e.m. The unpaired t-test was used to calculate the p-values of treated versus control tumors at different days of treatment. *P<0.05, **P<0.01, and ***P<0.001. Values that are not significant are not indicated. d, Representative images of SCLC xenografts (H187) collected 14 days after daily treatment with Saline, Imipramine, and Promethazine. e, MTT survival assay of Cisplatin- and saline-treated SCLC cells cultured in 2% serum (n=3 independent experiments) for 48 hours with increasing doses of Imipramine. ns, not significant. f, Representative images of Cisplatin- and saline-treated SCLC allografts collected 17 days after daily treatment with Saline, and Imipramine.

0

2

4

6

8

Days of Treatment

Fo

ld C

han

ge o

f Tu

mo

r V

olu

me

b cSaline

Imipramine

Promethazine

Bepridil

***

**

****

******

*****

0 3 5 7 10 13

!"#$#"%&'()*+,-'.

$

/'0+,1)1"2'34556

787.7

97:7;77

<=%>,=2?@)A,"@)%-'B7!4?@)A,"@)%-'C7!4?@)A,"@)%-'DC!4

''''EFG< "#5< 4?HI9 HJ 4<

<EHK5L

/'0+,1)1"2'34556

787

.797:7;77

"#5< 4?HI9EHK5L

M

<=%>,=2E,=@->#"N)%-'87!4E,=@->#"N)%-'B7!4E,=@->#"N)%-'C7!4

''''EFG< HJ 4<

<

O OOO

OOO

OOOOOO

OOOOOOOOO

OOO

OOO

OOOOOO

OO

OOOOOO

OOO OOO

OOOOOO

OOO

OOOOOO

OOO

OOO

OOOOOO

OOO

OO

"

::'H-+,=P2"L>=@"'>+@=,L

D9'E#-=$#,=@=$Q>=@"'>+@=,L

BC'4-,R-2'<-22'<",$)%=@"'>+@=,L

.8'4)M*+>'<",$)%=)M'>+@=,L

!"!#$!"%&'("'#)'("'#*!+",)

!"!#$!"%&'("'#)'("'#*!+",)

!"!#$!"%&'("'#)'("'#*!+",)

!"!#$!"%&'("'#)'("'#*!+",)

!

!

!

!

"#$

!#%

&#'

'#''''''''''''SHG'-TA,-LL)=%'2-1-2L'

P '''''EFG< ''''EHK5 HJ 4<<'

U-#)$2-

3V">-,6

?@)A,"@)%-

C7!4

E,=@

->#"N)%-

B7!4

<GS

C7!@

C7!@

C7!@

<GS

OOO

<GS

OOO

!"#$#"%&'()*+,-'.

$

/'0+,1)1"2'34556

787.7

97:7;77

<=%>,=2?@)A,"@)%-'B7!4?@)A,"@)%-'C7!4?@)A,"@)%-'DC!4

''''EFG< "#5< 4?HI9 HJ 4<

<EHK5L

/'0+,1)1"2'34556

787

.797:7;77

"#5< 4?HI9EHK5L

M

<=%>,=2E,=@->#"N)%-'87!4E,=@->#"N)%-'B7!4E,=@->#"N)%-'C7!4

''''EFG< HJ 4<

<

O OOO

OOO

OOOOOO

OOOOOOOOO

OOO

OOO

OOOOOO

OO

OOOOOO

OOO OOO

OOOOOO

OOO

OOOOOO

OOO

OOO

OOOOOO

OOO

OO

"

::'H-+,=P2"L>=@"'>+@=,L

D9'E#-=$#,=@=$Q>=@"'>+@=,L

BC'4-,R-2'<-22'<",$)%=@"'>+@=,L

.8'4)M*+>'<",$)%=)M'>+@=,L

!"!#$!"%&'("'#)'("'#*!+",)

!"!#$!"%&'("'#)'("'#*!+",)

!"!#$!"%&'("'#)'("'#*!+",)

!"!#$!"%&'("'#)'("'#*!+",)

!

!

!

!

"#$

!#%

&#'

'#''''''''''''SHG'-TA,-LL)=%'2-1-2L'

P '''''EFG< ''''EHK5 HJ 4<<'

U-#)$2-

3V">-,6

?@)A,"@)%-

C7!4

E,=@

->#"N)%-

B7!4

<GS

C7!@

C7!@

C7!@

<GS

OOO

<GS

OOO

Fig. 2

d

● ●● ●●

●●

●●

●●●

●●●

●●●

●● ●●●

●●

●●

●●● ●

● ●●

●●● ●

●●

●●●

●●

● ●●● ●

●●

●● ●● ●

●●●

●●

●●

● ●

●●●

●● ● ●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●●

●● ●

●●●

●●

●●

●● ●● ●●●

●●

● ●●

●●●

●●

●●●●

●● ●

●●●

●● ●

●● ●●

●●●

●●

●●

● ●●

●●●

●●

●●● ●●●

●● ●● ●

●●

●● ●

●●●● ●

●●

●●

●●

●●●●●●

●●● ● ●

●●

●● ●●

●●● ●●

●● ●● ● ●●

●●

●●●

●●

●●●●

●● ●

●●●

● ●●● ●

●●

●● ●

●●

●● ●

● ●●

●●

●●

●●

●●

●●●●●

●●

●●

●●●

●●

● ●

●●

● ●

●● ●●●

● ● ●●●

●●

●●

● ●●

●●●

●●

● ●

●●●

●●

● ●●●

●●

●●

●●●

●●● ●●

●●

●●●

●●

● ● ●●

●●● ●●

●●

●● ●

●● ●●

●●●

●● ●

● ●●●●

●●●

●●●●

● ●●

●●

●●● ●●●

●● ●

●●●

●●●

●●● ●●

●●

●●

●●●● ●●● ●

●●●

●●●

●●● ● ●

●●

●● ●●

●●●

● ●●● ●●

●●

●●●

●●

●●

●●

●●

● ●●●

●●

●●

●●

●●

● ●● ●●●

●●

●●●●

●● ●

● ●

●●

●● ●● ●

●●

●●

●●

● ● ●

●●

● ●●

● ●●●

●●

●●

●●●

●● ●● ●● ●

●●●

●●●

● ●

● ●●

●●●

●●

●●

●● ●

●●

●● ●●

● ●

●●

Num cell subset changesN

um ta

rget

s

0 10 20 30 40 50 60

050

100150200

adrugs (1,309)ce

ll st

ates

(304

)

immunemod score-1 +1

stim-specific drugs

unstim-specific drugs

immature subsets

371

203

213

28

17

1258

17

28 mature subsets

subsettissuegenetictimeperturbation

b

H V B P M L G R A D J S N C

Count

024681012

***

*

c

NKT, mono

DC

preB

T4

Tgd

TgdDC

mac

B

stromal

B

preT

H V B P M L G R A D J S N C

H V B P M L G R A D J S N C

Kidd BA, Wroblewska A, Agudo J, Merad M, Brown BD, Dudley JT. Systematic integrative analysis of immune pharmacology. In review.

How do approved drugs modulate networks in immune cells?

Fig. 4

CD11b%

Gr(1%

CD11b%

Gr(1%

CD11b%

Gr(1%

cblood

bone marrow

peritoneal cavity

ClioquinolControl

score0.0 0.2 0.4

score0.0 0.2 0.4

Tgd Th -> Sp GN_Arth BM -> SynF GN BM -> Bl DC_8+ Th -> MLN DC_8+ Th -> SLN DC Th -> MLN SC_STSL FL -> BM SC_STSL FL -> BM DC Lv -> LuLN T8_Nve MLN -> Sp MLP FL -> BM proB_CLP FL -> BM B_FrE FL -> BM T4Nve Sp -> MLN proB_FrA FL -> BM DC Th -> SLN T8_Nve MLN -> PPproB_FrBC FL -> BM B_Fo MLN -> LN NKT_4+ Lu -> Lv T8_Nve LN -> Sp T4_Nve Sp -> PP NKT_4- Sp -> Lv T8_Nve LN -> PP DC_pDC MLN -> SLN SC_LTSL FL -> BM T4_Nve LN -> MLN T8_Nve LN -> MLN

T8_Nve PP -> MLN T8_Nve PP -> Sp Tgd Sp -> Th T4_Nve PP -> Sp T4_Nve PP -> MLN Mo_6C+II- Bl -> BM T8_Nve Sp -> MLN T4_Nve PP -> LN DC_pDC MLN -> Sp DC Lv -> LuLN T4 Pa -> PLN NKT_4+ Lv -> Sp T8_Nve PP -> LN NKT_4+ Lv -> Lu DC_8+ SLN -> Th DC Lu -> LuLN SC_LTSL BM -> FL proB_CLP BM -> FL proB_FrBC BM -> FL DC_8+ MLN -> Th MLP BM -> FL DC MLN -> SLN DC Lv -> SI GN Arth -> Arth DC_8+ Sp -> SLN Mo_6C-II- Bl -> BM DC Sp -> MLN DC Lu -> LuLN DC_11b- Sp -> SLN DC_11b+ Sp -> SLN DC LuLN -> Lv proB_FrA BM -> FL DC MLN -> Th DC_pDC SLN -> Sp T4_Nve Sp -> LN GN Bl -> BM NKT_4+ Lu -> Sp DC Th -> SLN

Amantadinebscore

0.0 0.2 0.4

score0.0 0.2 0.4

Tgd Th -> Sp GN_Arth BM -> SynF GN BM -> Bl DC_8+ Th -> MLN DC_8+ Th -> SLN DC Th -> MLN SC_STSL FL -> BM SC_STSL FL -> BM DC Lv -> LuLN T8_Nve MLN -> Sp MLP FL -> BM proB_CLP FL -> BM B_FrE FL -> BM T4Nve Sp -> MLN proB_FrA FL -> BM DC Th -> SLN T8_Nve MLN -> PPproB_FrBC FL -> BM B_Fo MLN -> LN NKT_4+ Lu -> Lv T8_Nve LN -> Sp T4_Nve Sp -> PP NKT_4- Sp -> Lv T8_Nve LN -> PP DC_pDC MLN -> SLN SC_LTSL FL -> BM T4_Nve LN -> MLN T8_Nve LN -> MLN

T8_Nve PP -> MLN T8_Nve PP -> Sp Tgd Sp -> Th T4_Nve PP -> Sp T4_Nve PP -> MLN Mo_6C+II- Bl -> BM T8_Nve Sp -> MLN T4_Nve PP -> LN DC_pDC MLN -> Sp DC Lv -> LuLN T4 Pa -> PLN NKT_4+ Lv -> Sp T8_Nve PP -> LN NKT_4+ Lv -> Lu DC_8+ SLN -> Th DC Lu -> LuLN SC_LTSL BM -> FL proB_CLP BM -> FL proB_FrBC BM -> FL DC_8+ MLN -> Th MLP BM -> FL DC MLN -> SLN DC Lv -> SI GN Arth -> Arth DC_8+ Sp -> SLN Mo_6C-II- Bl -> BM DC Sp -> MLN DC Lu -> LuLN DC_11b- Sp -> SLN DC_11b+ Sp -> SLN DC LuLN -> Lv proB_FrA BM -> FL DC MLN -> Th DC_pDC SLN -> Sp T4_Nve Sp -> LN GN Bl -> BM NKT_4+ Lu -> Sp DC Th -> SLN

Clioquinola d p < 2 ⨉ 10–3

ns

p < 2 ⨉ 10–5

ns

p < 1 ⨉ 10–7

ns

% C

D45

+ liv

e

PEG400 Clioquinol PBS Amantadine

0

20

40

60

% C

D45

+ liv

e

PEG400 Clioquinol PBS Amantadine

0

10

20

30

40

50

% C

D45

+ liv

e

PEG400 Clioquinol PBS Amantadine

0

20

40

60

80

Antifungal activates neutrophil migration

Kidd BA, Wroblewska A, Agudo J, Merad M, Brown BD, Dudley JT. Systematic integrative analysis of immune pharmacology. In review.

Drug A

Drug B

Drug A

Drug A Drug B

Drug A Drug B

Drug A Drug B

Population

Sample acquisition

Predictive Network Model

What we are about: Integrating big data across many domains to build predictive models that improve how we diagnose and treat disease

Slide  courtesy  of  Eric  Schadt

Problem: How do you make sense of 163 loci to understand a complex disease like IBD?

Organizing 163 genetic loci for IBD

A  key  driver  network  for  many  diseases:  Obesity,  diabetes,  heart  disease,  COPD,  asthma,  fibrosis,  stroke  and  so  on  

(Created with iCAVE from Gumus Lab, 2013)

Alzheimer’s neuro-immune and microglia

network

Zhang et al. “Tracing Multi-System Failure in LOAD to Causal Genes”

-27-

Figure 5

Zhang B et al. Tracing Multi-System Failure in Alzheimer Disease to Causal Genes. Cell 2013

Understanding the multiscale complexity of patient populations

=

GenomicEnvironmentClinical

Capturing rich multi scale data on patients through the Mount Sinai Biobank

Drugs

DiagnosesDNA

RNA

Labs

Procedures

Microbiome

Immune

Image credit: Li Li (ISMMS)

Type 2, Type 3, and Type 4 diabetes?

Female&Male&

“The future is already here — it's just not very evenly

distributed”.

- William Gibson

200Data Feeds

5GbData/lap

300Sensors

3,000Variables/0.1 sec

>200GbData/day

We  are  on  the  crest  of  a  tsunami  in  consumer  sensor  technologies

Printable  tattoo  biosensor

Consumer health tools as a driver of data-driven

healthcare

Where will most of the health

data be in 5-10 years?

Electronic Consent and Patient Engagement

Electronic Consent and Patient Engagement

A Learning Digital Health Platform for Chronic Lung Disease

Inhaler

Activity/Vitals

Spirometry

Real-time learning on patient population data

Environmental/Geospatial data

Actionable insights delivered to EMRat point-of-care

Th17Th1

0:00 min0:05 min 0:10 min

DNACell'specific-RNACytokinesClinical-labsMobile-devicesMicrobiomePhysiometrics

Personalized multiscale networks to model dynamics of complex disease

DNA sequence, transcriptome, proteome, metabolome, epige-nome, microbiome, and exposome. Going forward, I will usethe term ‘‘panoromic’’ to denote the multiple biologic omictechnologies. This term closely resembles and is adopted frompanoramic, which refers to a wide-angle view or comprehensiverepresentation across multiple applications and repositories. Ormore simply, according to the Merriam-Webster definition ofpanoramic, it ‘‘includes a lot of information and covers manytopics.’’ Thus the term panoromic may be well suited for portray-ing the concept of big biological data.

The first individual who had a human GIS-like construct wasMichael Snyder. Not only was his whole genome sequenced,he also collected serial gene expression, autoantibody, proteo-mic, and metabolomic (Chen et al., 2012) samples. A portion ofthe data deluge that was generated is represented in the Circosplot of Figure 2 or an adoption of the London Tube map (Shen-dure and Lieberman Aiden, 2012). The integrated personal omicsprofiling (iPOP) or ‘‘Snyderome,’’ as it became known, proved tobe useful for connecting viral infections to markedly elevatedglucose levels. With this integrated analysis in hand, MichaelSnyder changed his lifestyle, eventually restoring normal glucosehomeostasis. Since that report in 2012, Snyder and his teamhave proceeded to obtain further omic data, including whole-genome DNA methylation data at multiple time points, serialmicrobiome (gut, urine, nasal, skin, and tongue) sampling, andthe use of biosensors for activity tracking and heart rhythm.Snyder also discovered that several extended family membershad smoldering, unrecognized glucose intolerance, therebychanging medical care for multiple individuals.

Of note, to obtain the data and process this first panoromicstudy, it required an armada of 40 experienced coauthors and

countless hours of bioinformatics and analytical work. To givecontext to the digital data burden, it took 1 terabyte (TB) forDNA sequence, 2 TB for the epigenomic data, 0.7 TB for thetranscriptome, and 3 TB for the microbiome. Accordingly, thisfirst human GIS can be considered a remarkable academicfeat and yielded key diagnostic medical information for theindividual. But, it can hardly be considered practical or scalableat this juncture. With the cost of storing information continuing todrop substantially, the bottleneck for scalability will likely beautomating the analysis. On the other hand, each omic technol-ogy can readily be undertaken now and has the potential ofproviding meaningful medical information for an individual.

The Omic ToolsWhole-Genome and Exome SequencingPerhaps the greatest technologic achievement in the biomedicaldomain has been the extraordinary progress in our ability tosequence a human genome over the past decade. Far exceedingthe pace of Moore’s Law for the relentless improvement in tran-sistor capacity, there has been a >4 log order (or 0.00007th)reduction in cost of sequencing (Butte, 2013), with a cost in2004 of !$28.8 million compared with the cost as low as$1,000 in 2014 (Hayden, 2014). However, despite this incompa-rable progress, there are still major limitations to how rapid,accurate, and complete sequencing can be accomplished.High-throughput sequencing involves chopping the DNA intosmall fragments, which are then amplified by PCR. Currently, ittakes 3 to 4 days in our lab to do the sample preparation andsequencing at 303 to 403 coverage of a human genome. Theread length of the fragments is now !250 base pairs for themost cost-effective sequencing methods, but this is still subop-timal in determining maternal versus paternal alleles, or whatis known as phasing. Because so much of understanding dis-eases involves compound heterozygote mutations, cis-actingsequence variant combinations, and allele-specific effects,phasing the diploid genome, or what we have called ‘‘diplomics’’(Tewhey et al., 2011), is quite important. Recently, Moleculointroduced a method for synthetically stitching together DNAsequencing reads yielding fragments as long as 10,000 basepairs. These synthetic long reads are well suited for phasing. Un-fortunately, the term ‘‘whole-genome sequencing’’ is far fromcomplete because !900 genes, or 3%–4% of the genome, arenot accessible (Marx, 2013). These regions are typically in cen-tromeres or telomeres. Other technical issues that detract fromaccuracy include long sequences of repeated bases (homopoly-mers) and regions rich in guanine and cytosine. Furthermore, theaccuracy for medical grade sequencing still needs to beimproved. A missed call rate of 1 in 10,000, which may notseem high, translates into a substantial number of errors whenconsidering the 6 billion bases in a diploid genome. These errorsobfuscate rare but potentially functional variants. Beyond thisissue, the accurate determination of insertions, deletions, andstructural variants is impaired, in part due to the relativelyshort reads that are typically obtained. The Clinical SequencingExploratory Research (CSER) program at the National Institutesof Health is aimed at improving the accuracy of sequencing formedical applications (National Human Genome Research Insti-tute, 2013).

Figure 1. Geographic Information System of a Human BeingThe ability to digitize the medical essence of a human being is predicated onthe integration of multiscale data, akin to a Google map, which consists ofsuperimposed layers of data such as street, traffic, and satellite views. For ahuman being, these layers include demographics and the social graph, bio-sensors to capture the individual’s physiome, imaging to depict the anatomy(often along with physiologic data), and the biology from the various omics(genome-DNA sequence, transcriptome, proteome, metabolome, micro-biome, and epigenome). In addition to all of these layers, there is one’simportant environmental exposure data, known as the ‘‘exposome.’’

242 Cell 157, March 27, 2014 ª2014 Elsevier Inc.

In the future you will have coordinates instead of a diagnosis

Topol EJ. Individualized Medicine from Prewomb to Tomb Cell 157, March 27, 2014

We can embrace digital health and sensors to map the human phoneme and envirome

462 VOLUME 33 NUMBER 5 MAY 2015 NATURE BIOTECHNOLOGY

the manifestations of disease by providing a more comprehensive and nuanced view of the experience of illness. Through the lens of the digital phenotype, an individual’s interaction with digital technologies affects the full spec-trum of human disease from diagnosis, to treat-ment, to chronic disease management. Early examples of digital tracking include the use of cell phone activity to measure’s one’s activity levels and the association with depression by the Boston-based startup company Ginger.io. There are, of course, limitations to what can be measured and by whom when considered in the context of personal privacy.

Exploiting the digital phenotypeAs a corollary to traditional forms of disease expression, digital phenotypes can expand our ability to identify and diagnose health conditions. Some of the earliest and most

The digital phenotypeSachin H Jain, Brian W Powers, Jared B Hawkins & John S Brownstein

In the coming years, patient phenotypes captured to enhance health and wellness will extend to human interactions with digital technology.

In 1982, the evolutionary biologist Richard Dawkins introduced the concept of the

“extended phenotype”1, the idea that pheno-types should not be limited just to biological processes, such as protein biosynthesis or tissue growth, but extended to include all effects that a gene has on its environment inside or outside of the body of the individual organism. Dawkins stressed that many delineations of phenotypes are arbitrary. Animals and humans can modify their environments, and these modifications and associated behaviors are expressions of one’s genome and, thus, part of their extended phe-notype. In the animal kingdom, he cites damn building by beavers as an example of the beaver’s extended phenotype1.

As personal technology becomes increasingly embedded in human lives, we think there is an important extension of Dawkins’s theory—the notion of a ‘digital phenotype’. Can aspects of our interface with technology be somehow diag-nostic and/or prognostic for certain conditions? Can one’s clinical data be linked and analyzed together with online activity and behavior data to create a unified, nuanced view of human dis-ease? Here, we describe the concept of the digital phenotype. Although several disparate studies have touched on this notion, the framework for how digital technologies will be integrated into the patient journey and play a role in precision

medicine has yet to be described. We attempt to define digital phenotype and further describe the opportunities and challenges in incorporat-ing these data into healthcare.

Defining the digital phenotypeThe growth and evolution of digital products and their application to health supports this interpretation of the extended phenotype. Through social media, forums and online communities, wearable technologies and mobile devices, there is a growing body of health-related data that can shape our assess-ment of human illness. Such data have sub-stantial value above and beyond the physical exam, laboratory values and clinical imaging data—our traditional approaches to charac-terizing a disease phenotype. When gathered and analyzed appropriately, these data have the potential to fundamentally alter our notion of

Sachin H. Jain is at Caremore Health System, Cerritos, California, USA; Brian W. Powers, Jared B. Hawkins and John S. Brownstein are at Harvard Medical School, Boston, Massachusetts, USA; Sachin H. Jain is at Boston-Virginia Medical Center, Boston, Massachusetts, USA, and Merck and Co., Inc., Boston, Massachusetts, USA; and John S. Brownstein and Jared B. Hawkins is at Children’s Hospital Boston, Boston, Massachusetts, USA. e-mail: [email protected]

Jan. 2013

0.000

0.002

0.004

Den

sity

0.006

July 2013 Jan. 2014 July 2014

User 1

User 2

User 3

User 4

User 5

User 6

User 7

Date

Figure 1 Timeline of insomnia-related tweets from representative individuals. Density distributions (probability density functions) are shown for seven individual users over a two-year period. Density on the y axis highlights periods of relative activity for each user. A representative tweet from each user is shown as an example.

COMMENTARY

462 VOLUME 33 NUMBER 5 MAY 2015 NATURE BIOTECHNOLOGY

the manifestations of disease by providing a more comprehensive and nuanced view of the experience of illness. Through the lens of the digital phenotype, an individual’s interaction with digital technologies affects the full spec-trum of human disease from diagnosis, to treat-ment, to chronic disease management. Early examples of digital tracking include the use of cell phone activity to measure’s one’s activity levels and the association with depression by the Boston-based startup company Ginger.io. There are, of course, limitations to what can be measured and by whom when considered in the context of personal privacy.

Exploiting the digital phenotypeAs a corollary to traditional forms of disease expression, digital phenotypes can expand our ability to identify and diagnose health conditions. Some of the earliest and most

The digital phenotypeSachin H Jain, Brian W Powers, Jared B Hawkins & John S Brownstein

In the coming years, patient phenotypes captured to enhance health and wellness will extend to human interactions with digital technology.

In 1982, the evolutionary biologist Richard Dawkins introduced the concept of the

“extended phenotype”1, the idea that pheno-types should not be limited just to biological processes, such as protein biosynthesis or tissue growth, but extended to include all effects that a gene has on its environment inside or outside of the body of the individual organism. Dawkins stressed that many delineations of phenotypes are arbitrary. Animals and humans can modify their environments, and these modifications and associated behaviors are expressions of one’s genome and, thus, part of their extended phe-notype. In the animal kingdom, he cites damn building by beavers as an example of the beaver’s extended phenotype1.

As personal technology becomes increasingly embedded in human lives, we think there is an important extension of Dawkins’s theory—the notion of a ‘digital phenotype’. Can aspects of our interface with technology be somehow diag-nostic and/or prognostic for certain conditions? Can one’s clinical data be linked and analyzed together with online activity and behavior data to create a unified, nuanced view of human dis-ease? Here, we describe the concept of the digital phenotype. Although several disparate studies have touched on this notion, the framework for how digital technologies will be integrated into the patient journey and play a role in precision

medicine has yet to be described. We attempt to define digital phenotype and further describe the opportunities and challenges in incorporat-ing these data into healthcare.

Defining the digital phenotypeThe growth and evolution of digital products and their application to health supports this interpretation of the extended phenotype. Through social media, forums and online communities, wearable technologies and mobile devices, there is a growing body of health-related data that can shape our assess-ment of human illness. Such data have sub-stantial value above and beyond the physical exam, laboratory values and clinical imaging data—our traditional approaches to charac-terizing a disease phenotype. When gathered and analyzed appropriately, these data have the potential to fundamentally alter our notion of

Sachin H. Jain is at Caremore Health System, Cerritos, California, USA; Brian W. Powers, Jared B. Hawkins and John S. Brownstein are at Harvard Medical School, Boston, Massachusetts, USA; Sachin H. Jain is at Boston-Virginia Medical Center, Boston, Massachusetts, USA, and Merck and Co., Inc., Boston, Massachusetts, USA; and John S. Brownstein and Jared B. Hawkins is at Children’s Hospital Boston, Boston, Massachusetts, USA. e-mail: [email protected]

Jan. 2013

0.000

0.002

0.004

Den

sity

0.006

July 2013 Jan. 2014 July 2014

User 1

User 2

User 3

User 4

User 5

User 6

User 7

Date

Figure 1 Timeline of insomnia-related tweets from representative individuals. Density distributions (probability density functions) are shown for seven individual users over a two-year period. Density on the y axis highlights periods of relative activity for each user. A representative tweet from each user is shown as an example.

COMMENTARY

It’s all about context

The emergence of science- and data-driven wellness

[email protected]

The emergence of science- and data-driven wellness

[email protected]

Digital Health

Molecular Profiling

Data Science

Clinical Medicine

The Harris Center for Precision Wellness

www.precisionwellness.org

Who are the resilient among us?

http://resilienceproject.me

Web-scale deep computing as the future of medicine and research

http://www.bbc.com/news/technology-18595351

Web-scale deep computing as the future of medicine and research

Xiong et al. Science 9 January 2015

Thank you for your attention

Email: [email protected] Twitter: @jdudley Web: dudleylab.org precisionwellness.org

Icahn School of Medicine at Mount Sinai

It’s all about context

• eQTL• What?• Where?• When?

It’s all about context