systems genetics of cancer - big data and all that

38
Florian Markowetz CRUK Cambridge Institute www.markowetzlab.org Systems genetics of cancer Big data and all that

Upload: florian-markowetz

Post on 16-Jul-2015

239 views

Category:

Health & Medicine


3 download

TRANSCRIPT

Florian Markowetz

CRUK Cambridge Institute

www.markowetzlab.org

Systems genetics

of cancerBig data and all that

htt

p:/

/eco

n.s

t/X

YyA

yi

“With enough data

and the ability to

crunch it, virtually any

challenge facing

humanity today can

be solved.”

Eric Schmidt et al, How Google Works, 2014

Prof Atul Butte (Genomic Medicine, Stanford) at TEDMED 2012

“Who needs the scientific method?

Vast stores of available data (…)

are simply waiting for the right

questions.”

Chris Anderson, WIRED.com, 2008http://archive.wired.com/science/discoveries/magazine/16-07/pb_theory

The End of Theory: The Data

Deluge Makes the Scientific

Method Obsolete‘Petabytes allow us to say: "Correlation

is enough.” (…)

We can throw the numbers into the

biggest computing clusters the world

has ever seen and let statistical

algorithms find patterns where science

cannot.’

Karl Popper1902 - 1994

R I P, Inductionism!

Conclusion:

The hype about

Big Data is built

on zombie

ideas

The data never

speak for

themselves

Theories,

hypotheses and

models make

data speak

Bigger data

don’t

change this

Encyclopedia of

DNA Elements

www.nature.com

[The ENCODE] data enabled us to

assign bio-chemical functions for

80% of the genome.

Function = showed up in the data

Graur et al, Genome Biol Evol (2013)

“This claim flies in the face of

current estimates according to

which the fraction of the genome

that is evolutionarily conserved

(…)

is under 10%.”

Function = evolutionary conserved

What is biological

function?

A conceptual question

independent of

the size of your data set

Always the same science.

Always the same questions.

Big Data is a technical

challenge, not a conceptual

one

Systems Genetics of Cancer

Genetic variation

• In people

• In tumours

• In clones

Phenotypic variation

• Tumour subtypes

• Aggressiveness

• Survival

Cancer genome

Evolution

Cancer tissue

Context

Cancer genome

Function

Ines

Wei Edith

Geoff

Ke Anne Joe

Leon

Andy

Amanda

Cancer genome

Heterogeneity and evolution

Clonal evolution

Intra-patient heterogeneity in HGSOC

Schwarz et al, PLoS Comp Bio 2014

Schwarz et al, PLoS Medicine, 2015

Structural differences indicate

resistance

sensitiv

ere

sis

tant

Heterogeneity predicts survival

Mixture model

1. How many clones are there in the

sample?

2. How are they related in a tree?

Data

Nr of

clones Size of

clone

Variability

inside clone

Parameters

Graphical Model behind BitPhylogeny

Phylogeny prior

Prior on local parameters

Likelihood

The BitPhylogeny model

FUTURE

• ICGC pan-cancer analysis:

2500 genomes => 2500 trees

• Characterize the 2500 trees

• Correlate trees with clinical data

• Infer onco-genetic progression models across

the 2500 trees

Cancer tissue

Context

DNA RNA Protein ChIP

Van’t Veer et al (2002) http://ms.lbl.gov Ross-Innes et al (2012)

Tumors

are

complex

tissues

Automated image analysis

Supervised

classification

Spatial

smoothing

Cell types

and location

H&E

Quantitative analysis of

tumour composition

Spatial features of tumour tissue

FUTURE

• ICGC pan-cancer analysis: 2500 genomes

• Collect tissues for as many samples as possible.

• Correlate tissue architecture with clinical data

• Correlate tissue architecture with evolution.

Florian Markowetz

CRUK Cambridge Institute

www.markowetzlab.org

Big Data in Genomics

Thank you

!