can we measure everything? 1 professor jacques corbeil canada research chair in medical genomics

38
Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Upload: anna-lane

Post on 19-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

1

Can we measure everything?

Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Page 2: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

2

Take home messages for Big Data

•Can we measure everything?

Yes

•Careful for what you wish for!!!

Page 3: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

3

Take home message

•Can we measure everything?

•Yes

•Careful for what you wish for!!!

Massive amount of unstructured data

Page 4: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

4

Take home message

•Can we measure everything?

•Yes

•Careful for what you wish for!!!

Big data need analysis pipelines

Page 5: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

5

Plan of the presentation

•Sequencing for genomic and metagenomic

•Mass spectrometry for metabolomic.•Biological computing and machine learning will be interspaced throughout.

Page 6: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

6

Genomics and Metagenomics

•Nextgen sequencing.

Frédéric Raymond, Maxime DéraspePier-Luc Plante & Alexandre Drouin

Page 7: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Metagenomic analysis with RAY META

Reads

de Bruijn graph

Reference genomes and taxonomy

Colored kmers

ABC

Colored de Bruijn graph

Colored assembly

Profiling

Taxon Frequencies

Bacteroidaceae 48 %

Rikenellaceae 15 %

Clostridiaceae 6 %

… …

Assembly

(Boisvert et al. 2012)

Page 8: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Microbiome and antibiotics

124 millions bp/sample70 samples

Page 9: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Molecular epidemiology using whole genomes

Page 10: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Ray Surveyor on 1600 bacterial genomes

New way to do Phylogeny!!

• Can be adapted to whole genome, core genome, others.

• Superfast• Precise• Insanely great!

Page 11: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Resistance of P. aeruginosa using whole genomes

Page 12: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Ray Surveyor on S. pneumoniae genomes (normalized)

Lots of Kmers10 242 551

Whole genome

3h30 on 408 CPUs (17 computers)

Page 13: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Big data epidemiology with Ray Surveyor

C. difficile

Whole genome

Page 14: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Comparing whole genomes with specific genes

Whole genome Crispr only

Developed an algorithm to calculate similarity.

Page 15: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Comparing whole genomes with specific genes

Whole genome Resistance genes

We will have number for homology!

Page 16: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Clostridium difficile Source: Dr. Vivian Loo (McGill University)

Pseudomonas aeruginosa Source: PMID25367914

Mycobacterium tuberculosis Source: PMID25599400

Streptococcus pneumoniae Source: PMID23644493

32 823 803, m = 470

132 487 288, m = 393

11 255 033, m = 154

10 542 251, m = 680

Datasets

Page 17: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

The SCM outperform other state-of-the-art biomarker discovery methods in terms of sparsity (in parenthesis) and compares favorably in terms of accuracy.

Benchmark

Page 18: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

On most datasets, the obtained models are highly accurate and rely on very few k-mers.

Results

Page 19: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Example Models

Page 20: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

20

Metabolomics

•Mass spectrometry and metabolomic.

Pier-Luc Plante, Alexandre DrouinFrancis Brochu, Prudencio ToussouNancy Boucher

Page 21: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Mass spectrometry

The paradigm

Page 22: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics
Page 23: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

We love the hay.

Page 24: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

High throughput mass spectrometry: Laser DiodeThermal Desorption (LDTD)

Sample every 10s on averageBig Data approaches

Page 25: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics
Page 26: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

26

Aims of the research program

•Better quality control procedures

•Diagnostic tools in health and disease states.

•Ultimately, predict paths and assist in the decision.

Page 27: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

The Set Covering Machine

• Supervised learning algorithm

• 3 interesting properties in our context:

• Accurate: state-of-the-art predictive error• Interpretable: sparse models• Scalable: optimal algorithmic complexity

• Marchand, M., & Shawe-Taylor, J. (2003). The set covering machine. The Journal of Machine Learning Research, 3, 723-746.

Page 28: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Sensitive detection

Approximately 40,000 peaks per spectrum.

Page 29: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Spectre MS m/z

Inte

nsit

yPatient 1

Patient 2

Patient 3

Patient 4

Patient 5

Each peak is a potential biomarker

Page 30: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

30

New problem= new algorithms

Problem: we have many mass spectra but the m/z are not identical from one sample to another.*Reference-free mass spectrum alignment*Virtual lock mass

RF-MSA: correct peak shape variation (ion distribution)VLM: homologous peak distance correction

Page 31: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Aligned spectra

Consensus is only if a peak is in all 4 spectrum but can ask if ¾ etc….

Page 32: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

50 of the 192 aligned spectrum

Consensus (502 matches)

Can do clustering and multifactorial analyses!

Page 33: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

33

100 of the 1000 aligned spectrum

At this stage, one needs more help!

Page 34: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

34

Plasmas (male and female)

Page 35: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

35

Can machine learning do better?

48 samples and 12 832 peaks in the dataset

Page 36: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

36

Perspectives

• In a position to derive signatures for specific states of complex biological matrices.

• Useful for diagnostics and monitoring industrial processes.

• Relatively cheap compare to sequencing and immunodetection systems.

• More in line with clinical or industrial setting since you process one sample at time for the same cost.

Page 37: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

Mass spectrometry

A new paradigm!!!

Page 38: Can we measure everything? 1 Professor Jacques Corbeil Canada Research Chair in Medical Genomics

38

Acknowledgements

• Nancy Boucher• Francis Brochu• Alexandre Drouin• Sébastien Giguère• Pier-Luc Plante• Frédéric Raymond• Lynda Robitaille• Prudencio Toussou

Héma-Québec Louis ThibaultAstraZeneca Veronica Kos

Humphrey GardnerPhytronix Serge Auger

Jean Lacoursière Pierre Picard

Royal Victoria Vivian LooINSPQ Cécile TremblayWaters corp. Keith Fadgen

Geoff Gerhardt

Big Data Centre François LavioletteU. Laval. Mario Marchand