seminar: u et al. 2014 plos comp. biol. 10(4):e1003545

41
Prediction and prioritization of rare oncogenic mutations in the cancer kinome using novel features and multiple classifiers Presentation by Rosemary McCloskey ManChon U 1 Eric Talevich 2 Samiksha Katiyar 3 Khaled Rasheed 1 Natarajan Kannan 3,4 1 Department of Computer Science, University of Georgia 2 Department of Dermatology, University of California San Francisco 3 Department of Biochemistry and Molecular Biology, University of Georgia 4 Institute of Bioinformatics, University of Georgia September 4, 2014 U et al. Oncogenic kinome mutations September 4, 2014 1 / 11

Upload: rosemary-mccloskey

Post on 20-Jun-2015

61 views

Category:

Science


2 download

DESCRIPTION

Presentation for BIOF 501 at the University of British Columbia. ManChon, U., et al. "Prediction and Prioritization of Rare Oncogenic Mutations in the Cancer Kinome Using Novel Features and Multiple Classifiers." PLoS computational biology 10.4 (2014): e1003545.

TRANSCRIPT

Page 1: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Prediction and prioritization of rare oncogenicmutations in the cancer kinome using novel features

and multiple classifiersPresentation by Rosemary McCloskey

ManChon U1 Eric Talevich2 Samiksha Katiyar3 KhaledRasheed1 Natarajan Kannan3,4

1Department of Computer Science, University of Georgia

2Department of Dermatology, University of California San Francisco

3Department of Biochemistry and Molecular Biology, University of Georgia

4Institute of Bioinformatics, University of Georgia

September 4, 2014

U et al. Oncogenic kinome mutations September 4, 2014 1 / 11

Page 2: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Driver vs. passenger mutations

driver passengers

How to distinguish driver from passenger mutations?

U et al. Oncogenic kinome mutations September 4, 2014 2 / 11

Page 3: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Driver vs. passenger mutations

driver passengers

How to distinguish driver from passenger mutations?

U et al. Oncogenic kinome mutations September 4, 2014 2 / 11

Page 4: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Driver vs. passenger mutations

driver passengers

How to distinguish driver from passenger mutations?

U et al. Oncogenic kinome mutations September 4, 2014 2 / 11

Page 5: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Driver vs. passenger mutations

driver passengers

How to distinguish driver from passenger mutations?

U et al. Oncogenic kinome mutations September 4, 2014 2 / 11

Page 6: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Driver vs. passenger mutations

driver

passengers

How to distinguish driver from passenger mutations?

U et al. Oncogenic kinome mutations September 4, 2014 2 / 11

Page 7: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Driver vs. passenger mutations

driver passengers

How to distinguish driver from passenger mutations?

U et al. Oncogenic kinome mutations September 4, 2014 2 / 11

Page 8: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Driver vs. passenger mutations

driver passengers

How to distinguish driver from passenger mutations?

U et al. Oncogenic kinome mutations September 4, 2014 2 / 11

Page 9: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Driver vs. passenger mutations

driver passengers

How to distinguish driver from passenger mutations?

U et al. Oncogenic kinome mutations September 4, 2014 2 / 11

Page 10: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Protein kinases

Enzymes which transfer POH− from ATP to protein.

Regulate cellular processes.

Kinase mutations implicated in cancer.

518 human protein kinase genes (Manning et al. 2002).

8 groups (plus “other“ and “atypical”), 133 families.

11 structurally conserved regions among all kinases(“subdomains”).

U et al. Oncogenic kinome mutations September 4, 2014 3 / 11

Page 11: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Training and validation data

˝ˆ ¨

birds

ˇY 7

peoplem

what is this?

bird

Supervised machine learning approach.

Known oncogenic: Catalogue of Somatic Mutations in Cancer(COSMIC) database.

I Filtered for protein kinases.I Two sets, one excluding mutations reported only once.

Known benign: protein kinase SNP’s (SNP@Domain).

U et al. Oncogenic kinome mutations September 4, 2014 4 / 11

Page 12: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Training and validation data

˝

ˆ ¨birds

ˇY 7

people

mwhat is this?

bird

Supervised machine learning approach.

Known oncogenic: Catalogue of Somatic Mutations in Cancer(COSMIC) database.

I Filtered for protein kinases.I Two sets, one excluding mutations reported only once.

Known benign: protein kinase SNP’s (SNP@Domain).

U et al. Oncogenic kinome mutations September 4, 2014 4 / 11

Page 13: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Training and validation data

˝

ˆ ¨birds

ˇY 7

peoplem

what is this?

bird

Supervised machine learning approach.

Known oncogenic: Catalogue of Somatic Mutations in Cancer(COSMIC) database.

I Filtered for protein kinases.I Two sets, one excluding mutations reported only once.

Known benign: protein kinase SNP’s (SNP@Domain).

U et al. Oncogenic kinome mutations September 4, 2014 4 / 11

Page 14: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Training and validation data

˝

ˆ ¨birds

ˇY 7

peoplem

what is this?

bird

Supervised machine learning approach.

Known oncogenic: Catalogue of Somatic Mutations in Cancer(COSMIC) database.

I Filtered for protein kinases.I Two sets, one excluding mutations reported only once.

Known benign: protein kinase SNP’s (SNP@Domain).

U et al. Oncogenic kinome mutations September 4, 2014 4 / 11

Page 15: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Training and validation data

˝

ˆ ¨birds

ˇY 7

peoplem

what is this?

bird

Supervised machine learning approach.

Known oncogenic: Catalogue of Somatic Mutations in Cancer(COSMIC) database.

I Filtered for protein kinases.I Two sets, one excluding mutations reported only once.

Known benign: protein kinase SNP’s (SNP@Domain).

U et al. Oncogenic kinome mutations September 4, 2014 4 / 11

Page 16: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Training and validation data

˝

ˆ ¨birds

ˇY 7

peoplem

what is this?

bird

Supervised machine learning approach.

Known oncogenic: Catalogue of Somatic Mutations in Cancer(COSMIC) database.

I Filtered for protein kinases.I Two sets, one excluding mutations reported only once.

Known benign: protein kinase SNP’s (SNP@Domain).

U et al. Oncogenic kinome mutations September 4, 2014 4 / 11

Page 17: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Mutation features

Identified 29 mutation features.

Kinase group and family classification.

Conservation of position within all kinases, family, and group.

Amino acid properties: charge, polarity, mass, etc.

Structural and functional: kinase sub-domain, binding site,posttranslational modification.

Which of these is important?

U et al. Oncogenic kinome mutations September 4, 2014 5 / 11

Page 18: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Mutation features

Identified 29 mutation features.

Kinase group and family classification.

Conservation of position within all kinases, family, and group.

Amino acid properties: charge, polarity, mass, etc.

Structural and functional: kinase sub-domain, binding site,posttranslational modification.

Which of these is important?

U et al. Oncogenic kinome mutations September 4, 2014 5 / 11

Page 19: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Mutation features

Identified 29 mutation features.

Kinase group and family classification.

Conservation of position within all kinases, family, and group.

Amino acid properties: charge, polarity, mass, etc.

Structural and functional: kinase sub-domain, binding site,posttranslational modification.

Which of these is important?

U et al. Oncogenic kinome mutations September 4, 2014 5 / 11

Page 20: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Mutation features

Identified 29 mutation features.

Kinase group and family classification.

Conservation of position within all kinases, family, and group.

Amino acid properties: charge, polarity, mass, etc.

Structural and functional: kinase sub-domain, binding site,posttranslational modification.

Which of these is important?

U et al. Oncogenic kinome mutations September 4, 2014 5 / 11

Page 21: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Mutation features

Identified 29 mutation features.

Kinase group and family classification.

Conservation of position within all kinases, family, and group.

Amino acid properties: charge, polarity, mass, etc.

Structural and functional: kinase sub-domain, binding site,posttranslational modification.

Which of these is important?

U et al. Oncogenic kinome mutations September 4, 2014 5 / 11

Page 22: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Mutation features

Identified 29 mutation features.

Kinase group and family classification.

Conservation of position within all kinases, family, and group.

Amino acid properties: charge, polarity, mass, etc.

Structural and functional: kinase sub-domain, binding site,posttranslational modification.

Which of these is important?

U et al. Oncogenic kinome mutations September 4, 2014 5 / 11

Page 23: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Mutation features

Identified 29 mutation features.

Kinase group and family classification.

Conservation of position within all kinases, family, and group.

Amino acid properties: charge, polarity, mass, etc.

Structural and functional: kinase sub-domain, binding site,posttranslational modification.

Which of these is important?

U et al. Oncogenic kinome mutations September 4, 2014 5 / 11

Page 24: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Feature selection

5 feature selection algorithms.

10-fold “cross-validation” for each algorithm.I Ran 10 times, each time excluding 10% of data.I Averaged results over all 10 folds.

17 features chosen by 3/5 selectors were retained.

Kinase family and group were most important by a wide margin.

U et al. Oncogenic kinome mutations September 4, 2014 6 / 11

Page 25: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Feature selection

5 feature selection algorithms.

10-fold “cross-validation” for each algorithm.I Ran 10 times, each time excluding 10% of data.I Averaged results over all 10 folds.

17 features chosen by 3/5 selectors were retained.

Kinase family and group were most important by a wide margin.

U et al. Oncogenic kinome mutations September 4, 2014 6 / 11

Page 26: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Feature selection

5 feature selection algorithms.

10-fold “cross-validation” for each algorithm.I Ran 10 times, each time excluding 10% of data.I Averaged results over all 10 folds.

17 features chosen by 3/5 selectors were retained.

Kinase family and group were most important by a wide margin.

U et al. Oncogenic kinome mutations September 4, 2014 6 / 11

Page 27: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Feature selection

5 feature selection algorithms.

10-fold “cross-validation” for each algorithm.I Ran 10 times, each time excluding 10% of data.I Averaged results over all 10 folds.

17 features chosen by 3/5 selectors were retained.

Kinase family and group were most important by a wide margin.

U et al. Oncogenic kinome mutations September 4, 2014 6 / 11

Page 28: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Training and cross-validation

11 machine learning methods.

10-fold cross-validation for each (train on 90%, test on 10%).

Quantify accuracy of each method by

F − measure =2 × recall × precision

recall + precision,

where

precision =identified true oncogenic

all identified oncogenic

recall =identified true oncogenic

all true oncogenic.

U et al. Oncogenic kinome mutations September 4, 2014 7 / 11

Page 29: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Training and cross-validation

11 machine learning methods.

10-fold cross-validation for each (train on 90%, test on 10%).

Quantify accuracy of each method by

F − measure =2 × recall × precision

recall + precision,

where

precision =identified true oncogenic

all identified oncogenic

recall =identified true oncogenic

all true oncogenic.

U et al. Oncogenic kinome mutations September 4, 2014 7 / 11

Page 30: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Training and cross-validation

11 machine learning methods.

10-fold cross-validation for each (train on 90%, test on 10%).

Quantify accuracy of each method by

F − measure =2 × recall × precision

recall + precision,

where

precision =identified true oncogenic

all identified oncogenic

recall =identified true oncogenic

all true oncogenic.

U et al. Oncogenic kinome mutations September 4, 2014 7 / 11

Page 31: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

EGFR mutations

Combined trained classifiers toanalyse rare mutations inepidermal growth factorreceptor.

L861R, G724S most likelyoncogenic.

T725M, L858Q middle-rankedbut unknown functionalimpact.

E746K low ranked.

U et al. Oncogenic kinome mutations September 4, 2014 8 / 11

Page 32: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

EGFR mutations

Combined trained classifiers toanalyse rare mutations inepidermal growth factorreceptor.

L861R, G724S most likelyoncogenic.

T725M, L858Q middle-rankedbut unknown functionalimpact.

E746K low ranked.

U et al. Oncogenic kinome mutations September 4, 2014 8 / 11

Page 33: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

EGFR mutations

Combined trained classifiers toanalyse rare mutations inepidermal growth factorreceptor.

L861R, G724S most likelyoncogenic.

T725M, L858Q middle-rankedbut unknown functionalimpact.

E746K low ranked.

U et al. Oncogenic kinome mutations September 4, 2014 8 / 11

Page 34: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

EGFR mutations

Combined trained classifiers toanalyse rare mutations inepidermal growth factorreceptor.

L861R, G724S most likelyoncogenic.

T725M, L858Q middle-rankedbut unknown functionalimpact.

E746K low ranked.

U et al. Oncogenic kinome mutations September 4, 2014 8 / 11

Page 35: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

In vitro verification

Effect of each mutation onEGFR autophosphorylationand Akt phosphorylation.

L861R, T725M, E746K showedincreased EGFRautophosphorylation.

I Up-regulation of cellproliferation.

G724S, L858Q showedincreased Akt (protein kinaseB) phosphorylation.

I Blocks apoptosis.

U et al. Oncogenic kinome mutations September 4, 2014 9 / 11

Page 36: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

In vitro verification

Effect of each mutation onEGFR autophosphorylationand Akt phosphorylation.

L861R, T725M, E746K showedincreased EGFRautophosphorylation.

I Up-regulation of cellproliferation.

G724S, L858Q showedincreased Akt (protein kinaseB) phosphorylation.

I Blocks apoptosis.

U et al. Oncogenic kinome mutations September 4, 2014 9 / 11

Page 37: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

In vitro verification

Effect of each mutation onEGFR autophosphorylationand Akt phosphorylation.

L861R, T725M, E746K showedincreased EGFRautophosphorylation.

I Up-regulation of cellproliferation.

G724S, L858Q showedincreased Akt (protein kinaseB) phosphorylation.

I Blocks apoptosis.

U et al. Oncogenic kinome mutations September 4, 2014 9 / 11

Page 38: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Conclusions

Used machine learning to classify rare EGFR mutations.

Kinase group and family were most important predictors.

Identified T725M, L861R as likely cancer-associated with anobvious mechanism (activating EGFR).

L858Q, G724S also likely oncogenic, but less obvious mechanism(Akt?).

U et al. Oncogenic kinome mutations September 4, 2014 10 / 11

Page 39: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Conclusions

Used machine learning to classify rare EGFR mutations.

Kinase group and family were most important predictors.

Identified T725M, L861R as likely cancer-associated with anobvious mechanism (activating EGFR).

L858Q, G724S also likely oncogenic, but less obvious mechanism(Akt?).

U et al. Oncogenic kinome mutations September 4, 2014 10 / 11

Page 40: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Conclusions

Used machine learning to classify rare EGFR mutations.

Kinase group and family were most important predictors.

Identified T725M, L861R as likely cancer-associated with anobvious mechanism (activating EGFR).

L858Q, G724S also likely oncogenic, but less obvious mechanism(Akt?).

U et al. Oncogenic kinome mutations September 4, 2014 10 / 11

Page 41: Seminar: U et al. 2014 PLoS Comp. Biol. 10(4):e1003545

Conclusions

Used machine learning to classify rare EGFR mutations.

Kinase group and family were most important predictors.

Identified T725M, L861R as likely cancer-associated with anobvious mechanism (activating EGFR).

L858Q, G724S also likely oncogenic, but less obvious mechanism(Akt?).

U et al. Oncogenic kinome mutations September 4, 2014 10 / 11