global annotation of the protein kinase family michael gribskov university of california, san diego

25
Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Upload: alberta-george

Post on 29-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Global Annotation of the Protein Kinase Family

Michael Gribskov

University of California, San Diego

Page 2: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

SignalingCascades

Page 3: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego
Page 4: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Statistics

• Arabidopsis• 1028 putative kinase• 58 Potentially alternatively spliced• 82 % confirmed by full length cDNA• Less than 100 experimentally investigated

• Rice• 1565 putative kinases

• What are the functions of each protein kinase?• Functional groupings• Substrate prediction• Pathway analysis and modeling

Page 5: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Targets

• Protein kinase• Protein phosphatase• Membrane transporters• Proteasome complex

Page 6: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Some Receptor Kinases

Class I(EGF receptor)

Class II(Insulin receptor)

Class III(FGF receptor)

Page 7: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Requirements for Functional Clustering

• Must handle very large number of objects (over 1200 for plants, over 9000 for all species)

• Must deal sensibly with paralogs from functional point of view

• Must be based on entire sequence, not just kinase catalytic domain

• Must be tolerant to sequence errors and omissions

Page 8: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Orthology vs Paralogy

• Relationships between genes in multigene families are complex

• Multiple genes may exist before speciation• Genes may be lost and replaced along lineages• “Function space” must be filled

Species A Species B

Page 9: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Maximal Linkage Clustering

Clustering

Page 10: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

a b

c

d

e f g

h

A

a b c e f g h d

B

Average linkage

a b c d e f g h

C

Maximum linkage

Clustering

Page 11: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Clustering/Classification

Maximum linkage

Page 12: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Clustering/Classification

• Pairwise distances• All-against-all BLAST

Uses entire sequence

Alignments not required

Longer matches, i.e. more domains, give better score

0

5000

10000

15000

20000

25000

0 10 20 30 40 50 60 70 80 90 100

110

120

130

140

150

160

170

180

-log( E-value )

Nu

mb

er

Nu

mb

er

Page 13: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Basic Approach

• Maximum linkage clustering up to “natural” limit• Recalculate average distances between groups• Repeat until tree is complete

Page 14: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Complete Kinase Clustering

Page 15: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Statistics

• Class 1: RLKs (transmembrane) and RLCKs • Class 2: “Raf-like” • Class 3: Casein Kinase and CLK • Class 4: Non-TM, Non-Receptor

Page 16: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

BLASTDistance

Entire Sequence

Page 17: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

BLASTDistance

Non-KinaseDomain

Page 18: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Yeast Signaling (MAPK)

Page 19: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Validating Transgenomic Predictions

Page 20: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

SnRK

• At AKIN10 and AKin11• Rescue yeast SNF1 deletion• Functional homolog

GIN4/ERC47/CLA6/D9719.13/YDR507C KCC4/YCL024W HSL1/(SEL2)/NIK1/YKL453/YKL101W SNF1/CAT1/GLC2/CCR1/PAS14/HAF3/D8035.20/YDR477W

At5g39440 At3g29160/AKIN11 At3g01090/AKIN10 At5g58380 At5g07070 At5g01810 At5g45820 At4g30960 At5g25110 At5g10930 At2g25090 At2g30360 At5g01820/AtSR1 At2g38490 At3g23000/AtSR2 At4g14580 At1g01140 At1g30270 At2g26980 At4g24400 At5g35410/SOS2 At1g48260 At3g17510 At5g57630 At1g60940 At1g10940 At5g08590 At5g63650 At2g23030 At1g78290 At3g50500 At5g66880 At4g33950 At4g40010 At1g29230 At2g34180 At4g18700 At5g45810

KIN1/YD9727.17/YDR122W KIN2/L8004.3/L2546/YLR096W KIN4/KIN31/(KIN3)/O5220/YOR233W YPL141C/LPI5 YPL150W/P2597

50

E=10-80

See

Fig

. 2

 

Page 21: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

MAPK Erk1/Human/CMGC MAPK ERK Erk2/Human/CMGC MAPK ERK rl/Fruit Fly/CMGC MAPK ERK mpk 1/Nematode worm/CMGC MAPK ERK Erk5/Human/CMGC MAPK ERK FUS3/YBL016W/Bakers Yeast/CMGC MAPK ERK KSS1/YGR040W/Bakers Yeast/CMGC MAPK ERK HOG1/YLR113W/Bakers Yeast/CMGC MAPK p38 At1g10210.1 4 5 1 AtMPK1 MAP kinase 1 At1g59580.1 4 5 1 AtMPK2 MAP kinase 2 At1g59580.2 AtMPK2 MAP kinase 2 9630.m00469/protein MAP kinase MAPK2 9634.m04729/protein MAP kinase 2 At2g18170.1 4 5 1 AtMPK7 MAP kinase 7 At4g36450.1 4 5 1 MPK14 MAP kinase 14 At2g43790.1 4 5 1 AtMPK6 MAP kinase 6 9634.m00532/protein Protein kinase domain putative At3g45640.1 4 5 1 AtMPK3 MAP kinase 3 9631.m01739/protein Protein kinase domain putative At3g59790.1 4 5 1 AtMPK10 MAP kinase 10 At2g46070.1 4 5 1 AtMPK12 MAP kinase 12 At4g01370.1 4 5 1 AtMPK4 MAP kinase 4 9636.m00537/protein mitogen activated protein kinase MMK2 <EC 9638.m03508/protein putative serine/threonine protein kinase

SLT2/YHR030C/Bakers Yeast/CMGC MAPK ERK YKL161C/Bakers Yeast/CMGC MAPK ERK At4g11330.1 4 5 1 AtMPK5 MAP kinase 5 NLK/Human/CMGC MAPK nmo nmo/Fruit Fly/CMGC MAPK nmo lit 1/Nematode worm/CMGC MAPK nmo At1g01560.1 4 5 1 MPK11 MAP kinase 11 At1g07880.1 4 5 1 AtMPK13 MAP kinase 13 At1g18150.1 4 5 1 AtMPK8 MAP kinase 8 At3g18040.2 AtMPK9 MAP kinase 9 At1g18150.2 AtMPK8 MAP kinase 8 At1g53510.1 4 5 1 AtMPK18 MAP kinase 18 At2g42880.1 4 5 1 AtMPK20 MAP kinase 20 9630.m00329/protein MAP kinase homolog 9633.m04760/protein MAP kinase putative 9633.m00448/protein expressed protein 9633.m04713/protein ATMPK9 9634.m04815/protein MAP kinase homolog At3g14720.1 4 5 1 AtMPK19 MAP kinase 19 At2g01450.1 4 5 1 AtMPK17 MAP kinase 17 9629.m04359/protein blast and wounding induced At3g18040.1 4 5 1 AtMPK9 MAP kinase 9 At5g19010.1 4 5 1 AtMPK16 MAP kinase 16 9629.m04231/protein Protein kinase domain putative 9634.m02609/protein mitogen activated protein kinase homologue 9629.m04560/protein Protein kinase domain putative At1g73670.1 4 5 1 AtMPK15 MAP kinase 15 9633.m04602/protein Protein kinase domain putative Erk7/Human/CMGC MAPK Erk7 CG2309/Fruit Fly/CMGC MAPK Erk7 C05D10.2/Nematode worm/CMGC MAPK Erk7 SMK1/YPR054W/Bakers Yeast/CMGC MAPK ERK JNK1/Human/CMGC MAPK JNK JNK3/Human/CMGC MAPK JNK JNK2/Human/CMGC MAPK JNK bsk/Fruit Fly/CMGC MAPK JNK jnk 1/Nematode worm/CMGC MAPK JNK T07A9.3/Nematode worm/CMGC MAPK JNK ZC416.4/Nematode worm/CMGC MAPK JNK p38a/Human/CMGC MAPK p38 p38b/Human/CMGC MAPK p38 Mpk2/Fruit Fly/CMGC MAPK p38 p38b/Fruit Fly/CMGC MAPK p38 p38d/Human/CMGC MAPK p38 p38g/Human/CMGC MAPK p38 pmk 1/Nematode worm/CMGC MAPK p38 pmk 2/Nematode worm/CMGC MAPK p38 P38c/Fruit Fly/CMGC MAPK p38 pmk 3/Nematode worm/CMGC MAPK p38 C04G6.1/Nematode worm/CMGC MAPK F09C12.2/Nematode worm/CMGC MAPK W06B3.2/Nematode worm/CMGC MAPK

50

Page 22: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

MEME PSSM

Page 23: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

PPC4.2.6 MEME Motifs

Page 24: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Summary

• Functional groups by clustering• Functional assignment by transgenomic comparison• Directed search for functional motifs by motif

comparison• Construction of public data resources

Page 25: Global Annotation of the Protein Kinase Family Michael Gribskov University of California, San Diego

Bioinformatics Group

• Michael Gribskov• Fariba Fana• Degeng Wang• Sheila Podell• Tobey Tam *• Jason Tchieu *• Hannes Niedner

• Douglas Smith• Guangfa Zhang *

• Jeff Harper

• Major Contributors• Catherine Chan• Alice Harmon• Estelle Hrabak• David Kerk• Shinhan Shiu