casp 13 predicting contacts - predictioncenter.org · • it is difficult to directly correlate...
TRANSCRIPT
![Page 1: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/1.jpg)
CASP13
PredictingContacts
Assessor:AndrásFiserDepartmentofSystemsandComputationalBiologyDepartmentofBiochemistry
1
![Page 2: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/2.jpg)
Possible questions
• Doescontactpredictionaccuracycorrelatewiththatofstructuremodeling?• Howwellyoudidamongyourselves?• HowwellyoudidcomparedtopreviousCASPs?
• Someinsightanalysis:– Areyoucapturingthesamesetofcontacts?– Arethereparticulartypesofcontactsthatyouaregettingaccurately?– Howimportantisthequalityofsequenceinformation?
2
![Page 3: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/3.jpg)
Best structure prediction (out of 98) vs. Best contact predictions (out of 46)
G043G322G089G145G224G261G354G498G197G460G324G135G196G055G418G117G208G274G086G192G071G222G044.
FMandTBM/FM FM ContactsG043G322G089G145G224G498G261G354G197G324G196G208G460G135G055G117G418G366G192G274G086G457G044.
XXG089(20)XG224(11)G498(1)XXXXXXXXXXXXXXXXX.
ContactsonlyG498(6)G032*G180*G323*G491*G106*G164(46)G189*G352*G125*G224(5)G036*
G392*G351(54)G122(67)G386*G475*G154*G292*G089(3)G430*G041(63)G091*.
![Page 4: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/4.jpg)
Best structure prediction (out of 98) vs. Best contact predictions (out of 46)
G043G322G089G145G224G261G354G498G197G460G324G135G196G055G418G117G208G274G086G192G071G222G044.
FMandTBM/FM FM ContactsG043G322G089G145G224G498G261G354G197G324G196G208G460G135G055G117G418G366G192G274G086G457G044.
XX(G036)(12)G089(20)X(G032)(2)G224(11)G498(1)X(G180,G32)(3,2)X(G229)(39)XX(G498)XXXXXX(G491)XXXXXXX.
ContactsonlyG498(6)G032*(2)G322,G180*(2)G322G323*(2)G322G491*(16)G117G106*G164(46)G189*G352*G125*G224(5)G036*(2)or(50)G116G392*G351(54)G122(67)G386*G475*G154*G292*G089(3)G430*G041(63)G091*.
![Page 5: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/5.jpg)
Best structure prediction (out of 98) vs. Best contact predictions (out of 46)
G043G322G089G145G224G261G354G498G197G460G324G135G196G055G418G117G208G274G086G192G071G222G044
FMandTBM/FM FM ContactsG043G322G089G145G224G498G261G354G197G324G196G208G460G135G055G117G418G366G192G274G086G457G044
XXG089(20)XG224(11)G498(1)XXXXXXXXXXXXXXXXX
ContactsonlyG498(6)G032*G180*G323*G491*G106*G164(46)G189*G352*G125*G224(5)G036*G392*G351(54)G122(67)G386*G475*G154*G292*G089(3)G430*G041(63)G091*
Difficulttoestablishclearrelationbetweencontactandstructureprediction->wedonotknowhowwellonecouldperformwitha“top”contactprediction
Sometopperformingstructurepredictiongroupsdidnotsubmitcontactprediction->wedonotknowiftheyhaveabettercontactpredictionthanothers
![Page 6: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/6.jpg)
Amonggroupsthathavesubmittedbothstructureandcontactprediction:Surprisinginconsistencies!!Itisimportanttoknowhowtousecontactinformation!And/OrContactinformationisnotasimportantasonethought
Best structure prediction (out of 98) vs. Best contact predictions (out of 46)
ContactsXX089(20)X224(11)498(1)XXXXXXXXXXXXXXXXX
Contactsonly498(6)032*180*323*491*106*164(46)189*352*125*224(5)036*392*351(54)122(67)386*475*154*292*089(3)430*041(63)091*
89submitted:30/31targets…)
![Page 7: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/7.jpg)
Are we predicting different contacts? Jaccard distance (“1-Intersection over Union”)
dj =A B − A B∩∪
A B∪0(same)<dj<1(different)
TopL/5numberofcontacts,Listhelengthofsequence
Contactsonly498(6)032*180*323*491*106*164(46)189*352*125*224(5)036*392*351(54)122(67)386*475*154*292*089(3)430*041(63)091*
(RRMD)
(RRMD-plus)
Deltacontact
Gammacontact
Tripletres
Tripletres_AT
![Page 8: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/8.jpg)
Performance using different criteria
Models(FMorFM+TBM)XContacts(top10orL/5orL/2orLorFL)Xprobability(0or0.5)Xcontactdefinition(medium/long;long;extralong)=>60combinationsevaluatedbyeither:usingF1;Precision/Recall;Z-scoresumorZ-scoreaverageetc.
![Page 9: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/9.jpg)
Long/medium contacts, FM only, Zscore >0
9
![Page 10: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/10.jpg)
Long/medium contacts (FM only), sum Zscore (>0)
Long/medium:top10
Long/medium:L5
Long/medium:L2
Long/medium:L
032and323arethesame
![Page 11: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/11.jpg)
Long contacts, (FM only), sum Zscore (>0)
11
Long:L
Long:L/2
Long:L/5
Long:top10
![Page 12: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/12.jpg)
Extra long contacts only, (FM only), Zscore>0
12
ExtraLong:top10
ExtraLong:L/5
ExtraLong:L/2
ExtraLong:L
![Page 13: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/13.jpg)
13
0
10
20
30
40
50
60
70
AveragePrecision
Longcontacts,L/5lists
CASP10
Improvement in contact prediction accuracy over CASP10-13 meetings
CASP10:23groups,15non-redundant
![Page 14: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/14.jpg)
14
0
10
20
30
40
50
60
70
AveragePrecision
Longcontacts,L/5lists
CASP11 CASP10
Improvement in contact prediction accuracy over CASP10-13 meetings
CASP10:23groups,15non-redundantCASP11:28groups,22non-redundant
![Page 15: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/15.jpg)
15
0
10
20
30
40
50
60
70
AveragePrecision
Longcontacts,L/5lists
CASP12 CASP11
CASP10
Improvement in contact prediction accuracy over CASP10-13 meetings
CASP10:23groups,15non-redundantCASp11:28groups,22non-redundantCASP12:31groups,22non-redundant
![Page 16: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/16.jpg)
Improvement in contact prediction accuracy over CASP10-13 meetings
16
0
10
20
30
40
50
60
70
AveragePrecision
Longcontacts,L/5lists
CASP13 CASP12
CASP11 CASP10
CASP10:23groups,15non-redundantCASP11:28groups,22non-redundantCASP12:31groups,24non-redundantCASP13:44groups,34non-redundant
![Page 17: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/17.jpg)
T0953s1d1
17
GoodFscore63.4
PoorFscore:14.64
BestTSmodel(G43),Cyan,GDT_TS54.48Contactmodel(G164),Green,GDT_TS41.05
![Page 18: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/18.jpg)
Relationship between sequence profile depth and success (F-score) of predicting contacts
• Lessreliantonsequenceprofiles. 18
20 30 40 50 60
020
040
060
080
010
00
F−Score
Num
ber o
f hits
●●● ●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
● PsiblastHHBlits
![Page 19: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/19.jpg)
Limited signal coming from sequence
19
20.7 23 10 36032.9 252 25 13845.7 14 1 4917 46 3728 50 17 33
31.6 40 22 3739.5 46 37 3621.9 669 37 4034.4 591 4131 89 46 168
33.3 38 20 6020.5 3021 1905 51132 172 172 457
25.4 609 183 46725 6130 129 465
34.7 132 31 20134.7 91 111 20116 30 6 917 30 6 55
35.7 278 17 34723 19 9 35
20.6 38 23 16218.3 38 23 16936 194 1 300
24.4 194 1 2746.4 58 31 45
18 58 31 5425 58 31 51
53.7 1 1 029 14 14 36
24.2 1266 1028 47843.7 1 1 219.1 3752 50032.1 4 3 923 584 110 43036 1 1 019 7 4 18
51.4 21 6 12351.4 77 13 12330 231 126 267
64.3 302 85 44164.3 545 343 44127.2 1730 68 47018.2 629 1163 5018.2 3755 38 5032 1380 53 465
Fscoree-5e-20Neff Fscoree-5e-20NeffBlastBlast+HHblits BlastBlast+HHblits
![Page 20: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/20.jpg)
What is what?
20Green:parallel(parallelwithdiagonal)+diffuse(helical)Blue:Anti-parallel(orthogonalwithdiagonal)+compact(strand)
![Page 21: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/21.jpg)
Performance vs. secondary structure interactions
21
E−E H−H E−H C−C E/H−C
Fsco
re0
1020
3040
5060
70
β-β
α-α
Coil-coil β-α
β/α-coil
Randommodel
Coil-coil
β-α
α-αβ-β
β/α-coil
![Page 22: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/22.jpg)
Topology dependence of success rates, Class level
22
0
5
10
15
20
25
30
E H M
<F-score>
all-β all-α α/β
![Page 23: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/23.jpg)
Correlation with size
23
0
10
20
30
40
50
60
0 50 100 150 200 250 300 350 400 450 500
Proteinlength
F-score*100accuracy
R=0.32
Withoutthissinglepoint:R=0.19
![Page 24: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/24.jpg)
Conclusions
• Contactpredictionmethodsmadeamajoradvanceforthelasttwoyears• Alotofdifferentsubsetsofcorrectcontactscanbemadeandused
successfullyin3Dmodeling• Itisdifficulttodirectlycorrelatepredictedcontactswith3Dpredictions
becauseofambiquityandlackofoverlapbetweencategoriesbut:– Best3Dpredictorshaveeitherevensuperiorcontactpredictionsorbetterwaystouse
contactinformation– Fromthefewexampleswhenbothcontactsand3Dstructureswerepredictedwesee
stronginconsistencies:itisimportanttoknowhowtousecontactinformation
• Oftenveryfewhomologoussequenceswereavailable,butverygoodcontactpredictionsweremade
– Lessemphasisonco-variancebasedmethods(supportedbytheabstractofinvitedgroups)
24
![Page 25: CASP 13 Predicting Contacts - predictioncenter.org · • It is difficult to directly correlate predicted contacts with 3D predictions because of ambiquity and lack of overlap between](https://reader033.vdocument.in/reader033/viewer/2022050120/5f50451617011475f135031b/html5/thumbnails/25.jpg)
Acknowledgement
25
CASP and Predictioncenter at UC Davies, Davies, USA: Andriy Kryshtafovych Bohdan Monastyrskyy Krzysztof Fidelis CASP organizers Albert Einstein College of Medicine, New York, USA: Rojan Shrestha Eduardo Fajardo Nelson Gil