finding the beta helix motif by marcin mejran. papers predicting the -helix fold from protein...
Post on 19-Dec-2015
223 views
TRANSCRIPT
![Page 1: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/1.jpg)
Finding the Beta Helix Motif By Marcin Mejran
![Page 2: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/2.jpg)
Papers
Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke, Jonathan King, Bonnie Berger
Segmentation Conditional Random Fields (SCRFs): A New Approach for Protein Fold Recognition by Yan Liu, Jaime Carbonell, Peter Weigele, and Vanathi Gopalakrishnan
![Page 3: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/3.jpg)
Secondary StructureBeta Strand
• Forms -sheets
Alpha Helix• Stand alone
Can combine into more complex structures:
• Beta sheets
• Beta Helixes
Images from: http://www.people.virginia.edu/~rjh9u/prot2ndstruct.html
![Page 4: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/4.jpg)
sheet
![Page 5: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/5.jpg)
Second and a half Structure
beta helix
beta barrel
beta trefoil
![Page 6: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/6.jpg)
-Helix
![Page 7: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/7.jpg)
-Helix
Helix composed of three parallel sheets
Three -strands per “rung”
Connecting “loops” Not in Eukaryotes Secreted by various
bacteria Right and left handed
![Page 8: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/8.jpg)
-Helix Few solved
structures9 SCOP
SuperFamilies14 RH solved
structures in PDB Solved structures
differ widely
B3T2
B2
B1
![Page 9: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/9.jpg)
-Helix
T2 turn: unique two residue loop
-strands are 3 to 5 residues.
T1 and T3 vary in size, may contain secondary structures
-strands interact between rungs
![Page 10: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/10.jpg)
-Helix
Good choice from computational point of view
“Nice” structure Repeating parallel -stands Rungs have similar structure Stacking is predictable Well conserved -stand across super-
families
![Page 11: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/11.jpg)
-Helix
Long term interactions Close in 3D but not 1D
“Non-unique” features B2-T2-B3 segment
Unique features not clearly shown in sequence
Usual methods don’t workImage from: http://www.cryst.bbk.ac.uk/PPS2/course/section10/all_beta.html
![Page 12: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/12.jpg)
BetaWrap
“Wraps” sequences around helix Finds best “wrap” Uses B2, B3 strands and T2 turn
Rest of rung varies greatly in size
Decomposes into sub-problems Rungs Find multiple rungs Find B1 by local optimization
![Page 13: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/13.jpg)
Hydrophobic/charged
HydrophobicDislikes Water
HydrophilicLike water
ChargedOn Outside
B3T2
B2
B1
Image from: http://betawrap.lcs.mit.edu/BetaTalk.ppt
![Page 14: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/14.jpg)
BetaWrap: Rungs
Given a T2 turn, find the next T2 turn
B2
B3 T2Candidate
Rung
Image from: http://betawrap.lcs.mit.edu/BetaTalk.ppt
![Page 15: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/15.jpg)
BetaWrap: Rungs More weight given to
inward pairs Certain stacked
Amino Acids preferred
Penalty for highly charged inward residues
Penalizes too few or too many residues
B3T2
B2
B1
Image from: http://betawrap.lcs.mit.edu/BetaTalk.ppt
![Page 16: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/16.jpg)
BetaWrap: Multiple Rungs
Find multiple initial B2-T2-B3 segments
Match pattern based on hydrophobic residues (appear on the inside)
Φ – A,F,I,L,M,V,W,Y
– D,E,R,K
X - Any
AFDEMVRKYE FIFDDEAK EDEMVMVFD
![Page 17: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/17.jpg)
BetaWrap: Multiple Rungs
DP is used to find 5 rungs in either direction from initial positions
α-helix filtering Take average score
of top 10 remaining wraps
Image from: http://betawrap.lcs.mit.edu/BetaTalk.ppt
![Page 18: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/18.jpg)
BetaWrap: Completing
Find B1 positionsHighest scoring parseDoes not affect wrap
score. Further filtering on
hydrophobic residues in T1 and T2
![Page 19: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/19.jpg)
Training Seven fold cross-validation
Partitioned based on families Scores calculated for
α-helix filtering thresholdB1-score thresholdHydrophobic count thresholddistribution of unmatched residues between
rungs
Image from: http://www.ornl.gov/info/ornlreview/v37_1_04/article_21.shtml
![Page 20: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/20.jpg)
BetaWrap: Results
![Page 21: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/21.jpg)
BetaWrap: Results
Correctly identifies Beta-Helixes Correctly separates helixes and non-helixes Can predict -helixes across families
![Page 22: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/22.jpg)
BetaWrap: SummaryPros: Finds beta-helixes AccurateCons: Still makes errors
Rung placement Hard coded information
Over-fittingHard to generalize
![Page 23: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/23.jpg)
Conditional Random Fields (CRFs)
y1
x1
y2
x2
y3
x3
y4
x4
y5
x5
y6
x6
…HMM
y1
x1
y2
x2
y3
x3
y4
x4
y5
x5
y6
x6
…CRF
![Page 24: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/24.jpg)
Hidden Markov Model
Set of States Transition Probabilities Emission Probabilities Only given sequence of
emitted residues Find sequence of true
states Generative
Res ProbA .2B .8
Res ProbA .2B .8
Res ProbA .2B .8
![Page 25: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/25.jpg)
Hidden Markov Model HMM: Maximize
P(x,y|θ) = P(y|x,θ)P(x|θ)x: emitted state/given sequencey: “hidden”/true stateP(x,y|θ): Joint probability of x and yP(y|x,θ): Probability of y given xP(x|θ): Probability of x
Need to make assumptions about the distribution of x
![Page 26: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/26.jpg)
Viterbi Algorithm HMM
Find most likely path/most likely sequence of hidden states
e3(x1)
e2(x1)
e1(x1)
e3(x2)
e2(x2)
e1(x2)
e3(x3)
e2(x3)
e1(x3)
e3(x4)
e2(x4)
e1(x4)
x1 x2 x3 x4
![Page 27: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/27.jpg)
Viterbi Algorithm HMM
e3(x1)
e2(x1)
e1(x1)
e3(x2)
e2(x2)
e1(x2)
e3(x3)
e2(x3)
e1(x3)
e3(x4)
e2(x4)
e1(x4)
x1 x2 x3 x4
v(i,j) = max(v(i-1,1)*t1,j*ej(xi), v(i-1,2)*t2,j*ej(xi) … v(i-k,1)*tk,j*ej(xi))
![Page 28: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/28.jpg)
HMM Disadvantages There is a strong independence assumption Long term interactions are difficult to model Overlapping features are difficult to model
![Page 29: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/29.jpg)
Conditional Random Fields (CRFs) Replace transition and emission probabilities with a set
of feature functions f(i,j,k) Feature functions based on all xs, not just one Not generative
f(3,0,1)
f(2,0,1)
f(1,0,1)
f(3,i,2)
f(2,i,2)
f(1,i,2)
f(3,i,3)
f(2,i,3)
f(1,i,3)
f(3,i,4)
f(2,i,4)
f(1,i,4)
x1 x2 x3 x4
![Page 30: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/30.jpg)
Conditional Random Fields (CRFs)
HMM: Maximize
P(x,y|θ)=P(y|x,θ)P(x|θ) CRF: Maximize
P(y|x,θ) Do not make assumptions about
underlying distribution
![Page 31: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/31.jpg)
Viterbi CRFs Same method as for HMM
f(3,0,1)
f(2,0,1)
f(1,0,1)
f(3,i,2)
f(2,i,2)
f(1,i,2)
f(3,i,3)
f(2,i,3)
f(1,i,3)
f(3,i,4)
f(2,i,4)
f(1,i,4)
x1 x2 x3 x4
![Page 32: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/32.jpg)
Conditional Random Fields (CRFs) States should form a chain Likelihood function is convex for chain
Z0 = number of states
λk = weights
![Page 33: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/33.jpg)
Segmented CRFs Each state corresponds to a structure Represented as a graph G
States represent secondary structures Nodes represent interactions Chains are nicer than graphs
![Page 34: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/34.jpg)
Segmented CRFs G =<V,E1,E2>
E1: Edges between neighborsE2: Edges for long-term interactions
E1 edges can be implied in model
![Page 35: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/35.jpg)
Only E2 needs to be explicitly considered
However Graph needs to be a chain for E2 Deterministic state transitions
![Page 36: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/36.jpg)
Beta-Helix CRF
![Page 37: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/37.jpg)
Beta-Helix CRF
Combined states B23: B2,B3,T2
Size assumptions: B23: 8 residues B1: 3 residues T1,T3: 1 to 80
res.
![Page 38: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/38.jpg)
Intra-Node Features
Regular Expression Template for B23
FIFDDEAK
Φ – A,F,I,L,M,V,W,Y
– D,E,R,K
X - Any
![Page 39: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/39.jpg)
Intra-Node Features
Probabilistic motif profiles for B23 and B1 Use HMMER to generate profiles from known
B23 and B1 sequences
![Page 40: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/40.jpg)
Intra-Node Features
Secondary Structure PredictionPSIPREDHelps locate T1 and T376 to 78% accuracy for α-helixes and coils
Segment length for T1 and T3Estimated as density function
![Page 41: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/41.jpg)
Inter-Node Features
Side chain alignment scoresAlignment between
B23 regionsMore weight given to
inward pairs
B3T2
B2
![Page 42: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/42.jpg)
Inter-Node Features
Parallel Beta-sheet alignment scores
Distance between adjacent B23 segments
![Page 43: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/43.jpg)
SCRF: Results
![Page 44: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/44.jpg)
SCRF: Results
![Page 45: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/45.jpg)
Summary
Discovered new beta-helix proteinSf6 gp14
Detected beta-helixes in plantsNone known of before
More robust than BetaWrap
![Page 46: Finding the Beta Helix Motif By Marcin Mejran. Papers Predicting The -Helix Fold From Protein Sequence Data by Phil Bradley, Lenore Cowen, Matthew Menke,](https://reader030.vdocument.in/reader030/viewer/2022033105/56649d395503460f94a12ecc/html5/thumbnails/46.jpg)
Questions