c e n t r f o r i n t e g r a t i v e b i o i n f o r m a t i c s v u e anton feenstra 23 nov 2006...
Post on 19-Dec-2015
214 views
TRANSCRIPT
![Page 1: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/1.jpg)
CENTR
FORINTEGRATIVE
BIOINFORMATICSVU
E
Anton Feenstra23 nov 2006
Sequence Entropy
Sequence Analysis
![Page 2: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/2.jpg)
[2] 23 nov 2006 Anton Feenstra[2] 23 nov 2006 Anton Feenstra[2] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Significance of Alignment Positions
• Observed occurrence of amino acids at some position in an alignment that deviates from expected may indicate some (functional) significance
• What ‘deviates from expected’?
• unlikely occurrences
• What is unlikely?
• only (relatively) few possibilities to obtain observed result
![Page 3: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/3.jpg)
[3] 23 nov 2006 Anton Feenstra[3] 23 nov 2006 Anton Feenstra[3] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Counting…
• Number of possibilities for finding given
numbers Nx of aminoacids types x :
• Problem: evaluates to huge numbers even for modest numbers of sequences…
!
!
xx
xx
N
N
![Page 4: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/4.jpg)
[4] 23 nov 2006 Anton Feenstra[4] 23 nov 2006 Anton Feenstra[4] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Sequence Entropy
• Entropy:
• Using Stirlings approximation:ln M! ≈ M(ln M-1) for M 1≫
!
!
lnlnx
x
xx
N
N
S
x
xx plogpS
![Page 5: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/5.jpg)
[5] 23 nov 2006 Anton Feenstra[5] 23 nov 2006 Anton Feenstra[5] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Shannon’s ‘Information Entropy’:
• ‘A Mathematical Theory of Communication’, The Bell System Technical Journal, Vol. 27, 1948.
“ Can we define a quantity which will measure, in some sense, how much information is ‘produced’ by such a process, or better, at what rate information is produced? ”
• He was thinking about the Transmission of Information, i.e., from a Source through some Channel to a Destination.
![Page 6: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/6.jpg)
[6] 23 nov 2006 Anton Feenstra[6] 23 nov 2006 Anton Feenstra[6] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Choice, Uncertainty and Entropy
• A set of ‘events’ with probabilities p1, p2, …, pn
• Is there a measure that indicates how much ‘choice’ is possible, given those probabilities?
• If there is, it should be:
• continuous for all pi
• monotonic in n if all probabilities are equal
• additive for ‘sub-events’
![Page 7: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/7.jpg)
[7] 23 nov 2006 Anton Feenstra[7] 23 nov 2006 Anton Feenstra[7] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Additivity:
H(1/2,1/3,1/6)
1/2
1/3
1/6H(2/3,1/3)
1/2
1/3
1/6
1/2
1/2
1/3
2/3
H(1/2,1/2)
H(1/2,1/2) + 1/2 H(2/3,1/3)=
![Page 8: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/8.jpg)
[8] 23 nov 2006 Anton Feenstra[8] 23 nov 2006 Anton Feenstra[8] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Solution: Entropy
• the entropy of a set of probabilities pi
• measures information, choice and uncertainty
• zero only if only one pi is not zero
• there is only one choice
• maximal if all pi are equal
• most ‘uncertain’ situation: all options are possible
n
iii
1
plogpH
n
iii
1
plogpH
![Page 9: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/9.jpg)
[9] 23 nov 2006 Anton Feenstra[9] 23 nov 2006 Anton Feenstra[9] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Information Content
• Shannon was thinking about the Transmission of Information, i.e., from a Source through some Channel to a Destination.
• …but it applies equally well to any type of ‘message’
• We can use it to measure the level of conservation in columns in an alignment
![Page 10: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/10.jpg)
[10] 23 nov 2006 Anton Feenstra[10] 23 nov 2006 Anton Feenstra[10] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Simple Example: Sequence Entropy
A A A A A A LA A A A A L LA A A A L L LA A A L L L LA A L L L L LA L L L L L L
.0
.1
.2
.3
.4
.5
.6
.7
.8
.9
1.0
.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0
p
H
p1 = 0 p2 = 0
p1 = p2 = ½
p1 = f (‘L’)p2 = f (‘A’)
n
iii
1
plogpH
![Page 11: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/11.jpg)
[11] 23 nov 2006 Anton Feenstra[11] 23 nov 2006 Anton Feenstra[11] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Conditional Probability / Prior Knowledge• Suppose we observe two things (A and B),
and we suspect a relation (A causes B)
• The fundamental question is then:How likely is A when we know B?
• (n.b., this is Bayes statistics…)
• Or, what is the uncertainty (entropy) of A knowing B?
![Page 12: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/12.jpg)
[12] 23 nov 2006 Anton Feenstra[12] 23 nov 2006 Anton Feenstra[12] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Conditional Entropy
• Entropy of joint occurrence of
value i for event x and value j for event y :
• and
• so that
• i.e., the entropy of a joint event is less than or equal to the sum of the individual entropies
• it is equal only if the events are independent
ji
jipjipyx,
),(log),(),H(
ji i
jipjipy,
),(log),()H( ji j
jipjipx,
),(log),()H(
)H()H(),H( yxyx
![Page 13: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/13.jpg)
[13] 23 nov 2006 Anton Feenstra[13] 23 nov 2006 Anton Feenstra[13] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Co-occurrence in practice
• Measure of ‘mutial information’, by relative entropy:
• more often written like:
• i.e., what is the entropy in x, given y
jii
iy i,j
i,ji,jx
, )p(
)p(log)p()(H
x y
xxyx
)p(
)p(log)p()||H(
![Page 14: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/14.jpg)
[14] 23 nov 2006 Anton Feenstra[14] 23 nov 2006 Anton Feenstra[14] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Relative Entropy in Sequence Analysis• Many biological problems relate to questions like:
“Why do these proteins do this, and those proteins not?”
• or
“Why do these patients get sick, and those not?”
• The answer can be related to similarities and differences between sequences
• Similarities (conservation) relate to functionally critical positions
• Differences can explain functional differences
![Page 15: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/15.jpg)
[15] 23 nov 2006 Anton Feenstra[15] 23 nov 2006 Anton Feenstra[15] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Comparing groups of Sequences
• For each position i in an alignment, we calculate the
relative entropy for group A vs. B from the
frequencies p (observed probabilities) of all
aminoacid types x, as follows:
n
x xi
xixii
1B,
A,A
,A/B
p
plogpH
![Page 16: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/16.jpg)
[16] 23 nov 2006 Anton Feenstra[16] 23 nov 2006 Anton Feenstra[16] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Simple Example: Relative Entropy ‘A/B’
0.0
0.5
1.0
1.5
2.0
2.5
3.0
En
tro
py
/ H
arm
on
y
A A A A A A L A A A A A A LA A A A A L L A A A A A L LA A A A L L L A A A A L L LA A A L L L L A A A L L L LA A L L L L L A A L L L L LA L L L L L L A L L L L L LA A A A A A A A A A A A A AA A A A A A A A A A A A A AA A A A A A A L L L L L L L
A
B
x xi
xixi B
,
A,A
,p
plogp
x
xi
xi
xi A
,
B
,B
,p
plogp
![Page 17: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/17.jpg)
[17] 23 nov 2006 Anton Feenstra[17] 23 nov 2006 Anton Feenstra[17] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Relative Entropy
• Captures similarities and differences
• Is infinite for completely dissimilar positions
• e.g., A vs. L
• Not symmetrical:
• Maybe not easy for selecting dissimilar positions
)(H)(H yx xy
![Page 18: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/18.jpg)
[18] 23 nov 2006 Anton Feenstra[18] 23 nov 2006 Anton Feenstra[18] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Simple Example: Relative Entropy ‘A/AB’
0.0
0.5
1.0
1.5
2.0
2.5
3.0
En
tro
py
/ H
arm
on
y
A A A A A A L A A A A A A LA A A A A L L A A A A A L LA A A A L L L A A A A L L LA A A L L L L A A A L L L LA A L L L L L A A L L L L LA L L L L L L A L L L L L LA A A A A A A A A A A A A AA A A A A A A A A A A A A AA A A A A A A L L L L L L L
A
B
x xi
xix i AB
,
B,B
,p
plog p
x xi
xixi AB
,
A,A
,p
plogp
![Page 19: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/19.jpg)
[19] 23 nov 2006 Anton Feenstra[19] 23 nov 2006 Anton Feenstra[19] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Relative Entropy (A/AB)
• Also captures similarities and differences
• Is no longer infinite for completely dissimilar positions
• Only symmetrical for equal size groups
• in practice, not symmetrical
• Maybe still not too easy for selecting dissimilar positions
![Page 20: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/20.jpg)
[20] 23 nov 2006 Anton Feenstra[20] 23 nov 2006 Anton Feenstra[20] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Measuring Overlapping Distributions• Weigh both groups equally;
take pA+pB in stead of pAB :
• Fixed interval [0,1], but not completely symmetrical
x xixi
xixii B
,A,
A,A
,A/B
pp
plogp SH
![Page 21: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/21.jpg)
[21] 23 nov 2006 Anton Feenstra[21] 23 nov 2006 Anton Feenstra[21] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
0.0
0.5
1.0
1.5
2.0
2.5
3.0
En
tro
py
/ H
arm
on
y
A A A A A A L A A A A A A LA A A A A L L A A A A A L LA A A A L L L A A A A L L LA A A L L L L A A A L L L LA A L L L L L A A L L L L LA L L L L L L A L L L L L LA A A A A A A A A A A A A AA A A A A A A A A A A A A AA A A A A A A L L L L L L L
A
B
Entropy vs. Sequence Harmony: Example
x xixi
xixi B
,A,
A,A
,pp
plogp x xixi
xixi B
,A,
B,B
,pp
plogp
![Page 22: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/22.jpg)
[22] 23 nov 2006 Anton Feenstra[22] 23 nov 2006 Anton Feenstra[22] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Sequence Harmony
• Introduce symmetry by averaging:
• May seem a trivial choice, but:
B/ABA/AB21A/B SH SH SH iii
x xixi
xixi
x xixi
xixii B
,A,
B,B
,B,
A,
A,A
,A/B
pp
plogp
pp
plogp
2
1SH
x xxixixixixixi
xxixi
B,
A,
B,
A,
B,
B,
A,
A, pplogpp plogp plogp
2
1
xxixixixi
B,
A,
B,
A, pplogpp BHAH
2
1
2logBAHBHAH21
Pirovano, Feenstra & Heringa. “Sequence Comparison by Sequence Harmony Identifies Subtype Specific Functional Sites”,
Nucleic Acids Res., in press (2006).
![Page 23: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/23.jpg)
[23] 23 nov 2006 Anton Feenstra[23] 23 nov 2006 Anton Feenstra[23] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
0.0
0.5
1.0
1.5
2.0
2.5
3.0
En
tro
py
/ H
arm
on
yEntropy vs. Sequence Harmony: Example
A A A A A A L A A A A A A LA A A A A L L A A A A A L LA A A A L L L A A A A L L LA A A L L L L A A A L L L LA A L L L L L A A L L L L LA L L L L L L A L L L L L LA A A A A A A A A A A A A AA A A A A A A A A A A A A AA A A A A A A L L L L L L L
A
B
2logBAHBHAH21
![Page 24: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/24.jpg)
[24] 23 nov 2006 Anton Feenstra[24] 23 nov 2006 Anton Feenstra[24] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Analyzing multiple groups
• Relative Entropy and Sequence Harmony are defined to compare a set of groups (N=2)
• problem for multiple groups (N>2)
• A solution: Entropy-variability plots
• variability is number of different aminoacid types in a certain alignment position
• problem: variability always tends towards maximum (20) for larger number of sequences
• Another solution: Two-entropy plots
• Total family entropy vs. sum of sub-family entropy
![Page 25: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/25.jpg)
[25] 23 nov 2006 Anton Feenstra[25] 23 nov 2006 Anton Feenstra[25] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Two-Entropies analysis of GPCRs
• Goal: to identify the function of individual positions
Ye, Lameijer, Beukers & IJzerman. “A Two-Entropies Analysis to Identify Functional Positions in the Transmembrane Region of Class A G Protein-Coupled Receptors”, Proteins 63, pp1018–1030 (2006)
![Page 26: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/26.jpg)
[26] 23 nov 2006 Anton Feenstra[26] 23 nov 2006 Anton Feenstra[26] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
G-protein Coupled Receptors (GPCRs)• Huge family of integral cell-membrane proteins
• crucial in signal transduction
• 70 subfamilies
• 1935 Class A GPCRs
• Three main regions:
• extracellular side
• transmembrane (TM)
• cytoplasmic side
![Page 27: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/27.jpg)
[27] 23 nov 2006 Anton Feenstra[27] 23 nov 2006 Anton Feenstra[27] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Two-entropies vs. Entropy-variability:upper vs. lower domain
Red: upper domain Blue: lower domain
![Page 28: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/28.jpg)
[28] 23 nov 2006 Anton Feenstra[28] 23 nov 2006 Anton Feenstra[28] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Two-entropies vs. Entropy-variability:relative solvent accessibility (RSA)
Red: RSA < 15% Blue: RSA > 15%
![Page 29: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/29.jpg)
[29] 23 nov 2006 Anton Feenstra[29] 23 nov 2006 Anton Feenstra[29] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Two-entropies vs. Entropy-variability:ligand binding site vs. other positions
Red: <4 Å from retinal Blue: >4 Å from retinal
Subfamily specific binding
Common activation mechanismSubfamily specific binding
Common activation mechanism
![Page 30: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/30.jpg)
[30] 23 nov 2006 Anton Feenstra[30] 23 nov 2006 Anton Feenstra[30] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
GPCR Two-entropies analysis
Red: ligand binding Green: coupling & activation Blue: others
![Page 31: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/31.jpg)
[31] 23 nov 2006 Anton Feenstra[31] 23 nov 2006 Anton Feenstra[31] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
GPCR Functional Site Prediction: Method Comparison
Finding known sites for ligand binding in Bovine Rhodopsin (left) and Aminergic Receptors (right)
![Page 32: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/32.jpg)
[32] 23 nov 2006 Anton Feenstra[32] 23 nov 2006 Anton Feenstra[32] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Smad-MH2 Alignment & Sequence Harmony
• Walter Pirovano*, K. Anton Feenstra* and Jaap Heringa. “Sequence Comparison by Sequence Harmony Identifies Subtype Specific Functional Sites”, Nucleic Acids Res., in press (2006).
• K. Anton Feenstra, Walter Pirovano and Jaap Heringa. “Sub-type Specific Sites for SMAD Receptor Binding Identified by Sequence Comparison using ‘Sequence Harmony’ ”. in: From Computational Biophysics to Systems Biology. pp. 73-78. Eds. U.H.E. Hansmann, J. Meinke, S. Mohanty and O. Zimmermann, Jülich, NIC Series, Vol. 34, 2006.
• Elena Marchiori*, Walter Pirovano, Jaap Heringa and K. Anton Feenstra*. “A Feature Selection Algorithm for Detecting Subtype Specific Sites for Smad Receptor Binding”, Bio-ICMLA06, accepted (2006).
Smad2 H.sapiens D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 D.melanogaster D A A P V M Y H E P A F W C S I S Y Y E L N T R V G E T F H A S Q P S I T V D G F T D P S N S E - R F C L G L
Smad2 D.rerio D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 C.auratus D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 R.norvegicus D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 M.musculus D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 D.rerio D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L C L
Smad3 S.scrofa D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 X.laevis D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 H.sapiens D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 M.musculus D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R L C L G L
Smad3 C.auratus D L Q P V T Y C E S A F W C S I S Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N A E - R F C L G L
Smad3 G.gallus D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 S.scrofa D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 R.norvegicus D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad1 S.mansoni T M H P V N Y Q E P K Y W C S I V Y Y E L N N R V G E A F N A S Q L S I I I D G F T D P S N N S D R F C L G L
Smad1 M.musculus D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad1 H.sapiens D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad1 S.scrofa D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad1 R.norvegicus D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad1 X.tropicalis D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N R N R F C L G L
Smad1 G.gallus D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S I L V D G F T D P S N N K N R F C L G L
Smad1 D.rerio D V H P V A Y Q E P K H W C S I V Y Y E L N N R V G E A F L A S S T S V L V D G F T D P S N N R N R F C L G L
Smad1 C.coturnix D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S I L V D G F T D P S N N K N R F C L G L
Smad5 H.sapiens D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K S R F C L G L
Smad5 M.musculus D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K S R F C L G L
Smad5 R.norvegicus D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P A N N K S R F C L G L
Smad5 G.gallus D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad5 D.rerio D V Q P V E Y Q E P S H W C S I V Y Y E L N N R V G E A Y H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad8 M.musculus D F R P V C Y E E P Q H W C S V A Y Y E L N N R V G E T F Q A S S R S V L I D G F T D P S N N R N R F C L G L
Smad8 R.norvegicus D F R P V C Y E E P L H W C S V A Y Y E L N N R V G E T F Q A S S R S V L I D G F T D P S N N R N R F C L G L
Smad8 G.gallus N F R P V C Y E E P Q H W C S V A Y Y E L N N R V G E T F Q A S S R S I L I D G F T D P S N N K N R F C L G L
50
|
40
|
20
|
30
|
10
|
L S N V N R N A T V E M T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N E V V E Q T R R H I G K G V R L Y Y I G G E V F A E C L S D S S I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A T V E M T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A T V E M T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y D W H P A T V C K
L S N V N R N A T V E M T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A T V E M T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A T V E M T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D N A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H N F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N F H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N F H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N F H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N F H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D T S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C V S D S S I F V Q S R N C N Y Q H G F H P A T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C V S D S S I F V Q S R N C N Y Q H G F H P A T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C V S D S S I F V Q S R N C N Y Q H G F H P A T V C K
110
|
100
|
80
|
90
|
70
|
60
|
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L S Q S V S Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y R L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C S L K I F S N Q E F A H - - - - L L S R T V H H G F E A V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T L R M S F V K G W G A E Y H R Q D V
I P S R C S L K I F N N Q E F A E - - - - L L A Q S V N H G F E A V Y E L T K M C T I R M S F V K G W G A K Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T L R M S F V K G W G A E Y H R Q D V
I P S S C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E A V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S S C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E A V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S S C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E A V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E A V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E A V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K V F N N Q L F A Q - - - - L L A Q S V H H G F E V V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K V F N N Q L F A Q L L A Q L L A Q S V H H G F E V V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q L F A Q - - - - P L A Q S V N H G F E V V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
170
|
160
|
140
|
150
|
130
|
120
|
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S V R C S S M S
T S T P C W I E L H L N G P L Q W L D R V L T Q M G S P R L P C S S M S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S V R C S S M S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S V R C S S M S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S V R C S S M S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S V R C S S M S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S V R C S S M S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P N L R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W V E I H L N G P L Q W L D R V L T Q M G T P R N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P L N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P L N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P L N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P L N P I S S V S
T S T P C W I E V H L H G P L Q W L D K V L T Q M G S P L N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
210
|
200
|
190
|
180
|
260 280 300 320 340 360 380 400 420 440 460
1
0
![Page 33: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/33.jpg)
[33] 23 nov 2006 Anton Feenstra[33] 23 nov 2006 Anton Feenstra[33] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Smad2 H.sapiens D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 D.melanogaster D A A P V M Y H E P A F W C S I S Y Y E L N T R V G E T F H A S Q P S I T V D G F T D P S N S E - R F C L G L
Smad2 D.rerio D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 C.auratus D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 R.norvegicus D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 M.musculus D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 D.rerio D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L C L
Smad3 S.scrofa D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 X.laevis D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 H.sapiens D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 M.musculus D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R L C L G L
Smad3 C.auratus D L Q P V T Y C E S A F W C S I S Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N A E - R F C L G L
Smad3 G.gallus D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 S.scrofa D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 R.norvegicus D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad1 S.mansoni T M H P V N Y Q E P K Y W C S I V Y Y E L N N R V G E A F N A S Q L S I I I D G F T D P S N N S D R F C L G L
Smad1 M.musculus D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad1 H.sapiens D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad1 S.scrofa D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad1 R.norvegicus D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad1 X.tropicalis D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N R N R F C L G L
Smad1 G.gallus D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S I L V D G F T D P S N N K N R F C L G L
Smad1 D.rerio D V H P V A Y Q E P K H W C S I V Y Y E L N N R V G E A F L A S S T S V L V D G F T D P S N N R N R F C L G L
Smad1 C.coturnix D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S I L V D G F T D P S N N K N R F C L G L
Smad5 H.sapiens D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K S R F C L G L
Smad5 M.musculus D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K S R F C L G L
Smad5 R.norvegicus D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P A N N K S R F C L G L
Smad5 G.gallus D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad5 D.rerio D V Q P V E Y Q E P S H W C S I V Y Y E L N N R V G E A Y H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad8 M.musculus D F R P V C Y E E P Q H W C S V A Y Y E L N N R V G E T F Q A S S R S V L I D G F T D P S N N R N R F C L G L
Smad8 R.norvegicus D F R P V C Y E E P L H W C S V A Y Y E L N N R V G E T F Q A S S R S V L I D G F T D P S N N R N R F C L G L
Smad8 G.gallus N F R P V C Y E E P Q H W C S V A Y Y E L N N R V G E T F Q A S S R S I L I D G F T D P S N N K N R F C L G L
26
2 27
0 28
0 29
0 30
0 31
0
AR
BR
Smads: Comparing two Groups
![Page 34: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/34.jpg)
[34] 23 nov 2006 Anton Feenstra[34] 23 nov 2006 Anton Feenstra[34] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Smad-MH2 Alignment & Functionally Specific Sites
• 29 known sites of functional specificity
• based mostly on site-specific mutants and characterized on affinity for binding to BMPR-I vs. TBR-I receptor types
Smad2 H.sapiens D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 D.melanogaster D A A P V M Y H E P A F W C S I S Y Y E L N T R V G E T F H A S Q P S I T V D G F T D P S N S E - R F C L G L
Smad2 D.rerio D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 C.auratus D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 R.norvegicus D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 M.musculus D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L G L
Smad2 D.rerio D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N S E - R F C L C L
Smad3 S.scrofa D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 X.laevis D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 H.sapiens D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 M.musculus D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R L C L G L
Smad3 C.auratus D L Q P V T Y C E S A F W C S I S Y Y E L N Q R V G E T F H A S Q P S L T V D G F T D P S N A E - R F C L G L
Smad3 G.gallus D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 S.scrofa D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad3 R.norvegicus D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G F T D P S N S E - R F C L G L
Smad1 S.mansoni T M H P V N Y Q E P K Y W C S I V Y Y E L N N R V G E A F N A S Q L S I I I D G F T D P S N N S D R F C L G L
Smad1 M.musculus D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad1 H.sapiens D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad1 S.scrofa D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad1 R.norvegicus D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad1 X.tropicalis D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N R N R F C L G L
Smad1 G.gallus D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S I L V D G F T D P S N N K N R F C L G L
Smad1 D.rerio D V H P V A Y Q E P K H W C S I V Y Y E L N N R V G E A F L A S S T S V L V D G F T D P S N N R N R F C L G L
Smad1 C.coturnix D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S I L V D G F T D P S N N K N R F C L G L
Smad5 H.sapiens D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K S R F C L G L
Smad5 M.musculus D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K S R F C L G L
Smad5 R.norvegicus D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P A N N K S R F C L G L
Smad5 G.gallus D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad5 D.rerio D V Q P V E Y Q E P S H W C S I V Y Y E L N N R V G E A Y H A S S T S V L V D G F T D P S N N K N R F C L G L
Smad8 M.musculus D F R P V C Y E E P Q H W C S V A Y Y E L N N R V G E T F Q A S S R S V L I D G F T D P S N N R N R F C L G L
Smad8 R.norvegicus D F R P V C Y E E P L H W C S V A Y Y E L N N R V G E T F Q A S S R S V L I D G F T D P S N N R N R F C L G L
Smad8 G.gallus N F R P V C Y E E P Q H W C S V A Y Y E L N N R V G E T F Q A S S R S I L I D G F T D P S N N K N R F C L G L
50
|
40
|
20
|
30
|
10
|
L S N V N R N A T V E M T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N E V V E Q T R R H I G K G V R L Y Y I G G E V F A E C L S D S S I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A T V E M T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A T V E M T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y D W H P A T V C K
L S N V N R N A T V E M T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A T V E M T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A T V E M T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D N A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N A A V E L T R R H I G R G V R L Y Y I G G E V F A E C L S D S A I F V Q S P N C N Q R Y G W H P A T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H N F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N F H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N F H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N F H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N F H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D S S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C L S D T S I F V Q S R N C N Y H H G F H P T T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C V S D S S I F V Q S R N C N Y Q H G F H P A T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C V S D S S I F V Q S R N C N Y Q H G F H P A T V C K
L S N V N R N S T I E N T R R H I G K G V H L Y Y V G G E V Y A E C V S D S S I F V Q S R N C N Y Q H G F H P A T V C K
110
|
100
|
80
|
90
|
70
|
60
|
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L S Q S V S Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y R L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C N L K I F N N Q E F A A - - - - L L A Q S V N Q G F E A V Y Q L T R M C T I R M S F V K G W G A E Y R R Q T V
I P P G C S L K I F S N Q E F A H - - - - L L S R T V H H G F E A V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T L R M S F V K G W G A E Y H R Q D V
I P S R C S L K I F N N Q E F A E - - - - L L A Q S V N H G F E A V Y E L T K M C T I R M S F V K G W G A K Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E T V Y E L T K M C T L R M S F V K G W G A E Y H R Q D V
I P S S C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E A V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S S C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E A V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S S C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E A V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E A V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q E F A Q - - - - L L A Q S V N H G F E A V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K V F N N Q L F A Q - - - - L L A Q S V H H G F E V V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K V F N N Q L F A Q L L A Q L L A Q S V H H G F E V V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
I P S G C S L K I F N N Q L F A Q - - - - P L A Q S V N H G F E V V Y E L T K M C T I R M S F V K G W G A E Y H R Q D V
170
|
160
|
140
|
150
|
130
|
120
|
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S V R C S S M S
T S T P C W I E L H L N G P L Q W L D R V L T Q M G S P R L P C S S M S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S V R C S S M S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S V R C S S M S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S V R C S S M S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S V R C S S M S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S V R C S S M S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P N L R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W I E L H L N G P L Q W L D K V L T Q M G S P S I R C S S V S
T S T P C W V E I H L N G P L Q W L D R V L T Q M G T P R N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P L N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P L N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P L N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P L N P I S S V S
T S T P C W I E V H L H G P L Q W L D K V L T Q M G S P L N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
T S T P C W I E I H L H G P L Q W L D K V L T Q M G S P H N P I S S V S
210
|
200
|
190
|
180
|
Method Predict Specificity Error
AMAS 6 21% 3%
TreeDet 21 52% 21%
SDPpred 12 31% 10%
![Page 35: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/35.jpg)
[35] 23 nov 2006 Anton Feenstra[35] 23 nov 2006 Anton Feenstra[35] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Pos. Sec.str. SH AR BR Interaction
L263 B1’ 0 La Vfm SARA
T267 B1’ 0 Tm Acen SARA
S269 loop 0 CSh Eq ?? (putative)
A272 loop 0 A Kqls ?? (putative)
F273 loop 0 F Hy ?? (putative)
Q284 B2 0 Qt N TR-I
Q294 loop 0.16 Q Sq c-Ski/SnoN
P295 B3 0 P Trl c-Ski/SnoN
L297 B3 0.11 LMi Vi c-Ski/SnoN
T298 B3 0 T Li c-Ski/SnoN
S308 L1 0 Sa N c-Ski/SnoN
– L1 0 – Nsd c-Ski/SnoN
E309 L1 0 E Krs c-Ski/SnoN
A323 H1 0 Ae S ALK1/2
V325 H1 0 V I ?? (putative)
M327 H1 0 LMq N ALK1/2
R334 Loop 0.18 Rk K ?? (putative)
R337 B5 0 R H ?? (putative)
Finding Low-harmony sites in Smad-MH2
27
0 28
0 29
0
D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G
D A A P V M Y H E P A F W C S I S Y Y E L N T R V G E T F H A S Q P S I T V D G
D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G
D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G
D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G
D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G
D L Q P V T Y S E P A F W C S I A Y Y E L N Q R V G E T F H A S Q P S L T V D G
D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G
D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G
D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G
D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G
D L Q P V T Y C E S A F W C S I S Y Y E L N Q R V G E T F H A S Q P S L T V D G
D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G
D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G
D L Q P V T Y C E P A F W C S I S Y Y E L N Q R V G E T F H A S Q P S M T V D G
T M H P V N Y Q E P K Y W C S I V Y Y E L N N R V G E A F N A S Q L S I I I D G
D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G
D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G
D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G
D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G
D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G
D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S I L V D G
D V H P V A Y Q E P K H W C S I V Y Y E L N N R V G E A F L A S S T S V L V D G
D V Q A V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S I L V D G
D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G
D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G
D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G
D V Q P V A Y E E P K H W C S I V Y Y E L N N R V G E A F H A S S T S V L V D G
D V Q P V E Y Q E P S H W C S I V Y Y E L N N R V G E A Y H A S S T S V L V D G
D F R P V C Y E E P Q H W C S V A Y Y E L N N R V G E T F Q A S S R S V L I D G
D F R P V C Y E E P L H W C S V A Y Y E L N N R V G E T F Q A S S R S V L I D G
N F R P V C Y E E P Q H W C S V A Y Y E L N N R V G E T F Q A S S R S I L I D G
30
0
![Page 36: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/36.jpg)
[36] 23 nov 2006 Anton Feenstra[36] 23 nov 2006 Anton Feenstra[36] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Finding Low-harmony sites in Smad-MH2Pos. Sec.str. SH AR BR Interaction
L263 B1’ 0 La Vfm SARA
T267 B1’ 0 Tm Acen SARA
S269 loop 0 CSh Eq ?? (putative)
A272 loop 0 A Kqls ?? (putative)
F273 loop 0 F Hy ?? (putative)
Q284 B2 0 Qt N TR-I
Q294 loop 0.16 Q Sq c-Ski/SnoN
P295 B3 0 P Trl c-Ski/SnoN
L297 B3 0.11 LMi Vi c-Ski/SnoN
T298 B3 0 T Li c-Ski/SnoN
S308 L1 0 Sa N c-Ski/SnoN
– L1 0 – Nsd c-Ski/SnoN
E309 L1 0 E Krs c-Ski/SnoN
A323 H1 0 Ae S ALK1/2
V325 H1 0 V I ?? (putative)
M327 H1 0 LMq N ALK1/2
R334 Loop 0.18 Rk K ?? (putative)
R337 B5 0 R H ?? (putative)
Method Predict Specificity Error
AMAS 6 21% 3%
TreeDet 21 52% 21%
SDPpred 12 31% 10%
Sequence Harmony
(SH=0) 32(SH<0.2)
40
79%
93%
28%
33%
Pirovano, Feenstra & Heringa. “Sequence Comparison by Sequence Harmony Identifies Subtype Specific Functional Sites”, Nucleic Acids Res., in press (2006).www.few.vu.nl/~feenstra/articles/NAR 2006 Sequence Harmony.pdf
![Page 37: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/37.jpg)
[37] 23 nov 2006 Anton Feenstra[37] 23 nov 2006 Anton Feenstra[37] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Smad-MH2: Low Harmony Patches
![Page 38: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/38.jpg)
[38] 23 nov 2006 Anton Feenstra[38] 23 nov 2006 Anton Feenstra[38] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Smad-MH2: Functional Clusters
R462 C463
Q400
R410 W368
Y366
A392
S269
F273
N443
Q294
Q309L297
L440
N381
A354
V461
S460Q407
Q364
P360
R365
T267
A272
I341
P295S308
T298R337F346
P378
Q284
V325
A323R427
M327T430
R334FAST1, Mixer, SARA
c-Ski/SnoN
SARA
TR-I/ALK1/2TR-I/BMPR-I
?SARA/Mixer
TR-I/BMPR-I/ALK1/2
?
receptor-binding
retention & transcription factorsco-repressors
![Page 39: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/39.jpg)
[39] 23 nov 2006 Anton Feenstra[39] 23 nov 2006 Anton Feenstra[39] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Conclusions Smad-MH2
• 40 Sites of Low Sequence Harmony in Smad-MH2 • different between the AR (TGF-) and BR (BMP) sub-type Smads
• Low Harmony sites in Smad-MH2 are functionally relevant
• Other methods cannot select all known sites!
Functional Sites are Interaction Surfaces on Protein Surface: Next: Analyze Interaction Partners in the Pathway
• 14 Low Harmony Sites in Smad-MH2 of unknown function• 11 putative functions from structural considerations
• promising candidates that determine TGF-/BMP specificity
• confirm (or rebuke) putative functions?
![Page 40: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/40.jpg)
[40] 23 nov 2006 Anton Feenstra[40] 23 nov 2006 Anton Feenstra[40] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Sequence Harmony Webserver
http://www.ibi.vu.nl/programs/seqharmwww1-b/
![Page 41: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/41.jpg)
[41] 23 nov 2006 Anton Feenstra[41] 23 nov 2006 Anton Feenstra[41] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Sequence Harmony Webserver: Groups
![Page 42: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/42.jpg)
[42] 23 nov 2006 Anton Feenstra[42] 23 nov 2006 Anton Feenstra[42] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Sequence Harmony Webserver: Reference
![Page 43: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/43.jpg)
[43] 23 nov 2006 Anton Feenstra[43] 23 nov 2006 Anton Feenstra[43] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
Sequence Harmony Webserver: Structure
![Page 44: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/44.jpg)
[44] 23 nov 2006 Anton Feenstra[44] 23 nov 2006 Anton Feenstra[44] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
SMAD Sequence Harmony: Raw Table
![Page 45: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/45.jpg)
[45] 23 nov 2006 Anton Feenstra[45] 23 nov 2006 Anton Feenstra[45] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
SMAD Sequence Harmony: Results
![Page 46: C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E Anton Feenstra 23 nov 2006 Sequence Entropy Sequence Analysis](https://reader038.vdocument.in/reader038/viewer/2022103123/56649d3b5503460f94a15d17/html5/thumbnails/46.jpg)
[46] 23 nov 2006 Anton Feenstra[46] 23 nov 2006 Anton Feenstra[46] 23 nov 2006 Anton Feenstra
C E N T R F O R I N T E G R A T I V EB I O I N F O R M A T I C S V U
E
SMAD Sequence Harmony: Structure