tropical geometry for biology lior pachter and bernd sturmfels department of mathematics u.c....

Post on 30-Jan-2016

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Tropical Geometry for Biology

Lior Pachter and Bernd SturmfelsDepartment of Mathematics

U.C. Berkeley

Tropical arithmetic• Annotation is sequence labeling• Annotation is important for biology• Annotation is tropical arithmetic

Tropical geometry• Tree basics• Tree reconstruction is important for biology • Tree space is the tropical Grassmanian

Back to the data

What is annotation?

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

INPUT: ..t..r…o..p..i..c..a..a..l...g..e..e..t..r..y..OUTPUT: ..t..r…o..p..i..c..a..a..l...g..e..e..t..r..y..

Annotation is the labeling of the input sequence,in this case with 3 colors:

ome

TAAT ATGTCCACGG TTGTACACGGCA G GTATTGAGGTATTGAG ATGTAAC TGAA

Input: TAATATGTCCACGGGTATTGAGCATTGTACACGGGGTATTGAGCATGTAATGAA

Biology example: gene annotation

Output:

Leucine

x

y

z

Best annotation for TAAT is obtained by evaluating

Example: assign “scores”, say x,y,z to each color regardless of letter

Finding a good annotationwith tropical arithmetic

Tropical arithmetic• Annotation is sequence labeling• Annotation is important for biology• Annotation is tropical arithmetic

Tropical geometry• Tree basics• Tree reconstruction is important for biology • Tree space is the tropical Grassmanian

Back to the data

What is a phylogenetic X-tree?

In Darwin’s exampleX = {A,B,C,D,1}

Tree basics1 3

2 4

1 2

3 4

1 2

4 3

In general, the number of trees is the Schröder number(2n-5)!! = (2n-5)*(2n-7)*… 3*1

12

34

0.1

0.2

0.40.2

0.3

Data

Metrics and trees

[ dij ]Distance between species i and j

A primate tree from genome sequences

Tree space is the tropical Grassmanian

Example: X={1,2,3,4,5}

31

2

4 5

Back to the data

Alignment

Phylogeny

AnnotationMulti HMM Generalized HMM

Tree Markov models

GeneralizedMulti HMM

Evol. HMM Generalized hidden MarkovPhylogeny

Graphical Models

Final message: Tropical mathematics is important for comparative genomics.

For more on mathematics and tropical geometry (and combinatorics and algebra and statistics…):L. Pachter and B. Sturmfels, Tropical Geometry of Statistical Models, PNAS 101, 2004L. Pachter and B. Sturmfels, Parametric Inference for Biological Sequence Analysis, PNAS 101, 2004D. Speyer and B. Sturmfels, The Tropical Grassmanian, Advances in Geometry 4, 2004.L. Pachter and B. Sturmfels, Mathematics of Phylogenomics, arxiv math.ST/0409132, 2004.

and coming soon:

Book (to be published by Cambridge University Press)

Algebraic Statistics for Computational Biologyedited by Pachter and Sturmfels

top related