1 dendroscope – an interactive viewer for large phylogenetic trees daniel h. huson phylogenetics...

28
1 Dendroscope – Dendroscope – An interactive viewer for An interactive viewer for large phylogenetic trees large phylogenetic trees Daniel H. Huson ylogenetics Programme, Newton Institute, September 2007 - and networks - and networks

Upload: bonnie-mccarthy

Post on 31-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

1

Dendroscope –Dendroscope –An interactive viewer for An interactive viewer for large phylogenetic treeslarge phylogenetic trees

Daniel H. Huson

Phylogenetics Programme, Newton Institute, September 2007

- and networks- and networks

Page 2: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

2

OverviewOverview

Dendroscope and large trees

Phylogenetic networks– cluster networks

Dendroscope 2 and phylogenetic networks

Page 3: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

3

Yet Another Tree Viewer?Yet Another Tree Viewer? http://

evolution.genetics.washington.edu/phylip/software.html:

Yet… no existing program “does it all”

Page 4: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

4

RequirementsRequirements Provide all standard visualizations Allow interactive setting of line

widths, colors and fonts Allow rerooting, reordering, hiding,

deletion and subtree extraction Open and save in different formats,

including standard graphics formats Run on large files with many trees or

large trees (with a million nodes) Run on all major operating systems

Page 5: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

5

Eight Different ViewsEight Different Views

Page 6: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

6

Multiple TreesMultiple Trees

List of treescan be loadedand edited

Page 7: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

7

Large TreesLarge Trees

NCBItaxonomy~325,000taxa

Page 8: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

8

Finding Taxa in Large TreesFinding Taxa in Large Trees

Page 9: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

9

Subtree ExtractionSubtree Extraction

Select a set of taxaand extractthe inducedsubtree

Page 10: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

10

OverviewOverview

Dendroscope and large trees

Phylogenetic networks– cluster networks

Dendroscope 2 and phylogenetic networks

Page 11: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

11

x1

x8

x7

x6

x5

x4

x3

x2

x8

x5

x2

The Splits of a TreeThe Splits of a Tree Every edge of a tree defines a split

of the taxon set X:

x1,x3,x4,x6,x7 vs x2,x5,x8

e

Page 12: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

12

Trees and Compatible SplitsTrees and Compatible Splits The set of all splits obtained from T is

called the split encoding (T) of T

Theorem An arbitrary set of splits is the split

encoding of some unique unique tree T, if and only if any two splits in are compatible.

How to represent incompatible splits?

Page 13: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

13

Split NetworksSplit Networks Display incompatible splits

using bands of parallel edges (Bandelt & Dress, 1992)

Boxes artifacts of this, non-intuitive for users?

Size of network can be exponential in # of splits

Only drawn in unrooted radial layout

Different from reticulate networks

Find a new way to represent incompatible splits?

Page 14: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

14

Hasse DiagramHasse Diagram Stefan Gruenewald (MPI Shanghai): why not use a

“Hasse diagram” or “cover digraph”?

Because clusters then represented by nodes, not edges

Clusters (“rooted splits”):{A} {B} {C} {D} {E}{A,B} {B,C} {D,E}{C,D,E}{A,B,C,D,E}

{A,B,C,D,E}

{C,D,E}

{A,B} {B,C} {D,E}

{A} {B} {C} {D} {E}

Page 15: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

15

Idea: Extend the Hasse DiagramIdea: Extend the Hasse Diagram

Represent every cluster by its in-edge:

{A,B,C,D,E}

{C,D,E}

{A,B} {B,C} {D,E}

{A} {B} {C} {D} {E}

?

Page 16: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

16

Idea: Extend the Hasse DiagramIdea: Extend the Hasse Diagram

If in-degree >1, insert new edge:

{A,B,C,D,E}

{C,D,E}

{A,B} {B,C} {D,E}

{A} {B} {D} {E}{C}

Page 17: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

17

““Cluster Network”Cluster Network”

A new type of network?

{A,B,C,D,E}

{C,D,E}

{A,B} {B,C} {D,E}

{A} {D} {E}{B} {C}

Page 18: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

18

Split Network vs Cluster NetworkSplit Network vs Cluster Network

Data: (Kumar, 1998)

Split network Cluster network

Page 19: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

19

Cluster Network vs Reticulate Cluster Network vs Reticulate NetworkNetwork

Cluster network “Hard-wired”: blue edges always on– Canonical network, computationally easy

Reticulate net.: “Soft-wired”: For any split, any blue edge can be on or off

– Minimum reticulate network, computationally hard

Page 20: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

20

OverviewOverview

Dendroscope and large trees

Phylogenetic networks– cluster networks

Dendroscope 2 and phylogenetic networks

Page 21: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

21

Dendroscope 2Dendroscope 2

Computation of different consensus trees and super trees

Computation of different consensus networks and super networks

Use “extended Newick” format to support cluster networks and reticulate networks

All features of Dendroscope 1 will also apply to networks

Page 22: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

22

Five fungal trees Five fungal trees (Pryor 2000, 2003):(Pryor 2000, 2003):

ITS (two trees)ITS (two trees)SSU (two SSU (two trees)trees)Gpd (one tree)Gpd (one tree)

Number of taxa:Number of taxa:29-46, total is 6329-46, total is 63

Example: Five Fungal TreesExample: Five Fungal Trees

Page 23: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

23

““Strict Consensus Tree”Strict Consensus Tree”

Page 24: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

24

““Majority Consensus Tree”Majority Consensus Tree”

Page 25: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

25

Consensus Super Network, >20% Consensus Super Network, >20% supportsupport

Page 26: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

26

Super Network, All SplitsSuper Network, All Splits

Page 27: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

27

SummarySummary Dendroscope 1: new interactive tool for

visualizing & editing phylogenetic trees Cluster networks: new type of

phylogenetic networks that are easy to compute and “look more like trees”

Dendroscope 2: will contain consensus methods and will read, write and draw cluster- and reticulate networks.

Dendroscope 1 is freely available from:

www-ab.informatik.uni-tuebingen.de/software.dendroscope

Page 28: 1 Dendroscope – An interactive viewer for large phylogenetic trees Daniel H. Huson Phylogenetics Programme, Newton Institute, September 2007 - and networks

28

CreditsCredits Contributions to Dendroscope from:

– Tobias Dezulian, Markus Franz, Christian Rausch, Daniel Richter & Regula Rupp

Super network algorithm (Z-closure) joint work with:– Tobias Dezulian, Tobias Klöpper and

Mike Steel Filtered super network joint work

with:– Mike Steel and Jim Whitfield