1 dendroscope – an interactive viewer for large phylogenetic trees daniel h. huson phylogenetics...
TRANSCRIPT
1
Dendroscope –Dendroscope –An interactive viewer for An interactive viewer for large phylogenetic treeslarge phylogenetic trees
Daniel H. Huson
Phylogenetics Programme, Newton Institute, September 2007
- and networks- and networks
2
OverviewOverview
Dendroscope and large trees
Phylogenetic networks– cluster networks
Dendroscope 2 and phylogenetic networks
3
Yet Another Tree Viewer?Yet Another Tree Viewer? http://
evolution.genetics.washington.edu/phylip/software.html:
Yet… no existing program “does it all”
4
RequirementsRequirements Provide all standard visualizations Allow interactive setting of line
widths, colors and fonts Allow rerooting, reordering, hiding,
deletion and subtree extraction Open and save in different formats,
including standard graphics formats Run on large files with many trees or
large trees (with a million nodes) Run on all major operating systems
5
Eight Different ViewsEight Different Views
6
Multiple TreesMultiple Trees
List of treescan be loadedand edited
7
Large TreesLarge Trees
NCBItaxonomy~325,000taxa
8
Finding Taxa in Large TreesFinding Taxa in Large Trees
9
Subtree ExtractionSubtree Extraction
Select a set of taxaand extractthe inducedsubtree
10
OverviewOverview
Dendroscope and large trees
Phylogenetic networks– cluster networks
Dendroscope 2 and phylogenetic networks
11
x1
x8
x7
x6
x5
x4
x3
x2
x8
x5
x2
The Splits of a TreeThe Splits of a Tree Every edge of a tree defines a split
of the taxon set X:
x1,x3,x4,x6,x7 vs x2,x5,x8
e
12
Trees and Compatible SplitsTrees and Compatible Splits The set of all splits obtained from T is
called the split encoding (T) of T
Theorem An arbitrary set of splits is the split
encoding of some unique unique tree T, if and only if any two splits in are compatible.
How to represent incompatible splits?
13
Split NetworksSplit Networks Display incompatible splits
using bands of parallel edges (Bandelt & Dress, 1992)
Boxes artifacts of this, non-intuitive for users?
Size of network can be exponential in # of splits
Only drawn in unrooted radial layout
Different from reticulate networks
Find a new way to represent incompatible splits?
14
Hasse DiagramHasse Diagram Stefan Gruenewald (MPI Shanghai): why not use a
“Hasse diagram” or “cover digraph”?
Because clusters then represented by nodes, not edges
Clusters (“rooted splits”):{A} {B} {C} {D} {E}{A,B} {B,C} {D,E}{C,D,E}{A,B,C,D,E}
{A,B,C,D,E}
{C,D,E}
{A,B} {B,C} {D,E}
{A} {B} {C} {D} {E}
15
Idea: Extend the Hasse DiagramIdea: Extend the Hasse Diagram
Represent every cluster by its in-edge:
{A,B,C,D,E}
{C,D,E}
{A,B} {B,C} {D,E}
{A} {B} {C} {D} {E}
?
16
Idea: Extend the Hasse DiagramIdea: Extend the Hasse Diagram
If in-degree >1, insert new edge:
{A,B,C,D,E}
{C,D,E}
{A,B} {B,C} {D,E}
{A} {B} {D} {E}{C}
17
““Cluster Network”Cluster Network”
A new type of network?
{A,B,C,D,E}
{C,D,E}
{A,B} {B,C} {D,E}
{A} {D} {E}{B} {C}
18
Split Network vs Cluster NetworkSplit Network vs Cluster Network
Data: (Kumar, 1998)
Split network Cluster network
19
Cluster Network vs Reticulate Cluster Network vs Reticulate NetworkNetwork
Cluster network “Hard-wired”: blue edges always on– Canonical network, computationally easy
Reticulate net.: “Soft-wired”: For any split, any blue edge can be on or off
– Minimum reticulate network, computationally hard
20
OverviewOverview
Dendroscope and large trees
Phylogenetic networks– cluster networks
Dendroscope 2 and phylogenetic networks
21
Dendroscope 2Dendroscope 2
Computation of different consensus trees and super trees
Computation of different consensus networks and super networks
Use “extended Newick” format to support cluster networks and reticulate networks
All features of Dendroscope 1 will also apply to networks
22
Five fungal trees Five fungal trees (Pryor 2000, 2003):(Pryor 2000, 2003):
ITS (two trees)ITS (two trees)SSU (two SSU (two trees)trees)Gpd (one tree)Gpd (one tree)
Number of taxa:Number of taxa:29-46, total is 6329-46, total is 63
Example: Five Fungal TreesExample: Five Fungal Trees
23
““Strict Consensus Tree”Strict Consensus Tree”
24
““Majority Consensus Tree”Majority Consensus Tree”
25
Consensus Super Network, >20% Consensus Super Network, >20% supportsupport
26
Super Network, All SplitsSuper Network, All Splits
27
SummarySummary Dendroscope 1: new interactive tool for
visualizing & editing phylogenetic trees Cluster networks: new type of
phylogenetic networks that are easy to compute and “look more like trees”
Dendroscope 2: will contain consensus methods and will read, write and draw cluster- and reticulate networks.
Dendroscope 1 is freely available from:
www-ab.informatik.uni-tuebingen.de/software.dendroscope
28
CreditsCredits Contributions to Dendroscope from:
– Tobias Dezulian, Markus Franz, Christian Rausch, Daniel Richter & Regula Rupp
Super network algorithm (Z-closure) joint work with:– Tobias Dezulian, Tobias Klöpper and
Mike Steel Filtered super network joint work
with:– Mike Steel and Jim Whitfield