t he p roblem of r econstructing k - articulated p hylogenetic n etwork supervisor : dr. yiu siu...
Post on 20-Dec-2015
217 views
TRANSCRIPT
THE PROBLEM OF
RECONSTRUCTING K-
ARTICULATED PHYLOGENETIC
NETWORK
Supervisor: Dr. Yiu Siu Ming
Second Examiner: Professor Francis Y.L. Chin
Student: Vu Thi Quynh Hoa
CONTENTS
1. Introduction
Motivation
Related Work
Project Plan
2. Problem Definitions
3. Algorithms
1-articulated Network Algorithm
2-articulated Network Algorithm
INTRODUCTION – MOTIVATION
To model the evolutionary history of species,
phylogenetic network is a powerful approach to
represent the articulation events
Level-x network: the time complexity of all existing
algorithms increases exponentially when x gets higher
k-articulated network is a more naturally biological
model which can capture complex scenarios of
articulation events with a smaller value of k
E.g. level-4 network vs. 2-articulated network
RELATED WORK
The problem of constructing phylogenetic
networks has been worked under many
approaches using different input types Nakhleh et al. proposed an algorithm constructing a level-1
network from two trees in polynomial time
Huynh et al. with a polynomial-running-time algorithm building
a galled network from a set of trees
Bryant and Moulton developed NeighborNet method to
construct a network from a distance matrix
Jansson, Nguyen and Sung with O(n3) running time to construct a
galled network given a set of triplets
Extending to level-2 network, Van Iersel et al. provided an
O(n8) algorithm
SCHEDULES – PROJECT PLAN
Objectives Time
Round1
1. Reconstructing restricted 1-articulated network from a set of binary phylogenetic trees
30 Sep2011
2. Reconstructing restricted 2-articulated network from a set of binary phylogenetic trees
15 Nov2011
Round2
3. Reconstructing 1-articulated network from a Distance Matrix
15 Feb2012
4. Implementation one of the three problems 31 Mar2012
DEFINITIONS
Phylogenetic Tree
A rooted, unordered tree with distinctly labeled leaves representing
each strain of the species
Phylogenetic Network
A rooted, directed acyclic graph in which:
One node has indegree 0 (the root), and all other nodes have indegree 1
or 2
All nodes with indegree 2 must have outdegree 1 (hybrid nodes)
All other nodes with indegree 1 have outdegree 0 or 2
Nodes with outdegree 0 are leaves which are distinctly labeled
Node s is called a split node of a hybrid node h if s can be reached
using two disjoint paths from the children of s
DEFINITIONS
k-articulated network
a phylogenetic network in which every split node
corresponds to at most k hybrid nodes A level-k network is a k-articulated network
A k-articulated network can model a level-x network (x > k)
Level-2 network
1-articulated
network
DEFINITIONS
A network is non-skew if all paths from any split node to its hybrid node have a length ≥ 2
A network is safe if the siblings of all hybrid nodes are not hybrid nodes
A network is restricted if it is non-skew and safe
DEFINITIONS
Given a hybrid node h and its parents p and q, a cut on edge (p, h)
means removing the edge (p, h) from the network, and then for
every node with indegree 1 and outdegree less than 2, contracting
its outgoing edge
A network N is compatible with phylogenetic tree T if N can be
converted to T by performing a series of cuts one by one.
h
pq
h
pq
PROBLEM DEFINITION
Reconstructing a restricted k-articulated
network (where k = 1, 2) from a set of binary
trees
Given a set of phylogenetic binary trees Ti , i = 1, 2, …, k,
with the same leaf label set, construct a restricted k-
articulated network N (where k = 1, 2) with minimum
number of hybrid nodes compatible with each tree Ti
ALGORITHM
Divide and Conquer Technique
Dividing
Bipartition Tripartition Quadripartition
Conquering
?
1-ARTICULATED NETWORK ALGORITHM
Case 1: Each input tree is a single node – Base
case
Case 2: Input tree set admits a leaf set
bipartition
Case 3: Input tree set admits a leaf set
tripartition
1-ARTICULATED NETWORK ALGORITHM
Case 1: Each input tree is a single node – Base case – O(1)
Return a network which is a single node with the same label
1-ARTICULATED NETWORK ALGORITHM
Case 2: Input tree set admits a leaf set bipartition – O(kn)
T1 T2 Tk
N1 N2
r
N
Combination
r
1-ARTICULATED NETWORK ALGORITHM
Case 3: Input tree set admits a leaf set tripartition – O(kn)T1 T2 Tk
N1 N2
Nh
x y
It takes O(kn) to find nodes x in N1 and y in N2
2-ARTICULATED NETWORK ALGORITHM
Case 1: Each input tree is a single node – Base
case
Case 2: Input tree set admit a leaf set bipartition
Case 3: Input tree set admit a leaf set tripartition
Case 4: Input tree set admit a leaf set
quadripartition
r
4-ARTICULATED NETWORK ALGORITHM
Case 4: Input tree set admits a leaf set quadripartition – O(kn)T1 T2 Tk
Nh1
x1 y1
It takes O(kn)
to find nodes
x1 & x2 in N1
and
y1 & y2 in N2
Nh1
x2y2
N2N1
r
4-ARTICULATED NETWORK ALGORITHM
Case 4: Input tree set admits a leaf set quadripartition – O(kn)T1 T2 Tk
Nh1
x1 y1
It takes O(kn)
to find nodes
x1 & x2 in N1
and
y1 & y2 in N2
Nh1
x2
y2
N2N1
TIME COMPLEXITY
Time complexity of the Algorithms in reconstructing a
restricted k-articulated network, in both cases when k =
1, 2:
Each recursive step takes O(kn) running time to check
whether the input tree set admit a leaf set bipartition or
tripartition, and then combine the subnetworks returned
The number of nodes in the restricted 1-articulated network
is O(n)
Therefore, the total time complexity is O(kn2)