t he p roblem of r econstructing k - articulated p hylogenetic n etwork supervisor : dr. yiu siu...

21
THE PROBLEM OF RECONSTRUCTING K-ARTICULATED PHYLOGENETIC NETWORK Supervisor: Dr. Yiu Siu Ming Second Examiner: Professor Francis Y.L. Chin Student: Vu Thi Quynh Hoa

Post on 20-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

THE PROBLEM OF

RECONSTRUCTING K-

ARTICULATED PHYLOGENETIC

NETWORK

Supervisor: Dr. Yiu Siu Ming

Second Examiner: Professor Francis Y.L. Chin

Student: Vu Thi Quynh Hoa

CONTENTS

1. Introduction

Motivation

Related Work

Project Plan

2. Problem Definitions

3. Algorithms

1-articulated Network Algorithm

2-articulated Network Algorithm

INTRODUCTION – MOTIVATION

To model the evolutionary history of species,

phylogenetic network is a powerful approach to

represent the articulation events

Level-x network: the time complexity of all existing

algorithms increases exponentially when x gets higher

k-articulated network is a more naturally biological

model which can capture complex scenarios of

articulation events with a smaller value of k

E.g. level-4 network vs. 2-articulated network

RELATED WORK

The problem of constructing phylogenetic

networks has been worked under many

approaches using different input types Nakhleh et al. proposed an algorithm constructing a level-1

network from two trees in polynomial time

Huynh et al. with a polynomial-running-time algorithm building

a galled network from a set of trees

Bryant and Moulton developed NeighborNet method to

construct a network from a distance matrix

Jansson, Nguyen and Sung with O(n3) running time to construct a

galled network given a set of triplets

Extending to level-2 network, Van Iersel et al. provided an

O(n8) algorithm

SCHEDULES – PROJECT PLAN

Objectives Time

Round1

1. Reconstructing restricted 1-articulated network from a set of binary phylogenetic trees

30 Sep2011

2. Reconstructing restricted 2-articulated network from a set of binary phylogenetic trees

15 Nov2011

Round2

3. Reconstructing 1-articulated network from a Distance Matrix

15 Feb2012

4. Implementation one of the three problems 31 Mar2012

DEFINITIONS

Phylogenetic Tree

A rooted, unordered tree with distinctly labeled leaves representing

each strain of the species

Phylogenetic Network

A rooted, directed acyclic graph in which:

One node has indegree 0 (the root), and all other nodes have indegree 1

or 2

All nodes with indegree 2 must have outdegree 1 (hybrid nodes)

All other nodes with indegree 1 have outdegree 0 or 2

Nodes with outdegree 0 are leaves which are distinctly labeled

Node s is called a split node of a hybrid node h if s can be reached

using two disjoint paths from the children of s

PHYLOGENETIC NETWORK

DEFINITIONS

k-articulated network

a phylogenetic network in which every split node

corresponds to at most k hybrid nodes A level-k network is a k-articulated network

A k-articulated network can model a level-x network (x > k)

Level-2 network

1-articulated

network

DEFINITIONS

A network is non-skew if all paths from any split node to its hybrid node have a length ≥ 2

A network is safe if the siblings of all hybrid nodes are not hybrid nodes

A network is restricted if it is non-skew and safe

DEFINITIONS

Given a hybrid node h and its parents p and q, a cut on edge (p, h)

means removing the edge (p, h) from the network, and then for

every node with indegree 1 and outdegree less than 2, contracting

its outgoing edge

A network N is compatible with phylogenetic tree T if N can be

converted to T by performing a series of cuts one by one.

h

pq

h

pq

PROBLEM DEFINITION

Reconstructing a restricted k-articulated

network (where k = 1, 2) from a set of binary

trees

Given a set of phylogenetic binary trees Ti , i = 1, 2, …, k,

with the same leaf label set, construct a restricted k-

articulated network N (where k = 1, 2) with minimum

number of hybrid nodes compatible with each tree Ti

ALGORITHM

Divide and Conquer Technique

Dividing

Bipartition Tripartition Quadripartition

Conquering

?

1-ARTICULATED NETWORK ALGORITHM

Case 1: Each input tree is a single node – Base

case

Case 2: Input tree set admits a leaf set

bipartition

Case 3: Input tree set admits a leaf set

tripartition

1-ARTICULATED NETWORK ALGORITHM

Case 1: Each input tree is a single node – Base case – O(1)

Return a network which is a single node with the same label

1-ARTICULATED NETWORK ALGORITHM

Case 2: Input tree set admits a leaf set bipartition – O(kn)

T1 T2 Tk

N1 N2

r

N

Combination

r

1-ARTICULATED NETWORK ALGORITHM

Case 3: Input tree set admits a leaf set tripartition – O(kn)T1 T2 Tk

N1 N2

Nh

x y

It takes O(kn) to find nodes x in N1 and y in N2

2-ARTICULATED NETWORK ALGORITHM

Case 1: Each input tree is a single node – Base

case

Case 2: Input tree set admit a leaf set bipartition

Case 3: Input tree set admit a leaf set tripartition

Case 4: Input tree set admit a leaf set

quadripartition

r

4-ARTICULATED NETWORK ALGORITHM

Case 4: Input tree set admits a leaf set quadripartition – O(kn)T1 T2 Tk

Nh1

x1 y1

It takes O(kn)

to find nodes

x1 & x2 in N1

and

y1 & y2 in N2

Nh1

x2y2

N2N1

r

4-ARTICULATED NETWORK ALGORITHM

Case 4: Input tree set admits a leaf set quadripartition – O(kn)T1 T2 Tk

Nh1

x1 y1

It takes O(kn)

to find nodes

x1 & x2 in N1

and

y1 & y2 in N2

Nh1

x2

y2

N2N1

TIME COMPLEXITY

Time complexity of the Algorithms in reconstructing a

restricted k-articulated network, in both cases when k =

1, 2:

Each recursive step takes O(kn) running time to check

whether the input tree set admit a leaf set bipartition or

tripartition, and then combine the subnetworks returned

The number of nodes in the restricted 1-articulated network

is O(n)

Therefore, the total time complexity is O(kn2)

THANK YOU!