![Page 1: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/1.jpg)
Inferring Phylogeny using Permutation Patterns on Genomic Data1Md Enamul Karim2Laxmi Parida1Arun Lakhotia
1University of Louisiana at Lafayette2IBM T. J. Watson Research Center
![Page 2: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/2.jpg)
Phylogeny
Reconstruction of the evolutionary relationship of a collection of organisms, usually in the form of a tree.
![Page 3: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/3.jpg)
Phylogenetic data Behavioral, morphological,
metabolic, etc. Molecular data: sequence data,
gene-order data etc.gene-order data
![Page 4: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/4.jpg)
Why gene order data?
Low error rate. Rare evolutionary events unlikely
to cause “silent" changes; can help inferring millions of years.
![Page 5: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/5.jpg)
Genomes rearrangements
• Inverted Transposition
1 2 3 9 -8 –7 –6 –5 –4 10
• Inversion
1 2 3 –8 –7 –6 –5 -4 9 10
• Transposition
1 2 3 9 4 5 6 7 8 10
1 2 3 4 5 6 7 8 9 10
![Page 6: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/6.jpg)
Breakpoint distance
Breakpoints are number of adjacencies present in one genome, but not in the other.
1 2 3 4 5 6 7 8 9 10
1 –3 –2 4 5 9 6 7 8 10
For some datasets, a close-to-linear relationship between the breakpoints and evolutionary events may exist.
Can be used for building phylogeny (Blanchette et al.).
![Page 7: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/7.jpg)
Limitations of breakpoint The number of breakpoints created by a
certain number of inversions may vary. Also, transpositions generally create more
breakpoints than inversions. Computing the breakpoint phylogeny is
NP-hard.
![Page 8: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/8.jpg)
MPBE (Maximum Parsimony on Binary Encoding)
A heuristic for the breakpoint phylogeny
(Cosner et al.). All ordered pairs of signed genes
appearing consecutively are coded as binary features.
Exponential time complexity, however, much faster than BPAnalysis.
![Page 9: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/9.jpg)
Limitations
May fail to find feasible solutions to the breakpoint phylogeny problem.
![Page 10: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/10.jpg)
Observation: The closer is the evolution history, the more permutations (of different granularity) are in common
1 2 3 4 5 6 7 8 9 10
1 2 3 –8 –7 –6 –5 –4 9 10
1 8 –3 –2 –7 –6 –5 –4 9 10
![Page 11: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/11.jpg)
Maximal pi-pattern (Eres et al.)
Matches permutations at different granularity.
Polynomial time complexity.
![Page 12: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/12.jpg)
pi-pattern
Example :
For S = and k=2
All pi-patterns are: ac, bc, abc, abcc
acbcabacbcab
abc
Pattern with minimum k permutations
![Page 13: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/13.jpg)
Cover
P1 covers P2=> Every P1 has a P2 Every P2 is within a P1
Example In S = acbcababc covers ac
![Page 14: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/14.jpg)
Maximal pi-pattern
pi-pattern which is not covered
Example In S = acbcabpi-patterns: ac, bc, abc, abcc
Maximal pi-patterns: abc, abcc
not covered by abcc
![Page 15: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/15.jpg)
Results
![Page 16: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/16.jpg)
Phylogeny for simulated evolution on synthetic data
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
![Page 17: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/17.jpg)
12 genera of Campanulaceaeand the outgroup tobacco
![Page 18: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/18.jpg)
Tree1: MPBE tree
![Page 19: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/19.jpg)
Tree2: Neighbor joining tree (using few different distances)
Tra
Sym
Cam
Ade
Wah
Mer
Leg
Asy
Tri
Cod
Cya
Pla
Tob
![Page 20: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/20.jpg)
Tree3: Neighbor joining tree using permutation patterns
Tra
Sym
Cam
Ade
Wah
Mer
Asy
Leg
Tri
Cod
Cya
Pla
Tob
167 Maximal pi-patterns(from 10769 pi-patterns) used as binary feature
XOR Distance measure
Distance/Similarity matrix is created to find neighbor joining tree
![Page 21: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/21.jpg)
Tree3 vs Tree2
![Page 22: Inferring Phylogeny using Permutation Patterns on Genomic Data](https://reader035.vdocument.in/reader035/viewer/2022070412/568149ab550346895db6e821/html5/thumbnails/22.jpg)
Conclusion Permutation patterns may preserve more
evolutionary information. Evolutionary events could be counted
within permuted segments to develop a hybrid
scheme. Current approaches remain unable to
handle unequal gene content, which could be solved using maximal pi-patterns.