another problem with variants on either side of p vs. np...
TRANSCRIPT
![Page 1: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/1.jpg)
Another problem with variants on either side
of P vs. NP divide
Leonid Chindelevitch
28 March 2019
Theory Seminar, Spring 2019, Simon Fraser University
![Page 2: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/2.jpg)
Background
![Page 3: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/3.jpg)
From mice to men through genome rearrangements
Mouse X-Chromosome
Human X-Chromosome 1
![Page 4: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/4.jpg)
Ancestral Reconstruction
M3
A B
M1
C D
M2
Input: Tree and genomes A,B,C ,D
Output: Ancestral genomes M1,M2,M3
2
![Page 5: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/5.jpg)
The Median of Three
A
B C
M
Input: Genomes A,B,C
Output: Genome M (the median, AKA the lowest common ancestor)
which minimizes
d(A,M) + d(B,M) + d(C ,M)
3
![Page 6: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/6.jpg)
Genome Elements
Adjacencies: {ah, bh}, {bt , ct}; telomeres: at , ch
4
![Page 7: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/7.jpg)
Genome Representations
atah btbh ctch
at ah bt bh ct ch
at 1 0 0 0 0 0
ah 0 0 0 1 0 0
bt 0 0 0 0 1 0
bh 0 1 0 0 0 0
ct 0 0 1 0 0 0
ch 0 0 0 0 0 1
This is a genome matrix.
Genome matrices can be represented by involutions: (ah bh)(bt ct).
5
![Page 8: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/8.jpg)
Genome Representations
atah btbh ctch
at ah bt bh ct ch
at 1 0 0 0 0 0
ah 0 0 0 1 0 0
bt 0 0 0 0 1 0
bh 0 1 0 0 0 0
ct 0 0 1 0 0 0
ch 0 0 0 0 0 1
This is a genome matrix.
Genome matrices can be represented by involutions: (ah bh)(bt ct).
5
![Page 9: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/9.jpg)
Genome Representations
atah btbh ctch
at ah bt bh ct ch
at 1 0 0 0 0 0
ah 0 0 0 1 0 0
bt 0 0 0 0 1 0
bh 0 1 0 0 0 0
ct 0 0 1 0 0 0
ch 0 0 0 0 0 1
This is a genome matrix.
Genome matrices can be represented by involutions: (ah bh)(bt ct).
5
![Page 10: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/10.jpg)
Properties of Genome Matrices
• binary matrices that satisfy A = AT = A−1
• even dimension n (but we can relax this assumption)
orthogonal
binary symmetric
6
![Page 11: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/11.jpg)
Rank Distance
The rank distance between two genome matrices is the rank of their
difference
d(A,B) = r(A− B)
Properties
• d(A,B) ≥ 0; d(A,B) = 0 if and only if A = B
• d(A,B) = d(B,A)
• d(A,C ) ≤ d(A,B) + d(B,C )
This is a metric on the space of genome matrices (and matrices in
general).
7
![Page 12: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/12.jpg)
Rank Distance
The rank distance between two genome matrices is the rank of their
difference
d(A,B) = r(A− B)
Properties
• d(A,B) ≥ 0; d(A,B) = 0 if and only if A = B
• d(A,B) = d(B,A)
• d(A,C ) ≤ d(A,B) + d(B,C )
This is a metric on the space of genome matrices (and matrices in
general).
7
![Page 13: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/13.jpg)
Equivalence of Rank Distance and the Cayley Distance
Lemma
Consider permutations matrices P,Q, with permutation representations
π, τ ∈ Sn, respectively. Then
d(P,Q) = ||τπ−1||
where || · || is the minimum number of cycles in a 2-cycle decomposition.
Remark
|| · || is a metric on permutations, also referred to as the Cayley distance.
Remark
We may as well work with involutions in Sn instead of genome matrices.
8
![Page 14: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/14.jpg)
Equivalence of Rank Distance and the Cayley Distance
Lemma
Consider permutations matrices P,Q, with permutation representations
π, τ ∈ Sn, respectively. Then
d(P,Q) = ||τπ−1||
where || · || is the minimum number of cycles in a 2-cycle decomposition.
Remark
|| · || is a metric on permutations, also referred to as the Cayley distance.
Remark
We may as well work with involutions in Sn instead of genome matrices.
8
![Page 15: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/15.jpg)
Equivalence of Rank Distance and the Cayley Distance
Lemma
Consider permutations matrices P,Q, with permutation representations
π, τ ∈ Sn, respectively. Then
d(P,Q) = ||τπ−1||
where || · || is the minimum number of cycles in a 2-cycle decomposition.
Remark
|| · || is a metric on permutations, also referred to as the Cayley distance.
Remark
We may as well work with involutions in Sn instead of genome matrices.
8
![Page 16: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/16.jpg)
The Rank Median Problem
A
B C
M
Input: Genome Matrices A,B,C
Output: Matrix M (the median) which minimizes
s(M;A,B,C ) = d(A,M) + d(B,M) + d(C ,M)
What kind of matrix should M be?
9
![Page 17: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/17.jpg)
The Rank Median Problem
A
B C
M
Input: Genome Matrices A,B,C
Output: Matrix M (the median) which minimizes
s(M;A,B,C ) = d(A,M) + d(B,M) + d(C ,M)
What kind of matrix should M be?
9
![Page 18: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/18.jpg)
Types of medians
[0 1 0 0
1 0 0 0
0 0 0 1
0 0 1 0
][0 0 0 1
0 0 1 0
0 1 0 0
1 0 0 0
][0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
]
Generalized median: minimizer of d(A,M) + d(B,M) + d(C ,M) over all
real valued matrices[− 1
212
12
12
12
− 12
12
12
12
12
− 12
12
12
12
12
− 12
]
Genome median: minimizer of d(A,M) + d(B,M) + d(C ,M) over all
genome matrices[0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
]
10
![Page 19: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/19.jpg)
Types of medians
[0 1 0 0
1 0 0 0
0 0 0 1
0 0 1 0
][0 0 0 1
0 0 1 0
0 1 0 0
1 0 0 0
][0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
]
Generalized median: minimizer of d(A,M) + d(B,M) + d(C ,M) over all
real valued matrices[− 1
212
12
12
12
− 12
12
12
12
12
− 12
12
12
12
12
− 12
]
Genome median: minimizer of d(A,M) + d(B,M) + d(C ,M) over all
genome matrices[0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
]
10
![Page 20: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/20.jpg)
Types of medians
[0 1 0 0
1 0 0 0
0 0 0 1
0 0 1 0
][0 0 0 1
0 0 1 0
0 1 0 0
1 0 0 0
][0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
]
Generalized median: minimizer of d(A,M) + d(B,M) + d(C ,M) over all
real valued matrices[− 1
212
12
12
12
− 12
12
12
12
12
− 12
12
12
12
12
− 12
]
Genome median: minimizer of d(A,M) + d(B,M) + d(C ,M) over all
genome matrices[0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
]
10
![Page 21: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/21.jpg)
P, NP, and NP-hard
11
![Page 22: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/22.jpg)
Problems with variants on both sides of the P vs. NP divide
Problem type P variant NP-hard variant
Cover Edge cover Vertex cover
Satisfiability 2-CNF-SAT 3-CNF-SAT
Graph mapping Graph isomorphism Subgraph isomorphism
Optimization Linear programming Integer programming
Median-of-three Generalized median Genome median
12
![Page 23: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/23.jpg)
NP-hard
NP-hard is the set of problems which are ”at least as hard as hardest
problems in NP”.
i.e. there is a polynomial time reduction from any problem L ∈ NP to
H ∈ NP-hard.
H problem solver
solution to I ′
instance I ′of Hfast L-to-H converter
fast H-to-L converter
instance I of L
solution to I
L problem solver
13
![Page 24: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/24.jpg)
APX-hard
APX is the set of problems which have polynomial time constant-factor
approximation algorithms.
APX-hard is the set of problems where there exists a polynomial time
approximation scheme reduction from any problem L ∈ APX to any
problem H ∈ APX-hard.
H approximater
ε-approximation of I ′
instance I ′of Hfast L-to-H converter
fast H-to-L converter
instance I of L
βε-approximation of I
L approximater
14
![Page 25: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/25.jpg)
Computational Complexity
15
![Page 26: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/26.jpg)
The Generalized Median problem
![Page 27: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/27.jpg)
Properties of the Median
• Lower Bound
d(M,A) + d(M,B) + d(M,C ) ≥ d(A,B) + d(B,C ) + d(C ,A)
2:= β
• At least one of the “corners” (input genomes) is a 43 approximation
of the median
• The lower bound is achieved if and only if
d(M,A) =d(A,B) + d(C ,A)− d(B,C )
2
and likewise for d(M,B) and d(M,C ).
• Not every A,B,C can achieve the lower bound β, e.g.:
A =
(−1 0
0 −1
),B =
(0 0
0 0
),C =
(1 0
0 1
).
16
![Page 28: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/28.jpg)
Approximating Matrix Medians
• Interesting Property
Theorem
For any three n× n matrices A, B, and C there is a median M satisfying:
for all vectors v ∈ Rn such that Av = Bv = Cv , we have Mv = Av .
• We define the invariant α := dim({v |Av = Bv = Cv}).
• For permutations, this can be computed in O(n) time via graph
union.
• Can we say the same if we have Av = Bv? We don’t know [yes for
orthogonal A,B,C ].
• However, we can act on this idea.
17
![Page 29: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/29.jpg)
Subspace decomposition
18
![Page 30: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/30.jpg)
Approximation Algorithm
V1
S1
P1
V2
S2
P2
V3
S3
P3
V4
S4
P4
V5
S5
P5
Subspaces
Orthonormal Bases
Projection Matrices
Median Candidates
MA = AP1 + AP2 + BP3 + AP4 + AP5
MB = BP1 + BP2 + BP3 + AP4 + BP5
MC = CP1 + BP2 + CP3 + CP4 + CP5
• 43 approximation factor for genome matrices
• if V5 = {0} then each candidate is a median (its score is β)
• In general, dim(V5) := 2δ, where δ := α + β − n is called the
“deficiency” of the triplet A,B,C .
19
![Page 31: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/31.jpg)
Some recently proven theorems
MI := AP1 + AP2 + BP3 + AP4 + P5
Theorem: MI is a median for any genomic inputs A,B,C .
Theorem: MI = I + ([AV1,AV2,BV3,AV4]− V14)(V T14V14)−1V T
14.
Corollary: It is possible to compute MI in O(nω) time, where ω is
Strassen’s exponent, in exact or floating-point arithmetic.
Theorem: The matrix MI is always symmetric and orthogonal for
genomic inputs A,B,C .
Special case: If A = I , then δ = 0, so MA = MB = MC = MI and each
one has a score of β.
20
![Page 32: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/32.jpg)
An even faster, O(n2), algorithm when δ = 0
Theorem: If a matrix M satisfies
d(A,M) + d(M,B) = d(A,B),
then there exists a projection matrix P such that
M = A + P(B − A).
• We can ignore the condition that P is a projection matrix.
• This yields the system
M = A + P(B − A) = B + Q(C − B) = C + R(A− C ),
from which we eliminate M and any redundancies.
• It splits into n linear systems with the same left-hand side.
• If A,B,C are permutations, δ = 0, each equation has 2 variables;
the Aspvall-Shiloach algorithm solves such systems in O(n) time.
21
![Page 33: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/33.jpg)
Rarity of the special case δ = 0
Theorem: The fraction of triples with δ = 0 goes to 0 as n→∞.
Proof: This follows directly from a result in analytic combinatorics.
0.5
0.6
0.7
0.8
4 6 8 10
n
Fra
ctio
n
Upper bounds and exact fractions of triples with deficiency 0
22
![Page 34: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/34.jpg)
Challenges with computing V5
Observation: A basis for the space im(A− B) ∩ im(B − C ) can be
computed in O(n log n).
Proof sketch: Let P,Q be the cycle partitions of A−1B,C−1B.
Create a multigraph G with vertices P ∪ Q and edges for all i ∈ [n].
Each parallel edge i , j gives a vector ei − ej ∈ im(A− B) ∩ im(B − C ).
Removing those to get a connected graph G ′ whose cycle basis B can be
computed from a spanning tree.
Since G ′ is bipartite, each cycle C ∈ B gives rise to the vector
χ(C+)− χ(C−) ∈ im(A− B) ∩ im(B − C ).
Difficulty: V5 = im(A− B) ∩ im(B − C ) ∩ im(C − A) may have no nice
basis; this construction fails when generalized to hypergraphs.
23
![Page 35: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/35.jpg)
A quartic algorithm for orthogonal matrices
while d(A,B) + d(B,C ) > d(A,C )
find u ∈ im(A− B) ∩ im(B − C );
B ←(I − 2
uuT
uTu
)B.
Remark
The transformation which multiplies a matrix on the left by I − 2 uuT
uT uis
called a Householder reflection, and is frequently used in numerical
analysis.
24
![Page 36: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/36.jpg)
Complexity of Rank Median Problems
PGeneralized Rank
Median Problem
NP
NP-hardGenome Rank
Median Problem
????
25
![Page 37: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/37.jpg)
The Genome Median Problem
![Page 38: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/38.jpg)
Genome Rank Median Problem is NP-hard and APX-hard
Theorem
The genome rank median problem of three genomes (GMP) is NP-hard
and APX-hard.
Proof.
By reduction from the breakpoint graph decomposition problem
(BGD).
26
![Page 39: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/39.jpg)
Breakpoint Graph Decomposition Problem
Objective (NP-hard): find a maximum alternating cycle decomposition Cof a balanced bicolored graph G .
Objective (APX-hard): find an alternating cycle decomposition C of a
balanced bicolored graph G which minimizes |B| − |C|.
x
v
w
s
y
t
z
27
![Page 40: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/40.jpg)
BGD Reduction Plan
GMP problem solver
solution to I ′
instance I ′of GMPfast BGD-to-GMP converter
fast GMP-to-BGD converter
instance G of BGD
max cycle decomp C
BGD problem solver
28
![Page 41: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/41.jpg)
Transforming BGD into GMP
x
v
w
s
y
t
z
−x
−v
−w
−s
−y
−t
−z
G −G
x
v ′
w
s
y
t
z
−x
−v ′
−w
−s
−y
−t
−z
v ′′ −v ′′
29
![Page 42: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/42.jpg)
Transforming BGD into GMP
x
v
w
s
y
t
z
−x
−v
−w
−s
−y
−t
−z
G −G
x
v ′
w
s
y
t
z
−x
−v ′
−w
−s
−y
−t
−z
v ′′ −v ′′
29
![Page 43: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/43.jpg)
Transforming BGD into GMP
x
v ′
w
s
y
t
z
−x
−v ′
−w
−s
−y
−t
−z
v ′′ −v ′′
π1 = id
π2 = (v ′ v ′′)(−v ′ − v ′′)
π3 = (v ′ x s w)(t y v ′′ z)(−w − s − x − v ′)(−z − v ′′ − y − t)
30
![Page 44: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/44.jpg)
Aside: Canonical medians
A canonical median mc is a median of π1, π2, and π3 which contains
cycles only from π2.
π1 = id
π2 = (v ′ v ′′)(−v ′ − v ′′)
π3 = (v ′ x s w)(t y v ′′ z)(−w − s − x − v ′)(−z − v ′′ − y − t)
mc = (v ′ v ′′)
Lemma
Medians of π1, π2, π3 can be transformed into canonical medians in
polynomial time.
Lemma
Canonical medians are in bijection to maximum cycle decompositions of
G .
31
![Page 45: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/45.jpg)
Aside: Canonical medians
mc = m(−m)
x
v
w
s
y
t
z
−x
−v
−w
−s
−y
−t
−z
G −G
m ⇔ max cycle decomp C −m ⇔ max cycle decomp -C
32
![Page 46: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/46.jpg)
Transforming BGD into GMP
Γ = (v ′ − v ′)(v ′′ − v ′′)(s − s)(t − t)(w −w)(x − x)(z − z)(y − y).
π1 = id
π2 = (v ′ v ′′)(−v ′ − v ′′)
π3 = (v ′ x s w)(t y v ′′ z)(−w − s − x − v ′)(−z − v ′′ − y − t)
Γ
π1Γ = Γ
π2Γ = (v ′ − v ′′)(−v ′ v ′′)(s − s)(t − t)(w − w)(x − x)(z − z)(y − y)
π3Γ = (v ′ − x)(x − s) . . .
π1Γ, π2Γ, π3Γ are involutions, i.e. they are an instance of GMP, with
genome rank median m′Γ.
33
![Page 47: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/47.jpg)
Transforming BGD into GMP
Γ = (v ′ − v ′)(v ′′ − v ′′)(s − s)(t − t)(w −w)(x − x)(z − z)(y − y).
π1 = id
π2 = (v ′ v ′′)(−v ′ − v ′′)
π3 = (v ′ x s w)(t y v ′′ z)(−w − s − x − v ′)(−z − v ′′ − y − t)
Γ
π1Γ = Γ
π2Γ = (v ′ − v ′′)(−v ′ v ′′)(s − s)(t − t)(w − w)(x − x)(z − z)(y − y)
π3Γ = (v ′ − x)(x − s) . . .
π1Γ, π2Γ, π3Γ are involutions, i.e. they are an instance of GMP, with
genome rank median m′Γ.
33
![Page 48: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/48.jpg)
Transforming BGD into GMP
Γ = (v ′ − v ′)(v ′′ − v ′′)(s − s)(t − t)(w −w)(x − x)(z − z)(y − y).
π1 = id
π2 = (v ′ v ′′)(−v ′ − v ′′)
π3 = (v ′ x s w)(t y v ′′ z)(−w − s − x − v ′)(−z − v ′′ − y − t)
Γ
π1Γ = Γ
π2Γ = (v ′ − v ′′)(−v ′ v ′′)(s − s)(t − t)(w − w)(x − x)(z − z)(y − y)
π3Γ = (v ′ − x)(x − s) . . .
π1Γ, π2Γ, π3Γ are involutions, i.e. they are an instance of GMP, with
genome rank median m′Γ.
33
![Page 49: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/49.jpg)
Transforming BGD into GMP
Γ = (v ′ − v ′)(v ′′ − v ′′)(s − s)(t − t)(w −w)(x − x)(z − z)(y − y).
π1 = id
π2 = (v ′ v ′′)(−v ′ − v ′′)
π3 = (v ′ x s w)(t y v ′′ z)(−w − s − x − v ′)(−z − v ′′ − y − t)
Γ
π1Γ = Γ
π2Γ = (v ′ − v ′′)(−v ′ v ′′)(s − s)(t − t)(w − w)(x − x)(z − z)(y − y)
π3Γ = (v ′ − x)(x − s) . . .
π1Γ, π2Γ, π3Γ are involutions, i.e. they are an instance of GMP, with
genome rank median m′Γ.
33
![Page 50: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/50.jpg)
Transforming BGD into GMP
Proposition
The rank distance is right-multiplication invariant; that is, for
σ, π, τ ∈ Sn,
d(σ, π) = d(στ, πτ)
Corollary
s(m′Γ;π1Γ, π2Γ, π3Γ) = s(m′;π1, π2, π3)
Corollary
m′Γ is a genome median of π1Γ, π2Γ, π3Γ if and only if m′ is a
permutation median of π1, π2, π3.
34
![Page 51: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/51.jpg)
Transforming BGD into GMP
Proposition
The rank distance is right-multiplication invariant; that is, for
σ, π, τ ∈ Sn,
d(σ, π) = d(στ, πτ)
Corollary
s(m′Γ;π1Γ, π2Γ, π3Γ) = s(m′;π1, π2, π3)
Corollary
m′Γ is a genome median of π1Γ, π2Γ, π3Γ if and only if m′ is a
permutation median of π1, π2, π3.
34
![Page 52: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/52.jpg)
Transforming BGD into GMP
Proposition
The rank distance is right-multiplication invariant; that is, for
σ, π, τ ∈ Sn,
d(σ, π) = d(στ, πτ)
Corollary
s(m′Γ;π1Γ, π2Γ, π3Γ) = s(m′;π1, π2, π3)
Corollary
m′Γ is a genome median of π1Γ, π2Γ, π3Γ if and only if m′ is a
permutation median of π1, π2, π3.
34
![Page 53: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/53.jpg)
NP-hardness proof sketch
Theorem
The genome rank median problem of three genomes is NP-hard.
Proof.
instance G of BGD
BGD problem solver
GMP problem solver
m′Γ
π1Γ, π2Γ, π3Γ
Γ
Γ
m′
mc = m(−m)
mMax cycle decomp. C
H = G ∪ −G
π1, π2, π3
Canonical Machine
35
![Page 54: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/54.jpg)
NP-hardness proof sketch
Theorem
The genome rank median problem of three genomes is NP-hard.
Proof.
instance G of BGD
BGD problem solver
GMP problem solver
m′Γ
π1Γ, π2Γ, π3Γ
Γ
Γ
m′
mc = m(−m)
mMax cycle decomp. C
H = G ∪ −G
π1, π2, π3
Canonical Machine
35
![Page 55: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/55.jpg)
APX-hardness proof sketch
Theorem
The genome rank median problem of three genomes is APX-hard.
Proof.
instance G of BGD
BGD approximater
GMP ε-approximater
m′Γ
π1Γ, π2Γ, π3Γ
Γ
Γ
m′
mc = m(−m)
m8ε-approximation C
H = G ∪ −G
π1, π2, π3
Canonical Machine
36
![Page 56: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/56.jpg)
APX-hardness proof sketch
Theorem
The genome rank median problem of three genomes is APX-hard.
Proof.
instance G of BGD
BGD approximater
GMP ε-approximater
m′Γ
π1Γ, π2Γ, π3Γ
Γ
Γ
m′
mc = m(−m)
m8ε-approximation C
H = G ∪ −G
π1, π2, π3
Canonical Machine
36
![Page 57: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/57.jpg)
Conclusion and open problems
• We have a general O(nω+1) algorithm for orthogonal matrices.
• We have a specialized O(nω) algorithm for symmetric orthogonal
matrices.
• We have a O(n2) algorithm for permutations with δ = 0.
• What properties of input matrices are inherited by medians?
• Partial answer: we know that not all generalized medians are
symmetric or orthogonal!
• Can we use convex optimization to find better approximations?
What is the best possible ratio for approximating the genome
median problem?
• Is there a fast exponential or sub-exponential algorithm for solving
this problem?
37
![Page 59: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/59.jpg)
Acknowledgments
Joao Meidanis
Cedric Chauve
Pedro Feijao
Sean La
39
![Page 60: Another problem with variants on either side of P vs. NP divideskoroth/cstheorysem/files/slides/03_28_2019/... · 2019. 9. 20. · Problems with variants on both sides of the P vs](https://reader033.vdocument.in/reader033/viewer/2022052103/603de5fc507a423389136d9e/html5/thumbnails/60.jpg)
References
• J. P. Pereira Zanetti, P. Biller, J. Meidanis. Median Approximations
for Genomes Modeled as Matrices. Bulletin of Math Biology, 78(4),
2016.
• L. Chindelevitch and J. Meidanis. On the Rank-Distance Median of
3 Permutations. Proc. 15th RECOMB Comparative Genomics
Satellite Workshop. LNCS, vol. 10562, pp. 256-276. Springer,
Heidelberg (2017). Journal version in BMC Bioinformatics.
• L. Chindelevitch, S. La and J. Meidanis. A cubic algorithm for the
generalized rank median of three genomes. Proc. 16th RECOMB
Comparative Genomics Satellite Workshop. LNCS, vol. 11183, pp.
3-27 (2018). Journal version in BMC Algorithms for Molecular
Biology.
• R. Sarkis, S. La, P. Feijao, L. Chindelevitch, H. Hatami. Computing
the Cayley median for permutations and the rank median for
genomes is NP-hard. [Submitted to WABI 2019]
40