longest increasing subsequences in windows based on canonical antichain partition erdong chen (joint...
TRANSCRIPT
Longest Increasing Subsequences in Windows
Based on Canonical Antichain Partition
Erdong Chen
(Joint work with Linji Yang & Hao Yuan)
Shanghai Jiao Tong Univ.
Outline
Problem Definition Canonical Antichain Partition Sweep Algorithm Complexity Analysis Conclusion
Longest Increasing Subsequence (LIS)
6 9 8 2 3 5 1 4 7
2 3 4 7
2 3 5 7
Input sequence:
All Longest Increasing Subsequences:
LIS in a Window
6 9 8 2 3 5 1 4 7
6 9 8
The length of LIS within the window is 26 9
Sequence
a Window
Longest Increasing Sequence in A Set of Variable-size Windows (LISSET)
6 9 8 2 3 5 1 4 7
896 5 1 43 53289 5 1 4 732896 96
53
532
5 732
Length of LIS = 2
Length of LIS = 3
Length of LIS = 2
Length of LIS = 4
OUTPUT = 11
+
=
+
+
Longest Increasing Sequence in A Set of Variable-size Windows (LISSET)
6 9 8 2 3 5 1 4 7
96
53
532
5 732
Length of LIS = 2
Length of LIS = 3
Length of LIS = 2
Length of LIS = 4
OUTPUT = 11
+
=
+
+
Related Works
Knuth proposed an O(n log n) algorithm for LIS problem
Fredman proved an Ω(n log n) lower bound under the decision tree model
An O(n log log n) algorithm is possible by using van Emde Boas tree on a permutation.
Related Works (Cont.)
Longest Increasing Subsequences in Sliding Windows (LISW problem) (by Michael H. Albert et al), Time Complexity O(OUTPUT + n log log n)
We called it Longest Increasing Subsequences in Fixed-size windows
LISW Problem
6 9 8 2 3 5
896 289 32 8 532 3 56 6 9
OUTPUT = 14
+
=
+
+
+
+
+
+
5
n = 6
w = 3 Length of LIS = 1
Length of LIS = 2
Length of LIS = 2
Length of LIS = 1
Length of LIS = 2
Length of LIS = 3
Length of LIS = 2
Length of LIS = 1
96
9
32
532
53
5
6
96
n+w-1 = 8 windows
Our Contribution
A algorithm for the generalized problem LISSET
To solve the sub case LISW problem, our algorithm runs in O(OUTPUT) time.
The best result among previous attempts on LISW is O(OUTPUT + n log log n)
Canonical Antichain Partition
The sequence: 6, 9, 8, 2, 3, 5, 1, 4, 7
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
p1(1,6)
p4(4,2)
p7(7,1)
p2(2,9)
p3(3,8)
p5(5,3)
p6(6,5)
p8(8,4)
p9(9,7)
Dominance Order
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
p1(1,6)
p4(4,2)
p7(7,1)
p2(2,9)
p3(3,8)
p5(5,3)
p6(6,5)
p8(8,4)
p9(9,7)
Points in this region dominates p1
Points in this region dominates p4
a<biffxa<xb & ya<yb
Height of points by Dominance Order
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
p1
p4
p7
p2
p3
p5
p6
p8
p9
Height = 1
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
p1
p4
p7
p2
p3
p5
p6
p8
Height of points by Dominance Order
Height = 1
Height = 2
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
p1
p4
p7
p2
p3
p5
p6
p8
Height of points by Dominance Order
Height = 1
Height = 2
Height = 3
Height = 4
Antichain and Chain
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
p1=HEAD(1)
p4p7=TAIL(1)
p2=HEAD(2)
p3
p5=TAIL(2)
p6=HEAD(3)
p8=TAIL(3)
p9=HEAD(4) =TAIL(4)
L(1) L(2) L(3) L(4)
Antichain:
xi<xi+1 & yi>=yi+1
0
1
2
3
4
5
6
7
8
9
10
0 1 2 3 4 5 6 7 8 9 10
p1
p4p7
p2
p3
p5
p6
p8
p9
L(1) L(2) L(3) L(4)
Antichain and Chain
Chain:
xi<xi+1 & yi<yi+1
Max Element
Min Element
The longest chain <p4,p5,p6,p9>
corresponds
the LIS
<2,3,5,7>
The sequence: 6, 9, 8, 2, 3, 5, 1, 4, 7
Sweep Algorithm
Sweep from left to right
Operations– DELETE operation (Delete at left) e.g.: delete 6– INSERT operation (Insert at right) e.g.: insert 2– QUERY operation
Algorithm Flow
896
289
DELETE operation
0
1
2
3
4
5
6
7
8
0 1 2 3 4 5 6 7 8 9
p8
p1=pdel
p3
p7
p5
p2
p4
p6
L(1) L(2) L(3)
D(1) = {p1}
D(2) = {p2}
D(3) = {p4,p5}
DELETE operation (Cont.)
0
1
2
3
4
5
6
7
8
0 1 2 3 4 5 6 7 8 9
p8
p1=pdel
p3
p7
p5
p2
p4
p6
D(2)
L(1)/D(1)
D(3)
L(2)/D(2)
L(3)/D(3)
After the Delete operation
0
1
2
3
4
5
6
7
8
0 1 2 3 4 5 6 7 8 9
p8
p1=pdel
p3
p7
p5
p2
p4
p6
L’(3)L’(1) L’(2)
Analysis of Delete operation
Theorem 2. The cost of one DELETE operation equals the total number of points whose height decreases, i.e., O(|D|).
)(iDD
INSERT & QUERY operations
Theorem 3. The cost of INSERT operation equals the length of the LIS with the pINS as the maximum element.
Theorem 4. The cost of outputting a longest chain equals to the length of the output subsequence.
Algorithm Flow
Step 1: Sort the windows Wi by their left endpoints (if two windows share the same left endpoint, the longer window comes first) O(n+m)
Step 2: initialize current window to ∅ Step 3: slide the window from Wj to Wj+1
(j=1,2,…m-1)
Details of Step 3
Move from Wj to Wj+1, Wj=(a1,b1), Wj+1=(a2,b2) Disjoint Overlap Contain
– Same left endpoint– Different left endpoints
QUERY(r2) to output a LIS within Wj+1
Amotized Complexity Analysis
Given a sequence π = π1π2 . . . πn
depthi is defined to be the largest height that πi achieved in the m windows. In other words, among all increasing subsequences in m windows, depthi is the length of the longest one with πi as the maximal element.
Complexity of each operation by Amortize Analysis
QUERY operation: O(OUTPUT)
INSERT operation:
DELETE operations:
(A point pi can decrease at most depthi times)
n
i
idepth1
n
i
idepth1
Complexity Analysis of LISSET
Theorem 5 (LISSET Problem). The algorithm described above computes the m longest increasing subsequences, one for each window, in total time:
n
i
idepthOUTPUTn1
Complexity Analysis of LISW
depthi equals the length of the output in window πi-w+1, πi-w+2, …, πi
So,
And,
Thus,
n
i
i OUTPUTdepth1
OUTPUTdepthOUTPUTnn
i
i 1
nOUTPUT
Complexity Analysis of LISW
Theorem 6 (LISW Problem). Our algorithm finds the longest increasing subsequence in a sliding window over a sequence of n elements in O(OUTPUT) time.
Future Works…
Questions?