Date post: | 27-Mar-2015 |
Category: |
Documents |
Upload: | audrey-king |
View: | 215 times |
Download: | 0 times |
The Communication and Streaming Complexity of
Computing the Longest Common and Increasing Subsequences
Xiaoming Sun Tsinghua University
David Woodruff MIT
The Problem
• Stream of elements a1, …, an 2
• Algorithm given one pass over stream
• Problem: Compute the longest increasing subsequence (LIS) – in this case answer is (3,7)
0113734
Previous Work
• Let k be the length of the LIS of the stream
• There exists an algorithm which computes the LIS with O(k2 log ||) space [LNVZ05]
• Trivial (k) lower bound
• Our first result: Improve both bounds to a tight (k2 log ||/k)
Our Lower Bound
Alice Bob
Reduction from indexing function:
x 2 {0,1}n i 2 [n] = {1, 2, …, n}
Randomized 1-way communication is (n)
What is xi?
Alice Bob
x 2 {0,1}n i 2 [n] = {1, 2, …, n}
What is xi?
Construct a stream A Construct a stream B
1. From LIS(A, B), Bob can get xi
2. |LIS(A, B)| = k, where k is input parameter
Alice
Alice uses x to create k-1 increasing sequences A1, …, Ak-1
For each j, Aj has length j. Each bit of x is encoded in some sequence Aj
Every element in Ak-1 is larger than every element in Ak-2, every element in Ak-2 larger than every element in Ak-3, etc.
Set A = Ak-1 ,…, A2 , A1
x 2 {0,1}n A:
A 1
A 2
A k-1
…Value
Position in stream
Bob
i 2 [n]
Bob uses i to recover Aj, the sequence encoding xi
Bob creates an increasing subsequence B of length k-j,
Every element in B is greater than Ar if r < j, and every element in B is less than Ar if r > j
A j-1
A j+1Value
Position in stream
A j
B:
B
Alice Bob
x 2 {0,1}n i 2 [n]
What is xi?
A = Ak-1, …, A2, A1B
A j-1
A j+1Value
Position in stream
A jB
LIS(A, B) = Aj, B, and |LIS(A, B)| = k
But xi encoded in Aj, so Bob recovers xi
• Thus, any streaming algorithm must use (n) space.
• But what is n? We need to construct k increasing sequences that are different for different x in {0,1}n
• Assume || large. Divide into k-1 blocks of size ||/(k-1)
• Let Aj be a random increasing sequence of length j in block j.
• The space to represent Aj is (k log ||/k) for j > k/2
• Set n = (k2 log ||/k).
Our Upper Bound• When processing the stream, keep lists A[1],
A[2], …, A[k].
• A[j] is an LIS of length j in the stream with minimal last element.
• Let L[1], L[2], …, L[k] be last elements of A[1], A[2], …, A[k]
• To process item x, find i for which L[i] < x < L[i+1], and replace A[i+1] with A[i], x
• So we have k arrays A[1], …, A[k], each of length at most k.
• Naively, this takes O(k2 log ||) space.
• But the Ai are increasing, so can compress the list by storing differences.
• Total space is O(k2 log ||/k).
This talk
• First result: a tight space bound for the LIS problem
• Second result: tight bounds for longest common subsequence (LCS)
LCS Bounds
• Problem: Alice has a permutation of [N], Bob has a permutation of [N]. Decide if |LCS(, )| ¸ k.
• Previous space bound: (k) [LNVZ05]
• Our space bound: (N) for 3 · k · N/2
(holds for randomized O(1)-pass algorithms)
LCS Bounds
• Why can we only prove (N) for 3 · k · N/2?
• If k = 2, reduces to equality test.
• If k large, there are at most O(N2(N-k)) permutations with |LCS(, )| > k, so just use an equality test with error O(1/N2(N-k))
Our Lower Bound
• Padding lemma: if for k = 3 the randomized communication complexity is (N), then it’s (N) for all k · N/2
• Proof: just pad each of the inputs by some common subsequence of length k-3
Alice Bob
Remains to show high complexity for k =3. We reduce from disjointness
x 2 {0,1}n y 2 {0,1}n
Randomized multi-way communication is (n)
Is there ani such thatxi = yi = 1?
Alice Bob
x 2 {0,1}N/3 y 2 {0,1}N/3
Construct Construct
Want |LCS(, )| ¸ 3 iff x and y are disjoint
Is there ani such thatxi = yi = 1?
Alice
x 2 {0,1}N/3
Divide 1, …, N into N/3 groupsG1 = (1, 2, 3), G2 = (4, 5, 6), …, GN/3 = (N-2, N-1, N).
Use x to choose 1, …, N/3
ii acts onacts on G Gii
If xIf xii = 0, = 0, ii (m+1, m+2, m+3) = (m+1, m+2, m+3). (m+1, m+2, m+3) = (m+1, m+2, m+3).
If xIf xii = 1, = 1, ii (m+1, m+2, m+3) = (m+1, m+3, m+2). (m+1, m+2, m+3) = (m+1, m+3, m+2).
= 1, 2, …, N/3
Bob
y 2 {0,1}N/3 = N/3 , …, 1
Divide 1, …, N into N/3 groupsG1 = (1, 2, 3), G2 = (4, 5, 6), …, GN/3 = (N-2, N-1, N).
Use y to choose 1, …, N/3
i acts on Gi
If yi = 0, i (m+1, m+2, m+3) = (m+3, m+2, m+1).If yi = 1, I (m+1, m+2, m+3) = (m+1, m+3, m+2).
1(G1)
2(G2)
3(G3)
N/3(GN/3)
…
N/3(GN/3)
3(G3)
2(G2)
1(G1)
…
Claim: |LCS(, )| · 3.
Proof: Use the fact that LCS(, ) intersects at most one Gi
Claim: |LCS(, )| = 3 iff there is some i with xi = yi = 1
Proof: Use the way we defined i and i
Thus, can decide disjointness, so (N) communication.
Other results
• Tight space bounds for computing the LIS length.
• Generalization to approximate LIS and LCS. Still many gaps here.
• Example: approximate LIS length, we have (1/) and O(k log ||). Recent work [GJKK07] has shown O(sqrt(N/) log ||), but still large gap.
Conclusion
• First result: a tight bound for the LIS
• Second result: an (N) space bound for the LCS k-decision problem for 3 · k · N/2
• Other results for approximation problems
• Another open question: extend our lower bound for LIS to randomized multi-round