Date post: | 26-Mar-2015 |
Category: |
Documents |
Upload: | sarah-palmer |
View: | 214 times |
Download: | 0 times |
Tight Bounds for Distributed Functional Monitoring
David Woodruff
IBM Almaden
Qin Zhang
Aarhus University
MADALGO
Distributed Functional MonitoringC
P1 P2 P3 Pk…
coordinator
time
sites
Static case vs. Dynamic caseProblems on x1 + x2 + … + xk: sampling, p-norms, heavy hitters, compressed sensing, quantiles, entropyAuthors: Can, Cormode, Huang, Muthukrishnan, Patt-Shamir, Shafrir, Tirthapura, Wang, Yi, Zhao, many others
CommunicationCommunication
x1 x2 x3 xkinputs:
Updates:xi à xi + ej
Updates:xi à xi + ej
Motivation
• Data distributed and stored in the cloud– Impractical to put data on a single device
• Sensor networks– Communication very power-intensive
• Network routers– Bandwidth limitations
Problems• Which functions f(x1, …, xk) do we care about?
• x1, …, xk are non-negative length-n vectors
• x = i=1k xi
• f(x1, …, xk) = |x|p = (i=1n xi
p)1/p
• |x|0 is the number of non-zero coordinates
What is the randomized communication cost of these
problems?I.e., the minimal cost of a protocol, which for every input, fails with probability < 1/3
Static case, Dynamic Case
What is the randomized communication cost of these
problems?I.e., the minimal cost of a protocol, which for every input, fails with probability < 1/3
Static case, Dynamic Case
Exact Answers
• An (n) communication bound for computing |x|p , p 1
• Reduction from 2-Player Set-Disjointness (DISJ)
• Alice has a set S µ [n] of size n/4
• Bob has a set T µ [n] of size n/4 with either |S Å T| = 0 or |S Å T| = 1
• Is S Å T = ;?• |X Å Y| = 1 ! DISJ(X,Y) = 1, |X Å Y| = 0 !DISJ(X,Y) = 0
• [KS, R] (n) communication
• Prohibitive for applications
Approximate Answers
f(x1, …, xk) = (1 ± ε) |x |p
What is the randomized communication cost as a function of k, ε, and n?
Ignore log(nk/ε) factors
Previous ResultsLower bounds in static model, upper bounds in dynamic
model (underlying vectors are non-negative)
• |x|0: (k + ε-2) and O(k¢ε-2 )
• |x|p: (k + ε-2)
• |x|2: O(k2/ε + k1.5/ε3)
• |x|p, p > 2: O(k2p+1n1-2/p ¢ poly(1/ε))
Our ResultsLower bounds in static model, upper bounds in dynamic
model (underlying vectors are non-negative)
• |x|0: (k + ε-2) and O(k¢ε-2 ) (k¢ε-2)
• |x|p: (k + ε-2) (kp-1¢ε-2). Talk will focus on p = 2
• |x|2: O(k2/ε + k1.5/ε3) O(k¢poly(1/ε))
• |x|p, p > 2: O(k2p+1n1-2/p ¢ poly(1/ε)) O(kp-1¢poly(1/ε))
First lower bounds to depend on
product of k and ε-
2
First lower bounds to depend on
product of k and ε-
2
Upper bound doesn’t depend
polynomially on n
Upper bound doesn’t depend
polynomially on n
Talk Outline
• Lower Bounds– Non-zero elements – Euclidean norm
• Upper Bounds– p-norm
Previous Lower Bounds• Lower bounds for any p-norm, p != 1
• [CMY](k)
• [ABC] (ε-2) • Reduction from Gap-Orthogonality (GAP-ORT)
• Alice, Bob have u, v 2 {0,1}ε-2 , respectively
• |¢(u, v) – 1/(2ε2)| < 1/ε or |¢(u, v) - 1/(2ε2)| > 2/ε
• [CR, S] (ε-2) communication
Talk Outline
• Lower Bounds– Non-zero elements – Euclidean norm
• Upper Bounds– p-norm
Lower Bound for Distinct Elements
• Improve bound to optimal (k¢ε-2)
• Simpler problem: k-GAP-THRESH
– Each site Pi holds a bit Zi
– Zi are i.i.d. Bernoulli(¯)
– Decide if
i=1k Zi > ¯ k + (¯ k)1/2 or i=1
k Zi < ¯ k - (¯ k)1/2
Otherwise don’t care
• Rectangle property: for any correct protocol transcript ¿,
Z1, Z2, …, Zk are independent conditioned on ¿
A Key Lemma• Lemma: For any protocol ¦ which succeeds w.pr. >.9999, the
transcript ¿ is such that w.pr. > 1/2, for at least k/2 different i, H(Zi | ¿) < H(.01 ¯)
• Proof: Suppose ¿ does not satisfy this– With large probability,
¯ k - O(¯ k)1/2 i=1k Zi | ¿] < ¯ k + O(¯ k)1/2
– Since the Zi are independent given ¿, i=1
k Zi | ¿ is a sum of independent Bernoullis
– Since most H(Zi | ¿) are large, by anti-concentration, both events occur with constant probability:
i=1k Zi | ¿ > ¯ k + (¯ k)1/2 , i=1
k Zi | ¿ < ¯ k - (¯ k)1/2
So ¦ can’t succeed with large probability
Composition IdeaC
P1 P2 P3 Pk…
Z3Z2Z1Zk
The input to Pi in k-GAP-THRESH, denoted Zi, is the output of a 2-party Disjointness (DISJ) instance between C and Si
- Let X be a random set of size 1/(4ε2) from {1, 2, …, 1/ε2}- For each i, if Zi = 1, then choose Yi so that DISJ(X, Yi) = 1, else choose Yi so that DISJ(X, Yi) = 0- Distributional complexity (1/ε2) [Razborov]
DISJ
DISJ
DISJDISJ
Can think of C as a
player
Can think of C as a
player
Putting it All Together• Key Lemma ! For most i, H(Zi | ¿) < H(.01¯)
• Since H(Zi) = H(¯) for all i, for most i protocol ¦ solves DISJ(X, Yi) with constant probability
• Since the Zi | ¿ are independent, solving DISJ requires communication (ε-2) on each of k/2 copies
• Total communication is (k¢ε-2)
• Can show a reduction:– |x|0 > 1/(2ε2) + 1/ε if i=1
k Zi > ¯ k + (¯ k)1/2
– |x|0 < 1/(2ε2) - 1/ε if i=1k Zi < ¯ k - (¯ k)1/2
Talk Outline
• Lower Bounds– Non-zero elements – Euclidean norm
• Upper Bounds– p-norm
Lower Bound for Euclidean Norm
• Improve (k + ε-) bound to optimal (k¢ε-2)
• Base problem: Gap-Orthogonality (GAP-ORT(X, Y))– Consider uniform distribution on (X,Y)
• We observe information lower bound for GAP-ORT
• Sherstov’s lower bound for GAP-ORT holds for uniform distribution on (X,Y)
• [BBCR] + [Sherstov] ! for any protocol ¦ and t > 0, I(X, Y; ¦) = (1/(ε2 log t)) or ¦ uses t communication
Information Implications
• By chain rule,
I(X, Y ; ¦) = i=11/ε2 I(Xi, Yi ; ¦ | X< i, Y< i) = (ε-2)
• For most i, I(Xi, Yi ; ¦ | X< i, Y< i) = (1)
• Maximum Likelihood Principle: non-trivial advantage in guessing (Xi, Yi)
2-BIT k-Party DISJ
• Choose a random j 2 [k2]– j doesn’t occur in any Ti
– j occurs only in T1, …, Tk/2
– j occurs only in Tk/, …, Tk
– j occurs in T1, …, Tk
• All j’ j occur in at most one set Ti (assume k ¸ 4)
• We show (k) information cost
P1 P2 … PkP3
T1 T2 T3 Tk 2 [k2]
We compose GAP-ORT with a variant of k-Party DISJ
Rough Composition Idea
2-BIT k-party DISJ instance
2-BIT k-party DISJ instance
…
2-BIT k-party DISJ instance
{1/ε2
Show (k/ε2) overall information is revealed
Bits Xi and Yi in GAP-ORT determine output of i-th 2-BIT k-party DISJ instance
Bits Xi and Yi in GAP-ORT determine output of i-th 2-BIT k-party DISJ instance
An algorithm for approximating Euclidean norm solves GAP-ORT, therefore solves most 2-BIT k-party DISJ instances
An algorithm for approximating Euclidean norm solves GAP-ORT, therefore solves most 2-BIT k-party DISJ instances
GAP-ORT
- Information adds (if we condition on enough “helper” variables)- Pi participates in all instances
- Information adds (if we condition on enough “helper” variables)- Pi participates in all instances
Talk Outline
• Lower Bounds– Non-zero elements – Euclidean norm
• Upper Bounds– p-norm
Algorithm for p-norm
• We get kp-1 poly(1/ε), improving k2p+1n1-2/p poly(1/ε) for general p and O(k2/ε + k1.5/ε3) for p = 2
• Our protocol is the first 1-way protocol, that is, all communication is from sites to coordinator
• Focus on Euclidean norm (p = 2) in talk
• Non-negative vectors
• Just determine if Euclidean norm exceeds a threshold θ
The Most Naïve Thing to Do
• xi is Site i’s current vector
• x = i=1k xi
• Suppose Site i sees an update xi à xi + ej
• Send j to Coordinator with a certain probability that only depends on k and θ?
Sample and Send
P1 P2 … PkP3
C
1…10…00…0………0…0
0…01…10…0………0…0
0…00…01…1………0…0
………………………………………
0…00…00…0………1…1
|x|2 = k2|x|2 = k2
{k|x|2 = 2k2|x|2 = 2k2
1 1 1 1 1
Send each update with probability at least 1/k
Communication = O(k), so okay
Send each update with probability at least 1/k
Communication = O(k), so okay
Suppose x has k4 coordinates that are 1, and may have a
unique coordinate which is k2, occurring k times on each site
Suppose x has k4 coordinates that are 1, and may have a
unique coordinate which is k2, occurring k times on each site
- Send update with probability 1/k2
- Will find the large coordinate
- But communication is (k2)
- Send update with probability 1/k2
- Will find the large coordinate
- But communication is (k2)
What Is Happening?
• Sampling with probability ¼ 1/k2 is good to get a few samples from heavy item
• But all the light coordinates are in the way, making the communication (k2)
• Suppose we put a barrier of k, that is, sample with probability ¼ 1/k2 but only send an item if it has occurred at least k times on a site
• Now communication is O(1) and found heavy coordinate
• But light coordinates also contribute to overall |x|2 value
• Sample at different scales with different barriers
• Use public coin to create O(log n) groups T1, …, Tlog n of the n input coordinates
• Tz contains n/2z random coordinates
• Suppose Site i sees the update xi à xi + ej
• For each Tz containing j • If xi
j > (θ/2z)1/2/k then with probability (2z/θ)1/2¢poly(ε-1 log n), send (j, z) to the coordinator
Algorithm for Euclidean Norm
• Expected communication O~(k)
• If a group of coordinates contributes to|x|2, there is a z for which a few coordinates in the group are sampled multiple times
Conclusions• Improved communication lower and upper bounds
for estimating |x|p
• Implies tight lower bounds for estimating entropy, heavy hitters, quantiles
• Implications for data stream model– First lower bound for |x|0 without Gap-Hamming– Useful information cost lower bound for Gap-Hamming, or protocol has very large communication– Improve (n1-2/p/ε2/p) bound for estimating |x|p in a
stream to (n1-2/p/ε4/p)