+ All Categories
Home > Documents > One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Date post: 18-Jan-2016
Category:
Upload: blake-lane
View: 216 times
Download: 2 times
Share this document with a friend
40
One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo
Transcript
Page 1: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

One algorithm to rule them allOne join query at a time

Atri RudraUniversity at Buffalo

Page 2: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

A brief history of this talk

L2/L2 foreach sparse recovery/compressed sensing

http://www-stat.stanford.edu/~candes/stats330/index.shtml

Page 3: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

The key technical problem

Given the three shadows, what is the largest size of the original set of points?

Given the three shadows, what is the largest size of the original set of points?

Page 4: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

The key technical problem

Highly trivial: 43 = 64 Still trivial: 42 = 16 Correct answer: 41.5 = 8

Page 5: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

The key technical problem

A

B

C

|R|= k

|T| =k|S|=k

k3/2

Loomis Whitney

Algorithmic Loomis-

Whitney?

Algorithmic Loomis-

Whitney?

Page 6: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

An equivalent view

A

B

C

R

TS

A

B C

R

S

T

Output all (a,b,c) s.t. (a,b) in R, (b,c) in S and

(c,a) in T

Output all (a,b,c) s.t. (a,b) in R, (b,c) in S and

(c,a) in T

Page 7: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Overview of the talk

A

B C

R

S

T

Page 8: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

The take-away message

Joinalgo

http://welovetumblr.blogspot.com/2012/07/thor-is.html

Page 9: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Overview of the talk

A

B C

R

S

T

Page 10: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

(Database) Joins

Codd

Attributes/Nodes: [n]

Relations/Hyperedges: e1,…, em [n]

11

2233

44

55

Tables/Projections: R1 , … , Rm

Output all a = (a1,..,an) s.t. a projected down to

ei is in Ri for every i in [m]

Output all a = (a1,..,an) s.t. a projected down to

ei is in Ri for every i in [m]

Page 11: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

The triangle join query

A

B

C

R

TS

Output all (a,b,c) s.t. (a,b) in R, (b,c) in S and

(c,a) in T

Output all (a,b,c) s.t. (a,b) in R, (b,c) in S and

(c,a) in T

S

AA

BB CC

R T

Page 12: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Bounding the output size

Atserias Grohe Marx

AA

BB CC

S

R T

Highly trivial bound: R S T

Still trivial bound: R S

Loomis-Whitney bound: R1/2 S1/2 T1/2

½

½

½x

y

z

AGM bound: Rx Sy Tz

x + z ≥ 1 x + y ≥ 1 y + z ≥ 1

AA

BB

CCx, y, z ≥ 0

Page 13: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Loomis Whitney

?

Page 14: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Algorithmic Loomis-WhitneyLoomis-Whitney bound: R1/2 S1/2 T1/2

AA

BB CC

S

R T½

½

½

R

TS CC

BBAA

c

Goal: Count number of trianglesGoal: Count number of triangles

There are Rchoices for edges in R

There are dS(c)dT(c)choices for pairs ofneighbors of c

http://agilitrix.com/2011/03/red-pill-blue-pill/

TS CC

BBAA

c

dT(c)dS(c)

Page 15: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Algorithmic Loomis-WhitneyLoomis-Whitney bound: R1/2 S1/2 T1/2

Goal: Count number of trianglesGoal: Count number of triangles

There are Rchoices for edges in R

There are dS(c)dT(c)choices for pairs ofneighbors of c

Make this choice for every c in CMake this choice for every c in C

Run time of algo=Σc min( R

,dS(c)dT(c) )

Run time of algo=Σc min( R

,dS(c)dT(c) )

R

TS CC

BBAA

c

Page 16: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Analyzing the algorithmLoomis Whitney bound: R½ S½ T½

Σc min( R , dS(c) dT(c) )

≤ Σc (R dS(c) dT(c) ) ½

= R½Σc ( dS(c) ½ dT(c) ½ )

≤ R½(Σc dS(c)) ½(ΣcdT(c)) ½

= R½S½T½

R

TS CC

BBAA

c

Cauchy Schwartz

min(E,F) ≤ (EF)½

min(E,F) ≤ (EF)½

Page 17: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

?Atserias Grohe Marx

Page 18: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Same algorithm!AGM bound: Rx Sy Tz

Σc min( R , dS(c) dT(c) )

≤ Σc Rx (dS(c) dT(c) ) 1-x

≤ RxΣc ( dS(c) y dT(c) z )

≤ Rx(Σc dS(c)) y(ΣcdT(c)) z

= RxSyTz

R

TS CC

BBAA

c

x + z ≥ 1 x + y ≥ 1 y + z ≥ 1

AA

BB

CC

Hölder

min(E,F) ≤ ExF1-x

min(E,F) ≤ ExF1-x

Page 19: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

General Join Result

Attributes/Nodes: [n]

Relations/Hyperedges: e1,…, em [n]

11

2233

44

55

Tables/Projections: R1 , … , Rm

x1,..,xm be a fractional cover

AGM bound: R1x1…Rm

xm

Our result: O(AGM + Input size)

x1

x2

x3

x4

Provably worst-case

optimal join algorithm

Provably worst-case

optimal join algorithm

Page 20: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

List recovery

.

.

.

..

.

.

S1 S2 S3 Sn

………………………Si subset of [q]

………………………c1 c2 c3 cn

20

Code C subset of [q]nApplications in

expandersApplications in

expanders

Page 21: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

An alternate view of joins

A

B C

R

S

T Msg in [q]3

Codeword in [q2]3

.

.

.

..

R S T

Constant dimensionConstant block length

Large alphabet sizeLarge input list size

Constant dimensionConstant block length

Large alphabet sizeLarge input list size

Page 22: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Overview of the talk

A

B C

R

S

T

Page 23: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Sparse Recovery/Compressed Sensing

UnknownTo be designed

Observed

DecodeDecode

Output

k=2

Heavy Hitter

Tail

Page 24: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Quantifying the approximation

L2 ≤ C L2

Page 25: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

(Most of) rest of the talk

Page 26: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Designing the matrix

UnknownTo be designed

Observed

DecodeDecode

Output

k=2

Page 27: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Designing the matrix k=2

N

m

k-expander

N m

< ¼ (neighborhood)

Measurement = + noise

Heavy tail noise < ¼ (neighborhood)

> ½ of the neighbors of have the

“correct” value

> ½ of the neighbors of have the

“correct” value

Page 28: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Count-Sketch style algo k=2

N m

Estimate = median of O(log N) values

Output the top O(k) estimates

O(N log N) decoding

Indyk Ružić

Page 29: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

We need a faster algorithm…

Page 30: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

S

Towards a sub-linear time algo

Estimate=median value

Output the top O(k) estimates in S

O(|S| log N) decoding

All we need to do is to

compute a small S quikcly

All we need to do is to

compute a small S quikcly

Page 31: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Porat-Strauss Idea: Recursion!

[N]

{0,1}log N

[√N] [√N]

Solve in ~ √N time Solve in ~ √N time

Page 32: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

The problem we now need to solveElements of S Geometrically…

k

k

?

Output size ~ k2Overall running time ~ √N + k2

Not sub-linear for

k > √N

Not sub-linear for

k > √N

Use a table-look up to decrease

the run time

Use a table-look up to decrease

the run time

Page 33: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Finally…

Page 34: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Slightly different recursionlog N

[N]

[N⅔] [N⅔] [N⅔]

Geometricproblem tosolve

Overall runtime

k3/2 + N2/3

Page 35: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Our Results

L2/L2 sparse recovery with failure prob p

Optimal k log(N/k) measurements*

k1+ε poly-log N decoding+space

p ~ (N/k)-k/poly-log k

Also prove tight lower bound of k log(N/k) + log(1/p)

Page 36: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

One algorithm to rule them allOne join query at a time

Atri RudraUniversity at Buffalo

Page 37: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Only two problems so far…

A

B C

R

S

T

Page 38: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

Albert Meyer (via Dick Lipton)

"Prove it for n=3 and then let 3 go to infinity"

Page 39: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

The 3rd problem…

Big (hyper)graph G

http://pigeonsandplanes.com/2010/12/thoughts-on-net-neutrality.html

11

2233

44

55

Small (hyper) graph H

Compute all copies of H in G

Our join algorithm gives a worst-case optimal algorithm for any constant-sized H

Our join algorithm gives a worst-case optimal algorithm for any constant-sized H

Joins model many more

problems, e.g. CSPs

Joins model many more

problems, e.g. CSPs

Page 40: One algorithm to rule them all One join query at a time Atri Rudra University at Buffalo.

The take-away message

Joinalgo

http://welovetumblr.blogspot.com/2012/07/thor-is.html


Recommended