geometric embeddings and graph expansionJames R. Lee
Institute for Advanced Study (Princeton) University of Washington (Seattle)
outline
1.Philosophy of geometric embeddings2.Example: Finding balanced cuts in graphs3.Four important open problems
in the talk:
not in the talk:
No proofs (one slide). Mathematics borrows from high-dimensional convexgeometry, functional analysis, harmonic analysis, differential geometry...(see other talks on my web page)
so you should ask questions if something is confusing!
geometric embeddings in CS
combinatorial problem
geometric representation
embedding
nicer geometric space
combinatorial solution
connections in CS
geometric searchclustering
dimension reductionmachine learning
computational biology
approximation algorithmsdivide and conquer
network designgraph layout
tree decompositions
geometric optimizationsemi-definite programming
PCPs, unique gamesfourier analysis of boolean functions
graph expansion and the sparsest cut
Input: A graph G=(V,E).
S
E(S, S)For a cut (S,S) let E(S,S) denote the edgescrossing the cut.
The sparsity of S is the value
The SPARSEST CUT problem is to find the cut which minimizes (S).
This problem is NP-hard, so we try to find approximately optimal cuts. (approximation algorithms)
graph expansion and the sparsest cut
Given a graph G=(V,E), we want to
Clustering Divide & conquer algorithms
graph expansion and the sparsest cut
Given a graph G=(V,E), we want to
This is actually the EDGE EXPANSION problem.The full SPARSEST CUT problem is a weighted version
where is the geometry?
Leighton-Rao (1988) approach via LP duality
d is a metric on V if d(x,y) = d(y,x) and d(x,y) · d(x,z)+d(z,y) 8x,y,z 2 V
“cut metric”d(x,y) = 1 if x,y are on different sides of Sd(x,y) = 0 otherwiseS S
where is the geometry?
Leighton-Rao (1988) approach via LP duality
d is a metric on V if d(x,y) = d(y,x) and d(x,y) · d(x,z)+d(z,y) 8x,y,z 2 V
can minimize with a linear program
dual of the multi-commodity flow LP - every edge has capacity 1 - send 1 unit of flow from x ! y for every x,y 2 V
finding cuts using embeddings
Now we find a cut using LP relaxation + embeddings [Linial London Rabinovich 1992]
S S
cut metric d
Rn
S
S
LP relaxation
?
1. Want to find a good cut in G.
2. Solve a linear program to get a metric d.
3. Embed the metric into a Euclidean space.
4. Use a geometric algorithm to find S. (random hyperplane cut)
The distortion of f is the smallest number D such that for all x,y 2 X:
embeddings and distortion
Given a metric space (X,d), a Euclidean embedding of X a mapping f : X ! Rn.
distortion measures how well f preserves the structure of X
The distortion of f is the smallest number D such that for all x,y 2 X:
embeddings and distortion
Given a metric space (X,d), a Euclidean embedding of X a mapping f : X ! Rn.
Depending on the application, sometimes we consider the L1 norm or the L2 norm.
- Embeddings into L2 are stronger than L1 embeddings- L1 embeddings are good enough for finding sparse cuts- We have many fewer techniques for analyzing L1 embeddings
first results
[Bourgain 1985] Every n-point metric space has a Euclidean embedding (L2 norm) with distortion O(log n).
[Linial-London-Rabinovich, Aumann-Rabani STOC’92] - Can use this to get an O(log n)-approximation for the SPARSEST CUT problem. - Bourgain’s result is tight (using expander graphs)
new results
semi-definite programming
special family ofmetric spaces
“negative type”
A metric space (X,d) is said to be negative type if we can write
where xu 2 Rn for every u 2 X.
embedding overview
metric spaces have various scales
embedding overview
Measured descent: New multi-scale embedding technique [Krauthgamer-L-Mendel-Naor FOCS’04]
exploit non-trivial interactionbetween scales
embedding overview
Measured descent: New multi-scale embedding technique [Krauthgamer-L-Mendel-Naor FOCS’04]
-approximation algorithm for EDGE EXPANSION [Arora-Rao-Vazirani STOC’04] new techniques in high-dimensional convex geometry
single-scale analysisvia geometric chaining argument
embedding overview
Measured descent: New multi-scale embedding technique [Krauthgamer-L-Mendel-Naor FOCS’04]
-approximation algorithm for EDGE EXPANSION [Arora-Rao-Vazirani STOC’04] new techniques in high-dimensional convex geometry
Gluing embeddings with “partitions of unity” [L SODA’05]
embedding overview
Measured descent: New multi-scale embedding technique [Krauthgamer-L-Mendel-Naor FOCS’04]
-approximation algorithm for EDGE EXPANSION [Arora-Rao-Vazirani STOC’04] new techniques in high-dimensional convex geometry
Gluing embeddings with “partitions of unity” [L SODA’05]
Improvements to the ARV geometric structure theorems [Chawla-Gupta-Racke SODA’05, L 05]
upper bound[CGR 05]
embedding overview
Measured descent: New multi-scale embedding technique [Krauthgamer-L-Mendel-Naor FOCS’04]
-approximation algorithm for EDGE EXPANSION [Arora-Rao-Vazirani STOC’04] new techniques in high-dimensional convex geometry
Gluing embeddings with “partitions of unity” [L SODA’05]
Improvements to the ARV geometric structure theorems [Chawla-Gupta-Racke SODA’05, L 05]
-approximation for SPARSEST CUT [Arora-L-Naor STOC’05, L 06] based on new Euclidean embedding theorems for “negative type” spaces
important problems: negative-type metrics
analyze this semi-definite program
- Analysis is equivalent to finding the best distortion of n-point “negative type” metrics into Euclidean space with the L1 norm
Upper bound: [Arora-L-Naor STOC’05, L 06]Lower bound: [Khot-Vishnoi FOCS’05]
- Related to Fourier analysis of boolean functions, probabilistically checkable proofs (PCPs), unique games conjecture, geometric analysis...
important problems: edit distance
A A G CT
A A CT
A CTA
For two strings s,t 2 {A,C,G,T}d
dEDIT(s,t)
{minimum number ofinsert/delete character operations
to change from s ! t}=
- What is the distortion needed to embed dEDIT into a Euclidean space (with the L1 norm)? (Applications to nearest-neighbor search, sketching, fast distance computations...)
Upper bound: [Ostrovsy-Rabani STOC’05]Lower bound: [Krauthgamer-Rabani SODA’06]
important problems: vertex separators
vertex cuts
Earlier, we talked about edge cuts.
We can also consider
- Most important application: Finding low-treewidth decompositions (useful as a basic step in many algorithms)
- Best approximation algorithms are from [Feige-Hajiaghayi-L STOC’05] Requires a stronger kind of embedding. We can only extend some of the known techniques.
important problems: planar multi-flows
Max-flow / Min-cut theorem: In any graph G, for any two nodes s and t, the value of the value of the minimum s-t cut = value of the maximum s-t flow.
What about multi-commodity flows?
G
s1
s2
s3
t1
t3
t2- In general graphs, there is no max-flow/min-cut theorem for multi-flows. The gap can be log(k), k = # of flows
- What about planar graphs?
Conjecture: The max-flow/min-cut gap is only O(1) for multi-flows on planar graphs.
important problems: planar multi-flows
Max-flow / Min-cut theorem: In any graph G, for any two nodes s and t, the value of the value of the minimum s-t cut = value of the maximum s-t flow.
Conjecture: The max-flow/min-cut gap is only O(1) for multi-flows on planar graphs.
This conjecture is equivalent to the question: If d(u,v) is the shortest-path metric on a planar graph G, does the metric space (G,d) embed into a Euclidean space (with the L1 norm) with O(1) distortion?
http://www.cs.berkeley.edu/~jrl
conclusion
- Embeddings are a fundamental tool in Computer Science
- Very rich, exciting mathematics
- Lots of important open problems at various levels of difficulty
- Many applications to other parts of scienceA A G CT
A A CT
Gs1
s2
s3
t1
t3
t2