+ All Categories
Home > Documents > Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H...

Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H...

Date post: 06-Jun-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
95
Data Streams & Communication Complexity Lecture 2: Graph Spanners, Sparsifiers, & Sketches Andrew McGregor, UMass Amherst 1/25
Transcript
Page 1: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Data Streams & Communication ComplexityLecture 2: Graph Spanners, Sparsifiers, & Sketches

Andrew McGregor, UMass Amherst

1/25

Page 2: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Graph Streams

I Consider a stream of m edges

〈e1, e2, . . . . . . , em〉

defining a graph G with nodes V = [n] and E = {e1, . . . , em}

I Semi-streaming: What can we compute with O(n · polylog n) space?

2/25

Page 3: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Graph Streams

I Consider a stream of m edges

〈e1, e2, . . . . . . , em〉

defining a graph G with nodes V = [n] and E = {e1, . . . , em}I Semi-streaming: What can we compute with O(n · polylog n) space?

2/25

Page 4: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Outline

Spanners and Distances

Sparsifiers and Cuts

Sketches and Dynamic GraphsConnectivityk-ConnectivityMinimum Cut

3/25

Page 5: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Outline

Spanners and Distances

Sparsifiers and Cuts

Sketches and Dynamic GraphsConnectivityk-ConnectivityMinimum Cut

4/25

Page 6: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Graph Distances

I Goal: Approximate length of the shortest path dG (u, v) between apair of nodes u, v ∈ G ,

DefinitionAn α-spanner of graph G is a subgraph H such that for any nodes u, v ,

dG (u, v) ≤ dH(u, v) ≤ αdG (u, v) .

5/25

Page 7: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Graph Distances

I Goal: Approximate length of the shortest path dG (u, v) between apair of nodes u, v ∈ G ,

DefinitionAn α-spanner of graph G is a subgraph H such that for any nodes u, v ,

dG (u, v) ≤ dH(u, v) ≤ αdG (u, v) .

5/25

Page 8: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Warm-Up: Connectivity

I Goal: Compute the number of connected components.

I Algorithm: Maintain a spanning forest FI F ← ∅I For each edge (u, v), if u and v aren’t connected in F ,

F ← F ∪ {(u, v)}

I Analysis:I F has the same number of connected components as GI F has at most n − 1 edges.

I Thm: Can count connected components in O(n log n) space.

6/25

Page 9: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Warm-Up: Connectivity

I Goal: Compute the number of connected components.

I Algorithm: Maintain a spanning forest F

I F ← ∅I For each edge (u, v), if u and v aren’t connected in F ,

F ← F ∪ {(u, v)}

I Analysis:I F has the same number of connected components as GI F has at most n − 1 edges.

I Thm: Can count connected components in O(n log n) space.

6/25

Page 10: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Warm-Up: Connectivity

I Goal: Compute the number of connected components.

I Algorithm: Maintain a spanning forest FI F ← ∅

I For each edge (u, v), if u and v aren’t connected in F ,

F ← F ∪ {(u, v)}

I Analysis:I F has the same number of connected components as GI F has at most n − 1 edges.

I Thm: Can count connected components in O(n log n) space.

6/25

Page 11: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Warm-Up: Connectivity

I Goal: Compute the number of connected components.

I Algorithm: Maintain a spanning forest FI F ← ∅I For each edge (u, v), if u and v aren’t connected in F ,

F ← F ∪ {(u, v)}

I Analysis:I F has the same number of connected components as GI F has at most n − 1 edges.

I Thm: Can count connected components in O(n log n) space.

6/25

Page 12: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Warm-Up: Connectivity

I Goal: Compute the number of connected components.

I Algorithm: Maintain a spanning forest FI F ← ∅I For each edge (u, v), if u and v aren’t connected in F ,

F ← F ∪ {(u, v)}

I Analysis:

I F has the same number of connected components as GI F has at most n − 1 edges.

I Thm: Can count connected components in O(n log n) space.

6/25

Page 13: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Warm-Up: Connectivity

I Goal: Compute the number of connected components.

I Algorithm: Maintain a spanning forest FI F ← ∅I For each edge (u, v), if u and v aren’t connected in F ,

F ← F ∪ {(u, v)}

I Analysis:I F has the same number of connected components as G

I F has at most n − 1 edges.

I Thm: Can count connected components in O(n log n) space.

6/25

Page 14: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Warm-Up: Connectivity

I Goal: Compute the number of connected components.

I Algorithm: Maintain a spanning forest FI F ← ∅I For each edge (u, v), if u and v aren’t connected in F ,

F ← F ∪ {(u, v)}

I Analysis:I F has the same number of connected components as GI F has at most n − 1 edges.

I Thm: Can count connected components in O(n log n) space.

6/25

Page 15: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Warm-Up: Connectivity

I Goal: Compute the number of connected components.

I Algorithm: Maintain a spanning forest FI F ← ∅I For each edge (u, v), if u and v aren’t connected in F ,

F ← F ∪ {(u, v)}

I Analysis:I F has the same number of connected components as GI F has at most n − 1 edges.

I Thm: Can count connected components in O(n log n) space.

6/25

Page 16: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Spanners

I Algorithm:

I H ← ∅.I For each edge (u, v), if dH(u, v) ≥ 2t, H ← H ∪ {(u, v)}

I Analysis:I Distances increase by at most a factor 2t − 1 since an edge (u, v) is

only forgotten if there’s already a detour of length at most 2t − 1.I Lemma: H has O(n1+1/t) edges since all cycles have length ≥ 2t + 1.

TheoremCan (2t − 1)-approximate all distances using only O(n1+1/t) space.

7/25

Page 17: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Spanners

I Algorithm:I H ← ∅.

I For each edge (u, v), if dH(u, v) ≥ 2t, H ← H ∪ {(u, v)}I Analysis:

I Distances increase by at most a factor 2t − 1 since an edge (u, v) isonly forgotten if there’s already a detour of length at most 2t − 1.

I Lemma: H has O(n1+1/t) edges since all cycles have length ≥ 2t + 1.

TheoremCan (2t − 1)-approximate all distances using only O(n1+1/t) space.

7/25

Page 18: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Spanners

I Algorithm:I H ← ∅.I For each edge (u, v), if dH(u, v) ≥ 2t, H ← H ∪ {(u, v)}

I Analysis:I Distances increase by at most a factor 2t − 1 since an edge (u, v) is

only forgotten if there’s already a detour of length at most 2t − 1.I Lemma: H has O(n1+1/t) edges since all cycles have length ≥ 2t + 1.

TheoremCan (2t − 1)-approximate all distances using only O(n1+1/t) space.

7/25

Page 19: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Spanners

I Algorithm:I H ← ∅.I For each edge (u, v), if dH(u, v) ≥ 2t, H ← H ∪ {(u, v)}

I Analysis:

I Distances increase by at most a factor 2t − 1 since an edge (u, v) isonly forgotten if there’s already a detour of length at most 2t − 1.

I Lemma: H has O(n1+1/t) edges since all cycles have length ≥ 2t + 1.

TheoremCan (2t − 1)-approximate all distances using only O(n1+1/t) space.

7/25

Page 20: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Spanners

I Algorithm:I H ← ∅.I For each edge (u, v), if dH(u, v) ≥ 2t, H ← H ∪ {(u, v)}

I Analysis:I Distances increase by at most a factor 2t − 1 since an edge (u, v) is

only forgotten if there’s already a detour of length at most 2t − 1.

I Lemma: H has O(n1+1/t) edges since all cycles have length ≥ 2t + 1.

TheoremCan (2t − 1)-approximate all distances using only O(n1+1/t) space.

7/25

Page 21: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Spanners

I Algorithm:I H ← ∅.I For each edge (u, v), if dH(u, v) ≥ 2t, H ← H ∪ {(u, v)}

I Analysis:I Distances increase by at most a factor 2t − 1 since an edge (u, v) is

only forgotten if there’s already a detour of length at most 2t − 1.I Lemma: H has O(n1+1/t) edges since all cycles have length ≥ 2t + 1.

TheoremCan (2t − 1)-approximate all distances using only O(n1+1/t) space.

7/25

Page 22: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Spanners

I Algorithm:I H ← ∅.I For each edge (u, v), if dH(u, v) ≥ 2t, H ← H ∪ {(u, v)}

I Analysis:I Distances increase by at most a factor 2t − 1 since an edge (u, v) is

only forgotten if there’s already a detour of length at most 2t − 1.I Lemma: H has O(n1+1/t) edges since all cycles have length ≥ 2t + 1.

TheoremCan (2t − 1)-approximate all distances using only O(n1+1/t) space.

7/25

Page 23: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Proof of Lemma

LemmaA graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

I Let d = 2m/n be average degree of H.

I Let J be the graph formed by removing all nodes with degree lessthan d/2. Note J 6= ∅ because < n(d/2) = m edges are removed.

I Grow a BFS of depth t from an arbitrary node in J.

I Because a) no cycles of length less than 2t + 1 and b) all degrees inJ are at least d/2, number of nodes at t-th level of BFS is at least

(d/2− 1)t = (m/n − 1)t

I But (m/n − 1)t ≤ |J| ≤ n and therefore m ≤ n + n1+1/t .

8/25

Page 24: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Proof of Lemma

LemmaA graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

I Let d = 2m/n be average degree of H.

I Let J be the graph formed by removing all nodes with degree lessthan d/2. Note J 6= ∅ because < n(d/2) = m edges are removed.

I Grow a BFS of depth t from an arbitrary node in J.

I Because a) no cycles of length less than 2t + 1 and b) all degrees inJ are at least d/2, number of nodes at t-th level of BFS is at least

(d/2− 1)t = (m/n − 1)t

I But (m/n − 1)t ≤ |J| ≤ n and therefore m ≤ n + n1+1/t .

8/25

Page 25: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Proof of Lemma

LemmaA graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

I Let d = 2m/n be average degree of H.

I Let J be the graph formed by removing all nodes with degree lessthan d/2.

Note J 6= ∅ because < n(d/2) = m edges are removed.

I Grow a BFS of depth t from an arbitrary node in J.

I Because a) no cycles of length less than 2t + 1 and b) all degrees inJ are at least d/2, number of nodes at t-th level of BFS is at least

(d/2− 1)t = (m/n − 1)t

I But (m/n − 1)t ≤ |J| ≤ n and therefore m ≤ n + n1+1/t .

8/25

Page 26: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Proof of Lemma

LemmaA graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

I Let d = 2m/n be average degree of H.

I Let J be the graph formed by removing all nodes with degree lessthan d/2. Note J 6= ∅ because < n(d/2) = m edges are removed.

I Grow a BFS of depth t from an arbitrary node in J.

I Because a) no cycles of length less than 2t + 1 and b) all degrees inJ are at least d/2, number of nodes at t-th level of BFS is at least

(d/2− 1)t = (m/n − 1)t

I But (m/n − 1)t ≤ |J| ≤ n and therefore m ≤ n + n1+1/t .

8/25

Page 27: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Proof of Lemma

LemmaA graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

I Let d = 2m/n be average degree of H.

I Let J be the graph formed by removing all nodes with degree lessthan d/2. Note J 6= ∅ because < n(d/2) = m edges are removed.

I Grow a BFS of depth t from an arbitrary node in J.

I Because a) no cycles of length less than 2t + 1 and b) all degrees inJ are at least d/2, number of nodes at t-th level of BFS is at least

(d/2− 1)t = (m/n − 1)t

I But (m/n − 1)t ≤ |J| ≤ n and therefore m ≤ n + n1+1/t .

8/25

Page 28: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Proof of Lemma

LemmaA graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

I Let d = 2m/n be average degree of H.

I Let J be the graph formed by removing all nodes with degree lessthan d/2. Note J 6= ∅ because < n(d/2) = m edges are removed.

I Grow a BFS of depth t from an arbitrary node in J.

I Because a) no cycles of length less than 2t + 1 and b) all degrees inJ are at least d/2, number of nodes at t-th level of BFS is at least

(d/2− 1)t = (m/n − 1)t

I But (m/n − 1)t ≤ |J| ≤ n and therefore m ≤ n + n1+1/t .

8/25

Page 29: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Proof of Lemma

LemmaA graph H on n nodes with no cycles of length ≤ 2t has O(n1+1/t) edges.

I Let d = 2m/n be average degree of H.

I Let J be the graph formed by removing all nodes with degree lessthan d/2. Note J 6= ∅ because < n(d/2) = m edges are removed.

I Grow a BFS of depth t from an arbitrary node in J.

I Because a) no cycles of length less than 2t + 1 and b) all degrees inJ are at least d/2, number of nodes at t-th level of BFS is at least

(d/2− 1)t = (m/n − 1)t

I But (m/n − 1)t ≤ |J| ≤ n and therefore m ≤ n + n1+1/t .

8/25

Page 30: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Outline

Spanners and Distances

Sparsifiers and Cuts

Sketches and Dynamic GraphsConnectivityk-ConnectivityMinimum Cut

9/25

Page 31: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Cuts and Sparsifiers

I Goal: Approximate capacity CG (S) of any cut (S ,V \ S) in G .

DefinitionAn α-sparsifier of graph G is a weighted subgraph H such that for anycut (S ,V \ S),

CG (S) ≤ CH(S) ≤ αCG (S) .

where CG and CH is the capacity of the cut in G and H respectively.

Theorem (Batson, Spielman, Srivastava)Exists offline algorithm A returning (1 + ε)-sparsifier with O(nε−2) edges.

I Idea: Use A as a black box to recursively sparsify graph stream.

10/25

Page 32: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Cuts and Sparsifiers

I Goal: Approximate capacity CG (S) of any cut (S ,V \ S) in G .

DefinitionAn α-sparsifier of graph G is a weighted subgraph H such that for anycut (S ,V \ S),

CG (S) ≤ CH(S) ≤ αCG (S) .

where CG and CH is the capacity of the cut in G and H respectively.

Theorem (Batson, Spielman, Srivastava)Exists offline algorithm A returning (1 + ε)-sparsifier with O(nε−2) edges.

I Idea: Use A as a black box to recursively sparsify graph stream.

10/25

Page 33: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Cuts and Sparsifiers

I Goal: Approximate capacity CG (S) of any cut (S ,V \ S) in G .

DefinitionAn α-sparsifier of graph G is a weighted subgraph H such that for anycut (S ,V \ S),

CG (S) ≤ CH(S) ≤ αCG (S) .

where CG and CH is the capacity of the cut in G and H respectively.

Theorem (Batson, Spielman, Srivastava)Exists offline algorithm A returning (1 + ε)-sparsifier with O(nε−2) edges.

I Idea: Use A as a black box to recursively sparsify graph stream.

10/25

Page 34: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Cuts and Sparsifiers

I Goal: Approximate capacity CG (S) of any cut (S ,V \ S) in G .

DefinitionAn α-sparsifier of graph G is a weighted subgraph H such that for anycut (S ,V \ S),

CG (S) ≤ CH(S) ≤ αCG (S) .

where CG and CH is the capacity of the cut in G and H respectively.

Theorem (Batson, Spielman, Srivastava)Exists offline algorithm A returning (1 + ε)-sparsifier with O(nε−2) edges.

I Idea: Use A as a black box to recursively sparsify graph stream.

10/25

Page 35: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Basic Properties of Sparsifiers

LemmaIf H1 and H2 are α-sparsifiers of G1 and G2. Then H1 ∪ H2 is anα-sparsifier of G1 ∪ G2.

LemmaIf J is an α-sparsifiers of H and H is an α-sparsifier of G . Then J is anα2-sparsifier of G .

11/25

Page 36: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Basic Properties of Sparsifiers

LemmaIf H1 and H2 are α-sparsifiers of G1 and G2. Then H1 ∪ H2 is anα-sparsifier of G1 ∪ G2.

LemmaIf J is an α-sparsifiers of H and H is an α-sparsifier of G . Then J is anα2-sparsifier of G .

11/25

Page 37: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Stream SparsificationI Divide stream into segments G1,G2, . . . each of t = O(nε−2) edges.

I Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8

G1∪G2 G3∪G4 G5∪G6 G7∪G8

G1∪G2∪G3∪G4 G5∪G6∪G7∪G8

G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

I Recursively use A with parameter 1 + γ:I Read in G1: compute A(G1) and forget G1

I Read in G2: compute A(G2) and forget G2

I Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2)I Read in G3: compute A(G3) and forget G3

I Read in G4: compute A(G4) and forget G4

I Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4)I Compute A(A(A(G1) ∪A(G2)) ∪A(A(G3) ∪A(G4))) and forget . . .

I Results in a (1 + γ)logm-sparsifier for G in O(nγ−2 log m) space.I If γ = O(ε/ log m), we get (1 + ε)-sparsifier in O(nε−2 log3 m) space.

12/25

Page 38: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Stream SparsificationI Divide stream into segments G1,G2, . . . each of t = O(nε−2) edges.I Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8

G1∪G2 G3∪G4 G5∪G6 G7∪G8

G1∪G2∪G3∪G4 G5∪G6∪G7∪G8

G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

I Recursively use A with parameter 1 + γ:I Read in G1: compute A(G1) and forget G1

I Read in G2: compute A(G2) and forget G2

I Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2)I Read in G3: compute A(G3) and forget G3

I Read in G4: compute A(G4) and forget G4

I Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4)I Compute A(A(A(G1) ∪A(G2)) ∪A(A(G3) ∪A(G4))) and forget . . .

I Results in a (1 + γ)logm-sparsifier for G in O(nγ−2 log m) space.I If γ = O(ε/ log m), we get (1 + ε)-sparsifier in O(nε−2 log3 m) space.

12/25

Page 39: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Stream SparsificationI Divide stream into segments G1,G2, . . . each of t = O(nε−2) edges.I Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8

G1∪G2 G3∪G4 G5∪G6 G7∪G8

G1∪G2∪G3∪G4 G5∪G6∪G7∪G8

G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

I Recursively use A with parameter 1 + γ:

I Read in G1: compute A(G1) and forget G1

I Read in G2: compute A(G2) and forget G2

I Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2)I Read in G3: compute A(G3) and forget G3

I Read in G4: compute A(G4) and forget G4

I Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4)I Compute A(A(A(G1) ∪A(G2)) ∪A(A(G3) ∪A(G4))) and forget . . .

I Results in a (1 + γ)logm-sparsifier for G in O(nγ−2 log m) space.I If γ = O(ε/ log m), we get (1 + ε)-sparsifier in O(nε−2 log3 m) space.

12/25

Page 40: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Stream SparsificationI Divide stream into segments G1,G2, . . . each of t = O(nε−2) edges.I Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8

G1∪G2 G3∪G4 G5∪G6 G7∪G8

G1∪G2∪G3∪G4 G5∪G6∪G7∪G8

G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

I Recursively use A with parameter 1 + γ:I Read in G1: compute A(G1) and forget G1

I Read in G2: compute A(G2) and forget G2

I Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2)I Read in G3: compute A(G3) and forget G3

I Read in G4: compute A(G4) and forget G4

I Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4)I Compute A(A(A(G1) ∪A(G2)) ∪A(A(G3) ∪A(G4))) and forget . . .

I Results in a (1 + γ)logm-sparsifier for G in O(nγ−2 log m) space.I If γ = O(ε/ log m), we get (1 + ε)-sparsifier in O(nε−2 log3 m) space.

12/25

Page 41: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Stream SparsificationI Divide stream into segments G1,G2, . . . each of t = O(nε−2) edges.I Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8

G1∪G2 G3∪G4 G5∪G6 G7∪G8

G1∪G2∪G3∪G4 G5∪G6∪G7∪G8

G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

I Recursively use A with parameter 1 + γ:I Read in G1: compute A(G1) and forget G1

I Read in G2: compute A(G2) and forget G2

I Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2)I Read in G3: compute A(G3) and forget G3

I Read in G4: compute A(G4) and forget G4

I Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4)I Compute A(A(A(G1) ∪A(G2)) ∪A(A(G3) ∪A(G4))) and forget . . .

I Results in a (1 + γ)logm-sparsifier for G in O(nγ−2 log m) space.I If γ = O(ε/ log m), we get (1 + ε)-sparsifier in O(nε−2 log3 m) space.

12/25

Page 42: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Stream SparsificationI Divide stream into segments G1,G2, . . . each of t = O(nε−2) edges.I Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8

G1∪G2 G3∪G4 G5∪G6 G7∪G8

G1∪G2∪G3∪G4 G5∪G6∪G7∪G8

G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

I Recursively use A with parameter 1 + γ:I Read in G1: compute A(G1) and forget G1

I Read in G2: compute A(G2) and forget G2

I Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2)

I Read in G3: compute A(G3) and forget G3

I Read in G4: compute A(G4) and forget G4

I Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4)I Compute A(A(A(G1) ∪A(G2)) ∪A(A(G3) ∪A(G4))) and forget . . .

I Results in a (1 + γ)logm-sparsifier for G in O(nγ−2 log m) space.I If γ = O(ε/ log m), we get (1 + ε)-sparsifier in O(nε−2 log3 m) space.

12/25

Page 43: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Stream SparsificationI Divide stream into segments G1,G2, . . . each of t = O(nε−2) edges.I Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8

G1∪G2 G3∪G4 G5∪G6 G7∪G8

G1∪G2∪G3∪G4 G5∪G6∪G7∪G8

G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

I Recursively use A with parameter 1 + γ:I Read in G1: compute A(G1) and forget G1

I Read in G2: compute A(G2) and forget G2

I Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2)I Read in G3: compute A(G3) and forget G3

I Read in G4: compute A(G4) and forget G4

I Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4)I Compute A(A(A(G1) ∪A(G2)) ∪A(A(G3) ∪A(G4))) and forget . . .

I Results in a (1 + γ)logm-sparsifier for G in O(nγ−2 log m) space.I If γ = O(ε/ log m), we get (1 + ε)-sparsifier in O(nε−2 log3 m) space.

12/25

Page 44: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Stream SparsificationI Divide stream into segments G1,G2, . . . each of t = O(nε−2) edges.I Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8

G1∪G2 G3∪G4 G5∪G6 G7∪G8

G1∪G2∪G3∪G4 G5∪G6∪G7∪G8

G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

I Recursively use A with parameter 1 + γ:I Read in G1: compute A(G1) and forget G1

I Read in G2: compute A(G2) and forget G2

I Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2)I Read in G3: compute A(G3) and forget G3

I Read in G4: compute A(G4) and forget G4

I Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4)I Compute A(A(A(G1) ∪A(G2)) ∪A(A(G3) ∪A(G4))) and forget . . .

I Results in a (1 + γ)logm-sparsifier for G in O(nγ−2 log m) space.I If γ = O(ε/ log m), we get (1 + ε)-sparsifier in O(nε−2 log3 m) space.

12/25

Page 45: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Stream SparsificationI Divide stream into segments G1,G2, . . . each of t = O(nε−2) edges.I Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8

G1∪G2 G3∪G4 G5∪G6 G7∪G8

G1∪G2∪G3∪G4 G5∪G6∪G7∪G8

G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

I Recursively use A with parameter 1 + γ:I Read in G1: compute A(G1) and forget G1

I Read in G2: compute A(G2) and forget G2

I Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2)I Read in G3: compute A(G3) and forget G3

I Read in G4: compute A(G4) and forget G4

I Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4)

I Compute A(A(A(G1) ∪A(G2)) ∪A(A(G3) ∪A(G4))) and forget . . .I Results in a (1 + γ)logm-sparsifier for G in O(nγ−2 log m) space.I If γ = O(ε/ log m), we get (1 + ε)-sparsifier in O(nε−2 log3 m) space.

12/25

Page 46: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Stream SparsificationI Divide stream into segments G1,G2, . . . each of t = O(nε−2) edges.I Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8

G1∪G2 G3∪G4 G5∪G6 G7∪G8

G1∪G2∪G3∪G4 G5∪G6∪G7∪G8

G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

I Recursively use A with parameter 1 + γ:I Read in G1: compute A(G1) and forget G1

I Read in G2: compute A(G2) and forget G2

I Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2)I Read in G3: compute A(G3) and forget G3

I Read in G4: compute A(G4) and forget G4

I Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4)I Compute A(A(A(G1) ∪A(G2)) ∪A(A(G3) ∪A(G4))) and forget . . .

I Results in a (1 + γ)logm-sparsifier for G in O(nγ−2 log m) space.I If γ = O(ε/ log m), we get (1 + ε)-sparsifier in O(nε−2 log3 m) space.

12/25

Page 47: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Stream SparsificationI Divide stream into segments G1,G2, . . . each of t = O(nε−2) edges.I Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8

G1∪G2 G3∪G4 G5∪G6 G7∪G8

G1∪G2∪G3∪G4 G5∪G6∪G7∪G8

G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

I Recursively use A with parameter 1 + γ:I Read in G1: compute A(G1) and forget G1

I Read in G2: compute A(G2) and forget G2

I Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2)I Read in G3: compute A(G3) and forget G3

I Read in G4: compute A(G4) and forget G4

I Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4)I Compute A(A(A(G1) ∪A(G2)) ∪A(A(G3) ∪A(G4))) and forget . . .

I Results in a (1 + γ)logm-sparsifier for G in O(nγ−2 log m) space.

I If γ = O(ε/ log m), we get (1 + ε)-sparsifier in O(nε−2 log3 m) space.

12/25

Page 48: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Stream SparsificationI Divide stream into segments G1,G2, . . . each of t = O(nε−2) edges.I Consider binary tree over segments

G1 G2 G3 G4 G5 G6 G7 G8

G1∪G2 G3∪G4 G5∪G6 G7∪G8

G1∪G2∪G3∪G4 G5∪G6∪G7∪G8

G=G1∪G2∪G3∪G4∪G5∪G6∪G7∪G8

I Recursively use A with parameter 1 + γ:I Read in G1: compute A(G1) and forget G1

I Read in G2: compute A(G2) and forget G2

I Compute A(A(G1) ∪ A(G2)) and forget A(G1) and A(G2)I Read in G3: compute A(G3) and forget G3

I Read in G4: compute A(G4) and forget G4

I Compute A(A(G3) ∪ A(G4)) and forget A(G3) and A(G4)I Compute A(A(A(G1) ∪A(G2)) ∪A(A(G3) ∪A(G4))) and forget . . .

I Results in a (1 + γ)logm-sparsifier for G in O(nγ−2 log m) space.I If γ = O(ε/ log m), we get (1 + ε)-sparsifier in O(nε−2 log3 m) space.

12/25

Page 49: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Outline

Spanners and Distances

Sparsifiers and Cuts

Sketches and Dynamic GraphsConnectivityk-ConnectivityMinimum Cut

13/25

Page 50: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Dynamic Graph Streams

I Consider a stream of edges inserts and deletions, e.g.,

〈add(1, 2), add(1, 4), add(2, 3), add(1, 3), add(4, 5), add(3, 4), del(1, 4)〉

would result in the following graph

1

2

3

5

4

I Dynamic semi-streaming: What can we compute about a dynamicgraph with only O(n · polylog n) space?

14/25

Page 51: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Dynamic Graph Streams

I Consider a stream of edges inserts and deletions, e.g.,

〈add(1, 2), add(1, 4), add(2, 3), add(1, 3), add(4, 5), add(3, 4), del(1, 4)〉

would result in the following graph

1

2

3

5

4

I Dynamic semi-streaming: What can we compute about a dynamicgraph with only O(n · polylog n) space?

14/25

Page 52: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Outline

Spanners and Distances

Sparsifiers and Cuts

Sketches and Dynamic GraphsConnectivityk-ConnectivityMinimum Cut

15/25

Page 53: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Connectivity

I Goal: Test whether G is connected.

I Our algorithm will actually return a spanning forest of G .

LemmaConsider the offline algorithm:

1. For each node, select an incident edge

2. Contract selected edges.

3. Repeat until no edges remain.

After log n steps, number of nodes is number of connected componentsin G . Furthermore, set of selected edges contains a spanning forest.

I Idea: Emulate above algorithm in a single pass using `0-sampling ofa particular vector representation of G .

16/25

Page 54: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Connectivity

I Goal: Test whether G is connected.

I Our algorithm will actually return a spanning forest of G .

LemmaConsider the offline algorithm:

1. For each node, select an incident edge

2. Contract selected edges.

3. Repeat until no edges remain.

After log n steps, number of nodes is number of connected componentsin G . Furthermore, set of selected edges contains a spanning forest.

I Idea: Emulate above algorithm in a single pass using `0-sampling ofa particular vector representation of G .

16/25

Page 55: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Connectivity

I Goal: Test whether G is connected.

I Our algorithm will actually return a spanning forest of G .

LemmaConsider the offline algorithm:

1. For each node, select an incident edge

2. Contract selected edges.

3. Repeat until no edges remain.

After log n steps, number of nodes is number of connected componentsin G . Furthermore, set of selected edges contains a spanning forest.

I Idea: Emulate above algorithm in a single pass using `0-sampling ofa particular vector representation of G .

16/25

Page 56: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Connectivity

I Goal: Test whether G is connected.

I Our algorithm will actually return a spanning forest of G .

LemmaConsider the offline algorithm:

1. For each node, select an incident edge

2. Contract selected edges.

3. Repeat until no edges remain.

After log n steps, number of nodes is number of connected componentsin G . Furthermore, set of selected edges contains a spanning forest.

I Idea: Emulate above algorithm in a single pass using `0-sampling ofa particular vector representation of G .

16/25

Page 57: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Useful Graph RepresentationI Represent graph on [n] with edges E ⊂ [n]× [n], as matrix

G ∈ {−1, 0, 1}n×(n2)

with non-zero entries Gj,(j,k) = 1, Gk,(j,k) = −1 if (j , k) ∈ E .

E.g.,

1

2

3

5

4

becomes,

(1,2) (1,3) (1,4) (1,5) (2,3) (2,4) (2,5) (3,4) (3,5) (4,5)

1 1 1 0 0 0 0 0 0 0 02 −1 0 0 0 1 0 0 0 0 03 0 −1 0 0 −1 0 0 1 0 04 0 0 0 0 0 0 0 −1 0 15 0 0 0 0 0 0 0 0 0 −1

I Lemma: For S ⊂ [n], support(

∑i∈S ai ) = E (S) where ai is ith row

of A and E (S) are edges across cut (S ,V \ S).

17/25

Page 58: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Useful Graph RepresentationI Represent graph on [n] with edges E ⊂ [n]× [n], as matrix

G ∈ {−1, 0, 1}n×(n2)

with non-zero entries Gj,(j,k) = 1, Gk,(j,k) = −1 if (j , k) ∈ E . E.g.,

1

2

3

5

4

becomes,

(1,2) (1,3) (1,4) (1,5) (2,3) (2,4) (2,5) (3,4) (3,5) (4,5)

1 1 1 0 0 0 0 0 0 0 02 −1 0 0 0 1 0 0 0 0 03 0 −1 0 0 −1 0 0 1 0 04 0 0 0 0 0 0 0 −1 0 15 0 0 0 0 0 0 0 0 0 −1

I Lemma: For S ⊂ [n], support(∑

i∈S ai ) = E (S) where ai is ith rowof A and E (S) are edges across cut (S ,V \ S).

17/25

Page 59: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Useful Graph RepresentationI Represent graph on [n] with edges E ⊂ [n]× [n], as matrix

G ∈ {−1, 0, 1}n×(n2)

with non-zero entries Gj,(j,k) = 1, Gk,(j,k) = −1 if (j , k) ∈ E . E.g.,

1

2

3

5

4

becomes,

(1,2) (1,3) (1,4) (1,5) (2,3) (2,4) (2,5) (3,4) (3,5) (4,5)

1 1 1 0 0 0 0 0 0 0 02 −1 0 0 0 1 0 0 0 0 03 0 −1 0 0 −1 0 0 1 0 04 0 0 0 0 0 0 0 −1 0 15 0 0 0 0 0 0 0 0 0 −1

I Lemma: For S ⊂ [n], support(

∑i∈S ai ) = E (S) where ai is ith row

of A and E (S) are edges across cut (S ,V \ S).17/25

Page 60: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Connectivity Algorithm

I Let A(a1),A(a2), . . . ,A(an) be sketches for `0 sampling. Canpost-process each sketch to find incident edge on each node.

I Suppose we found edges that connected, e.g., S = {a1, a2, a3}. Howcan find an edge e ∈ E (S) without taking another pass?

I Linearity: Because of linearity we can just add sketches,

A(a1) + A(a2) + A(a3) = A(a1 + a2 + a3) −→ e ∈ E (S)

I Under-the-rug: Actually we need to use log n independent sketchmatrices B,C ,D, . . . to emulate each round of algorithm. But this isfine: we can compute each B(ai ),C (ai ),D(ai ), . . . during same pass.

18/25

Page 61: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Connectivity Algorithm

I Let A(a1),A(a2), . . . ,A(an) be sketches for `0 sampling. Canpost-process each sketch to find incident edge on each node.

I Suppose we found edges that connected, e.g., S = {a1, a2, a3}. Howcan find an edge e ∈ E (S) without taking another pass?

I Linearity: Because of linearity we can just add sketches,

A(a1) + A(a2) + A(a3) = A(a1 + a2 + a3) −→ e ∈ E (S)

I Under-the-rug: Actually we need to use log n independent sketchmatrices B,C ,D, . . . to emulate each round of algorithm. But this isfine: we can compute each B(ai ),C (ai ),D(ai ), . . . during same pass.

18/25

Page 62: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Connectivity Algorithm

I Let A(a1),A(a2), . . . ,A(an) be sketches for `0 sampling. Canpost-process each sketch to find incident edge on each node.

I Suppose we found edges that connected, e.g., S = {a1, a2, a3}. Howcan find an edge e ∈ E (S) without taking another pass?

I Linearity: Because of linearity we can just add sketches,

A(a1) + A(a2) + A(a3) = A(a1 + a2 + a3) −→ e ∈ E (S)

I Under-the-rug: Actually we need to use log n independent sketchmatrices B,C ,D, . . . to emulate each round of algorithm. But this isfine: we can compute each B(ai ),C (ai ),D(ai ), . . . during same pass.

18/25

Page 63: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Connectivity Algorithm

I Let A(a1),A(a2), . . . ,A(an) be sketches for `0 sampling. Canpost-process each sketch to find incident edge on each node.

I Suppose we found edges that connected, e.g., S = {a1, a2, a3}. Howcan find an edge e ∈ E (S) without taking another pass?

I Linearity: Because of linearity we can just add sketches,

A(a1) + A(a2) + A(a3) = A(a1 + a2 + a3) −→ e ∈ E (S)

I Under-the-rug: Actually we need to use log n independent sketchmatrices B,C ,D, . . . to emulate each round of algorithm. But this isfine: we can compute each B(ai ),C (ai ),D(ai ), . . . during same pass.

18/25

Page 64: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Outline

Spanners and Distances

Sparsifiers and Cuts

Sketches and Dynamic GraphsConnectivityk-ConnectivityMinimum Cut

19/25

Page 65: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity

I Goal: Test whether all cuts of G have size at least k.

I Our algorithm actually returns a certificate of k-connectivity.

DefinitionWe say subgraph H is a k-certificate for G if,

∀ cuts (S ,V \ S) : CH(S) ≥ min(CG (S), k) .

LemmaLet F1 be a spanning forest of G and, for i ≥ 2, let Fi be a spanningforest of G \ (F1 ∪ . . .∪ Fi−1). Then F1 ∪ . . .∪ Fk is a k-certificate for G .

I Idea: Emulate above algorithm in a single pass by exploiting linearityof connectivity algorithm.

20/25

Page 66: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity

I Goal: Test whether all cuts of G have size at least k.

I Our algorithm actually returns a certificate of k-connectivity.

DefinitionWe say subgraph H is a k-certificate for G if,

∀ cuts (S ,V \ S) : CH(S) ≥ min(CG (S), k) .

LemmaLet F1 be a spanning forest of G and, for i ≥ 2, let Fi be a spanningforest of G \ (F1 ∪ . . .∪ Fi−1). Then F1 ∪ . . .∪ Fk is a k-certificate for G .

I Idea: Emulate above algorithm in a single pass by exploiting linearityof connectivity algorithm.

20/25

Page 67: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity

I Goal: Test whether all cuts of G have size at least k.

I Our algorithm actually returns a certificate of k-connectivity.

DefinitionWe say subgraph H is a k-certificate for G if,

∀ cuts (S ,V \ S) : CH(S) ≥ min(CG (S), k) .

LemmaLet F1 be a spanning forest of G and, for i ≥ 2, let Fi be a spanningforest of G \ (F1 ∪ . . .∪ Fi−1). Then F1 ∪ . . .∪ Fk is a k-certificate for G .

I Idea: Emulate above algorithm in a single pass by exploiting linearityof connectivity algorithm.

20/25

Page 68: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity

I Goal: Test whether all cuts of G have size at least k.

I Our algorithm actually returns a certificate of k-connectivity.

DefinitionWe say subgraph H is a k-certificate for G if,

∀ cuts (S ,V \ S) : CH(S) ≥ min(CG (S), k) .

LemmaLet F1 be a spanning forest of G and, for i ≥ 2, let Fi be a spanningforest of G \ (F1 ∪ . . .∪ Fi−1). Then F1 ∪ . . .∪ Fk is a k-certificate for G .

I Idea: Emulate above algorithm in a single pass by exploiting linearityof connectivity algorithm.

20/25

Page 69: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity

I Goal: Test whether all cuts of G have size at least k.

I Our algorithm actually returns a certificate of k-connectivity.

DefinitionWe say subgraph H is a k-certificate for G if,

∀ cuts (S ,V \ S) : CH(S) ≥ min(CG (S), k) .

LemmaLet F1 be a spanning forest of G and, for i ≥ 2, let Fi be a spanningforest of G \ (F1 ∪ . . .∪ Fi−1). Then F1 ∪ . . .∪ Fk is a k-certificate for G .

I Idea: Emulate above algorithm in a single pass by exploiting linearityof connectivity algorithm.

20/25

Page 70: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity Algorithm

I Can find F1 using the connectivity algorithm.

I But how can we find F2 without taking another pass over the data?

I Linearity: Suppose we have independent connectivity sketchesA(G ) and B(G ) of the graph G .

1. Construct F1 from A(G)2. Construct B(F1)3. Then B(G)− B(F1) = B(G \ F1) can be used to construct F2.

I Given A(G ),B(G ),C (G ) we would find F1 and F2 as above. Wethen find F3 from

C (G )− C (F1)− C (F2) = C (G \ F1 ∪ F2) ,

I And so on. . . resulting algorithm, connectivityk , requires onepass and uses O(k · n · polylog n) space.

21/25

Page 71: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity Algorithm

I Can find F1 using the connectivity algorithm.

I But how can we find F2 without taking another pass over the data?

I Linearity: Suppose we have independent connectivity sketchesA(G ) and B(G ) of the graph G .

1. Construct F1 from A(G)2. Construct B(F1)3. Then B(G)− B(F1) = B(G \ F1) can be used to construct F2.

I Given A(G ),B(G ),C (G ) we would find F1 and F2 as above. Wethen find F3 from

C (G )− C (F1)− C (F2) = C (G \ F1 ∪ F2) ,

I And so on. . . resulting algorithm, connectivityk , requires onepass and uses O(k · n · polylog n) space.

21/25

Page 72: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity Algorithm

I Can find F1 using the connectivity algorithm.

I But how can we find F2 without taking another pass over the data?

I Linearity: Suppose we have independent connectivity sketchesA(G ) and B(G ) of the graph G .

1. Construct F1 from A(G)2. Construct B(F1)3. Then B(G)− B(F1) = B(G \ F1) can be used to construct F2.

I Given A(G ),B(G ),C (G ) we would find F1 and F2 as above. Wethen find F3 from

C (G )− C (F1)− C (F2) = C (G \ F1 ∪ F2) ,

I And so on. . . resulting algorithm, connectivityk , requires onepass and uses O(k · n · polylog n) space.

21/25

Page 73: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity Algorithm

I Can find F1 using the connectivity algorithm.

I But how can we find F2 without taking another pass over the data?

I Linearity: Suppose we have independent connectivity sketchesA(G ) and B(G ) of the graph G .

1. Construct F1 from A(G)

2. Construct B(F1)3. Then B(G)− B(F1) = B(G \ F1) can be used to construct F2.

I Given A(G ),B(G ),C (G ) we would find F1 and F2 as above. Wethen find F3 from

C (G )− C (F1)− C (F2) = C (G \ F1 ∪ F2) ,

I And so on. . . resulting algorithm, connectivityk , requires onepass and uses O(k · n · polylog n) space.

21/25

Page 74: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity Algorithm

I Can find F1 using the connectivity algorithm.

I But how can we find F2 without taking another pass over the data?

I Linearity: Suppose we have independent connectivity sketchesA(G ) and B(G ) of the graph G .

1. Construct F1 from A(G)2. Construct B(F1)

3. Then B(G)− B(F1) = B(G \ F1) can be used to construct F2.

I Given A(G ),B(G ),C (G ) we would find F1 and F2 as above. Wethen find F3 from

C (G )− C (F1)− C (F2) = C (G \ F1 ∪ F2) ,

I And so on. . . resulting algorithm, connectivityk , requires onepass and uses O(k · n · polylog n) space.

21/25

Page 75: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity Algorithm

I Can find F1 using the connectivity algorithm.

I But how can we find F2 without taking another pass over the data?

I Linearity: Suppose we have independent connectivity sketchesA(G ) and B(G ) of the graph G .

1. Construct F1 from A(G)2. Construct B(F1)3. Then B(G)− B(F1) = B(G \ F1) can be used to construct F2.

I Given A(G ),B(G ),C (G ) we would find F1 and F2 as above. Wethen find F3 from

C (G )− C (F1)− C (F2) = C (G \ F1 ∪ F2) ,

I And so on. . . resulting algorithm, connectivityk , requires onepass and uses O(k · n · polylog n) space.

21/25

Page 76: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity Algorithm

I Can find F1 using the connectivity algorithm.

I But how can we find F2 without taking another pass over the data?

I Linearity: Suppose we have independent connectivity sketchesA(G ) and B(G ) of the graph G .

1. Construct F1 from A(G)2. Construct B(F1)3. Then B(G)− B(F1) = B(G \ F1) can be used to construct F2.

I Given A(G ),B(G ),C (G ) we would find F1 and F2 as above. Wethen find F3 from

C (G )− C (F1)− C (F2) = C (G \ F1 ∪ F2) ,

I And so on. . . resulting algorithm, connectivityk , requires onepass and uses O(k · n · polylog n) space.

21/25

Page 77: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

k-Connectivity Algorithm

I Can find F1 using the connectivity algorithm.

I But how can we find F2 without taking another pass over the data?

I Linearity: Suppose we have independent connectivity sketchesA(G ) and B(G ) of the graph G .

1. Construct F1 from A(G)2. Construct B(F1)3. Then B(G)− B(F1) = B(G \ F1) can be used to construct F2.

I Given A(G ),B(G ),C (G ) we would find F1 and F2 as above. Wethen find F3 from

C (G )− C (F1)− C (F2) = C (G \ F1 ∪ F2) ,

I And so on. . . resulting algorithm, connectivityk , requires onepass and uses O(k · n · polylog n) space.

21/25

Page 78: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Outline

Spanners and Distances

Sparsifiers and Cuts

Sketches and Dynamic GraphsConnectivityk-ConnectivityMinimum Cut

22/25

Page 79: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Estimating Minimum Cut

I Goal: Estimate the size of the min-cut up to a (1 + ε) factor.

I If min-cut size is O(ε−2 · polylog n) then connectivityk algorithmcan find exact min-cut exactly in O(ε−2 · n · polylog n) space.

I What can be done if min-cut is large?

Theorem (Karger)Let G = (V ,E ) be an unweighted graph with min-cut value λ. If wesample each edge with probability

p ≥ p∗ := 6λ−1ε−2 log n

and assign weight 1/p to sampled edges, then the resulting graph is an(1 + ε)-sparsification of G with high probability.

I Idea: Subsample the input graph at different rates and useconnectivityk to compute min-cut size if it’s small enough.

23/25

Page 80: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Estimating Minimum Cut

I Goal: Estimate the size of the min-cut up to a (1 + ε) factor.

I If min-cut size is O(ε−2 · polylog n) then connectivityk algorithmcan find exact min-cut exactly in O(ε−2 · n · polylog n) space.

I What can be done if min-cut is large?

Theorem (Karger)Let G = (V ,E ) be an unweighted graph with min-cut value λ. If wesample each edge with probability

p ≥ p∗ := 6λ−1ε−2 log n

and assign weight 1/p to sampled edges, then the resulting graph is an(1 + ε)-sparsification of G with high probability.

I Idea: Subsample the input graph at different rates and useconnectivityk to compute min-cut size if it’s small enough.

23/25

Page 81: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Estimating Minimum Cut

I Goal: Estimate the size of the min-cut up to a (1 + ε) factor.

I If min-cut size is O(ε−2 · polylog n) then connectivityk algorithmcan find exact min-cut exactly in O(ε−2 · n · polylog n) space.

I What can be done if min-cut is large?

Theorem (Karger)Let G = (V ,E ) be an unweighted graph with min-cut value λ. If wesample each edge with probability

p ≥ p∗ := 6λ−1ε−2 log n

and assign weight 1/p to sampled edges, then the resulting graph is an(1 + ε)-sparsification of G with high probability.

I Idea: Subsample the input graph at different rates and useconnectivityk to compute min-cut size if it’s small enough.

23/25

Page 82: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Estimating Minimum Cut

I Goal: Estimate the size of the min-cut up to a (1 + ε) factor.

I If min-cut size is O(ε−2 · polylog n) then connectivityk algorithmcan find exact min-cut exactly in O(ε−2 · n · polylog n) space.

I What can be done if min-cut is large?

Theorem (Karger)Let G = (V ,E ) be an unweighted graph with min-cut value λ. If wesample each edge with probability

p ≥ p∗ := 6λ−1ε−2 log n

and assign weight 1/p to sampled edges, then the resulting graph is an(1 + ε)-sparsification of G with high probability.

I Idea: Subsample the input graph at different rates and useconnectivityk to compute min-cut size if it’s small enough.

23/25

Page 83: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Estimating Minimum Cut

I Goal: Estimate the size of the min-cut up to a (1 + ε) factor.

I If min-cut size is O(ε−2 · polylog n) then connectivityk algorithmcan find exact min-cut exactly in O(ε−2 · n · polylog n) space.

I What can be done if min-cut is large?

Theorem (Karger)Let G = (V ,E ) be an unweighted graph with min-cut value λ. If wesample each edge with probability

p ≥ p∗ := 6λ−1ε−2 log n

and assign weight 1/p to sampled edges, then the resulting graph is an(1 + ε)-sparsification of G with high probability.

I Idea: Subsample the input graph at different rates and useconnectivityk to compute min-cut size if it’s small enough.

23/25

Page 84: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Min-Cut AlgorithmI Let hi be a hash function such that for each e ∈ [n]× [n]

P [hi (e) = 1] = 1/2i

I Let Gi = (V ,Ei ) where Ei = {e ∈ E : hi (e) = 1}I Let Hi = connectivityk(Gi ) where k := 24ε−2 log n

I Post-Processing: Let µi be min-cut size of Hi . Return

2j · µj where j = min{i : µi < k}

I Analysis:I Let λi be the size of min-cut of Gi

I Karger’s result implies 2iλi = (1± ε)λ for all i = 0, 1, . . . , blg 1/p∗c.I If λi < k, connectivityk algorithm guarantees λi = µi .I Lemma: j ≤ blg 1/p∗c

I Total space is O(k · n · polylog n) = O(ε−2 · n · polylog n).

I Can extend these ideas to get (1 + ε)-sparsification of a dynamicgraph in a single pass and O(ε−2 · n · polylog n) space.

24/25

Page 85: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Min-Cut AlgorithmI Let hi be a hash function such that for each e ∈ [n]× [n]

P [hi (e) = 1] = 1/2i

I Let Gi = (V ,Ei ) where Ei = {e ∈ E : hi (e) = 1}

I Let Hi = connectivityk(Gi ) where k := 24ε−2 log n

I Post-Processing: Let µi be min-cut size of Hi . Return

2j · µj where j = min{i : µi < k}

I Analysis:I Let λi be the size of min-cut of Gi

I Karger’s result implies 2iλi = (1± ε)λ for all i = 0, 1, . . . , blg 1/p∗c.I If λi < k, connectivityk algorithm guarantees λi = µi .I Lemma: j ≤ blg 1/p∗c

I Total space is O(k · n · polylog n) = O(ε−2 · n · polylog n).

I Can extend these ideas to get (1 + ε)-sparsification of a dynamicgraph in a single pass and O(ε−2 · n · polylog n) space.

24/25

Page 86: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Min-Cut AlgorithmI Let hi be a hash function such that for each e ∈ [n]× [n]

P [hi (e) = 1] = 1/2i

I Let Gi = (V ,Ei ) where Ei = {e ∈ E : hi (e) = 1}I Let Hi = connectivityk(Gi ) where k := 24ε−2 log n

I Post-Processing: Let µi be min-cut size of Hi . Return

2j · µj where j = min{i : µi < k}

I Analysis:I Let λi be the size of min-cut of Gi

I Karger’s result implies 2iλi = (1± ε)λ for all i = 0, 1, . . . , blg 1/p∗c.I If λi < k, connectivityk algorithm guarantees λi = µi .I Lemma: j ≤ blg 1/p∗c

I Total space is O(k · n · polylog n) = O(ε−2 · n · polylog n).

I Can extend these ideas to get (1 + ε)-sparsification of a dynamicgraph in a single pass and O(ε−2 · n · polylog n) space.

24/25

Page 87: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Min-Cut AlgorithmI Let hi be a hash function such that for each e ∈ [n]× [n]

P [hi (e) = 1] = 1/2i

I Let Gi = (V ,Ei ) where Ei = {e ∈ E : hi (e) = 1}I Let Hi = connectivityk(Gi ) where k := 24ε−2 log n

I Post-Processing: Let µi be min-cut size of Hi . Return

2j · µj where j = min{i : µi < k}

I Analysis:I Let λi be the size of min-cut of Gi

I Karger’s result implies 2iλi = (1± ε)λ for all i = 0, 1, . . . , blg 1/p∗c.I If λi < k, connectivityk algorithm guarantees λi = µi .I Lemma: j ≤ blg 1/p∗c

I Total space is O(k · n · polylog n) = O(ε−2 · n · polylog n).

I Can extend these ideas to get (1 + ε)-sparsification of a dynamicgraph in a single pass and O(ε−2 · n · polylog n) space.

24/25

Page 88: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Min-Cut AlgorithmI Let hi be a hash function such that for each e ∈ [n]× [n]

P [hi (e) = 1] = 1/2i

I Let Gi = (V ,Ei ) where Ei = {e ∈ E : hi (e) = 1}I Let Hi = connectivityk(Gi ) where k := 24ε−2 log n

I Post-Processing: Let µi be min-cut size of Hi . Return

2j · µj where j = min{i : µi < k}

I Analysis:I Let λi be the size of min-cut of Gi

I Karger’s result implies 2iλi = (1± ε)λ for all i = 0, 1, . . . , blg 1/p∗c.I If λi < k, connectivityk algorithm guarantees λi = µi .I Lemma: j ≤ blg 1/p∗c

I Total space is O(k · n · polylog n) = O(ε−2 · n · polylog n).

I Can extend these ideas to get (1 + ε)-sparsification of a dynamicgraph in a single pass and O(ε−2 · n · polylog n) space.

24/25

Page 89: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Min-Cut AlgorithmI Let hi be a hash function such that for each e ∈ [n]× [n]

P [hi (e) = 1] = 1/2i

I Let Gi = (V ,Ei ) where Ei = {e ∈ E : hi (e) = 1}I Let Hi = connectivityk(Gi ) where k := 24ε−2 log n

I Post-Processing: Let µi be min-cut size of Hi . Return

2j · µj where j = min{i : µi < k}

I Analysis:I Let λi be the size of min-cut of Gi

I Karger’s result implies 2iλi = (1± ε)λ for all i = 0, 1, . . . , blg 1/p∗c.

I If λi < k, connectivityk algorithm guarantees λi = µi .I Lemma: j ≤ blg 1/p∗c

I Total space is O(k · n · polylog n) = O(ε−2 · n · polylog n).

I Can extend these ideas to get (1 + ε)-sparsification of a dynamicgraph in a single pass and O(ε−2 · n · polylog n) space.

24/25

Page 90: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Min-Cut AlgorithmI Let hi be a hash function such that for each e ∈ [n]× [n]

P [hi (e) = 1] = 1/2i

I Let Gi = (V ,Ei ) where Ei = {e ∈ E : hi (e) = 1}I Let Hi = connectivityk(Gi ) where k := 24ε−2 log n

I Post-Processing: Let µi be min-cut size of Hi . Return

2j · µj where j = min{i : µi < k}

I Analysis:I Let λi be the size of min-cut of Gi

I Karger’s result implies 2iλi = (1± ε)λ for all i = 0, 1, . . . , blg 1/p∗c.I If λi < k, connectivityk algorithm guarantees λi = µi .

I Lemma: j ≤ blg 1/p∗cI Total space is O(k · n · polylog n) = O(ε−2 · n · polylog n).

I Can extend these ideas to get (1 + ε)-sparsification of a dynamicgraph in a single pass and O(ε−2 · n · polylog n) space.

24/25

Page 91: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Min-Cut AlgorithmI Let hi be a hash function such that for each e ∈ [n]× [n]

P [hi (e) = 1] = 1/2i

I Let Gi = (V ,Ei ) where Ei = {e ∈ E : hi (e) = 1}I Let Hi = connectivityk(Gi ) where k := 24ε−2 log n

I Post-Processing: Let µi be min-cut size of Hi . Return

2j · µj where j = min{i : µi < k}

I Analysis:I Let λi be the size of min-cut of Gi

I Karger’s result implies 2iλi = (1± ε)λ for all i = 0, 1, . . . , blg 1/p∗c.I If λi < k, connectivityk algorithm guarantees λi = µi .I Lemma: j ≤ blg 1/p∗c

I Total space is O(k · n · polylog n) = O(ε−2 · n · polylog n).

I Can extend these ideas to get (1 + ε)-sparsification of a dynamicgraph in a single pass and O(ε−2 · n · polylog n) space.

24/25

Page 92: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Min-Cut AlgorithmI Let hi be a hash function such that for each e ∈ [n]× [n]

P [hi (e) = 1] = 1/2i

I Let Gi = (V ,Ei ) where Ei = {e ∈ E : hi (e) = 1}I Let Hi = connectivityk(Gi ) where k := 24ε−2 log n

I Post-Processing: Let µi be min-cut size of Hi . Return

2j · µj where j = min{i : µi < k}

I Analysis:I Let λi be the size of min-cut of Gi

I Karger’s result implies 2iλi = (1± ε)λ for all i = 0, 1, . . . , blg 1/p∗c.I If λi < k, connectivityk algorithm guarantees λi = µi .I Lemma: j ≤ blg 1/p∗c

I Total space is O(k · n · polylog n) = O(ε−2 · n · polylog n).

I Can extend these ideas to get (1 + ε)-sparsification of a dynamicgraph in a single pass and O(ε−2 · n · polylog n) space.

24/25

Page 93: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Min-Cut AlgorithmI Let hi be a hash function such that for each e ∈ [n]× [n]

P [hi (e) = 1] = 1/2i

I Let Gi = (V ,Ei ) where Ei = {e ∈ E : hi (e) = 1}I Let Hi = connectivityk(Gi ) where k := 24ε−2 log n

I Post-Processing: Let µi be min-cut size of Hi . Return

2j · µj where j = min{i : µi < k}

I Analysis:I Let λi be the size of min-cut of Gi

I Karger’s result implies 2iλi = (1± ε)λ for all i = 0, 1, . . . , blg 1/p∗c.I If λi < k, connectivityk algorithm guarantees λi = µi .I Lemma: j ≤ blg 1/p∗c

I Total space is O(k · n · polylog n) = O(ε−2 · n · polylog n).

I Can extend these ideas to get (1 + ε)-sparsification of a dynamicgraph in a single pass and O(ε−2 · n · polylog n) space.

24/25

Page 94: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Proof of Lemma

I Consider i = blg 1/p∗c and so sampling probability for Gi is

2−i < 2p∗ = 12λ−1ε−2 log n

I Consider a cut in G of size λ. Expected number of edges acrosssame cut is Gi is at most

2p∗ · λ = 12ε−2 log n

and is < 24 log nε2 = k with high probability. Hence, λi < k.

25/25

Page 95: Data Streams & Communication Complexitymcgregor/slides/epit-2.pdf · Proof of Lemma Lemma A graph H on n nodes with no cycles of length 2t has O(n1+1=t) edges. I Let d = 2m=n be average

Proof of Lemma

I Consider i = blg 1/p∗c and so sampling probability for Gi is

2−i < 2p∗ = 12λ−1ε−2 log n

I Consider a cut in G of size λ. Expected number of edges acrosssame cut is Gi is at most

2p∗ · λ = 12ε−2 log n

and is < 24 log nε2 = k with high probability. Hence, λi < k.

25/25


Recommended