+ All Categories
Home > Documents > Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very...

Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very...

Date post: 05-Aug-2020
Category:
Upload: others
View: 10 times
Download: 0 times
Share this document with a friend
38
Lecture 24: Disjoint Sets CSE 373: Data Structures and Algorithms CSE 373 19 SP – ZACHARY CHUN 1
Transcript
Page 1: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Lecture 24: Disjoint Sets CSE 373: Data Structures and Algorithms

CSE 373 19 SP – ZACHARY CHUN 1

Page 2: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Warmup

CSE 373 SP 18 - KASEY CHAMPION 2

KruskalMST(Graph G) initialize each vertex to be an independent

componentsort the edges by weightforeach(edge (u, v) in sorted order){

if(u and v are in different components){add (u,v) to the MSTupdate u and v to be in the same component

}}

Run Kruskal’s algorithm on the following graph to find the MST (minimum spanning tree) of the graph below.Recall the definition of a minimum spanning tree: a minimum-weight set of edges such that you can get fromany vertex of the graph to any other on only those edges. The set of these edges form a valid tree in the graph.Below is the provided pseudocode for Kruksal’s algorithm to choose all the edges.

PollEv.com/373lecture

Page 3: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Announcements

- Kasey out today (no Kasey 2:30 office hours)- Hw6 released, due next Wednesday- Hw7 partner form out now, due Monday 11:59pm.

-If you do not fill out the partner form out on time, Brian will be sad because he has to do more work unnecessarily to fix it

- No office hours Monday (Memorial Day)

CSE 373 SP 18 - KASEY CHAMPION 3

Page 4: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

What are we doing today?

- Disjoint Set ADT

- Implementing Disjoint Set

Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses …

but it’s exciting because:

- is a cool recap of topics / touches on a bunch of different things we’ve seen in this course

(trees, arrays, graphs, optimizing runtime, etc.)

- it has a lot of details and is fairly complex – it doesn’t seem like a plus at first, but after you

learn this / while you’re learning this…you’ve come along way since lists and being able to learn

new complex data structures is a great skill to have built)

CSE 373 SP 18 - KASEY CHAMPION 4

Page 5: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

The Disjoint Set ADT

CSE 373 SP 18 - KASEY CHAMPION 5

Page 6: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Disjoint Sets in mathematics

- “In mathematics, two sets are said to be disjoint sets if they have no element in common.” - Wikipedia - disjoint = not overlapping

CSE 373 SP 18 - KASEY CHAMPION 6

Kevin

VivianBlarry

Sherdil

Velocity

These two sets are disjoint sets

Meredith

Matt Brian

These two sets are not disjoint sets

Matt

Set #1 Set #2 Set #3 Set #4

Page 7: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Disjoint Sets in computer science

In computer science, a disjoint set keeps track of multiple “mini” disjoint sets (confusing naming, I know)

CSE 373 SP 18 - KASEY CHAMPION 7

Kevin

VivianBlarry

Sherdil

Velocity

Set #1 Set #2

This overall grey blob thing is the actual disjoint set, and it’s keeping track of any number of mini-sets, which are all disjoint (the mini sets have no overlappingvalues).

Note: this might feel really different than ADTs we’verun into before. The ADTs we’ve seen before

(dictionaries, lists, sets, etc.) just store values directly.But the Disjoint Set ADT is particularly interested inletting you group your values into sets and keep track of which particular set your values are in.

new ADT!

Page 8: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

What methods does the DisjointSet ADT have?

Just 3 methods (and one of them is pretty simple!)

- findSet(value)- union(valueA, valueB)- makeSet(value)

CSE 373 SP 18 - KASEY CHAMPION 8

Page 9: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

findSet(value)findSet(value) returns some indicator for which particular set the value is in. You can think of this as an ID. For Disjoint Sets, we often call this the representative.

Examples:

findSet(Brian)

findSet(Sherdil)

findSet(Velocity)

findSet(Kevin) == findSet(Blarry)

CSE 373 SP 18 - KASEY CHAMPION 9

Kevin

Vivian

Blarry

Sherdil

Velocity

Set #1

Set #2

Brian

Set #3

Set #4

Keanu

Kasey

3

3

2

2

true

Page 10: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

What methods does the Disjoint Set ADT have?

Just 3 methods (and one of them is pretty simple!)

- findSet(value)- union(valueA, valueB)- makeSet(value)

CSE 373 SP 18 - KASEY CHAMPION 10

Page 11: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

union(valueA, valueB)union(valueA, valueB) merges the set that A is in with the set that B is in. (basically add the two sets together into one)

Example: union(Blarry,Brian)

CSE 373 SP 18 - KASEY CHAMPION 11

Set #1Set #3

Kevin

Vivian

Blarry

Sherdil

Velocity

Set #2

Brian

Set #4

Keanu

Kasey

Set #1

Kevin

Vivian

Blarry

Sherdil

Velocity

Set #2 Set #4

Kasey

BrianKeanu

Page 12: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

What methods does the DisjointSet ADT have?

Just 3 methods (and one of them is pretty simple!)

- findSet(value)- union(valueA, valueB)- makeSet(value)

CSE 373 SP 18 - KASEY CHAMPION 12

Page 13: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

makeSet(value)makeSet(value) makes a new mini set that just has the value parameter in it.

Examples:

makeSet(Cherie)

makeSet(Anish)

CSE 373 SP 18 - KASEY CHAMPION 13

Kevin

Vivian

Blarry

Sherdil

Velocity

Set #1

Set #2

Brian

Set #3

Set #4

Keanu

Kasey

Cherie

Set #5AnishSet #6

Page 14: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Disjoint Set ADT Summary

CSE 373 SP 18 - KASEY CHAMPION 14

Disjoint-Set ADT

makeSet(value) – creates a new set within the disjoint set where the

only member is the value. Picks id/representative for set

state

behavior

Set of Sets- Mini sets are disjoint: Elements must be unique across mini sets

- No required order

- Each set has id/representative

findSet(value) – looks up the set containing the value, returns

id/representative/ of that set

union(x, y) – looks up set containing x and set containing y, combines two

sets into one. All of the values of one set are added to the other, and the

now empty set goes away.

Page 15: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Why are we doing this again?

CSE 373 SP 18 - KASEY CHAMPION 15

Page 16: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Kruskal’s Algorithm Implementation

KruskalMST(Graph G) initialize each vertex to be an independent componentsort the edges by weightforeach(edge (u, v) in sorted order){

if(u and v are in different components){update u and v to be in the same componentadd (u,v) to the MST

}}

KruskalMST(Graph G) foreach (V : G.vertices) {

makeSet(v);}sort the edges by weightforeach(edge (u, v) in sorted order){

if(findSet(v) is not the same as findSet(u)){union(u, v)add (u, v) to the MST

}}

Page 17: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Kruskal’s with disjoint sets on the side example

CSE 373 SP 18 - KASEY CHAMPION 17

KruskalMST(Graph G) foreach (V : G.vertices) {

makeSet(v);}sort the edges by weightforeach(edge (u, v) in sorted order){

if(findSet(v) is not the same as findSet(u)){union(u, v)

}}

Page 18: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Why are we doing this again? (continued)Disjoint Sets help us manage groups of distinct values.

This is a common idea in graphs, where we want to keep track of different connected components of a graph.

In Kruskal’s, if each connected-so-far-island of the graph is its own mini set in our disjoint set, we can easily check that we don’t introduce cycles. If we’re considering a new edge, we just check that the two vertices of that edge are in different mini sets by calling findSet.

CSE 373 SP 18 - KASEY CHAMPION 18

Page 19: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

1 min breakTake a second to review notes with your neighbors, ask questions, try to clear up any confusions you have… we’ll group back up and see if there are still any unanswered questions then!

CSE 373 SP 18 - KASEY CHAMPION 19

Page 20: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Implementing Disjoint Set

CSE 373 SP 18 - KASEY CHAMPION 20

Page 21: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Implementing Disjoint Set with Dictionaries

CSE 373 SP 18 - KASEY CHAMPION 21

Approach 1: dictionary of value -> set ID/representative Approach 2: dictionary of ID/representative of set

-> all the values in that set

Sherdil

Robbie

Sarah

1

2

1

1

2 Robbie

Sarah, Sherdil

Page 22: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Exercise (1.5 min)

Calculate the worst case Big O runtimes for each of the methods (makeSet, findSet, union) for both approaches.

CSE 373 SP 18 - KASEY CHAMPION 22

Approach 1: dictionary of value -> set ID/representative

Approach 2: dictionary of ID/representative of set -> all the values in that set

Sherdil

Robbie

Sarah

1

2

1

1

2 Robbie

Sarah, Sherdilapproach 1 approach 2

makeSet(value) O(1) O(1)

findSet(value) O(1) O(n)

union(valueA, valueB)

O(n) O(n)

Page 23: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Implementing Disjoint Set with Trees (and dictionaries) (1)Each mini-set is now represented as a separate tree.

(Note: using letters/numbers from now on as the values because they’re easier to fit inside the nodes)

CSE 373 SP 18 - KASEY CHAMPION 23

a

b

c

1

2

1

d

ac

e

b

d

e 2

1

Page 24: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Implementing Disjoint Set with Trees (and dictionaries) (1)Each mini-set is now represented as a different tree.

(Note: using letters/numbers from now on as the values because they’re easier to fit inside the nodes)

CSE 373 SP 18 - KASEY CHAMPION 24

a

b

c

1

2

1

d

ac

e

b

d

e 2

1

a b c d e

dictionary so you can jump to nodes in the tree

Page 25: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Implementing Disjoint Set with Trees (and dictionaries) (2)union(valueA, valueB) -- the method with the problem runtime from before -- should look a lot easier in terms of updating the data structure – all we have to do is change one link so they’re connected.

What should we change? If we change the root of one to point to the other tree, then all the lower things will be updated. It turns out it will be most efficient if we have the root point to the other tree’s root.

CSE 373 SP 18 - KASEY CHAMPION 25

1

6

3 4

2

105 7

98

11

15

13 14

12

1716

18

Page 26: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Implementing Disjoint Set with Trees (and dictionaries) (3)findSet has to be different though …

They all have access to the root node because all the links point up – we can use the root node as our id / representative. For example:

findSet(5)

findSet(9)

they’re in the same set because they have the same representative!

CSE 373 SP 18 - KASEY CHAMPION 26

1

6

3 4

2

105 7

98

Page 27: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Seems great so far but let’s abuse some stuffmakeSet(a)

makeSet(b)

makeSet(c)makeSet(d)

makeSet(e)

union(a, b)

union(a, c)

union(a, d)

union(a, e)

findSet (a) – how long will this take? Could turn into a linked list where you might have to start at the bottom and loop all the way to the top.

CSE 373 SP 18 - KASEY CHAMPION 27

Page 28: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Improving unionProblem: Trees can be unbalanced (and look linked-list-like) so our findSet runtime becomes closer to N

Solution: Union-by-rank!- let rank(x) be a number representing the upper bound of the height of x so rank(x) >= height(x)- Keep track of rank of all trees- When unioning make the tree with larger rank the root- If it’s a tie, pick one randomly and increase rank by one

CSE 373 SP 18 - KASEY CHAMPION 28

2

3

5

1

4

rank = 0 rank = 2

0 4

rank = 0 rank = 0rank = 1

Page 29: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

PracticeGiven the following disjoint-set what would be the result of the following calls on union if we add the “union-by-rank” optimization. Draw the forest at each stage with corresponding ranks for each tree.

CSE 373 SP 18 - KASEY CHAMPION29

6

4

5

0

rank = 2

3

1

2

rank = 0

8

10

12

9

rank = 2

11

7

13

rank = 1

union(2, 13)union(4, 12)union(2, 8)

Page 30: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

PracticeGiven the following disjoint-set what would be the result of the following calls on union if we add the “union-by-rank” optimization. Draw the forest at each stage with corresponding ranks for each tree.

CSE 373 SP 18 - KASEY CHAMPION30

6

4

5

0

rank = 2

3

1

2

rank = 0

8

10

12

9

rank = 2

11

7

13

rank = 1

union(2, 13)

Page 31: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

PracticeGiven the following disjoint-set what would be the result of the following calls on union if we add the “union-by-rank” optimization. Draw the forest at each stage with corresponding ranks for each tree.

CSE 373 SP 18 - KASEY CHAMPION31

6

4

5

0

rank = 2

3

1

2

8

10

12

9

rank = 2

11

7

13

rank = 1

union(2, 13)

union(4, 12)

Page 32: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

PracticeGiven the following disjoint-set what would be the result of the following calls on union if we add the “union-by-rank” optimization. Draw the forest at each stage with corresponding ranks for each tree.

CSE 373 SP 18 - KASEY CHAMPION32

6

4

5

0 3

1

2

8

10

12

9

rank = 3

11

7

13

rank = 1

union(2, 13)

union(4, 12)

union(2, 8)

Page 33: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

PracticeGiven the following disjoint-set what would be the result of the following calls on union if we add the “union-by-rank” optimization. Draw the forest at each stage with corresponding ranks for each tree.

CSE 373 SP 18 - KASEY CHAMPION33

8

10

12

9

rank = 3

11

union(2, 13)union(4, 12)union(2, 8)

6

4

5

0 3

1

2

7

13

Does this improve the worst case runtimes?

findSet is more likely to be O(log(n)) than O(n)

Page 34: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Exercise (2 min)Given the following disjoint-set what would be the result of the following calls on union if we add the “union-by-rank” optimization. Draw the forest at each stage with corresponding ranks for each tree.

CSE 373 SP 18 - KASEY CHAMPION34

6

4

5

0

rank = 2

3

1

2

rank = 0

8

10

12

9

rank = 2

11

7

13

rank = 1

union(5, 8)union(1, 2)union(7, 3)

Page 35: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Improving findSet()Problem: Every time we call findSet() you must traverse all the levels of the tree to find representative. If there are a lot of levels (big height), this is more inefficient than need be.

Solution: Path Compression- Collapse tree into fewer levels by updating parent pointer of each node you visit- Whenever you call findSet() update each node you touch’s parent pointer to point directly to overallRoot

CSE 373 SP 18 - KASEY CHAMPION 35

8

10

12

9 116

4

5

3 2

7

13

rank = 3

findSet(5)findSet(4)

8

10

12

9 11645

3 2

7

13

rank = 3

Does this improve the worst case runtimes?findSet is more likely to be O(1) than O(log(n))

Page 36: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Exercise if time?Using the union-by-rank and path-compression optimized implementations of disjoint-sets draw the resulting forest caused by these calls:1.makeSet(a)

2.makeSet(b)

3.makeSet(c)

4.makeSet(d)

5.makeSet(e)

6.makeSet(f)

7.makeSet(h)

8.union(c, e)

9.union(d, e)

10.union(a, c)

11.union(g, h)

12.union(b, f)

13.union(g, f)

14.union(b, c)

CSE 373 SP 18 - KASEY CHAMPION 36

Page 37: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Optimized Disjoint Set RuntimemakeSet(x)Without Optimizations

With Optimizations

findSet(x)Without Optimizations

With Optimizations

union(x, y)Without Optimizations

With Optimizations

CSE 373 SP 18 - KASEY CHAMPION 37

O(1)

O(1)

O(n)

O(n)

Best case: O(1) Worst case: O(logn)

Best case: O(1) Worst case: O(logn)

Page 38: Lecture 24: Disjoint Sets CSE 373: Data Structures and ... · Disjoint Set is honestly a very specific ADT/Data structure that has pretty limited realistic uses … but it’s exciting

Next time- union should call findMin to get access to the root of the trees

- why rank is an approximation of height

- array representation instead of tree

- more practice!

CSE 373 SP 18 - KASEY CHAMPION 38


Recommended