of elements. •Operations supported Union-Find Disjoint Set€¦ · Union-Find Disjoint Set •An...

Post on 26-Aug-2020

8 views 0 download

transcript

Union-Find Disjoint Set• An ADT(Abstract Data Type) used to manage

disjoint sets of elements.• Operations supported

• Union: Given 2 elements, if they belong to different sets, merge the 2 sets into 1

• Find: Given an element, find the set it belongs to

9/7/15 2

1

2

5

6

47

3Union(1, 2) 1

2

56

473

1

2

5

6

47

3Find(5) 1

2

5

6

47

3

Implementing Union-Find Disjoint Set• Data structure for Disjoint Set• Algorithms for Union and Find operations• Run time analysis & reflection

9/7/15 3

Version 1(Quick Find)• Data Structure

• Each Set is represented as a list, and the head of the list is used as the representative(identifier) of the set

• To facilitate find, each node contains a pointer to its representative

• The representative records the number of elements in the set

9/7/15 4

Version 1(Quick Find)• Union(x,y)

• Merge all elements in the smaller set to the larger set• Update the number of elements in the combined set • Redirect the representative pointer in all the elements of the

smaller set• Find(x)

• If the representative pointer of x is null, return itself• Else, return its representative pointer

9/7/15 5

Example of Version 1

9/7/15 6

1 2 3 4 5 6 7 8 9Initial state

1

2

3 4 5 6 7 8 9Union(2,6)

1

2

4 5 6 7 8 9

3

Union(3,8)

1

2

4 5 6 7 8

93

Union(3,9)

1 4 5 7 8

93

Union(2,3)

26

Analysis of Version 1•

9/7/15 7

Analysis of Version 1• Amortized run time of Union(cont’)

• If an element is still contained in a singleton, we say it is still not touched by the Union operations; or else, it is touched

• k Union operations touch at most 2*k elements• If an Union operation merges 2 singletons, it adds 2

untouched elements• If it merges a singleton to an non-singleton, it adds 1

untouched element• If it merges 2 non-singletons, it adds 0 untouched

elements

9/7/15 8

Analysis of Version 1• Amortized run time of Union(cont’)

• The total run time of k Union operations is at most 2 * k * log n• The amortized run time = 2 * k * log n / k = O (log n)

9/7/15 9

Reflection on Version 1• It is because we have to update all the

representative pointers in the smaller set that makes Union operation slow

• What if we only update the representative pointer of the representative of the smaller set?

9/7/15 10

Version 2(Quick Union)• Data Structure

• Each element has a pointer to the representative of the set it was merged to(if any)

• Each element records the size of the set of which it was the representative

• Next pointer is not needed any more because we do not have to update the representative pointer of all elements in the smaller set

• Data representation actually becomes tree structure9/7/15 11

Version 2(Quick Union)• Union(x, y)

• Find the representatives of x and y using Find• Redirect the representative pointer of the smaller

set to the representative of the larger set• Find(x)

• Follow the representative pointer chain until root and return root.

9/7/15 12

Example of Version 2/Union

9/7/15 13

1 2 3 4 5 6 7 8 9Initial state

1

2

3 4 5 6 7 8 9Union(2,6)

1

2

4 5 6 7 8 9

3

Union(3,8)

1

2

4 5 6 7 8

93

Union(3,9)

13

Example of Version 2/Union

9/7/15 14

1

2

4 5 6 7 8

93

Union(3,9)

1 4 5 7 8

93

Union(2,3)

6

2

Analysis of Version 2•

9/7/15 15

T1 T2

T

Analysis of Version 2• If the size of the set is n, the height of its tree

representation is at most log n + 1• Proof by contradiction using the previous characteristic

• The worst run time of both Union and Find is O(log n)

• The amortized run time is better

9/7/15 16

log n

Find

Reflection on Version 2• If we do m Finds on element e, the total time is

m * log n; if we redirect the representative pointer to root in the first Find, the total time becomes shorter.

9/7/15 17

Version 3(Path Compression)• Data Structure

• The same as Version 2• Union

• The same as Version 2• Find(x)

• For all the node in the path from x to root, redirect their representative pointer to root

9/7/15 18

Example

9/7/15 19

13

3

1 7 10

2 4

6

5 9 11 12

14

15

13

31

7 102

4

6

5 9 11 12

14

15

Find(4)

Example

9/7/15 20

13

31

7 102

4

6

5 9 11 12

14

15

13

31

7

10

2

4

6

5 9

11

12

14

15Find(12)

Run time analysis of Version 3

9/7/15 21

nearly O(1)

Review of non-recursive algorithm analysis

S=0;for (j = 1; j <= n; j++) { for (k=j; k <= n; k++){ S++; }}

+

+

+

=

11/15/2015 22

Analyzing recursive algorithms

• F1(A, k1, k2): m = (k2-k1+1)/2; if (m <= 0): return; B = new a vector of size m; for (j = 1; j < m; j++): B[j] = A[k1+2*j] – A[k1+2*j-1];

F1(A, k1, k1+m); F1(A, k1+m+1, k2); F1(B, 1, m);

+

+

+

+

=

Recurrence

11/15/2015 23

Recursion tree(1)

……

11/15/2015 24

Important properties of logarithm

11/15/2015 25

Recursion tree(2)

……

Property 2

11/15/2015 26

Recursion tree(3)

Property 2

11/15/2015 27

Recursion tree(4)

11/15/2015 28

Example

11/15/2015 29

Recursion tree

……

11/15/2015 30

31

Many recurrence relations arising from divide-and-conquer algorithms have the form:

T(n) = aT(n/b) + f(n)where a ≥1, b>1 are constants and f is asymptotically positive

• We create a problem instances, each of size n/b

• Setting up the problem instance to recurse on and combining sub-solutions returned takes f(n) work

 

32

33

34

35

Solving T(n) = aT(n/b) + f(n)   

36

f(n) grows polynomially slower than g(n)

●  Does a polynomial nε separate f(n) and g(n)?

37

 

 

T(n) = aT(n/b) + f(n) 

38

39

 

 

T(n) = aT(n/b) + f(n) 

40

41

 

T(n) = aT(n/b) + f(n) 

42

f(n) grows polynomially faster than g(n)

●  

43

 

T(n) = T(n/2) + n log n = Ɵ (n log n)

a=1, b=2, n logb a = O(1), f(n) = n log n

regularity condition: 1*n/2*log (n/2) < n/2 log n

we have c=1/2<1

T(n) = 4T(n/2) + n3 = Ɵ(n3)

a=4, b=2, n logb a = n2, f(n) = n3

regularity condition: 4(n/2)3 = n3/4; we have c=1/4 < 1

44

45

Outside the Master Theorem

T(n) = T(√n) + c = O(log log n)T(n) = T(n/4) + T(n/2) + n2 = O(n2)T(n) = 2T(n-1) + 1 = O(2n)f(n) = f(n-1) + f(n-2) T(n) = 4T(n/2) + n2/log n ● Has the right form, but ...● compare n2 and n2/log n: f(n) is smaller

by a factor of log n, not a polynomial