+ All Categories
Home > Documents > 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of...

9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of...

Date post: 21-Jan-2016
Category:
Upload: coleen-wade
View: 217 times
Download: 1 times
Share this document with a friend
29
9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays. Sets in the Java class library. © 2001, D.A. Watt and D.F. Brown
Transcript
Page 1: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-1

9Set ADTs

• Set concepts.

• Set applications.

• A set ADT: requirements, contract.

• Implementations of sets: using arrays, linked lists, boolean arrays.

• Sets in the Java class library.

© 2001, D.A. Watt and D.F. Brown

Page 2: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-2

Set concepts (1)

• A set is a collection of distinct members (values or objects), whose order is insignificant.

• Notation for sets: {a, b, …, z}. The empty set is { }. Set notation is used here, but not supported by Java.

Page 3: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-3

Set concepts (2)

• Examples of sets:

evens = {0, 2, 4, 6, 8}

punct = {‘.’, ‘!’, ‘?’, ‘:’, ‘;’, ‘,’}

EU = {AT, BE, DE, DK, ES, FI, FR, GR, IE, IT, LU, NL, PT, SE, UK}

NAFTA = {CA, MX, US}NATO = {BE, CA, CZ, DE, DK, ES, FR, GR, HU,

IS, IT, LU, NL, NO, PL, PT, TR, UK, US}

set of integers

set of characters

sets of countries

Page 4: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-4

Set concepts (3)

• The cardinality of a set s is the number of members of s. This is written #s. E.g.:

#EU = 15#{red, white, red} = 2

Duplicate members aren’t counted.

• An empty set has cardinality zero.

• We can test whether x is a member of set s (i.e., s contains x). This is the membership test, written x s. E.g.:

UK EUSE EUSE NATO

SE is not a member of NATO.

Page 5: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-5

Set concepts (4)

• Two sets are equal if they contain exactly the same members. E.g.:

NAFTA = {US, CA, MX}NAFTA {CA, US}

Order of members doesn’t matter.

These two sets are unequal.

• Set s1 subsumes (is a superset of) set s2 if every member of s2 is also a member of s1. This is written s1 s2. E.g.:

NATO {CA, US}NATO EU NATO does not subsume EU.

Page 6: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-6

Set concepts (5)

• The union of sets s1 and s2 is a set containing just those values that are members of s1 or s2 or both. This is written s1 s2. E.g.:

{DK, NO, SE} {FI, IS} = {DK, FI, IS, NO, SE}

{DK, NO, SE} {IS, NO} = {DK, IS, NO, SE}

Page 7: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-7

Set concepts (6)

• The intersection of sets s1 and s2 is a set containing just those values that are members of both s1 and s2. This is written s1 s2. E.g.:

NAFTA NATO = {CA, US}NAFTA EU = {}

• Two sets are disjoint if they have no common member, I.e., if their intersection is empty. E.g.:

NAFTA and EU are disjointNATO and EU are not disjoint.

Page 8: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-8

Set concepts (7)

• The difference of sets s1 and s2 is a set containing just those values that are members of s1 but not of s2. This is written s1 – s2. E.g.:

NATO – EU = {CA, CZ, HU, IS, NO, PL, TR, US}

EU – NATO = {AT, FI, IE, SE}

Page 9: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-9

Set applications

• Spelling checker: A spelling checker’s dictionary is a set of words.

The spelling checker highlights any words in the document that are not in the dictionary.

The spelling checker might allow the user to add words to the dictionary.

• Relational database system: A relation is essentially a set of tuples.

Each tuple is distinct.

The tuples are in no particular order.

Page 10: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-10

Example 1: prime numbers

• A prime number is an integer that is divisible only by itself and 1. E.g.: 2, 7, 11, 13 are prime numbers.

• Eratosthenes’ sieve algorithm:

To compute the set of prime numbers less than m (where m > 0):

1. Set sieve = {2, 3, …, m–1}.2. For i = 2, 3, …, while i2 m, repeat:

2.1. If i is a member of sieve:2.1.1. Remove all multiples of i from sieve.

3. Terminate with answer sieve.

2.1.1. For mult = 2i, 3i, ..., while mult < m, repeat:2.1.1.1. Remove mult from sieve.

1.1. Set sieve = { }.1.2. For i = 2, ..., m–1, repeat:

1.2.1. Add i to sieve.

Page 11: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-11

Set ADT: requirements

• Requirements:

1) It must be possible to make a set empty.

2) It must be possible to test whether a set is empty.

3) It must be possible to obtain the cardinality of a set.

4) It must be possible to perform a membership test.

5) It must be possible to add or remove a member of a set.

6) It must be possible to test whether two sets are equal.

7) It must be possible to test whether one set subsumes another.

8) It must be possible to compute the union, intersection, or difference of two sets.

9) It must be possible to traverse a set.

Page 12: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-12

Set ADT: contract (1)

• Possible contract:

public interface Set {

// Each Set object is a set whose members are objects.

//////////// Accessors ////////////

public boolean isEmpty ();// Return true if and only if this set is empty.

public int size ();// Return the cardinality of this set.

public boolean contains (Object obj);// Return true if and only if obj is a member of this set.

Page 13: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-13

Set ADT: contract (2)

• Possible contract (continued):

public boolean equals (Set that);// Return true if and only if this set is equal to that.

public boolean containsAll (Set that);// Return true if and only if this set subsumes that.

Page 14: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-14

Set ADT: contract (3)

• Possible contract (continued):

//////////// Transformers ////////////

public void clear ();// Make this set empty.

public void add (Object obj);// Add obj as a member of this set.

public void remove (Object obj);// Remove obj from this set.

public void addAll (Set that);// Make this set the union of itself and that.

Page 15: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-15

Set ADT: contract (4)

• Possible contract (continued):

public void removeAll (Set that);// Make this set the difference of itself and that.

public void retainAll (Set that);// Make this set the intersection of itself and that.

//////////// Iterator ////////////

public Iterator iterator();// Return an iterator that will visit all members of this set, in

no // particular order.

}

Page 16: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-16

Implementation of sets using arrays (1)

• Represent a bounded set (cardinality maxcard) by: a variable card, containing the current cardinality

an array members of length maxcard, containing the set members in members[0… card–1].

• Keep the array sorted, and avoid storing duplicates.

Illustration(maxcard = 6):

MX USCA0 1 2 4card=3 5

Empty set:1card=0 maxcard–1

0 1 card–1 cardInvariant: membermember member

maxcard–1

greatest member unoccupiedleast member

Page 17: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-17

Implementation using arrays (2)

• Summary of algorithms and time complexities:

Operation Algorithm Time complexity

contains binary search O(log n)

add binary search + insertion O(n)

remove binary search + deletion O(n)

equals pairwise comparison O(n2)

containsAll variant of pairwise comparison O(n2)

addAll array merge O(n1+n2)

removeAll variant of array merge O(n1+n2)

retainAll variant of array merge O(n1+n2)

Page 18: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-18

Implementation of sets using SLLs (1)

• Represent an (unbounded) set by: a variable card, containing the current cardinality

an SLL, containing one member per node.

• Keep the SLL sorted, and avoid storing duplicates.

membermember memberInvariant:

Empty set:

Illustration: CA MX US

least member greatest member

represents the set {CA, US, MX}

Page 19: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-19

Implementation using SLLs (2)

• Summary of algorithms and time complexities:

Operation Algorithm Time complexity

contains SLL linear search O(n)

add SLL linear search + insertion O(n)

remove SLL linear search + deletion O(n)

equals pairwise comparison O(n2)

containsAll variant of pairwise comparison O(n2)

addAll SLL merge O(n1+n2)

removeAll variant of SLL merge O(n1+n2)

retainAll variant of SLL merge O(n1+n2)

Page 20: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-20

Implementation of small-integer sets using boolean arrays (1)

• If the members are known to be small integers, in the range 0…m–1, represent the set by: a boolean array b of length m, such that b[i] is true if and only if i

is a member of the set.

0 1 m–1Invariant: bool.bool. bool.

2bool.

Empty set:0 1 m–1

falsefalse false2

false

Illustration(m = 10): false true true falsefalse

0 1 2 4true

53false

6true

7

represents the set {2, 3, 5, 7}

Page 21: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-21

Implementation using boolean arrays (2)

• Summary of algorithms and time complexities:

Operation Algorithm Time complexity

contains test array component O(1)

add set array component to true O(1)

remove set array component to false O(1)

equals pairwise equality test O(m)

containsAll pairwise implication test O(m)

addAll pairwise disjunction O(m)

removeAll pairwise negation + conjunction O(m)

retainAll pairwise conjunction O(m)

Page 22: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-22

Summary of set implementations (1)

Operation Array representation

SLL representation

Boolean array representation

contains O(log n) O(n) O(1)

add O(n) O(n) O(1)

remove O(n) O(n) O(1)

equals O(n2) O(n2) O(m)

containsAll O(n2) O(n2) O(m)

addAll O(n1+n2) O(n1+n2) O(m)

removeAll O(n1+n2) O(n1+n2) O(m)

retainAll O(n1+n2) O(n1+n2) O(m)

Page 23: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-23

Summary of set implementations (2)

• The array representation is suitable only for small or static sets. A static set is one in which members are never/infrequently added

or removed.

• The SLL representation is suitable only for small sets.

• The boolean-array representation is suitable only for dense sets of small integers. A dense set is one where most potential members are actually

present.

• For general applications, we need a more efficient set representation: search tree (see 10) or hash table (see 12).

Page 24: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-24

Sets in the Java class library

• The java.util.Set interface is similar to the Set interface above.

• The java.util.TreeSet class implements the java.util.Set interface, representing each set by a search tree (see 10).

• The java.util.HashSet class implements the java.util.Set interface, representing each set by an open-bucket hash table (see 12).

Page 25: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-25

Example 2: information retrieval (1)

• Consider a very simple information retrieval system.

• A query is a set of key words.

• Each document in the document base is viewed as a set of words. The order of words in a document is of no significance.

• In response to a query, the system identifies each document that contains all or some of the key words.

Page 26: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-26

Example 2 (2)

• Outline of implementation:

public static final int NONE=0, SOME=1, ALL=2;

public static int score (String name,Set keywords) {

// Return a score reflecting whether the document named name // contains all, some, or none of the words in keywords.

Set docwords = readAllWords(name);if (docwords.containsAll(keywords))

return ALL;else if (disjoint(docWords, keywords))

return NONE;else

return SOME ;}

Page 27: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-27

Example 2 (3)

• Outline of implementation (continued):

private static boolean disjoint (Set docwords, Set keywords) {

// Return true if and only if the sets docwords and keywords // have no common words.

Iterator iter = keywords.iterator();while (iter.hasNext()) {

String keyword = (String) iter.next();if (docwords.contains(word))

return false;}return true;

}

Page 28: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-28

Example 2 (4)

• Outline of implementation (continued):

private static Set readAllWords (String name) {// Return the set of all words occurring in the document name.

BufferedReader doc =new BufferedReader(

new InputStreamReader(new FileInputStream(name)));

Set words = new TreeSet();for (;;) {

String word = readWord(doc);if (word == null) break; // end of documentwords.add(word.toLowerCase());

}doc.close(); return words;

}

or:new HashSet()

Page 29: 9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

9-29

Example 2 (5)

• Outline of implementation (continued):

private static String readWord (BufferedReader doc)

throws IOException {// Read and return the next word from doc, skipping any preceding // white space or punctuation. Return null if no word remains to be // read.

…}


Recommended