Data Structures
September 22
Topics
• Org rem
• Revisited ADTs: definitions
• ADT List and ADT ListDrozdek
• Compare these two ADT
• We are going to study Chapter 3 through the glasses of ADTs:
• The different Implementations of ADT List Drozdek– Array Implementation
– Singly Linked List Implementation
– Doubly Linked List Implementation
– Circular Singly Linked List Implementation
– Circular Doubly Linked List Implementation
– Skip List implementation
– Implementation with Self-organizing
• ADT ListWithSuccessor and its implementations
Topics
• Sparse Tables
• Binary Trees
– Trees, Binary Trees,
Org Remarks
• Send your Assignments (of course, zipped – since your work will usually consist of several files) to Simon Zaaijer (email: [email protected])
• In addition to the above turn in a paper copy of your work in class on the day of the deadline
• Let your zip file have a name which can be traced to your team in the following standard way – username1andUsername2No1.zip, – where username1 is the liacs username of the first member and
username2 is the liacs username of the second member
• Moreover: each file you are turning in should contain the names of the team members, date turned in, due date, Assignment No; all this in a conspicuous manner at the top of the files
• furthermore a README file should describe which files your turning in and instructions for compilation and execution, if necessary
ADT through the Glasses of a Modern Language (C++)
• Client uses class as abstraction– Invokes public operations only– Internal implementation not relevant!
• Client can't and shouldn't muck with internals– Class data should private (supported by modern languages)
• Imagine a "wall" between client and implementer– Wall prevents either from getting involved in other's business– Interface is the "chink" in the wall– Conduit allows controlled access between the two
• Consider Lexicon– Abstraction is a word list, operations to verify word/prefix– How does it store list? using array? vector? set? does it matter to
client?
Why ADTs?•Abstraction
–Client insulated from details, works at higher-level
•Encapsulation–(In modern language can make) internals private to ADT, not accessible by client (accidentally or on purpose)
•Independence–Separate tasks for each side (once agreed on interface)
•Flexibility–ADT implementation can be changed without affecting client (provided implementation adheres to the contract: that is, syntax and semantics of functions does not change)
• Once and Only Once• Strongly supports Modularity
Data AbstractionData Abstraction – a way of processing/manipulating data according to the
black-box principle. The data is processed by high-level functions which invoke lower level functions. This approach is usually used in OOP (object-oriented programming), which enables one to work with objects, without penetrating into the particulars of their realization.
ADTs (narrative)Abstract Data Type – is a data type, which provides a set of operations for working with the elements of this
type, and also the provides the possibility of creating elements of this type by means of special functions. All internal structure of such a type is hidden from the client programmer – this is the crux of abstraction. An ADT defines a set of functions to process/manipulate its values; these functions are independent from any concrete realization of the type. Concrete realizations of an ADT are called data structures.
In programming ADTs are normally speaking represented in the form of interfaces which hide the corresponding implementations of the types. Client programmers work with ADTs solely by means of the interfaces as the implementation may change in the future. This approach is in harmony with the principle of encapsulation in OOP. The strength of this approach is exactly this hiding of the implementation. As soon as only the interface is published, as long as the data structure supports this interface, each program working with the given ADT, will continue to work correctly. The implementers of a data structure can fine tune, without changing the published interface and without changing the semantics of the functions, incrementally the realizations, improving the algorithms with respect to speed, reliability, and use of memory.
The difference between ADTs and data structures, which realize the abstract types, can be explained by the following example. ADT List can be implemented by means of an array or by some linear linked list which uses some technique of dynamic storage allocation. However, each realization defines the same set of functions, which must work the same way for all implementations (according to functionality but not as far as speed is concerned).
ADTs strongly support the creation of modular programming products and usually give rise to several exchangeable (equivalent) implementations of a single module.
(See, for instance, the several Wikipedias (English, Dutch, German, French, Spanish, Russian, ... or NIST (National Institute of Standards and Technology) ) for different definitions (ATDs are defined uniformly accross the board, the terms data structure and data type know some variations.) See also pages 1-3, section 1.9, page 137 of Drozdek.
• Lists contain items of the same type
– List of grocery items
– List of phone numbers
• What can you do to the items of a list?
• Determine length of the list
• Add an itme to the list
• Remove an item from the list
• Retrieve an item from the list
• Where do you want to add a new item and which item do you want to look at?
• Various answers various ADTs List
Lis
ts
ADT List (there are some variants)createList()
// Creates an empty list.
destroyList()
// Deatroys a list.
listIsEmpty()
// Determines whether the list is empty.
listLength()
// Returns the number of items in list.
listInsert(newPosition, newItem, success)
// Inserts newItem at position newPosition of a list, if 1 ≤ newPosition ≤ listLength() + 1
// if newPosition ≤ listLength(), the items are shifted as follows
// the item at newPosition becomes the item at newPosition becomes the item at newPosition + 1, the item
// at newPosition + 2, and so on. Success indicates whether the insertion was successful.
listDelete(pos, success)
// Deletes the item at position pos of a list, if 1 ≤ pos ≤ listLength(). If pos < lengthList(), the items are
// shifted as follows: the item at pos+1 becomes the at pos, the item at pos+1 becomes the item at pos+1, and
//so on. Success indicates whether the deletion was successful.
listRetrieve(pos, dataItem, success)
// sets dataItem to the item at position pos of a list, if 1 ≤ position ≤ listLength(). The list is left unchanged by
// this operation. Success indicates whether the retrieval was successful.
This variant is not
Discussed in Drozdek
Recall: Self-referential Classes
Self-referential class objects can be linked together to form useful
Data structures (such as lists, queues, stacks or trees.
ADT List
Compare Two ADTsADT ListDrozdek
Ch 3 organized around the ADT concept: ADT ListDrozdek and its Various Implementations
– Array Implementation
– Singly Linked List Implementation
– Doubly Linked List Implementation
– Circular Singly Linked List Implementation
– Circular Doubly Linked List Implementation
– Skip List implementation
– Implementation with Self-organization
ADT List Drozdek: array implementation
• Analyze each of the operations and determine the cost of each
– Insert at the head
– Delete at the head
– Insert at the tail
– Delete at the tail
– Delete int
– Determine whether int is in the list.
ADT List Drozdek: singly linked list implementation
• Analyze each of the operations and determine the cost of each
– Insert at the head
– Delete at the head
– Insert at the tail
– Delete at the tail
– Delete int
– Determine whether int is in the list.
ADT List Drozdek: doubly linked list implementation
• Analyze each of the operations and determine the cost of each
– Insert at the head
– Delete at the head
– Insert at the tail
– Delete at the tail
– Delete int
– Determine whether int is in the list.
ADT List Drozdek: circular singly linked list implementation
• Analyze each of the operations and determine the cost of each
– Insert at the head
– Delete at the head
– Insert at the tail
– Delete at the tail
– Delete int
– Determine whether int is in the list.
ADT List Drozdek: circular doubly linked list implementation
• Analyze each of the operations and determine the cost of each
– Insert at the head
– Delete at the head
– Insert at the tail
– Delete at the tail
– Delete int
– Determine whether int is in the list.
ADT List Drozdek: skip list implementation
• Analyze each of the operations and determine the cost of each
– Insert at the head
– Delete at the head
– Insert at the tail
– Delete at the tail
– Delete int
– Determine whether int is in the list.
ADT List Drozdek: skip list implementation
ADT List Drozdek: self-organizing list implementation
There are four methods for organizing lists:
• Move-to-front method – after the desired element is located, put it at the beginning of the list (Figure 3.18a)
• Transpose method – after the desired element is located, swap it with its predecessor unless it is at the head of the list (Figure 3.18b)
• Count method – order the list by the number of times elements are being accessed (Figure 3.18c)
ADT List Drozdek: self-organizing list implementation
There are four methods for organizing lists:
• Ordering method – order the list using certain criteria natural for the information under scrutiny (Figure 3.18d)
• Optimal static ordering all the data are already ordered by the frequency of their occurrence in the body of data so that the list is used only for searching, not for inserting new items
Self-organizing lists
Self-organizing lists
ADT List DrozdekWithSuccessor
• Has new operation: successor(item)
• Implementations:– Array
– Singly linked list
– Etc
– Circular singly linked list
– Circular doubly linked list
Sp
ars
e t
ab
les
:
ex
am
ple
stu
den
t g
rad
es
Sp
ars
e t
ab
les
:
ex
am
ple
stu
den
t g
rad
es
Sp
ars
e t
ab
les
:
ex
am
ple
stu
den
t g
rad
es
30
Summary• A linked structure is a collection of nodes
storing data and links to other nodes.
• A linked list is a data structure composed of nodes, each node holding some information and a reference to another node in the list.
• We viewed each of the linked structures in Ch3 as a different implementation of ADT ListDrozdek.
• A singly linked list is a node that has a link only to its successor in this sequence.
• A circular list is when nodes form a ring: The list is finite and each node has a successor.
31
Summary (continued)
• A skip list is a variant of the ordered linked list that makes a non sequential search possible.
• There are four methods for organizing lists: move-to-front method, transpose method, count method, and ordering method.
• Optimal static ordering - all the data are already ordered by the frequency of their occurrence in the body of data so that the list is used only for searching, not for inserting new items.
32
Summary (continued)
• A sparse table refers to a table that is populated sparsely by data and most of its cells are empty.
• Linked lists allow easy insertion and deletion of information because such operations have a local impact on the list.
• The advantage of arrays over linked lists is that they allow random accessing.
Linear Structures
• ADT List (with variations such ADT ListDrozdek)
• ADT Queue
• ADT Stack
• ADT Deque
• ADT HSQ (non-standard)
Po
p =
re
mo
ve
to
p
Pu
sh
= i
ns
ert
a
t te
to
p
Retrieve top
AD
T S
tack
FIF
O d
iscip
lin
eRetrieve first
element
dequeue
enqueue
ADT Queue
LIFO discipline
Retrieve end2
Rem
ove
en
d1
ins
ert
a
t e
nd
1
Remove end2
Retrieve end1
Insert at end2
AD
T D
eq
ue
The deletes also return info
(do retrieve as well)
deleteNode(int or T)
// deletion done by value, not position
isInList(int or T)
deleteFromTail
addToTail
de
lete
Fro
mH
ea
da
dd
To
Hea
d
ADT ListDrozdek
listInsert(pos, int or T)
listRetrieve(pos, int& or T&)
ADT List
listDelete(pos)
1 2 ... n
Gra
ph
ica
l S
um
mary
of
Lin
ear,
sta
nd
ard
AD
Ts
posType=int
Could be more general
Hierachical structures: Trees
•
37
Objectives
Discuss the following topics:
• Trees, Binary Trees, and Binary Search Trees
• Implementing Binary Trees
• Searching a Binary Search Tree
• Tree Traversal
• Insertion
• Deletion
38
Objectives (continued)
Discuss the following topics:
• Balancing a Tree
• Self-Adjusting Trees
• Heaps
39
Trees, Binary Trees, and Binary Search Trees
• A tree is a data type that consists of nodesand arcs
• These trees are depicted upside down with the root at the top and the leaves (terminal nodes) at the bottom
• The root is a node that has no parent; it can have only child nodes
• Leaves have no children (their children are null)
40
Trees, Binary Trees, and Binary Search Trees (continued)
• Each node has to be reachable from the root through a unique sequence of arcs, called a path
• The number of arcs in a path is called the length of the path
• The level of a node is the length of the path from the root to the node plus 1, which is the number of nodes in the path
• The height of a nonempty tree is the maximum level of a node in the tree
41
Trees, Binary Trees, and Binary Search Trees (continued)
Figure 6-1 Examples of trees
42
Trees, Binary Trees, and Binary Search Trees (continued)
Figure 6-2 Hierarchical structure of a university shown as a tree
Recall Definition of tree
1. An empty structure is a tree
2. If t1, ..., tk are trees, the structure whose root is a tree has as its children the roots of t1,...,tk is also a tree
3. Only structures generated by rule 1 and 2 are trees
Alternatively: a connected graph which contains no cycles(circuits) is a tree
Equivalent statements (see φ1)
• Let T be graph with n vertices then the following are equivalent:
– T is a tree
– T contains no circuits, and has n-1 edges
– T is connected, and has n-1 edges
– T is connected, and every edge is a bridge
– Any two vertices are connected gby exactly one pass
– T contains no circuits, but the addition of any new edge creates exactly one circuit.
45
Trees, Binary Trees, and Binary Search Trees (continued)
• An orderly tree is where all elements are stored according to some predetermined criterion of ordering
Figure 6-3 Transforming (a) a linked list into (b) a tree
46
Trees, Binary Trees, and Binary Search Trees (continued)
• A binary tree is a tree whose nodes have two children (possibly empty), and each child is designated as either a left child or a right child
Figure 6-4 Examples of binary trees
47
Trees, Binary Trees, and Binary Search Trees (continued)
• In a complete binary tree, all nonterminalnodes have both their children, and all leaves are at the same level
• A decision tree is a binary tree in which all nodes have either zero or two nonempty children
Complete
Binary tree
Decision tree complete
Decision treeincomplete
Binary tree
Trees, Binary Trees, and Binary Search Trees (continued)
• At level i in binary tree at most 2i-1 nodes
• Non-empty binary tree whose nonterminal nodes have exactly two nonempty children, #of leaves = 1+#nonterminal nodes
• In complete binary decision tree
• # of nodes = 2height-1; one way is to use the statement #of leaves = 1+#nonterminal nodes; another way is to count how many nodes there are in each level and then sum the geometric series;
49
Trees, Binary Trees, and Binary Search Trees (continued)
Figure 6-5 Adding a leaf to tree (a), preserving the relation of the
number of leaves to the number of nonterminal nodes (b)