Data Structures 2008

1

Data StructuresData Structures

Mostafa M. Aref

Ain Shams University

Faculty of Computer & Information Sciences

2

Problem-SolvingProblem-Solving Problem - Reasoning - Solution - Test Analytic Approach Algorithmic Approach

– Input, Process, Output– Algorithm is a sequence of executable instructions

no ambiguity (instruction or sequence) finite (steps or execution)

– Sequence: I/O, variables assignment – Selection:If . . . Else– Repetition: looping– Condition Looping: while or until

3

Software EngineeringSoftware Engineering Software Engineering

– Requirement Specification– Analysis (Input, Output, Formula, Units)– Design (algorithm, verification)– Implementation (Language)– Testing

Waterfall ModelSpecification

Analysis

Design

Implementation

Testing

4

Evolution of Object-Oriented ProgrammingEvolution of Object-Oriented Programming VariablesVariables

– A variable is a value that can change, depending on conditions or on information passed to the program.

Data typesData types– A set of values from which a variable, constant, function, or other expression may

take its value. A type is a classification of data that tells the compiler or interpreter how the programmer intends to use it.

User defined data typesUser defined data types Abstract Data typesAbstract Data types

– A type whose internal form is hidden behind a set of access functions. Objects of the type are created and inspected only by calls to the access functions. This allows the implementation of the type to be changed without requiring any changes outside the module in which it is defined.

– Abstraction– Encapsulation

Object-Oriented ProgrammingObject-Oriented Programming

5

Object-Oriented ProgrammingObject-Oriented Programming

Advantages of Object-Oriented ProgrammingAdvantages of Object-Oriented Programming– Simplicity– Modularity– Modifiability– Extensibility– Flexibility– Maintainability– Reusability

6

Object-Oriented FeaturesObject-Oriented Features Abstraction: The process of capturing the essential features and ignoring the detail Encapsulation: information hiding mechanism

– Object: An entity which has both variables and methods Variables comprise the state of the object Methods are mechanisms for accessing or changing the state of the

object Public variables or methods Private variables and methods

– Class: a template that can be used to create many objects with a different name and identity yet sharing the method code and shared variables

Instantiation: The process of creating objects using the description of a classMyclass m1 = new Myclass ();– Message Passing: an object sends a message to another requesting that a service be

performed. Inheritance Polymorphism: the same message is sent to a collection of objects & each object will

be able to respond in its own way

7

AbstractionAbstraction Object-Oriented Strategies

Properties of a Good Abstraction– Well named Coherent – Accurate Minimal – Complete

8

SeparationSeparation

Software Interface: the external, visible aspects of a software artifact by which the behavior of the artifact is elicited.

Software Implementation: The programmed mechanism that realizes the behavior implied by an interface.

Separation: In software systems, the independent specification of a software interface and one or more software implementations of that interface.

What Goals Policy Product (Interface, specification, requirement) Ends

How Plans Mechanism Process, Implementation Means

9

Data StructuresData Structures Data structures is concerned with the representation

and manipulation of data. All programs manipulate data. So, all programs represent data in some way. Data manipulation requires an algorithm. Algorithm design methods needed to develop

programs that do the data manipulation. The study of data structures and algorithms is

fundamental to Computer Science.

10

Data StructuresData Structures Data object: Set or collection of instances

– integer = {0, +1, -1, +2, -2, +3, -3, …}– daysOfWeek = {S,M,T,W,Th,F,Sa}

Instances may or may not be related– myDataObject = {apple, chair, 2, 5.2, red, green, Jack}

Relationships that exist among instances and elements that comprise an instance. Among instances of integer– 369 < 370 or 280 + 4 = 284

Among elements that comprise an instance 369– 3 is more significant than 6– 3 is immediately to the left of 6– 9 is immediately to the right of 6

The relationships are usually specified by specifying operations on one or more instances. add, subtract, predecessor, multiply

11

Linear (or Ordered) ListsLinear (or Ordered) Lists Instances are of the form: (e0, e1, e2, …, en-1), where ei denotes a list

element n >= 0 is finite list size is n

L = (e0, e1, e2, e3, …, en-1) relationships

– e0 is the zero’th (or front) element

– en-1 is the last element

– ei immediately precedes ei+1

Linear List Examples– Students in CS502 =(Jack, Jill, Abe, Henry, Mary, …, Judy)

– Quizes in CS502 =(Quiz1, Quiz2, Quiz3)

– Days of Week = (S, M, T, W, Th, F, Sa)

– Months = (Jan, Feb, Mar, Apr, …, Nov, Dec)

12

Linear List OperationsLinear List Operations– size( ): determine list size L = (a,b,c,d,e)size = 5– get(theIndex): get element with given index

L = (a,b,c,d,e) get(0) = a, get(2) = c, get(4) = e get(-1) = error, get(9) = error

– indexOf(theElement): determine the index of an element L = (a,b,d,b,a) indexOf(d) = 2, indexOf(a) = 0, indexOf(z) = -1

– remove(theIndex): remove and return element with given indexL = (a,b,c,d,e,f,g)

remove(2) returns cand L becomes (a,b,d,e,f,g) index of d,e,f, and g decrease by 1 remove(-1) => error remove(20) => error

– add(theIndex, theElement): add an element so that the new element has a specified index L = (a,b,c,d,e,f,g)

add(0,h) => L = (h,a,b,c,d,e,f,g) index of a,b,c,d,e,f, and g increase by 1 add(2,h) => L = (a,b,h,c,d,e,f,g) index of c,d,e,f, and g increase by 1 add(10,h) => error add(-6,h) => error

13

Data Structure SpecificationData Structure Specification Language independent: Abstract Data Type Linear List Abstract Data Type

AbstractDataType LinearList{instances ordered finite collections of zero or more elements operations isEmpty(): return true iff the list is empty, false otherwise

size(): return the list size (number of elements in the list)

get(index): return the indexth element of the list

indexO f(x): return the index of the first occurrence of x in the list, return -1 if x is not in the list

remove(index): remove and return the indexth element, elements with higher index have their index reduced by 1

add(theIndex, x): insert x as the indexth element,elements with theIndex >= index have their index increased by 1

output(): output the list elements from left to right

14

Linear List as C++ ClassLinear List as C++ Classclass LinearList{ public:

boolean isEmpty();int size();

Object get(int index);int indexOf(Object elem);Object remove(int index);void add(int index, Object obj);String toString(); }

Extending A C++ Class

class ArrayLinearList : public LinearLis{protected:

Object element []; // array of elementsint size; // number of elements in array

// code for all Array implementation must come here }

15

Linear List Array RepresentationLinear List Array Representationuse a one-dimensional array element[]

L = (a, b, c, d, e), Store element i of list in element[i].Right To Left Mapping

Mapping That Skips Every Other Position

Wrap Around Mapping

Add/Remove An Element Size 5add(1,g) size = 6

0 1 2 3 4 5 6a b c d e

abcde

a b c d e

a b cd e

a b c d e

a g b c d e

16

Array RepresentationArray Representation Data Type Of Array element[]

– Data type of list elements is unknown.– Define element[] to be of data type Object.– Cannot put elements of primitive data types (int, float, double, char, etc.) into our linear

lists. Length of Array element[]

– Don’t know how many elements will be in list.– Must pick an initial length and dynamically increase as needed

Create An Empty ListArrayLinearList a (100), b( ), c;ArrayLinearList a = new ArrayLinearList(100), b = new ArrayLinearList(),LinearList a (100), b( ), c;LinearList a = new ArrayLinearList(100), b = new ArrayLinearList(),

Using A Linear Lista.size();a.add(0,2);b.remove(0);if (a.isEmpty())

a.add(0, 5);

17

Class ArrayLinearListClass ArrayLinearList The Class ArrayLinearListArrayLinearList(int initialCapacity) { if (initialCapacity < 1) out << "initialCapacity must be >= 1";

else element = new Object [initialCapacity]; }ArrayLinearList() /** create a list with initial capacity 10 */ { this(10); } // use default capacity of 10boolean isEmpty() /** return true iff list is empty */ {return size == 0;}int size() /** return current number of elements in list */ {return size;}Object get(int index) {if (index < 0 || index >= size) out << "index = " << index << " size = " << size;

else return element[index]; }int indexOf(Object theElement) { for (int i = 0; i < size; i++) // search element[] for theElement if (element[i].equals(theElement)) return i;

return -1; } // theElement not found

18

The Class ArrayLinearListThe Class ArrayLinearListObject remove(int index) {if (index < 0 || index >= size) out << "index = " << index << " size = " << size;

// valid index, shift elements with higher index else Object removedElement = element[index]; for (int i = index + 1; i < size; i++) element[i-1] = element[i]; element[--size] = null; return removedElement; }void add(int index, Object theElement) { if (index < 0 || index > size) // invalid list position out << "index = " << index << " size = " << size; else if (size == element.length) // valid index, make sure we have space // no space, double capacity element = ChangeArrayLength.changeLength1D(element, 2 * size); for (int i = size - 1; i >= index; i--) // shift elements right one position element[i + 1] = element[i]; element[index] = theElement; size++; }

19

Linked RepresentationLinked Representation– lists elements are stored, in memory, in an arbitrary order– explicit information (called a link) is used to go from one element to the next– Layout of L = (a,b,c,d,e) using an array representation.

– A linked representation uses an arbitrary layout.

– pointer (or link) in e is null– use a variable firstNode to get to the first element a

Normal Way To Draw A Linked List

a b c d enullfirstNode

c a e d b

firstNode

a b c d e

20

ChainChain– A chain is a linked list, each node represents one element.– There is a link or pointer from one element to the next.– The last node has a null pointer.

Node Representationclass ChainNode //{ Object element; ChainNode *next; ChainNode( ) { } // constructors come here ChainNode(Object element) {this.element = element;} ChainNode(Object element, ChainNode *next) {this.element = element; this.next = next;} }– get(0) desiredNode = firstNode; // gets you to first node– get(1) desiredNode = firstNode.next; // gets the second node– get(2) desiredNode = firstNode.next.next; //gets the third node– get(5) desiredNode = firstNode.next.next.next.next.next;

Remove An Element– remove(0)ChainNode * temp = firstNode;firstNode = firstNode.next;delete temp;– remove(2) – first get to node just before node to be removed, beforeNode = firstNode.next;– now change pointer in beforeNode

ChainNode * temp = beforeNode.next; beforeNode.next = beforeNode.next.next; delete temp;

beforeNodea b c d e

null

firstNode

21

Add an ElementAdd an Element– add(0,’f’)

get a node, set its data and link fieldsChainNode *newNode = new ChainNode(‘f’, firstNode);

update firstNode firstNode = newNode; firstNode = new ChainNode( ‘f’, firstNode);

– Add element at the middle – add(3,’f’)

beforeNode = firstNode.next.next; ChainNode *newNode = new ChainNode(‘f’, beforeNode.next);

beforeNode.next = newNode; first find node whose index is 2 next create a node and set its data and link fields finally link beforeNode to newNode beforeNode = firstNode.next.next; beforeNode.next = new ChainNode(‘f’, beforeNode.next);

a b c d enull

firstNode

fnewNode

a b c d enull

firstNode fnewNode

beforeNodec

22

The Class ChainThe Class Chain/** linked implementation of LinearList */class Chain: LinearList{public:

ChainNode *firstNode=null; // data members int size=0;Chain(int initialCapacity) /** create a list that is empty */ { // initial values of firstNode and size are null and 0, respectively }Chain( ) {this(0);}

boolean isEmpty( ) /** @return true iff list is empty */ {return size == 0;}int size( ) /** @return current number of elements in list */ {return size;}Object get(int index) {if (index < 0 || index >= size) {out << "index = " << index << " size = " << size; return;} ChainNode currentNode = firstNode; // move to desired node for (int i = 0; i < index; i++) currentNode = currentNode.next; return currentNode.element; }int indexOf(Object theElement) // search the chain for theElement { ChainNode *currentNode = firstNode; int index = 0; // index of currentNode while (currentNode != null && currentNode.element!=theElement) { currentNode = currentNode.next; // move to next node index++; } if (currentNode == null) // make sure we found matching element return -1; else return index; }

23

The Class Chain(2)The Class Chain(2)public Object remove(int index) {if (index < 0 || index >= size) {out << "index = " << index << " size = " << size; return;} Object removedElement; if (index == 0) // remove first node { removedElement = firstNode.element; firstNode = firstNode.next; }

else { ChainNode *beforeNode = firstNode; // get before node for (int i = 0; i < index - 1; i++) beforeNode = beforeNode.next; removedElement = beforeNode.next.element; beforeNode.next = beforeNode.next.next; // remove desired node } size--; return removedElement; }public void add(int index, Object theElement) { if (index < 0 || index > size) // invalid list position

{out << "index = " << index << " size = " << size; return;} if (index == 0) // insert at front firstNode = new ChainNode(theElement, firstNode);else { ChainNode beforeNode = firstNode; // find before node for (int i = 0; i < index - 1; i++) beforeNode = beforeNode.next; // insert after beforeNode beforeNode.next = new ChainNode(theElement, beforeNode.next); } size++; }

24

Other types of Linked ListOther types of Linked List Chain With Header Node

Circular List

Doubly Linked Circular List With Header Node

a b c d enull

headerNode

a b c d e

firstNode

a b c e

headerNode

d

25

StacksStacks– Linear list.– One end is called top.– Other end is called bottom.– Additions to and removals from the top end only.

Add a cup to the stack. Remove a cup from new stack. A stack is a LIFO list.

The class StackClass Stack{ public:

boolean empty();Object peek();void push(Object theObject);Object pop();}

bottom

top

C

AB

DE

26

Stacks ApplicationsStacks Applications Parentheses Matching

– (((a+b)*c+d-e)/(f+g)-(h+j)*(k-l))/(m-n)– Output pairs (u,v) such that the left parenthesis at position u is matched with the right parenthesis at v.

(2,6) (1,13) (15,19) (21,25) (27,31) (0,32) (34,38)– (a+b))*((c+d)

(0,4) right parenthesis at 5 has no matching left parenthesis (8,12) left parenthesis at 7 has no matching right parenthesis

– scan expression from left to right– when a left parenthesis is encountered, add its position to the stack– when a right parenthesis is encountered, remove matching position from stack– Example: (((a+b)*c+d-e)/(f+g)-(h+j)*(k-l))/(m-n)– (2,6), (1,13) (15,19) (21,25) (27,31) (0,32)

2 15 21 27 1 - - - 0 0 0 0

27

Stacks Applications(2)Stacks Applications(2) Towers Of Hanoi 64 gold disks to be moved from tower A to tower C each tower operates as a stack cannot place big disk on top of a smaller one 3-disk Towers Of Hanoi 3-disk Towers Of Hanoi

– 7 disk moves Recursive Solution

– n > 0 gold disks to be moved from A to C using B– move top n-1 disks from A to B using C– move top disk from A to C– move top n-1 disks from B to C using A– moves(n) = 0 when n = 0– moves(n) = 2*moves(n-1) + 1 = 2n-1 when n > 0

moves(64) = 1.8 * 1019 (approximately) Performing 109 moves/second, a computer would take about 570 years

to complete.

A B C1234

28

Derive From A Linear List ClassDerive From A Linear List Class Chess Story

One 1 grain of rice on the first square, 2 for next, 4 for next, 8 for next, and so on. Surface area needed exceeds surface area of earth.

Method Invocation And Return– public void a( ) { …; b(); …} return address in e()– public void b( ) { …; c(); …} return address in d()– public void c( ) { …; d(); …} return address in c()– public void d( ) { …; e(); …} return address in b()– public void e( ) { …; c(); …} return address in a()

Derive From ArrayLinearList– stack top is either left end or right end of linear list, when top is right end of linear list– empty() => isEmpty()– peek() => get(0) or get(size() - 1)– push(theObject) => add(size(), theObject)– pop() => remove(size()-1)

Derive From Chain– stack top is either left end or right end of linear list, when top is left end of linear list– empty() => isEmpty()– peek() => get(0) – push(theObject) => add(0, theObject)– pop() => remove(0)

29

Deriving from ArrayLinearListDeriving from ArrayLinearListclass DerivedArrayStack is ArrayLinearList{ // constructors come herepublic boolean empty( ) {return isEmpty();}public Object peek( ){ if (empty()) {out << “Empty Stack Exception”; return;} return get(size() - 1) }public void push(Object theElement)

{add(size(), theElement);}public Object pop(){ if (empty()) {out << “Empty Stack Exception”; return;}

return remove(size() - 1); } } Merits of deriving from ArrayLinearList

– Code for derived class is quite simple and easy to develop.– Code is expected to require little debugging.– Code for other stack implementations such as a linked implementation are easily

obtained.– Just replace extends ArrayLinearList with extends Chain– For efficiency reasons we must also make changes to use the left end of the list as

the stack top rather than the right end.

30

Evaluation of deriving from ArrayLinearListEvaluation of deriving from ArrayLinearList Demerits

– All public methods of ArrayLinearList may be performed on a stack. get(0) … get bottom element remove(5) add(3, x) So we do not have a true stack implementation. Must override undesired methods.

– Unnecessary work is done by the code. peek() verifies that the stack is not empty before get is invoked. The index check done

by get is, therefore, not needed. add(size(), theElement) does an index check and a for loop that is not entered. Neither

is needed. pop() verifies that the stack is not empty before remove is invoked. remove does an

index check and a for loop that is not entered. Neither is needed. So the derived code runs slower than necessary.

Evaluation– Code developed from scratch will run faster but will take more time (cost) to

develop.– Tradeoff between software development cost and performance.– Tradeoff between time to market and performance.– Could develop easy code first and later refine it to improve performance.

31

Code From ScratchCode From Scratch– Use an int variable top.– Stack elements are in stack[0:top].– Top element is in stack[top].– Bottom element is in stack[0].– Stack is empty iff top = -1.– Number of elements in stack is top+1.

class ArrayStack{int top; // current top of stack Object [] stack; // element array // constructors come herepublic Object pop(){ if (empty()) {out << “Empty Stack Exception”; return;} Object topElement = stack[top]; return topElement;} }

32

QueuesQueues– Linear list.– One end is called front.– Other end is called rear.– Additions are done at the rear only. – Removals are made from the front only.

Queue classclass Queue{ public:

boolean isEmpty();Object getFrontEelement();Object getRearEelement();void put(Object theObject);

Object remove(); } Derive From ArrayLinearList

– when front is left end of list and rear is right end– Queue.isEmpty() => ArrayLinearList.isEmpty() getFrontElement() => get(0)– getRearElement() => get(size() - 1)– put(theObject) => add(size(), theObject)– remove() => remove(0)

Derive From ExtendedChain

Bus Stop

frontrear rear

33

Custom Array QueueCustom Array Queue Custom Linked Code

– Develop a linked class for Queue from scratch to get better preformance than obtainable by deriving from ExtendedChain.

Custom Array Queue– Use a 1D array queue. queue[]

Circular view of array. Use integer variables front and rear.

– front is one position counterclockwise from first element– rear gives position of last element

Add An Element– Move rear one clockwise.– Then put into queue[rear].

Remove An Element– Move front one clockwise.– Then extract from queue[front].

Moving Clockwise– rear++;– if (rear = = queue.length) rear = 0;– rear = (rear + 1) % queue.length;

Empty That Queue– When a series of removals causes the queue to become empty, front = rear.– When a queue is constructed, it is empty.– So initialize front = rear = 0.

[0]

[1]

[2] [3]

[4]

[5]

[0]

[1]

[2] [3]

[4]

[5]

A B

Cfront rear

34

Problems with QueuesProblems with Queues

A Full Tank Please– When a series of adds causes the queue to become full, front = rear.– So we cannot distinguish between a full queue and an empty queue!

Remedies.– Don’t let the queue get full.

When the addition of an element will cause the queue to be full, increase array size.

This is what the text does.– Define a boolean variable lastOperationIsPut.

Following each put set this variable to true. Following each remove set to false. Queue is empty iff (front == rear) && !lastOperationIsPut Queue is full iff (front == rear) && lastOperationIsPut

– Performance is slightly better when first strategy is used.

35

TreesTrees Computer Scientist’s View Linear Lists And Trees

– Linear lists are useful for serially ordered data.

(e0, e1, e2, …, en-1) Days of week. Months in a year. Students in this class.

– Trees are useful for hierarchically ordered data. Employees of a corporation.

– President, vice presidents, managers, and so on. Java’s classes.

– Object is at the top of the hierarchy.– Subclasses of Object are next, and so on.

branches

leavesroot

nodes

36

Hierarchical Data And TreesHierarchical Data And Trees– The element at the top of the hierarchy is the root.– Elements next in the hierarchy are the children of the root.– Elements next in the hierarchy are the grandchildren of the root,

and so on.– Elements at the lowest level of the hierarchy are the leaves.– Java’s Classes

root

children of root

grand children of root

great grand child of root

Object

Number Throwable OutputStream

Integer Double Exception FileOutputStream

RuntimeException

37

Tree DefinitionTree Definition– A tree t is a finite nonempty set of elements.– One of these elements is called the root.– The remaining elements, if any, are partitioned into trees, which are called the subtrees of t.

Subtrees root

Leaves Parent, Grandparent, Siblings, Ancestors, Descendents Levels – Caution

– Some texts start level numbers at 0 rather than at 1.– Root is at level 0. Its children are at level 1.– The grand children of the root are at level 2. And so on.– We shall number levels with the root at level 1.

height = depth = number of levels Node Degree = Number Of Children Tree Degree = Max Node Degree - Degree of the above tree = 3

Binary Tree

Object

Number Throwable OutputStream

Integer Double Exception FileOutputStream

RuntimeException

38

Binary TreeBinary Tree– Finite (possibly empty) collection of elements.– A nonempty binary tree has a root element.– The remaining elements (if any) are partitioned into two binary trees.– These are called the left and right subtrees of the binary tree.

Differences Between A Tree & A Binary Tree– No node in a binary tree may have a degree more than 2, whereas there is no limit

on the degree of a node in a tree.– A binary tree may be empty; a tree cannot be empty.– The subtrees of a binary tree are ordered; those of a tree are not ordered.

Differences Between A Tree & A Binary Tree– The subtrees of a binary tree are ordered; those of a tree are not

ordered.

– Are different when viewed as binary trees.– Are the same when viewed as trees.

a

b

a

b

39

Arithmetic ExpressionsArithmetic Expressions Arithmetic Expressions

– (a + b) * (c + d) + e – f/g*h + 3.25– Expressions comprise three kinds of entities.

Operators (+, -, /, *). Operands (a, b, c, d, e, f, g, h, 3.25, (a + b), (c + d), etc.). Delimiters ((, )).

– Operator Degree Number of operands that the operator requires. Binary operator requires two operands.

– a + b c / d e - f Unary operator requires one operand.

– + g - h– Infix Form

Normal way to write an expression. Binary operators come in between their left and right operands.

– a * b a + b * c a * b / c– (a + b) * (c + d) + e – f/g*h + 3.25

40

Arithmetic Expressions (2)Arithmetic Expressions (2)– Operator Priorities

How do you figure out the operands of an operator?– a + b * c a * b + c / d

This is done by assigning operator priorities.– priority(*) = priority(/) > priority(+) = priority(-)

When an operand lies between two operators, the operand associates with the operator that has higher priority.

Tie Breaker– When an operand lies between two operators that have the same

priority, the operand associates with the operator on the left. a + b – c a * b / c / d

Delimiters– Subexpression within delimiters is treated as a single operand,

independent from the remainder of the expression. (a + b) * (c – d) / (e – f)

41

Arithmetic Expressions (3)Arithmetic Expressions (3) Infix Expression Is Hard To Parse

– Need operator priorities, tie breaker, and delimiters.– This makes computer evaluation more difficult than is necessary.– Postfix and prefix expression forms do not rely on operator priorities, a tie

breaker, or delimiters.– So it is easier for a computer to evaluate expressions that are in these forms.

Postfix Form– The postfix form of a variable or constant is the same as its infix form. a, b,

3.25– The relative order of operands is the same in infix and postfix forms.– Operators come immediately after the postfix form of their operands. Infix = a

+ b Postfix = ab+ Postfix Examples

– Infix = a + b * c Postfix = a b c * +– Infix = a * b + c Postfix = a b * c +– Infix = (a + b) * (c – d) / (e + f) – Postfix = a b + c d - * e f + /

Unary Operators– Replace with new symbols.

+ a => a @ + a + b => a @ b + - a => a ? - a-b => a ? b -

42

Postfix EvaluationPostfix Evaluation– Scan postfix expression from left to right pushing operands on to a stack.– When an operator is encountered, pop as many operands as this operator needs; evaluate

the operator; push the result on to the stack.– This works because, in postfix, operators come immediately after their operands.– Example: (a + b) * (c – d) / (e + f)

Prefix Form– The prefix form of a variable or constant is the same as its infix form. a, b,

3.25– The relative order of operands is the same in infix and prefix forms.– Operators come immediately before the prefix form of their operands. Infix = a + b

Postfix = ab+ Prefix = +ab Binary Tree Form

– a + b - a

– (a + b) * (c – d) / (e + f)

+

a b

-

a

/

+

a b

-

c d

+

e f

*

/

43

Binary Tree Properties & RepresentationBinary Tree Properties & Representation Merits Of Binary Tree Form

– Left and right operands are easy to visualize.– Code optimization algorithms work with the binary tree form of an expression.– Simple recursive evaluation of expression.

Minimum Number Of Nodes– Minimum number of nodes in a binary tree whose height is h. At least one node at

each of first h levels.– minimum number of nodes is h

Maximum Number Of Nodes– All possible nodes at first h levels are present.– Maximum # of nodes = 1 + 2 + 4 + 8 + … + 2h – 1 = 2h-1

Number Of Nodes & Height– Let n be the # of nodes in a binary tree whose height is h.– h <= n <= 2h – 1 log2(n+1) <= h <= n

44

Full Binary TreeFull Binary Tree

– A full binary tree of a given height h has 2h – 1 nodes. Numbering Nodes In A Full Binary Tree

– Number the nodes 1 through 2h – 1. – Number by levels from top tobottom.– Within a level number from left to right.– Node Number Properties

Parent of node i is node i / 2, unless i = 1. Node 1 is the root and has no parent. Left child of node i is node 2i, unless 2i > n, n is the # of nodes. If 2i > n, node i has no left child. Right child of node i is node 2i+1, unless 2i+1 > n, where n is the # of nodes. If 2i+1 > n, node i has no right child.

Binary Tree Representation– Array Representation

Number the nodes using the numbering scheme for a full binary tree. The node that is numbered i is stored in tree[i].

1

2 3

4 5 6 7

8 9 10 11 12 13 14 15

45

Array RepresentationArray Representation

Right-Skewed Binary Tree– An n node binary tree needs an array whose length is between n+1 and 2n.

Linked Representation– Each binary tree node is represented as an object whose data type is BinaryTreeNode.– The space required by an n node binary tree is n * (space required by one node).

The Class BinaryTreeNode class BinaryTreeNode{ Object element; BinaryTreeNode leftChild; // left subtree BinaryTreeNode rightChild;// right subtree // constructors and any other methods come here}

tree[]0 5 10

a b c d e f g h i jb

a

c

d e f g

h i j

1

2 3

4 5 6 7

8 9 10

a

b

1

3

c7

d15tree[]

0 5 10a - b - - - c - - - - - - -

15d

46

Binary Tree OperationsBinary Tree Operations– Determine the height.– Determine the number of nodes.– Make a clone.– Determine if two binary trees are clones.– Display the binary tree.– Evaluate the arithmetic expression represented by a binary tree.– Obtain the infix form of an expression.– Obtain the prefix form of an expression.– Obtain the postfix form of an expression.

Binary Tree Traversal– Many binary tree operations are done by performing a traversal of

the binary tree.– In a traversal, each element of the binary tree is visited exactly once.– During the visit of an element, all action (make a clone, display,

evaluate the operator, etc.) with respect to this element is taken.– Preorder, Inorder, Postorder or Level order

47

Binary Tree TraversalBinary Tree Traversal Preorder Traversalvoid preOrder(BinaryTreeNode t){ if (t != null) { visit(t); preOrder(t.leftChild); preOrder(t.rightChild); } }

– Preorder Example (visit = print)a b d g h e i c f j

– Preorder Of Expression Tree/ * + a b - c d + e f

– Gives prefix form of expression! Inorder Traversalvoid inOrder(BinaryTreeNode t){ if (t != null) { inOrder(t.leftChild); visit(t); inOrder(t.rightChild); }}

– Inorder Example (visit = print)g d h b e i a f j c

– Inorder By Projection (Squishing)– Inorder Of Expression Tree– Gives infix form of expression (sans parentheses)!

a

b c

d ef

g h i j

+a b

-c d

+e f

*

/

+a b

-c d

+e f

*

/

ea + b * c d / + f-

48

Postorder TraversalPostorder Traversalpublic static void postOrder(BinaryTreeNode t){ if (t != null) { postOrder(t.leftChild); postOrder(t.rightChild); visit(t); } }

– Postorder Example (visit = print) g h d i e b j f c a– Postorder Of Expression Tree a b + c d - * e f + /– Gives postfix form of expression!

Traversal Applications– Make a clone. Determine height. Determine # of nodes.

Level OrderLet t be the tree root.

while (t != null){ visit t and put its children on a FIFO queue; remove a node from the FIFO queue and call it t; } // remove returns null when queue is empty

– Level-Order Example (visit = print) a b c d e f g h i j

49

Binary Tree ConstructionBinary Tree Construction– Suppose that the elements in a binary tree are distinct.– Can you construct the binary tree from which a given traversal

sequence came? When a traversal sequence has more than one element, the binary tree is not

uniquely defined. Therefore, the tree from which the sequence was obtained cannot be

reconstructed uniquely.– Can you construct the binary tree, given two traversal sequences?

Depends on which two sequences are given.

– Preorder And Postorder preorder = ab postorder = ba Preorder and postorder do not uniquely define a binary tree. Nor do preorder and level order (same example). Nor do postorder and level order (same example).

a

b

a

b

50

Binary Tree ConstructionBinary Tree Construction– Inorder And Preorder

inorder = g d h b e i a f j c– Inorder And Postorder

Scan postorder from right to left using inorder to separate left and right subtrees.

inorder = g d h b e i a f j c postorder = g h d i e b j f c a Tree root is a; gdhbei are in left subtree; fjc are in right subtree. preorder = a b d g h e i c f j Scan the preorder left to right using the inorder to separate left and right subtrees. a is the root of the tree; gdhbei are in the left subtree; fjc are in the right subtree. b is the next root; gdh are in the left subtree; ei are in the right subtree. d is the next root; g is in the left subtree; h is in the right subtree.

– Inorder And Level Order Scan level order from left to right using inorder to separate left and right subtrees. inorder = g d h b e i a f j c level order = a b c d e f g h i j Tree root is a; gdhbei are in left subtree; fjc are in right subtree.

a

gdhbei fjc

a

gdhfjcb

ei

a

g

fjcb

eid

h

51

GraphsGraphs– G = (V,E)– V is the vertex set.– Vertices are also called nodes and points.– E is the edge set.– Each edge connects two different vertices. – Edges are also called arcs and lines.– Directed edge has an orientation (u,v).– Undirected edge has no orientation (u,v).– Undirected graph => no oriented edge.– Directed graph => every edge has an orientation.

Applications– Communication Network:

Vertex = city, edge = communication link.– Driving Distance/Time Map

Vertex = city, edge weight = driving distance/time.– Street Map: Some streets are one way.

2 3

1

4 5

810

9 11

6 7

2 3

1

4 5

810

9 11

6 7n = 1

n = 2

n = 4

u v u v

52

Complete Undirected GraphComplete Undirected Graph Has all possible edges.

– Number Of Edges Undirected Graph Each edge is of the form (u,v), u != v. Number of such pairs in an n vertex graph is n(n-1). Since edge (u,v) is the same as edge (v,u), the number of edges in a complete undirected

graph is n(n-1)/2. Number of edges in an undirected graph is <= n(n-1)/2.

– Number Of Edges--Directed Graph Each edge is of the form (u,v), u != v. Number of such pairs in an n vertex graph is n(n-1). Since edge (u,v) is not the same as edge (v,u), the number of edges in a complete

directed graph is n(n-1). Number of edges in a directed graph is <= n(n-1).

Vertex Degree– Number of edges incident to vertex.– Sum of degrees = 2e (e is number of edges)– in-degree is number of incoming edges– out-degree is number of outbound edges– each edge contributes 1 to the in-degree of some vertex and 1 to the out-degree of some

other vertex– sum of in-degrees = sum of out-degrees = e, where e is the number of edges in the digraph

53

Searching AlgorithmSearching Algorithm Linear Search: Unsorted data

– Using Array

int lin_search1(L_TYPE *list, long int value){ int loc;

for (loc = 0; loc < list->size && list->info[loc].id != value; ++loc); if (list->info[loc].id == value) return (loc);

else return (-1); }

Using Linked List

N_PTR lin_search2((L_TYPE *list, long int value) { N-PTR loc;

for (loc = list; loc != NULL && loc->info.id != value; loc = loc->next);if (loc->info.id == value) return (loc);else return (NULL); }

Using Recursion

N_PTR lin_search3((L_TYPE *list, long int value) { if (list == NULL) return (NULL);

else if (loc->info.id == value) return ( list);

else return (lin_search3 (list->next, value)); }

54

Binary SearchBinary Searchint bin_search1(L_TYPE *list, long int value, int low, int high){ int mid;

while (low <= high) { mid = (low + high) /2;

if ( list->info[mid].id == value) return (mid);else if ( list->info[mid].id <value)

low = mid + 1; else high = mid - 1; }

return (-1); }

int bin_search2(L_TYPE *list, long int value, int low, int high){ int mid;

if (low > high)return (-1);mid = (low + high) /2;if ( list->info[mid].id == value) return (mid);else if ( list->info[mid].id <value)

return(bin_search2(list,value,mid+1,high)); else

return(bin_search2(list,value,low,mid-1));}

55

Sorting AlgorithmsSorting Algorithms Bubble Sortvoid bubble_sort(int list[ ], int size){ int i,temp,sorted ,pass=1;

do { sorted = 1;for ( i = 0; i < size - pass; i++) if (list[i] > list [i + 1]) {

temp = list[i];list[i] = list[i+1];list[i+1] = temp;sorted = 0; }

pass++; } while (!sorted); }

Selection Sort– Basic Idea:make a number of passes through the list or a part of the list and, on

each pass, select one element to be correctly positioned.67, 33, 21, 84, 49, 50, 75 => 21 , 33 , 67 , 84 , 49 , 50 , 75

void selection_sort (int list[ ], int size){ int i,j,min_pos,temp;

for (i = 0; i < size-1; i++) { min_pos = i; for (j = i+1; j < size; j++)

if (list[j] < list[min_pos]) min_pos = j;

if (min_pos != i) { temp = list[i]; list[i] = list[min_pos]; list[min_pos] = temp; } } }

56

Insertion SortInsertion Sort28 81 03 47 17 13 55 65 23 18 67 38 36 03 28 81 47 03 28 47 81 17 03 17 28 47 81 13 03 13 17 28 47 81 55 03 13 17 28 47 55 81 65 03 13 17 28 47 55 65 81 23

Shell SortShell Sort

28 81 03 47 17 13 55 65 23 18 67 38 36

13 38 03 23 17 28 55 36 47 18 67 81 65

03 18 13 23 17 28 47 36 55 38 65 81 67

03 13 17 18 23 28 36 38 47 55 65 67 81

57

QuicksortQuicksort Quicksort uses a divide-and-conquer strategy a recursive approach to problem-

solving in which the original problem partitioned into simpler sub-problems, each subproblem considered independently. Subdivision continues until subproblems obtained are simple enough to be solved directly

Choose some element called a pivot Perform a sequence of exchanges so that

– all elements that are less than this pivot are to its left and – all elements that are greater than the pivot are to its right.

divides the (sub)list into two smaller sublists, – each of which may then be sorted independently in the same way.

1. If the list has 0 or 1 elements, return. // the list is sortedElse do:Pick an element in the list to use as the pivot.Split the remaining elements into two disjoint groups:

SmallerThanPivot = {all elements < pivot}LargerThanPivot = {all elements > pivot}

Return the list rearranged as:Quicksort(SmallerThanPivot), pivot, Quicksort(LargerThanPivot).

58

Quick Sort Quick Sort 28 81 03 47 17 13 55 65 23 18 67 38 36

36 03 47 17 13 28 23 18 38 55 67 81 65

03 13 47 17 36 28 23 18 38 55 65 67 81

03 13 18 17 23 28 36 47 38 55 65 67 81

03 13 17 18 23 28 36 38 47 55 65 67 81void quicksort (int list[ ], int left, int right){ int pivot,p_value,i,mid,temp;

if (left < right) { mid = (left+right)/2; p_value = list[mid]; list[mid] = list[left]; list[left] = p_value; pivot = left; for (i = left+1; i <=right; i++) if (list[i] < p_value) {temp= list[++ pivot]; list[pivot] = list[i]; list[i] = temp; } temp = list[pivot]; list[pivot] = list[left]; list[left] = temp; quicksort(list, left, pivot-1); quicksort(list, pivot+1,right); } }

59

Bucket sort Assumes the input is generated by a random process that distributes

elements uniformly over [0, 100). Idea:

– Divide [0, 100) into n equal-sized buckets.

– Distribute the n input values into the buckets.

– Sort each bucket.

– Then go through buckets in order, listing elements in each one.

Example:– 89, 88, 21, 17, 37, 65, 44, 53, 23, 54, 87, 77

.. 17 .. 21 23 .. 37 .. 44 .. 53 54 .. 65 .. 77 .. 87 88 89

60

Hash TablesHash Tables– Worst-case time for get, put, and remove is O(size).– Expected time is O(1).

Ideal Hashing– Uses a 1D array (or table) table[0:b-1].

Each position of this array is a bucket. A bucket can normally hold only one dictionary pair.

– Uses a hash function f that converts each key k into an index in the range [0, b-1].

f(k) is the home bucket for key k.– Every dictionary pair (key, element) is stored in its home bucket

table[f[key]].– Pairs are: (22,a), (33,c), (3,d), (73,e), (85,f).– Hash table is table[0:7], b = 8.– Hash function is key/11. – Pairs are stored in table as below:

[0] [1] [2] [3] [4] [5] [6] [7]

(85,f)(22,a) (33,c)(3,d) (73,e)

61

Hash Table IssuesHash Table Issues– get, put, and remove take O(1) time.– What Can Go Wrong?

Where does (26,g) go? Keys that have the same home bucket are synonyms.

– 22 and 26 are synonyms with respect to the hash function that is in use. The home bucket for (26,g) is already occupied.

– A collision occurs when the home bucket for a new pair is occupied by a pair with a different key. – An overflow occurs when there is no space in the home bucket for the new pair.– When a bucket can hold only one pair, collisions and overflows occur together.– Need a method to handle overflows.

Hash Table Issues– Choice of hash function.– Overflow handling method.– Size (number of buckets) of hash table.

Hash Functions– Two parts:

Convert key into an integer in case the key is not an integer.– Done by the method hashCode().

Map an integer into a home bucket.– f(k) is an integer in the range [0, b-1], where b is the number of buckets in the table.

Map Into A Home Bucket– Most common method is by division.

homeBucket = Math.abs(theKey.hashCode()) % divisor;– divisor equals number of buckets b.– 0 <= homeBucket < divisor = b

62

Uniform Hash FunctionUniform Hash Function Uniform Hash Function

– Let keySpace be the set of all possible keys.– A uniform hash function maps the keys in keySpace into buckets such that

approximately the same number of keys get mapped into each bucket.– Equivalently, the probability that a randomly selected key has bucket i as its home

bucket is 1/b, 0 <= i < b.– A uniform hash function minimizes the likelihood of an overflow when keys are

selected at random. Hashing By Division

– keySpace = all ints.– For every b, the number of ints that get mapped (hashed) into

bucket i is approximately 232/b.– Therefore, the division method results in a uniform hash function

when keySpace = all ints.– In practice, keys tend to be correlated.– So, the choice of the divisor b affects the distribution of home

buckets.

63

Selecting The DivisorSelecting The Divisor– Because of this correlation, applications tend to have a bias towards

keys that map into odd integers (or into even ones).– When the divisor is an even number, odd integers hash into odd

home buckets and even integers into even home buckets.– 20%14 = 6, 30%14 = 2, 8%14 = 8– 15%14 = 1, 3%14 = 3, 23%14 = 9– The bias in the keys results in a bias toward either the odd or even

home buckets.– When the divisor is an odd number, odd (even) integers may hash

into any home. 20%15 = 5, 30%15 = 0, 8%15 = 8 15%15 = 0, 3%15 = 3, 23%15 = 8

– The bias in the keys does not result in a bias toward either the odd or even home buckets.

– Better chance of uniformly distributed home buckets.– So do not use an even divisor.

64

Selecting The Divisor (2)Selecting The Divisor (2)– Similar biased distribution of home buckets is seen, in practice,

when the divisor is a multiple of prime numbers such as 3, 5, 7, …– The effect of each prime divisor p of b decreases as p gets larger.– Ideally, choose b so that it is a prime number.– Alternatively, choose b so that it has no prime factor smaller than

20. Java.util.HashTable

– Simply uses a divisor that is an odd number.– This simplifies implementation because we must be able to

resize the hash table as more pairs are put into the dictionary.

Array doubling, for example, requires you to go from a 1D array table whose length is b (which is odd) to an array whose length is 2b+1 (which is also odd).

65

Overflow HandlingOverflow Handling– An overflow occurs when the home bucket for a new pair (key,

element) is full.– We may handle overflows by:

Search the hash table in some systematic fashion for a bucket that is not full.– Linear probing (linear open addressing).– Quadratic probing.– Random probing.

Eliminate overflows by permitting each bucket to keep a list of all pairs for which it is the home bucket.

– Array linear list.– Chain.

Linear Probing – Get And Put– divisor = b (number of buckets) = 17.– Home bucket = key % 17.

66

Linear ProbingLinear Probing

– Put in pairs whose keys are 6, 12, 34, 29, 28, 11, 23, 7, 0, 33, 30, 45 Remove: remove(0)

Search cluster for pair (if any) to fill vacated bucket.

– remove(34)


0 4 8 12 166 12 2934 28 1123 70 333045

0 4 8 12 166 12 2934 28 1123 745 3330

0 4 8 12 166 12 2934 28 1123 745 3330

0 4 8 12 166 12 2928 1123 70 333045

0 4 8 12 166 12 290 28 1123 7 333045

67

Linear Probing (2)Linear Probing (2)– remove(29)


Performance Of Linear Probing– Worst-case get/put/remove time is Theta(n), where n is the number of pairs in the

table.– This happens when all pairs are in the same cluster.

0 4 8 12 16

6 1234 28 1123 70 333045

0 4 8 12 16

6 12 1134 2823 70 3330 45

Date post:	16-Nov-2014
Category:	Documents
Upload:	prakashgkhaire
View:	255 times
Download:	0 times

Data Structures 2008

Documents