Structures unit-1.pdf · 2020. 8. 28. · search, Fibonacci search. Sorting- Insertion sort,...

Post on 04-Sep-2020

3 views 0 download

transcript

2

Contents

• Introduction to Data Structures

• Classification and Operations on Data Structures

• Preliminaries of Algorithm

• Algorithm Analysis and Complexity

• Recursive Algorithms

• Searching Techniques - Linear, Binary, Fibonacci

• Sorting Techniques- Bubble, Selection, Insertion,

Quick and Merge Sort

• Comparison of Sorting Algorithms

3

Introduction to Data Structures

• A data structure is a way of storing data in a

computer so that it can be used efficiently and

it will allow the most efficient algorithm to be

used.

• A data structure should be seen as a logical

concept that must address two fundamental

concerns.

I. First, how the data will be stored, and

II. Second, what operations will be performed on it.

Classification of Data Structures

• Data structures can be classified as i. Simple data structure

ii. Compound data structure

iii. Linear data structure

iv. Non linear data structure

Fig. Classification of Data Structures 4

Simple and Compound Data Structures

• Simple Data Structure: Simple data structure can be

constructed with the help of primitive data structure.

A primitive data structure used to represent the

standard data types of any one of the computer

languages. Variables, arrays, pointers, structures,

unions, etc. are examples of primitive data structures.

Compound Data structure: Compound data

structure can be constructed with the help of any one

of the primitive data structure and it is having a

specific functionality. It can be designed by user. It

can be classified as

i. Linear data structure

ii. Non-linear data structure 5

6

Linear and Non-linear Data Structures

• Linear Data Structure: Linear data structures can be

constructed as a continuous arrangement of data

elements in the memory. It can be constructed by

using array data type. In the linear Data Structures the

relationship of adjacency is maintained between the

data elements.

• Non-linear

structure can

Data Structure: Non-linear data

be constructed as a collection of

randomly distributed set of data item joined together

by using a special pointer (tag). In non-linear Data

structure the relationship of adjacency is not

maintained between the data items.

7

Operations on Data Structures

i. Add an element

ii. Delete an element

iii. Traverse

iv. Sort the list of elements

v. Search for a data element

8

Abstract Data Type

• An abstract data type, sometimes abbreviated

ADT, is a logical description of how we view

the data and the operations that are allowed

without regard to how they will be

implemented.

• By providing this level of abstraction, we are

creating an encapsulation around the data. The

idea is that by encapsulating the details of the

implementation, we are hiding them from the

user‘s view. This is called information hiding.

9

Algorithm Definition

• An Algorithm may be defined as a finite

sequence of instructions each of which has a clear

meaning and can be performed with a finite

amount of effort in a finite length of time.

• The word algorithm originates from the Arabic

word Algorism which is linked to the name of the

Arabic Mathematician AI Khwarizmi.

• AI Khwarizmi is considered to be the first

algorithm designer for adding numbers.

10

Structure of an Algorithm

• An algorithm has the following structure:

– Input Step

–Assignment Step

–Decision Step

–Repetitive Step

–Output Step

11

Properties of an Algorithm

• Finiteness:- An algorithm must terminate after

finite number of steps.

• Definiteness:-The steps of the algorithm must be

precisely defined.

• Generality:- An algorithm must be generic

enough to solve all problems of a particular

class.

• Effectiveness:- The operations of the algorithm

must be basic enough to be put down on pencil

and paper.

Input-Output:- The algorithm must have certain

initial and precise inputs, and outputs that may be

generated both at its intermediate and final steps

13

Algorithm Analysis and Complexity

• The performances of algorithms can be

measured on the scales of Time and Space.

• The Time Complexity of an algorithm or a

program is a function of the running time of

the algorithm or a program.

• The Space Complexity of an algorithm or a

program is a function of the space needed by

the algorithm or program to run to completion.

14

Algorithm Analysis and Complexity

• The Time Complexity of an algorithm can be computed either by an – Empirical or Posteriori Testing

– Theoretical or Apriori Approach

• The Empirical or Posteriori Testing approach calls for implementing the complete algorithm and executes them on a computer for various instances of the problem.

• The Theoretical or Apriori Approach calls for mathematically determining the resources such as time and space needed by the algorithm, as a function of parameter related to the instances of the problem considered.

15

Algorithm Analysis and Complexity

• Apriori analysis computed the efficiency of the

program as a function of the total frequency

count of the statements comprising the

program.

• Example: Let us estimate the frequency count

of the statement x = x+2 occurring in the

following three program segments A, B and C.

16

Total Frequency Count of Program

Segment A

• Program Statements

..…………………

x = x+ 2

….……………….

Total Frequency Count

• Frequency Count

1

1

Time Complexity of Program Segment A is O(1).

17

Total Frequency Count of Program

Segment B

• Program Statements

..…………………

for k = 1 to n do

x = x+ 2;

end

….……………….

Total Frequency Count

• Frequency Count

(n+1)

n

n

……………………

3n+1

Time Complexity of Program Segment B is O(n).

18

Total Frequency Count of Program

Segment C • Program Statements

..…………………

for j = 1 to n do for k = 1 to n do

x = x+ 2;

end

end

….……………….

Total Frequency Count

• Frequency Count

(n+1) (n+1)n n2

n2

n …………………………

3n2 +3n+1

Time Complexity of Program Segment C is O(n2).

19

Asymptotic Notations

• Big oh(O): f(n) = O(g(n)) ( read as f of n is big

oh of g of n), if there exists a positive integer

n0 and a positive number c such that |f(n)| ≤ c

|g(n)| for all n ≥ n0.

• Here g(n) is the upper bound of the function

f(n).

f(n) g(n)

16n3 + 45n2 + 12n n3 f(n) = O(n3 )

34n – 40 n f(n) = O(n)

50 1 f(n) = O(1)

20

Asymptotic Notations

• Omega(Ω): f(n) = Ω(g(n)) ( read as f of n is

omega of g of n), if there exists a positive

integer n0 and a positive number c such that

|f(n)| ≥ c |g(n)| for all n ≥ n0.

• Here g(n) is the lower bound of the function

f(n).

f(n) g(n)

16n3 + 8n2 + 2 n3 f(n) = Ω (n3 )

24n +9 n f(n) = Ω (n)

Asymptotic Notations

• Theta(Θ): f(n) = Θ(g(n)) (read as f of n is

theta of g of n), if there exists a positive

integer n0 and two positive constants c1 and c2

such that c1 |g(n)| ≤ |f(n)| ≤ c2 |g(n)| for all n ≥

n0.

• The function g(n) is both an upper bound and a

lower bound for the function f(n) for all values

of n, n ≥ n0 20

f(n) g(n)

16n3 + 30n2 – 90 n2 f(n) = Θ(n2 )

7. 2n + 30n 2n f(n) = Θ (2n)

22

Asymptotic Notations

• Little oh(o): f(n) = O(g(n)) ( read as f of n is

little oh of g of n), if f(n) = O(g(n)) and f(n) ≠

Ω(g(n)).

f(n) g(n)

18n + 9 n2 f(n) = o(n2) since f(n) = O(n2 ) and f(n) ≠ Ω(n2 ) however f(n) ≠ O(n).

23

Time Complexity

Complexity Notation Description

Constant O(1) Constant number of operations, not depending on the

input data size.

Logarithmic O(logn) Number of operations proportional of log(n) where n

is the size of the input data.

Linear O(n) Number of operations proportional to the input data

size.

Quadratic O(n2 ) Number of operations proportional to the square of

the size of the input data.

Cubic O(n3 ) Number of operations proportional to the cube of the

size of the input data.

Exponential O(2n) Exponential number of operations, fast growing.

O(kn )

O(n!)

Time Complexities of various

Algorithms

24

25

Recursion Examples

Factorial Function

Factorial(Fact, N)

1. If N = 0, then set Fact :=1, and return.

2. Call Factorial(Fact, N-1)

3. Set Fact := N* Fact

4. Return

26

Fibonacci Sequence

Fibonacci(Fib, N)

1. If N=0 or N=1, then: Set Fib:=N, and return.

2. Call Fibonacci(FibA, N-2)

3. Call Fibonacci(FibB, N-1)

4. Set Fib:=FibA + FibB

5. Return

27

Towers of Hanoi

Tower(N, Beg, Aux, End)

1. If N=1, then:

(a) Write: Beg -> End

(b) Return

2. [Move N-1 disks from peg Beg to peg Aux]

Call Tower(N-1, Beg, End, Aux)

3. Write: Beg -> End

4. [Move N-1 disks from peg Aux to peg End]

Call Tower(N-1, Aux, Beg, End)

5. Return

28

Basic Searching Methods

• Search: A search algorithm is a method of

locating a specific item of information in a

larger collection of data.

• There are three primary algorithms used for

searching the contents of an array:

1. Linear or Sequential Search

2. Binary Search

3. Fibonacci Search

29

Linear Search

• Begins search at first item in list, continues

searching sequentially(item by item) through

list, until desired item(key) is found, or until

end of list is reached.

• Also called sequential or serial search.

• Obviously not an

searching ordered

efficient method for

lists like phone

directory(which is ordered alphabetically).

30

Linear Search contd..

• Advantages

1. Algorithm is simple.

2. List need not be ordered in any particular

way.

• Time Complexity of Linear Search is O(n).

31

Recursive Linear Search Algorithm

def linear_Search(l,key,index=0):

if l:

if l[0]==key:

return index

s=linear_Search(l[1:],key,(index+1))

if s is not false:

return s

return false

32

Binary Search

• List must be in sorted order to begin with

Compare key with middle entry of list

For lists with even number of entries, either

of the two middle entries can be used.

• Three possibilities for result of comparison

Key matches middle entry --- terminate

search with success

Key is greater than middle entry ---

matching entry(if exists) must be in upper

part of list (lower part of list can be

discarded from search)

33

Binary Search contd…

Key is less than middle entry ---matching

entry (if exists) must be in lower part of list (

upper part of list can be discarded from

search)

• Keep applying above 2 steps to the

progressively ―reduced‖ lists, until match is

found or until no further list reduction can be

done.

• Time Complexity of Binary Search is O(logn).

34

Fibonacci Search

• Fibonacci Search Technique is a method of

searching a sorted array using a divide and

conquer algorithm that narrows down possible

locations with the aid of Fibonacci numbers.

• Fibonacci search examines locations whose

addresses have lower dispersion, therefore it

has an advantage over binary search in slightly

reducing the average time needed to access a

storage location.

35

Fibonacci Search contd…

search has a complexity of • Fibonacci

O(log(n)).

• Fibonacci search was first devised by

Kiefer(1953) as a minimax search for the

maximum (minimum) of a unimodal function

in an interval.

36

Fibonacci Search Algorithm

• Let k be defined as an element in F, the array

of Fibonacci numbers. n = Fm is the array size.

If the array size is not a Fibonacci number, let

Fm be the smallest number in F that is greater

than n.

• The array of Fibonacci numbers is defined

where Fk+2

= Fk+1 + Fk, when k ≥ 0, F1 = 1, and F0 = 0.

• To test whether an item is in the list of ordered

numbers, follow these steps:

Set k = m.

If k = 0, stop. There is no match; the item is

not in the array.

38

Fibonacci Search Algorithm Contd…

• Compare the item against element in Fk−1.

• If the item matches, stop.

• If the item is less than entry Fk−1, discard

the elements from positions Fk−1 + 1 to n.

Set k = k − 1 and return to step 2.

• If the item is greater than entry Fk−1,

discard the elements from positions 1 to

Fk−1. Renumber the remaining elements

from 1 to Fk−2, set k = k − 2, and return to

step 2.

39

Basic Sorting Methods :-Bubble Sort • First Level Considerations

• To sort list of n elements in ascending order

Pass 1 :make nth element the largest

Pass 2 :if needed make n-1th element the 2nd

largest

Pass 3 :if needed make n-2th element the 3rd

largest Pass n-2: if needed make 3rd n-(n-3)th element the (n-2)th largest Pass n-1 :if needed make 2nd n-(n-2)th element the (n-1)th largest

• Maximum number of passes is (n-1).

40

Bubble Sort

Second Level Considerations

• Pass 1: Make nth element the largest.

Compare each successive pair of elements

beginning with 1st 2nd and ending with n-1th

nth and swap the elements if necessary.

• Pass 2 : Make n-1th element the 2nd largest.

Compare each successive pair of elements

beginning with 1st 2nd and ending with n-2th

n-1th and swap the elements if necessary

Pass n-1:Make 2nd n-(n-2)th element the (n-1)th

largest.

Compare each successive pair of elements

beginning with 1st 2nd and ending with n-(n-1)

th n-(n-2)th 1st 2nd and swap the elements if

necessary.

List is sorted when either of the following occurs

No swapping involved in any pass

Pass n-1:the last pass has been executed

Bubble Sort Example

42

43

Selection Sort

First Level Considerations

• To sort list of n elements in ascending order

• Pass 1: make 1st element the smallest

• Pass 2: make 2nd element the 2nd smallest

• Pass 3: make 3rd element the 3rd smallest

• Pass n-2: make (n-2)th element the (n-2)th smallest

• Pass n-1: make (n-1)th element the (n-1)th smallest

• Number of passes is (n-1).

44

Selection Sort

Second Level Considerations • Pass 1: Make 1st element the smallest

Examine list from 1st to last element locate element with smallest value and swap it with the 1st element where appropriate .

• Pass 2: Make 2nd element the 2nd smallest Examine list from 2nd to last element locate element with smallest value and swap it with the 2nd element where appropriate.

• Pass n-1: Make (n-1)th element the (n-1)th smallest Examine list from (n-1)th to last element locate element with smallest value and swap it with the n-1th element where appropriate.

Selection Sort Example

45

46

Insertion Sort

First Level Considerations

To sort list of n items (stored as 1D array) in ascending order

• NOTE: 1-element sub-array (1st) is always sorted

• Pass 1: make 2-element sub-array (1st 2nd) sorted

• Pass 2 :make 3-element sub-array (1st 2nd 3rd) sorted

• Pass 3 :make 4-element sub-array (1st 4th) sorted

• Pass n-2: make n-1-element sub-array (1st (n-1)th) sorted

• Pass n-1: make entire n-element array (1st nth) sorted

• Number of passes is (n-1)

Insertion Sort Example

47

48

Quick Sort

It uses Divide-and-Conquer approach:

1. Divide: partition A[p..r] into two sub-arrays A[p..q-1] and A[q+1..r] such that each element of A[p..q-1] is ≤ A[q], and each element of A[q+1..r] is ≥ A[q]. Compute q as part of this partitioning.

2. Conquer: sort the sub-arrays A[p..q-1] and

A[q+1..r] by recursive calls to QUICKSORT.

3. Combine: the partitioning and recursive sorting leave us with a sorted A[p..r] – no work needed here.

The Pseudo-Code of Quick Sort

49

Quick Sort Contd…..

50

51

Quick sort Analysis

• Best case running time: O(n log2n)

• Worst case running time: O(n2)!!!

Quick Sort Example

52

53

Merge Sort • Merge-sort is a sorting algorithm based on the divide-and-

conquer paradigm

• Like heap-sort

It uses a comparator

It has O(n log n) running time

• Unlike heap-sort

It does not use an auxiliary priority queue

It accesses data in a sequential manner (suitable to sort data

on a disk)

• Merge-sort on an input sequence S with n elements consists of three steps: Divide: partition S into two sequences S1 and S2 of about n/2 elements each

Recur: recursively sort S1 and S2

Conquer: merge S1 and S2 into a unique sorted sequence

54

Merge Sort Algorithm

Algorithm mergeSort(S, C)

Input sequence S with n elements, comparator C

Output sequence S sorted according to C

if S.size() > 1

(S1, S2) partition(S, n/2)

mergeSort(S1, C)

mergeSort(S2, C)

S merge(S1, S2)

Merge Sort Example

55

Merge Sort Example Contd…

56

57

Analysis of Merge Sort

• The height h of the merge-sort tree is O(log n)

– at each recursive call we divide in half the sequence,

• The overall amount or work done at the nodes of depth

i is O(n)

– we partition and merge 2i sequences of size n/2i

– we make 2i+1 recursive calls

• Thus, the total running time of merge-sort is O(n log n)

Comparison of Sorting Algorithms

58