+ All Categories
Home > Documents > Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is...

Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is...

Date post: 03-May-2018
Category:
Upload: vuongquynh
View: 217 times
Download: 1 times
Share this document with a friend
16
Unit Fist A) Measure of central value: Introduction: Averages are the typical values around which other items of the distribution congregate. They are the values which lie between the two extreme observations (i.e. the smallest and the largest observations) Definition: The word average or the term measures of central tendency have been defined by various authors’ in their own way A.E.Waugh: “ An average is a single value selected from a group of values to represent then in some way, a value which is supposed to stand for whole group of which it is part, as typical of all the values in the group”. Lawrence J. Kaplan: “One of the most widely used set of summery figures is known as measures of location, which are often referred to as averages, central tendency or central location. The purpose for computing an average value for a set of observations is to obtain a single value which is reprehensive of all the items and which the mind can grasp in simple manner quickly. The single value is the point of location around which the individual items cluster”. Various Measures of central tendency: There are five commonly measures of central tendency which is being used to obtain an average: 1) Arithmetic Mean or Mean 2) Median 3) Mode 4) Geometric Mean 5) Harmonic Mean Arithmetic Mean: Calculation of Arithmetic mean in an individual series A.M= Sum of values/number of values = x1+x2+x3+x……….xn/n = Sum X/n In case of frequency = Xn /f
Transcript
Page 1: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

Unit Fist

A) Measure of central value:

Introduction:

Averages are the typical values around which other items of the distribution congregate. They are the

values which lie between the two extreme observations (i.e. the smallest and the largest observations)

Definition: The word average or the term measures of central tendency have been defined by various

authors’ in their own way

A.E.Waugh: “ An average is a single value selected from a group of values to represent then in some

way, a value which is supposed to stand for whole group of which it is part, as typical of all the values in

the group”.

Lawrence J. Kaplan: “One of the most widely used set of summery figures is known as measures of

location, which are often referred to as averages, central tendency or central location. The purpose for

computing an average value for a set of observations is to obtain a single value which is reprehensive of

all the items and which the mind can grasp in simple manner quickly. The single value is the point of

location around which the individual items cluster”.

Various Measures of central tendency:

There are five commonly measures of central tendency which is being used to obtain an average:

1) Arithmetic Mean or Mean

2) Median

3) Mode

4) Geometric Mean

5) Harmonic Mean

Arithmetic Mean:

Calculation of Arithmetic mean in an individual series

A.M= Sum of values/number of values

Ẍ= x1+x2+x3+x……….xn/n = Sum X/n

In case of frequency

Ẏ= Xn /f

Page 2: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

If we take the Assumed mean

The formula is

Ẍ= A+fdxn/f

For further knowledge and Problems refer book

Median

“The median is that value of the variable which divides the group in two equal parts , one part computing

all the values greater and the other, all the values less then median”.

Calculation of Median:

Case 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after

the observations have been arranged in ascending or descending order of magnitude, for exam

a) 34, 10, 40, 7, 50, it can arranged like 7,10, 34, 40, 50 or 50, 40, 34,10,7

Case2) Frequency Distribution: In case of frequency distribution where the variable takes the values

X1,X2,X3,………Xn with the frequency f1,f2,f3,………fn with Sum f= N

Median in the case 2nd will be ( N+1)/2nd item or observation

The steps to follow are

I) Prepare the ‘less than’ cumulative frequency (c.f) distribution)

II) Find N/2

III) See the c.f. first greater than N/2

IV) The corresponding value of the variable gives median

For further knowledge and Problems refer book

Mode

Mode is the value which occurs most frequently in a set of observations and around which the other items

of the set cluster densely. In other words , mode is the value of a series which is predominant in it.

A.M. Tuttle Defines “ Mode is the value which has the greatest frequency density in its immediate

neighborhood”

Computation of Mode:

Individual and discrete series : Mode cannot be calculated in a series of individual observations unless

it is converted into a discrete continuous series.

Page 3: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

Question

S. No Weighted pounds S. No Weighted pounds

1 20 6 130

2 130 7 132

3 135 8 132

4 130 9 135

5 140 10 141

Solution

Weighted pounds Number of

Students

120 1

130 3

132 2

135 2

140 1

In the above example 130 has the maximum value , it the model value

Grouping Method:

In case of continuous frequency distribution , the class corresponding to the maximum frequency is called

model and the mode is obtained by the interpolation formula

Mode= L + h(f1-f0)/f1-f0)-(f2-f1)= L+ h(f1-f0)/2f1-f0-f2

Where

L is the lower limit of the model class11

Page 4: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

F1 is the frequency of the modal class

F0 is the frequency of the class proceeding the modal class

F2 is the frequency of the class succeeding the modal class

H is the magnitude of the modal class

For further knowledge and Problems refer book

B) Measures of dispersion

Definition: The term dispersion is used to indicate the facts that within a given group, the items

differ from one another in size or in other words , there is lack of uniformity in their size

The various measures of dispersion are:

i. Range

ii. Quartile deviation or semi quartile deviation

iii. Mean deviation

Range:

The range is the most simplest of all the measures of dispersion. It is defined as the

difference between the two extreme observations of the distributions. In other words , the

range is the difference between the greatest and the smallest observations of the

distributions

Formula : Xmax-Xmim

Quartile deviation or semi quartile deviation

It is a measure of dispersion based on the upper quartile Q3 and the lower quartile Q1

Inter quartile range = Q3-Q1

Quartile Deviation (Q.D) = Q3-Q1/2

Percentile deviation

This is a measure of dispersion based on the difference between certain percentiles, if P,

is the ith percentile and Pj is the Jth percentile then the so called i-j percentile range is

given by

i-j percentile Range = Pj-Pi( i-<j)

For further knowledge and Problems refer book

Page 5: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

Mean Deviation or Average Deviation

Computation of Mean deviation :

If X1,X2,X3………..Xn are in given observations then the mean deviation (M.D) about an

average A,

Steps for computation of Mean Deviation

1. Calculate the average A of the distribution by the usual methods

2. Take the deviation d=X-A of each observation from the average A

3. Ignore the negative signs of the deviations , taking all the deviations to be positive to

obtain the absolute deviations |d| = |X-A|

4. Obtain the sum of the absolute deviations obtained in step 3

5. Divide the total obtained in step 4 by n , the number of observations

The result gives the mean deviation about A

In case of frequency distribution or grouped or continuous frequency distribution , mean

deviation about an average A is given by:

M.D. (about the average A) = 1/N∑f|X-A|=1/N|∑f|d|

For further knowledge and Problems refer book

C). Standard Deviation :

Standard deviation , usually denoted by the letter δ(small sigma) of the Greek alphabet was first

suggested by Karl Pearson as a measure of dispersion in 1893. It is defined as the positive square

root of the arithemetic mean of the squares of the deviations of the given observiations from their

arithmetic mean. Thus if X1,X2……….Xn is the set of observiation then the standard deviation is

given by

δ =√1/n∑(X-Ẍ)2

Where Ẍ = 1/n ∑ X, is the arithmetic mean of the given values

Steps for computation of standard deviation

1. Compute the arithmetic mean Ẍ by the formula

2. Compute the deviation (X-Ẍ) of each observation from arithmetic mean i.e., to obtain

Page 6: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

X1- Ẍ, X2-Ẍ……….Xn-Ẍ

3. Square each of the observiation obtained in step 2 i.e., compute (X1-Ẍ)2, (X2-Ẍ)2,

…….(Xn-Ẍ)2,

4. Find the sum of the squared deviation in step 3 given by

∑(X-Ẍ)2,= (X1-Ẍ)2, + (X2-Ẍ)2,……. (Xn-Ẍ)2,

5. Divide the sum in step 4 by n to obtain 1/n∑(X-Ẍ)2

6. Take the positive square root of the value obtained in step 5

7. The resulting value gives the standard deviation of the distribution’

In case of frequency distribution , the standard deviation is given by:

δ = √1/N∑f(X-Ẍ)2

For further knowledge and Problems refer book

Page 7: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

Unit Second

A) Correlation Analysis

Introduction:

The term correlation is used by a common man without knowing that he is making use of the

term correlation. For example when parents advice their children to work hard so that they may

get good marks, they are correlating good marks with hard work. The study related to the

characteristics of only variable such as height, weight, ages, marks, wages, etc., is known as

univariate analysis. The statistical Analysis related to the study of the relationship between two

variables is known as Bi-Variate Analysis. Some times the variables may be inter-related. In

health sciences we study the relationship between blood pressure and age, consumption level of

some nutrient and weight gain, total income and medical expenditure, etc., The nature and

strength of relationship may be examined by correlation and Regression analysis. Thus

Correlation refers to the relationship of two variables or more. (e-g) relation between height of

father and son, yield and rainfall, wage and price index, share and debentures etc. Correlation is

statistical Analysis which measures and analyses the degree or extent to which the two variables

fluctuate

Methods of Studying correlation:

i. Scatter diagram method

ii. Karl pearson’s coefficient of correlation (covariance method)

iii. Rank Method

Scatter diagram method

It is the simplest method of studying the relationship between two variables diagrammatically.

One variable is represented along the horizontal axis and the second variable along the vertical

axis. For each pair of observations of two variables, we put a dot in the plane. There are as many

dots in the plane as the number of paired observations of two variables. The direction of dots

shows the scatter or concentration of various points. This will show the type of correlation. 1. If

all the plotted points form a straight line from lower left hand corner to the upper right hand

corner then there is Perfect positive correlation. We denote this as r = +1

Y Y

Page 8: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

Perfect positive correction

O X axis X axis

Karl Pearson’s coefficient of correlation:

Karl Pearson, a great biometrician and statistician, suggested a mathematical method for

measuring the magnitude of linear relationship between the two variables. It is most widely used

method in practice and it is known as pearsonian coefficient of correlation. It is denoted by ‘ r’ .

The formula for calculating ‘ r’ is

The formula is

∑ XY/ √∑X2.∑Y2

X = x-Ẍ , Y = y - Ẏ

Or

∑ dx.dxy/√∑dx2.∑dy2

Where deviations are taken from assumed mean

r = n∑dxdy-∑dx.∑dy/n2δxδy

Steps:

1. Find the mean of the two series x and y.

2. Take deviations of the two series from x and y.

X = x x , Y = y y

3. Square the deviations and get the total, of the respective

squares of deviations of x and y and denote by

X2 ,Y2 respectively.

Perfect negative correction

Page 9: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

4. Multiply the deviations of x and y and get the total and

Divide by n. This is covariance.

Substitute the values in the formula.

For further knowledge and Problems refer book

B) Rank Correlation method

It is studied when no assumption about the parameters of the population is made. This method is

based on ranks. It is useful to study the qualitative measure of attributes like honesty, colour,

beauty, intelligence, character, morality etc.The individuals in the group can be arranged in order

and there on, obtaining for each individual a number showing his/her rank in the group. This

method was developed by Edward Spearman in 1904. It is defined as

r = 1 -6∑d2/n(n2-1)

r = rank correlation coefficient

Note: Some authors use the symbol for rank correlation.

D2 = sum of squares of differences between the pairs of ranks.

n = number of pairs of observations.

The value of r lies between –1 and +1. If r = +1, there is complete agreement in order of ranks

and the direction of ranks is also same. If r = -1, then there is complete disagreement in order of

ranks and they are in opposite directions.

For further knowledge and Problems refer book

C) Regression Analysis:

Introduction:

After knowing the relationship between two variables we may be interested in estimating

(predicting) the value of one variable given the value of another. The variable predicted on the

basis of other variables is called the “dependent” or the ‘ explained’ variable and the other the ‘

independent’ or the ‘ predicting’ variable. The prediction is based on average relationship

derived statistically by regression analysis. The equation, linear or otherwise, is called the

regression equation or the explaining equation. For example, if we know that advertising and

sales are correlated we may find out expected amount of sales for a given advertising

expenditure or the required amount of expenditure for attaining a given amount of sales. The

relationship between two variables can be considered between, say, rainfall and agricultural

production, price of an input and the overall cost of product, consumer expenditure and

disposable income. Thus, regression analysis reveals average relationship between two variables

and this makes possible estimation or prediction.

Page 10: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

Definition:

Regression is the measure of the average relationship between two or more variables in terms of

the original units of the data.

Types Of Regression:

The regression analysis can be classified into:

a) Simple and Multiple

b) Linear and Non –Linear

c) Total and Partial

a) Simple and Multiple:

In case of simple relationship only two variables are considered, for example, the influence of

advertising expenditure on sales turnover. In the case of multiple relationship, more than 219 two

variables are involved. On this while one variable is a dependent variable the remaining variables

are independent ones. For example, the turnover (y) may depend on advertising expenditure (x)

and the income of the people (z). Then the functional relationship can be expressed as y = f (x,z).

b) Linear and Non-linear:

The linear relationships are based on straight-line trend, the equation of which has no-power

higher than one. But, remember a linear relationship can be both simple and multiple. Normally a

linear relationship is taken into account because besides its simplicity, it has a better predictive

value, a linear trend can be easily projected into the future. In the case of non-linear relationship

curved trend lines are derived. The equations of these are parabolic.

c) Total and Partial:

In the case of total relationships all the important variables are considered. Normally, they take

the form of a multiple relationships because most economic and business phenomena are

affected by multiplicity of cases. In the case of partial relationship one or more variables are

considered, but not all, thus excluding the influence of those not found relevant for a given

purpose.

Linear Regression Equation:

If two variables have linear relationship then as the independent variable (X) changes, the

dependent variable (Y) also changes. If the different values of X and Y are plotted, then the two

straight lines of best fit can be made to pass through the plotted points. These two lines are

Page 11: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

known as regression lines. Again, these regression lines are based on two equations known as

regression equations. These equations show best estimate of one variable for the known value of

the other. The equations are linear. Linear regression equation of Y on X is

Y = a + bX ……. (1)

And X on Y is

X = a + bY……. (2)

a, b are constants.

From (1) We can estimate Y for known value of X. (2) We can estimate X for known value of Y.

Regression Lines:

For regression analysis of two variables there are two regression lines, namely Y on X and X on

Y. The two regression lines show the average relationship between the two variables. For perfect

correlation, positive or negative i.e., r = + 1, the two lines coincide i.e., we will find only one

straight line. If r = 0, i.e., both the variables are independent then the two lines will cut

each other at right angle. In this case the two lines will be parallel to X and Y-axes.

Y Y

r = +1

r = -1

O X O X

Y

Y Y

O X O X

r = 0

r=( ẌẎ)

Page 12: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

(ii) Regression Co-efficient:

The regression equation Y on X is

Ye= Ẏ+r δy/δx(x-Ẍ)

The regression equation X on Y is

Xe= Ẍ+r δx/δy(y-Ẏ)

Regression equation of Y on X can b measured as now with the help of the following procedure

( Y-Ẏ) = r δy/δx(X-Ẍ) Where r δy/δx and r δx/δy is called regression coefficient of Y on X and X on Y Its values are ∑xy/x2 and ∑xy/y2 For further knowledge and Problems refer book

Page 13: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

Unit Third Analysis of Time Series A time series is an arrangement of statistical data in a chorological order, i.e., in accordance with its time of occurrence. It reflects the dynamic pace of movements of a phenomenon over a period of time. Most of the series relating to Economics, Business and Commerce, e.g., the series relating to prices, production and consumption of various commodities; agriculture and industrial products, national income and foreign exchange services, investments, sales and profits of business houses and various other aspects of where fluctuations are possible and visible. Ya Lun Chou define Time series analysis “ A time series my by defined as collection of readings to different time periods of some economic variable or composite of variables”. Mathematical Models for time series

The following are the two models commonly used for the decomposition of a time series into a

component

1. Addictive Model or Decomposition by addictive Hypothesis

2. Multiplicative Model or Decomposition by Multiplicative Hypothesis

1. Addictive Model or Decomposition by addictive Hypothesis

Y = T +S+C+I

Yt = Tt+St+Ct+It

Where Y(Yt) is the time series and time Tt+St+Ct+ and It, represent the trend , seasonal , cyclical

and random variations at time t

2. Multiplicative Model or Decomposition by Multiplicative Hypothesis

Y= T x S x C x I

Yt= Tt x St x CtxIt

Page 14: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

Measurement of Trend

The following are the four methods which are generally used for the study and

measurement of the trend component in a time series

i) Graphic (or Free hand Curve Fitting) Method

ii) Method of Semi- Averages

iii) Method of Curve Fitting by the Principal of least squares

iv) Method of Moving Averages

For further knowledge and Problems refer book

Page 15: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

Unit Fourth Probability

The term probability refers to the chance of happing or not happing of an even. Today, the theory

of probability has been very extensively developed and there is hardly any discipline –physical

or social –where it is not being extensively used. In the field of business and Economics, it is

very widely used for quantitative analysis of various problems and it forms the very basic of the

modern theory of Decision –making i.e., decision making under conditions of uncertainty with

calculated risks

Probability Defined:

Ordinarily speaking, the probability of an event denotes the likelihood of its happening. The

value of probability ranges between 0 and 1, if an event is certain to happen , its probability

would be 1 and if it is certain that the event would not take place, then the probability of its

happing is 0

Theories of Probability

There are two important theorems of probability, viz

(i) The addition theorem, and

(ii) The multiplication theorem

The addition theorem

If A and B are any two events, then the probability that at least one of them occurs is denoted by

P(Aᴗ B) and is given by P (A ᴗB) = P(A)+P(B) –P(AᴖB) Where P(A) = Probability of the occurrence of event A P(B) = Probability of the occurrence of event B P(AᴖB) = Probability of simultaneous occurrence of event A and B In case of mutually exclusive events P(AᴗB)= P(A)+P(B) If there are three events A,B and C, the probability of the occurrence of at least one of them is given by P(AᴗBᴗC)=P(A)+P(B)+P(C)-P(AᴖB)-P(BᴖC)-P(AᴖC) +P(AᴖBᴖC)

Page 16: Unit Fist - Government Degree College, Bemina 1) : Ungrouped data: If the number of observations is odd, then the median is the middle value after the observations have been arranged

If the events are mutually exclusive , then P(AᴗBᴗC) =P(A)+P(B)+P(C) In case of finite number P(A1ᴗA2ᴗA3)…………An)= p(A1)+P(A2)+…………………P(An) Multiplication Theorem The probability of the simultaneous occurrence of the events A and B is denoted by P(AᴖB)or p(AB) and is given by P(AᴖB) = P(A), P(B/A) = P(B) x P(A/B) If the events are independent P(B/A)= [(B), P(A/B) = (P(A) For independent events P(Aᴖ= P(A) x P(B) If three events , A,B,C P(AᴖBᴖ C) = P(A)P(B/A)P(C/B) If the evens are independent P(AᴖBᴖC) = P(A)Xp(B)Xp(C)

For further knowledge and Problems refer book


Recommended