+ All Categories
Home > Data & Analytics > Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Date post: 21-Jan-2015
Category:
Upload: neeraj-bhandari
View: 730 times
Download: 2 times
Share this document with a friend
Description:
 
Popular Tags:
38
CORRELATION Correlation is a statistical measurement of the relationship between two variables such that a change in one variable results a change in other variable and such variables are called correlated. Thus the correlation analysis is a mathematical tool which is used to measure the degree to which are variable is linearly related to each other
Transcript
Page 1: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

CORRELATION

Correlation is a statistical measurement of the relationship between two variables such that a change in one variable results a change in other variable and such variables are called correlated.

Thus the correlation analysis is a mathematical tool which is used to measure the degree to which are variable is linearly related to each other

Page 2: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

DIRECT OR POSITIVE CORRELATIONIf the increase(or decrease) in one variable results in a corresponding increase (or decrease) in the other, the correlation is said to be direct or positive.

INVERSE OR NEGATIVE CORRELATIONIf the increase(or decrease) in one variable results in a corresponding decrease (or increase) in the other, the correlation is said to be inverse or negative correlation.

Page 3: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

For example, the correlation between (i)The income and expenditure; is positive.

And the correlation between (i) the volume and pressure of a perfect gas; is negative.

Page 4: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

LINEAR CORRELATION A relation in which the values of two variable have a constant ratio is called linear correlation(or perfect correlation).NON LINEAR CORRELATIONA relation in which the values of two variable does not have a constant ratio is called a non linear correlation.

Page 5: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Karl Pearson’s Coefficient of Correlation-Correlation coefficient between two variables x and y is denoted by r(x,y) and it is a numerical measure of linear relationship between them.

r=

Where r = correlation coefficient between x and y σx= standard deviation of x σy = standard deviation of y n= no. of observations

Page 6: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Properties of coefficient of correlation-

(i) It is the degree of measure of correlation(ii)The value of r(x,y) lies between -1 and 1.(iii) If r=1, then the correlation is perfect positive.(iv) If r= -1, then the correlation is perfect negative.(v) If r = 0,then variables are independent , i.e. no correlation

Page 7: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

(vi) Correlation coefficient is independent of change of origin and scale. If X and Y are random variables and a,b,c,d are any numbers provided that a ≠0, c ≠0 ,then

r( aX+b, cY+d) = r(X,Y)

Page 8: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Example:- Calculate the correlation coefficient of the following heights(in inches) of fathers(X) and their sons(Y):

X : 65 66 67 67 68 69 70 72 Y : 67 68 65 68 72 72 69 71

Page 9: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

X Y XY

65 67 4225 4489 4355

66 68 4356 4624 4488

67 65 4489 4225 4355

67 68 4489 4624 4556

68 72 4624 5184 4896

69 72 4761 5184 4968

70 69 4900 4761 4830

72 71 5184 5041 5112

Total =544

552 37028 38132 37560

2x 2y

Page 10: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

= = 544/8 ,

= 68

= = 552/8

= 69

r(X,Y) =

On putting all the values , we get r = .603

Page 11: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

SOLUTION:SHORT-CUT METUOD-

X Y U=X-68

V=Y-69

U2 V2 UV

65 67 -3 -2 9 4 6

66 68 -2 -1 4 1 2

67 65 -1 -4 1 16 4

67 68 -1 -1 1 1 1

68 72 0 3 0 9 0

69 72 1 3 1 9 3

70 69 2 0 4 0 0

72 71 4 2 16 4 8

Total 0 0 36 44 24

Page 12: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

= 0

= 0

r(U,V) =

On putting all the values we get- r(U,V) = .603

Page 13: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

RANK CORRELATION-

Let (xi ,yi) i = 1,2,3……n be the ranks of n individuals in the group for the characteristic A and B respectively.Co-efficient of correlation between the ranks is called the rank correlation co-efficient between the characteristic A and B for that group of individuals.

r = 1-

Where di denotes the difference in ranks of the ith individual.

Page 14: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

EXAMPLE-Compute the rank correlation co-efficient for the following data-Person : A B C D E F G H I JRank in Maths : 9 10 6 5 7 2 4 8 1 3 Rank in Physics:1 2 3 4 5 6 7 8 9 10

Page 15: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Person R1 R2 d=R1 -R2 d2

A 9 1 8 64

B 10 2 8 64

C 6 3 3 9

D 5 4 1 1

E 7 5 2 4

F 2 6 -4 16

G 4 7 -3 9

H 8 8 0 0

I 1 9 -8 64

J 3 10 -7 49

TOTAL 280

Page 16: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

r = 1-

=1- [ {6×280}/10(100-1)] = 1- 1.697 = -0.697.

Page 17: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Repeated Ranks

2 2 2 21 1 2 2

2

1 1 16 1 1 ..... 1

12 12 121

1

k kd m m m m m mr

n n

Example : Obtain the rank correlation co-efficient for the following data ;

X 68 64 75 50 64 80 75 40 55 64

Y 62 58 68 45 81 60 68 48 50 70

Page 18: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

X 68 64 75 50 64 80 75 40 55 64

Y 62 58 68 45 81 60 68 48 50 70

Ranks in X

4 6 2.5 9 6 1 2.5 10 8 6

Ranks in Y

5 7 3.5 10 1 6 3.5 9 8 2

d=x-y -1 -1 -1 -1 5 -5 -1 1 0 4 0

d2 1 1 1 1 25 25 1 1 0 16 72

2 2 2 21 1 2 2 3 3

2

2 2 2

2

1 1 16 1 1 1

12 12 121

1

1 1 16 72 2 2 1 3 3 1 2 2 1

12 12 121

10 10 1

6 75 61 0.545

990 11

d m m m m m mr

n n

r

r

Page 19: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Regression AnalysisThe term regression means some sort of functional relationship between two or more variables.

Regression measures the nature and extent of correlation.Regression is the estimation or prediction of unknown values of one variable from known values of another variable.

Page 20: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

CURVE OF REGRESSION AND REGRESSION EQUATION

If two variates x and y are correlated, then the scatter diagram will be more or less concentrated round a curve. This curve is called the curve of regression.The mathematical equation of the regression curve is called regression equation.

Page 21: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

LINEAR REGRESSION

When the points of the scatter diagram concentrate round a straight line, the regression is called linear and this straight line is known as the line of regression.

Page 22: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

LINES OF REGRESSIONIn case of n pairs (x,y), we can assume x or y as independent or dependent variable.Either of the two may be estimated for the given values of the other. Thus if want to estimate y for given values of x, we shall have the regression equation of the form y = a + bx, called the regression line of y on x. And if we wish to estimate x from the given values of y, we shall have the regression line of the form x = A + By, called the regression line of x on y. Thus in general, we always have two lines of regression

Page 23: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

LINE OF REGRESSION OF Y ON X:

( )yxy y b x x

Page 24: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

WHERE IS REGRESSION CO-EFFICIENT.

2 2( )y

yxx

n xy x yb r

n x x

yxb

Page 25: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

LINE OF REGRESSION OF X ON Y:

( )xyx x b y y

Page 26: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Where is the regression co-efficient. xyb

2 2( )x

xyy

n xy x yb r

n y y

Page 27: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Theorem :- Correlation co-efficient is the geometric mean between the regression co-efficients.

The co-efficient of regression are

Then geometric mean =

= co-efficient of correlation

y xyx xy

x y

r rb and b

yx

y x

rrr

Page 28: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

EXAMPLE-

Find the line of regression of y on x for the data given below:

X: 1.53 1.78 2.60 2.95 3.43

Y: 33.50 36.30 40 45.80 53.50

Page 29: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Solution:

x y x y

1.53 33.50 2.3409 51.255

1.78 36.30 2.1684 64.614

2.60 40.00 6.76 104

2.95 45.80 8.7025 135.11

3.42 53.50 11.6964

182.97

2x

12.28x 209.1y 2 32.67x 537.95xy

Page 30: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Here n=5

= 9.726Then, the line of regression of y on x

y=17.932+9.726xWhich is required line of regression of y on x.

2 2( )yx

n xy x yb

n x x

( )yxy y b x x

Page 31: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Question:For 10 observations on price (x) and supply (y), the following data were obtained :

Obtain the two lines of regression and estimate the supply when price is 16 units.

2 2130., 220., 2288., 5506., 3467x y x y xy

Page 32: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Solution:

Regression coefficient of y on x

=1.015

Regression line of y on x is

y=1.015x+8.805

10,, 13., 22x y

n x yn n

2 2( )yx

n xy x yb

n x x

( )yxy y b x x

Page 33: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Since we are to estimate supply (y) when price (x) is given therefore we are to use regression line of y on x here.

When x=16 units y = 1.105(16)+8.805 =25.045

Page 34: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Ques:- From the following data, find the most likely value of y when x=24:

Mean (x)=18.1, mean (y)=985.8 S.D (x)=2, S.D (y)=36.4, r=0.58

Page 35: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Ex. In a partially destroyed laboratory record of an analysis of a correlation data, the following results only are eligible : Variance of x = 9 Regression equations :

What were (a) the mean values of x and y , (b) the standard deviation of x and y and the coefficient of correlation between x and y

8 10 66 0, 40 18 214.x y x y

Page 36: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

2

(i)Sinceboth the linesof regression pass through thepoint (x,y) therefore,

8 10 66 0

40 18 214 0 .

13 17

( ) 9 3

0.8 6.6

x x

x y

x y Solvetheseeqs

x and y

ii Variance of x

Theequations of lines of regressioncanbewritten as

y x and x

2

0.45 5.35

0.8 0.45

* 0.8*0.45 0.36

0.6

0.8*0.34

0.6

yx xy

yx xy

y yx xyx y

x

y

b and b

r b b

r

r bb

r

Page 37: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Ques. : If the regression co-efficient are 0.8 and 0.2, what would be the value of co-efficient of correlation.

Page 38: Correlation by Neeraj Bhandari ( Surkhet.Nepal )

Ques.: The equations of two lines of regression obtained in a correlation analysis of 60 observation are 5x = 6y +24 , and 1000y =768 x – 3608.

What is the co-efficient of correlation ? Mean values of x and y. What is the ratio of variance of x and y ?


Recommended