Date post: | 03-Apr-2018 |
Category: |
Documents |
Upload: | unitedworld-school-of-business |
View: | 220 times |
Download: | 0 times |
of 49
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
1/49
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
2/49
Correlation-Regression
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
3/49
It deals with association between two ormore variables
Correlation analysis deals with
covariation between two or more
variables
Types
1. Positive or negative
Simple or multipleLinear or non-linear
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
4/49
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
5/49
Karl Pearsons Coefficient of Correlation
dx dy ( Gamma) = -------------------------
dx2 dy2
dx dy= -------------------------
N xydx = x-xbardy = y- ybar
dx dy = sum of products of deviations from respective
arithmetic means of both series
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
6/49
Karl Pearsons Coefficient of Correlation
After calculating assumed or working mean Ax & Ay dx dy ( dx) x( dy) ( Gamma) = --------------------------------
[ N dx2 - ( dx)2 x [ Ndy2 - ( dy)2 ]
dx dy = total of products of deviation from assumedmeans of x and y series
dx = total of deviations of x series dy = total of deviations of y series
dx2 = total of squared deviations of x series dy2 = total of squared deviations of y series
N= No. of items ( no. of paired items
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
7/49
Karl Pearsons Coefficient of Correlation
After calculating assumed or working mean Ax & Ay dx x dy
dx dy - ----------------N
( Gamma) = -------------------------( dx)2 ( dy)2
[ dx2 - --------- ] x [ dy2 - ------------]N N
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
8/49
Assumptions of Karl Pearsons Coefficient of Correlation
1. Linear relationship exists between the variablesProperties of Karl Pearsons Coefficient of Correlation1.value lies between +1 & - 1
2.Zero means no correlation
3. ( Gamma) = bxy X byxWhere bxy X byx are regression coefficicent
Merit
Convenient for accurate interpretation as it gives degree &
direction of relationship between two variables
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
9/49
Limitations
1. Assumes linear relationship , even though it
may not be
2. Method & process of calculation is difficult &
time consuming3. Affected by extreme values in distribution
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
10/49
Probable Error of Karl Pearsons Coefficient of
Correlation
1- 2
Probable Error of ( Gamma) = 0.6745 -------- N
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
11/49
Q7.Calculate coefficient of correlation for following data
X65 63 67 64 68 62 70 66 68 67 69 71
Y 68 66 68 65 69 66 68 65 71 67 68 70
Ans dx dy ( Gamma) = ------------------------- dx2 dy2
dx dy= -------------------
N xy
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
12/49
1 2 3 4 5 6 7 8 9 10 11 12
Su
mX Xbar
X 65 63 67 64 68 62 70 66 68 67 69 71 800 66.67
Y 68 66 68 65 69 66 68 65 71 67 68 70 811 67.58
dx=x-xbar -1.67 -3.67 0.33 -2.67 1.33 -4.67 3.33 -0.67 1.33 0.33 2.33 4.33
dx2 2.78 13.44 0.11 7.11 1.78 21.78 11.11 0.44 1.78 0.11 5.44 18.78
84.
67
dx.dy -0.69 5.81 0.14 6.89 1.89 7.39 1.39 1.72 4.56 -0.19 0.97 10.47
40.
33
dy=y-ybar 0.42 -1.58 0.42 -2.58 1.42 -1.58 0.42 -2.58 3.42 -0.58 0.42 2.42
dy2 0.17 2.51 0.17 6.67 2.01 2.51 0.17 6.67 11.67 0.34 0.17 5.84
38.
92
dx dysum dx2*
sumdy2
3294.
9
dx2 dy2 57.40
coeff of
correlation = 0.70
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
13/49
Q8. following information about age of husbands
& wives. Find correlation coefficient
Husband
23 27 28 29 30 31 33 35 36 39
Wife 18 22 23 24 25 26 28 29 30 32
( Gamma) =0.99
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
14/49
1 2 3 4 5 6 7 8 9 10
Sum
X Xbar
X 23 27 28 29 30 31 33 35 36 39 311 31.10
Y 18 22 23 24 25 26 28 29 30 32 257 25.70
dx=x-
xbar -8.10 -4.10 -3.10 -2.10 -1.10 -0.10 1.90 3.90 4.90 7.90
dx2 65.61 16.81 9.61 4.41 1.21 0.01 3.61 15.21 24.01 62.41
202.
9
dx.dy 62.37 15.17 8.37 3.57 0.77 -0.03 4.37 12.87 21.07 49.77178.
3
dy=y-
ybar -7.70 -3.70 -2.70 -1.70 -0.70 0.30 2.30 3.30 4.30 6.30
dy2 59.29 13.69 7.29 2.89 0.49 0.09 5.29 10.89 18.49 39.69
158.
1
dx dy sum dx2* sumdy232078.4
9
dx2 dy2 179.10
coeff of correlation
= 1.00
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
15/49
Rank Correlation : some times variable are not
quantitative in nature but can be arranged inserial order.
Specially while eading with attributes like
honesty , beauty , character , morality etcTo deal with such situations , Charles Edward
Spearman , in 1904 developed a formula for
obtaining correlation coefficient between ranks
of n individuals in two attributes under study , or
ranks given by two or three judges
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
16/49
Rank coefficient of correlation
6 d2
(rho) = 1 - -------------------
N3-N
6 d2 (rho) = 1 - -------------------
N(N2-1)
d2 = total of squared differenceN = number of items
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
17/49
Q9. ten competitors in a cooking competition are ranked
by three judges in the following way .by using rank
coorelation method find out which pair of judges havenearest approach
P Q R
1 1 3 6
2 6 5 43 5 8 9
4 10 4 8
5 3 7 1
6 2 10 27 4 2 3
8 9 1 10
9 7 6 5
10 8 9 7
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
18/49
P Q R Rp-Rq dpq2 Rq-Rr dqr2 Rp-Rr dpr2
1 1 3 6 -2 4 -3 9 -5 25
2 6 5 4 1 1 1 1 2 4
3 5 8 9 -3 9 -1 1 -4 16
4 10 4 8 6 36 -4 16 2 4
5 3 7 1 -4 16 6 36 2 4
6 2 10 2 -8 64 8 64 0 0
7 4 2 3 2 4 -1 1 1 1
8 9 1 10 8 64 -9 81 -1 1
9 7 6 5 1 1 1 1 2 4
10 8 9 7 -1 1 2 4 1 1
1000 200 214 0 60
6Sigma d2 1200 1284 360
N3-N 990 6Sigma d2/N3-N 1.21 1.297 0.3636
(rho) -0.21 -0.297 0.636364
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
19/49
Regression Analysis is the process of
developing a statistical model which is usedto predict the value of a dependant variable
by an independent variable
Application
Advertising v/s sales revenue
First used by Sir Francis Gatton in 1877 for
study of height of sons w.r.t height of fathers
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
20/49
Regression Analysisgoing back or to revert to
the former condition or returnRefers to functional relationship between x & y
and estimates of value of depebdent variable y
for given values of independeny variable x
Relationship between income of employees and
savings
Regression coefficients can be used to calculate ,
correlation coeffecient. ( Gamma) = bxy Xbyx
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
21/49
Types of Regression
1. Simple & Multiple Regression2. Total or Partial
3. Linear / Non-linear
Methods of Regression Analysis
1. Scatter Diagram
2. Regression Equations
3. Regression LinesRegression of x on y y= a + bx
Regression of y on x x= a + by
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
22/49
Regression coefficients coefficient of regressionof x on y = coefficient of regression of x on y =
( x- x-) (y- y-) dx dy
bxy= ------------------= ------- (y- y-)2 dy2
coefficient of regression of y on x
( x- x-) (y- y-) dx dybyx= ------------------= ---------- (x- x-)2 dx2
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
23/49
Q2.From the data given below find
two regression coefficientstwo regression equations
coefficient of correlation between marks in
Economics & statistics
most likely marks in statistics when marks in
Economics are 30
let marks in Economics be x and that in statistics
be yMarks in Eco 25 28 35 32 31 36 29 38 34 32
Marks in Stat 43 46 49 41 36 32 31 30 33 39
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
24/49
Marks in
Eco
25 28 35 32 31 36 29 38 34 32 x 320 x- 32
Marks in
Stat
43 46 49 41 36 32 31 30 33 39 y 380 y- 38
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
25/49
Marks in
Eco
25 2
8
35 3
2
3
1
3
6
2
9
3
8
3
4
3
2
x 320 x- 3
2
Marks in
Stat
43 4
6
49 4
1
3
6
3
2
3
1
3
0
3
3
3
9
y 380 y- 3
8
dx=x- x-
=x-32
-7 -4 3 0 -1 4 -3 6 2 0 dx 0 3
3
3
3
dy=y- y
-
=x-38 5 8 11 3 -2 -6 -7 -8 -5 1dy
0
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
26/49
Marks in
Eco
25 28 35 32 31 36 29 38 34 32 x 320 x- 32
Marks in
Stat
43 46 49 41 36 32 31 30 33 39 y 380 y- 38
dx=x- x-=x-
32
-7 -4 3 0 -1 4 -3 6 2 0 dx 0 33 33
dy=y- y-=x-
38
5 8 11 3 -2 -6 -7 -8 -5 1 dy 0
dx2 49 16 9 0 1 16 9 36 4 0 dx2 140
dy2 25 64 121 9 4 36 49 64 25 1 dy2 398
dx dy -35 -
32
33 0 2 -
24
21 -
48
-
10
0 dxd
y
-93
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
27/49
Regression coefficients coefficient of regressionof x on y = coefficient of regression of x on y =
( x- x-) (y- y-) dx dy -93
bxy= ------------------= ------- = ------ = -0.2337 (y- y-)2 dy2 398
coefficient of regression of y on x =
( x- x-) (y- y-) dx dy -93byx= ------------------= ---------- = --------= -0.6643
(x- x-)2 dx2 140
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
28/49
regression of x on yx-x- = bxy (y-y-)
x-32 = -0.2337(y-38)
= - 0.2337 y +0.2337 *38= -0.2337y + 8.8806
x = -0.2337y +32 + 8.8806
x = -0.2337y +40.8806
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
29/49
Correlation Coefficient = bxy *byx
= -0.2337 *-0.6643 = 0.1552 = -0.394
Since byx & bxy are both negative
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
30/49
regression of y on x
y-y- = bxy (x-x-)
y-38 = -0.6643(x-32)y -38= -0.6643x+0.6643*32
y = -0.6643x+38+0.6643*32
y = -0.6643x+38+21.2576y = -0.6643x+59.2576
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
31/49
In order to estimate most likely marks in statistics
(y) when Economics (x) are 30 , we shall use the
line regression of y x viz
The required estimate is given by
y = -0.6643* 30+59.2576= -19.929+59.2576 =
=39.3286
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
32/49
Sum of Squares- x&y
(x )*(y)SSxy = ( x-x- ) ( y-y- )= = xy - --------------
nSum of Squares xx
(x )
SSxx = ( x-x- )2=x2 - -------------n
Sales &advt expenses in Rs 1000 Develop a regression model
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
33/49
advt sales
92 930
94 900
97 1020
98 990
100 1100102 1050
104 1150
105 1120
105 1130
107 1200
107 1250
110 1220
Sales &advt expenses in Rs.1000. Develop a regression model
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
34/49
SSxy
b = ------------SSxx
y=a+bx
y= a+b x y= n* a+b x
n* a = b x - y y - b x y b x
a = ----------- = ------- - -------
n n n
xi= yi= ed residual
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
35/49
xi yi ed residual
advt sales x2 xy (yi-y-) (yi-y-)2 y^=fits yi-y^ (yi-y^)2 y^-y- (y^-y-)2
92 930 8464 85560 = 902.4 27.6
94 900 8836 84600 940.54 -40.54
97 1020 9409 98940 997.75 22.25
98 990 9604 97020
1016.8
2 -26.82
100 1100 10000 110000
1054.9
6 45.04
102 1050 10404 107100 1093.1 -43.1
104 1150 10816 119600
1131.2
4 18.76
105 1120 11025 117600
1150.3
1 -30.31
105 1130 11025 118650
1150.3
1 -20.31
107 1200 11449 128400
1188.4
5 11.55
107 1250 11449 133750
1188.4
5 61.55
110 1220 12100 134200
1245.6
6 -25.66
1221 13060 124581 1335420 0
13059.
99 0.01x y x2 xy (yi-yc)
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
36/49
xi= yi= predicted residual
advt sales x2 xy (yi-y-) (yi-y-
)2
y^=fits yi-y^ (yi-y^)2 y^-y-
(y^-y-
)2
92 930 8464 85560 -158.33
25069.44
902.4 27.6
761.76 -
185.9334571.20
94 900 8836 84600 -188.33
35469.44
940.54 -40.54
1643.49 -
147.79
21842.87
97 1020 9409 98940 -68.33 4669.44 997.75 22.25 495.06 -90.58 8205.34
98 990 9604 97020 -98.33 9669.44 1016.82 -26.82 719.31 -71.51 5114.16
100 1100 10000 110000 11.67 136.11
1054.96 45.04 2028.60
-33.37
1113.78
102 1050 10404 107100 -38.33 1469.44 1093.1 -43.1 1857.61 4.77 22.72
104 1150 10816 119600 61.67 3802.78 1131.24 18.76 351.94 42.91 1840.98
105 1120 11025 117600 31.67 1002.78 1150.31 -30.31 918.70 61.98 3841.11
105 1130 11025 118650 41.67 1736.11 1150.31 -20.31 412.50 61.98 3841.11
107 1200 11449 128400 111.67 12469.44 1188.45 11.55 133.40 100.12 10023.35
107 1250 11449 133750 161.67 26136.11 1188.45 61.55 3788.40 100.12 10023.35
110 1220 12100 134200 131.67 17336.11 1245.66 -25.66 658.44 157.33 24751.68
1221 13060 124581 1335420 0.00 138966.667 13059.99 0.01 13769.21 -0.01 125191.6
x y x2 xy (yi-yc)
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
37/49
1221x- = ------------- = 101.75
12
(x *y) 1221*13060
SSxy = xy - ------------= 1335420 - -------------- =6565n 12
(x )2 ( 1221)2
SSxx = x2 - -------------= 124581 - ------- = = 344.25
n 12
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
38/49
SSxy 6565
b = ------------- = ----------------= 19.0704
SSxx 344.25y=a+bx
y= a+b x y= n* a+b xn* a = b x - y
y - b x y b x 13060 19.0704*1221a = ----------- = ------- - ------- = ---------- - --------------
n n n 12 12
= - 852.08
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
39/49
equation for simple regression line
y= a+bxy= -852.08+ 19.0704 x
for regression of y on x
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
40/49
For testing the Fit
yi = yi- value of yrecorded value in the given datay- = Mean ( Average )of y
y^ = Predicted Values from regression line
deviation = (yi- y-) = difference in actual value of y from
meanResiduals = (yi- y^)= gap ( error , difference ) between
actual value of y & predicted value calculated from
regression line
Deviation of predicted value from mean = (y^- y-
)a = intercept on y -axis
b= slope of regression line
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
41/49
total sum of squares = SST = (yi-y-
)2
regression sum of squares = SSR = (y^- y-)2
Error sum of squares = SSE = (yi-y^)2
SSR
coefficient of determination = 2= -------SST
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
42/49
SSEStandard Error of Estimate =Syx= ----------------
n-2
In order to to determine whether a significant linear relationship
exists between independent variable x and dependent variable y weperform whether population slope is zero
b - t= ----------
Sb
Syx
Sb = Standard error of b= -----------
SSxx
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
43/49
H0:Slope of thr regression line is zero
H1-Slope of the regression line is not zero
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
44/49
SSE
Syx= Standard Error of Estimate =--------
n-2 (yi-y^)2 13769.21= -------- = ------------ = 1376.92 = 37.1068
n-2 10-2
(x )2 (1221)2SSxx = x2 - -------- = 124581 - -------= 344.25
n 12
Syx
Sb = Standard error of b= -----------
SSxx
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
45/49
Syx
Sb = Standard error of b= -----------
SSxxb- 19.07-0
t= ---------- = ------------------------------- = 9.53Sb 37.1068/( 344.25)
As calculated value of t is more than table
value of t for 12-2 = 10 degrees of freedomNull hypothesis is rejected
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
46/49
Coefficient of Determination Definition
The Coefficient of Determination, also known as R Squared,
is interpreted as the goodness of fit of a regression.
The higher the coefficient of determination, the better the
variance that the dependent variable is explained by theindependent variable.
The coefficient of determination is the overall measure of
the usefulness of a regression.
For example,r2 is given at 0.95. This means that thevariation in the regression is 95% explained by the
independent variable. That is a good regression.
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
47/49
The Coefficient of Determination can be
calculated as the Regression sum of squares,SSR, divided by the total sum of squares, SST
SSR
Coefficient of Determination 2 = ----------SST
Campus Overview
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
48/49
Campus Overview
907/A Uvarshad,
Gandhinagar
Highway, Ahmedabad
382422.
Ahmedabad Kolkata
Infinity Benchmark,
10th Floor, Plot G1,Block EP & GP,
Sector V, Salt-Lake,
Kolkata 700091.
Mumbai
Goldline Business Centre
Linkway Estate,Next to Chincholi Fire
Brigade, Malad (West),
Mumbai 400 064.
7/28/2019 Correlation & Regression - 2 Presentation - Unitedworld School of Business
49/49
Thank You