Date post: | 16-Jul-2015 |
Category: |
Education |
Upload: | mary-grace |
View: | 143 times |
Download: | 2 times |
SIMPLE
REGRESSION
AND
CORRELATION
Prepared by: WET SOCIETY :D
DEFINITION OF TERMS
CORRELATION The correlations term is used when:
1) Both variables are random variables,
2) The end goal is simply to find a number that expresses the relation between the
variables
REGRESSIONThe regression term is used when
1) One of the variables is a fixed variable,
2) The end goal is use the measure of relation to predict values of the random
variable based on values of the fixed variable
WET SOCIETY \m/
CORRELATION
Correlations range from -1
(perfect negative relation)
through 0 (no relation) to +1
(perfect positive relation)
WET SOCIETY \m/
CORRELATION = -1.0WET SOCIETY \m/
CORRELATION = 0.0WET SOCIETY \m/
CORRELATION = +1.0WET SOCIETY \m/
CALCULATING THE COVARIANCE:
The first step in calculating a correlation co-
efficient is to quantify the covariance between
two variables.
WET SOCIETY \m/
CALCULATING THE COVARIANCE:
Alternative formula:
WET SOCIETY \m/
THE PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENT (R)
The Pearson Product-Moment Correlation Coefficient, r, is computed simple by standardizing the covariance estimate as follows:
This results in r values ranging from -1.0 to +1.0 as discussed earlier
WET SOCIETY \m/
THE PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENT (R)
There is another way to represent this formula. It is:
where SPXY is the sum of the products of X and Y, SSX is the
sum of squares for X and SSY is the sum of squares for Y
WET SOCIETY \m/
SUMS OF SQUARES AND SUMS OF PRODUCTS
WET SOCIETY \m/
SUMS OF SQUARES AND SUMS OF PRODUCTS
WET SOCIETY \m/
ADJUSTED RWET SOCIETY \m/
EXAMPLE 1
In this class, height and ratings of physical attractiveness vary
across individuals. What is the correlation between height and
these ratings in our class?
PhyHeightSubject
7691
8612
6683
5664
8665
....
107148
WET SOCIETY \m/
We can create a scatter plot of these data by simply plotting
one variable against the other:
correlation = 0.146235 or +0.15
WET SOCIETY \m/
EXAMPLE 2
Consider the height and weight variables from our class dataset ...
WET SOCIETY \m/
SUM (XY) = 99064
Subject Height (X) Weight (Y)
1 69 108
2 61 130
3 68 135
4 66 135
5 66 120
6 63 115
7 72 150
8 62 105
9 62 115
10 67 145
11 66 132
12 63 120
Mean 65.42 125.83
Sum(X) = 785 Sum(Y) = 1510
Sum (X2) = 51473 Sum(Y2) = 192238
WET SOCIETY \m/
WET SOCIETY \m/
WET SOCIETY \m/
So, based on the 12 subjects we examined,
the correlation between height and weight
was +0.55
WET SOCIETY \m/
Unfortunately, the r we measure using our sample
is not an unbiased estimator of the population
correlation coefficient (rho)
We can correct for this using the adjusted
correlation coefficient which is computed as
follows:
WET SOCIETY \m/
WET SOCIETY \m/
THE REGRESSION LINE
The regression line represents
the best prediction of the
variable on the Y axis for each
point along the X axis.
WET SOCIETY \m/
COMPUTING THE REGRESSION LINE
where = the predicted value of Y
b = the slope of the line (the change in Y as a function of X)
X = the various values of X
a = the intercept of the line (the point where the line hits the Y
axis)
WET SOCIETY \m/
Slope(b) = (NΣXY - (ΣX)(ΣY)) /
(NΣX2 - (ΣX)2)
Intercept(a) = (ΣY – b(ΣX)) / Nwhere
x and y are the variables.
N = Number of values or elements
X = First Score
Y = Second Score
ΣXY = Sum of the product of first and
Second Scores
ΣX = Sum of First Scores
ΣY = Sum of Second Scores
ΣX2 = Sum of square First Scores
WET SOCIETY \m/
REGRESSION EXAMPLE
To find the Simple/Linear Regression of
To find regression equation, we will first find slope, intercept and use it to form regression equation..
X Values Y Values
60 3.1
61 3.6
62 3.8
63 4
65 4.1
WET SOCIETY \m/
Step 1: Count the number of values.
N = 5
Step 2: Find XY, X2
See the below table
X Value Y Value X*Y X*X
60 3.160 *3.1 =
18660 *60 =
3600
61 3.661 *3.6 =
219.661 *61 =
3721
62 3.862 *3.8 =
235.662 *62 =
3844
63 4 63 *4 =25263 *63 =
3969
65 4.165 *4.1 =
266.565 *65 =
4225
WET SOCIETY \m/
Step 3: Find ΣX, ΣY, ΣXY, ΣX2.
ΣX = 311
ΣY = 18.6
ΣXY = 1159.7
ΣX2 = 19359
WET SOCIETY \m/
Step 4: Substitute in the above slope
formula given.
Slope(b) = (NΣXY - (ΣX)(ΣY)) /
(NΣX2 - (ΣX)2)
= ((5)*(1159.7)-
(311)*(18.6))/((5)*(19359)-(311)2)
= (5798.5 - 5784.6)/(96795 -
96721)
= 13.9/74
= 0.19
WET SOCIETY \m/
Step 5: Now, again substitute in the above
intercept formula given.
Intercept(a) = (ΣY - b(ΣX)) / N
= (18.6 - 0.19(311))/5
= (18.6 - 59.09)/5
= -40.49/5
= -8.098
Step 6: Then substitute these values in
regression equation formula
Regression Equation(y) = a + bx
= -8.098 + 0.19x.
WET SOCIETY \m/
Suppose if we want to know the approximate y
value for the variable x = 64. Then we can
substitute the value in the above equation.
Regression Equation(y) = a + bx
= -8.098 + 0.19(64).
= -8.098 + 12.16
= 4.06
This example will guide you to find the relationship
between two variables by calculating the
Regression from the above steps.
WET SOCIETY \m/