Linear Regression A method of calculating a linear
equation for the relationship between two or more variables using multiple data points.
Linear Regression General form: Y = a + bX + e “Y” is the dependent variable “X” is the explanatory variable “a” is the intercept parameter “b” is the slope parameter “e” is an error term or residual
Regression Results Y and X come from data A computer program calculates
estimates of a and b e is the difference between a + bX and
the actual value of Y corresponding to X OLS estimates of a and b minimize the
sum of the squared residuals ∑e2
“OLS” is Ordinary Least Squares
The Regression Line and the Residual
Electricity Demand Example Data for residential customers in U.
S. states Y is millions of kilo-watt-hours sold X is population Other data include per capita
income, price of electricity (cents/kwh) and price of natural gas
Variable Means 2004
Milkwh Pop Pkwh Pgas Income
Mean 25364.4902 5756576.549 8.997647059 11.39823529 32344.62745Standard Deviation 25605.09342 6502776.218 2.411107288 3.107310545 5307.123526Minimum 1834 506529 6.13 4.88 24379Maximum 120330 35893799 18.13 27.15 52101Count 51 51 51 51 51
Variable Means 2012
Milkwh Pop Pkwh Pgas IncomeMean 26,951 6,198,605 10.35 12.14 42,492Standard Dev. 27,327.10 7,036,162 4.1633 6.40641 7,605.86Minimum 2,003 582,658 6.9 7.43 33,073Maximum 137,412 38,332,521 34 52.86 74,710Observations 51 51 51 51 51
Excel Regression: MilkwhSUMMARY OUTPUT
Regression StatisticsMultiple R 0.868539405R Square 0.754360698Adjusted R Square 0.749347651Standard Error 12819.23929Observations 51
ANOVAdf SS MS F Significance F
Regression 1 24728728564 2.47E+10 150.4795 1.49762E-16Residual 49 8052311901 1.64E+08Total 50 32781040465
Coefficients Standard Error t Stat P-valueIntercept 5677.407437 2407.873624 2.357851 0.022417Pop 0.003419929 0.000278791 12.26701 1.5E-16
Actual vs. Predicted Milkwh
Useful Statistics and Tests t-statistic: is estimated coefficient
significantly different from zero? Coefficient of determination
or R-square: % variation explained F-statistic: statistical significance of the
entire regression equation; OR is the R-square significantly different from zero?
Find these on the Excel regression output.
Confidence and Significance Levels
99% Confidence = 1% Significance P-value of 0.01 or less
95% Confidence = 5% Significance P-value of 0.05 or less
90% Confidence = 10% Significance P-value of 0.10 or less
Smaller Significance Levels Are Better Find P-values for t and F statistics on Excel
regression output
Multiple & Nonlinear Regression Multiple Regression
Y= a + bX + cW + dZ Nonlinear Regression
Quadratic: Y = a + bX + cX2 Log-Linear: Y = aXbZc
Or Ln Y = (ln a) + b(ln X) + c(ln Z)
Multiple Regression: Milkwh
SUMMARY OUTPUT
Regression StatisticsMultiple R 0.886907555R Square 0.786605011Adjusted R Square 0.777713553Standard Error 12072.10091Observations 51
ANOVAdf SS MS F Significance F
Regression 2 25785730685 1.29E+10 88.4675 7.95099E-17Residual 48 6995309780 1.46E+08Total 50 32781040465
Coefficients Standard Error t Stat P-valueIntercept 22354.84026 6594.71082 3.389814 0.001407Pop 0.003568436 0.000268271 13.30163 1.02E-17Pkwh -1948.545245 723.5281324 -2.69312 0.009721
Quadratic Regression: Milkwh
SUMMARY OUTPUT
Regression StatisticsMultiple R 0.955685683R Square 0.913335125Adjusted R Square 0.89682753Standard Error 8224.476787Observations 51
ANOVAdf SS MS F Significance F
Regression 8 29940075691 3.74E+09 55.32818 7.73456E-20Residual 42 2840964773 67642018Total 50 32781040465
Coefficients Standard Error t Stat P-valueIntercept -22436.17995 40605.45259 -0.55254 0.583507Pop 0.005502909 0.000520656 10.56919 2.09E-13Popsq -6.71958E-11 1.69781E-11 -3.9578 0.000286Pkwh 14663.0069 4863.649065 3.014816 0.004349Pkwhsq -850.7008024 242.651212 -3.50586 0.001097PGas -2752.477572 2070.841636 -1.32916 0.190971PGsq 196.1586297 77.84468361 2.519872 0.015629Income -1.417855478 2.096191368 -0.6764 0.502497Incsq 1.19331E-05 2.84886E-05 0.418872 0.677444
Log Linear Regression: LnMilkwh
SUMMARY OUTPUT
Regression StatisticsMultiple R 0.9883461R Square 0.976828014Adjusted R Square 0.974813058Standard Error 0.175309427Observations 51
ANOVAdf SS MS F Significance F
Regression 4 59.59683702 14.89921 484.7889 5.81585E-37Residual 46 1.413736172 0.030733Total 50 61.01057319
Coefficients Standard Error t Stat P-valueIntercept 0.033793671 1.792943436 0.018848 0.985044LnPop 1.01392625 0.023922385 42.384 1.63E-38LnPkwh -0.9191971 0.123573131 -7.43849 2.01E-09LnPgas 0.416254503 0.113418083 3.670089 0.000629LnInc -0.45117695 0.179298162 -2.51635 0.015409
Demand Regression Project
DATA: ElecDemandData2012.xls under Project Materials on D2L
1. Using the data file above, run a linear regression of
Dependent Variable: MilkwhExplanatory Variables: Pop, Pkwh, PGas, Income.
Which coefficients (including the constant) are statistically significant at the 10% level or better?Which are not significant?How much of the variation in the dependent variable is explained by the estimated equation?Is the equation as a whole statistically significant? At what level?
Finding the Marginal Revenue Equation: Overview Evaluate estimated demand at means of all
explanatory variables except price Calculate average effect of non-price variables
to get demand equation in this form Q = A - b(P)
Rearrange to find Inverse Demand equation P = (A/b) - (1/b)Q
MR has twice the slope of inverse demand MR = (A/b) – (2/b)Q
The end result is an equation, not a number
Finding the Marginal Revenue Equation: Example
Write your regression equation in this form Milkwh = 11,000 – 3600Pkwh + 0.0041Pop + 2150PGas
11,000 is the intercept or constant coefficient -3600, 0.0041, and 2150 are estimated coefficients These are made-up numbers for this example
Use the mean values of the non-electricity-price variables Pop=5,756,577 PGas=11.4
Substitute into your regression equation and simplify Milkwh = 11,000 – 3600Pkwh + 0.0041(5,756,577) + 2150(11.4) Milkwh= [11,000 + 23,601.97 + 24,510] – 3600(Pkwh) Milkwh = 59,111.97 – 3600(Pkwh) This is Q = A – bP from the previous slide
Finding Marginal Revenue Example, Continued
Milkwh = 59,111.97 – 3600(Pkwh) From end of previous slide
Rearrange to find Inverse Demand P = (A/b) - (1/b)Q = (A – Q)/b Pkwh = (59,111.97 – Milkwh)/(3600) Pkwh = 16.42 – 0.00028(Milkwh) This is the inverse demand equation
Marginal Revenue has twice the slope MR = 16.42 – 0.00056(Milkwh)