regression assumption by Ammara Aftab

transcript

REGRESSION ANALYSISAND ITS ASSUMPTION

AMMARA AFTABammara.aftab63@gmail.com

MSC (FINAL)ECONOMETRICS

THE SHINNIG STAR FOR STATISTIC…MY IDEALS…

REGRESSION seems like MSC(FINAL) OF UOK DEPENDS ON QUALIFIED ECONOMIST MR.ZOHAIB AZIZ

SIR ZOHAIB

By Ammara Aftab

INTRODUCTION

By Ammara Aftab

HISTORICAL ORIGIN OF THE TERM REGRESSION

By Ammara Aftab

Historical ORIGIN BY FRANCIS GALTON

GALTON’S Law of universal regression was confirmed by his friend CARL F .GAUSS

KARL PEARSONBy Ammara Aftab

Galton's universal regression law

Galton found that, although there was a tendency for tall parents to have tall children and for short parents to have short children.

KARL PEARSON: He is talking in average sense that average height (not single

height of children that may be high or low from tall fathers) of sons is less than the fathers height means tendency to mid....

similarly, average height of sons (not single height of children that may be high or low from short fathers) of short parents greater than from them means tendency to mid.

Conclusion:Tall parents have tall children but average height of their children

will be less from them similarly for short parents. That is why Karl Pearson's endorses the Galton's theory.

By Ammara Aftab

MODERN INTERPRETATION OF REGRESSION

By Ammara Aftab

HISTORICAL

MODERN

By Ammara Aftab

Reconsider Galton’s law of universal regression. Galton was interested in finding out why there was a stability in the distribution of heightsin a population.

But in the modern view our concern is not with this explanation but rather with finding out how the average height of sons changes, given the fathers’ height. In other words, our concern is with predicting the average height of sons knowing the height of their fathers

By Ammara Aftab

EXAMPLE

By Ammara Aftab

How to use the

regression data

By Ammara Aftab

PUT THE REGRESSION DATA ON EXCEL SHEET

By Ammara Aftab

The slope indicates the steepness of a line and the intercept indicates the location where it intersects an axis.

By Ammara Aftab

GRAPHICAL REPRESENTATION

By Ammara Aftab

REGRESSION ASSUMPTIONS

By Ammara Aftab

WHY THE ASSUMPTION requirement is needed??

…………?????By Ammara Aftab

LOOK AT PRF: Yi=β1+(β2)Xi+µi

It shows that Yi depends on both Xi and ui . Therefore, unless we are specific about how Xi and ui are created or generated, there is no way we can make any statistical inference about the Yi and also, as we shall see, about β1 and β2. Thus, the assumptions made about the Xi variable(s) and the error term are extremely critical to the valid interpretation of the regression estimates.

By Ammara Aftab

the Gaussian or classical linear regression

ASSUMPTIONS(clrm)

By Ammara Aftab

ASSUMPTION # 01

By Ammara Aftab

LINEAR REGRESSION MODEL:The regression model is linear with respect to parameter.

1) Yi = β1 + β2X1 +µiIn this above equation the model is linear with respect to parameter.

2)Yi=B1+(B2^2)X1+µiIn this above equation the model is non linear with respect to parameter.

By Ammara Aftab

Linear with respect to (Parameter)

• Yi=β1+(β2)Xi+µi

• Yi=β1+(β2)Xi^2+µi

Non-linear with respect to

(PARAMETER)

Yi=β1+(β2^2)Xi+µi

ß is the parameter,If ß have power then it will be non linear with respect to parameter, if it does not have then it is in Linear form.By Ammara Aftab

Simple linear regression describes the linear relationship between a predictor variable, plotted on the x-axis, and a response variable, plotted on the y-axis

Independent Variable (X)

By Ammara Aftab

ASSUMPTION # 02

By Ammara Aftab

X values are fixed in repeated sampling:

Values taken by the regressor X are considered fixed in repeated samples. More technically, X is assumed to be no stochastic .

By Ammara Aftab

20$ 40$ 60$ 80$ 100$

JAPAN 18 23 47 78 89

CHINA 09 28 37 34 34

USA 16 35 48 45 67RUSSIA 17 38 50 23 69

INCOME & CONSUMPTION

Here X (INCOME)is fixed variable AN y (CONSUMPTION ) is depending on X

By Ammara Aftab

The X variable is measured without error

X is fixed and Y is dependent

By Ammara Aftab

ASSUMPTION # 03

By Ammara Aftab

Zero mean value of disturbance ui. Given the value of X, the mean, or expected, value of the random disturbance term ui is zero. Technically, the conditional mean value of ui is zero. Symbolically, we have E(ui |Xi) = 0EXAMPLE:If X= 3,6,9Then the mean =6,after applying this formula (mean-X) = (6-3)+(6-6)+(6-9) = 0 so 1st moment is always zero.

By Ammara Aftab

3Mean =ovariance= constant

By Ammara Aftab

ASSUMPTION # 04

By Ammara Aftab

4Homoscedasticity or equal variance of ui. Given the value of X, the varianceof ui is the same for all observations. That is, the conditional variances of ui are identical.Symbolically, we have

var (ui |Xi) = E[ui − E(ui |Xi)]2 = E(ui2 | Xi ) because of Asp3 = σ2By Ammara Aftab

4Homo= samescedasticity=spreadness

By Ammara Aftab

ASSUMPTION # 05

By Ammara Aftab

5No autocorrelation between the disturbances. Given any two X values,

Xi and Xj (i = j), the correlation between any two ui and uj (i = j) is zero.

Symbolically,cov (ui, uj |Xi, Xj) = E{[ui − E(ui)] | Xi }{[uj − E(uj)] | Xj } = E(ui |Xi)(uj | Xj) (why?) = 0

By Ammara Aftab

No Auto-Correlation

Positive correlation

Negative correlation

No correlation

By Ammara Aftab

ASSUMPTION # 06

By Ammara Aftab

6Zero covariance between ui and Xi,or E(uiXi) = 0.

We assumed that X and u have separate effect on Y,But if X and u are corelated,it is not possible to assess their individual effect on Y as well as they will be directly proportional. X u Formally, cov (ui, Xi) = E[ui − E(ui)][Xi − E(Xi)] = E[ui (Xi − E(Xi))] since E(ui) = 0 = E(uiXi) − E(Xi)E(ui) since E(Xi) is nonstochastic ( NON RANDOM) = E(uiXi) since E(ui) = 0 = 0By Ammara Aftab

ASSUMPTION # 07

By Ammara Aftab

7The number of observations n must be greater than the number of parameters to be estimated.

Example:Yi=β1+(β2)X1+(β3)X2+µi As u can see that we have 3 parameters here so the number of observation will be grater then 3.Alternatively,The number of observations n must be greater than the number of regressors. From this single observation there is no way to estimate the two unknowns, β1 and β2. We need at least two pairs of observations to estimate the two unknowns.By Ammara Aftab

ASSUMPTION # 08

By Ammara Aftab

8Variability in X values.

The X values in a given sample must not all be the same.

If the X values are same then the Variance is equal to zero and

regression can’t be run

Technically, var (X) must be a finite positive

number.By Ammara Aftab

8If we have same values of X:

X = 2,2,2

Then, according to the variance formula

σ = [(x-µ)^2]/n σ = [(2-2)^2]/3+[(2-2)^2]/3+[(2-2)^2]/3 σ = 0As σ= o then regression could not be run Variance = 0 ,regression can not run..

Positive Finite means:

variance could not be(-ve),because it has squaring σ^2 variance could not be 0 , regression could not run. variance should be (+ve) , positive finite.

By Ammara Aftab

ASSUMPTION # 09

By Ammara Aftab

9The regression model is correctly specified.

Alternatively, There is no specification bias or error in the model used in empirical analysis.

Yi = α1 + α2Xi + ui Yi = β1 + β2(1/Xi)+ ui

By Ammara Aftab

9Yi = β1 + β2(1/Xi)+ ui

By Ammara Aftab

ASSUMPTION # 10

By Ammara Aftab

There is no perfect multicollinearity.Perfect multicollinearity=Strong Correlation b/w regressors.

That is, there are no perfect linearRelation b/w regressors.

Yi = β1 + β2X2i + β3X3i + ui

where Y is the dependent variable, X2 and X3 the explanatory variables (orregressors) or non random variables.

By Ammara Aftab

regression assumption by Ammara Aftab

Documents