+ All Categories
Home > Documents > Linear Regression Model

Linear Regression Model

Date post: 04-Nov-2015
Category:
Upload: muluneh-hideto
View: 9 times
Download: 0 times
Share this document with a friend
Description:
Linear regression model
57
 THE LINEAR REGRESSION MODEL Lectures 1 and 2 Francis Kramarz and Michael Visser MASTER 1 EPP 2012
Transcript
  • THE LINEAR REGRESSION MODELLectures 1 and 2

    Francis Kramarz and Michael Visser

    MASTER 1 EPP

    2012

  • THE SIMPLE LINEAR REGRESSION MODEL

  • Introduction

    DefinitionA simple linear regression model is a regression model where thedependent variable is continuous, explained by a single exogenousvariable, and linear in the parameters.

    Theoretical model : Y = 0 + 1X The model is linear in theparameters 0 and 1. The single explanatory variable can becontinuous or discrete.

    Assumption (1) (Random sampling)

    (Xi , Yi ); i = 1, ..., n is a random sample of size n from thepopulation.

    The sample is randomly drawn from the population of interest. Foreach individual of the sample, we observe Xi and Yi we want toestimate the simple linear model Yi = 0 + 1Xi + ui .

  • The error term captures all relevant variables not included in themodel because they are not observed in the data set (for exampleability, dynamism ...)

    Assumption (2) (Sample variation)

    j and k such that Xj 6= Xk .If all the Xi in the sample take the same value, the slope parameter1 cannot be identified.

    Assumption (3) (zero mean)

    E (ui ) = 0

    The average value of the error term is 0. This is not a restrictiveassumption. If E (u) = , then we can rewrite the model asfollows : Y = 0 + + 1X + e, where e = u .Assumption (4) (zero conditional mean)

    E (ui | Xi ) = 0This is a crucial and strong assumption.

  • The Ordinary Least Squares estimator

    DefinitionThe OLS estimators 0 and 1 minimize the sum of squaredresiduals S(0, 1) = ni=1 u2i =

    ni=1(Yi 0 1Xi )2

    First order Conditions

    S(0, 1)

    0= 2

    n

    i=1

    (Yi 0 1Xi ) = 0 (1)

    S(0, 1)

    1= 2

    n

    i=1

    Xi (Yi 0 1Xi ) = 0 (2)

    Proposition

    0 = Y 1X and 1 = ni=1(YiY )(XiX )ni=1(XiX 2)

  • Proof

    Using equation (1), equation (2) turns into 1 =ni=1(YiY )Xi

    (XiX )Xi .

    ni=1 Xi (Xi X ) = ni=1(Xi X )2and ni=1 Xi (Yi Y ) = ni=1(Xi X )(Yi Y ) , which gives theresult.

    Remarks

    I 0 is the constant (or intercept) Yi = 0 if Xi = 0I 1 is the slope estimate and measures the effect of Xi on Yi

    Definition

    I Yi = 0 + 1Xi is the predicted value of Y for individual i

    I ui = Yi Yi is the residual for individual i. The OLSestimates minimize the sum of squared residuals.

    I SST = ni=1(Yi Y )2 is the total variation in YI SSE = ni=1(Yi Y )2 is the explained sum of squaresI SSR = ni=1 u2i is the regression sum of squares

  • How well does the regression line fits the sample data ?

    Proposition

    SST = SSE + SSR (3)

    This suggests a fitting criterionDefinitionLet R2 = SSESST = 1 SSRSST .R2 measures the proportion of the variation in Y explained byvariation in X .

    Remarks

    1. 0

  • Example : Determinants of the monthly wageI Data set French sample of the European Community

    Household PanelI Dependent variable wage = monthly wageI Explanatory variable

    1. age (continuous variable)2. sup= college graduate (discrete variable 0, 1)

    I The sample is restricted to the individuals aged 20 60 andemployed in 2000 (n=5010)

    I Descriptive statisticsI wage= 1432.3 eurosI age = 38.87 yearsI college graduates = 30.64 %

    SAS ProgramProc reg data=c ;model wage = age ;model wage = sup ;run ;

  • Figure: Effect of age on wage

  • Figure: Effect of education on wage

  • Finite-sample properties of the OLS estimatorI Other estimation methods exist and could be used (e.g.

    maximum likelihood, ...). How to choose ?I By comparing their properties in terms of

    1. unbiasedness2. precision (minimization of the variance)

    1. Unbiasedness

    Proposition

    Let Assumptions (1) to (4) be verified. Then E (0) = 0 andE (1) = 1. The OLS estimators are unbiased estimators of theparameters.

    The sampling distribution of estimators is centered around thetrue parameter. If we could draw an infinite number of samples ofsize n from the population, and take the average of the infinitenumber of OLS estimates, we would obtain the true value of 0and 1. BUT it does not mean that the particular OLS estimatesobtained using a given sample of size n are equal to the true valuesof the parameters.

  • 2. PrecisionThe question is now : are our OLS estimates far from the truevalues of 0 and 1 ?

    Assumption (5) (Homoscedasticity and non autocorrelation)

    V (ui | Xi ) = 2 and corr(ui , uj | Xi , Xj ) = 0Remarks

    1. is the standard error

    2. is unknown, since u represents all the unobservedexplanatory variables.

    3. Under assumption (5), V (Yi | Xi ) = 24. Thus the variance of Y , given X , does not vary with X

    (strong assumption)

    5. Under assumptions (4) and (5), E (u2i | Xi ) = 2

  • Proposition

    Let assumptions (1) to (5) be verified. Then

    V (0 | X ) = 2n2(ni=1 Xi )2

    ni=1(Xi X )2(4)

    V (1 | X ) = 2

    ni=1(Xi X )2(5)

    Remarks

    1. V (0) and V (1) increase with 2 (the higher the variance ofthe error term, the more difficult it is to estimate theparameters with precision).

    2. V (0) and V (1) decrease with ni=1(Xi X )23. V (0) and V (1) decrease with n

    4. As is unknown,V (0) and V (1) are also unknown

    5. BUT can be estimated using the sum of squared residuals

    ni=1 u2i

  • Proposition

    2 = ni=1 u

    2i

    n2 =SSRn2 is an unbiased estimator of

    2 E (2) = 2

    Replacing by in the variance of the estimators give unbiasedestimators of V (0) and V (1).

    We want to find the best unbiased linear estimator (BLUE) of 0and 1.

    How is defined the best estimator ? It is, among the unbiased linearestimators, the one with the smaller variance.

    Theorem (Gauss-Markov)

    Under assumptions (1) to (5), the OLS estimators 0 and 1 arethe best linear unbiased estimators of respectively 0 and 1.

  • The effect of omitting relevant variables

    I Assume the true model is Y = 0 + 1X1 + 2X2 + u, whilethe estimated model is Y = 0 + 1X1 + e.

    X2 is omitted and e = 2X2 + uI Example : both age and education affect the wageI Let 0 and 1 denote the OLS estimators of

    Y = 0 + 1X1 + e.

    I Are they biased estimators of 0 and 1 ?I There are 2 cases

    1. If Cov(X1, X2) = 0 or 2 = 0, then 0 and 1 are unbiased.

    2. If Cov(X1, X2) 6= 0 and 2 6= 0, then 0 and 1 are biased.I The bias in 1 is equal to 2

    Cov(X1,X2V (X1)

    )

    I Sign of the bias1. If Cov(X1, X2) > 0 (resp. < 0) and 2 > 0 (resp. < 0), then

    1 is upward biased.2. If Cov(X1, X2) > 0 (resp. < 0) and 2 < 0 (resp. > 0), then

    1 is downward biased.

  • Example : age and education in a wage equationI We estimate the true model

    wage = 0 + 1age + 2sup + u (6)

    2 > 0ceteris paribus education has a positive effect on the wage

    I We estimate the false model

    wage = 0 + 1age + e (7)

    I Comparison : 1 = 26.24 < 1 = 30.47 the effect of age is under-estimated in the false model

    I We estimate education as a function of age

    sup = 0 + 1age + v (8)

    1 < 0age has a negative effect on education

    I As 1 < 0 and 2 > 0, the estimator is logically downwardbiased

  • Figure: Effect of age and education on wage

  • Figure: Effect of age on education

  • THE MULTIPLE LINEAR REGRESSION MODEL

  • Introduction

    DefinitionA multiple linear regression model is a regression model where thedependant variable is continuous, explained by several exogenousvariables, and linear in the parameters.

    Example

    Y = 0 +K

    k=1

    kXk + u = X + u (9)

    where is a vector of K + 1 parameters (0, 1, ..., K )and X is a matrix (1, X1, ..., XK ) with K + 1 columns.

    I Linearity of the model in the parameters k , k = 0, ..., KI 0 is the constant

    I k is the slope parameter and measures the ceteris paribuseffect of Xk on Y .

    I In the multiple case, the matrix notation is more convenient.

  • Assumptions

    Assumption

    1. Random sampling : {(X1i , X2i ..., XKi , Yi ); i = 1, ..., n} is arandom sample of size n from the population.

    2. Sample variation and no collinearity : the explanatory variablesare not linearly related and none is constant Rank of X Xis K + 1

    3. Zero mean : E (ui ) = 0

    4. Zero conditional mean : E (ui | X1i , ..., XKi ) = 05. Homoscedasticity and non-autocorrelation :

    V (ui | X1i , ..., XKi ) = 2 andcorr(ui , uj | X1i , ..., XKi , X1j , ..., XKj ) = 0

  • Remarks

    I Assumptions (1), (3), (4) and (5) are similar to the simplecase.

    I Assumption (2) is an extension of the simple case.I Assumption (2) is required for the identification of the

    parameters. WHY ?I Assume Xki = Xki . Then 0 and k cannot be separately

    identified.

    I Similarly, assume that the variables X1 and X2 are collinear :X1 = X2.The model can be rewritten :Y = 0 + (1+ 2)X2 +Kk=3 kXk + u. 1 and 2 cannot be separately identified.

  • contd

    Remarks

    I Other example : dummy variablesIf all the dummy variables and the constant are included in themodel, then the constant and the dummy parameters cannotbe identified separately.One of the dummy variables must be dropped (the referencevariable) (i.e. education).

    I The rank of a matrix X is equal to the number of nonzerocharacteristic roots in X X .

    I rank(X ) min(number of rows, number of columns).

  • The Ordinary Least Squares estimator

    DefinitionThe OLS estimates k , k = {0, ..., K}, minimize the sum ofsquared residuals

    S(0, ..., K ) =n

    i=1

    u2i =n

    i=1

    (Yi 0K

    k=1

    kXki )2 = (Y X )(Y X )

    (10)

    First order Conditions

    2ni=1(Yi 0 Kk=1 kXki ) = 02ni=1 Xki (Yi 0 Kk=1 kXki ) = 0, k = {1, ..., K}

    OR, using matrix notation

    2n

    i=1

    X i (Yi Xi ) = 2X (Y X ) = 0 (11)

  • Proposition

    k =ni=1(Yi Y )((Xki Xki ) (X k X k))

    ni=1((Xki Xki ) (X k X k))2

    where Xki is the predicted value of Xki obtained from a regressionof Xki on a constant and all the other covariates.

    0 = Y K

    k=1

    kX k

    OR using matrix notation,

    = (X X )1X Y (12)

    Comparison with the simple case : k is equal to the slopeestimate in the regression of Y on a constant and (Xki Xki ) ceteris paribus effect.

  • How well does the regression line fits the sample data ?

    Definition

    Let SST = ni=1(Yi Y )2 denote the total variationin Y ,

    SSE = ni=1(Yi Y )2 denote the explained sum ofsquares,

    and SSR = ni=1 u2i denote the regression sum ofsquares.

    Then

    R2 =SSE

    SST= 1 SSR

    SST= 1

    ni=1 u

    2i

    ni=1(Yi Y )2(13)

    Remarks

    I R2 does not decrease when one more variable is included.

    I The R2 is useful to compare two models with the samenumber of explanatory variables, but not useful if the numberof variables is different.

    I Use the adjusted R2 : R2= 1 (n1)(1R2)nK1 .

  • Example : Determinants of the monthly wage

    I Data set French sample of the European CommunityHousehold Panel

    I Dependent variable wage = monthly wageI Explanatory variables

    1. age (continuous variable or discrete variable)2. diplo0, diplo1, ... = educational level (discrete variables 0, 1)3. sex, children

    I The sample is restricted to the individuals aged 20 60 andemployed in 2000 (n=5010)

    SAS ProgramProc reg data=c ;model wage = sex1 age age2 diplo1 diplo2 diplo3 diplo4 nbenf1nbenf2 nbenf3 ;model wage = sex1 ag1 ag2 ag4 ag5 diplo1 diplo2 diplo3 diplo4nbenf1 nbenf2 nbenf3 ;run ;

  • Figure: Determinants of wage - model 1

  • Figure: Determinants of wage - model 2

  • Finite-sample properties of the OLS estimator

    Proposition

    Let Assumptions (1) to (4) be verified. Then

    E () = (14)

    Proposition

    Let Assumptions (1) to (5) be verified. Then

    V (k | X1, ..., XK ) = 2

    ni=1(Xki X k)2(1 R2k )(15)

    where R2k is the R2 obtained from the regression of Xk on the

    covariates.OR using the matrix notation

    V ( | X ) = 2(X X )1 (16)

  • RemarksV (k | X ) increases with 2 and R2k , and decreases withni=1(Xki X k)2.Estimating the error variance

    I We dont know what the error variance 2 is, because wedont observe the error term u.

    I BUT we observe the residuals u and we can use the residualsto find an estimate of the error variance.

    I Replacing by in the variance of the estimators givesunbiased estimators of V ().

    Proposition

    2 = ni=1 u

    2i

    nK1 =SSR

    degree of freedom is an unbiased estimator of 2.

    Theorem (Gauss-Markov)

    Under assumptions (1) to (5), is the best linear unbiasedestimator (BLUE).

  • Asymptotic properties

    We study now the properties of the OLS estimator when n +.First we study the consistency of the OLS estimator. Consistency isstronger than unbiasedness.

    TheoremUnder assumptions (1) to (4), plim() = .Under assumptions (1) to (5), plim() =

    Next we study the asymptotic distribution of the OLS estimator.

    Theorem (asymptotic normality)

    Under assumptions (1) to (5),

    n( ) N(0, 2Q1) (17)

    with Q1 = plim((X Xn )1)

  • The maximum likelihood method

    The likelihood is the probability of observing the sample{(Y1, X1) , ..., (YN , XN)}.DefinitionThe contribution of individual i to the likelihood is the function Lidefined by : Li (Yi , Xi ; ) = f (Yi , Xi ; ).

    The likelihood function of the sample is the functionL(Yi , Xi , i = 1, ..., n; ) defined as the product of the individualcontributions :

    L(Yi , Xi , i = 1, ..., n; ) =N

    i=1

    Li (Yi , Xi ; )

    If the dependent variable is continuous, L(Y , X ; ) is theproduct of the distribution functions associated with each couple.

  • Assumption (6) (Normality of the error term)

    The error term is independent of Xi and normally distributed withzero mean and variance 2 : ui |Xi N(0, 2).Under this assumption, we can use the Maximum Likelihoodmethod.

    TheoremUnder assumptions (1) to (6), the Maximum Likelihood estimatorof is the OLS estimator.

    MOREOVER

    TheoremUnder assumptions (1) to (6), the OLS estimator and the MLestimator are the minimum variance linear unbiased estimator of .

  • Tests and inference - finite samples

    I We want to test hypotheses about the parameters of themodel.

    I Example : wage equationI Are men significantly better paid than women ?I Are the 45-55 significantly better paid than the 35-45 (the

    reference category) ?I Are the 45-55 significantly better paid than the 25-35 ?

    I In order to perform statistical tests on finite sample, we needto add an assumption on the distribution of the error term.

    Assumption (6) (Normality of the error term)

    The error term is independent of Xi and normally distributed withzero mean and variance 2 : ui |Xi N(0, 2). the distribution of the error term conditional on the vector ofthe explanatory variables is normal.

  • Remarks

    I Assumption (6) implies assumptions (3), (4) and (5).

    I Conditionally on the explanatory variables,the dependentvariable is normally distributed with mean Xi and variance2 Yi |X N(0 +Kk=1 kXki , 2)

    Consequence for the distribution of

    TheoremUnder assumptions (1) to (6), k is normally distributed withmean k and variance V (k) k k N(0, V (k)) kk

    V (k ) N(0, 1)

    Proof is a linear combination of the error term.

    We cannot use directly this property since V () is unknown. BUTwe can replace the variance by the estimate of the variance, whichgives :

  • TheoremUnder assumptions (1) to (6),

    kkV (k )

    TnK1Idea

    Assume we want to test the null hypothesis H0

    We need

    (i) a test statistic (t), i.e. a decision function that takesits values in the set of hypotheses

    (ii) a decision rule that determines when H0 is rejected choose = Pr( rejectH0|H0 is true) is the significance level (usually = 5%)

    (iii) a critical region, i.e. the set of values of the teststatistic for which the null hypothesis is rejected we want to find the critical value c that verifiesPr( rejectH0|H0 is true) = Pr(|t| > c) = c is the 2 th percentile in the t distribution withnK 1 degrees of freedom

  • The t-test : is k significantly different from 0 ?{H0 : k = 0H1 : k 6= 0

    The null hypothesis means that Xk has no effect on Y .

    (i) Under the null hypothesis, t = kV (k )

    The test statistic follows a t distribution withnK 1 degrees of freedom.

    (ii) = 5%(iii) Pr(|t| > c) = 5%

    c is the 97.5-th percentile in the t distribution withnK 1 degrees of freedom. When nK 1 islarge, c = 1.96

    DecisionI If |t| =

    V ()< 1.96, then H0 is accepted at the 5%

    significance level

    I If |t| = V ()

    1.96, then H0 is rejected at the 5%significance level.

  • Figure: Determinants of wage - model 1

  • The t-test : is k significantly greater than 0 ?{H0 : k = 0H1 : k > 0

    The null hypothesis means that Xk has no effect on Y .

    (i) Under the null hypothesis, t = kV (k )

    The test statistic follows a t distribution withnK 1 degrees of freedom.

    (ii) = 5%(iii) Pr(t > c) = 5%

    c is the 95-th percentile in the t distribution withnK 1 degrees of freedom. When nK 1 islarge, c = 1.645

    DecisionI If t =

    V ()< 1.645, then H0 is accepted at the 5%

    significance level

    I If t = V ()

    1.645, then H0 is rejected at the 5%significance level.

  • The t-test : is k significantly different from a ?{H0 : k = aH1 : k 6= a

    (i) Under the null hypothesis, t = kaV (k )

    The test statistic follows a t distribution withnK 1 degrees of freedom.

    (ii) = 1%

    (iii) Pr(|t| > c) = 1%c is the 99.5-th percentile in the t distribution withnK 1 degrees of freedom. When nK 1 islarge, c = 2.576

    Decision

    I If |t| < 2.576, then H0 is accepted at the 1% significance levelI If |t| 2.576, then H0 is rejected at 1%

  • The t-test : is k significantly different from j ?{H0 : k = jH1 : k 6= j

    The null hypothesis means that Xk and Xj have the same effect onY .

    (i) Under the null hypothesis,

    t =kj

    V (k )+V (j )2Cov(k ,j )The test statistic follows a t distribution withnK 1 degrees of freedom.

    (ii) = 10%

    (iii) Pr(|t| > c) = 10%c is the 95-th percentile in the t distribution withnK 1 degrees of freedom. When nK 1 islarge, c = 1.645

    Decision

    I If |t| < 1.645, then H0 is accepted at 10%I If |t| 1.645, then H0 is rejected at 10%

  • The F-test : exclusion restrictions

    We test q linear restrictions on the parameters{H0 : K+1q = K+2q = ... = K = 0H1 : H0 is false

    (i) Under the null hypothesis, the model becomesY = 0 + 1X1 + ... + KqXKq.

    (ii) The test statistic is F = (R2R2c )/q

    (1R2)/(nK1) , where R2c

    denotes the R2 of the constrained model.

    The F-statistic follows a Fisher distribution with(q, nK 1) degrees of freedom.

    (iii) Pr(F > c) =

    Decision

    I If F < c, then H0 is accepted at the significance level

    I If F c, then H0 is rejected

  • Figure: Determinants of wage - model 3

  • The F-test : Overall significance

    Question : is the model completely false ?{H0 : 1 = 2 = ... = K = 0H1 : H0 is false

    K is the number of restrictions.

    (i) Under the null hypothesis, F = R2/K

    (1R2)/(nK1) .The F-statistic follows a Fisher distribution with(K , nK 1) degrees of freedom.

    (ii) = 1%

    (iii) Pr(F > c) = 1%c is the 99-th percentile in the F distribution with(K , nK 1) degrees of freedom.

    Decision

    I If F < c, then H0 is accepted at the 1% significance level

    I If F c, then H0 is rejected at the 1% significance level.

  • HeteroscedasticityAssumption (5) (homoscedasticity) is restrictive

    How to proceed when this assumption is dropped ?Assumption (5)

    V (ui |X1i , ..., XKi ) = 2i andcorr(ui , uj | X1i , ..., XKi , X1j , ..., XKj ) = 0

    Remarks

    I The OLS estimator remains unbiased

    I The OLS estimator remains consistent

    BUT

    I V (|X ) is no longer equal to 2(X X )1I The asymptotic normality is no longer verified

    I The OLS estimator is no longer the BLUE

  • Testing for heteroscedasticity

    I There exist two tests : Breusch-Pagan and White. Both testsare based on the residuals of the fitted model.

    I IDEA

    E (u2i |X1i , ..., XKi ) = 2i u2i = 2i + eiI The variance of the error term is of the general form

    2i = i = (X1i , ..., XKi )

    The Breusch-Pagan test

    I More restrictive assumption on the form of i :

    (X1i , ..., XKi ) = 0 +Kk=1 kXki

    u2i = 0 +Kk=1 kXki + ei

    I{

    H0 : 1 = 2 = ... = K = 0H1 : k such that k 6= 0

  • I Under the null hypothesis, F = R2/K

    (1R2)/(nK1) follows aFisher distribution with (K , nK 1) degrees of freedom.

    The White test

    I Less restrictive assumption on the form of i :

    (X1i , ..., XKi ) = 0 +Kk=1 kXki +

    Kk,j=1 kjXki Xji

    u2i = 0 +Kk=1 kXki +Kk,j=1 kjXki Xji + ei

    I{

    H0 : k = kj = 0 k , jH1 : k such that k 6= 0 or kj 6= 0

    I Under the null hypothesis, F = R2/q

    (1R2)/(nq1) follows aFisher distribution with (q, n q 1) degrees of freedom andq = K + K (K+1)2

  • Example : Determinants of the monthly wageI Data set French sample of the European Community

    Household PanelI The sample is restricted to the individuals aged 20 60 and

    employed in 2000 (n=5010)I Dependent variable wage = monthly wageI Explanatory variables age (continuous variable), diplo0,

    diplo1, ... = educational level (discrete variables 0, 1), sex,children

    SAS ProgramProc model data=c ;Parms ac asex1 aage aage2 adiplo1 adiplo2 adiplo3 adiplo4anbenf1 anbenf2 anbenf3 ;wage=ac+ +asex1*sex1+aage*age+aage2*age2+ adiplo1*diplo1+ adiplo2*diplo2 + adiplo3*diplo3 + adiplo4*diplo4+anbenf1*nbenf1 +anbenf2*nbenf2 +anbenf3*nbenf3 ;fit wage / white breusch=(1 sex1 age age2 diplo1 diplo2 diplo3diplo4 nbenf1 nbenf2 nbenf3) ;run ;

  • Figure: Example : Determinants of wage

  • Correcting for heteroscedasticityThere are two methods to improve the efficiency of the estimationin the presence of heteroscedastic errors

    1. Use the Feasible General Least Squares (FGLS) method if thefunction (X1i , ..., XKi ) is known

    2. Use the OLS method but compute a heteroscedasticconsistent-covariance matrix estimator (White(1980),Davidson and MacKinnon (1993))

    1. The Feasible General Least Squares (FGLS) method

    I Assumption : 2i = i = g(0 +Kk=1 kXki ) where g(.) is

    known.

    I IDEAapply a transformation to the initial model Yi = Xi+ ui thatmakes the error terms of the transformed modelhomoscedastic.

    Yi , Xi , and ui are divided by0 +Kk=1 kXki

  • The transformed model isY ii

    = 0i

    +Kk=1 kXki

    i+ ui

    i

    Remarks

    I there is no constant in the transformed model

    I the parameters k are the same than in the initial model

    I uii

    is homoscedastic

    The transformed model can then be estimated by OLS

    EXCEPT that we do not know the parameters 0 and k .

  • we proceed in the following way1. estimate the initial model by OLS and compute ui

    2. estimate u2i = 0 +Kk=1 kXki + ei and compute

    i = 0 +Kk=1 kXki3. divide the initial model by i

    4. estimate the transformed model by the OLS

    if is well specified, the FGLS estimator is unbiased,consistent, and asymptotically efficient

    2. Compute a heteroscedastic consistent-covariance matrixestimator

    The idea is to use the OLS method but compute a heteroscedasticconsistent-covariance matrix estimator (cf. White (1980),Davidsonand MacKinnon (1993)).

  • Example

    SAS Program

    Proc Reg Data=outc ;Model UHat2 = sex1 age age2 diplo1 diplo2 diplo3 diplo4 nbenf1nbenf2 nbenf3 ;Output OUT= MyData PREDICTED = Sig2hat ;Run ;

    /* Create weights as inverse of root of variance */

    Data MyData ; Set MyData ; OmegaInv = SQRT(1/Sig2hat) ;Run ;

    Proc Reg Data=MyData ;Model wage = sex1 age age2 diplo1 diplo2 diplo3 diplo4 nbenf1nbenf2 nbenf3 ;Weight OmegaInv ;Run ;

  • Figure: Estimation of u2 when g(x) = x

  • Figure: Estimation of u2 when g(x) = exp(x)

  • Figure: Determinants of wage correcting for heteroscedasticity

    The simple linear regression modelIntroductionThe Ordinary Least Squares estimation methodExampleFinite-sample propertiesExample

    The multiple linear regression modelIntroductionThe Ordinary Least Squares estimation methodExamplePropertiesComparison between OLS and MLTests and inferenceHeteroscedasticity


Recommended