+ All Categories
Home > Documents > Computer-Aided Introduction to Econometrics.pdf

Computer-Aided Introduction to Econometrics.pdf

Date post: 25-Feb-2018
Category:
Upload: dkmasta
View: 220 times
Download: 0 times
Share this document with a friend

of 355

Transcript
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    1/354

    Computer-AidedIntroduction to Econometrics

    Juan M. Rodriguez Poo

    In cooperation with

    Ignacio Moral, M. Teresa Aparicio, Inmaculada Villanua,Pavel Czek, Yingcun Xia, Pilar Gonzalez, M. Paz Moral, Rong Chen,

    Rainer Schulz, Sabine Stephan, Pilar Olave,J. Tomas Alcala and Lenka Cizkova

    July 24, 2002

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    2/354

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    3/354

    Contents

    1 Univariate Linear Regression Model 1

    Ignacio Moral and Juan M. Rodriguez-Poo

    1.1 Probability and Data Generating Process . . . . . . . . . . . . 1

    1.1.1 Random Variable and Probability Distribution . . . . . 2

    1.1.2 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1.1.3 Data Generating Process . . . . . . . . . . . . . . . . . 8

    1.1.4 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    1.2 Estimators and Properties . . . . . . . . . . . . . . . . . . . . . 12

    1.2.1 Regression Parameters and their Estimation. . . . . . . 14

    1.2.2 Least Squares Method . . . . . . . . . . . . . . . . . . . 16

    1.2.3 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    1.2.4 Goodness of Fit Measures . . . . . . . . . . . . . . . . . 20

    1.2.5 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    1.2.6 Properties of the OLS Estimates of, and 2 . . . . . 23

    1.2.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    1.3 Inference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    1.3.1 Hypothesis Testing about . . . . . . . . . . . . . . . . 31

    1.3.2 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    1.3.3 Testing Hypothesis Based on the Regression Fit . . . . 35

    http://personales.unican.es/~rodrigjm/http://www.unican.es/~
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    4/354

    iv Contents

    1.3.4 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    1.3.5 Hypothesis Testing about . . . . . . . . . . . . . . . . 37

    1.3.6 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    1.3.7 Hypotheses Testing about2 . . . . . . . . . . . . . . . 38

    1.4 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    1.4.1 Confidence Interval for the Point Forecast . . . . . . . . 40

    1.4.2 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    1.4.3 Confidence Interval for the Mean Predictor . . . . . . . 41

    2 Multivariate Linear Regression Model 45

    Teresa Aparicio and Inmaculada Villanua

    2.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

    2.2 Classical assumptions of the MLRM . . . . . . . . . . . . . . . 46

    2.2.1 The systematic component assumptions . . . . . . . . . 47

    2.2.2 The random component assumptions . . . . . . . . . . . 48

    2.3 Estimation Procedures . . . . . . . . . . . . . . . . . . . . . . . 49

    2.3.1 The Least Squares estimation . . . . . . . . . . . . . . . 50

    2.3.2 The Maximum Likelihood Estimation . . . . . . . . . . 55

    2.3.3 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    2.4 Properties of the estimators . . . . . . . . . . . . . . . . . . . . 59

    2.4.1 Finite sample properties of the OLS and ML estimates of 59

    2.4.2 Finite sample properties of the OLS and ML estimatesof2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

    2.4.3 Asymptotic properties of the OLS and ML estimators of 66

    2.4.4 Asymptotic properties of the OLS and ML estimators of2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    2.4.5 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 72

    2.5 Interval Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 72

    http://www.unizar.es/~maite
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    5/354

    Contents v

    2.5.1 Interval Estimation of the coefficients of the MLRM . . 73

    2.5.2 Interval Estimation of2 . . . . . . . . . . . . . . . . . 74

    2.5.3 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 74

    2.6 Goodness of fit measures. . . . . . . . . . . . . . . . . . . . . . 74

    2.7 Linear Hypothesis testing . . . . . . . . . . . . . . . . . . . . . 77

    2.7.1 Hypothesis testing about the coefficients . . . . . . . . . 78

    2.7.2 Hypothesis testing about a coefficient of the MLRM . . 81

    2.7.3 Testing the overall significance of the model . . . . . . . 83

    2.7.4 Testing hypothesis about2

    . . . . . . . . . . . . . . . . 842.7.5 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 84

    2.8 Restricted and unrestricted regression . . . . . . . . . . . . . . 85

    2.8.1 Restricted Least Squares and Restricted Maximum Like-lihood Estimators . . . . . . . . . . . . . . . . . . . . . 86

    2.8.2 Finite sample properties of the restricted estimator vector 89

    2.8.3 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 91

    2.9 Three general test procedures . . . . . . . . . . . . . . . . . . . 92

    2.9.1 Likelihood Ratio test (LR) . . . . . . . . . . . . . . . . 92

    2.9.2 The Wald test (W) . . . . . . . . . . . . . . . . . . . . . 93

    2.9.3 Lagrange Multiplier test (LM) . . . . . . . . . . . . . . 94

    2.9.4 Relationships and properties of the three general testingprocedures . . . . . . . . . . . . . . . . . . . . . . . . . 96

    2.9.5 The three general testing procedures in the MLRM context 97

    2.9.6 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 102

    2.10 Dummy variables . . . . . . . . . . . . . . . . . . . . . . . . . . 102

    2.10.1 Models with changes in the intercept . . . . . . . . . . . 103

    2.10.2 Models with changes in some slope parameters . . . . . 108

    2.10.3 Models with changes in all the coefficients . . . . . . . . 109

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    6/354

    vi Contents

    2.10.4 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 111

    2.11 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

    2.11.1 Point prediction . . . . . . . . . . . . . . . . . . . . . . 113

    2.11.2 Interval prediction . . . . . . . . . . . . . . . . . . . . . 115

    2.11.3 Measures of the accuracy of forecast . . . . . . . . . . . 117

    2.11.4 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 118

    3 Dimension Reduction and Its Applications 123

    PavelCzek and Yingcun Xia

    3.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

    3.1.1 Real data sets. . . . . . . . . . . . . . . . . . . . . . . . 123

    3.1.2 Theoretical consideration . . . . . . . . . . . . . . . . . 126

    3.2 Average outer product of gradients and its estimation . . . . . 130

    3.2.1 The simple case. . . . . . . . . . . . . . . . . . . . . . . 130

    3.2.2 The varying-coefficient model . . . . . . . . . . . . . . . 132

    3.3 A Unified Estimation Method . . . . . . . . . . . . . . . . . . . 132

    3.3.1 The simple case. . . . . . . . . . . . . . . . . . . . . . . 133

    3.3.2 The varying-coefficient model . . . . . . . . . . . . . . . 142

    3.4 Number of e.d.r. Directions . . . . . . . . . . . . . . . . . . . . 144

    3.5 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

    3.6 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . 149

    3.7 Applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

    3.8 Conclusions and further discussion . . . . . . . . . . . . . . . . 159

    3.9 Appendix. Assumptions and remarks. . . . . . . . . . . . . . . 160

    4 Univariate Time Series Modelling 167

    Paz Moral and Pilar Gonzalez

    http://alcib.bs.ehu.es/~pg/index.htmhttp://alcib.bs.ehu.es/~pm/pm.htm
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    7/354

    Contents vii

    4.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

    4.2 Linear Stationary Models for Time Series . . . . . . . . . . . . 170

    4.2.1 White noise process . . . . . . . . . . . . . . . . . . . . 174

    4.2.2 Moving Average model. . . . . . . . . . . . . . . . . . . 175

    4.2.3 Autoregressive model . . . . . . . . . . . . . . . . . . . 178

    4.2.4 Autoregressive Moving Average model . . . . . . . . . . 182

    4.3 Nonstationary Models for Time Series . . . . . . . . . . . . . . 184

    4.3.1 Nonstationary in the variance . . . . . . . . . . . . . . . 185

    4.3.2 Nonstationarity in the mean. . . . . . . . . . . . . . . . 1864.3.3 Testing for unit roots and stationarity . . . . . . . . . . 191

    4.4 Forecasting with ARIMA Models . . . . . . . . . . . . . . . . . 196

    4.4.1 The optimal forecast . . . . . . . . . . . . . . . . . . . . 196

    4.4.2 Computation of forecasts . . . . . . . . . . . . . . . . . 197

    4.4.3 Eventual forecast functions . . . . . . . . . . . . . . . . 198

    4.5 ARIMA model building . . . . . . . . . . . . . . . . . . . . . . 202

    4.5.1 Inference for the moments of stationary processes. . . . 202

    4.5.2 Identification of ARIMA models . . . . . . . . . . . . . 204

    4.5.3 Parameter estimation . . . . . . . . . . . . . . . . . . . 207

    4.5.4 Diagnostic checking . . . . . . . . . . . . . . . . . . . . 212

    4.5.5 Model selection criteria . . . . . . . . . . . . . . . . . . 215

    4.5.6 Example: European Union G.D.P. . . . . . . . . . . . . 217

    4.6 Regression Models for Time Series . . . . . . . . . . . . . . . . 221

    4.6.1 Cointegration . . . . . . . . . . . . . . . . . . . . . . . . 222

    4.6.2 Error correction models . . . . . . . . . . . . . . . . . . 225

    5 Multiplicative SARIMA models 231Rong Chen,Rainer SchulzandSabine Stephan

    http://www.diw.de/deutsch/abteilungen/kon/oekonometrie/~http://ise.wiwi.hu-berlin.de/~rschulzhttp://tigger.uic.edu/~rongchen
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    8/354

    viii Contents

    5.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

    5.2 Modeling seasonal time series . . . . . . . . . . . . . . . . . . . 233

    5.2.1 Seasonal ARIMA models . . . . . . . . . . . . . . . . . 233

    5.2.2 Multiplicative SARIMA models . . . . . . . . . . . . . . 238

    5.2.3 The expanded model . . . . . . . . . . . . . . . . . . . . 239

    5.3 Identification of multiplicative SARIMA models. . . . . . . . . 240

    5.4 Estimation of multiplicative SARIMA models . . . . . . . . . . 246

    5.4.1 Maximum likelihood estimation . . . . . . . . . . . . . . 247

    5.4.2 Setting the multiplicative SARIMA model . . . . . . . . 2505.4.3 Setting the expanded model . . . . . . . . . . . . . . . . 252

    5.4.4 The conditional sum of squares . . . . . . . . . . . . . . 253

    5.4.5 The extended ACF . . . . . . . . . . . . . . . . . . . . . 256

    5.4.6 The exact likelihood . . . . . . . . . . . . . . . . . . . . 259

    References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

    6 AutoRegressive Conditional Heteroscedastic Models 263

    Pilar Olave and Jose T. Alcala

    6.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

    6.2 ARCH(1) model . . . . . . . . . . . . . . . . . . . . . . . . . . 268

    6.2.1 Conditional and unconditional moments of the ARCH(1) 268

    6.2.2 Estimation for ARCH(1) process . . . . . . . . . . . . . 271

    6.3 ARCH(q) model . . . . . . . . . . . . . . . . . . . . . . . . . . 275

    6.4 Testing heteroscedasticity and ARCH(1) disturbances . . . . . 277

    6.4.1 The Breusch-Pagan test . . . . . . . . . . . . . . . . . . 278

    6.4.2 ARCH(1) disturbance test. . . . . . . . . . . . . . . . . 279

    6.5 ARCH(1) regression model . . . . . . . . . . . . . . . . . . . . 2816.6 GARCH(p,q) model . . . . . . . . . . . . . . . . . . . . . . . . 283

    http://www.unizar.es/~
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    9/354

    Contents ix

    6.6.1 GARCH(1,1) model . . . . . . . . . . . . . . . . . . . . 285

    6.7 Extensions of ARCH models. . . . . . . . . . . . . . . . . . . . 287

    6.8 Two Examples of Spanish Financial Markets . . . . . . . . . . 289

    6.8.1 Ibex35 Data. . . . . . . . . . . . . . . . . . . . . . . . . 289

    6.8.2 Exchange Rate US Dollar/Spanish Peseta data (continued)292

    7 Numerical Optimization Methods in Econometrics 297

    LenkaCzkova

    7.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

    7.2 Solving a Nonlinear Equation . . . . . . . . . . . . . . . . . . . 297

    7.2.1 Termination of Iterative Methods. . . . . . . . . . . . . 298

    7.2.2 Newton-Raphson Method . . . . . . . . . . . . . . . . . 298

    7.3 Solving a System of Nonlinear Equations. . . . . . . . . . . . . 300

    7.3.1 Newton-Raphson Method for Systems . . . . . . . . . . 300

    7.3.2 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 301

    7.3.3 Modified Newton-Raphson Method for Systems . . . . . 303

    7.3.4 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 304

    7.4 Minimization of a Function: One-dimensional Case . . . . . . . 306

    7.4.1 Minimum Bracketing. . . . . . . . . . . . . . . . . . . . 306

    7.4.2 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 306

    7.4.3 Parabolic Interpolation . . . . . . . . . . . . . . . . . . 307

    7.4.4 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 309

    7.4.5 Golden Section Search . . . . . . . . . . . . . . . . . . . 310

    7.4.6 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 311

    7.4.7 Brents Method . . . . . . . . . . . . . . . . . . . . . . . 312

    7.4.8 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 3137.4.9 Brents Method Using First Derivative of a Function . . 315

    http://ise.wiwi.hu-berlin.de/~cizkova
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    10/354

    x Contents

    7.4.10 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 315

    7.5 Minimization of a Function: Multidimensional Case . . . . . . 317

    7.5.1 Nelder and Meads Downhill Simplex Method (Amoeba) 317

    7.5.2 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 317

    7.5.3 Conjugate Gradient Methods . . . . . . . . . . . . . . . 318

    7.5.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 319

    7.5.5 Quasi-Newton Methods . . . . . . . . . . . . . . . . . . 322

    7.5.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 323

    7.5.7 Line Minimization . . . . . . . . . . . . . . . . . . . . . 3267.5.8 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 326

    7.6 Auxiliary Routines for Numerical Optimization . . . . . . . . . 330

    7.6.1 Gradient. . . . . . . . . . . . . . . . . . . . . . . . . . . 330

    7.6.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 330

    7.6.3 Jacobian. . . . . . . . . . . . . . . . . . . . . . . . . . . 333

    7.6.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 333

    7.6.5 Hessian . . . . . . . . . . . . . . . . . . . . . . . . . . . 334

    7.6.6 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 335

    7.6.7 Restriction of a Function to a Line . . . . . . . . . . . . 336

    7.6.8 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 336

    7.6.9 Derivative of a Restricted Function . . . . . . . . . . . . 337

    7.6.10 Example. . . . . . . . . . . . . . . . . . . . . . . . . . . 337

    Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    11/354

    Preface

    This book is designed for undergraduate students, applied researchers and prac-titioners to develop professional skills in econometrics. The contents of the

    book are designed to satisfy the requirements of an undergraduate economet-rics course of about 90 hours. Although the book presents a clear and serioustheoretical treatment, its main strength is that it incorporates an interactivecomputing internet based method that allows the reader to practice all thetechniques he is learning theoretically along the different chapters of the book.It provides a comprehensive treatment of the theoretical issues related to lin-ear regression analysis, univariate time series modelling and some interestingextensions such as ARCH models and dimensionality reduction techniques.Furthermore, all theoretical issues are illustrated through an internet basedinteractive computing method, that allows the reader to learn from theory topractice the different techniques that are developed in the book. Although thecourse assumes only a modest background it moves quickly between differentfields of applications and in the end, the reader can expert to have theoretical

    and computational tools that are deep enough and rich enough to be relied onthroughout future professional careers.

    The computer inexperienced user of this book is softly introduced into the in-teractive book concept and will certainly enjoy the various practical examples.The e-book is designed as an interactive document: a stream of text and in-formation with various hints and links to additional tools and features. Oure-book design offers also a complete PDF and HTML file with links to worldwide computing servers. The reader of this book may therefore without down-load or purchase of software use all the presented examples and methods via alocal XploRe Quantlet Server (XQS). Such QS Servers may also be installed ina department or addressed freely on the web, click to www.xplore-stat.de andwww.quantlet.com.

    Computer-Aided introduction to Econometrics consists on three main parts:Linear Regression Analysis, Univariate Time Series Modelling and Computa-

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    12/354

    xii Preface

    tional Methods. In the first part, Moral and Rodriguez-Poo provide the basicbackground for univariate linear regression models: Specification, estimation,testing and forecasting. Moreover, they provide some basic concepts on prob-ability and inference that are required to study fruitfully further concepts inregression analysis. Aparicio and Villanua provide a deep treatment of themultivariate linear regression model: Basic assumptions, estimation methodsand properties. Linear hypothesis testing and general test procedures (Like-lihood ratio test, Wald test and Lagrange multiplier test) are also developed.Finally, they consider some standard extensions in regression analysis such asdummy variables and restricted regression. Czek and Xia close this part witha chapter devoted to dimension reduction techniques and applications. Sincethe techniques developed in this section are rather new, this part of of higher

    level of difficulty than the preceding sections.The second part starts with an introduction to Univariate Time Series Anal-ysis by Moral and Gonzalez. Starting form the analysis of linear stationaryprocesses, they jump to some particular cases of non-stationarity such as non-stationarity in mean and variance. They provide also some statistical toolsfor testing for unit roots. Furthermore, within the class of linear stationaryprocesses they focus their attention in the sub-class of ARIMA models. Fi-nally, as a natural extension to the previous concepts to regression analysis,cointegration and error correction models are considered. Departing from theclass of ARIMA models, Chen, Schulz and Stephan propose a way to deal withseasonal time series. Olave and Alcala end this part with an introduction toAutoregressive Conditional Heteroskedastic Models, which appear to be a nat-

    ural extension of ARIMA modelling to econometric models with a conditionalvariance that is time varying. In their work, they provide an interesting batteryof tests for ARCH disturbances that appears as a nice example of the testingtools already introduced by Aparicio and Villanua in a previous chapter.

    In the last part of the book, Cizkova develops several nonlinear optimizationtechniques that are of common use in Econometrics. The special structure ofthe e-book relying in a interactive computing internet based method makes itan ideal tool to comprehend optimization problems.

    I gratefully acknowledge the support of Deutsche Forschungsgemeinschaft, SFB373 Quantifikation und Simulation Okonomischer Prozesse and Direccion Gen-eral de Investigacion del Ministerio de Ciencia y Tecnologa under researchgrant BEC2001-1121. For technical production of the e-book I would like tothank Zdenek Hlavka and Rodrigo Witzel.

    J. M. Rodriguez-Poo

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    13/354

    Preface xiii

    Santander, May 2002

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    14/354

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    15/354

    Contributors

    Ignacio MoralDepartamento de Economa, Universidad de Cantabria

    Juan M. Rodriguez-Poo Departamento de Economa, Universidad de Cantabria

    Teresa Aparicio Departamento de Analisis Economico, Universidad de Zaragoza

    Inmaculada Villanua Departamento de Analisis Economico, Universidad deZaragoza

    Pavel Czek Humboldt-Universitat zu Berlin, CASE, Center of Applied Statis-tics and Economics

    Yingcun XiaDepartment of Statistics and Actuarial Science, The Universityof Hong Kong

    Paz MoralDepartamento de Econometra y Estadstica, Universidad del PasVasco

    Pilar GonzalezDepartamento de Econometra y Estadstica, Universidad delPas Vasco

    Rong Chen Department of Information and Decision Sciences, University ofIllinois at Chicago

    Rainer Schulz Humboldt-Universitat zu Berlin, CASE, Center of AppliedStatistics and Economics

    Sabine Stephan German Institute for Economic Research

    Pilar Olave Departamento de metodos estadsticos, Universidad de Zaragoza

    Juan T. Alcala Departamento de metodos estadsticos, Universidad de Zaragoza

    Lenka Czkova Humboldt-Universitat zu Berlin, CASE, Center of Applied

    Statistics and Economics

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    16/354

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    17/354

    1 Univariate Linear RegressionModel

    Ignacio Moral and Juan M. Rodriguez-Poo

    In this section we concentrate our attention in the univariate linear regressionmodel. In economics, we can find innumerable discussions of relationshipsbetween variables in pairs: consumption and real disposable income, laborsupply and real wages and many more. However, the main interest in the studyof this model is not its real applicability but the fact that the mathematicaland the statistical tools developed for the two variable model are foundationsof other more complicated models.

    An econometric study begins with a theoretical proposition about the relation-ship between two variables. Then, given a data set, the empirical investigationprovides estimates of unknown parameters in the model, and often attemptsto measure the validity of the propositions against the behavior of observabledata. It is not our aim to include here a detailed discussion on economet-ric model building, this type of discussion can be found in Intriligator (1978),however, along the sequent subsections we will introduce, using monte carlosimulations, the main results related to estimation and inference in univariatelinear regression models. The next chapters of the book develop more elaboratespecifications and various problems that arise in the study and application ofthese techniques.

    1.1 Probability and Data Generating Process

    In this section we make a revision of some concepts that are necessary to un-

    derstand further developments in the chapter, the purpose is to highlight someof the more important theoretical results in probability, in particular, the con-cept of the random variable, the probability distribution, and some related

    http://personales.unican.es/~rodrigjm/http://www.unican.es/~
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    18/354

    2 1 Univariate Linear Regression Model

    results. Note however, that we try to maintain the exposition at an introduc-tory level. For a more formal and detailed expositions of these concepts seeHardle and Simar (1999), Mantzapoulus (1995), Newbold (1996) and WonacotandWonacot (1990).

    1.1.1 Random Variable and Probability Distribution

    A random variable is a function that assigns (real) numbers to the resultsof an experiment. Each possible outcome of the experiment (i.e. value ofthe corresponding random variable) occurs with a certain probability. Thisoutcome variable, X, is a random variable because, until the experiment is

    performed, it is uncertain what value X will take. Probabilities are associatedwith outcomes to quantify this uncertainty.

    A random variable is called discrete if the set of all possible outcomes x1, x2,...is finite or countable. For a discrete random variable X, a probability densityfunction is defined to be the function f(xi) such that for any real number xi,which is a value that X can take, f gives the probability that the randomvariable X is equal to xi . Ifxi is not one of the values that Xcan take thenf(xi) = 0.

    P(X=xi) = f(xi) i= 1, 2, . . .

    f(xi) 0,

    i

    f(xi) = 1

    A continuous random variableXcan take any value in at least one interval onthe real number line. Assume Xcan take valuesc x d. Since the possiblevalues of X are uncountable, the probability associated with any particularpoint is zero. Unlike the situation for discrete random variables, the densityfunction of a continuous random variable will not give the probability thatX takes the value xi. Instead, the density function of a continuous randomvariable Xwill be such that areas under f(x) will give probabilities associatedwith the corresponding intervals. The probability density function is definedso thatf(x) 0 and

    P(a < X

    b) =

    b

    a f(x) dx; a b (1.1)

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    19/354

    1.1 Probability and Data Generating Process 3

    This is the area under f(x) in the range from a to b. For a continuous variable

    +

    f(x) dx= 1 (1.2)

    Cumulative Distribution Function

    A function closely related to the probability density function of a random vari-able is the corresponding cumulative distribution function. This function of adiscrete random variable Xis defined as follows:

    F(x) = P(X x) = Xx

    f(X) (1.3)

    That is, F(x) is the probability that the random variable X takes a value lessthan or equal to x.

    The cumulative distribution function for a continuous random variable X isgiven by

    F(x) =P(X x) = x

    f(t)dt (1.4)

    wheref(t) is the the probability density function. In both the continuous andthe discrete case, F(x) must satisfy the following properties:

    0 F(x) 1. Ifx2> x1 then F(x2) F(x1). F(+) = 1 and F() = 0.

    Expectations of Random Variables

    The expected value of a random variable X is the value that we, on average,expect to obtain as an outcome of the experiment. It is not necessarily a valueactually taken by the random variable. The expected value, denoted by E(X)or, is a weighted average of the values taken by the random variableX, wherethe weights are the respective probabilities.

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    20/354

    4 1 Univariate Linear Regression Model

    Let us consider the discrete random variable Xwith outcomes x1, , xn andcorresponding probabilities f(xi). Then, the expression

    E(X) = =

    ni=1

    xif(X=xi) (1.5)

    defines the expected value of the discrete random variable. For a continuousrandom variableXwith density f(x), we define the expected value as

    E(X) = =

    +

    xf(x) dx (1.6)

    Joint Distribution Function

    We consider an experiment that consists of two parts, and each part leads tothe occurrence of specified events. We could study separately both events, how-ever we might be interested in analyzing them jointly. The probability functiondefined over a pair of random variables is called the joint probability distribu-tion. Consider two random variables X and Y, the joint probability distributionfunction of two random variables X and Y is defined as the probability that Xis equal to xi at the same time that Y is equal to yj

    P({X=xi} {Y =yj}) =P(X=xi, Y =yj) =f(xi, yj ) i, j = 1, 2, . . .(1.7)

    If X and Yare continuous random variables, then the bivariate probabilitydensity function is:

    P(a < X b; c < Y d) =d

    c

    ba

    f(x, y)dxdy (1.8)

    The counterparts of the requirements for a probability density function are:

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    21/354

    1.1 Probability and Data Generating Process 5

    i

    j

    f(xi, yj) = 1

    (1.9) +

    +

    f(x, y)dxdy = 1

    The cumulative joint distribution function, in the case that both X and Y arediscrete random variables is

    F(x, y) = P(X

    x, Y

    y) = Xx Yy f(X, Y) (1.10)

    and if both X andYare continuous random variables then

    F(x, y) = P(X x, Y y) = x

    y

    f(t, v)dtdv (1.11)

    Marginal Distribution Function

    Consider now that we know a bivariate random variable (X, Y) and its proba-bility distribution, and suppose we simply want to study the probability distri-bution ofX, say f(x). How can we use the joint probability density functionfor (X, Y) to obtain f(x)?

    The marginal distribution, f(x), of a discrete random variable Xprovides theprobability that the variable X is equal to x, in the joint probability f(X, Y),without considering the variable Y, thus, if we want to obtain the marginaldistribution ofX from the joint density, it is necessary to sum out the othervariable Y. The marginal distribution for the random variable Y, f(y) isdefined analogously.

    P(X=x) = f(x) =

    Y

    f(x, Y) (1.12)

    P(Y =y) =f(y) =

    Xf(X, y) (1.13)

    The resulting marginal distributions are one-dimensional.

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    22/354

    6 1 Univariate Linear Regression Model

    Similarly, we obtain the marginal densities for a pair of continuous randomvariables X and Y:

    f(x) =

    +

    f(x, y) dy (1.14)

    f(y) =

    +

    f(x, y) dx (1.15)

    Conditional Probability Distribution Function

    In the setting of a joint bivariate distribution f(X, Y), consider the case whenwe have partial information about X. More concretely, we know that the ran-dom variableXhas taken some vale x. We would like to know the conditionalbehavior ofY given that Xhas taken the value x. The resultant probabilitydistribution ofY givenX=x is called the conditional probability distributionfunction ofY given X, FY|X=x(y). In the discrete case it is defined as

    FY|X=x(y) = P(Y y|X=x) =Yy

    f(x, Y)

    f(x) =

    Yy

    f(Y|x) (1.16)

    where f(Y

    |x) is the conditional probability density function and x must be

    such that f(x)> 0 . In the continuous case FY|X=x(y) is defined as

    FY|X=x(y) = P(Y y|X=x) =y

    f(y|x) dy =

    y

    f(x, y)

    f(x) dy (1.17)

    f(y|x) is the conditional probability density function and x must be such thatf(x)> 0 .

    Conditional Expectation

    The concept of mathematical expectation can be applied regardless of the kind

    of the probability distribution, then, for a pair of random variables ( X, Y)

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    23/354

    1.1 Probability and Data Generating Process 7

    with conditional probability density function, namely f(y|x), the conditionalexpectation is defined as the expected value of the conditional distribution, i.e.

    E(Y|X=x) =

    n

    j=1 yjf(Y =yj |X=x) ifYdiscrete

    +

    yf(y|x) dy ifY continuous(1.18)

    Note that for the discrete case, y1, , yn are values such that f(Y =yj |X=x)> 0.

    The Regression Function

    Let us define a pair of random variables (X, Y) with a range of possible valuessuch that the conditional expectation of Y given X is correctly defined inseveral values of X = x1, , xn. Then, a regression is just a function thatrelates different values ofX, say x1, , xn, and their corresponding values interms of the conditional expectation E(Y|X=x1), , E(Y|X=xn).The main objective of regression analysis is to estimate and predict the meanvalue (expectation) for the dependent variableYin base of the given (fixed) val-ues of the explanatory variable. The regression function describes dependenceof a quantity Y on the quantity X, a one-directional dependence is assumed.The random variable Xis referred as regressor, explanatory variable or inde-pendent variable, the random variableYis referred as regressand or dependentvariable.

    1.1.2 Example

    In the following Quantlet, we show a two-dimensional random variable (X, Y),we calculate the conditional expectation E(Y|X = x) and generate a line bymeans of merging the values of the conditional expectation in each x values.The result is identical to the regression ofy onx.

    Let us consider 54 households as the whole population. We want to know therelationship between the net income and household expenditure, that is, wewant a prediction of the expected expenditure, given the level of net income ofthe household. In order to do so, we separate the 54 households in 9 groups

    with the same income, then, we calculate the mean expenditure for every level

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    24/354

    8 1 Univariate Linear Regression Model

    of income.XEGlinreg01.xpl

    This program produces the output presented in Figure1.1

    20 40 60 80

    Variable x: Net Income

    50

    100

    150

    Varibbley:Expenditure

    Figure 1.1: Conditional Expectation: E(Y|X=x)

    The functionE(Y|X=x) is called aregression function. This function expressonly the fact that the (population) mean of the distribution ofY given X hasa functional relationship with respect toX.

    1.1.3 Data Generating Process

    One of the major tasks of statistics is to obtain information about populations.A population is defined as the set of all elements that are of interest for astatistical analysis and it must be defined precisely and comprehensively so thatone can immediately determine whether an element belongs to the populationor not. We denote by N the population size. In fact, in most of cases, thepopulation is unknown, and for the sake of analysis, we suppose that it ischaracterized by a joint probability distribution function. What is known forthe researcher is a finite subset of observations drawn from this population.This is called a sample and we will denote the sample size by n. The mainaim of the statistical analysis is to obtain information from the population (itsjoint probability distribution) through the analysis of the sample.

    Unfortunately, in many situations the aim of obtaining information about the

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg01.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    25/354

    1.1 Probability and Data Generating Process 9

    whole joint probability distribution is too complicated, and we have to orientour objective towards more modest proposals. Instead of characterizing thewhole joint distribution function, one can be more interested in investigatingone particular feature of this distribution such as the regression function. Inthis case we will denote it as Population Regression Function(PRF), statisticalobject that has been already defined in sections1.1.1and 1.1.2.

    Since very few information is know about the population characteristics, onehas to establish some assumptions about what is the behavior of this unknownquantity. Then, if we consider the observations in Figure 1.1 as the the wholepopulation, we can state that the PRF is a linear function of the different valuesofX, i.e.

    E(Y|X=x) = + x (1.19)

    where and are fixed unknown parameters which are denoted as regressioncoefficients. Note the crucial issue that once we have determined the functionalform of the regression function, estimation of the parameter values is tanta-mount to the estimation of the entire regression function. Therefore, once asample is available, our task is considerably simplified since, in order to analyzethe whole population, we only need to give correct estimates of the regressionparameters.

    One important issue related to the Population Regression Function is theso called Error term in the regression equation. For a pair of realizations

    (xi, yi) from the random variable (X, Y), we note thatyi will not coincide withE(Y|X=xi). We define as

    ui = yi E(Y|X=xi) (1.20)

    the error term in the regression function that indicates the divergence betweenan individual value yi and its conditional mean, E(Y|X = xi). Taking intoaccount equations (1.19) and (1.20) we can write the following equalities

    yi = E(Y|X=xi) + ui = + xi+ ui (1.21)

    and

    E(u|X=xi) = 0

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    26/354

    10 1 Univariate Linear Regression Model

    This result implies that for X = xi, the divergences of all values of Y withrespect to the conditional expectation E(Y|X = xi) are averaged out. Thereare several reasons for the existence of the error term in the regression:

    The error term is taking into account variables which are not in the model,because we do not know if this variable (regressor) has a influence in theendogenous variable

    We do not have great confidence about the correctness of the model Measurement errors in the variables

    The PRF is a feature of the so called Data Generating Process DGP. This isthe joint probability distribution that is supposed to characterize the entirepopulation from which the data set has been drawn. Now, assume that fromthe population of N elements characterized by a bivariate random variable(X, Y), a sample ofn elements, (x1, y1), , (xn, yn), is selected. If we assumethat the Population Regression Function (PRF) that generates the data is

    yi = + xi+ ui, i= 1, , n (1.22)

    then, given any estimator of and , namely and , we can substitute theseestimators into the regression function

    yi = +xi, i= 1, , n (1.23)obtaining the sample regression function (SRF). The relationship between thePRF and SRF is:

    yi = yi+ ui, i= 1, , n (1.24)

    where ui is denoted the residual.

    Just to illustrate the difference between Sample Regression Function and Pop-ulation Regression Function, consider the data shown in Figure1.1 (the wholepopulation of the experiment). Let us draw a sample of 9 observations from

    this population. XEGlinreg02.xpl

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg02.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    27/354

    1.1 Probability and Data Generating Process 11

    This is shown in Figure1.2. If we assume that the model which generates thedata isyi = +xi +ui, then using the sample we can estimate the parametersand.

    XEGlinreg03.xpl

    In Figure1.3 we present the sample, the population regression function (thickline), and the sample regression function (thin line). For fixed values ofx inthe sample, the Sample Regression Function is going to depend on the sample,whereas on the contrary, the Population Regression Function will always takethe same values regardless the sample values.

    20 40 60 80

    Variable x: Net Income

    50

    100

    Varibbley:Expenditure

    Figure 1.2: Sample n = 9 of (X, Y)

    With a data generating process (DGP) at hand, then it is possible to createnew simulated data. If, and the vector of exogenous variables X is known(fixed), a sample of size n is created by obtaining n values of the randomvariable u and then using these values, in conjunction with the rest of themodel, to generate n values ofY. This yields one complete sample of size n.Note that this artificially generated set of sample data could be viewed as anexample of real-world data that a researcher would be faced with when dealingwith the kind of estimation problem this model represents. Note especiallythat the set of data obtained depends crucially on the particular set of errorterms drawn. A different set of error terms would create a different data set ofYfor the same problem (see for more details Kennedy (1998)).

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg03.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    28/354

    12 1 Univariate Linear Regression Model

    20 40 60 80

    Variable x: Net Income

    50

    100

    Varibbley:Expenditure

    Figure 1.3: Sample and Population Regression Function

    1.1.4 Example

    In order to show how a DGP works, we implement the following experiment.We generate three replicates of sample n = 10 of the following data generatingprocess: yi = 2+0.5xi +ui. Xis generated by a uniform distribution as followsX U[0, 1].

    XEGlinreg04.xpl

    This code produces the values ofX, which are the same for the three samples,and the corresponding values ofY, which of course differ from one sample tothe other.

    1.2 Estimators and Properties

    If we have available a sample ofn observations from the population representedby (X, Y), (x1, y1), , (xn, yn), and we assume the Population RegressionFunction is both linear in variables and parameters

    yi = E(Y|X=xi) + ui = + xi+ ui, i= 1, , n, (1.25)we can now face the task of estimating the unknown parameters and . Un-

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg04.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    29/354

    1.2 Estimators and Properties 13

    fortunately, the sampling design and the linearity assumption in the PRF, arenot sufficient conditions to ensure that there exists a precise statistical rela-tionship between the estimators and its true corresponding values (see section1.2.6for more details). In order to do so, we need to know some additionalfeatures from the PRF. Since we do not them, we decide to establish someassumptions, making clear that in any case, the statistical properties of theestimators are going to depend crucially on the related assumptions. The basicset of assumptions that comprises the classical linear regression model is asfollows:

    (A.1) The explanatory variable, X, is fixed.

    (A.2) For any n >1,1

    n

    ni=1

    (xi x)2 >0.

    (A.3)

    limn

    1

    n

    ni=1

    (xi x)2 =m >0.

    (A.4) Zero mean disturbances: E(u) = 0.

    (A.5) Homoscedasticity: V ar(ui) =2 < , is constant, for all i.(A.6) Nonautocorrelation: Cov(ui, uj ) = 0 ifi

    =j .

    Finally, an additional assumption that is usually employed to easier the infer-ence is

    (A.7) The error term has a gaussian distribution, ui N(0, 2)

    For a more detailed explanation and comments on the different assumptionsee Gujarati (1995). Assumption (A.1) is quite strong, and it is in fact verydifficult to accept when dealing with economic data. However, most part ofstatistical results obtained under this hypothesis hold as well for weaker suchas random Xbut independent ofu (see Amemiya (1985) for the fixed design

    case, against Newey and McFadden (1994) for the random design).

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    30/354

    14 1 Univariate Linear Regression Model

    1.2.1 Regression Parameters and their Estimation

    In the univariate linear regression setting that was introduced in the previoussection the following parameters need to be estimated

    - intercept term. It gives us the value of the conditional expectation ofY givenX=x, forx= 0.

    - linear slope coefficient. It represents the sensitivity ofE(Y|X = x)to changes in x.

    2 - Error term measure of dispersion. Large values of the variance meanthat the error term u is likely to vary in a large neighborhood around theexpected value. Smaller values of the standard deviation indicate thatthe values ofu will be concentrated around the expected value.

    Regression Estimation

    From a given population described as

    y= 3 + 2.5x + u (1.26)

    X U[0, 1] andu N(0, 1), a random sample ofn= 100 elements is generated.XEGlinreg05.xpl

    We show the scatter plot in Figure 1.4

    Following the same reasoning as in the previous sections, the PRF is unknownfor the researcher, and he has only available the data, and some informa-tion from the PRF. For example, he may know that the relationship betweenE(Y|X=x) andx is linear, but he does not know which are the exact param-eter values. In Figure1.5 we represent the sample and several possible valuesof the regression functions according to different values for and .

    XEGlinreg06.xpl

    In order to estimate and , many estimation procedures are available. Oneof the most famous criteria is the one that chooses and such that they

    minimize the sum of the squared deviations of the regression values from their

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg06.htmlhttp://www.quantlet.org/mdstat/codes/xeg/XEGlinreg05.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    31/354

    1.2 Estimators and Properties 15

    0 0.5 1

    X

    2

    4

    6

    Y

    Figure 1.4: Sample n = 100 of (X, Y)

    0 0.5 1

    X

    0

    5

    Y

    Figure 1.5: Sample ofX, Y, Possible linear functions

    real corresponding values. This is the so called least squares method. Applyingthis procedure to the previous sample,

    XEGlinreg07.xpl

    in Figure1.6, we show for the sake of comparison the least squares regressioncurve together with the other sample regression curves.

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg07.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    32/354

    16 1 Univariate Linear Regression Model

    0 0.5 1

    X

    0

    5

    Y

    Figure 1.6: Ordinary Least Squares Estimation

    We describe now in a more precise way how the Least Squares method isimplemented, and, under a Population Regression Function that incorporatesassumptions (A.1) to (A.6), which are its statistical properties.

    1.2.2 Least Squares Method

    We begin by establishing a formal estimation criteria. Let and be a possible

    estimators (some function of the sample observations) of and . Then, thefitted value of the endogenous variable is:

    yi = +xi i= 1,...,n (1.27)

    The residual value between the real and the fitted value is given by

    ui = yi yi i= 1,...,n (1.28)

    The least squares method minimizes the sum of squared deviations of regression

    values ( yi = + xi) from the observed values (yi), that is, the residual sum

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    33/354

    1.2 Estimators and Properties 17

    of squaresRSS.

    ni=1

    (yi yi)2 min (1.29)

    This criterion function has two variables with respect to which we are willing

    to minimize: and .

    S( ,) =

    ni=1

    (yi xi)2

    . (1.30)

    Then, we define as Ordinary Least Squares (OLS) estimators, denoted by and , the values of and that solve the following optimization problem

    (,) =argmin

    ,

    S( ,

    ) (1.31)

    In order to solve it, that is, to find the minimum values, the first conditionsmake the first partial derivatives have to be equal to zero.

    S( ,)

    = 2n

    i=1(yi xi) = 0

    (1.32)

    S( ,)

    = 2n

    i=1

    (yi xi)xi = 0

    To verify whether the solution is really a minimum, the matrix of second orderderivatives of (1.32), the Hessian matrix, must be positive definite. It is easyto show that

    H( ,) = 2

    n ni=1 xi

    ni=1 xi

    ni=1 x

    2i

    , (1.33)and this expression is positive definite if and only if,i(xix)2 > 0. But,this is implied by assumption (A.2). Note that this requirement is not strongat all. Without it, we might consider regression problems where no variation

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    34/354

    18 1 Univariate Linear Regression Model

    at all is considered in the values of X. Then, condition (A.2) rules out thisdegenerate case.

    The first derivatives (equal to zero) lead to the so-called (least squares) normalequations from which the estimated regression parameters can be computed bysolving the equations.

    n +

    ni=1

    xi =

    ni=1

    yi (1.34)

    n

    i=1xi+

    n

    i=1xi

    2 =

    n

    i=1xiyi (1.35)

    Dividing the original equations by n, we get a simplified formula suitable forthe computation of regression parameters

    +x = y

    x +1

    n

    ni=1

    xi2 =

    1

    n

    ni=1

    xiyi

    For the estimated intercept , we get:

    = y x (1.36)

    For the estimated linear slope coefficient , we get:

    (y x)x +1n

    ni=1

    xi2 =

    1

    n

    ni=1

    xiyi

    1

    n

    ni=1

    (xi2 x2) = 1

    n

    ni=1

    xiyi xy

    SX2 = SXY

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    35/354

    1.2 Estimators and Properties 19

    =SXY

    SX2

    =

    ni=1(xi x)(yi y)n

    i=1(xi x)2 (1.37)

    The ordinary least squares estimator of the parameter 2 is based on the fol-lowing idea: Since2 is the expected value ofu2i and uis an estimate ofu, ourinitial estimator

    2 = 1n

    i

    u2i (1.38)

    would seem to be a natural estimator of2, but due to the fact that E iu2i =(n 2)2, this implies

    E2 = n 2

    n 2 =2. (1.39)

    Therefore, the unbiased estimator of2 is

    2 =

    iu

    2i

    n 2 (1.40)

    Now, with this expression, we obtain thatE(2) = .

    In the next section we will introduce an example of the least squares estimation

    criterion.

    1.2.3 Example

    We can obtain a graphical representation of the least squares ordinary estima-tion by using the following Quantlet

    gl = grlinreg (x)

    The regression line computed by the least squares method using the data gen-erated in (1.49)

    XEGlinreg08.xpl

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg08.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    36/354

    20 1 Univariate Linear Regression Model

    is shown in Figure1.7 jointly with the data set.

    0 0.5 1

    X

    2

    4

    6

    8

    Y

    Figure 1.7: Ordinary Least Squares Estimation

    1.2.4 Goodness of Fit Measures

    Once the regression line is estimated, it is useful to know how well the regressionline approximates the data from the sample. A measure that can describe the

    quality of representation is called the coefficient of determination (either R-Squared or R2). Its computation is based on a decomposition of the varianceof the values of the dependent variable Y.

    The smaller is the sum of squared estimated residuals, the better is the qualityof the regression line. Since the Least Squares method minimizes the varianceof the estimated residuals it also maximizes the R-squared by construction.

    (yi yi)2 =

    ui

    2 min. (1.41)

    The sample variance of the values ofY is:

    SY2 = ni=1(yi y)2n

    (1.42)

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    37/354

    1.2 Estimators and Properties 21

    The elementni=1(yi y)2 is known as Total Sum of Squares (TSS), it isthe total variation of the values ofY from y. The deviation of the observedvalues, yi, from the arithmetic mean, y, can be decomposed into two parts:The deviation of the observed values ofYfrom the estimated regression valuesand the deviation of the estimated regression values from the sample mean. i.e.

    yi y= (yi yi+ yi y) = ui+ yi y, i= 1, , n (1.43)

    where ui = yi yi is the error term in this estimate. Note also that consideringthe properties of the OLS estimators it can be proved that y = y. Taking thesquare of the residulas and summing over all the observations, we obtain the

    Residual Sum of Squares, RSS =ni=1u2i . As a goodness of fit criterion theRSS is not satisfactory because the standard errors are very sensitive to theunit in whichY is measured. In order to propose a criteria that is not sensitiveto the measurement units, let us decompose the sum of the squared deviationsof equation (1.43) as

    ni=1

    (yi y)2 =n

    i=1

    [(yi yi) + (yi y)]2

    =n

    i=1(yi yi)2 +

    n

    i=1(yi y)2 + 2

    n

    i=1(yi yi)(yi y) (1.44)

    Now, noting that by the properties of the OLS estimators we have thatn

    i=1(yiyi)(yi y) = 0, expression (1.44) can be written as

    T SS= ESS+ RSS, (1.45)

    whereE SS=n

    i=1(yi y)2, is the so called Explained Sum of Squares. Now,dividing both sides of equation (1.45) byn, we obtain

    ni (yi y)2

    n =

    ni=1(yi yi)2

    n +

    ni=1(yi y)2

    n

    (1.46)

    =

    ni=1 ui

    2

    n +

    ni=1(yi y)2

    n

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    38/354

    22 1 Univariate Linear Regression Model

    and then,

    SY2 =Su

    2 + SY2 (1.47)

    The total variance of Y is equal to the sum of the sample variance of theestimated residuals (the unexplained part of the sampling variance ofY) andthe part of the sampling variance of Y that is explained by the regressionfunction (the sampling variance of the regression function).

    The larger the portion of the sampling variance of the values ofY is explainedby the model, the better is the fit of the regression function.

    The Coefficient of Determination

    The coefficient of the determination is defined as the ratio between the sam-pling variance of the values ofYexplained by the regression function and thesampling variance of values ofY. That is, it represents the proportion of thesampling variance in the values ofY explained by the estimated regressionfunction.

    R2 =

    ni=1(yi y)2ni=1(yi y)2

    =SY

    2

    SY2

    (1.48)

    This expression is unit-free because both the numerator and denominator havethe same units. The higher the coefficient of determination is, the better the

    regression function explains the observed values. Other expressions for thecoefficient are

    R2 =ES S

    T SS = 1 RS S

    T SS =

    n

    i=1(xi x)(yi y)ni=1(yi y)2

    =2n

    i=1(xi x)2ni=1(yi y)2

    One special feature of this coefficient is that the R-Squared can take values inthe following range: 0 R2 1. This is always true if the model includesa constant term in the population regression function. A small value of R2

    implies that a lot of the variation in the values ofYhas not been explained bythe variation of the values ofX.

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    39/354

    1.2 Estimators and Properties 23

    1.2.5 Example

    Ordinary Least Squares estimates of the parameters of interest are given byexecuting the following quantlet

    {beta,bse,bstan,bpval}=linreg(x,y)

    As an example, we use the original data source that was already shown inFigure1.4

    XEGlinreg09.xpl

    1.2.6 Properties of the OLS Estimates of, and 2

    Once the econometric model has been both specified and estimated, we are nowinterested in analyzing the relationship between the estimators (sample) andtheir respective parameter values (population). This relationship is going tobe of great interest when trying to extend propositions based on econometricmodels that have been estimated with a unique sample to the whole popula-tion. One way to do so, is to obtain the sampling distribution of the differentestimators. A sampling distribution describes the behavior of the estimators inrepeated applications of the estimating formulae. A given sample yields a spe-

    cific numerical estimate. Another sample from the same population will yieldanother numerical estimate. A sampling distribution describes the results thatwill be obtained for the estimators over the potentially infinite set of samplesthat may be drawn from the population.

    Properties of and

    We start by computing the finite sample distribution of the parameter vector( )

    . In order to do so, note that taking the expression for in (1.36) and

    in (1.37) we can write

    =

    ni=1

    1n xi

    i

    yi, (1.49)

    wherei =

    xi xnl=1(xl x)2

    . (1.50)

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg09.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    40/354

    24 1 Univariate Linear Regression Model

    If we substitute now the value ofyi by the process that has generated it (equa-tion (1.22)) we obtain

    =

    +

    ni=1

    1n xi

    i

    ui, (1.51)

    Equations (1.49) and (1.51) show the first property of the OLS estimators of and . They are linear with respect to the sampling values of the endoge-nous variable y1, , yn, and they also linear in the error terms u1, , un.This property is crucial to show the finite sample distribution of the vector ofparameters ( ) since then, assuming the values of X are fixed (assump-tion A.1), and independent gaussian errors (assumptions A.6 and A.7), linear

    combinations of independent gaussian variables are themselves gaussian andtherefore ( ) follow a bivariate gaussian distribution.

    N

    Var() Cov

    ,

    Cov

    ,

    Var

    (1.52)

    To fully characterize the whole sampling distribution we need to determine boththe mean vector, and the variance-covariance matrix of the OLS estimators.Assumptions (A.1), (A.2) and (A.3) immediately imply that

    E 1n xii ui = 1n xii E(ui) = 0, i (1.53)and therefore by equation (1.51) we obtain

    E

    =

    . (1.54)

    That is, the OLS estimators of and , under assumptions (A.1) to (A.7) areunbiased. Now we calculate the variance-covariance matrix. In order to doso, let

    Var() Cov

    ,

    Cov

    ,

    Var

    E (1.55)

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    41/354

    1.2 Estimators and Properties 25

    Then, if we substitute by its definition in equation (1.51),the last expression will be equal to

    =n

    i=1

    nj=1

    E

    ( 1n xi)( 1n xj ) ( 1n xi)j

    i(1n xj ) ij

    uiuj (1.56)

    Now, assumptions (A.1), (A.5) and (A.6) allow us to simplify expression (1.56)and we obtain

    =2

    n

    i=1

    ( 1n xi)2 ( 1n xi)i

    i(1n xi) 2i . (1.57)

    Finally, substitutei by its definition in equation (1.50) and we will obtain thefollowing expressions for the variance covariance matrix

    Var() Cov

    ,

    Cov

    ,

    Var

    =2

    1n + x2

    ni=1(xix)2

    xni=1(xix)2

    xni=1(xix)2

    1ni=1(xix)2

    (1.58)

    We can say that the OLS method produces BLUE (Best Linear Unbiased Es-timator) in the following sense: the OLS estimators are the linear, unbiasedestimators which satisfy the Gauss-Markov Theorem. We now give the simplestversion of the Gauss-Markov Theorem, that is proved in Johnston and Dinardo(1997), p. 36.

    Gauss-Markov Theorem: Consider the regression model (1.22). Under as-sumptions (A.1) to (A.6) the OLS estimators of and are those who haveminimum variance among the set of all linear and unbiased estimators of theparameters.

    We remark that for the Gauss-Markov theorem to hold we do not need toinclude assumption (A.7) on the distribution of the error term. Furthermore,the properties of the OLS estimators mentioned above are established for finitesamples. That is, the estimator divergence between the estimator and the

    parameter value is analyzed for a fixed sample size. Other properties of theestimators that are also of interest are the asymptotic properties. In this case,the behavior of the estimators with respect to their true parameter values are

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    42/354

    26 1 Univariate Linear Regression Model

    analyzed as the sample size increases. Among the asymptotic properties of theestimators we will study the so called consistencyproperty.

    We will say that the OLS estimators, , , are consistent if they convergeweakly in probability (see Serfling (1984) for a definition) to their respectiveparameter values, and . For weak convergence in probability, a sufficientcondition is

    limnE

    =

    (1.59)

    and

    limnVar()Var = 00 (1.60)Condition (1.59) is immediately verified since under conditions (A.1) to (A.6)we have shown that both OLS estimators are unbiased in finite sample sizes.Condition (1.60) is shown as follows:

    Var() = 2

    1

    n+

    x2ni=1(xi x)2

    =

    2

    n

    1 +

    x2

    n1n

    i=1(xi x)2

    then by the properties of the limits

    limn

    Var() = limn

    2

    n limn

    1n

    ni=1 x

    2i

    1n ni=1(xi x)2Assumption (A.3) ensures that

    limn

    1n

    ni=1 x

    2i

    1n

    ni=1(xi x)2

    <

    and since by assumption (A.5),2 is constant and bounded, then limn 2

    n =

    0. This proves the first part of condition (1.60). The proof for follows thesame lines.

    Properties of2

    For the statistical properties of

    2

    , we will just enumerate the different statis-tical results that will be proved in a more general setting in Chapter 2, Section2.4.2. of this monograph.

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    43/354

    1.2 Estimators and Properties 27

    Under assumptions (A.1) to (A.7), the finite sample distribution of this esti-mator is given by

    (n 2)22

    2n2. (1.61)

    Then, by the properties of the 2 distribution it is easy to show that

    V ar

    (n 2)2

    2

    = 2(n 2).

    This result allows us to calculate the variance of2 as

    V ar(2) = 24

    n 2 . (1.62)

    Note that to calculate this variance, the normality assumption, (A.7), plays acrucial role. In fact, by assuming that uN(0, 2), then E(u3) = 0, and thefourth order moment is already known an related to 2. These two propertiesare of great help to simplify the third and fourth order terms in equation (1.62).

    Under assumptions (A.1) to (A.7) in Section 1.2 it is possible to show (seeChapter 2, Section 2.4.2 for a proof)

    Unbiasedness:

    E(2) = E

    ni=1u

    2i

    n 2

    = 1

    n 2 E(n

    i=1

    u2i ) = 1

    n 2 (n 2)2 =2

    Non-efficiency: The OLS estimator of2 is not efficient because it does not

    achieve the Cramer-Rao lower bound (this bound is 24

    n ).

    Consistency: The OLS estimator of2 converges weakly in probability to2.i.e.

    2 p 2

    as n tends to infinity.

    Asymptotic distribution:

    n

    2 2 dN 0, 24

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    44/354

    28 1 Univariate Linear Regression Model

    asn tends to infinity.

    From the last result, note finally that although 2 is not efficient forfinite sample sizes, this estimator achieves asymptotically the Cramer-Rao lower bound.

    1.2.7 Examples

    To illustrate the different statistical properties given in the previous section, wedevelop three different simulations. The first Monte Carlo experiment analyzesthe finite sample distribution of both , and 2. The second study performsa simulation to explain consistency, and finally the third study compares finite

    sample and asymptotic distribution of the OLS estimator of 2.Example 1

    The following program illustrates the statistical properties of the OLS esti-mators of and . We implement the following Monte Carlo experiment.We have generated 500 replications of sample size n = 20 of the model yi =1.5 + 2xi+ui i= 1, . . . , 20. The values ofX have been generated accordingto a uniform distribution, X U[0, 1], and the the values for the error termhave been generated following a normal distribution with zero mean and vari-ance one, u N(0, 1). To fulfil assumption (A.1), the values ofXare fixed forthe 500 different replications. For each sample (replication) we have estimatedthe parameters and and their respective variances (note that 2 has beenreplaced by 2). With the 500 values of the estimators of these parameters, wegenerate four different histograms

    XEGlinreg10.xpl

    The result of this procedure is presented in the Figure 1.8. With a samplesize ofn= 20, the histograms that contain the estimations of and in thedifferent replications approximate a gaussian distribution. In the other hand,the histograms for the variance estimates approximate a 2 distribution, asexpected.

    Example 2

    This program analyzes by simulation the asymptotic behavior of both and

    when the sample size increases. We generate observations using the model,yi = 2 + 0.5xi+ ui, X U[0, 1], and uN(0, 102). For 200 different samplesizes, (n= 5, , 1000), we have generated 50 replications for each sample size.

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg10.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    45/354

    1.2 Estimators and Properties 29

    histogram of alpha

    0 5

    X

    0

    0.1

    0.2

    0.3

    Y

    histogram of var(alpha)

    1 2 3 4 5 6

    X

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    Y

    histogram of beta

    -5 0 5 10

    X

    0

    5

    10

    15

    Y*E-2

    histogram of var(beta)

    5 10 15 20 25

    X

    0

    5

    10

    15

    Y*E-2

    Figure 1.8: Finite sample distribution

    For each sample size we estimate 50 estimators of,, then, we calculateE()andE() conditioning on the sample size.

    XEGlinreg11.xpl

    The code gives the output presented in Figure 1.9. As expected, when weincrease the sample size E() tends to , in this case = 0.5, andE() tendsto = 2.

    convergence of alpha

    0 50 100 150 200

    X

    1.5

    2

    2.5

    3

    3.5

    4

    Y

    convergence of beta

    0 50 100 150 200

    X

    -2

    -1

    0

    1

    Y

    Figure 1.9: Consistency

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg11.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    46/354

    30 1 Univariate Linear Regression Model

    Example 3

    In the modelyi = 1.5 + 2xi + ui,X U[0, 1], andu N(0, 16). We implementthe following Monte Carlo experiment. For two different sample sizes we havegenerated 500 replications for each sample size. The first 500 replications havea sample size n = 10, the second n = 1000. In both sample sizes we estimate500 estimators of2. Then, we calculate two histograms for the estimates of(n2)2

    2 , one forn = 10, the other for n = 1000.XEGlinreg12.xpl

    The output of the code is presented in Figure 1.10. As expected, the histogramforn = 10 approximates a 2 density, whereas forn = 1000, the approximateddensity is the standard normal.

    hist of var(u) n=10

    0 5 10 15 20 25

    X

    0

    5

    10

    15

    Y*E-2

    hist of var(u) n=1000

    850 900 950 1000 1050 1100

    X

    0

    5

    10

    Y*E-2

    Figure 1.10: Distribution of 2

    1.3 Inference

    In the framework of a univariate linear regression model, one can be interestedin testing two different groups of hypotheses about , and 2. In the firstgroup, the user has some prior knowledge about the value of , for example

    he believes = 0, then he is interested in knowing whether this value, 0,is compatible with the sample data. In this case the null hypothesis will beH0 : = 0, and the alternative H1 : =0. This is what is called a two

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg12.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    47/354

    1.3 Inference 31

    sided test. In the other group, the prior knowledge about the parameter can be more diffuse. For example we may have some knowledge about the signof the parameter, and we want to know whether this sign agrees with our data.Then, two possible tests are available, H0 : 0 against H1 : > 0,(for 0 = 0 this would be a test of positive sign); and H0 : 0 againstH1 : < 0, (for0= 0 this would be a test of negative sign). These are theso called on sided tests. Equivalent tests for are available.

    The tool we are going to use to test for the previous hypotheses is the samplingdistribution for the different estimators. The key to design a testing procedurelies in being able to analyze the potential variability of the estimated value,that is, one must be able to say whether a large divergence between it and thehypothetical value is better ascribed to sampling variability alone or whether

    it is better ascribed to the hypothetical value being incorrect. In order to doso, we need to know the sampling distribution of the parameters.

    1.3.1 Hypothesis Testing about

    In section 1.2.6, equations (1.52) to (1.58) show that the joint finite sampledistribution of the OLS estimators of and is a normal density. Then,by standard properties of the multivariate gaussian distribution (see Greene(1993), p. 76), and under assumptions (A.1) to (A.7) from Section (1.2.6)it ispossible to show that

    N, 2ni=1(xi x)2

    , (1.63)then, by a standard transformation

    z =

    2/n

    i=1(xi x)2(1.64)

    is standard normal. 2 is unknown and therefore the previous expression isunfeasible. Replacing the unknown value of2 with 2 (the unbiased estimatorof2) the result

    z =

    2/n

    i=1(xi x)2, (1.65)

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    48/354

    32 1 Univariate Linear Regression Model

    is the ratio of a standard normal variable (see (1.63)) and the square root ofa chi-squared variable divided by its degrees of freedom (see (1.61)). It is notdifficult to show that both random variables are independent, and therefore zin (1.65) follows a student-t distribution with n 2 degrees of freedom (seeJohnston and Dinardo (1997), p. 489 for a proof). i. e.

    z t(n2) (1.66)

    To test the hypotheses, we have the following alternative procedures:

    Null Hypothesis Alternative Hypothesis

    a) Two-sided test H0: = 0 H1: =0b) one-sided testRight-sided test H0: 0 H1: > 0Left-sided test H0: 0 H1: < 0

    According to this set of hypotheses, next, we present the steps for a one-sidedtest, after this, we present the procedure for a two-sided test.

    One-sided Test

    The steps for a one-sided test are as follows:

    Step 1: Establish the set of hypotheses

    H0: 0 versus H1: > 0.

    Step 2: The test statistic is 0

    2/

    ni=1(xix)2

    , which can be calculated from the

    sample. Under the null hypothesis, it has the t-distribution with (n 2)degrees of freedom. If the calculated z is large, we would suspect thatis probably not equal to 0. This leads to the next step.

    Step 3: In the t-table, look up the entry for n 2 degrees of freedom andthe given level of significance () and find the point t,n2 such thatP(t > t) =

    Step 4: RejectH0 ifz > t,n2.

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    49/354

    1.3 Inference 33

    If the calculated t-statistic (z) falls in the critical region, then z > t,n2. Inthat case the null hypothesis is rejected and we conclude thatis significantlygreater than 0

    The p-value Approach to Hypothesis Testing

    The t-statistic can also be carried out in an equivalent way. First, calculatethe probability that the random variable t (t-distribution withn 2 degrees offreedom) is greater than the observed z , that is, calculate

    p value= P(t > z)

    This probability is the area to the right ofz in thet-distribution. A high value

    for this probability implies that the consequences of erroneously rejecting atrue H0 is severe. A lowp-value implies that the consequences of rejecting atrue H0 erroneously are not very severe, and hence we are safe in rejectingH0. The decision rule is therefore to accept H0(that is, not reject it) if thep-value is too high. In other words, if the p-value is higher than the specifiedlevel of significance (say ), we conclude that the regression coefficient is notsignificantly greater than0at the level. If thep-value is less than we rejectH0and conclude thatis significantly greater than 0. The modified steps forthep-value approach are as follows:

    Step 3a: Calculate the probability (denoted as p-value) that t is greater thanz, that is, compute the area to the right of the calculated z.

    Step 4a: RejectH0and conclude that the coefficient is significant if the p-valueis less than the given level of significance ()

    If we want to establish a more constrained null hypothesis, that is, the set ofpossible values thatcan take under the null hypothesis is only one value, wemust use a two-sided test.

    Two-sided Test

    The procedure for a two-sided alternative is quite similar. The steps are asfollows:

    Step 1: Establish the set of hypotheses

    H0: = 0 versus H1: =0.

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    50/354

    34 1 Univariate Linear Regression Model

    Step 2: The test statistic is 2/

    ni=1(xix)2

    , which is the same as before.

    Under the null hypothesis, it has the t-distribution with (n 2) degreesof freedom.

    Step 3: In the t-table, look up the entry for n 2 degrees of freedom andthe given level of significance () and find the point t/2,n2 such thatP(t > t) = /2 (one-half of the level of significance)

    Step 3a: To use the p-value approach calculate

    p value= P(t > z or t < z) = 2P(t > z)because of the symmetry of the t-distribution around the origin.

    Step 4: RejectH0if |z| > t/2,n2and conclude thatis significantly differentform0 at the level

    Step 4a: In case of the p-value approach, reject H0 ifp-value< , the level ofsignificance.

    The different sets of hypotheses and their decision regions for testing at asignificance level of can be summarized in the following table:

    Test Rejection region for H0 Non-rejection region for H0

    Two-sided

    z | z < t/2 or z > t/2

    z | t/2 z t/2

    right-sided {z | z > t } {z | z t }left-sided {z | z < t } {z | z t}1.3.2 Example

    We implement the following Monte Carlo experiment. We generate one sampleof size n = 20 of the model yi = 2+0.75xi + ui i= 1, . . . , 20. Xhas a uniformdistribution generated as follows X U[0, 1], and the error term u N(0, 1).We estimate , , 2. The program gives the three possible test for when0= 0, showing the critical values and the rejection regions.

    XEGlinreg13.xpl

    The previous hypothesis-testing procedure is confined to the slope coefficient,. In the next section we present the process based on the fit of the regression

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg13.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    51/354

    1.3 Inference 35

    1.3.3 Testing Hypothesis Based on the Regression Fit

    In this section we present an alternative view to the two sided test on thatwe have developed in the previous section. Recall that the null hypothesis isH0 := 0 against the alternative hypothesis that H0 :=0.In order to implement the test statistic remind that the OLS estimators, and , are such that they minimize the residual sum of squares (RSS). Since

    R2 = 1 RSS/TSS, equivalently and maximize the R2, and thereforeany other value of, leads to a relevant loss of fit. Consider, now, the valueunder the null, 0 rather than (the OLS estimator). We can investigate the

    changes in the regression fit when using 0 instead of. To this end, considerthe following residual sum of squares where has been replaced by

    0.

    RSS0=

    ni=1

    (yi 0xi)2. (1.67)

    Then, the value of, 0, that minimizes (1.67) is

    0= y 0x. (1.68)

    Substituting (1.68) into (1.67) we obtain

    RSS0=

    ni=1

    (yi y 0(xi x))2 . (1.69)

    Doing some standard algebra we can show that this last expression is equal to

    RSS0= T SS+

    02 n

    i=1

    (xi x)2 ESS, (1.70)

    and sinceT SS= ESS+ RSS and defining

    R20= 1

    RS S0

    T SS

    (1.71)

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    52/354

    36 1 Univariate Linear Regression Model

    then (1.70) is equal to

    R2 R20=( 0)2

    ni=1(xi x)2

    T SS , (1.72)

    which is positive, becauseR20 must be smaller thanR2, that is, the alternative

    regression will not fit as well as the OLS regression line. Finally,

    F = (R2 R20)/1(1 R2)/(n 2) 1,n2 (1.73)

    where 1,n2is an F-Snedecor distribution with 1 andn

    2 degrees of freedom.

    The last statement is easily proved since under the assumptions established inSection1.2.6then

    ( 0)2n

    i=1

    (xi x)2/2 21, (1.74)

    (n 2)RSS/2 2n2, (1.75)

    and

    (R2 R20)/1(1

    R2)/(n

    2)

    = ( 0)2

    ni=1(xi x)2/2

    (n

    2)RSS/2 . (1.76)

    The proof of (1.73) is closed by remarking that (1.74) and (1.75) are indepen-dent.

    The procedure in the two-sided test

    Step 1: Establish the set of hypotheses

    H0: = 0 versus H1: =0.

    Step 2: The test statistic is F = (R2R20)/1

    (1R2)/(n2) . Under the null hypothesis, ithas theF-distribution with one and (n 2) degrees of freedom.

    Step 3: In the F-table, look up the entry for 1, n 2 degrees of freedomand the given level of significance () and find the point /2,1,n2 and1/2,1,n2

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    53/354

    1.3 Inference 37

    Step 4: Reject H0 ifF0 > /2,1,n2 or F0 < 1/2,1,n2 and conclude that is significantly different from 0 at the level

    1.3.4 Example

    With the same data of the previous example, the program computes the hy-pothesis test for H0 : 0 = 0 by using the regression fit. The output is thecritical value and the rejection regions.

    XEGlinreg14.xpl

    1.3.5 Hypothesis Testing about

    As in Section1.3.1, by standard properties of the multivariate gaussian distri-bution (see Greene (1993), p. 76), and under assumptions (A.1) to (A.7) fromSection (1.2.6) it is possible to show that

    z =

    1/n + x2/n

    i=1(xi x)2 t(n2) (1.77)

    The construction of the test are made similar to the test of, a two- or one-sided test will be carried out:

    1)Two-sided test

    H0: = 0 versus H1: =0.

    2) Right-sided test

    H0: 0 versus H1: > 0.

    3) Left-sided test

    H0: 0 versus H1: < 0.

    If we assume a two-sided test, the steps for this test are as follows

    http://www.quantlet.org/mdstat/codes/xeg/XEGlinreg14.html
  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    54/354

    38 1 Univariate Linear Regression Model

    Step 1: Establish the set of hypotheses

    H0: = 0 versus H1: =0.

    Step 2: The test statistic isz = 0 , which is the same as before. Under thenull hypothesis, it has the t-distribution with (n 2) degrees of freedom.

    Step 3: In the t-table, look up the entry for n 2 degrees of freedom andthe given level of significance () and find the point t/2,n2 such thatP(t > t) = /2 (one-half of the level of significance)

    Step 4: RejectH0if |z| > t/2,n2and conclude thatis significantly differentform0 at the level

    1.3.6 Example

    With the same data of the previous example, the program gives the threepossible tests for when 0 = 2, showing the critical values and the rejectionregions.

    XEGlinreg15.xpl

    1.3.7 Hypotheses Testing about 2

    Although a test for the variance of the error term 2 is not as common as onefor the parameters of the regression line, for the sake of completeness we presentit here. The test on 2 can be obtained from the large sample distribution of2,

    (n 2)22

    2n2 (1.78)

    Using this result, one may write:

    Prob

    21/20 is an arbitrary constant. Equivalently,

    we can express this convergence as:znpc and znp z

    orplimzn = c and plimzn = z (2.93)

    Result (2.91) implies that all the probability of the distribution becomes con-centrated at points close to c. Result (2.92) implies that the values that thevariable may take that are not far from z become more probable as n increases,and moreover, this probability tends to one.

    A second form of convergence is convergence in distribution. Ifzn is a sequenceof random variables with cumulative distribution function (cdf)Fn(z), then the

    sequence converges in distribution to a variable z with cdf F(z) if

    limnFn(z) =F(z) (2.94)

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    83/354

    2.4 Properties of the estimators 67

    which can be denoted by:znd z (2.95)

    andF(z) is said to be the limit distribution ofz .

    Having established these preliminary concepts, we now consider the followingdesirable asymptotic properties : asymptotic unbiasedness, consistency andasymptotic efficiency.

    Asymptotic unbiasedness. There are two alternative definitions of thisconcept. The first states that an estimator

    is asymptotically unbiased

    if as n increases, the sequence of its first moments converges to the pa-rameter . It can be expressed as:

    limn

    E(n) = lim

    nE(

    n) = 0 (2.96)

    Note that the second part of (2.96) also means that the possible bias of disappears asn increases, so we can deduce that an unbiased estimatoris also an asymptotic unbiased estimator.

    The second definition is based on the convergence in distribution of asequence of random variables. According to this definition, an estimatoris asymptotically unbiased if its asymptotic expectation, or expectationof its limit distribution, is the parameter . It is expressed as follows:

    Eas() = (2.97)

    Since this second definition requires knowing the limit distribution of thesequence of random variables, and this is not always easy to know, thefirst definition is very often used.

    In our case, since and are unbiased, it follows that they are asymp-totically unbiased:

    limn

    E(n) = limn

    E(n) = 0 (2.98)

    In order to simplify notation, in what follows we will use , instead ofn.

    Nevertheless, we must continue considering it as a sequence of random variablesindexed by the sample size.

  • 7/25/2019 Computer-Aided Introduction to Econometrics.pdf

    84/354

    68 2 Multivariate Linear Regression Model

    Consistency. An estimator is said to be consistent if it converges inprobability to the unknown parameter, that is to say:

    plimn = (2.99)

    which, in view of (2.91), means that a consistent estimator satisfies theconvergence in probability to a constant, with the unknown parameter being such a constant.

    The simplest way of showing consistency consists of proving two sufficientconditions: i) the estimator must be asymptotically unbiased, and ii)its variance must converge to zero as n increases. These conditions arederived from the convergence in quadratic mean (or convergence in second

    moments), given that this concept of convergence implies convergence inprobability (for a detailed study of the several modes of convergence andtheir relations, see Amemiya (1985), Spanos (1986) and White (1984)).

    In our case, since the asymptotic unbiasedness ofand has been shownearlier, we only have to prove the second condition. In this sense, wecalculate:

    limnV() = limn

    2(XX)1 (2.100)

    Multiplying and dividing (2.100) byn, we obtain:

    limn

    V() = limn

    n

    n2(XX)1 = lim

    n2

    n(

    XXn

    )1 =

    limn

    2

    n limn(

    XXn

    )1 = 0 Q1 = 0 (2.101)

    where we have used the condition (2.6) included in assumption 1. Thus,result (2.101) proves the consistency of the OLS and ML estimators ofthe coefficient vector. As we mentioned before, this


Recommended