Date post: | 19-Jan-2015 |
Category: |
Education |
Upload: | vignes-gopal |
View: | 359 times |
Download: | 0 times |
Introduction to Eviews
Vignes Gopal KrishnaFast track PhD student & SLAI fellow
Faculty of Economics and Administration University of Malaya
Email Address: [email protected]/[email protected]
Steps for Quantitative AnalysisData Screening/Cleaning
Data reliability & validity
Data Analysis
Missing values
Outliers
Standard errors/Standard deviation
Logical sequence of numerical presentations
Sources & Measurement of Data
Linear regression, Granger causality, Cointegration
Conditions
Data Cleaning/Screening
Deals with the management of missing values and outliers.
Crucial element and it will be very helpful in avoiding the dubiousness of results
Useful in monitoring the trends of numerical presentations
Missing values
• Common occurrence in research• Significant impact on the results except for
some tests such as survival analysis, impact analysis and etc
• Types of missing values a) Missing Completely at Random(MCAR)b) Missing at Random(MAR)c) Missing not at Random(MNAR)
Types of Missing Values
Missing Completely at Random(MCAR)*Missing values of Y – do not depend on X & Y* Ex: Selection of survey questionsMissing at Random(MAR)*Missing values of Y –depend on X, but, not on Y*Ex: Income reporting is quite weak among
respondents in service industry.
Missing not at Random(MNAR) • Pr(Y,…)=f(Y,…)• Example: Respondents with high income are
less likely to deal with income reportingMethods:-a) Heckman selection Modelb) Patterns of missing values
Outliers • Inconsistent with existing range of data points• Deals with lower and higher levels of outliers• Positively related to error terms/residuals• Inclusion and exclusion of outliers• Main methods that can be used to identify outliersa) Chauvenet’s criterionb) Grubbs test for outliersc) Peirce’s criteriond) Box-Plot e) Extreme Values Ways to remove outliers (a) By reducing the effects of autocorrelation(b) Winsorizing(c) Robustness of the standard errors –Minimization of standard errors(d) Normalization(e) Trimming
Normality *requirement for parametric analysis*Most of the tests require all the variables to be normally distributed in order
to ensure the normal distribution of error terms• + (skewed to right), -(Skewed to left) – skewness = 0 Kurtosis should be approximately 3(JB test)• Available normality testsa) Jarque Bera test – Skewness & Kurtosis b) Shapiro Wilk test c) Shapiro Francia test d) Zero skewness Log transforme) Box Cox Transform - Data transformationsHistogram with normal curve, Quantile-Quantile (QQ-plot)/Qnorm, and etc
Linear regression
• Associations between DV and IV• DV-continuous/interval/scale/ratio variable• IV-continuous/interval/scale/ratio/categorical variables• Assumptions:-a)Linear parametersb)No endogeneity problemc) No multicollinearityd) Homoscedasticity (No heteroscedasticity)e) Number of variables>number of observationsf) Error terms must be normally distributed.
Diagnostic Testing
Multicollinearity(Deals with multivariate Analysis)• Correlations between independent variables• High R square, large covariances and correlations, more
insignificant t-ratios • Variance Inflation Factor(VIF), Tolerance Value(TL),
Auxillary regressions – Graphical Method• Ways to reduce the effects –a)Drop variables that have high correlationsb)Data transformationand etc
Heteroscedasticity• No homoscedasticity (Unequal spread of variances)a) Error learning modelsb) Outliersc) Techniques of data collectionsCommon way – Remove
outliers(Winsorizing/Trimming), GLS, Park Test, Glejser Test, Goldfeld –Quandt Test, White Test, graphical method and etc
Autocorrelation (Correlation between residuals) • Predicted error terms will underestimate the population error terms• R square will be overestimated• Misleading results – F and t tests are not valid• Methods:- a) Graphical Methodb) Runs testc) Durbin Watson d Testd) Breusch Godfrey Teste) Corrected version of GLSf) Newey West Method
Unit root test • Stationary & Non-stationary • Intercept, Trend, Intercept + TrendGeneral Hypothesis: Null Hypothesis: A variable has an unit rootAlternative Hypothesis: A variable has no unit root (Applicable for Augmented Dickey Fuller(ADF), Dickey Fuller-GLS(DF-
GLS), Phillips-Perron(PP),ERS point Optimal and Ng-Perron)The reversed hypotheses can be observed in the case of KPSS.It is a requirement for cointegration tests (e.g. Johansen Juselius
cointegration test)I(1) at the level form, I(0) at the first different form
Logit/Probit regression
• Type of probabilistic model• Used when DV=Binary/Categorical variable• Error terms should not be normally distributed(Note: The error terms should be normally distributed for Probit
regression)Pr(0,1) = f(X1,X2,X3…) Methods to identify the goodness of fitness(a) Likelihood ratio tests (b) Pseudo R2 (c) Hosmer-Lemeshow test(d) Binary classification
Cointegration
• Two or more variables are said to be cointegrated if the two shares same portion of stochastic trends/drifts.
• In general, the variables have to be I(1) at the level form before getting to cointegration
Types of cointegration test (a) Engle-Granger 2 step method(b) Johansen Juselius test(c) Phillips–Ouliaris cointegration test(d) Autoregressive Distributed Lag(ARDL)