ECS550NFBIntroduction to Numerical Methods using Matlab
Day 4
Lukas [email protected]
Department of Mathematics, University of Matej Bel
June 11, 2015
Today
I Basic econometricsI Linear regressionI Instrumental variablesI Panel Data regression
I BootstrapI Introduction to the BootstrapI Theory of BootstrapI Practical issues
I Selected topicsI Principal Components AnalysisI Support Vector MachinesI Cross validationI Non-parametric Estimation
What is available
I MATLAB’s - Econometrics Toolbox (time series models)I MATLAB’s - Statistics and Machine Learning Toolbox (including
regression analysis)I LeSage - Econometrics toolbox (1999, free)I Panel Data toolbox (Alvarez, Barbero, Zofio)
Basic econometrics - linear regression
y = Xβ + ε
[b,bint,r,rint,stats] = regress(y, X)
I β =(XTX
)−1XT y
I se(β) =√
nn−k−1 σ
2ε (XTX)−1
I tβi = βise(βi)
I R2 =∑n
i=1(yi−Xiβ)2∑ni=1 (yi−y)2
I CIi,α =[βi − tn−k−1
i,α se(βi), βi + tn−k−1i,α se(βi)
]
Basic econometrics - instrumental variable regression
y = Xβ + ε
I Z is matrix of exogenous regressors, such that E(Z ′ε) = 0I PZ = Z(ZTZ)−1ZT is a projection matrixI β2SLS = (XTPZX)−1XTPZy
Basic econometrics - Panel Data Regression
http://www.paneldatatoolbox.com/
yit = α+Xitβ + µi + vit
Models
I Panel Data ModelsI Instrumental Panel Data ModelsI Spatial Panel Data Models
What is available
I Pooled OLSI Fixed Effects (with option Robust)I Between EffectsI Random Effects (with option Robust)I Hausman test (Fixed vs Random effects)
Basic econometrics - time series
http://www.mathworks.com/products/econometrics/
What is available
I Time Series modelling - ARIMAI State space modelling - Kalman filterI Monte Carlo simulationI ForecastingI Cointegration modelling - VECI Volatility modelling - ARCH, GARCH
Bootstrap - what is it
I resampling method for estimating distribution of an estimator ortest statistic
I produces approximation that is at least as accurate as asymptoticexpansion
I may provide test statistic distribution/p-values/confidenceintervals when no asymptotic results are available
I usually used when: we have a consistent estimator, but we don’tknow how to derive standard errors
I after all, the data sample is all we have
Bootstrap - how does it work
We will pretend our finite sample is a population and draw randomsamples with replacement from this population.
Bootstrap - example
Nice animationshttps://www.stat.auckland.ac.nz/∼wild/BootAnim/
Bootstrap - notation
I {Xi, i = 1, · · · , n} data from F0 ∈ II parametric F0(x, θ0) = P (X ≤ x)I statistic Tn = Tn(X1, ..., Xn)I Gn(τ, F0) = P (Tn ≤ τ) denotes the exact finite sample CDF of TnI Tn is pivotal if Gn(τ, F ) does not depend on FI Tn is asymptotically pivotal if G∞(τ, F ) does not depend on FI how can we estimate Gn(., F0)?
I by G∞ - asymptotic approximation (we need large n)I replace F0 with a known estimator - bootstrap
I Fn denotes the estimator of F0
I ECDF - Fn(x) = 1n
∑ni=1 I(Xi ≤ x))→a.s. F0(x)
I from a parametric family: F0(.) = F (., θ0)
Bootstrap - Monte Carlo
Gn(., Fn)→ Gn(., F0)
Approximation procedure of Gn(τ, F0)
Step 1 Generate a random sample of size n from Fn: {X∗i : i = 1, ..., n}Step 2 Compute T ∗n = Tn(X∗1 , ..., X
∗n)
Step 3 Repeat (1) and (2) many times to get an empirical probability of(T ∗n ≤ τ)
Bootstrap - ”does it work” ?
What does it mean, for the bootstrap to work? We would at leastexpect it to get the approximation right when data sample grows toinfinity.
Gn(t, Fn) is consistent if ∀ε > 0,∀F0 ∈ I
limn→∞
Pn
[supτ|Gn(t, Fn)−G∞(τ, F0)| > ε
]= 0
Bootstrap - when does it work
Gn(τ, Fn) ∼ G∞(τ, Fn) ∼ G∞(τ, F0) ∼ Gn(τ, F0)
(Beran and Ducharme 1991)
I Fn → F0
I G∞(τ, F ) is continuous function of τ for any F ∈ II for any τ and any sequence Hn, that Hn → F0:Gn(τ,Hn)→ G∞(τ, F0)
(Mammen 1992)
I gn = 1n
∑gn(Xi), Tn = gn−tn
σn
I g∗n = 1n
∑gn(X∗i ), T ∗n = g∗n−tn
σn
I Then, G∗n consistently estimates Gn if and only if Tn →d N(0, 1)
Bootstrap - when does it not work
I Heavy tailed distributions, Xi random sample from Cauchy,Tn = X
I Xi random sample from N(µ, σ2), Tn = n1/2(X2 − µ2) if µ 6= 0,otherwise Tn = nX2.
I Maximum of a sample: F0 has a support [0, θ0].θn = max{X1, ..., Xn}. Tn = n(θn − θ), T ∗n = n(θ∗n − θn).P ∗n(T ∗n = 0) = 1− (1− 1/n)n → 1− e−1 while P (Tn = 0)→ 0.
I Parameter on a boundary: Xi is random sample from N(µ, 1)where µ ∈ [0,∞) [Andrews (2000)]
Bootstrap - even more
I consistency is nice, but bootstrap allows us to improve finitesample properties of an estimator!
I bias-correctionI bootstraping critical valuesI parametric vs non-parametric bootstrap - how to choose Fn(?)
Bootstrap - bias correction
We care about E[θn − θ]
Step 1 Compute θnStep 2 Generate a random sample of size n from Fn: {X∗i : i = 1, ..., n}
and calculate θ∗n = g(X∗)
Step 3 Repeat (2) many times to calculate E∗θ∗n. Bias estimate is nowE∗θ∗n − θn. Bias corrected estimator is θn −B∗n
Do not use it for√n consistent estimators, because of a higher variance
of the bias-corrected estimator.
Bootstrap - hypothesis tests
Tn = n1/2 θn − θ0
sθn
I With Asymptotic Refinement - use bootstrap to get critical valuesI Without Asymptotic Refinement
I use bootstrap to estimate standard error of an estimatorI percentile method - used quantiles of the distribution of θ∗n
Bootstrap - hypothesis tests
Step 1 Compute θnStep 2 Generate a random sample of size n from Fn: {X∗i : i = 1, ..., n}
and calculate T ∗n = n1/2(θ∗n − θn)/s∗nStep 3 Repeat (2) many times to get empirical distribution of T ∗n . We set
z∗n,α/2 to the (1− α) quantile of this distribution.
Bootstrap - choice of the number of bootstrapreplications
I the larger the number the betterI Efron and Tibshirani (1993) - 200I Andrews and Buchinsky (2000) - 2000I the smaller the quantile the larger the number of replications
Implementation in MATLAB
Statistics and Machine Learning Toolbox
Bootstrapping statistics[bootstat, bootsam] = bootstrp(nboot,bootfun,d1)
Bootstrap confidence intervals[ci, bootstat] =
bootci(nboot,bootfun,...,...,’Options’,options)
I Normal approximationI Percentile methodI Bias correctedI Bias corrected and accelerated
When bootstrap fails - Subsampling
Alternative to Bootstrap.
I we draw smaller samples without replacementI crucial difference: we draw samples from the true data generating
process (F0) and not from the estimated model (Fn)I more general than bootstrapI less powerful in cases when bootstrap worksI practical difficulties → how to choose subsample size?
Selected topics
I Principal Components AnalysisI Support Vector MachinesI Cross ValidationI Non-parametric estimation
Principal Components Analysis
Suppose we have many variables but there are certain regularities inour data.
Can we encode (almost) the same information using fewer variables?(dimension reduction).
We will transform the space that our data span into orthogonal space,the basis vectors are principal components.
We order the principal components according to their importance, thatis, what fraction of variation they explain.What is it good for?
Dimension reduction.
Principal Components Analysis - example
Nice animationhttp://setosa.io/ev/principal-component-analysis/
PCA - how does it work
I principal components are eigenvalues of the covariance matrix ofthe demeaned data
Support Vector Machines
Is a 0 -1 classifier.
Training set → classifier.
This classifier is a result of an optimization problem (quadraticprogramming), that tries to separate 0s from 1s.
Those points that separates 0s from 1s are called support vectors.
Support Vector Machines - Hard vs Soft Margin
Source: www.stackoverflow.com
Support Vector Machine - Non-linear
Separation may not be possible in the original space so we project thepoints into Feature space
Source: www.stackoverflow.com
Cross validation
Our goal is prediction.
We divide our data into three parts.
I Training set - here we train different predictorsI Validation set - pick a winner that is best on the validation set
(the one that fits best in the training phase may be overfitting)I Test set - check how well is the winner doing
Non-parametric estimation - Why?(DiNardo and Tobias 2001)Parametric model:
Why Non-parametric?(DiNardo and Tobias 2001)Non-parametric model:
Kernel density estimation
[f,xi] = ksdensity(x,pts,Name,Value)
I kernel - ’normal’, ’box’, triangle’, ’epanechnikov’ orcustom
I npoints - at how many points it will evaluate (length of xi)I support - ’unbounded’, ’positive’
I bandwidth - kernel-smoothing window
Kernel density estimation
Source: www.mathworks.com
Literature
I LeSage’s Econometrics Toolbox http://www.spatial-econometrics.com/html/mbook.pdfI Panel Data toolbox (Alvarez, Barbero, Zofio) https://ideas.repec.org/p/uam/wpaper/201305.html
www.paneldatatoolbox.comI Horowitz, Joel L. ”The bootstrap.” Handbook of econometrics 5 (2001): 3159-3228.
I Cameron, A. Colin, and Pravin K. Trivedi. Microeconometrics: methods and applications.Cambridge university press, 2005.
I Efron, Bradley, and Robert J. Tibshirani. An introduction to the bootstrap. CRC press, 1994.I (mathematical theory) Beran, Rudolf J., and Gilles R. Ducharme. Asymptotic theory for bootstrap
methods in statistics. Centre de Recherches Mathematiques, 1991.I Kennedy, P. Bootstrapping Student Understanding of What is Going On in Econometrics
http://www.sfu.ca/ kkasa/middle.pdf - this is very instructive and usefulI Bootstrap Testing in Econometrics http://qed.econ.queensu.ca/faculty/mackinnon/papers/bt-cea.pdfI Politis, D. N., J. P. Romano. and M. Wolf. Subsampling. Springer, 1999.”I Bootstrap vs Subsampling
https://normaldeviate.wordpress.com/2013/01/19/bootstrapping-and-subsampling-part-i/
https://normaldeviate.wordpress.com/2013/01/27/bootstrapping-and-subsampling-part-ii/I Bootstrap vs Subsampling http://web.stanford.edu/∼doubleh/eco273/subsampling.pdfI Hastie, Trevor, et al. The elements of statistical learning. Vol. 2. No. 1. New York: springer, 2009.
http://statweb.stanford.edu/ tibs/ElemStatLearn/I Varian, H. Big Data: New Tricks for Econometrics
http://people.ischool.berkeley.edu/∼hal/Papers/2013/ml.pdfI Support Vector Machine http://research.microsoft.com/pubs/67119/svmtutorial.pdfI Cross Validation MATLAB example
http://white.stanford.edu/ knk/Psych216A/Psych216ALecture5Tutorial.mI Nonparametric econometrics http://www.ssc.wisc.edu/∼bhansen/718/NonParametrics1.pdf