EC 827

transcript

EC 827

Module 2Forecasting a Single Variable from its own

History

Regression Refresher (oh what fun?)

Consider a simple regression, e.g. one based on the incomes of workers at a firm:

…for which a computer will calculate an optimal mathematical model:

where the ’s are coefficients on education, age etc… and the is a random error.

[K])rank [T], tenure [R], race [S],sex [A], age ,[E] education(functionYi

ii6i5i4i3i2i10i K TRSAEY

Coefficients Let’s say that (the coefficient on age) is 125. If

we assume that the data for age is just some number of years (e.g. 32 years of age etc…), then the coefficient implies that, on average, a worker who is one year older will earn $125 more than a younger equivalent, all things being equal.

But…– is the relationship real?– does it matter at all?

Statistical Significance To tell whether the coefficient should be believed,

the computer will generally give us some measure of the reliability of the coefficient estimate, normally either:– the standard error– the t-stat– the p-value

Fortunately, they all tell the same story...

Statistical Significance II …they all measure, given the natural

variability of the data, how likely it is that the actual (as opposed to estimated) coefficient is really zero (and hence, there’s no real relationship between, e.g. age and income).

Generally, we use a 95% confidence level as our measure of “certainty”. If a coefficient passes one of three equivalent tests, we are reasonably confident there is a relationship:

Coefficient Tests I If the coefficient is approximately twice the size

of the standard erroror The t-stat is greater (in absolute value) than 1.96 or The p-value is less (in absolute value) than 0.05

…then we’re confident we’ve found a real relationship. Note: saying we’re confident there’s a relationship isn’t the same as saying we’re confident that our relationship is accurate.

Coefficient Tests II As a general rule, if a coefficient passes one of

these tests, we think it’s important as an explanation of the dependent variable (in our example, income). If it fails the test, we may want to discard it.

We can also test whole sets of variables by using F-tests. Again, the computer will calculate these scores for you. Their meaning is the same: they measure whether or not it appears the variables are important determinants of whatever we’re modeling.

Economic Significance Be careful: all these tests just look for a

mathematically meaningful relationship, not a practically important one.

For example, it would be possible that the coefficient on age was 1.25, not 125, but was very statistically significant. That would imply that an extra year of age raised average salaries by $1.25 per year. Statistically measurable? Yes. Relevant or important? No.

Time Series Data Data that we are interested in forecasting are

time series: observations that have a well defined temporal ordering

Notation: t-1Xt = forecast constructed at time = t-1 of the observation on X to be realized at time = t. No standard notation

Alternative data have no particular ordering: e.g. sample of heights of people in this class

Forecasting a Single Variable I Any time series can be considered as the sum of two

components:– Deterministic Component: that part of the time

series for which a perfect forecast of the future value can be constructed

» examples: constant, time trends, constant seasonal factors– Stochastic Component: that part of the time series

for which is random (stochastic) for which predictions of future value may turn out to be in error.

Forecasting a Single Variable II Assumption is that a history of the variable is available

- a time series of observations

Require that information in that history have implication for current or future realizations of the variable.

Useful information available at the present for forecasting the future requires correlation between events at different points in time.

Correlation and Causation Correlation simply implies a link between two

data series, not that the link is “cause and effect”.

In a perfect world, we’d like to find the effects of causes

In the real world, the best we can hope for is to find potential causes of effects

Forecasts from Own History:An Example

Consider a coin tossing experiment:– Toss a coin N times and record heads (1) or tails

(0) for each replication– Generate a time series: 0,1,1,0, ...– No deterministic component to this time series– Does the information in the time series of

outcomes provide any basis for forecasting the outcome of additional tosses? Why or why not?

Covariance Stationarity Characteristic of the Stochastic Component of a

time series– mean does not depend on time– variance does not depend on time and is

finite– autocovariances (autocorrelations) depend

only on the distance between observations and not the time of the observations.

Sample Correlation Coefficients

]YY[]XX[

)YXYX(

For two series, Xi and Yi on which N observations are available:

correlation coefficient =

Autocorrelations: Definitions Definition: An autocorrelation coefficient is the

correlation between observations of a time series that are separated by a fixed time interval.– A First order autocorrelation is the correlation

between observations in a time series and the same observations lagged one period.

– A pth order autocorrelation is the correlation between observations in a time series and the same observations lagged p periods.

Autocorrelations: Formulae

2jtjt2

jttjtt

)XXX(X

:ist coefficien

ationAutocorrel orderj theThen . 0 j,X YLet

Sample Autocorrelation Function I

Coin Tossing Experiment

Lag1 3 5 7 9

Predictions of Outcomes of Future Coin Tosses

Outcome of any particular coin toss (head or tail) is not influenced by the outcome of any past toss (assuming a fair coin)

Coin tosses are independent events Autocorrelations for time series of coin toss

outcomes are zero (estimated autocorrelations are not significantly different from zero)

No useful forecasting info in time series

Sample Partial Autocorrelation Coefficients

Construct a linear regression of a variable on a constant and lagged observations on the dependent variable up to order p.

Estimated coefficients in this regression model are the estimated partial autocorrelation coefficients, i.e. the coefficient on the n’th lag would be correlation between the event n periods ago and today’s event, given all of the other events.

White Noise Variables white noise process (stochastic variable) =

– zero mean – constant finite variance– not serially correlated (uncorrelated with

observations at different points in time)» autocorrelation coefficients of order 1 are all

zero. Coin tossing is such a white noise process

(actually stronger = independent white noise)

Wold Representation Theorem Any zero mean covariance stationary process

Xt can be written as an infinite sum of white noise processes:

noisewhite

Wold Theorem: Implementation What does that mean? It means that everything

that happens is a function of an infinite series of all past random events. True, but… so what?

The problem is to estimate the terms– impossible since there are an infinite number– trick is to find some model that approximates

the Wold representation

Forecasting Without Infinite Information

What will tomorrow look like? Generally, tomorrow will look like today.

What’s more important:– How much will tomorrow look like today?– How will tomorrow respond to today’s shocks?– How will tomorrow be different from today?

First, know what question to ask… then worry about answers.

Autocorrelation Patterns I Autoregressive (AR) patterns: “Today looks

like previous days”– at least three components

» deterministic component (e.g. constant, trend, constant seasonal factors)

» second component depends on observed values of previous periods

» third component is a new shock, independent of anything that has happened in the past

Example: AR(1) process

X a a X et t t 0 1 1

a0 = Constant or DeterministicComponenta1 = First order Autoregressive

Coefficient(-1 < a1 < 1)

et = Error term (independent whitenoise)

Sample Autocorrelation Function II

AR(1) Variable; a1 = .8

Lag1 3 5 7 9

Sample Autocorrelation II Note that in AR(1) process all observations prior

to time = t are correlated with the outcome at t.– all previous observations have information that

is useful in forecasting what will happen at t.– usefulness (size of autocorrelation coefficients)

decreases as information becomes older (move to more distant past)

This is the general result for AR processes, although the autocorrelation patterns can differ.

Autocorrelation Patterns II Moving Average (MA) Process: “Today is

determined by yesterday’s shocks”– at least three components

» the deterministic component of the series» the effect of the current shock on the series» the effect of one or more previous shocks whose

influence still persists in the current observation.

MA(1) Process: Example

shocks past any with)correlated(not of tindependen Shock;Current e

Parameter Average MovingOrder First c

Sample Autocorrelation Function III

Ma1 Variable c1 = .8

Lag1 3 5 7 9

The shock occurs at t=1, with full strength, then persists with strength

c1*et-1 = (.8*1) = .8 at t=2, then vanishes

Sample Autocorrelations III Note that in MA process only a limited number

of past observations have information (autocorrelations) that are useful in forecasting the current outcome.– number of past observations that have

potentially useful forecasting information is determined by the length of the MA process (here only one past observation)

Sample Autocorrelations IV Size of first order autocorrelation of MA(1) process

determined by value of c1 (see Diebold, p. 158)

(don’t worry about remembering this. That’s what PCs are for)

Forecasts from AR Models I How does anything that occurs today (time = t)

carry forward to influence future observations?– Assume shock et = 1.0– AR(1) coefficient in (e.g.) Annual Inflation

Model is 0.58– Shock of 1.0 at time = t carries forward to

generate an increase in Xt+1 of 0.58

– Shock of 0.58 to Xt+1 carries forward to generate and increase in Xt+2 of 0.58*0.58 = 0.34

Forecasts from AR Models II Effect of shock at time = t persists to affect

observations at t+3, t+4, etc. Size of the effect on future observations becomes

smaller as long as absolute value of autoregressive coefficient is < 1.0

Contrast:– Shocks in AR model carry forward to affect future

observations indefinitely;– Shocks in MA model carry forward limited number

of periods

Forecasts from AR Models III Effect on future observations at time t+j of a

shock at a particular time t, is measured by impulse response function

Impulse Response Function Annual Inflation AR(1) Model

AR(1) Impulse Response Function

0 2 4 6 8 10 120.00

Forecasts from MA Models

How does anything that occurs today (time = t) carry forward to influence future observations?– Assume that we see a shock et = 1.0.– What are implications for t+1, t+2, etc?

Transitory & Permanent Shocks A transitory shock is one whose effect eventually

dies off– go far enough out into the future and events at that

time are not influenced by what is happening today.

A permanent shock is one whose effect continues to influence events, no matter how far off into the future.– all past events have a lasting effect on the present

and the future

EC 827

Documents