+ All Categories
Home > Documents > Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional...

Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional...

Date post: 14-Jun-2020
Category:
Upload: others
View: 10 times
Download: 0 times
Share this document with a friend
35
Forecasting Part 1 JY Le Boudec 1 March 2015
Transcript
Page 1: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

ForecastingPart 1

JYLeBoudec

1March2015

Page 2: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Contents

1. Whatisforecasting?2. LinearRegression

3. EstimationerrorvsPredictioninterval4. AvoidingOverfitting

5. UseofBootstrap

2

Page 3: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

1. What is forecasting ?

Assumeyouhavebeenabletodefinethenature oftheloadforyourstudyItremainstohaveanideaaboutitsintensity

Itisimpossibletoforecastwithouterror

ThegoodengineershouldForecastwhatcanbeforecastGiveuncertainty intervals

Therestisoutsideourcontrol

3

Page 4: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Forecasting = finding conditional distribution of future given past 

AssumeweobservesomedataWehaveobserved andwanttoforecast ℓAfullforecastistheconditionaldistributionof ℓ given

Apointforecastis(e.g.)themean,i.e. ℓ(orthemedian)

Apredictioninterval atlevel95%issuchthatℓ

4

Page 5: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

2. Use of Regression Models

Simple,oftenusedBasedonamodelfittedoverthepast,assumedtoholdinthefuture

5

Page 6: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

6

Page 7: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Prediction

Wehaveobtainedthemodel

with

Theconditionaldistributionof ℓ given is

ℓ with ℓ

because ℓ isindependentof (iid assumption)

7

Page 8: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

8

Page 9: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Virus Growth Data

Wehaveobtainedthemodel

with , 6.2205

A95%‐predictionintervalisℓ

where isthe97.5%quantileoftheLaplace( )distribution;Innaturalscale:Pointprediction:

95%‐predictioninterval: ℓ ℓ

9

Page 10: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

10

Naturalscale

Logscale

6.2205

Prediction interval at time 25

PI = [19942 ; 52248]

Page 11: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Say what is true, for this model 

A. Thewidthofpredictionintervalisconstantandequalto2 1.96

B. Aistrueand istherootmeansquareoftheresidualsuptotime

C. Aistrueand istherootmeansquareoftheforecasterrorsifweapplythemodeluptotime

D. BandCE. NoneoftheaboveF. Idon’tknow

11

The w

idth of p

redicti

on i..

.

A is tru

e and

 $$ is th

e r...

A is tru

e and

 $$ is th

e r...

B an

d C

None

 of th

e abo

veI d

on’t kn

ow

60%

0% 0%0%

40%

0%

Page 12: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Solution

The95%‐predictionintervalisThemodelisfittedwithleastsquares,therefore istherootmeansquaresofresiduals(Thm 3.1)

Notethattheresidualsareequaltotheforecasterrors:

AnswerD.

12

Forecast ℓ ℓ =residuals

Page 13: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Say what is true, for this model 

A. Inlogscalethewidthofpredictionintervalsisconstantandisequaltothe97.5%‐quantileofLaplace

B. Aistrueand isthemeansquareoflog‐scaleresiduals

C. Aistrueand isthemeanoftheabsolutevalueoflog‐scaleresiduals

D. NoneoftheaboveE. Idon’tknow

13

In log

 scale

 the w

idth o

f...

A is t

rue a

nd 1/

$$ is

 the..

.

A is t

rue a

nd 1/

$$ is

 the..

.No

ne of

 the a

bove

I don

’t kno

w

20%

0%0%

53%

27%

Page 14: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Solution

AistruebecausethemodelinwhichwebelieveassumesLaplacenoise;further, isthemeanoftheabsolutevalueofresiduals(Thm 3.2).AnswerC

Notethattheresidualsarealsotheforecasterrors(inlog‐scale).

Notethatinnaturalscale,thepredictionintervalisnotconstant(andnotsymmetric).

14

Page 15: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

What is the 97.5% quantile of the Laplace ( )  distribution ?

.

.

.

.

.

G. Idon’tknow

15

  1.96

 $$+1

    3.

00 $$

  $$+

2   1.

96 $$

  1+$

$   3.

00 $$

  1.

96  $

$ 2  

1.96 $

$I d

on’t k

now

0%

6%

0%

47%

24%

12%12%

Page 16: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Solution

isascaleparameteroftheLaplacedistribution,hencethe

quantileshouldscalelike

(hint:tosimulateLaplacenoise,withproba ½youdo

andwithproba 1/youdo )

AllanswersexceptDarethusimpossible.AnswerD

16

Page 17: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Solution

FromtheCDFofLaplaceweobtain whichgives.

Notethatthe95%‐predictionintervalforLaplacenoiseiswhere isthe97.5%‐quantile,becausethepdfis

symmetric.Wecanalsoobtain bycomputingthe95%‐quantileoftheabsolutevalueofLaplacenoise,whichisanexponentialRV,i.e.solvefor

Thus .

17

Page 18: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

3. How about the estimation error ?Inpracticeweestimatethemodelparameter fromWhencomputingtheforecast,wepretend isknown,andthusmakeanestimationerror(ie weignoreconfidenceintervalson – itishopedthattheestimationerrorismuchlessthanthepredictioninterval).Letusreturntoanexamplewealreadysaw. Assumeweobserve andwanttoforecast .Assumethatwebelieveinthemodel .Weestimateandobtain .Pointpredictionfor ifweignoreestimationuncertainty:;ifweaccountforestimationuncertainty,

95%‐predictionintervalfor ifweignoreestimationuncertainty:

18

Page 19: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Thm 2.6saysthat(for anexactintervalthataccountsforestimationuncertaintyis– compareto

Theestimationerrordecaysin andissmallforlarge

19

Page 20: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Exact Formulas exist for Linear Regression with LS

20

Page 21: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

21

Page 22: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Take‐Home Message

WhenweuseafittedmodelthereissomeuncertaintythataddstothepredictionintervalsInmostcaseswecanignorethemodeluncertaintybecauseitimpactsthepredictionintervalsonlymarginallyInsomerarecases(e.g.linearregressionwithgaussian errors)thereareexactformulas

22

Page 23: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

4. The Overfitting ProblemAssumewewanttoimproveourmodelbyaddingmoreparameters:addapolynomialterm+moreharmonics

23

0, 1 10, 3

Page 24: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Prediction for the better model

24

Thisistheoverfitting problem:abetterfitisnotthebestpredictor– intheextremecase,amodelcanfitexactlythedataandisunabletomodelit

Page 25: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

How to avoid overfittingMethod1:useoftestdataMethod2:informationcriterion

25

Page 26: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Method 2: Information Criteria

Wesawthatthelikelihoodcanbeusedtodefineascorefunctionforthemodelfittingphasee.g foraLSmodel,Toavoidoverfitting,addapenaltytermtothescore

26

Page 27: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

27

Page 28: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Best Model for Internet Data, d=1, h up to 10

28

Information criterions are able to identify the best model

Page 29: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Best Model for Internet Data, h=3, d up to 10

29

Information criterions are not able to identify the best model; the polynomial models are not a good class of models

Page 30: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Say what is true

A. Whendoingthefitandifweuseaninformationcriterion,wecanusealldataavailableuptotime

B. Whendoingthefitandifweuseascore+testdatawecanusealldataavailableuptotime

C. AandBD. NoneE. Idon’tknow

30

Whe

n doing

 the f

it an

d if...

Whe

n doing

 the f

it an

d if...

A an

d B

None

 I d

on’t kn

ow

75%

6%6%13%

0%

Page 31: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Solution

AistrueBisnottrue:ifweusetestdataweneedtokeepasubsetofthedatafortestingthepredictionaccuracy.Weshouldnotusethissubsetofdataforfittingthemodel,otherwisethepredictionperformanceisnotproperlyassessed.AnswerA

31

Page 32: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

5. Use of Bootstrap

AssumewehaveapredictionmodelTheestimationof isdoneassumingsomedistributionfor ;Assumethisdistributionisonlyapproximatelyknown;wecanimprovethepredictionintervalsifweuseabetterapproximationofthisdistribution.Forexample,wecanusetheprincipleoftheBoostrap,i.e.estimatethedistributionof byitsempiricaldistribution.

32

Page 33: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Assume andapplytheorem2.5toℓ

Thisgivesthealgorithm:1.Estimate bysomemethod2.Estimateresiduals3.(Thm 2.5)

4.Predictionintervalfor ℓ ℓ ℓ

33

Page 34: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

ExampleForthisexample,thebootstrap(doneinlogscale)givesasymmetricpredictioninterval

34

bootstrap

AssumingLaplacenoise

Page 35: Forecasting Part 1perfeval.epfl.ch/printMe/forecastPost.pdf · Forecasting = finding conditional distribution of ... intervals is constant and is equal to the 97.5%‐quantile of

Example

Forthisexample,thebootstrapgivesslightlysmallerintervalsthantheonesbasedongaussian noise

35

Assuminggaussian noise

bootstrap


Recommended