+ All Categories
Home > Documents > Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down...

Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down...

Date post: 11-Mar-2020
Category:
Upload: others
View: 13 times
Download: 0 times
Share this document with a friend
37
Bottom-up Estimation and Top-down Prediction for Multi-level Models: Solar Energy Prediction Combining Information from Multiple Sources Jae-Kwang Kim Department of Statistics, Iowa State University Ross-Royall Symposium: Johns Hopkins University Feb 26, 2016 1/37
Transcript
Page 1: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Bottom-up Estimation and Top-downPrediction for Multi-level Models:

Solar Energy Prediction CombiningInformation from Multiple Sources

Jae-Kwang Kim

Department of Statistics, Iowa State University

Ross-Royall Symposium: Johns Hopkins UniversityFeb 26, 2016

1/37

Page 2: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Collaborators

I Youngdeok Hwang (IBM Research)I Siyuan Lu (IBM Research)

2/37

Page 3: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Outline

I IntroductionI Modeling approachI Application: Solar Energy PredictionI Conclusion

Overview 3/37

Page 4: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Mountain Climbing for Problem Solving!

Math Problem

Stat Problem

Real Problem

Math Solution

Stat Solution

Real Solution

We need a map (abstraction) to move from problem to solution!

Overview 4/37

Page 5: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Real Problem: Solar Energy Prediction

I Solar electricity is now projected to supply 14% of totaldemand of contiguous U.S. by 2030, and 27% by 2050.

Introduction 5/37

Page 6: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

IBM Solar Forecasting

Figure : Sky Camera for short-term forecasting (located at Watson)

I Research program funded the by the U.S. Department ofEnergy’s SunShot Initiative.

Introduction 6/37

Page 7: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Monitoring Network

I Global Horizontal Irradiance (GHI): The total amount ofshortwave radiation received from above by a horizontal surface.

I GHI Measurements are being collected every 15 minutes from1,528 sensor units.

Introduction 7/37

Page 8: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Weather Models

I Prediction of GHI from widely-used weather models NorthAmerican Mesoscale Forecast System (NAM) and Short-RangeEnsemble Forecast (SREF).

I We want to combine GHI measurements with the weather modeloutcomes to obtain the solar energy prediction.

Introduction 8/37

Page 9: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Statistical Model: Basic setup

I Population is divided into H exhaustive andnon-overlapping groups, where group h has nh units, forh = 1, . . . ,H.

I For group h, nh units are selected for measurement.I From the i-th unit of group h, the measurements and its

associated covariates, (yhij ,xhij), are available forj = 1, . . . ,nhi .

Model 9/37

Page 10: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Multi-level Model

I Consider level one and level two model,

yhi ∼ f1(yhi |xhi ;θhi),

θhi ∼ f2(θhi |zhi ; ζh),

I yhi = (yhi1, . . . , yhinhi )>: observations at unit (hi).

I xhi = (x>hi1, . . . ,x>hinhi

)>: covariates associated with unit (hi)(=two weather model outcomes).

I zhi : unit-specific covariate.I Note that θhi is a parameter in level 1 model, but a random

variable (latent variable) in level 2 model.I We can build a level 3 model on ζh if necessary.

ζh ∼ f3(ζh | qh;α).

Model 10/37

Page 11: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Data Structure Under Two-level Model

ζh

θh2θh1 θh3

yh11...yh1n1

yh21...yh2n2

yh31...yh3n3

f2f2

f2

f1 f1 f1

Model 11/37

Page 12: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Why Multi-level Models?

1. To reflect the reality: To allow for structural heterogeneity(=variety in big data) across areas.

2. To borrow strength: we need to predict the locations withno direct measurement.

Model 12/37

Page 13: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Real Problems Become Statistical Problems!

1. Parameter estimation2. Prediction3. Uncertainty quantification

Bayesian method using MCMC computation is a useful tool.

Model 13/37

Page 14: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Classical Solutions Do Not Necessarily Work inReality!

1. No single data file exists, as they are stored in cloud(Hadoop Distributed File System).

2. Micro-level data is not always available to the analyst forconfidentiality and security reasons.

3. Classical solution, based on MCMC algorithm, is timeconsuming and the computational cost can be huge for bigdata.

This is a typical big data problem.

Solution 14/37

Page 15: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

New Solution: Divide-and-Conquer Approach

I Three steps for parameter estimation in each level1. Summarization: Find a summary (=measurement) for latent

variable to obtain the sampling error model.2. Combine: Combine the sampling error model and the latent

variable model.3. Learning: Estimate the parameters from the summary data.

I Apply the three steps in level two model and then do thesein level three model.

Solution 15/37

Page 16: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Modeling Structure

Storage

Storage

Storage

Sensor

Sensor

Sensor

Level 1

Level 1

Level 1

Level 2

Site 1

Site 2

Site 3

individual

data

Unit summary

Group

Summary

Solution 16/37

Page 17: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Summarization

I Find a measurement for θhi .I For each unit, treat (xhi ,yhi) as a single data set to obtain

the best estimator θhi of θhi by treating θhi as a fixedparameter.

I Obtain the sampling distribution of θhi as a function of θhi ,θhi ∼ g1(θhi | θhi).

Solution 17/37

Page 18: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Summarization Step under Two-Level Model Structure

ζh

θh2θh1 θh3

θh1 θh2 θh3

f2f2

f2

g1 g1 g1

g1(θhi | θhi): Sampling error model, θhi ∼ N(θhi , V (θhi)).

Solution 18/37

Page 19: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Combining

I The marginal distribution of θhi is

m2(θhi | zhi ; ζh) =

∫g1(θhi | θhi)f2(θhi | zhi ; ζh)dθhi . (1)

which is combining g1(θhi | θhi) and f2(θhi | zhi ; ζh) vialatent variable θhi .

I Also, the prediction model for the latent variable θhi isobtained by using Bayes theorem:

p2(θhi | θhi ; ζh) =g1(θhi | θhi)f2(θhi | zhi ; ζh)∫

g1(θhi | θhi)f2(θhi | zhi ; ζh)dθhi(2)

Solution 19/37

Page 20: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Combining Step

θhi

θhi ζh

g1

f2

m2p2

p2

Sampling error model (g1)+ Latent variable model (f2)⇒ Marginal model (m2), Prediction model (p2)

Solution 20/37

Page 21: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Learning

I Level two model can be learned by EM algorithm: at t-thiteration, we update ζh by solving

ζ(t+1)h ← arg max

ζh

nh∑i=1

Ep2

{log f2(θhi | zhi ; ζh)

∣∣∣θhi ; ζ(t)h

}where the conditional expectation is taken with respect tothe prediction model p2 in (2) evaluated at ζ(t)h , and ζ

(t)h

denotes the t-th iteration of the EM algorithm.

Solution 21/37

Page 22: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Learning Using EM Algorithm

θhi

θhi

Zhi

ζhM-step

E-step

Solution 22/37

Page 23: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Bayesian Interpretation

I Prediction model (2) can be written as

p2(θhi | θhi ; ζh) ∝ g1(θhi | θhi)f2(θhi | zhi ; ζh).

I Here, f2(θhi | zhi ; ζh) can be treated as a prior distributionand p2(θhi | θhi ; ζh) is a posterior distribution thatincorporates the observation of θhi .

I Use of g1(θhi | θhi) instead of full likelihood simplifies thecomputation. (Approximate Bayesian Computation).

Solution 23/37

Page 24: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Extension to Three Level Model

Model Measurement Parameter Latent variable(Data summary)

Level 1 yhi = (yhi1, · · · , yhin) θhi

Level 2 θh = (θh1, · · · , θhnh) ζh θ = (θh1, · · · , θhnh)

Level 3 ζ = (ζ1, · · · , ζH) α ζ = (ζ1, · · · , ζH)

We can apply the same three steps to the level three model.

Solution 24/37

Page 25: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Bottom-up Estimation

Latent VariableModel

f3(ζh|qh;α)

f2(θhi |zhi ; ζh)

f1(yhij |xhij ;θhi)

Level

3

2

1

Sampling ErrorModel

ζh ∼ g2(ζh|ζh)

θhi ∼ g1(θhi |θhi)

Parameter Estimation

α = arg maxα∑H

h=1 log∫

g2(ζh|ζh)f3(ζh|qh;α)dζh

ζh = arg maxζh

∑nhi=1 log

∫g1(θhi |θhi)f2(θhi |zhi ; ζh)dθhi

θhi = arg maxθhi

∑nhij=1 log f1(yhij |xhij ;θhi)

Figure : An illustration of the Bottom-up approach to parameterestimation

Solution 25/37

Page 26: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

PredictionI Our goal is to predict unobserved yhij values from the

above models using the parameter estimates.I The best prediction for yhij is

y∗hij = Ep3

[Ep2

{Ef1(yhij | xhij ,θhi) | θhi ; ζh

}| ζh; α

],

where

p3(ζh | ζh, α) =g2(ζh | ζh)f3(ζh | qh; α)∫

g2(ζh | ζh)f3(ζh | qh; α)dζh

and

p2(θhi | θhi , ζh) =g1(θhi | θhi)f2(θhi | zhi ; ζh)∫

g1(θhi | θhi)f2(θhi | zhi ; ζh)dθhi.

I The prediction is made in a top-down manner.

Solution 26/37

Page 27: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Prediction: Top-down Prediction

α

ζ∗2ζ∗1 ζ∗3

θ∗1i θ∗2i θ∗3i

p3 p3p3

p2 p2 p2

Predict yhij using f1(yhij | xhij ; θ∗hi).

Solution 27/37

Page 28: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Prediction: Top-down Prediction

Level

3

2

1

Latent

ζh

θhi

yhij

Prediction Model

p3(ζh | ζh; α)

p2(θhi | θhi ; ζh)

f1(yhij | xhij ;θhi)

Best Prediction

ζ∗h ∼ p3(ζh | ζh; α)

θ∗hi ∼ p2(θhi | θhi ; ζ∗h )

y∗hij ∼ f1(yhij |xhij ,θ∗hi)

Figure : Top-down approach to prediction

Solution 28/37

Page 29: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Case study: Application to Solar Energy Prediction

I We use 15-day long (12/01/2014 – 12/15/2014) data foranalysis.

I Organized the states into 12 groups.I The number of sites in each group, mh, varies between 37

and 321.

Application 29/37

Page 30: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Grouping Scheme

I Pooling data from nearby sites.I Can incorporate complex structure such as distribution

zone.

Application 30/37

Page 31: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Application: Site Level

I First assume that

yhij = xhijθhi + εhij ,

εhij ∼ t(0, σ2hi , νhi),

where σ2hi is scale parameter and νhi is degree of freedom

andθhi | θhi ∼ N(θhi ,V hi),

where V hi = V (θhi).I The degree of freedom is assumed to be unknown and

estimated by the method of Lange et al. (1989).

Application 31/37

Page 32: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Three Level Model

I Assume level 2 model

θhi ∼ N(βh,Σh),

and ζh = (βh,Σh)

I Similarly, level 3 model is

ζh ∼ N(µ,Σ),

and α = (µ,Σ).

Application 32/37

Page 33: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Comparison

I We compared the performance of the multi-level approachwith three other modeling methods:

I Site-by-site model: fit a different model for each individualsite

I Group-by-group model: fit a different model for each groupI One global model: fit a single common model for all sites

using the aggregate dataI To evaluate the prediction accuracy, we randomly selected

the 70% of the data to fit the model and tested on theremaining 30%.

Application 33/37

Page 34: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

MSPE Comparison

I We compare the accuracy by Mean Squared PredictionError (MSPE), N−1

T∑

(yhij − yhij)2, where yhij are obtained

from four different methods and NT is the size of the testdata set.

Multi level Site model Group model Global modelMSPE 0.297 0.298 0.406 0.383

SD 0.601 0.609 0.803 0.791

Table : Accuracy comparison of the different modeling methods

Application 34/37

Page 35: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Comparison in Detail (nhi ≤ 100 vs > 100)

0.0

0.5

1.0

1.5

<100 >100Sample Size

Mea

n S

quar

ed E

rror

Method

Multilevel

Site Model

Group Model

Global Model

Application 35/37

Page 36: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Discussion

I Motivated from a real problem: A solar energy forecastingsystem has been developed.

I We used a multi-level model approach to address thepractical issues.

I There are more issues to be investigated.I Spatial modelingI Estimation of group structureI Preferential sampling of sitesI ...

I The proposed method is promising for handling big data.

Application 36/37

Page 37: Bottom-up Estimation and Top-down Prediction for Multi ......Bottom-up Estimation and Top-down Prediction for Multi-level Models: ... Monitoring Network I Global Horizontal Irradiance

Application 37/37


Recommended