Slide 1 2008 Warren B. Powell Slide 1 Approximate Dynamic
Programming for High-Dimensional Problems in Energy Modeling Ohio
St. University October 7, 2009 Warren Powell CASTLE Laboratory
Princeton University http://www.castlelab.princeton.edu 2009 Warren
B. Powell, Princeton University
Slide 2
Slide 2 Goals for an energy policy model Potential questions
Policy questions How do we design policies to achieve energy goals
(e.g. 20% renewables by 2015) with a given probability? How does
the imposition of a carbon tax change the likelihood of meeting
this goal? What might happen if ethanol subsidies are reduced or
eliminated? What is the impact of a breakthrough in batteries?
Energy economics What is the best mix of energy generation
technologies? How is the economic value of wind affected by the
presence of storage? What is the best mix of storage technologies?
How would climate change impact our ability to use hydroelectric
reservoirs as a regulating source of power?
Slide 3
Slide 3 Goals for an energy policy model Designing energy
supply and storage portfolios to work with wind: The marginal value
of wind and solar farms depends on the ability to work with
intermittent supply. The impact of intermittent supply will be
mitigated by the use of storage. Different storage technologies
(batteries, flywheels, compressed air, pumped hydro) are each
designed to serve different types of variations in supply and
demand. The need for storage (and the value of wind and solar)
depends on the entire portfolio of energy producing
technologies.
Slide 4
Slide 4 Intermittent energy sources Wind speed Solar
energy
Slide 7 Long term uncertainties. 2010 2015 2020 2025 2030 Tax
policy Batteries Solar panels Carbon capture and sequestration
Price of oil Climate change
Slide 8
Slide 8 Goals for an energy policy model Model capabilities we
are looking for Multi-scale Multiple time scales (hourly, daily,
seasonal, annual, decade) Multiple spatial scales Multiple
technologies (different coal-burning technologies, new wind
turbines, ) Multiple markets Transportation (commercial, commuter,
home activities) Electricity use (heavy industrial, light
industrial, business, residential) . Stochastic (handles
uncertainty) Hourly fluctuations in wind, solar and demands Daily
variations in prices and rainfall Seasonal changes in weather
Yearly changes in supplies, technologies and policies
Slide 9
Slide 9 Outline Modeling stochastic resource allocation
problems An introduction to ADP ADP and the post-decision state
variable A blood management example The SMART energy policy
model
Slide 10
Slide 10 2008 Warren B. Powell Slide 10 A resource allocation
model Attribute vectors:
Slide 11
Slide 11 2008 Warren B. Powell Slide 11 A resource allocation
model Modeling resources: The attributes of a single resource: The
resource state vector: The information process:
Slide 12
Slide 12 2008 Warren B. Powell Slide 12 A resource allocation
model Modeling demands: The attributes of a single demand: The
demand state vector: The information process:
Slide 13
Slide 13 2008 Warren B. Powell Slide 13 Energy resource
modeling The system state:
Slide 14
Slide 14 Energy resource modeling The decision variables:
Slide 15
Slide 15 Energy resource modeling Exogenous information: Slide
15
Slide 16
Slide 16 2008 Warren B. Powell Slide 16 Energy resource
modeling The transition function Known as the: Transition function
Transfer function System model Plant model Model
Slide 17
Slide 17 Energy resource modeling Demands Resources
Slide 18
Slide 18 Energy resource modeling t t+1 t+2
Slide 19
Slide 19 Energy resource modeling t t+1 t+2 Optimizing at a
point in time Optimizing over time
Slide 20
Slide 20 Energy resource modeling The objective function How do
we find the best policy? Myopic policies Rolling horizon policies
Simulation-optimization Dynamic programming Decision function
(policy) State variable Contribution function Finding the best
policy Expectation over all random outcomes
Slide 21
Slide 21 Outline Modeling stochastic resource allocation
problems An introduction to ADP ADP and the post-decision state
variable A blood management example The SMART energy policy
model
Slide 22
Slide 22 Introduction to dynamic programming Bellmans
optimality equation: Assume this is knownCompute this for each
state S
Slide 23
Slide 23 Introduction to dynamic programming Bellmans
optimality equation: Problem: Curse of dimensionality Three curses
State space Outcome space Action space (feasible region)
Slide 24
Slide 24 Introduction to dynamic programming The computational
challenges: How do we find ? How do we compute the expectation? How
do we find the optimal solution?
Slide 25
Slide 25 Introduction to ADP Classical ADP Most applications of
ADP focus on the challenge of handling multidimensional state
variables Start with Now replace the value function with some sort
of approximation May draw from the entire field of
statistics/machine learning.
Slide 26
Slide 26 Introduction to ADP Other statistical methods
Regression trees Combines regression with techniques for discrete
variables. Data mining Good for categorical data Neural networks
Engineers like this for low-dimensional continuous problems
Kernel/locally polynomial regression Approximations portions of the
value function locally using simple functions Dirichlet mixture
models Aggregate portions of the function and fit approximations
around these aggregations.
Slide 27
Slide 27 Introduction to ADP But this does not solve our
problem Assume we have an approximate value function. We still have
to solve a problem that looks like This means we still have to deal
with a maximization problem (might be a linear, nonlinear or
integer program) with an expectation.
Slide 28
Slide 28 Outline Modeling stochastic resource allocation
problems An introduction to ADP ADP and the post-decision state
variable A blood management example The SMART energy policy
model
Slide 29
Do not use weather report Use weather report Forecast sunny.6
Rain.8 -$2000 Clouds.2 $1000 Sun.0 $5000 Rain.8 -$200 Clouds.2
-$200 Sun.0 -$200 Schedule game Cancel game Rain.1 -$2000 Clouds.5
$1000 Sun.4 $5000 Rain.1 -$200 Clouds.5 -$200 Sun.4 -$200 Schedule
game Cancel game Rain.1 -$2000 Clouds.2 $1000 Sun.7 $5000 Rain.1
-$200 Clouds.2 -$200 Sun.7 -$200 Schedule game Cancel game Rain.2
-$2000 Clouds.3 $1000 Sun.5 $5000 Rain.2 -$200 Clouds.3 -$200 Sun.5
-$200 Schedule game Cancel game Forecast cloudy.3 Forecast rain.1 -
Decision nodes - Outcome nodes Information Action Information
Action State
Slide 30
Slide 30 The post-decision state New concept: The pre-decision
state variable: Same as a decision node in a decision tree. The
post-decision state variable: Same as an outcome node in a decision
tree.
Slide 31
Slide 31 The post-decision state An inventory problem: Our
basic inventory equation: Using pre- and post-decision states:
Slide 32
Slide 32 The post-decision state Pre-decision, state-action,
and post-decision Pre-decision state State Action Post-decision
state
Slide 33
Slide 33 The post-decision state Pre-decision: resources and
demands
Slide 34
Slide 34 The post-decision state
Slide 35
Slide 35 The post-decision state
Slide 36
Slide 36 The post-decision state
Slide 37
Slide 37 The post-decision state Classical form of Bellmans
equation: Bellmans equations around pre- and post-decision states:
Optimization problem (making the decision): Note: this problem is
deterministic! Expectation problem (incorporating
uncertainty):
Slide 38
Slide 38 Introduction to ADP We first use the value function
around the post-decision state variable, removing the expectation:
We then replace the value function with an approximation that we
estimate using machine learning techniques:
Slide 39
Slide 39 The post-decision state Value function approximations:
Linear (in the resource state): Piecewise linear, separable:
Indexed PWL separable:
Slide 40
Slide 40 The post-decision state Value function approximations:
Ridge regression (Klabjan and Adelman) Benders cuts
Slide 41
Slide 4141 Making decisions Following an ADP policy
Slide 42
Slide 4242 Making decisions Following an ADP policy
Slide 43
Slide 4343 Making decisions Following an ADP policy
Slide 44
Slide 4444 Making decisions Following an ADP policy
Slide 45
Slide 45 45 Approximate dynamic programming With luck, the
objective function will improve steadily
Slide 46
Slide 46 The post-decision state Comparison to other methods:
Classical MDP (value iteration) Classical ADP (pre-decision state):
Updating around post-decision state: Expectation No
expectation
Slide 47
Slide 47 Approximate dynamic programming Step 1: Start with a
pre-decision state Step 2: Solve the deterministic optimization
using an approximate value function: to obtain. Step 3: Update the
value function approximation Step 4: Obtain Monte Carlo sample of
and compute the next pre-decision state: Step 5: Return to step 1.
Simulation Deterministic optimization Recursive statistics
Slide 48
Slide 48 Approximate dynamic programming Step 1: Start with a
pre-decision state Step 2: Solve the deterministic optimization
using an approximate value function: to obtain. Step 3: Update the
value function approximation Step 4: Obtain Monte Carlo sample of
and compute the next pre-decision state: Step 5: Return to step 1.
Simulation Deterministic optimization Recursive statistics
Slide 49
Slide 49 Outline Modeling stochastic resource allocation
problems An introduction to ADP ADP and the post-decision state
variable A blood management example The SMART energy policy
model
AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0
O-,1 O-,2 O-,3 Solve this as a linear program.
Slide 56
AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0
O-,1 O-,2 O-,3 Dual variables give value additional unit of blood..
Duals
Slide 57
Slide 57 Updating the value function approximation Estimate the
gradient at
Slide 58
Slide 58 Updating the value function approximation Update the
value function at
Slide 59
Slide 59 Updating the value function approximation Update the
value function at
Slide 60
Slide 60 Updating the value function approximation Update the
value function at
Slide 61
Slide 61 Outline Modeling stochastic resource allocation
problems An introduction to ADP ADP and the post-decision state
variable A blood management example The SMART energy policy
model
Slide 62
Slide 62 SMART-Stochastic, multiscale model SMART: A
Stochastic, Multiscale Allocation model for energy Resources,
Technology and policy Stochastic able to handle different types of
uncertainty: Fine-grained Daily fluctuations in wind, solar,
demand, prices, Coarse-grained Major climate variations, new
government policies, technology breakthroughs Multiscale able to
handle different levels of detail: Time scales Hourly to yearly
Spatial scales Aggregate to fine-grained disaggregate Activities
Different types of demand patterns Decisions Hourly dispatch
decisions Yearly investment decisions Takes as input parameters
characterizing government policies, performance of technologies,
assumptions about climate
Slide 63
Slide 63 The annual investment problem 2008 2009
Slide 64
Slide 64 The hourly dispatch problem Hourly electricity
dispatch problem
Slide 65
Slide 65 The hourly dispatch problem Hourly model Decisions at
time t impact t+1 through the amount of water held in the
reservoir. Hour t Hour t+1
Slide 66
Slide 66 The hourly dispatch problem Hourly model Decisions at
time t impact t+1 through the amount of water held in the
reservoir. Value of holding water in the reservoir for future time
periods. Hour t
Slide 67
Slide 67 The hourly dispatch problem
Slide 68
Slide 68 The hourly dispatch problem 2008 Hour 1 2 3 4 8760
2009 1 2
Slide 69
Slide 69 The hourly dispatch problem 2008 Hour 1 2 3 4 8760
2009 1 2
Slide 70
Slide 70 SMART-Stochastic, multiscale model 2008 2009
Slide 71
Slide 71 SMART-Stochastic, multiscale model 2008 2009
Slide 72
Slide 72 SMART-Stochastic, multiscale model 2008 2009 2010 2011
2038
Slide 73
Slide 73 SMART-Stochastic, multiscale model 2008 2009 2010 2011
2038
Slide 75 SMART-Stochastic, multiscale model Use statistical
methods to learn the value of resources in the future. Resources
may be: Stored energy Hydro Flywheel energy Storage capacity
Batteries Flywheels Compressed air Energy transmission capacity
Transmission lines Gas lines Shipping capacity Energy production
sources Wind mills Solar panels Nuclear power plants Amount of
resource Value
Slide 76
Slide 76 SMART-Stochastic, multiscale model Approximating
continuous functions The algorithm performs very fine
discretization over a small range of the function which is visited
most often.
Slide 77
Slide 77 SMART-Stochastic, multiscale model Benchmarking
Compare ADP to optimal LP for a deterministic problem Annual model
8,760 hours over a single year Focus on ability to match hydro
storage decisions 20 year model 24 hour time increments over 20
years Focus on investment decisions Comparisons on stochastic model
Stochastic rainfall analysis How does ADP solution compare to LP?
Carbon tax policy analysis Demonstrate nonanticipativity Slide
77
Slide 78
Slide 7878 Iterations Percentage error from optimal 0.06% over
optimal Benchmarking on hourly dispatch 2.50 2.00 1.50 1.00 0.50
0.00 ADP objective function relative to optimal LP
Slide 79
Slide 79 Benchmarking on hourly dispatch Optimal from linear
program Reservoir level Demand Rainfall
Slide 83 2009 Warren B. Powell Slide 83 Multidecade energy
model Optimal vs. ADP daily model over 20 years 0.24% over
optimal
Slide 84
Slide 84 Energy policy modeling Traditional optimization models
tend to produce all-or-nothing solutions Cost differential: IGCC -
Pulverized coal Pulverized coal is cheaper IGCC is cheaper
Investment in IGCC Traditional optimization Approximate dynamic
programming
Slide 85
Slide 8585 Time period Precipitation Sample paths Stochastic
rainfall
Slide 86
Slide 8686 Reservoir level Optimal for individual scenarios
Time period ADP Stochastic rainfall
Slide 87
Slide 87 Energy policy modeling Following sample paths Demands,
prices, weather, technology, policies, Slide 87 2030 Achieved goal
w/ Prob. 0.70 Metric (e.g. % renewable) Need to consider:
Finge-grained noise (wind, rain, demand, prices, ) Coarse-grained
noise (technology, policy, climate, ) Need to consider:
Finge-grained noise (wind, rain, demand, prices, ) Coarse-grained
noise (technology, policy, climate, )
Slide 88
Slide 88 Energy policy modeling Policy study: What is the
effect of a potential (but uncertain) carbon tax in year 8? 1 2 3 4
5 6 7 8 9 Year Carbon tax 0
Slide 89
Slide 89 Energy policy modeling Renewable technologies
Carbon-based technologies No carbon tax
Slide 90
Slide 90 Energy policy modeling With carbon tax Carbon-based
technologies Renewable technologies Carbon tax policy unknown
Carbon tax policy determined
Slide 91
Slide 91 Energy policy modeling With carbon tax Carbon-based
technologies Renewable technologies
Slide 92
Slide 92 Conclusions Capabilities SMART can handle problems
with over 300,000 time periods so that it can model hourly
variations in a long- term energy investment model. It can simulate
virtually any form of uncertainty, either provided through an
exogenous scenario file or sampled from a probability distribution.
Accurate modeling of climate, technology and markets requires
access to exogenously provided scenarios. It properly models
storage processes over time. Current tests are on an aggregate
model, but the modeling framework (and library) is set up for
spatially disaggregate problems.
Slide 93
Slide 93 Conclusions Limitations More research is needed to
test the ability of the model to use multiple storage technologies.
Extension to spatially disaggregate model will require significant
engineering and data. Run times will start to become an issue for a
spatially disaggregate model. Value function approximations capture
the resource state vector, but are limited to very simple exogenous
state variations.
Slide 94
Slide 94 Outline Modeling stochastic resource allocation
problems An introduction to ADP ADP and the post-decision state
variable A blood management example The SMART energy policy model
Merging machine learning and optimization
Slide 95
Slide 95 Merging machine learning and optimization The
challenge of coarse-grained uncertainty Fine-grained uncertainty
can generally be modeled as memoryless (even if it is not).
Coarse-grained uncertainty affects what might be called state of
the world. The value of a resource depends on the state of the
world. Is there a carbon tax? What is the state of battery
research? Have there been major new oil discoveries? What is the
price of oil? Did the international community adopt strict limits
on carbon emissions? Has their been advances in our understanding
of climate change?
Slide 96
Slide 96 Merging machine learning and optimization Modeling the
state of the world We can use powerful machine learning algorithms
to overcome these new curses of dimensionality. Instead of one
piecewise linear value function for each resource and time period
We need one for each state of the world. There can be thousands of
these.
Slide 97
Slide 97 Merging machine learning and optimization Strategy 1:
Locally polynomial regression Widely used in statistics Approximate
complex functions locally using simple functions. Estimate of the
function is a weighted sum of these local approximations. But
cannot handle categorical variables.
Slide 98
Slide 98 Merging machine learning and optimization Strategy 2:
Dirichlet process mixtures of generalized linear models
Slide 99
Slide 99 Merging machine learning and optimization Strategy 3:
Hierarchical learning models Estimate piecewise constant functions
at different levels of aggregation:
Slide 100
Slide 100 Merging machine learning and optimization Next steps:
We need to transition these machine learning techniques into an ADP
setting: Can they be adapted to work within a linear or nonlinear
optimization algorithm? All three methods are asymptotically
unbiased, but this depends on unbiased observations. In an ADP
algorithm, observations are biased. We need to design an effective
exploration strategy so that the solution does not become stuck.
Other issues Will the methods provide fast, robust solutions for
effective policy analysis?
Slide 101
Slide 101 2009 Warren B. Powell 2008 Warren B. Powell Slide
101
Slide 102
Slide 102
Slide 103
Slide 103
Slide 104
Slide 104 2009 Warren B. Powell Demand modeling Commercial
electric demand 7 days