Slide 1 © 2008 Warren B. Powell Slide 1 Approximate Dynamic Programming for High-Dimensional...

2008 Warren B. Powell Slide 1 Approximate Dynamic Programming for High-Dimensional Problems in Energy Modeling Ohio St. University October 7, 2009 Warren Powell CASTLE Laboratory Princeton University http://www.castlelab.princeton.edu 2009 Warren B. Powell, Princeton University

Goals for an energy policy model Potential questions Policy questions How do we design policies to achieve energy goals (e.g. 20% renewables by 2015) with a given probability? How does the imposition of a carbon tax change the likelihood of meeting this goal? What might happen if ethanol subsidies are reduced or eliminated? What is the impact of a breakthrough in batteries? Energy economics What is the best mix of energy generation technologies? How is the economic value of wind affected by the presence of storage? What is the best mix of storage technologies? How would climate change impact our ability to use hydroelectric reservoirs as a regulating source of power?

Goals for an energy policy model Designing energy supply and storage portfolios to work with wind: The marginal value of wind and solar farms depends on the ability to work with intermittent supply. The impact of intermittent supply will be mitigated by the use of storage. Different storage technologies (batteries, flywheels, compressed air, pumped hydro) are each designed to serve different types of variations in supply and demand. The need for storage (and the value of wind and solar) depends on the entire portfolio of energy producing technologies.

Intermittent energy sources Wind speed Solar energy

Wind 30 days 1 year

Storage Batteries Ultracapacitors Flywheels Hydroelectric

Long term uncertainties. 2010 2015 2020 2025 2030 Tax policy Batteries Solar panels Carbon capture and sequestration Price of oil Climate change

Goals for an energy policy model Model capabilities we are looking for Multi-scale Multiple time scales (hourly, daily, seasonal, annual, decade) Multiple spatial scales Multiple technologies (different coal-burning technologies, new wind turbines, ) Multiple markets Transportation (commercial, commuter, home activities) Electricity use (heavy industrial, light industrial, business, residential) . Stochastic (handles uncertainty) Hourly fluctuations in wind, solar and demands Daily variations in prices and rainfall Seasonal changes in weather Yearly changes in supplies, technologies and policies

Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model

2008 Warren B. Powell Slide 10 A resource allocation model Attribute vectors:

2008 Warren B. Powell Slide 11 A resource allocation model Modeling resources: The attributes of a single resource: The resource state vector: The information process:

2008 Warren B. Powell Slide 12 A resource allocation model Modeling demands: The attributes of a single demand: The demand state vector: The information process:

2008 Warren B. Powell Slide 13 Energy resource modeling The system state:

Energy resource modeling The decision variables:

Energy resource modeling Exogenous information: Slide 15

2008 Warren B. Powell Slide 16 Energy resource modeling The transition function Known as the: Transition function Transfer function System model Plant model Model

Energy resource modeling Demands Resources

Energy resource modeling t t+1 t+2

Energy resource modeling t t+1 t+2 Optimizing at a point in time Optimizing over time

Energy resource modeling The objective function How do we find the best policy? Myopic policies Rolling horizon policies Simulation-optimization Dynamic programming Decision function (policy) State variable Contribution function Finding the best policy Expectation over all random outcomes

Introduction to dynamic programming Bellmans optimality equation: Assume this is knownCompute this for each state S

Introduction to dynamic programming Bellmans optimality equation: Problem: Curse of dimensionality Three curses State space Outcome space Action space (feasible region)

Introduction to dynamic programming The computational challenges: How do we find ? How do we compute the expectation? How do we find the optimal solution?

Introduction to ADP Classical ADP Most applications of ADP focus on the challenge of handling multidimensional state variables Start with Now replace the value function with some sort of approximation May draw from the entire field of statistics/machine learning.

Introduction to ADP Other statistical methods Regression trees Combines regression with techniques for discrete variables. Data mining Good for categorical data Neural networks Engineers like this for low-dimensional continuous problems Kernel/locally polynomial regression Approximations portions of the value function locally using simple functions Dirichlet mixture models Aggregate portions of the function and fit approximations around these aggregations.

Introduction to ADP But this does not solve our problem Assume we have an approximate value function. We still have to solve a problem that looks like This means we still have to deal with a maximization problem (might be a linear, nonlinear or integer program) with an expectation.

Do not use weather report Use weather report Forecast sunny.6 Rain.8 -$2000 Clouds.2 $1000 Sun.0 $5000 Rain.8 -$200 Clouds.2 -$200 Sun.0 -$200 Schedule game Cancel game Rain.1 -$2000 Clouds.5 $1000 Sun.4 $5000 Rain.1 -$200 Clouds.5 -$200 Sun.4 -$200 Schedule game Cancel game Rain.1 -$2000 Clouds.2 $1000 Sun.7 $5000 Rain.1 -$200 Clouds.2 -$200 Sun.7 -$200 Schedule game Cancel game Rain.2 -$2000 Clouds.3 $1000 Sun.5 $5000 Rain.2 -$200 Clouds.3 -$200 Sun.5 -$200 Schedule game Cancel game Forecast cloudy.3 Forecast rain.1 - Decision nodes - Outcome nodes Information Action Information Action State

The post-decision state New concept: The pre-decision state variable: Same as a decision node in a decision tree. The post-decision state variable: Same as an outcome node in a decision tree.

The post-decision state An inventory problem: Our basic inventory equation: Using pre- and post-decision states:

The post-decision state Pre-decision, state-action, and post-decision Pre-decision state State Action Post-decision state

The post-decision state Pre-decision: resources and demands

The post-decision state

The post-decision state Classical form of Bellmans equation: Bellmans equations around pre- and post-decision states: Optimization problem (making the decision): Note: this problem is deterministic! Expectation problem (incorporating uncertainty):

Introduction to ADP We first use the value function around the post-decision state variable, removing the expectation: We then replace the value function with an approximation that we estimate using machine learning techniques:

The post-decision state Value function approximations: Linear (in the resource state): Piecewise linear, separable: Indexed PWL separable:

The post-decision state Value function approximations: Ridge regression (Klabjan and Adelman) Benders cuts

Making decisions Following an ADP policy

45 Approximate dynamic programming With luck, the objective function will improve steadily

The post-decision state Comparison to other methods: Classical MDP (value iteration) Classical ADP (pre-decision state): Updating around post-decision state: Expectation No expectation

Approximate dynamic programming Step 1: Start with a pre-decision state Step 2: Solve the deterministic optimization using an approximate value function: to obtain. Step 3: Update the value function approximation Step 4: Obtain Monte Carlo sample of and compute the next pre-decision state: Step 5: Return to step 1. Simulation Deterministic optimization Recursive statistics

Blood management Managing blood inventories

Blood management Managing blood inventories over time t=0 Week 1 Week 2 Week 3 t=1 t=2 t=3 Week 0

O-,1 O-,2 O-,3 AB+,2 AB+,3 O-,0 AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+ AB- A+ A- B+ B- O+ O- AB+,0 AB+,1 Satisfy a demandHold

AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3

AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3

AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3 Solve this as a linear program.

AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3 Dual variables give value additional unit of blood.. Duals

Updating the value function approximation Estimate the gradient at

Updating the value function approximation Update the value function at

SMART-Stochastic, multiscale model SMART: A Stochastic, Multiscale Allocation model for energy Resources, Technology and policy Stochastic able to handle different types of uncertainty: Fine-grained Daily fluctuations in wind, solar, demand, prices, Coarse-grained Major climate variations, new government policies, technology breakthroughs Multiscale able to handle different levels of detail: Time scales Hourly to yearly Spatial scales Aggregate to fine-grained disaggregate Activities Different types of demand patterns Decisions Hourly dispatch decisions Yearly investment decisions Takes as input parameters characterizing government policies, performance of technologies, assumptions about climate

The annual investment problem 2008 2009

The hourly dispatch problem Hourly electricity dispatch problem

The hourly dispatch problem Hourly model Decisions at time t impact t+1 through the amount of water held in the reservoir. Hour t Hour t+1

The hourly dispatch problem Hourly model Decisions at time t impact t+1 through the amount of water held in the reservoir. Value of holding water in the reservoir for future time periods. Hour t

The hourly dispatch problem

The hourly dispatch problem 2008 Hour 1 2 3 4 8760 2009 1 2

SMART-Stochastic, multiscale model 2008 2009

SMART-Stochastic, multiscale model 2008 2009 2010 2011 2038

SMART-Stochastic, multiscale model 2008 2009 2010 2011 2038 ~5 seconds

SMART-Stochastic, multiscale model Use statistical methods to learn the value of resources in the future. Resources may be: Stored energy Hydro Flywheel energy Storage capacity Batteries Flywheels Compressed air Energy transmission capacity Transmission lines Gas lines Shipping capacity Energy production sources Wind mills Solar panels Nuclear power plants Amount of resource Value

SMART-Stochastic, multiscale model Approximating continuous functions The algorithm performs very fine discretization over a small range of the function which is visited most often.

SMART-Stochastic, multiscale model Benchmarking Compare ADP to optimal LP for a deterministic problem Annual model 8,760 hours over a single year Focus on ability to match hydro storage decisions 20 year model 24 hour time increments over 20 years Focus on investment decisions Comparisons on stochastic model Stochastic rainfall analysis How does ADP solution compare to LP? Carbon tax policy analysis Demonstrate nonanticipativity Slide 77

Iterations Percentage error from optimal 0.06% over optimal Benchmarking on hourly dispatch 2.50 2.00 1.50 1.00 0.50 0.00 ADP objective function relative to optimal LP

Benchmarking on hourly dispatch Optimal from linear program Reservoir level Demand Rainfall

Benchmarking on hourly dispatch ADP solution Reservoir level Demand Rainfall Approximate dynamic programming

Benchmarking on hourly dispatch Optimal from linear program Reservoir level Demand Rainfall Optimal from linear program

Benchmarking on hourly dispatch ADP solution Reservoir level Demand Rainfall Approximate dynamic programming

2009 Warren B. Powell Slide 83 Multidecade energy model Optimal vs. ADP daily model over 20 years 0.24% over optimal

Energy policy modeling Traditional optimization models tend to produce all-or-nothing solutions Cost differential: IGCC - Pulverized coal Pulverized coal is cheaper IGCC is cheaper Investment in IGCC Traditional optimization Approximate dynamic programming

Time period Precipitation Sample paths Stochastic rainfall

Reservoir level Optimal for individual scenarios Time period ADP Stochastic rainfall

Energy policy modeling Following sample paths Demands, prices, weather, technology, policies, Slide 87 2030 Achieved goal w/ Prob. 0.70 Metric (e.g. % renewable) Need to consider: Finge-grained noise (wind, rain, demand, prices, ) Coarse-grained noise (technology, policy, climate, ) Need to consider: Finge-grained noise (wind, rain, demand, prices, ) Coarse-grained noise (technology, policy, climate, )

Energy policy modeling Policy study: What is the effect of a potential (but uncertain) carbon tax in year 8? 1 2 3 4 5 6 7 8 9 Year Carbon tax 0

Energy policy modeling Renewable technologies Carbon-based technologies No carbon tax

Energy policy modeling With carbon tax Carbon-based technologies Renewable technologies Carbon tax policy unknown Carbon tax policy determined

Energy policy modeling With carbon tax Carbon-based technologies Renewable technologies

Conclusions Capabilities SMART can handle problems with over 300,000 time periods so that it can model hourly variations in a long- term energy investment model. It can simulate virtually any form of uncertainty, either provided through an exogenous scenario file or sampled from a probability distribution. Accurate modeling of climate, technology and markets requires access to exogenously provided scenarios. It properly models storage processes over time. Current tests are on an aggregate model, but the modeling framework (and library) is set up for spatially disaggregate problems.

Conclusions Limitations More research is needed to test the ability of the model to use multiple storage technologies. Extension to spatially disaggregate model will require significant engineering and data. Run times will start to become an issue for a spatially disaggregate model. Value function approximations capture the resource state vector, but are limited to very simple exogenous state variations.

Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model Merging machine learning and optimization

Merging machine learning and optimization The challenge of coarse-grained uncertainty Fine-grained uncertainty can generally be modeled as memoryless (even if it is not). Coarse-grained uncertainty affects what might be called state of the world. The value of a resource depends on the state of the world. Is there a carbon tax? What is the state of battery research? Have there been major new oil discoveries? What is the price of oil? Did the international community adopt strict limits on carbon emissions? Has their been advances in our understanding of climate change?

Merging machine learning and optimization Modeling the state of the world We can use powerful machine learning algorithms to overcome these new curses of dimensionality. Instead of one piecewise linear value function for each resource and time period We need one for each state of the world. There can be thousands of these.

Merging machine learning and optimization Strategy 1: Locally polynomial regression Widely used in statistics Approximate complex functions locally using simple functions. Estimate of the function is a weighted sum of these local approximations. But cannot handle categorical variables.

Merging machine learning and optimization Strategy 2: Dirichlet process mixtures of generalized linear models

Merging machine learning and optimization Strategy 3: Hierarchical learning models Estimate piecewise constant functions at different levels of aggregation:

Merging machine learning and optimization Next steps: We need to transition these machine learning techniques into an ADP setting: Can they be adapted to work within a linear or nonlinear optimization algorithm? All three methods are asymptotically unbiased, but this depends on unbiased observations. In an ADP algorithm, observations are biased. We need to design an effective exploration strategy so that the solution does not become stuck. Other issues Will the methods provide fast, robust solutions for effective policy analysis?

2009 Warren B. Powell 2008 Warren B. Powell Slide 101

2009 Warren B. Powell Demand modeling Commercial electric demand 7 days

Date post:	22-Dec-2015
Category:	Documents
View:	219 times
Download:	1 times

Slide 1 © 2008 Warren B. Powell Slide 1 Approximate Dynamic Programming for High-Dimensional...

Documents