+ All Categories
Home > Documents > Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf ·...

Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf ·...

Date post: 27-Sep-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
32
Solving High-dimensional PDEs Using Deep Learning Jiequn Han The Program in Applied & Computational Mathematics, Princeton University Joint work with Weinan E and Arnulf Jentzen Inverse Problems and Machine Learning, Caltech, February 9, 2018 1 / 32
Transcript
Page 1: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Solving High-dimensional PDEs UsingDeep Learning

Jiequn Han

The Program in Applied & Computational Mathematics,Princeton University

Joint work with Weinan E and Arnulf Jentzen

Inverse Problems and Machine Learning,Caltech, February 9, 2018

1 / 32

Page 2: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Outline

1. Introduction

2. Mathematical Formulation

3. Neural Network Approximation

4. Numerical Examples

5. Summary

2 / 32

Page 3: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Table of Contents

1. Introduction

2. Mathematical Formulation

3. Neural Network Approximation

4. Numerical Examples

5. Summary

3 / 32

Page 4: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Well-known Examples of PDEs

• The Schrodinger equation in quantum many-body problem,

i~∂

∂tΨ(t, x) = (−1

2∆ + V )Ψ(t, x).

• The Black-Scholes equation for pricing financial derivatives,

vt + 12 Tr

(σσT(Hessxv)

)+ r∇v · x− rv = 0.

• The Hamilton-Jacobi-Bellman equation in stochastic control(dynamic programming),

vt + maxu

{12 Tr

(σσT(Hessxv)

)+∇v · b+ f

}= 0.

4 / 32

Page 5: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Curse of Dimensionality

• The dimension of PDEs can be easily large in practice.

Equation Dimension (roughly)

Schrodinger equation # of electrons × 3Black-Scholes equation # of underlying financial assets

HJB equation the same as the state space

• A key computational challenge is the curse of dimensionality:the complexity is exponential in dimension d for finitedifference/element method – usually unavailable for d ≥ 4.

• There is a huge gap between PDE modelings andcomputational algorithms.

5 / 32

Page 6: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Remarkable Success of Deep Learning

• Machine learning/data analysis also face the same curse ofdimensionality

• In recent years, deep learning has achieved remarkable success

• An old but essential idea: represent functions in acompositional form rather than additive

6 / 32

Page 7: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Related Work in High-dimensional Case

• Linear parabolic PDEs: Monte Carlo methods based on theFeynman-Kac formula

• Semilinear parabolic PDEs:1. branching diffusion approach (Henry-Labordere 2012,

Henry-Labordere et al. 2014)2. multilevel Picard approximation (E et al. 2016)

• Hamilton-Jacobi PDEs: using Hopf formula and fastconvex/nonconvex optimization methods (Darbon & Osher2016, Chow et al. 2017)

7 / 32

Page 8: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Table of Contents

1. Introduction

2. Mathematical Formulation

3. Neural Network Approximation

4. Numerical Examples

5. Summary

8 / 32

Page 9: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Semilinear Parabolic PDE

We consider a general semilinear parabolic PDE in [0, T ]× Rd:

∂u

∂t(t, x) + 1

2Tr(σσT(t, x)(Hessxu)(t, x)

)+∇u(t, x) · µ(t, x)

+ f(t, x, u(t, x), σT(t, x)∇u(t, x)

)= 0.

• Terminal condition is given: u(T, x) = g(x).

• To fix ideas, we are interested in the solution at t = 0, x = ξfor some vector ξ ∈ Rd.

9 / 32

Page 10: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Connection between PDE and BSDE

• The link between parabolic PDEs and backward stochasticdifferential equations (BSDEs) has been extensivelyinvestigated (Pardoux & Peng 1992, El Karoui et al. 1997,etc).

• In particular, Markovian BSDEs give a nonlinear Feynman-Kacrepresentation of some nonlinear parabolic PDEs.

• Consider the following BSDEXt = ξ +

∫ t

0µ(s,Xs) ds+

∫ t

0σ(s,Xs) dWs,

Yt = g(XT ) +∫ T

tf(s,Xs, Ys, Zs) ds−

∫ T

t(Zs)T dWs,

The solution is an adapted process {(Xt, Yt, Zt)}t∈[0,T ] withvalues in Rd × R× Rd.

10 / 32

Page 11: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Connection between PDE and BSDE• Under suitable regularity assumptions, the BSDE is well-posed

and related to the PDE in the sense that for all t ∈ [0, T ] itholds a.s. that

Yt = u(t,Xt) and Zt = σT(t,Xt)∇u(t,Xt).

• In other words, given the stochastic process satisfying

Xt = ξ +∫ t

0µ(s,Xs) ds+

∫ t

0σ(s,Xs) dWs,

the solution of PDE satisfies the following SDE

u(t,Xt)− u(0, X0)

=−∫ t

0f(s,Xs, u(s,Xs), σT(s,Xs)∇u(s,Xs)

)ds

+∫ t

0[∇u(s,Xs)]T σ(s,Xs) dWs.

11 / 32

Page 12: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

BSDE and Control – A LQG ExampleConsider a classical linear-quadratic-Gaussian (LQG) controlproblem in Rd:

dXt = 2√λmt dt+

√2 dWt,

with cost functional J({mt}0≤t≤T ) = E[ ∫ T

0 ‖mt‖22 dt+ g(XT )].

The HJB equation for this problem is

∂u

∂t(t, x) + ∆u(t, x)− λ‖∇u(t, x)‖22 = 0.

The optimal control is given by

m∗t = ∇u(t, x)√2λ

, (recall Zt = σT(t,Xt)∇u(t,Xt)).

In the context of BSDE for control, Yt denotes the optimal valueand Zt denotes the optimal control (up to a constant scaling).

12 / 32

Page 13: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Table of Contents

1. Introduction

2. Mathematical Formulation

3. Neural Network Approximation

4. Numerical Examples

5. Summary

13 / 32

Page 14: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Neural Network Approximation

• Key step: approximate the function x 7→ σT(t, x)∇u(t, x) ateach discretized time step t = tn by a feedforward neuralnetwork

σT(tn, Xtn)∇u(tn, Xtn) = (σT∇u)(tn, Xtn)≈ (σT∇u)(tn, Xtn |θn),

where θn denotes neural network parameters.

• Observation: we can stack all the subnetworks together toform a deep neural network (DNN) as a whole, based on thetime discretization (see the next two slides).

14 / 32

Page 15: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Time Discretization

We consider the simple Euler scheme of the BSDE, with apartition of the time interval [0, T ], 0 = t0 < t1 < . . . < tN = T :

Xtn+1 −Xtn ≈ µ(tn, Xtn) ∆tn + σ(tn, Xtn) ∆Wn,

and

u(tn+1, Xtn+1)− u(tn, Xtn)≈− f

(tn, Xtn , u(tn, Xtn), σT(tn, Xtn)∇u(tn, Xtn)

)∆tn

+ [∇u(tn, Xtn)]T σ(tn, Xtn) ∆Wn,

where∆tn = tn+1 − tn, ∆Wn = Wtn+1 −Wtn .

15 / 32

Page 16: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Network Architecture

Figure: Network architecture for solving parabolic PDEs. Each columncorresponds to a subnetwork at time t = tn. The whole network has(H + 2)(N − 1) layers in total.

16 / 32

Page 17: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Optimization

• This network takes the paths {Xtn}0≤n≤N and {Wtn}0≤n≤N

as the input data and gives the final output, denoted byu({Xtn}0≤n≤N , {Wtn}0≤n≤N ), as an approximation tou(tN , XtN ).

• The error in the matching of given terminal condition definesthe expected loss function

l(θ) = E

[∣∣g(XtN )− u({Xtn}0≤n≤N , {Wtn}0≤n≤N

)∣∣2].• The paths can be simulated easily. Therefore the commonly

used SGD algorithm fits this problem well.

• We call the introduced methodology deep BSDE method sincewe use the BSDE and DNN as essential tools.

17 / 32

Page 18: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Time Discretization as Skip Connection

Why such deep networks can be trained?

Intuition: there are skip connections between different subnetworks

u(tn+1, Xtn+1)− u(tn, Xtn)≈− f

(tn, Xtn , u(tn, Xtn), (σT∇u)(tn, Xtn |θn)

)∆tn

+ (σT∇u)(tn, Xtn |θn) ∆Wn,

18 / 32

Page 19: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Analogy to Deep Reinforcement Learning

• Deep Reinforcement Learning (DRL) has achieved greatsuccess in game domains and sophisticated control tasks. Acommon strategy is to represent policy function (control)through neural networks.

• Recall that in the example of LQG control problem, Zt

denotes the optimal control, which is approximated by neuralnetworks.

Table: Informal analogy

Deep BSDE method DRL

BSDE ←→ Markov decision modelgradient of the solution ←→ optimal policy function

19 / 32

Page 20: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Table of Contents

1. Introduction

2. Mathematical Formulation

3. Neural Network Approximation

4. Numerical Examples

5. Summary

20 / 32

Page 21: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Implementation

• Each subnetwork has 4 layers, with 1 input layer(d-dimensional), 2 hidden layers (both d+ 10-dimensional),and 1 output layer (d-dimensional).

• Choose the rectifier function (ReLU) as the activationfunction and optimize with Adam method.

• Implement in Tensorflow and reported examples are all run ona Macbook Pro.

• Github: https://github.com/frankhan91/DeepBSDE

21 / 32

Page 22: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

LQG Example RevisitedWe solve the introduced HJB equation in [0, 1]× R100. It admitsan explicit formula, which allows accuracy test:

u(t, x) = − 1λ

ln(E

[exp

(− λg(x+

√2WT−t)

)]).

0 10 20 30 40 50

lambda

4.0

4.1

4.2

4.3

4.4

4.5

4.6

4.7

u(0,0,...,0)

Deep BSDE Solver

Monte Carlo

Figure: Left: Relative error of the deep BSDE method foru(t=0, x=(0, . . . , 0)) when λ = 1, which achieves 0.17% in a runtime of 330seconds. Right: Optimal cost u(t=0, x=(0, . . . , 0)) against different λ.

22 / 32

Page 23: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Black-Scholes Equation with Default Risk

• The classical Black-Scholes model can and should beaugmented by some important factors in real markets,including defaultable securities, transactions costs,uncertainties in the model parameters, etc.

• Ideally the pricing models should take into account the wholebasket of financial derivative underlyings, resulting inhigh-dimensional nonlinear PDEs.

• To test the deep BSDE method, we study a special case ofthe recursive valuation model with default risk (Duffie et al.1996, Bender et al. 2015).

23 / 32

Page 24: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Black-Scholes Equation with Default Risk

• Consider the fair price of a European claim based on 100underlying assets conditional on no default having occurredyet.

• The underlying asset price moves as a geometric Brownianmotion and the possible default is modeled by the first jumptime of a Poisson process.

• The claim value is modeled by a parabolic PDE with thenonlinear function

f(t, x, u(t, x), σT(t, x)∇u(t, x)

)=− (1− δ)Q(u(t, x))u(t, x)−Ru(t, x).

24 / 32

Page 25: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Black-Scholes Equation with Default RiskThe not explicitly known “exact” solution at t = 0x = (100, . . . , 100) is computed by the multilevel Picard method.

Figure: Approximation of u(t=0, x=(100, . . . , 100)) against number ofiteration steps. The deep BSDE method achieves a relative error of size0.46% in a runtime of 617 seconds.

25 / 32

Page 26: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Allen-Cahn Equation

The Allen-Cahn equation is a reaction-diffusion equation for themodeling of phase separation and transition in physics. Here weconsider a typical Allen-Cahn equation with the “double-wellpotential” in 100-dimensional space:

∂u

∂t(t, x) = ∆u(t, x) + u(t, x)− [u(t, x)]3 ,

with initial condition u(0, x) = g(x).

26 / 32

Page 27: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Allen-Cahn Equation

The not explicitly known “exact” solution at t = 0.3,x = (0, . . . , 0) is computed by the branching diffusion method.

0.00 0.05 0.10 0.15 0.20 0.25 0.30

t

0.00

0.05

0.10

0.15

0.20

0.25

0.30

u(t,0,...,0)

Figure: Left: relative error of the deep BSDE method foru(t=0.3, x=(0, . . . , 0)), which achieves 0.30% in a runtime of 647 seconds.Right: time evolution of u(t, x=(0, . . . , 0)) for t ∈ [0, 0.3], computed by meansof the deep BSDE method.

27 / 32

Page 28: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

An Example with Quadratically GrowingDerivatives

We consider an example studied for the numerical methods of PDEin literature (Gobet & Turkedjiev 2016).The PDE is constructed artificially in a form

∂u

∂t(t, x) + ‖(∇xu)(t, x)‖22 + 1

2 (∆xu)(t, x)

= ∂ψ

∂t(t, x) + ‖(∇xψ)(t, x)‖22 + 1

2 (∆xψ)(t, x),

with the explicit solution

ψ(t, x) = sin([T − t+ ‖x‖22/d ]0.4

).

28 / 32

Page 29: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

An Example with Quadratically GrowingDerivatives

Compared to the literature, we set d = 100 instead of d ∈ {3, 5, 7}and T = 1 instead T = 0.2.

Figure: Left: relative error of the deep BSDE method foru(t=0, x=(0, . . . , 0)), which achieves 0.09% in a runtime of 957 seconds.Right: learning curves of the loss function.

29 / 32

Page 30: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

References and Follow-up Works

• References:I Han, Jentzen, and E, Solving high-dimensional partial

differential equations using deep learning, arXiv:1707.02568

I E, Han, and Jentzen, Deep learning-based numerical methodsfor high-dimensional parabolic partial differential equations andbackward stochastic differential equations, Communications inMathematics and Statistics (2017)

• Follow-up works:I Beck et al. 2017: deep 2BSDE method – solve fully nonlinear

PDEs and second-order BSDEs through their connections andapproximate the gradient and Hessian by DNN.

I Henry-Labordere 2017: deep primal-dual algorithm for BSDEs

I Fujii et al. 2017: use asymptotic expansion as prior knowledgeto reduce error and accelerate convergence.

30 / 32

Page 31: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Table of Contents

1. Introduction

2. Mathematical Formulation

3. Neural Network Approximation

4. Numerical Examples

5. Summary

31 / 32

Page 32: Solving High-dimensional PDEs Using Deep Learningcmx.caltech.edu/ipml/ipml-slides-han.pdf · Schr¨odinger equation # of electrons ×3 Black-Scholes equation # of underlying financial

Summary

This work proposes the so-called deep BSDE method, which cansolve general nonlinear high-dimensional parabolic PDEs.

1. We reformulate the parabolic PDEs as BSDEs andapproximate the unknown gradient by deep neural networks.

2. Numerical results validate the proposed algorithm in highdimensions, in terms of both accuracy and speed.

3. This opens up new possibilities in various disciplines involvingPDE modelings.

Thank you for your attention!

32 / 32


Recommended