Numerical Optimal Control– Part 3: Function space methods –
SADCO Summer School and Workshop on Optimal and Model Predictive Control– OMPC 2013, Bayreuth –
Matthias Gerdts
Institute of Mathematics and Applied ComputingDepartment of Aerospace Engineering
Universitat der Bundeswehr Munchen (UniBw M)[email protected]
http://www.unibw.de/lrt1/gerdts
Fotos: http://de.wikipedia.org/wiki/MunchenMagnus Manske (Panorama), Luidger (Theatinerkirche), Kurmis (Chin. Turm), Arad Mojtahedi (Olympiapark), Max-k (Deutsches Museum), Oliver Raupach (Friedensengel), Andreas Praefcke (Nationaltheater)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Schedule and Contents
Time Topic
9:00 - 10:30 Introduction, overview, examples, indirect method
10:30 - 11:00 Coffee break
11:00 - 12:30 Discretization techniques, structure exploitation, calcula-tion of gradients, extensions: sensitivity analysis, mixed-integer optimal control
12:30 - 14:00 Lunch break
14:00 - 15:30 Function space methods: Gradient and Newton typemethods
15:30 - 16:00 Coffee break
16:00 - 17:30 Numerical experiments
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Contents
Introduction
Necessary Conditions
Adjoint Formalism
Gradient MethodGradient Method in Finite DimensionsGradient Method for Optimal Control ProblemsExtensionsGradient Method for Discrete ProblemExamples
Lagrange-Newton MethodLagrange-Newton Method in Finite DimensionsLagrange-Newton Method in Infinite DimensionsApplication to Optimal ControlSearch DirectionExamplesExtensions
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Contents
Introduction
Necessary Conditions
Adjoint Formalism
Gradient MethodGradient Method in Finite DimensionsGradient Method for Optimal Control ProblemsExtensionsGradient Method for Discrete ProblemExamples
Lagrange-Newton MethodLagrange-Newton Method in Finite DimensionsLagrange-Newton Method in Infinite DimensionsApplication to Optimal ControlSearch DirectionExamplesExtensions
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Overview on Solution Methods
Indirect approach based on finite dimensional
optimality system
DAE Optimal Control Problem
Discretization:Approximation by finite dimensional problem Function space approach
Indirect approach based on infinite dimensional
optimality system
Direct Approach: Methods for discretized
optimization problem
Direct Approach: Methods for infinite dimensional
optimization problem
SQP methodsInteriorPoint Methods
Gradient methodsPenalty methods
Multiplier methodsDynamic programming
Reduced approach (direct shooting)
Full discretization (collocation)
Reduced approach Full approach
SQP methodsInteriorPoint Methods
Gradient methodsPenalty methods
Multiplier methodsDynamic programming
Semianalytical methods (indirect method, boundary value problems)
Methods for infinite dimensional
complementarity problems and
variational inequalities
(semismooth Newton, JosephyNewton, fixedpoint iteration, projection
methods)
Methods for finite dimensional
complementarity problems and
variational inequalities
(semismooth Newton, JosephyNewton, fixedpoint iteration, projection
methods)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Function Space Methods
ParadigmAnalyze and develop methods for some infinite dimensional optimization problem (e.g.an optimal control problem) of type
Minimize J(z) subject to G(z) ∈ K , H(z) = 0
in the same Banach or Hilbert spaces where z, G and H live.
Why is this useful?
I no immediate approximation error: algorithms work in the same spaces as the problemI massive exploitation of structure possibleI subtle requirements can get lost in discretizations (e.g. smoothing operator for semismooth
methods)I methods can be very fast
What are the difficulties?
I detailed functional analytic background necessary (cannot be expected in an industrialcontext)
I discretizations become necessary at lower level anyway; so, why not discretize right away?I theoretical difficulties with, e.g., state constraints (multipliers are measures; how to handle
them numerically?)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Function Space Methods
ParadigmAnalyze and develop methods for some infinite dimensional optimization problem (e.g.an optimal control problem) of type
Minimize J(z) subject to G(z) ∈ K , H(z) = 0
in the same Banach or Hilbert spaces where z, G and H live.
Why is this useful?I no immediate approximation error: algorithms work in the same spaces as the problem
I massive exploitation of structure possibleI subtle requirements can get lost in discretizations (e.g. smoothing operator for semismooth
methods)I methods can be very fast
What are the difficulties?
I detailed functional analytic background necessary (cannot be expected in an industrialcontext)
I discretizations become necessary at lower level anyway; so, why not discretize right away?I theoretical difficulties with, e.g., state constraints (multipliers are measures; how to handle
them numerically?)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Function Space Methods
ParadigmAnalyze and develop methods for some infinite dimensional optimization problem (e.g.an optimal control problem) of type
Minimize J(z) subject to G(z) ∈ K , H(z) = 0
in the same Banach or Hilbert spaces where z, G and H live.
Why is this useful?I no immediate approximation error: algorithms work in the same spaces as the problemI massive exploitation of structure possible
I subtle requirements can get lost in discretizations (e.g. smoothing operator for semismoothmethods)
I methods can be very fast
What are the difficulties?
I detailed functional analytic background necessary (cannot be expected in an industrialcontext)
I discretizations become necessary at lower level anyway; so, why not discretize right away?I theoretical difficulties with, e.g., state constraints (multipliers are measures; how to handle
them numerically?)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Function Space Methods
ParadigmAnalyze and develop methods for some infinite dimensional optimization problem (e.g.an optimal control problem) of type
Minimize J(z) subject to G(z) ∈ K , H(z) = 0
in the same Banach or Hilbert spaces where z, G and H live.
Why is this useful?I no immediate approximation error: algorithms work in the same spaces as the problemI massive exploitation of structure possibleI subtle requirements can get lost in discretizations (e.g. smoothing operator for semismooth
methods)
I methods can be very fast
What are the difficulties?
I detailed functional analytic background necessary (cannot be expected in an industrialcontext)
I discretizations become necessary at lower level anyway; so, why not discretize right away?I theoretical difficulties with, e.g., state constraints (multipliers are measures; how to handle
them numerically?)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Function Space Methods
ParadigmAnalyze and develop methods for some infinite dimensional optimization problem (e.g.an optimal control problem) of type
Minimize J(z) subject to G(z) ∈ K , H(z) = 0
in the same Banach or Hilbert spaces where z, G and H live.
Why is this useful?I no immediate approximation error: algorithms work in the same spaces as the problemI massive exploitation of structure possibleI subtle requirements can get lost in discretizations (e.g. smoothing operator for semismooth
methods)I methods can be very fast
What are the difficulties?
I detailed functional analytic background necessary (cannot be expected in an industrialcontext)
I discretizations become necessary at lower level anyway; so, why not discretize right away?I theoretical difficulties with, e.g., state constraints (multipliers are measures; how to handle
them numerically?)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Function Space Methods
ParadigmAnalyze and develop methods for some infinite dimensional optimization problem (e.g.an optimal control problem) of type
Minimize J(z) subject to G(z) ∈ K , H(z) = 0
in the same Banach or Hilbert spaces where z, G and H live.
Why is this useful?I no immediate approximation error: algorithms work in the same spaces as the problemI massive exploitation of structure possibleI subtle requirements can get lost in discretizations (e.g. smoothing operator for semismooth
methods)I methods can be very fast
What are the difficulties?I detailed functional analytic background necessary (cannot be expected in an industrial
context)
I discretizations become necessary at lower level anyway; so, why not discretize right away?I theoretical difficulties with, e.g., state constraints (multipliers are measures; how to handle
them numerically?)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Function Space Methods
ParadigmAnalyze and develop methods for some infinite dimensional optimization problem (e.g.an optimal control problem) of type
Minimize J(z) subject to G(z) ∈ K , H(z) = 0
in the same Banach or Hilbert spaces where z, G and H live.
Why is this useful?I no immediate approximation error: algorithms work in the same spaces as the problemI massive exploitation of structure possibleI subtle requirements can get lost in discretizations (e.g. smoothing operator for semismooth
methods)I methods can be very fast
What are the difficulties?I detailed functional analytic background necessary (cannot be expected in an industrial
context)I discretizations become necessary at lower level anyway; so, why not discretize right away?
I theoretical difficulties with, e.g., state constraints (multipliers are measures; how to handlethem numerically?)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Function Space Methods
ParadigmAnalyze and develop methods for some infinite dimensional optimization problem (e.g.an optimal control problem) of type
Minimize J(z) subject to G(z) ∈ K , H(z) = 0
in the same Banach or Hilbert spaces where z, G and H live.
Why is this useful?I no immediate approximation error: algorithms work in the same spaces as the problemI massive exploitation of structure possibleI subtle requirements can get lost in discretizations (e.g. smoothing operator for semismooth
methods)I methods can be very fast
What are the difficulties?I detailed functional analytic background necessary (cannot be expected in an industrial
context)I discretizations become necessary at lower level anyway; so, why not discretize right away?I theoretical difficulties with, e.g., state constraints (multipliers are measures; how to handle
them numerically?)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Functional Analysis
I Banach space: complete normed vector space (X , ‖ · ‖X )
I Hilbert space: complete vector space X with inner product 〈·, ·〉X×X
I Dual space X∗ of a vector space X is defined by
X∗ := f : X → R | f linear and continuous, ‖f‖X∗ = sup‖x‖=1
|f (x)|
(X∗, ‖ · ‖X∗ ) is a Banach space, if (X , ‖ · ‖X ) is a Banach space.I For Banach spaces X and Y , the Frechet derivative of f : X −→ Y at x is a linear
and continuous operator f ′(x) : X −→ Y with the property
‖f (x + h)− f (x)− f ′(x)h‖Y = o(‖h‖X ).
The Frechet derivative of f at x in direction d is denoted by f ′(x)d .I The Frechet derivative of a function f : X × U → Y at (x, u) is given by
f ′(x, u)(x, u) = f ′x (x)x + f ′u(u)u,
where f ′x and f ′u are the partial derivatives of f w.r.t. x and u, respectively.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Functional Analysis
I Banach space: complete normed vector space (X , ‖ · ‖X )
I Hilbert space: complete vector space X with inner product 〈·, ·〉X×X
I Dual space X∗ of a vector space X is defined by
X∗ := f : X → R | f linear and continuous, ‖f‖X∗ = sup‖x‖=1
|f (x)|
(X∗, ‖ · ‖X∗ ) is a Banach space, if (X , ‖ · ‖X ) is a Banach space.I For Banach spaces X and Y , the Frechet derivative of f : X −→ Y at x is a linear
and continuous operator f ′(x) : X −→ Y with the property
‖f (x + h)− f (x)− f ′(x)h‖Y = o(‖h‖X ).
The Frechet derivative of f at x in direction d is denoted by f ′(x)d .I The Frechet derivative of a function f : X × U → Y at (x, u) is given by
f ′(x, u)(x, u) = f ′x (x)x + f ′u(u)u,
where f ′x and f ′u are the partial derivatives of f w.r.t. x and u, respectively.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Functional Analysis
I Banach space: complete normed vector space (X , ‖ · ‖X )
I Hilbert space: complete vector space X with inner product 〈·, ·〉X×X
I Dual space X∗ of a vector space X is defined by
X∗ := f : X → R | f linear and continuous, ‖f‖X∗ = sup‖x‖=1
|f (x)|
(X∗, ‖ · ‖X∗ ) is a Banach space, if (X , ‖ · ‖X ) is a Banach space.
I For Banach spaces X and Y , the Frechet derivative of f : X −→ Y at x is a linearand continuous operator f ′(x) : X −→ Y with the property
‖f (x + h)− f (x)− f ′(x)h‖Y = o(‖h‖X ).
The Frechet derivative of f at x in direction d is denoted by f ′(x)d .I The Frechet derivative of a function f : X × U → Y at (x, u) is given by
f ′(x, u)(x, u) = f ′x (x)x + f ′u(u)u,
where f ′x and f ′u are the partial derivatives of f w.r.t. x and u, respectively.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Functional Analysis
I Banach space: complete normed vector space (X , ‖ · ‖X )
I Hilbert space: complete vector space X with inner product 〈·, ·〉X×X
I Dual space X∗ of a vector space X is defined by
X∗ := f : X → R | f linear and continuous, ‖f‖X∗ = sup‖x‖=1
|f (x)|
(X∗, ‖ · ‖X∗ ) is a Banach space, if (X , ‖ · ‖X ) is a Banach space.I For Banach spaces X and Y , the Frechet derivative of f : X −→ Y at x is a linear
and continuous operator f ′(x) : X −→ Y with the property
‖f (x + h)− f (x)− f ′(x)h‖Y = o(‖h‖X ).
The Frechet derivative of f at x in direction d is denoted by f ′(x)d .
I The Frechet derivative of a function f : X × U → Y at (x, u) is given by
f ′(x, u)(x, u) = f ′x (x)x + f ′u(u)u,
where f ′x and f ′u are the partial derivatives of f w.r.t. x and u, respectively.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Functional Analysis
I Banach space: complete normed vector space (X , ‖ · ‖X )
I Hilbert space: complete vector space X with inner product 〈·, ·〉X×X
I Dual space X∗ of a vector space X is defined by
X∗ := f : X → R | f linear and continuous, ‖f‖X∗ = sup‖x‖=1
|f (x)|
(X∗, ‖ · ‖X∗ ) is a Banach space, if (X , ‖ · ‖X ) is a Banach space.I For Banach spaces X and Y , the Frechet derivative of f : X −→ Y at x is a linear
and continuous operator f ′(x) : X −→ Y with the property
‖f (x + h)− f (x)− f ′(x)h‖Y = o(‖h‖X ).
The Frechet derivative of f at x in direction d is denoted by f ′(x)d .I The Frechet derivative of a function f : X × U → Y at (x, u) is given by
f ′(x, u)(x, u) = f ′x (x)x + f ′u(u)u,
where f ′x and f ′u are the partial derivatives of f w.r.t. x and u, respectively.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Functional Analysis
Lebesgue spacesFor 1 ≤ p ≤ ∞ the Lebesgue spaces are defined by
Lp(I,Rn) := f : I −→ Rn | ‖f‖p <∞
with
‖f‖p :=
(∫I‖f (t)‖p
)1/p(1 ≤ p <∞),
‖f‖∞ := ess supt∈I
‖f (t)‖.
Properties:I L2(I,Rn) is a Hilbert space with inner product 〈f , g〉 =
∫I f (t)>g(t)dt
I L∞(I,Rn) is the dual space of L1(I,Rn), but not vice versaI Lp(I,Rn) and Lq(I,Rn) are dual to each other, if 1/p + 1/q = 1
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Functional Analysis
Sobolev spacesFor 1 ≤ p ≤ ∞ and q ∈ N the Sobolev spaces are defined by
W q,p(I,Rn) := f : I −→ Rn | ‖f‖q,p <∞
with
‖f‖q,p :=
q∑j=0
‖f (j)‖p
1/p
(1 ≤ p ≤ ∞)
Properties:I W 1,2([t0, tf ],Rn) is a Hilbert space with inner product
〈f , g〉 = f (t0)>g(t0) +
∫ tf
t0f (t)>g(t) dt
I W 1,1(I,Rn) is the space of absolutely continuous functions, i.e functions satisfy
f (t) = f (t0) +
∫ t
t0f ′(τ )dτ in I = [t0, tf ].
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Contents
Introduction
Necessary Conditions
Adjoint Formalism
Gradient MethodGradient Method in Finite DimensionsGradient Method for Optimal Control ProblemsExtensionsGradient Method for Discrete ProblemExamples
Lagrange-Newton MethodLagrange-Newton Method in Finite DimensionsLagrange-Newton Method in Infinite DimensionsApplication to Optimal ControlSearch DirectionExamplesExtensions
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Infinite Optimization Problem
Throughout we restrict the discussion to the following class of optimization problems.
Problem (Infinite Optimization Problem (NLP))Given:
I Banach spaces (Z , ‖ · ‖Z ), (Y , ‖ · ‖Y )
I mappings J : Z −→ R, H : Z −→ YI convex set S ⊆ Z
Minimize J(z) subject to z ∈ S, H(z) = 0
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Infinite Optimization Problem
Theorem (KKT Conditions)Assumptions:
I z is a local minimum of NLPI J is Frechet-differentiable at z, H is continuously Frechet-differentiable at zI S is convex with non-empty interiorI Mangasarian-Fromowitz Constraint Qualification (MFCQ): H′(z) is surjective and
there exists d ∈ int(S − z) with H′(z)(d) = 0
Then there exists a multiplier λ∗ ∈ Y∗ such that
J′(z)(z − z) + λ∗(H′(z)(z − z)) ≥ 0 ∀z ∈ S
For a proof see [1, Theorems 3.1,4.1].
[1] S. Kurcyusz.On the Existence and Nonexistence of Lagrange Multipliers in Banach Spaces.Journal of Optimization Theory and Applications, 20(1):81–110, 1976.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Infinite Optimization Problem
Lagrange function:L(z, λ∗) := J(z) + λ∗(H(z))
Special cases:
I If S = Z , then
J′(z)(z) + λ∗(H′(z)(z)) = 0 ∀z ∈ Z
or equivalently
L′z (z, λ∗) = 0
I If H is not present, then
J′(z)(z − z) ≥ 0 ∀z ∈ S
If Z is a Hilbert space, then this condition is equivalent with the nonsmoothequation
z = ΠS(z − αJ′(z)
)(α > 0,ΠS projection onto S)
I If S = Z and H is not present, then
J′(z)(z − z) = 0 ∀z ∈ S
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Infinite Optimization Problem
Lagrange function:L(z, λ∗) := J(z) + λ∗(H(z))
Special cases:I If S = Z , then
J′(z)(z) + λ∗(H′(z)(z)) = 0 ∀z ∈ Z
or equivalently
L′z (z, λ∗) = 0
I If H is not present, then
J′(z)(z − z) ≥ 0 ∀z ∈ S
If Z is a Hilbert space, then this condition is equivalent with the nonsmoothequation
z = ΠS(z − αJ′(z)
)(α > 0,ΠS projection onto S)
I If S = Z and H is not present, then
J′(z)(z − z) = 0 ∀z ∈ S
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Infinite Optimization Problem
Lagrange function:L(z, λ∗) := J(z) + λ∗(H(z))
Special cases:I If S = Z , then
J′(z)(z) + λ∗(H′(z)(z)) = 0 ∀z ∈ Z
or equivalently
L′z (z, λ∗) = 0
I If H is not present, then
J′(z)(z − z) ≥ 0 ∀z ∈ S
If Z is a Hilbert space, then this condition is equivalent with the nonsmoothequation
z = ΠS(z − αJ′(z)
)(α > 0,ΠS projection onto S)
I If S = Z and H is not present, then
J′(z)(z − z) = 0 ∀z ∈ S
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Infinite Optimization Problem
Lagrange function:L(z, λ∗) := J(z) + λ∗(H(z))
Special cases:I If S = Z , then
J′(z)(z) + λ∗(H′(z)(z)) = 0 ∀z ∈ Z
or equivalently
L′z (z, λ∗) = 0
I If H is not present, then
J′(z)(z − z) ≥ 0 ∀z ∈ S
If Z is a Hilbert space, then this condition is equivalent with the nonsmoothequation
z = ΠS(z − αJ′(z)
)(α > 0,ΠS projection onto S)
I If S = Z and H is not present, then
J′(z)(z − z) = 0 ∀z ∈ S
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Contents
Introduction
Necessary Conditions
Adjoint Formalism
Gradient MethodGradient Method in Finite DimensionsGradient Method for Optimal Control ProblemsExtensionsGradient Method for Discrete ProblemExamples
Lagrange-Newton MethodLagrange-Newton Method in Finite DimensionsLagrange-Newton Method in Infinite DimensionsApplication to Optimal ControlSearch DirectionExamplesExtensions
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Equality constrained NLP (EQ-NLP)
Minimize J(x, u) subject to Ax = Bu.
A ∈ Rn×n, B ∈ Rn×m, J : Rn × Rm −→ R differentiable
Solution operator: Let A be nonsingular. Then:
Ax = Bu ⇔ x = A−1Bu =: Su
S : Rm −→ Rn, u 7→ x = Su is called solution operator.
Reduced objective functional:
J(x, u) = J(Su, u) =: j(u), j : Rm −→ R.
Reduced NLP (R-NLP)
Minimize j(u) = J(Su, u) w.r.t. u ∈ Rm.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Equality constrained NLP (EQ-NLP)
Minimize J(x, u) subject to Ax = Bu.
A ∈ Rn×n, B ∈ Rn×m, J : Rn × Rm −→ R differentiable
Solution operator: Let A be nonsingular. Then:
Ax = Bu ⇔ x = A−1Bu =: Su
S : Rm −→ Rn, u 7→ x = Su is called solution operator.
Reduced objective functional:
J(x, u) = J(Su, u) =: j(u), j : Rm −→ R.
Reduced NLP (R-NLP)
Minimize j(u) = J(Su, u) w.r.t. u ∈ Rm.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Equality constrained NLP (EQ-NLP)
Minimize J(x, u) subject to Ax = Bu.
A ∈ Rn×n, B ∈ Rn×m, J : Rn × Rm −→ R differentiable
Solution operator: Let A be nonsingular. Then:
Ax = Bu ⇔ x = A−1Bu =: Su
S : Rm −→ Rn, u 7→ x = Su is called solution operator.
Reduced objective functional:
J(x, u) = J(Su, u) =: j(u), j : Rm −→ R.
Reduced NLP (R-NLP)
Minimize j(u) = J(Su, u) w.r.t. u ∈ Rm.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Equality constrained NLP (EQ-NLP)
Minimize J(x, u) subject to Ax = Bu.
A ∈ Rn×n, B ∈ Rn×m, J : Rn × Rm −→ R differentiable
Solution operator: Let A be nonsingular. Then:
Ax = Bu ⇔ x = A−1Bu =: Su
S : Rm −→ Rn, u 7→ x = Su is called solution operator.
Reduced objective functional:
J(x, u) = J(Su, u) =: j(u), j : Rm −→ R.
Reduced NLP (R-NLP)
Minimize j(u) = J(Su, u) w.r.t. u ∈ Rm.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
TaskCompute gradient of j at a given u, for instance in a gradient based optimizationmethod or for the evaluation of necessary optimality conditions.
Differentiation yields
j′(u) = J′x (Su, u)S + J′u(Su, u)
= J′x (Su, u)A−1B + J′u(Su, u)
Computation of A−1 is expensive. Try to avoid it!
To this end define adjoint vector λ by
λ> := J′x (Su, u)A−1
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
TaskCompute gradient of j at a given u, for instance in a gradient based optimizationmethod or for the evaluation of necessary optimality conditions.
Differentiation yields
j′(u) = J′x (Su, u)S + J′u(Su, u)
= J′x (Su, u)A−1B + J′u(Su, u)
Computation of A−1 is expensive. Try to avoid it!
To this end define adjoint vector λ by
λ> := J′x (Su, u)A−1
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
TaskCompute gradient of j at a given u, for instance in a gradient based optimizationmethod or for the evaluation of necessary optimality conditions.
Differentiation yields
j′(u) = J′x (Su, u)S + J′u(Su, u)
= J′x (Su, u)A−1B + J′u(Su, u)
Computation of A−1 is expensive. Try to avoid it!
To this end define adjoint vector λ by
λ> := J′x (Su, u)A−1
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
TaskCompute gradient of j at a given u, for instance in a gradient based optimizationmethod or for the evaluation of necessary optimality conditions.
Differentiation yields
j′(u) = J′x (Su, u)S + J′u(Su, u)
= J′x (Su, u)A−1B + J′u(Su, u)
Computation of A−1 is expensive. Try to avoid it!
To this end define adjoint vector λ by
λ> := J′x (Su, u)A−1
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Adjoint equation
A>λ = ∇x J(Su, u)
The gradient of j at u reads
j′(u) = λ>B + J′u(Su, u)
Connection to KKT conditions: Lagrange function of EQ-NLP
L(x, u, λ) := J(x, u) + λ>(Bu − Ax)
If λ solves adjoint equation, then
L′x (x, u, λ) = J′x (x, u)− λ>A = 0
This choice of λ automatically satisfies one part of the KKT conditions. The secondpart
0 = L′u(x, u, λ) = J′u(x, u) + λ>B = j′(u)
is only satisfied at a stationary point u of the reduced problem R-NLP!
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Adjoint equation
A>λ = ∇x J(Su, u)
The gradient of j at u reads
j′(u) = λ>B + J′u(Su, u)
Connection to KKT conditions: Lagrange function of EQ-NLP
L(x, u, λ) := J(x, u) + λ>(Bu − Ax)
If λ solves adjoint equation, then
L′x (x, u, λ) = J′x (x, u)− λ>A = 0
This choice of λ automatically satisfies one part of the KKT conditions. The secondpart
0 = L′u(x, u, λ) = J′u(x, u) + λ>B = j′(u)
is only satisfied at a stationary point u of the reduced problem R-NLP!
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Adjoint equation
A>λ = ∇x J(Su, u)
The gradient of j at u reads
j′(u) = λ>B + J′u(Su, u)
Connection to KKT conditions: Lagrange function of EQ-NLP
L(x, u, λ) := J(x, u) + λ>(Bu − Ax)
If λ solves adjoint equation, then
L′x (x, u, λ) = J′x (x, u)− λ>A = 0
This choice of λ automatically satisfies one part of the KKT conditions.
The secondpart
0 = L′u(x, u, λ) = J′u(x, u) + λ>B = j′(u)
is only satisfied at a stationary point u of the reduced problem R-NLP!
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Adjoint equation
A>λ = ∇x J(Su, u)
The gradient of j at u reads
j′(u) = λ>B + J′u(Su, u)
Connection to KKT conditions: Lagrange function of EQ-NLP
L(x, u, λ) := J(x, u) + λ>(Bu − Ax)
If λ solves adjoint equation, then
L′x (x, u, λ) = J′x (x, u)− λ>A = 0
This choice of λ automatically satisfies one part of the KKT conditions. The secondpart
0 = L′u(x, u, λ) = J′u(x, u) + λ>B = j′(u)
is only satisfied at a stationary point u of the reduced problem R-NLP!
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Let X and Y be Banach spaces and X∗ and Y∗ their topological dual spaces.
Adjoint operatorLetA : X −→ Y be a bounded linear operator. The operatorA∗ : Y∗ −→ X∗ withthe property
(A∗y∗)(x) = y∗(Ax) ∀x ∈ X , y∗ ∈ Y∗
is called adjoint operator (it is a bounded linear operator).Notion: (A∗y∗, x) = (y∗,Ax) “dual pairing”
Equality constrained NLP (EQ-NLP)
Minimize J(x, u) subject to Ax = Bu.
A : X −→ Y , B : U −→ Y bounded linear operators, J : X × U −→ R Frechetdifferentiable, X , U, Y Banach spaces
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Solution operator: LetA be nonsingular. Then:
Ax = Bu ⇐⇒ x = A−1Bu =: Su
with solution operator S : U −→ X , u 7→ x = Su.
Reduced objective functional:
J(x, u) = J(Su, u) =: j(u), j : U −→ R.
Reduced NLP (R-NLP)
Minimize j(u) = J(Su, u) w.r.t. u ∈ U.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Solution operator: LetA be nonsingular. Then:
Ax = Bu ⇐⇒ x = A−1Bu =: Su
with solution operator S : U −→ X , u 7→ x = Su.
Reduced objective functional:
J(x, u) = J(Su, u) =: j(u), j : U −→ R.
Reduced NLP (R-NLP)
Minimize j(u) = J(Su, u) w.r.t. u ∈ U.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Solution operator: LetA be nonsingular. Then:
Ax = Bu ⇐⇒ x = A−1Bu =: Su
with solution operator S : U −→ X , u 7→ x = Su.
Reduced objective functional:
J(x, u) = J(Su, u) =: j(u), j : U −→ R.
Reduced NLP (R-NLP)
Minimize j(u) = J(Su, u) w.r.t. u ∈ U.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Differentiation at u in direction u yields
j′(u)u = J′x (Su, u)Su + J′u(Su, u)u
= J′x (Su, u)A−1Bu + J′u(Su, u)u
Computation ofA−1 is expensive. Try to avoid it!
Define adjoint λ ∈ Y∗ by
y∗ := J′x (Su, u)A−1 ⇔ y∗(Ax)︸ ︷︷ ︸=(A∗y∗)x
= J′x (Su, u)x
Adjoint equation
A∗y∗ = J′x (Su, u) (operator equation in X∗)
The gradient of j at u in direction u reads
j′(u)u = y∗Bu + J′u(Su, u)u
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Differentiation at u in direction u yields
j′(u)u = J′x (Su, u)Su + J′u(Su, u)u
= J′x (Su, u)A−1Bu + J′u(Su, u)u
Computation ofA−1 is expensive. Try to avoid it!
Define adjoint λ ∈ Y∗ by
y∗ := J′x (Su, u)A−1 ⇔ y∗(Ax)︸ ︷︷ ︸=(A∗y∗)x
= J′x (Su, u)x
Adjoint equation
A∗y∗ = J′x (Su, u) (operator equation in X∗)
The gradient of j at u in direction u reads
j′(u)u = y∗Bu + J′u(Su, u)u
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Differentiation at u in direction u yields
j′(u)u = J′x (Su, u)Su + J′u(Su, u)u
= J′x (Su, u)A−1Bu + J′u(Su, u)u
Computation ofA−1 is expensive. Try to avoid it!
Define adjoint λ ∈ Y∗ by
y∗ := J′x (Su, u)A−1 ⇔ y∗(Ax)︸ ︷︷ ︸=(A∗y∗)x
= J′x (Su, u)x
Adjoint equation
A∗y∗ = J′x (Su, u) (operator equation in X∗)
The gradient of j at u in direction u reads
j′(u)u = y∗Bu + J′u(Su, u)u
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Differentiation at u in direction u yields
j′(u)u = J′x (Su, u)Su + J′u(Su, u)u
= J′x (Su, u)A−1Bu + J′u(Su, u)u
Computation ofA−1 is expensive. Try to avoid it!
Define adjoint λ ∈ Y∗ by
y∗ := J′x (Su, u)A−1 ⇔ y∗(Ax)︸ ︷︷ ︸=(A∗y∗)x
= J′x (Su, u)x
Adjoint equation
A∗y∗ = J′x (Su, u) (operator equation in X∗)
The gradient of j at u in direction u reads
j′(u)u = y∗Bu + J′u(Su, u)u
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Connection to KKT conditions:I Lagrange function of EQ-NLP
L(x, u, y∗) := J(x, u) + y∗(Bu −Ax)
I y∗ solves adjoint equation =⇒
L′x (x, u, y∗)x = J′x (x, u)x − y∗(Ax) = J′x (x, u)x − (A∗y∗)x = 0
This choice of y∗ automatically satisfies one part of the KKT conditions.I The second part
0 = L′u(x, u, y∗) = J′u(x, u) + y∗B = j′(u)
is only satisfied at a stationary point u of the reduced problem R-NLP!
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Connection to KKT conditions:I Lagrange function of EQ-NLP
L(x, u, y∗) := J(x, u) + y∗(Bu −Ax)
I y∗ solves adjoint equation =⇒
L′x (x, u, y∗)x = J′x (x, u)x − y∗(Ax) = J′x (x, u)x − (A∗y∗)x = 0
This choice of y∗ automatically satisfies one part of the KKT conditions.
I The second part
0 = L′u(x, u, y∗) = J′u(x, u) + y∗B = j′(u)
is only satisfied at a stationary point u of the reduced problem R-NLP!
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Adjoint Formalism
Connection to KKT conditions:I Lagrange function of EQ-NLP
L(x, u, y∗) := J(x, u) + y∗(Bu −Ax)
I y∗ solves adjoint equation =⇒
L′x (x, u, y∗)x = J′x (x, u)x − y∗(Ax) = J′x (x, u)x − (A∗y∗)x = 0
This choice of y∗ automatically satisfies one part of the KKT conditions.I The second part
0 = L′u(x, u, y∗) = J′u(x, u) + y∗B = j′(u)
is only satisfied at a stationary point u of the reduced problem R-NLP!
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
How to compute the adjoint operator?
Theorem (Riesz)Given: Hilbert space X, inner product 〈·, ·〉, dual space X∗.For every f∗ ∈ X∗ there exists a unique f ∈ X such that ‖f∗‖X∗ = ‖f‖X and
f∗(x) = 〈f , x〉 ∀x ∈ X .
Example (linear ODE)Given: I := [0, T ], linear mappingA : W 1,2(I, Rn) =: X −→ Y := L2(I, Rn)× Rn,
(Ax)(·) =
(x′(·)− A(·)x(·)
x(0)
).
X and Y are Hilbert spaces. According to Riesz’ theorem for y∗ ∈ Y∗ and x∗ ∈ X∗ there exist(λ, σ) ∈ Y and µ ∈ X with
y∗(h, k) =
∫Iλ(t)>h(t) dt + σ>k ∀(h, k) ∈ Y
x∗(x) = µ(0)>x(0) +
∫Iµ′(t)>x′(t) dt ∀x ∈ X
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
How to compute the adjoint operator?
Example (linear ODE, continued)We intend to identify λ, σ, and µ such that y∗(Ax) = (A∗y∗)(x) holds for all x ∈ X andy∗ ∈ Y∗. Note, thatA∗y∗ ∈ X∗.
By partial integration we obtain
y∗(Ax) =
∫Iλ(t)>
(x′(t)− A(t)x(t)
)dt + σ>x(0)
= −[(−∫ T
tλ(s)>A(s)ds
)x(t)]T
0+ σ>x(0)
+
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)x′(t) dt
=
(σ> −
∫ T
0λ(s)>A(s)ds
)︸ ︷︷ ︸
=µ(0)>
x(0) +
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)︸ ︷︷ ︸
=µ′(t)>
x′(t) dt
= (A∗y∗)(x).
The latter equation defines the adjoint operator since y∗(Ax) = (A∗y∗)(x) holds for all x ∈ X ,y∗ ∈ Y∗.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
How to compute the adjoint operator?
Example (linear ODE, continued)We intend to identify λ, σ, and µ such that y∗(Ax) = (A∗y∗)(x) holds for all x ∈ X andy∗ ∈ Y∗. Note, thatA∗y∗ ∈ X∗.
By partial integration we obtain
y∗(Ax) =
∫Iλ(t)>
(x′(t)− A(t)x(t)
)dt + σ>x(0)
= −[(−∫ T
tλ(s)>A(s)ds
)x(t)]T
0+ σ>x(0)
+
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)x′(t) dt
=
(σ> −
∫ T
0λ(s)>A(s)ds
)︸ ︷︷ ︸
=µ(0)>
x(0) +
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)︸ ︷︷ ︸
=µ′(t)>
x′(t) dt
= (A∗y∗)(x).
The latter equation defines the adjoint operator since y∗(Ax) = (A∗y∗)(x) holds for all x ∈ X ,y∗ ∈ Y∗.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
How to compute the adjoint operator?
Example (linear ODE, continued)We intend to identify λ, σ, and µ such that y∗(Ax) = (A∗y∗)(x) holds for all x ∈ X andy∗ ∈ Y∗. Note, thatA∗y∗ ∈ X∗.
By partial integration we obtain
y∗(Ax) =
∫Iλ(t)>
(x′(t)− A(t)x(t)
)dt + σ>x(0)
= −[(−∫ T
tλ(s)>A(s)ds
)x(t)]T
0+ σ>x(0)
+
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)x′(t) dt
=
(σ> −
∫ T
0λ(s)>A(s)ds
)︸ ︷︷ ︸
=µ(0)>
x(0) +
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)︸ ︷︷ ︸
=µ′(t)>
x′(t) dt
= (A∗y∗)(x).
The latter equation defines the adjoint operator since y∗(Ax) = (A∗y∗)(x) holds for all x ∈ X ,y∗ ∈ Y∗.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
How to compute the adjoint operator?
Example (linear ODE, continued)We intend to identify λ, σ, and µ such that y∗(Ax) = (A∗y∗)(x) holds for all x ∈ X andy∗ ∈ Y∗. Note, thatA∗y∗ ∈ X∗.
By partial integration we obtain
y∗(Ax) =
∫Iλ(t)>
(x′(t)− A(t)x(t)
)dt + σ>x(0)
= −[(−∫ T
tλ(s)>A(s)ds
)x(t)]T
0+ σ>x(0)
+
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)x′(t) dt
=
(σ> −
∫ T
0λ(s)>A(s)ds
)︸ ︷︷ ︸
=µ(0)>
x(0) +
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)︸ ︷︷ ︸
=µ′(t)>
x′(t) dt
= (A∗y∗)(x).
The latter equation defines the adjoint operator since y∗(Ax) = (A∗y∗)(x) holds for all x ∈ X ,y∗ ∈ Y∗.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
How to compute the adjoint operator?
Example (linear ODE, continued)We intend to identify λ, σ, and µ such that y∗(Ax) = (A∗y∗)(x) holds for all x ∈ X andy∗ ∈ Y∗. Note, thatA∗y∗ ∈ X∗.
By partial integration we obtain
y∗(Ax) =
∫Iλ(t)>
(x′(t)− A(t)x(t)
)dt + σ>x(0)
= −[(−∫ T
tλ(s)>A(s)ds
)x(t)]T
0+ σ>x(0)
+
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)x′(t) dt
=
(σ> −
∫ T
0λ(s)>A(s)ds
)︸ ︷︷ ︸
=µ(0)>
x(0) +
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)︸ ︷︷ ︸
=µ′(t)>
x′(t) dt
= (A∗y∗)(x).
The latter equation defines the adjoint operator since y∗(Ax) = (A∗y∗)(x) holds for all x ∈ X ,y∗ ∈ Y∗.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
How to compute the adjoint equation?
Example (OCP and adjoint equation)Optimal control problem: (assumption: ϕ Frechet differentiable)
Minimize J(x, u) := ϕ(x(T )) s.t.
(x′(t)− A(t)x(t)
x(0)
)︸ ︷︷ ︸
=A(x)(t)
=
(B(t)u(t)
0
)︸ ︷︷ ︸
=B(u)(t)
Adjoint equation: For every x ∈ W 1,2(I, Rn) we have
0 = (A∗y∗)(x)− J′(x, u)(x)
=
(σ> −
∫ T
0λ(s)>A(s)ds
)x(0) +
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)x′(t) dt
− ϕ′(x(T ))x(T )
Application of variation lemma (DuBois-Reymond) yields
λ(t)> = λ(T )> +
∫ T
tλ(s)>A(s)ds
andλ(0) = σ, λ(T )> = ϕ
′(x(T )).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
How to compute the adjoint equation?
Example (OCP and adjoint equation)Optimal control problem: (assumption: ϕ Frechet differentiable)
Minimize J(x, u) := ϕ(x(T )) s.t.
(x′(t)− A(t)x(t)
x(0)
)︸ ︷︷ ︸
=A(x)(t)
=
(B(t)u(t)
0
)︸ ︷︷ ︸
=B(u)(t)
Adjoint equation: For every x ∈ W 1,2(I, Rn) we have
0 = (A∗y∗)(x)− J′(x, u)(x)
=
(σ> −
∫ T
0λ(s)>A(s)ds
)x(0) +
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)x′(t) dt
− ϕ′(x(T ))x(T )
Application of variation lemma (DuBois-Reymond) yields
λ(t)> = λ(T )> +
∫ T
tλ(s)>A(s)ds
andλ(0) = σ, λ(T )> = ϕ
′(x(T )).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
How to compute the adjoint equation?
Example (OCP and adjoint equation)Optimal control problem: (assumption: ϕ Frechet differentiable)
Minimize J(x, u) := ϕ(x(T )) s.t.
(x′(t)− A(t)x(t)
x(0)
)︸ ︷︷ ︸
=A(x)(t)
=
(B(t)u(t)
0
)︸ ︷︷ ︸
=B(u)(t)
Adjoint equation: For every x ∈ W 1,2(I, Rn) we have
0 = (A∗y∗)(x)− J′(x, u)(x)
=
(σ> −
∫ T
0λ(s)>A(s)ds
)x(0) +
∫I
(λ(t)> −
∫ T
tλ(s)>A(s)ds
)x′(t) dt
− ϕ′(x(T ))x(T )
Application of variation lemma (DuBois-Reymond) yields
λ(t)> = λ(T )> +
∫ T
tλ(s)>A(s)ds
andλ(0) = σ, λ(T )> = ϕ
′(x(T )).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
How to compute the adjoint equation?
Example (OCP and adjoint equation, continued)
I Adjoint equation:λ′(t) = −A(t)>λ(t), λ(T ) = ∇ϕ(x(T )).
I Gradient of reduced objective functional j(u) = J(x(u), u):
j′(u)(u) = y∗(Bu) + J′u(x, u)u
=
∫Iλ(t)>B(t)u(t) dt + σ>0
=
∫Iλ(t)>B(t)u(t) dt (u ∈ L2(I, Rm))
I Stationary point:
0 = j′(u)(u) =
∫Iλ(t)>B(t)u(t) dt ∀u ∈ L2(I, Rm)
implies0 = B(t)>λ(t) a.e. in I
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Contents
Introduction
Necessary Conditions
Adjoint Formalism
Gradient MethodGradient Method in Finite DimensionsGradient Method for Optimal Control ProblemsExtensionsGradient Method for Discrete ProblemExamples
Lagrange-Newton MethodLagrange-Newton Method in Finite DimensionsLagrange-Newton Method in Infinite DimensionsApplication to Optimal ControlSearch DirectionExamplesExtensions
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method in Finite Dimensions
Unconstrained minimization problem
Minimize J(u) w.r.t. u ∈ Rn
J : Rn −→ R continuously differentiable
Gradient Method for Finite Dimensional Problems
(0) Let u(0) ∈ Rn, β ∈ (0, 1), σ ∈ (0, 1), and k := 0.
(1) Compute d (k) := −∇J(u(k)).
(2) If ‖d (k)‖ ≈ 0, STOP.
(3) Perform line-search: Find smallest j ∈ 0, 1, 2, . . . with
J(u(k) + βj d (k)) ≤ J(u(k))− σβj‖∇J(u(k))‖22
and set αk := βj .
(4) Set u(k+1) := u(k) + αk d (k), k := k + 1, and go to (1).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method in Finite Dimensions
Unconstrained minimization problem
Minimize J(u) w.r.t. u ∈ Rn
J : Rn −→ R continuously differentiable
Gradient Method for Finite Dimensional Problems
(0) Let u(0) ∈ Rn, β ∈ (0, 1), σ ∈ (0, 1), and k := 0.
(1) Compute d (k) := −∇J(u(k)).
(2) If ‖d (k)‖ ≈ 0, STOP.
(3) Perform line-search: Find smallest j ∈ 0, 1, 2, . . . with
J(u(k) + βj d (k)) ≤ J(u(k))− σβj‖∇J(u(k))‖22
and set αk := βj .
(4) Set u(k+1) := u(k) + αk d (k), k := k + 1, and go to (1).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method in Finite Dimensions
Unconstrained minimization problem
Minimize J(u) w.r.t. u ∈ Rn
J : Rn −→ R continuously differentiable
Gradient Method for Finite Dimensional Problems
(0) Let u(0) ∈ Rn, β ∈ (0, 1), σ ∈ (0, 1), and k := 0.
(1) Compute d (k) := −∇J(u(k)).
(2) If ‖d (k)‖ ≈ 0, STOP.
(3) Perform line-search: Find smallest j ∈ 0, 1, 2, . . . with
J(u(k) + βj d (k)) ≤ J(u(k))− σβj‖∇J(u(k))‖22
and set αk := βj .
(4) Set u(k+1) := u(k) + αk d (k), k := k + 1, and go to (1).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method in Finite Dimensions
Unconstrained minimization problem
Minimize J(u) w.r.t. u ∈ Rn
J : Rn −→ R continuously differentiable
Gradient Method for Finite Dimensional Problems
(0) Let u(0) ∈ Rn, β ∈ (0, 1), σ ∈ (0, 1), and k := 0.
(1) Compute d (k) := −∇J(u(k)).
(2) If ‖d (k)‖ ≈ 0, STOP.
(3) Perform line-search: Find smallest j ∈ 0, 1, 2, . . . with
J(u(k) + βj d (k)) ≤ J(u(k))− σβj‖∇J(u(k))‖22
and set αk := βj .
(4) Set u(k+1) := u(k) + αk d (k), k := k + 1, and go to (1).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method in Finite Dimensions
Unconstrained minimization problem
Minimize J(u) w.r.t. u ∈ Rn
J : Rn −→ R continuously differentiable
Gradient Method for Finite Dimensional Problems
(0) Let u(0) ∈ Rn, β ∈ (0, 1), σ ∈ (0, 1), and k := 0.
(1) Compute d (k) := −∇J(u(k)).
(2) If ‖d (k)‖ ≈ 0, STOP.
(3) Perform line-search: Find smallest j ∈ 0, 1, 2, . . . with
J(u(k) + βj d (k)) ≤ J(u(k))− σβj‖∇J(u(k))‖22
and set αk := βj .
(4) Set u(k+1) := u(k) + αk d (k), k := k + 1, and go to (1).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method in Finite Dimensions
Unconstrained minimization problem
Minimize J(u) w.r.t. u ∈ Rn
J : Rn −→ R continuously differentiable
Gradient Method for Finite Dimensional Problems
(0) Let u(0) ∈ Rn, β ∈ (0, 1), σ ∈ (0, 1), and k := 0.
(1) Compute d (k) := −∇J(u(k)).
(2) If ‖d (k)‖ ≈ 0, STOP.
(3) Perform line-search: Find smallest j ∈ 0, 1, 2, . . . with
J(u(k) + βj d (k)) ≤ J(u(k))− σβj‖∇J(u(k))‖22
and set αk := βj .
(4) Set u(k+1) := u(k) + αk d (k), k := k + 1, and go to (1).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method
Pro’s:I requires only first derivativesI easy to implementI global convergence achieved by Armijo linesearchI extension: projected gradient method for simple constraints
Con’s:I only linear convergence rateI only unconstrained minimization (except simple bounds)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method for Optimal Control Problems
Optimal control problem (OCP)Given: I := [t0, tf ], x ∈ Rnx , continuously differentiable functions
ϕ : Rnx −→ R,f0 : Rnx × Rnu −→ R,f : Rnx × Rnu −→ Rnx .
MinimizeΓ(x, u) := ϕ(x(tf )) +
∫I
f0(x(t), u(t))dt
w.r.t. x ∈ W 1,∞(I,Rnx ), u ∈ L∞(I,Rnu ) subject to the constraints
x′(t) = f (x(t), u(t)) a.e. in I,x(t0) = x.
Define X := W 1,∞(I,Rnx ) and U := L∞(I,Rnu ).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method for Optimal Control Problems
Assumptions: (existence of solution operator)I For every u ∈ U the initial value problem
x′(t) = f (x(t), u(t)) a.e. in I,x(t0) = x
has a unique solution x = S(u) ∈ X .I The solution mapping S : U −→ X is continuously Frechet-differentiable.
Reduced optimal control problem (R-OCP)
Minimize J(u) := Γ(S(u), u) subject to u ∈ U.
Gradient method requires gradient of reduced objective function J at some u ∈ U.
How does the gradient look like?
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method for Optimal Control Problems
Assumptions: (existence of solution operator)I For every u ∈ U the initial value problem
x′(t) = f (x(t), u(t)) a.e. in I,x(t0) = x
has a unique solution x = S(u) ∈ X .I The solution mapping S : U −→ X is continuously Frechet-differentiable.
Reduced optimal control problem (R-OCP)
Minimize J(u) := Γ(S(u), u) subject to u ∈ U.
Gradient method requires gradient of reduced objective function J at some u ∈ U.
How does the gradient look like?
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Computation of Gradient
I In Rn:
∇J(u) := J′(u)> =
∂J∂u1
(u)
...∂J∂un
(u)
, J′(u)(u) = ∇J(u)>u = 〈J(u), u〉Rn×Rn ,
I In a Hilbert space setting, i.e J : U −→ R, U Hilbert space:
J′(u) ∈ U∗ Riesz=⇒ ∃η(u) ∈ U : J′(u)(u) = 〈η(u), u〉U×U
Hence,∇J(u) := η(u) ∈ U is the gradient of J at u.
I But, in our case U = L∞(I,Rnu ) is not a Hilbert space and hence the functionalJ′(u) ∈ U∗ does not a priori have such a nice representation as in the abovecases.
How to define the gradient in this case?−→ use formal Lagrange technique (seealso metric gradient, see [1])
[1] M. Golomb and R. A. Tapia.The metric gradient in normed linear spaces. Numerische Mathematik, 20:115–124, 1972.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Computation of Gradient
I In Rn:
∇J(u) := J′(u)> =
∂J∂u1
(u)
...∂J∂un
(u)
, J′(u)(u) = ∇J(u)>u = 〈J(u), u〉Rn×Rn ,
I In a Hilbert space setting, i.e J : U −→ R, U Hilbert space:
J′(u) ∈ U∗ Riesz=⇒ ∃η(u) ∈ U : J′(u)(u) = 〈η(u), u〉U×U
Hence,∇J(u) := η(u) ∈ U is the gradient of J at u.
I But, in our case U = L∞(I,Rnu ) is not a Hilbert space and hence the functionalJ′(u) ∈ U∗ does not a priori have such a nice representation as in the abovecases.
How to define the gradient in this case?−→ use formal Lagrange technique (seealso metric gradient, see [1])
[1] M. Golomb and R. A. Tapia.The metric gradient in normed linear spaces. Numerische Mathematik, 20:115–124, 1972.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Computation of Gradient
I In Rn:
∇J(u) := J′(u)> =
∂J∂u1
(u)
...∂J∂un
(u)
, J′(u)(u) = ∇J(u)>u = 〈J(u), u〉Rn×Rn ,
I In a Hilbert space setting, i.e J : U −→ R, U Hilbert space:
J′(u) ∈ U∗ Riesz=⇒ ∃η(u) ∈ U : J′(u)(u) = 〈η(u), u〉U×U
Hence,∇J(u) := η(u) ∈ U is the gradient of J at u.
I But, in our case U = L∞(I,Rnu ) is not a Hilbert space and hence the functionalJ′(u) ∈ U∗ does not a priori have such a nice representation as in the abovecases.
How to define the gradient in this case?−→ use formal Lagrange technique (seealso metric gradient, see [1])
[1] M. Golomb and R. A. Tapia.The metric gradient in normed linear spaces. Numerische Mathematik, 20:115–124, 1972.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Computation of Gradient
I In Rn:
∇J(u) := J′(u)> =
∂J∂u1
(u)
...∂J∂un
(u)
, J′(u)(u) = ∇J(u)>u = 〈J(u), u〉Rn×Rn ,
I In a Hilbert space setting, i.e J : U −→ R, U Hilbert space:
J′(u) ∈ U∗ Riesz=⇒ ∃η(u) ∈ U : J′(u)(u) = 〈η(u), u〉U×U
Hence,∇J(u) := η(u) ∈ U is the gradient of J at u.
I But, in our case U = L∞(I,Rnu ) is not a Hilbert space and hence the functionalJ′(u) ∈ U∗ does not a priori have such a nice representation as in the abovecases.
How to define the gradient in this case?−→ use formal Lagrange technique (seealso metric gradient, see [1])
[1] M. Golomb and R. A. Tapia.The metric gradient in normed linear spaces. Numerische Mathematik, 20:115–124, 1972.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Computation of Gradient by Formal Lagrange Technique
Hamilton function:H(x, u, λ) := f0(x, u) + λ>f (x, u)
Auxiliary functional: (u, x = S(u) given; λ ∈ X to be specified later)
J(u) := J(u) +⟨λ, f (x, u)− x′
⟩L2(I,Rnx )×L2(I,Rnx )
= ϕ(x(tf )) +
∫IH(x(t), u(t), λ(t))− λ(t)>x′(t) dt
p.I.= ϕ(x(tf ))−
[λ(t)>x(t)
]tf
t0+
∫IH(x(t), u(t), λ(t)) + λ′(t)>x(t) dt.
Frechet derivative: (exploit S′(u)(t0) = 0)
J′(u)(u) =(ϕ′(x(tf ))− λ(tf )>
)S′(u)(tf )
+
∫I
(H′x [t] + λ′(t)>
)S′(u)(t) +H′u [t]u(t) dt.
Computation of S′(u) is expensive! Eliminate red terms:
λ′(t) = −H′x (x(t), u(t), λ(t))>
λ(tf ) = ϕ′(x(tf ))>
adjoint ODE
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Computation of Gradient by Formal Lagrange Technique
Hamilton function:H(x, u, λ) := f0(x, u) + λ>f (x, u)
Auxiliary functional: (u, x = S(u) given; λ ∈ X to be specified later)
J(u) := J(u) +⟨λ, f (x, u)− x′
⟩L2(I,Rnx )×L2(I,Rnx )
= ϕ(x(tf )) +
∫IH(x(t), u(t), λ(t))− λ(t)>x′(t) dt
p.I.= ϕ(x(tf ))−
[λ(t)>x(t)
]tf
t0+
∫IH(x(t), u(t), λ(t)) + λ′(t)>x(t) dt.
Frechet derivative: (exploit S′(u)(t0) = 0)
J′(u)(u) =(ϕ′(x(tf ))− λ(tf )>
)S′(u)(tf )
+
∫I
(H′x [t] + λ′(t)>
)S′(u)(t) +H′u [t]u(t) dt.
Computation of S′(u) is expensive! Eliminate red terms:
λ′(t) = −H′x (x(t), u(t), λ(t))>
λ(tf ) = ϕ′(x(tf ))>
adjoint ODE
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Computation of Gradient by Formal Lagrange Technique
Hamilton function:H(x, u, λ) := f0(x, u) + λ>f (x, u)
Auxiliary functional: (u, x = S(u) given; λ ∈ X to be specified later)
J(u) := J(u) +⟨λ, f (x, u)− x′
⟩L2(I,Rnx )×L2(I,Rnx )
= ϕ(x(tf )) +
∫IH(x(t), u(t), λ(t))− λ(t)>x′(t) dt
p.I.= ϕ(x(tf ))−
[λ(t)>x(t)
]tf
t0+
∫IH(x(t), u(t), λ(t)) + λ′(t)>x(t) dt.
Frechet derivative: (exploit S′(u)(t0) = 0)
J′(u)(u) =(ϕ′(x(tf ))− λ(tf )>
)S′(u)(tf )
+
∫I
(H′x [t] + λ′(t)>
)S′(u)(t) +H′u [t]u(t) dt.
Computation of S′(u) is expensive! Eliminate red terms:
λ′(t) = −H′x (x(t), u(t), λ(t))>
λ(tf ) = ϕ′(x(tf ))>
adjoint ODE
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Computation of Gradient by Formal Lagrange Technique
Hamilton function:H(x, u, λ) := f0(x, u) + λ>f (x, u)
Auxiliary functional: (u, x = S(u) given; λ ∈ X to be specified later)
J(u) := J(u) +⟨λ, f (x, u)− x′
⟩L2(I,Rnx )×L2(I,Rnx )
= ϕ(x(tf )) +
∫IH(x(t), u(t), λ(t))− λ(t)>x′(t) dt
p.I.= ϕ(x(tf ))−
[λ(t)>x(t)
]tf
t0+
∫IH(x(t), u(t), λ(t)) + λ′(t)>x(t) dt.
Frechet derivative: (exploit S′(u)(t0) = 0)
J′(u)(u) =(ϕ′(x(tf ))− λ(tf )>
)S′(u)(tf )
+
∫I
(H′x [t] + λ′(t)>
)S′(u)(t) +H′u [t]u(t) dt.
Computation of S′(u) is expensive! Eliminate red terms:
λ′(t) = −H′x (x(t), u(t), λ(t))>
λ(tf ) = ϕ′(x(tf ))>
adjoint ODE
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Computation of Gradient by Formal Lagrange Technique
Hamilton function:H(x, u, λ) := f0(x, u) + λ>f (x, u)
Auxiliary functional: (u, x = S(u) given; λ ∈ X to be specified later)
J(u) := J(u) +⟨λ, f (x, u)− x′
⟩L2(I,Rnx )×L2(I,Rnx )
= ϕ(x(tf )) +
∫IH(x(t), u(t), λ(t))− λ(t)>x′(t) dt
p.I.= ϕ(x(tf ))−
[λ(t)>x(t)
]tf
t0+
∫IH(x(t), u(t), λ(t)) + λ′(t)>x(t) dt.
Frechet derivative: (exploit S′(u)(t0) = 0)
J′(u)(u) =(ϕ′(x(tf ))− λ(tf )>
)S′(u)(tf )
+
∫I
(H′x [t] + λ′(t)>
)S′(u)(t) +H′u [t]u(t) dt.
Computation of S′(u) is expensive! Eliminate red terms:
λ′(t) = −H′x (x(t), u(t), λ(t))>
λ(tf ) = ϕ′(x(tf ))>
adjoint ODE
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Computation of Gradient
“Gradient form”:
J′(u)(u) =
∫IH′u [t]u(t)dt = 〈∇uH, u〉L2(I,Rnu )×L2(I,Rnu ).
Theorem (steepest descent)The direction
d(t) := −1
‖H′u‖2H′u [t]>
solvesMinimize J′(u)(u) w.r.t. u subject to ‖u‖2 = 1 .
Proof: For every u with ‖u‖2 = 1 we have by the Schwarz inequality,
|J′(u)(u)| ≤ ‖H′u‖2 · ‖u‖2 = ‖H′u‖2
and for d we haveJ′(u)(d) = −‖H′u‖2.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Computation of Gradient
Relation between J′(u) and J′(u), for a proof see [5, Chapter 8]:
TheoremLet u ∈ U and x = S(u) be given. Let λ satisfy the adjoint ODE
λ′ = −H′x (x, u, λ)>, λ(tf ) = ϕ′(x(tf ))>.
Then:J′(u)(u) = J′(u)(u) ∀u ∈ U.
Owing to the theorem we may define the gradient of J at u as follows.
Definition (Gradient of reduced objective functional)Let u ∈ U and x = S(u) be given. Let λ satisfy the adjoint ODE.The gradient∇J(u) ∈ U of J at u is defined by
∇J(u)(·) := H′u(x(·), u(·), λ(·))>.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method
Gradient method for R-OCP
(0) Choose u(0) ∈ U, β ∈ (0, 1), σ ∈ (0, 1), and set k := 0.
(1) Solve the ODEx′ = f (x, u(k)), x(t0) = x,
and the adjoint ODE
λ′ = −H′x (x, u(k), λ)>, λ(tf ) = ϕ′(x(tf ))>.
Denote the solution by x (k) = S(u(k)) and λ(k).
(2) If ‖H′u‖2 ≈ 0, STOP.
(3) Setd (k)(t) := −H′u(x (k)(t), u(k)(t), λ(k)(t))>.
(4) Perform an Armijo line-search: Find smallest j ∈ 0, 1, 2, . . . with
J(u(k) + βj d (k)) ≤ J(u(k))− σβj‖H′u(x (k), u(k), λ(k))‖22
and set αk := βj .
(5) Set u(k+1) := u(k) + αk d (k), k := k + 1, and go to (1).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method
Theorem (Convergence)Suppose that the gradient method does not terminate. Let u∗ be an accumulation pointof the sequence u(k)k∈N generated by the gradient method and x∗ := S(u∗).Then:
‖∇J(u∗)‖2 = 0.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method – Examples
ExampleMinimize x2(1) subject to the constraints
x′1(t) = −x1(t) +√
3u(t), x1(0) = 2,
x′2(t) =12
(x1(t)2 + u(t)2
), x2(0) = 0.
Output of gradient method: (u(0) ≡ 0, β = 0.9, σ = 0.1, symplectic Euler, N = 100)
k αk J(u(k)) ‖H′u‖∞ ‖H′u‖22
0 0.00000000E+00 0.87037219E+00 0.14877655E+01 0.65717322E+00
1 0.10000000E+01 0.72406641E+00 0.61765831E+00 0.21168343E+00
2 0.10000000E+01 0.68017548E+00 0.35175249E+00 0.72633493E-01
3 0.10000000E+01 0.66515486E+00 0.20519966E+00 0.24977643E-01
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
23 0.10000000E+01 0.65728265E+00 0.47781223E-05 0.13406009E-10
24 0.10000000E+01 0.65728265E+00 0.28158031E-05 0.46203574E-11
25 0.10000000E+01 0.65728265E+00 0.16459056E-05 0.15877048E-11
26 0.10000000E+01 0.65728265E+00 0.97422070E-06 0.54326016E-12
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method – Examples
Example (continued)Some iterates (red) and converged solution (blue):
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x1(t
)
time t [s]
state x1
-1.6
-1.4
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
u(t
)
time t [s]
control u
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
lam
bda1(t
)
time t [s]
adjoint lambda1
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method – Examples
ExampleMinimize x2(1) + 2.5(x1(1)− 1)2 subject to the constraints
x′1(t) = u(t)− 15 exp(−2t), x1(0) = 4,
x′2(t) =12
(u(t)2 + x1(t)3
), x2(0) = 0.
Output of gradient method: (u(0) ≡ 0, β = 0.9, σ = 0.1, symplectic Euler, N = 100)
k αk J(u(k)) ‖H′u‖∞ ‖H′u‖22
0 0.00000000E+00 0.32263509E+02 0.17750257E+02 0.23786744E+03
1 0.31381060E+00 0.21450571E+02 0.16994586E+02 0.18987670E+03
2 0.25418658E+00 0.15604667E+02 0.91485607E+01 0.78563930E+02
3 0.28242954E+00 0.11414005E+02 0.75417222E+01 0.39835695E+02
4 0.25418658E+00 0.97695774E+01 0.42526367E+01 0.15677990E+02
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
61 0.28242954E+00 0.84261386E+01 0.22532128E-05 0.28170368E-11
62 0.28242954E+00 0.84261386E+01 0.16626581E-05 0.17267323E-11
63 0.28242954E+00 0.84261386E+01 0.16639237E-05 0.13398784E-11
64 0.22876792E+00 0.84261386E+01 0.83196204E-06 0.21444710E-12
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method – Examples
Example (continued)Some iterates (red) and converged solution (blue):
-3
-2
-1
0
1
2
3
4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x1(t
)
time t [s]
state x1
-1
0
1
2
3
4
5
6
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
u(t
)
time t [s]
control u
-20
-15
-10
-5
0
5
10
15
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
lam
bda1(t
)
time t [s]
adjoint lambda1
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Projected Gradient Method
We add “simple” constraints u ∈ U with a convex set U ⊂ U to the reduced problem.
Reduced optimal control problem (R-OCP)
Minimize J(u) := Γ(S(u), u) subject to u ∈ U ⊂ U.
Assumptions:I the projection ΠU : U −→ U is easy to compute
Example: For box constraints
U = u ∈ R | a ≤ u ≤ bthe projection computes to
ΠU (u) = maxa,minb, u =
a, if u < a,
u, if a ≤ u ≤ b,
b, if u > b
Optimality:
J′(u)(u − u) ≥ 0 ∀u ∈ U ⇐⇒ u = ΠU(u − αJ′(u)
)(α > 0)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Projected Gradient Method
We add “simple” constraints u ∈ U with a convex set U ⊂ U to the reduced problem.
Reduced optimal control problem (R-OCP)
Minimize J(u) := Γ(S(u), u) subject to u ∈ U ⊂ U.
Assumptions:I the projection ΠU : U −→ U is easy to compute
Example: For box constraints
U = u ∈ R | a ≤ u ≤ bthe projection computes to
ΠU (u) = maxa,minb, u =
a, if u < a,
u, if a ≤ u ≤ b,
b, if u > b
Optimality:
J′(u)(u − u) ≥ 0 ∀u ∈ U ⇐⇒ u = ΠU(u − αJ′(u)
)(α > 0)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Projected Gradient Method
We add “simple” constraints u ∈ U with a convex set U ⊂ U to the reduced problem.
Reduced optimal control problem (R-OCP)
Minimize J(u) := Γ(S(u), u) subject to u ∈ U ⊂ U.
Assumptions:I the projection ΠU : U −→ U is easy to compute
Example: For box constraints
U = u ∈ R | a ≤ u ≤ bthe projection computes to
ΠU (u) = maxa,minb, u =
a, if u < a,
u, if a ≤ u ≤ b,
b, if u > b
Optimality:
J′(u)(u − u) ≥ 0 ∀u ∈ U ⇐⇒ u = ΠU(u − αJ′(u)
)(α > 0)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Projected Gradient Method
The projected gradient method requires a feasible initial u(0) and differs from thegradient method in one of the following two components:
Version 1: In iteration k compute
u(k) := ΠU(
u(k) + d (k))
and use the direction d (k) := u(k) − u(k) instead of d (k) in steps (4) and (5), i.e.
J(
u(k) + βj d (k))≤ J(u(k)) + σβj J′(u(k))d (k), u(k+1) = u(k) + αk d (k)
Version 2: In iteration k use the projection within the Armijo linesearch in step (4), i.e.
J(
ΠU(
u(k) + βj d (k)))≤ J(u(k)) + σβj J′(u(k))d (k)
and set u(k+1) := ΠU(u(k) + αk d (k)
)in step (5)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Projected Gradient Method
The projected gradient method requires a feasible initial u(0) and differs from thegradient method in one of the following two components:
Version 1: In iteration k compute
u(k) := ΠU(
u(k) + d (k))
and use the direction d (k) := u(k) − u(k) instead of d (k) in steps (4) and (5), i.e.
J(
u(k) + βj d (k))≤ J(u(k)) + σβj J′(u(k))d (k), u(k+1) = u(k) + αk d (k)
Version 2: In iteration k use the projection within the Armijo linesearch in step (4), i.e.
J(
ΠU(
u(k) + βj d (k)))≤ J(u(k)) + σβj J′(u(k))d (k)
and set u(k+1) := ΠU(u(k) + αk d (k)
)in step (5)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Projected Gradient Method – Examples
ExampleMinimize x2(1) + 2.5(x1(1)− 1)2 subject to the constraints
x′1(t) = u(t)− 15 exp(−2t), x1(0) = 4,
x′2(t) =12
(u(t)2 + x1(t)3
), x2(0) = 0,
u(t) ∈ U := u ∈ R | 1 ≤ u ≤ 3.
Output of projected gradient method (version 1): (u(0) ≡ 1, β = 0.9, σ = 0.1,symplectic Euler, N = 100)
k αk J(u(k)) ‖u(k) − ΠU (u(k) − J′(u(k)))‖∞ J′(u(k))d(k)
0 0.00000000E+00 0.19145273E+02 0.20000000E+01 -0.21446656E+02
1 0.10000000E+01 0.87483967E+01 0.20000000E+01 -0.10674827E+01
2 0.47829690E+00 0.86918573E+01 0.95659380E+00 -0.39783867E+00
3 0.10000000E+01 0.85622908E+01 0.16165415E+01 -0.34588659E+00
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
56 0.81000000E+00 0.84783285E+01 0.13795810E-05 -0.18974642E-12
57 0.72900000E+00 0.84783285E+01 0.13795797E-05 -0.98603024E-13
58 0.81000000E+00 0.84783285E+01 0.90526344E-06 -0.25693860E-13
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Projected Gradient Method – Examples
Example (continued)Some iterates (red) and converged solution (blue):
-2
-1
0
1
2
3
4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x1(t
)
time t [s]
state x1
1
1.5
2
2.5
3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
u(t
)
time t [s]
control u
-14
-12
-10
-8
-6
-4
-2
0
2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
lam
bda1(t
)
time t [s]
adjoint lambda1
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method for the Discretized Problem
Instead of applying the “function space” gradient method to OCP, we could firstdiscretize OCP and apply the standard gradient method to the discretized problem.
Problem (Discretized optimal control problem (D-OCP))Given: x ∈ Rnx , grid GN , and continuously differentiable functions defined by
ϕ : Rnx −→ R, f0 : Rnx × Rnu −→ R, f : Rnx × Rnu −→ Rnx ,
GN := ti | ti = t0 + ih, i = 0, 1, . . . ,N, h = (tf − t0)/N, N ∈ N.
Minimize
ΓN (x, y, u) := ϕ(xN ) + hN−1∑i=0
f0(xi , ui )
w.r.t. x = (x0, x1, . . . , xN )> ∈ R(N+1)nx , u = (u0, . . . , uN−1)> ∈ RNnu subject to theconstraints
xi+1 − xi
h= f (xi , ui ) i = 0, . . . ,N − 1,
x0 = x.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method for the Discretized Problem
Denote byS : RNnu −→ R(N+1)nx , u 7→ x = S(u),
the solution operator, that maps the control input u to the solution x of the discretedynamics.
Problem (Reduced Discretized Problem (RD-OCP))
Minimize JN (u) := ΓN (S(u), u) w.r.t. u ∈ RNnu .
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method for the Discretized Problem
Auxiliary functional: (u, x = S(u) given; λ ∈ R(N+1)nx to be defined later)
JN (u) := ϕ(xN ) + hN−1∑i=0
(H(xi , ui , λi+1)− λ>i+1
(xi+1 − xi
h
))
Derivative: (S′i (u) denotes the sensitivity matrix ∂ xi∂u ; exploit S′0(u) = 0)
J′N (u) = ϕ′(xN )S′N (u)
+hN−1∑i=0
(H′x [ti ]S′i (u) +H′u [ti ]
∂ui
∂u− λ>i+1
(S′i+1(u)− S′i (u)
h
))
=(ϕ′(xN )− λ>N
)S′N (u) +
(hH′x [t0] + λ>1
)S′0(u)
+hN−1∑i=1
(H′x [ti ] +
1h
(λ>i+1 − λ
>i
))S′i (u) + h
N−1∑i=0
H′u [ti ]∂ui
∂u
Avoid calculation of sensitivities S′i ! Choose λ such that red terms vanish.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method for the Discretized Problem
Discrete adjoint ODE:
λi+1 − λi
h= −H′x (xi , ui , λi+1), i = 0, . . . ,N − 1,
λN = ϕ′(xN )>
Gradient of auxiliary functional:
J′N (u) = hN−1∑i=0
H′u [ti ]∂ui
∂u= h
(H′u [t0] H′u [t1] · · · H′u [tN−1]
).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method for the Discretized Problem
Link to gradient of reduced objective functional JN :
TheoremLet u ∈ RNnu be given and let λ ∈ R(N+1)nx satisfy the discrete adjoint ODE. Then,
∇JN (u) = J′N (u)>.
Consequence:I The gradient method for RD-OCP uses in iteration k the search direction
d (k) = −∇JN (u(k)) = −h
H′u [t0]>
...
H′u [tN−1]>
.I This is the same search direction as in the “function space” gradient method,
except that it is scaled by h. Slower convergence expected!
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method for the Discretized Problem
Gradient method for RD-OCP
(0) Choose u(0) ∈ RNnu , β ∈ (0, 1), σ ∈ (0, 1), and set k := 0.
(1) Solve
xi+1 − xi
h= f (xi , u(k)
i ) (i = 0, . . . , N − 1), x0 = x,
λi+1 − λi
h= −H′x (xi , u(k)
i , λi+1) (i = 0, . . . , N − 1), λN = ϕ′(xN )>
(2) Set
d (k) := −∇JN (u(k)) = −h
H′u [t0]>
...
H′u [tN−1]>
(3) If ‖d (k)‖2 ≈ 0, STOP.
(4) Perform an Armijo line-search: Find smallest j ∈ 0, 1, 2, . . . with
JN (u(k) + βj d (k)) ≤ JN (u(k))− σβj‖d (k)‖22
and set αk := βj .
(5) Set u(k+1) := u(k) + αk d (k), k := k + 1, and go to (1).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Gradient Method for the Discretized Problem – Example
Example (compare “function space” equivalent example)Minimize x2(1) subject to the constraints
x′1(t) = −x1(t) +√
3u(t), x1(0) = 2,
x′2(t) =12
(x1(t)2 + u(t)2
), x2(0) = 0.
Output of gradient method for RD-OCP: (u(0) = 0, N = 100, β = 0.9, σ = 0.1)
k αk J(u(k)) ‖H′u‖∞ ‖H′u‖22
0 0.00000000E+00 0.87037219E+00 0.14877677E-01 0.65717185E-02
1 0.10000000E+01 0.86385155E+00 0.14671320E-01 0.63689360E-02
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
813 0.10000000E+01 0.65728265E+00 0.10356781E-05 0.76292003E-11
814 0.10000000E+01 0.65728265E+00 0.10655524E-05 0.77345706E-11
815 0.10000000E+01 0.65728265E+00 0.98588398E-06 0.74390524E-11
Observation: Since the search direction in RD-OCP is scaled by h, we need muchmore iterations compared to the “function space” gradient method, which required only26 iterations at the same accuracy.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Contents
Introduction
Necessary Conditions
Adjoint Formalism
Gradient MethodGradient Method in Finite DimensionsGradient Method for Optimal Control ProblemsExtensionsGradient Method for Discrete ProblemExamples
Lagrange-Newton MethodLagrange-Newton Method in Finite DimensionsLagrange-Newton Method in Infinite DimensionsApplication to Optimal ControlSearch DirectionExamplesExtensions
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method
Pro’s:I locally quadratic convergence rate (fast convergence)I can handle nonlinear equality constraintsI global convergence achieved by Armijo linesearch
Con’s:I requires second derivativesI higher implementation effort compared to gradient method
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method in Finite Dimensions
Consider the equality constrained nonlinear optimization problem:
Equality constrained optimization problem (E-NLP)
Minimize J(x, u) subject to H(x, u) = 0
J : Rnx × Rnu −→ R, H : Rnx × Rnu −→ RnH twice continuously differentiable
Lagrange function;L(x, u, λ) := J(x, u) + λ>H(x, u)
Theorem (KKT conditions)Let (x, u) be a local minimum of E-NLP and let H′(x, u) have full rank. Then thereexists a multiplier λ ∈ RnH such that
∇(x,u)L(x, u, λ) = 0.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method in Finite Dimensions
Idea of Lagrange-Newton method: Apply Newton’s method to the optimality system
T (x, u, λ) = 0 with T (x, u, λ) :=
(∇(x,u)L(x, u, λ)
H(x, u)
)=
∇x L(x, u, λ)
∇uL(x, u, λ)
H(x, u)
Newton system:
T ′(x (k), u(k), λ(k))d (k) = −T (x (k), u(k), λ(k))
with
T ′(x, u, λ) =
L′′xx (x, u, λ) L′′xu(x, u, λ) H′x (x, u)>
L′′ux (x, u, λ) L′′uu(x, u, λ) H′u(x, u)>
H′x (x, u) H′u(x, u) 0
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method in Finite Dimensions
Theorem (Nonsingularity)T ′(x, u, λ) is nonsingular, if the following conditions hold:
I L′′(x,u),(x,u)
(x, u, λ) is positive definite on the nullspace of H′(x, u), that is
v>L′′(x,u),(x,u)(x, u, λ)v > 0 ∀v : H′(x, u)v = 0.
I H′(x, u) has full rank
Note: These conditions are actually sufficient, if (x, u, λ) is a stationary point of L.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method in Infinite Dimensions
Consider the equality constrained nonlinear optimization problem:
Equality constrained optimization problem (E-NLP)
Minimize J(x, u) subject to H(x, u) = 0
J : X × U −→ R, H : X × U −→ Λ twice continuously Frechet differentiable, X , U, ΛBanach spaces
Lagrange function; L : X × U × Λ∗ −→ R with
L(x, u, λ∗) := J(x, u) + λ∗(H(x, u))
Theorem (KKT conditions)Let (x, u) be a local minimum of E-NLP and let H′(x, u) be surjective. Then thereexists a multiplier λ∗ ∈ Λ∗ such that
L′(x,u)(x, u, λ∗) = 0.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method in Infinite Dimensions
Idea of Lagrange-Newton method:Use Newton’s method to find a zero z = (x, u, λ∗) ∈ Z := X × U × Λ∗ of theoperator T : Z −→ Y defined by
T (x, u, λ∗) :=
L′x (x, u, λ∗)
L′u(x, u, λ∗)
H(x, u)
Observe:
L′x (x, u, λ∗)x = J′x (x, u)x + λ∗(H′x (x, u)x) = J′x (x, u)x + (H′x (x, u)∗λ∗)x
L′u(x, u, λ∗)u = J′u(x, u)u + λ∗(H′u(x, u)u) = J′u(x, u)u + (H′u(x, u)∗λ∗)u
whereH′x (x, u)∗ : Y∗ −→ X∗ and H′u(x, u)∗ : Y∗ −→ U∗
denote the respective adjoint operators.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Local Lagrange-Newton Method
Local Lagrange-Newton Method
(0) Choose z(0) ∈ Z and set k := 0.
(1) If ‖T (z(k))‖Y ≈ 0, STOP.
(2) Compute the search direction d (k) from
T ′(z(k))(d (k)) = −T (z(k)).
(3) Set z(k+1) := z(k) + d (k), k := k + 1, and go to (1).
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method in Infinite Dimensions
Theorem (Nonsingularity)T ′(x, u, λ∗) is nonsingular, if the following conditions hold:
I L′′(x,u),(x,u)
(x, u, λ∗) is uniformly positive definite on the nullspace of H′(x, u), i.e.there exists C > 0 such that
v>L′′(x,u),(x,u)(x, u, λ∗)v ≥ C‖v‖2X×U ∀v ∈ X × U : H′(x, u)v = 0.
I H′(x, u) is surjective
Note: These conditions are actually sufficient, if (x, u, λ) is a stationary point of L,see [1, Theorem 5.6], [2, Theorem 2.3].
[1] H. Maurer and J. Zowe.First and Second-Order Necessary and Sufficient Optimality Conditions for Infinite-DimensionalProgramming Problems.Mathematical Programming, 16:98–110, 1979.
[2] H. Maurer.First and Second Order Sufficient Optimality Conditions in Mathematical Programming andOptimal Control.Mathematical Programming Study, 14:163–177, 1981.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Local Lagrange-Newton Method
Theorem (local convergence)Let z∗ be a zero of T . Suppose there exist constants ∆ > 0 and C > 0 such that forevery z ∈ B∆(z∗) the derivative T ′(z) is non-singular and
‖T ′(z)−1‖L(Y ,Z ) ≤ C.
(a) If ϕ, f0, f , ψ are twice continuously differentiable, then there exists δ > 0 suchthat the local Lagrange-Newton method is well-defined for every z(0) ∈ Bδ(z∗)and the sequence z(k)k∈N converges superlinearly to z∗ for everyz(0) ∈ Bδ(z∗).
(b) If the second derivatives of ϕ, f0, f , ψ are locally Lipschitz continuous, then theconvergence in (a) is quadratic.
(c) If in addition to the assumption in (a), T (z(k)) 6= 0 for all k, then the residualvalues converge superlinearly:
limk−→∞
‖T (z(k+1))‖Y
‖T (z(k))‖Y= 0.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Global Lagrange-Newton Method
Merit function for globalization:
γ(z) :=12‖T (z)‖2
2
Globalized Lagrange-Newton Method
(0) Choose z(0) ∈ Z , β ∈ (0, 1), σ ∈ (0, 1/2).
(1) If γ(z(k)) ≈ 0, STOP.
(2) Compute the search direction d (k) from T ′(z(k))(d (k)) = −T (z(k)).
(3) Find smallest j ∈ 0, 1, 2, . . . with
γ(z(k) + βj d (k)) ≤ γ(z(k)) + σβjγ′(z(k))(d (k))
and set αk := βj .
(4) Set z(k+1) := z(k) + αk d (k), k := k + 1, and go to (1).
Note: γ : Z −→ R is Frechet-differentiable with
γ′(z(k))(d (k)) = −2γ(z(k)) = −‖T (z(k))‖22
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Application to Optimal Control
Problem (Optimal control problem (OCP))Given: I := [t0, tf ], twice continuously differentiable functions ϕ : Rnx × Rnx −→ R,
f0 : Rnx × Rnu −→ R, f : Rnx × Rnu −→ Rnx , ψ : Rnx × Rnx −→ Rnψ .
MinimizeJ(x, u) := ϕ(x(t0), x(tf )) +
∫I
f0(x(t), u(t))dt
with respect to x ∈ X := W 1,∞(I,Rnx ) and u ∈ U := L∞(I,Rnu ) subject to theconstraints
x′(t) = f (x(t), u(t)) a.e. in I,0 = ψ(x(t0), x(tf )).
Remark:I A partially reduced approach is possible, where x = x(·; u, x0) is expressed as a
function of initial value x0 and u. The constraint ψ(x0, x(tf ; u, x0)) = 0 remains.This is the function space equivalent of the direct shooting method.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Application to Optimal Control
Hamilton function:H(x, u, λ) := f0(x, u) + λ>f (x, u)
Defineκ := ϕ + σ>ψ
Theorem (Minimum principle, KKT conditions)Let (x∗, u∗) be a local minimizer of OCP. Let H′(x∗, u∗) be surjective. Then there existmultipliers λ∗ ∈ W 1,∞(I,Rnx ) and σ∗ ∈ Rnψ with
x′∗(t)− f (x∗(t), u∗(t)) = 0
λ′∗(t) +H′x (x∗(t), u∗(t), λ∗(t))> = 0
ψ(x∗(t0), x∗(tf )) = 0
λ∗(t0) + κ′x0(x∗(t0), x∗(tf ), σ∗)> = 0
λ∗(tf )− κ′xf(x∗(t0), x∗(tf ), σ∗)> = 0
H′u(x∗(t), u∗(t), λ∗(t))> = 0
Root finding problem:T (z∗) = 0, T : Z −→ Y
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Application to Optimal Control
Definiton of operator T :
T (z)(·) :=
x′(·)− f (x(·), u(·))
λ′(·) + H′x (x(·), u(·), λ(·))>
ψ(x(t0), x(tf ))
λ(t0) + κ′x0(x(t0), x(tf ), σ)>
λ(tf )− κ′xf(x(t0), x(tf ), σ)>
H′u(x(·), u(·), λ(·))>
with
z := (x, u, λ, σ)
Z := X × U × X × Rnψ
Y := L∞(I,Rnx )× L∞(I,Rnx )× Rnψ × Rnx × Rnx × L∞(I,Rnu )
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Computation of Search Direction
Newton direction:
T ′(z(k))d = −T (z(k)), d = (x, u, λ, σ) ∈ Z
Frechet derivative: (evaluated at z(k))
T ′(z(k))(z) =
x′ − f ′x x − f ′uu
λ′ +H′′xx x +H′′xuu +H′′xλλψ′x0
x(t0) + ψ′xfx(tf )
λ(t0) + κ′′x0x0x(t0) + κ′′x0xf
x(tf ) + κ′′x0σσ
λ(tf )− κ′′xf x0x(t0)− κ′′xf xf
x(tf )− κ′′xfσσ
H′′ux x +H′′uuu +H′′uλλ
.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Computation of Search Direction
Newton direction is equivalent to the linear DAE boundary value problemx′
λ′
0
−
f ′x 0 f ′u−H′′xx −H′′xλ −H′′xu
−H′′ux −H′′uλ −H′′uu
x
λ
u
= −
(x (k)
)′ − f(λ(k)
)′+ (H′x )>
(H′u)>
with boundary conditions
ψ′x00 0
κ′′x0x0Id κ′′x0σ
−κ′′xf x00 −κ′′xfσ
x(0)
λ(t0)
σ
+
ψ′xf
0 0
κ′′x0xf0 0
−κ′′xf xfId 0
x(1)
λ(tf )
σ
= −
ψ
λ(k)(t0) + (κ′x0)>
λ(k)(tf )− (κ′xf)>
.Hence: In each Newton iteration, we need to solve the above linear BVP.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Computation of Search Direction
TheoremThe differential-algebraic equation (DAE) in BVP has index one (i.e. the last equalitycan be solved for u), if the matrix function
M(t) := H′′uu [t]
is non-singular for almost every t ∈ [0, 1] and ‖M(t)−1‖ ≤ C for some constant Cand almost every t ∈ [0, 1].
If M(·) is singular:I BVP contains a differential-algebraic equation of higher index, which is
numerically unstable.I boundary conditions may become infeasible
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Examples
Example (Trolley)
x
z
u
(x1, z)
x2
−m2g
`
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Examples
Example (Trolley, continued)Dynamics:
x′1 = x3
x′2 = x4
x′3 =m2
2`3 sin(x2)x2
4 − m2`2u + m2Iy`x2
4 sin(x2)− Iy u−m1m2`2 − m1Iy − m2
2`2 − m2Iy + m2
2`2 cos(x2)2
+m2
2`2g cos(x2) sin(x2)
−m1m2`2 − m1Iy − m22`
2 − m2Iy + m22`
2 cos(x2)2
x′4 =m2`
(m2` cos(x2)x2
4 sin(x2)− cos(x2)u + g sin(x2)(m1 + m2))
−m1m2`2 − m1Iy − m22`
2 − m2Iy + m22`
2 cos(x2)2
Parameters:
g = 9.81, m1 = 0.3, m2 = 0.5, ` = 0.75, r = 0.1, Iy = 0.002.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Examples
Example (Trolley, continued)Minimize
12
∫ tf
0u(t)2 + 5x4(t)2 dt
subject to the ODE, the initial conditions
x1(0) = x2(0) = x3(0) = x4(0) = 0,
and the terminal conditions
x1(tf ) = 1, x2(tf ) = x3(tf ) = x4(tf ) = 0
within the fixed time tf = 2.7.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Examples
Example (Trolley, continued)Output of Lagrange-Newton method: (N = 1000, Euler)
k αk ‖T (z(k))‖22 ‖d(k)‖∞
0 0.000000E+00 0.100000E+01 0.451981E+01
1 0.100000E+01 0.688773E-03 0.473501E-02
2 0.100000E+01 0.809983E-12 0.118366E-06
3 0.100000E+01 0.160897E-24 0.141058E-11
Iterations for different stepsizes N: mesh independence; linear CPU
N CPU time [s] Iterations
100 0.022 3
200 0.050 3
400 0.093 3
800 0.174 3
1600 0.622 3
3200 0.822 3
6400 1.900 4
12800 3.771 4
25600 7.939 4
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Examples
Example (Trolley, continued)
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 0.5 1 1.5 2 2.5 3
x1(t
)
time t [s]
state x1
-0.1
-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
0 0.5 1 1.5 2 2.5 3
x2(t
)
time t [s]
state x2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0 0.5 1 1.5 2 2.5 3
x3(t
)
time t [s]
state x3
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
0 0.5 1 1.5 2 2.5 3
x4(t
)
time t [s]
state x4
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
0 0.5 1 1.5 2 2.5 3
lam
bda1(t
)
time t [s]
adjoint lambda1
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
0 0.5 1 1.5 2 2.5 3
lam
bda2(t
)
time t [s]
adjoint lambda2
-1.5
-1
-0.5
0
0.5
1
1.5
0 0.5 1 1.5 2 2.5 3
lam
bda3(t
)
time t [s]
adjoint lambda3
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
0 0.5 1 1.5 2 2.5 3
lam
bda4(t
)
time t [s]
adjoint lambda4
-1.5
-1
-0.5
0
0.5
1
1.5
0 0.5 1 1.5 2 2.5 3
u(t
)
time t [s]
control u
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Example from Chemical EngineeringDAE index-1 optimal control problem: (chemical reaction of substances A, B, C, and D)
Minimize
−MC (tf ) + 10−2∫ tf
0FB(t)2 + Q(t)2dt
w.r.t. controls FB (feed rate of substance B) and cooling power Q subject to the index-1 DAE
M′A = −V · A1 · e−E1/TR · CA · CB
M′B = FB − V(
A1e−E1/TR · CA · CB + A2 · e−E2/TR · CB · CC
)M′C = V
(A1e−E1/TR · CA · CB − A2 · e−E2/TR · CB · CC
)M′D = V · A2 · e−E2/TR · CB · CC
H′ = 20FB − Q − V(−A1e−E1/TR · CA · CB − 75A2 · e−E2/TR · CB · CC
)0 = H −
∑i=A,B,C,D
Mi
(αi (TR − Tref ) +
βi
2
(T 2
R − T 2ref
))where
V =∑
i=A,B,C,D
Mi
ρi, Ci = Mi/V , i = A, B, C, D.
Source: V. C. Vassiliades, R. W. H. Sargent, and C. C. Pantelides. Solution of a class of multistage dynamic optimization problems. 2.
Problems with path constraints. Industrial & Engineering Chemistry Research, 33:2123–2133, 1994.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Example from Chemical Engineering
Lagrange-Newton method:(N = 20000 intervals)
k αk ‖T (z(k))‖22 ‖d(k)‖∞
0 0.000000E+00 0.465186E+12 0.599335E+05
1 0.100000E+01 0.759821E+10 0.523150E+07
2 0.147809E-01 0.755228E+10 0.127262E+07
3 0.423912E-01 0.745716E+10 0.835212E+05
4 0.100000E+01 0.351002E+09 0.344908E+06
5 0.121577E+00 0.340325E+09 0.667305E+05
6 0.100000E+01 0.131555E+08 0.370395E+05
7 0.100000E+01 0.295389E+07 0.169245E+04
8 0.100000E+01 0.114958E+01 0.336606E+00
9 0.100000E+01 0.852576E-11 0.104227E-02
10 0.100000E+01 0.125527E-11 0.658317E-03
11 0.100000E+01 0.283055E-14 0.605453E-05
0
20
40
60
80
100
0 5 10 15 20
F_B
(t)
time t [s]
control F_B
-0.02
-0.015
-0.01
-0.005
0
0.005
0.01
0.015
0.02
0 5 10 15 20
Q(t
)
time t [s]
control Q
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Example from Chemical Engineering
8000
8200
8400
8600
8800
9000
0 5 10 15 20
M_A
(t)
time t [s]
state M_A
-5
0
5
10
15
20
25
30
35
0 5 10 15 20M
_B
(t)
time t [s]
state M_B
0
200
400
600
800
1000
0 5 10 15 20
M_C
(t)
time t [s]
state M_C
-0.2
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20
M_D
(t)
time t [s]
state M_D
150000
155000
160000
165000
170000
175000
180000
0 5 10 15 20
H(t
)
time t [s]
state H
346
348
350
352
354
356
358
360
362
0 5 10 15 20
T_R
(t)
time t [s]
algebraic variable T_R
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Example from Chemical Engineering
-0.001
0
0.001
0.002
0.003
0.004
0.005
0 5 10 15 20
lam
bda1(t
)
time t [s]
adjoint lambda1
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0 5 10 15 20la
mbda2(t
)
time t [s]
adjoint lambda2
-0.002
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0 5 10 15 20
lam
bda3(t
)
time t [s]
adjoint lambda3
-0.002
0
0.002
0.004
0.006
0.008
0.01
0 5 10 15 20
lam
bda4(t
)
time t [s]
adjoint lambda4
-0.0002
-0.00015
-0.0001
-5e-05
0
5e-05
0.0001
0 5 10 15 20
lam
bda5(t
)
time t [s]
adjoint lambda5
-0.00025
-0.0002
-0.00015
-0.0001
-5e-05
0
5e-05
0.0001
0 5 10 15 20
lam
bda_g(t
)
time t [s]
adjoint lambda_g
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Navier-Stokes Example
Optimal control problem:
Minimize
12
∫Q‖y(t, x)− yd (t, x)‖2 dx dt +
δ
2
∫Q‖u(t, x)‖2 dx dt
w.r.t. velocity y , pressure p and control u subject to the 2D Navier-Stokes equations
yt =1
Re∆y − (y · ∇)y −∇p + u in Q := (0, tf )× Ω,
0 = div(y) in Q,
0 = y(0, x) for x ∈ Ω := (0, 1)× (0, 1),
0 = y(t, x) for (t, x) ∈ (0, tf )× ∂Ω.
Given: desired velocity field
yd (t, x1, x2) = (−q(t, x1)q′x2(t, x2), q(t, x2)q′x1
(t, x1))>,
q(t, z) = (1− z)2(1− cos(2πzt))
M. Gerdts and M. Kunkel. A globally convergent semi-smooth Newton method for control-state constrained DAE optimal control problems.
Computational Optimization and Applications, 48(3):601–633, 2011.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Navier-Stokes Example
Discretization by method of lines: (details omitted)
Minimize12
∫ tf
0‖yh(t)− yd,h(t)‖2 dt +
δ
2
∫ tf
0‖uh(t)‖2 dt
subject to the index-2 DAE
yh(t) =1
ReAhyh(t)−
12
yh(t)>Hh,1yh(t)
...
yh(t)>Hh,2(N−1)2 yh(t)
− Bhph(t) + uh(t),
0 = B>h yh(t),
yh(0) = 0.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Navier-Stokes Example
Pressure p at t = 0.6, t = 1.0, t = 1.4 and t = 1.967:
(Parameters: tf = 2, δ = 10−5, Re = 1, N = 31, Nt = 60, nx = 2(N − 1)2 = 1800,ny = (N − 1)2 = 900, nu = 1800 controls)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Navier-Stokes Example
Desired flow (left), controlled flow (middle), control (right) at t = 0.6, t = 1.0, t = 1.4and t = 1.967:
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Navier-Stokes Example
Output of Lagrange-Newton method:
Solve Stokes problem as initial guess:
k∫ 1
0 f0[t]dt αk−1 ‖T (z(k))‖2
0 1.763432802e+03 5.938741958e+01
1 3.986109778e+02 1.0000000000000 7.255927209e-10
Solve Navier-Stokes problem:
k∫ 1
0 f0[t]dt αk−1 ‖T (z(k))‖2
0 3.986109778e+02 1.879632471e+04
1 3.988553051e+02 1.0000000000000 1.183777963e+01
2 3.988549264e+02 1.0000000000000 4.521653291e-04
3 3.988549264e+02 1.0000000000000 9.576226032e-10
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Lagrange-Newton Method – Extensions
Treatment of inequality constraints:I sequential-quadratic programming (SQP)
Idea: solve linear-quadratic optimization problems to obtain a search directionI interior-point methods (IP)
Idea: solve a sequence of barrier problems (equivalent: perturbation ofcomplementarity conditions in KKT conditions)
I semismooth Newton methodsIdea: transform complementarity conditions into equivalent (nonsmooth!) equation
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
References.
[1] W. Alt.The Lagrange-Newton method for infinite dimensional optimization problems.Numerical Functional Analysis and Optimization, 11:201–224, 1990.
[2] W. Alt.Sequential Quadratic Programming in Banach Spaces.In W. Oettli and D. Pallaschke, editors, Advances in Optimization, pages 281–301, Berlin, 1991. Springer.
[3] W. Alt and K. Malanowski.The Lagrange-Newton method for nonlinear optimal control problems.Computational Optimization and Applications, 2:77–100, 1993.
[4] W. Alt and K. Malanowski.The Lagrange-Newton method for state constrained optimal control problems.Computational Optimization and Applications, 4:217–239, 1995.
[5] M. Gerdts.Optimal Control of ODEs and DAEs.Walter de Gruyter, Berlin/Boston, 2012, 2012.
[6] M. Hinze, R. Pinnau, M. Ulbrich, and S. Ulbrich.Optimization with PDE constraints.Mathematical Modelling: Theory and Applications 23. Dordrecht: Springer. xi, 270 p., 2009.
[7] K. Ito and K. Kunisch.Lagrange multiplier approach to variational problems and applications.Advances in Design and Control 15. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM), 2008.
[8] K. C. P. Machielsen.Numerical Solution of Optimal Control Problems with State Constraints by Sequential Quadratic Programming in Function Space.Volume 53 of CWI Tract, Centrum voor Wiskunde en Informatica, Amsterdam, 1988.
[9] F. Troltzsch.Optimale Steuerung partieller Differentialgleichungen.Vieweg, Wiesbaden, 2005.
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Resources
Software: (available for registered users; free for academic use)I OCPID-DAE1 (optimal control and parameter identification with
differential-algebraic equations of index 1):
http://www.optimal-control.deI sqpfiltertoolbox: SQP method for dense NLPs
http://www.optimal-control.deI WORHP (Buskens/Gerdts): SQP method for sparse large-scale NLPs
http://www.worhp.deI QPSOL: interior-point and nonsmooth Newton methods for sparse large-scale
convex quadratic programs
available upon request
Robotics lab at UniBw M: research stays with use of lab equipment upon requestI KUKA youBot robot (2 arm robot on a platform); 3 scale rc cars; LEGO
Mindstorms robots; quarter car testbench; quadcopter
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
More ResourcesFurther optimal control software:
I CasADI, ACADO: M. Diehl et al.; http://casadi.org; http://sourceforge.net/p/acado/I NUDOCCCS: C. Buskens, University of BremenI SOCS: J. Betts, The Boeing Company, Seattle; http://www.boeing.com/boeing/phantom/socs/I DIRCOL: O. von Stryk, TU Darmstadt; http://www.sim.informatik.tu-darmstadt.de/res/sw/dircolI MUSCOD-II: H.G. Bock et al., IWR Heidelberg; http://www.iwr.uni-heidelberg.de/∼agbock/RESEARCH/muscod.phpI MISER: K.L. Teo et al., Curtin University, Perth; http://school.maths.uwa.edu.au/ les/miser/I PSOPT: http://www.psopt.org/I ...
Further optimization software:I NPSOL (dense problems), SNOPT (sparse large-scale problems): Stanford Business Software; http://www.sbsi-sol-optimize.comI KNITRO (sparse large-scale problems): Ziena Optimization; http://www.ziena.com/knitro.htmI IPOPT (sparse large-scale problems): A. Wachter: https://projects.coin-or.org/IpoptI filterSQP: R. Fletcher, S. Leyffer; http://www.mcs.anl.gov/ leyffer/solvers.htmlI ooQP: M. Gertz, S. Wright; http://pages.cs.wisc.edu/ swright/ooqp/I qpOASES: H.J. Ferreau, A. Potschka, C. Kirches; http://homes.esat.kuleuven.be/ optec/software/qpOASES/I ...
Software for boundary value problems:I BOUNDSCO: H. J. Oberle, University of Hamburg; http://www.math.uni-hamburg.de/home/oberle/software.htmlI COLDAE: U. Ascher; www.cs.ubc.ca/ ascher/coldae.fI ...
Links:I Decision Tree for Optimization Software; http://plato.la.asu.edu/guide.htmlI CUTEr: large collection of optimization test problems; http://www.cuter.rl.ac.uk/I COPS: large-scale optimization test problems; http://www.mcs.anl.gov/∼more/cops/I MINTOC: testcases for mixed-integer optimal control; http://mintoc.de/I ...
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Announcement: youBot Robotics Hackathon
Description:I student programming contestI addresses students and Phd’s who like to realize
projects with the KUKA youBot robotI 12 particpants from 5 Universities (UniBw M, Bayreuth,
TUM, TU Berlin, Maastricht)
Numerical Optimal Control – Part 3: Function space methods –Matthias Gerdts
Thanks for your Attention!
Questions?
Further information:
[email protected]/lrt1/gerdtswww.optimal-control.de
Fotos: http://de.wikipedia.org/wiki/MunchenMagnus Manske (Panorama), Luidger (Theatinerkirche), Kurmis (Chin. Turm), Arad Mojtahedi (Olympiapark), Max-k (Deutsches Museum), Oliver Raupach (Friedensengel), Andreas Praefcke (Nationaltheater)