+ All Categories
Home > Documents > W H A T I S . . . a Fenchel Conjugate? · Closely tied to the Fenchel conjugate of f ∈ ΓX is the...

W H A T I S . . . a Fenchel Conjugate? · Closely tied to the Fenchel conjugate of f ∈ ΓX is the...

Date post: 09-May-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
3
? WHAT IS... a Fenchel Conjugate? Heinz H. Bauschke and Yves Lucet The ideas of duality and transforms are ubiqui- tous in mathematics, the most classical example being the Fourier transform in harmonic analysis. Convex analysis, an area founded by W. Fenchel, J.-J. Moreau, and R. T. Rockafellar in the mid- twentieth century, concerns convex sets, convex functions, and their applications to optimization. The counterpart of the Fourier transform in con- vex analysis is the Fenchel conjugate. Suppose we have a real Hilbert space X and a function f : X ]−∞, +∞]. We shall assume that f is proper, that is, dom f = x X f (x) R 0. Then the Fenchel conjugate f at u X is f (u) = sup xX ( x, u〉− f (x) ) . An immediate consequence of the definition is the Fenchel-Young inequality f (x) + f (u) ≥〈x, u. We also note that f is convex and lower semicon- tinuous because it is the supremum of the family of affine continuous functions (x, ·〉 − f (x)) xX . One has the beautiful duality f (x) = f ∗∗ (x) f is convex and lower semicontinuous, which shows that such a function f can be represented as a supremum of affine functions x ֏ u, x〉− f (u), where f (u) determines the constant term of the affine function with slope u. Given a subset C of X, its indicator function ι C is defined by ι C (x) = 0, if x C; +∞, otherwise. As a first example, we compute that if f (x) = Heinz H. Bauschke is professor of mathematics at the Uni- versity of British Columbia in Kelowna. His email address is [email protected]. Yves Lucet is associate professor of computer science at the University of British Columbia in Kelowna. His email address is [email protected]. DOI: http://dx.doi.org/10.1090/noti788. x, a, where a X, then f = ι {a} . Thus +∞ is unavoidable and to be embraced in convex analysis. If f is convex and differentiable, then the supremum in the definition of f (u) can be found by calculus, and we obtain f ( f (x) ) =〈x, f (x)〉− f (x). This formula not only explains why the Fenchel conjugate is also known as the Fenchel-Legendre transform, but it also shows that the energy is self-dual; in fact, f = f f = 1 2 ‖·‖ 2 . Given α> 0, one also computes that α exp and the following (scaled) negative entropy are conjugates of each other: exp) (u) = +∞, if u< 0; 0, if u = 0; u ln(u/α) u, if u> 0. By associating each α ]0, 1] with a color, we are able to display an entire family of conjugates (see Figures 1 and 2). Figure 1. The family exp) α]0,1] . 44 Notices of the AMS Volume 59, Number 1
Transcript
Page 1: W H A T I S . . . a Fenchel Conjugate? · Closely tied to the Fenchel conjugate of f ∈ ΓX is the subdifferential operator ∂f. Thisis a set-valued mappingonX,thatis,itmapsfromX

?W H A T I S . . .

a Fenchel Conjugate?Heinz H. Bauschke and Yves Lucet

The ideas of duality and transforms are ubiqui-tous in mathematics, the most classical examplebeing the Fourier transform in harmonic analysis.Convex analysis, an area founded by W. Fenchel,J.-J. Moreau, and R. T. Rockafellar in the mid-twentieth century, concerns convex sets, convexfunctions, and their applications to optimization.The counterpart of the Fourier transform in con-vex analysis is the Fenchel conjugate. Supposewe have a real Hilbert space X and a functionf : X → ]−∞,+∞]. We shall assume that f isproper, that is, dom f =

{x ∈ X

∣∣ f (x) ∈ R}≠ 0.

Then the Fenchel conjugate f∗ at u ∈ X is

f∗(u) = supx∈X

(〈x, u〉 − f (x)

).

An immediate consequence of the definition is theFenchel-Young inequality

f (x)+ f∗(u) ≥ 〈x, u〉 .We also note that f∗ is convex and lower semicon-tinuous because it is the supremum of the familyof affine continuous functions (〈x, ·〉 − f (x))x∈X .One has the beautiful duality

f (x) = f∗∗(x)⇔{f is convex and

lower semicontinuous,

which shows that such a function f can berepresented as a supremum of affine functionsx ֏ 〈u, x〉 − f∗(u), where f∗(u) determines theconstant term of the affine function with slope u.

Given a subset C of X, its indicator function ιCis defined by ιC(x) = 0, if x ∈ C; +∞, otherwise.As a first example, we compute that if f (x) =

Heinz H. Bauschke is professor of mathematics at the Uni-

versity of British Columbia in Kelowna. His email address

is [email protected].

Yves Lucet is associate professor of computer science at

the University of British Columbia in Kelowna. His email

address is [email protected].

DOI: http://dx.doi.org/10.1090/noti788.

〈x, a〉, where a ∈ X, then f∗ = ι{a}. Thus +∞is unavoidable and to be embraced in convex

analysis. If f is convex and differentiable, then the

supremum in the definition of f∗(u) can be foundby calculus, and we obtain

f∗(∇f (x)

)= 〈x,∇f (x)〉 − f (x).

This formula not only explains why the Fenchel

conjugate is also known as the Fenchel-Legendretransform, but it also shows that the energy is

self-dual; in fact,

f = f∗⇔ f = 1

2‖ · ‖2.

Givenα > 0, one also computes thatα exp and thefollowing (scaled) negative entropy are conjugates

of each other:

(α exp)∗(u) =

+∞, if u < 0;

0, if u = 0;

u ln(u/α)− u, if u > 0.

By associating each α ∈ ]0,1] with a color, we areable to display an entire family of conjugates (see

Figures 1 and 2).

Figure 1. The family (α exp)α∈]0,1].

44 Notices of the AMS Volume 59, Number 1

Page 2: W H A T I S . . . a Fenchel Conjugate? · Closely tied to the Fenchel conjugate of f ∈ ΓX is the subdifferential operator ∂f. Thisis a set-valued mappingonX,thatis,itmapsfromX

Figure 2. The associated family of conjugates.

Many more interesting pairs can be com-

puted (in closed form, or at least numerically).

For instance, if 1 < p < +∞, then( 1

p| · |p

)∗ = 1

p′ | · |p′ , where

1

p+ 1

p′ = 1,

which, along with the Fenchel-Young inequality,

leads to an elegant proof of Hölder’s inequality.

The natural domain for Fenchel conjugation

is ΓX , the cone of functions that are convex,

lower semicontinuous, and proper on X. One

now wishes to obtain calculus rules for Fenchel

conjugation. In harmonic analysis, one is led to

discover convolution as crucial in describing the

Fourier transform of a product. The counterpart

in convex analysis is the infimal convolution f�g,

defined by

(f�g)(x) = infy∈X

(f (y)+ g(x− y)

).

Under appropriate hypotheses, one has

(f�g)∗ = f∗ + g∗ and (f + g)∗ = f∗�g∗.

Moreover, (αf )∗(u) = αf∗(u/α) where α > 0.

Closely tied to the Fenchel conjugate of f ∈ ΓX is

the subdifferential operator ∂f . This is a set-valued

mapping onX, that is, it maps fromX to the power

set of X, and it is defined by

u ∈ ∂f (x) ⇔ (∀h ∈ X) f (x)+ 〈h,u〉 ≤ f (x+ h).

Now equality in the Fenchel-Young inequality char-

acterizes subgradients, that is, elements in the

subdifferential, in the sense that

f (x)+ f∗(u) = 〈x, u〉⇔ u ∈ ∂f (x)⇔ x ∈ ∂f∗(u).

When f is continuous at x, then differentiability

of f at x is the same as requiring ∂f (x) to be a

singleton, in which case ∂f (x) = {∇f (x)}. When

dom f = X = R, then the left (f ′−) and right (f ′+)derivatives exist at every x and

∂f (x) =[f ′−(x), f

′+(x)

].

Thus the subdifferential operator is a power-ful generalized derivative. It also has a propertycritical for optimization:

0 ∈ ∂f (x) ⇔ x is a global minimizer of f .

Suppose that Y is another real Hilbert space,A : X → Y is continuous and linear, and g ∈ ΓY .The most important theorem concerns Fenchel-Rockafellar duality, which involves the primalproblem

(P) minimizex∈X

f (x)+ g(Ax),

and the associated dual problem

(D) minimizey∈Y

f∗(−A∗y)+ g∗(y).

Set µ = inf{f (x)+ g(Ax)

∣∣ x ∈ X}

and µ∗ =inf{f∗(−A∗y)+ g∗(y)

∣∣ y ∈ Y}. Then µ ≥ −µ∗.

The key result asserts that in the presence of aso-called (primal) constraint qualification such as0 ∈ int

(domg −Adom f

), one has µ = −µ∗, and

the dual problem possesses at least one solution.Let y be an arbitrary dual solution. Then the entireset of primal solutions is obtained as

∂f∗(−A∗y)∩A−1∂g∗(y).

As an example, one may formally derive the well-known linear programming (LP) duality, whichconcerns

inf{〈c, x〉

∣∣ x ≥ 0, Ax = b},

andsup

{〈b, y〉

∣∣ y ∈ Rm, A∗y ≤ c},

where c ∈ X = Rn, A ∈ Rm×n, b ∈ Y = Rm, andvector inequalities are interpreted entry-wise, bysetting f = 〈·, c〉 + ιRn+ and g = ι{b}.

Let f ∈ ΓX . Then the operator ∂f+Id is surjective;here Id denotes the identity mapping. The inverseoperator (∂f + Id)−1 is actually single-valued andcalled the proximal mapping Proxf . In view of

x = Proxf x ⇔ x is a global minimizer of f

and, for all x and y in X,

‖Proxf x− Proxf y‖2 ≤ ‖x− y‖2

− ‖(Id − Proxf )x− (Id − Proxf )y‖2,

Proxf is Lipschitz continuous with constant 1 andthus enables fixed-point algorithmic approachesto optimization problems.

Turning to further applications, let

q = 1

2‖ · ‖2.

Strict-smooth duality : When X = Rn and

dom f = dom f∗ = X, then f is strictly convex ifand only if f∗ is differentiable.

Moreau envelope and Moreau decomposition:The beautiful identity

(f�q)+ (f∗�q) = qbecomes

Proxf +Proxf∗ = Id

January 2012 Notices of the AMS 45

Page 3: W H A T I S . . . a Fenchel Conjugate? · Closely tied to the Fenchel conjugate of f ∈ ΓX is the subdifferential operator ∂f. Thisis a set-valued mappingonX,thatis,itmapsfromX

New journal from the

European Mathematical Society Publishing HouseSeminar for Applied Mathematics, ETH-Zentrum FLI C4CH-8092 Zürich, [email protected]

Revista MatemáticaIberoamericana

A scientific journal of the Real Sociedad Matemática

Española

Published by the EMS Publishing House as of 2012

Revista Matemática Iberoamericana publishes original research arti-cles on all areas of mathematics. Its distinguished Editorial Boardselects papers according to the highest standards. Founded in 1985,Revista is the scientific journal of Real Sociedad Matemática Española.

Editors: Antonio Córdoba, Universidad Autónoma de Madrid, SpainJosé Luis Fernández, Universidad Autónoma de Madrid, SpainLuis Vega, Universidad del País Vasco, Bilbao, Spain

Editorial Board: Antonio Ambrosetti, SISSA, Trieste, ItalyKari Astala, University of Helsinki, FinlandRodrigo Bañuelos, Purdue University, West Lafayette, USALuis Barreira, Instituto Superior Técnico, Lisboa, PortugalPilar Bayer, Universitat de Barcelona, SpainJoaquim Bruna, Universitat Autònoma de Barcelona, SpainLuis A. Caffarelli, University of Texas at Austin, USAFernando Chamizo, Universidad Autónoma de Madrid, SpainSun-Yung Alice Chang, Princeton University, USAGuy David, Université Paris-Sud, Orsay, FranceCharles Fefferman, Princeton University, USAPierre-Louis Lions, Université de Paris IX-Dauphine, FranceRafael de la Llave, University of Texas at Austin, USATerry Lyons, University of Oxford, UKAntonio Ros, Universidad de Granada, SpainElias M. Stein, Princeton University, USAGunther Uhlmann, University of Washington, Seattle, USA

Subscription informationISSN print 0213-2230ISSN online 2235-06162012. Vol. 28. 4 issuesApprox. 1200 pages. 17.0 x 24.0 cmPrice of subscription:328 € online only / 388 € print+online

New in 2012

Figure 3.

after taking the derivative. Let PY denote theprojector onto the closed subspace Y of X. Thenthis last decomposition turns into the well-knownorthogonal subspace decomposition

PY + PY⊥ = Id

since ι∗Y = ιY⊥ and ProxιY = PY .The material thus far has been classical, al-

though significant refinements continue to bemade. We conclude with a recent development.

Proximal average: Let f0 and f1 be in ΓX . Thenthe proximal average fλ for 0 < λ < 1 is defined by

((1− λ)(f0 + q)∗ + λ(f1 + q)∗

)∗ − q.We have (fλ)

∗ = (f∗)λ, that is, taking the Fenchelconjugate and the proximal average commute, and

Proxfλ = (1− λ)Proxf0 +λProxf1 .

The proximal average provides a homotopy be-tween f0 and f1, even when dom f0∩dom f1 = 0, andit is useful for the construction of antiderivativesand maximally monotone operators. By associat-ing each λ ∈ [0,1] with a color, we are able todisplay the full family of proximal averages; seeFigure 3 for the graph of the family of proximalaverages (fλ)λ∈[0,1] of f0(x) = −

√−x + ι[−3/2,0](x)and f1 = f0 ◦ (−Id). The following reading listis a starting point from which to explore thetheory, history, applications, and (symbolic andnumerical) computation of Fenchel conjugates.

Further Reading1. H. H. Bauschke and P. L. Combettes, Convex Analy-

sis and Monotone Operator Theory in Hilbert Spaces,

Springer, 2011.

2. J. M. Borwein and J. D. Vanderwerff, Convex

Functions, Cambridge University Press, 2010.

3. Y. Lucet, What shape is your conjugate?, SIAM Rev.

52 (2010), 505–542.

4. R. T. Rockafellar, Convex Analysis, Princeton

University Press, 1970.

46 Notices of the AMS Volume 59, Number 1


Recommended