Post on 27-Jul-2020
transcript
Introduction The discrete case Measures The Euclidean case
Gradient flows, optimal transport,and evolution PDE’s
2 - A quick introduction to Optimal Transport
Giuseppe Savarehttp://www.imati.cnr.it/∼savare
Dipartimento di Matematica, Universita di Pavia
GNFM Summer SchoolRavello, September 13–18, 2010
1
Introduction The discrete case Measures The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
2
Introduction The discrete case Measures The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
2
Introduction The discrete case Measures The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
2
Introduction The discrete case Measures The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
2
Introduction The discrete case Measures The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
3
Introduction The discrete case Measures The Euclidean case
Gaspard Monge (1746-1818)
1781: “La theorie des deblais et des remblais ”
Problem: how to transport soil from the groud to a given configuration in the“most efficient” way.
42 3 The founding fathers of optimal transport
minimize the total cost. Monge assumed that the transport cost of oneunit of mass along a certain distance was given by the product of themass by the distance.
x
deblaisremblais
T
y
Fig. 3.1. Monge’s problem of deblais and remblais
Nowadays there is a Monge street in Paris, and therein one can findan excellent bakery called Le Boulanger de Monge. To acknowledge this,and to illustrate how Monge’s problem can be recast in an economicperspective, I shall express the problem as follows. Consider a largenumber of bakeries, producing loaves, that should be transported eachmorning to cafes where consumers will eat them. The amount of breadthat can be produced at each bakery, and the amount that will beconsumed at each cafe are known in advance, and can be modeled asprobability measures (there is a “density of production” and a “densityof consumption”) on a certain space, which in our case would be Paris(equipped with the natural metric such that the distance between twopoints is the length of the shortest path joining them). The problem isto find in practice where each unit of bread should go (see Figure 3.2),in such a way as to minimize the total transport cost. So Monge’sproblem really is the search of an optimal coupling; and to be moreprecise, he was looking for a deterministic optimal coupling.
Fig. 3.2. Economic illustration of Monge’s problem: squares stand for productionunits, circles for consumption places.
The transport cost is proportional to the distance |T (x)− x|.
4
Introduction The discrete case Measures The Euclidean case
Leonid Kantorovich (1912-1986)
1939: Mathematical Methods of Organizing andPlanning of Production,(unpublished until 1960).1942: On the translocation of masses1948: On a problem of Monge
1975: Nobel prize, jointly with Tjalling Koopmans,“for their contributions to the theory of optimum allocation of resources”
Autobiography:http://nobelprize.org/nobel prizes/economics/laureates/1975/kantorovich-autobio.html
Parallel contributions:1941: Frank Hitchcock, The distribution of a product from several sources tonumerous localities (Jour. Math. Phys.)1947: Tjalling Koopmans, Optimum utilization of the transportation system.1947: George Dantzig, simplex method.
5
Introduction The discrete case Measures The Euclidean case
Leonid Kantorovich (1912-1986)
1939: Mathematical Methods of Organizing andPlanning of Production,(unpublished until 1960).1942: On the translocation of masses1948: On a problem of Monge
1975: Nobel prize, jointly with Tjalling Koopmans,“for their contributions to the theory of optimum allocation of resources”
Autobiography:http://nobelprize.org/nobel prizes/economics/laureates/1975/kantorovich-autobio.html
Parallel contributions:1941: Frank Hitchcock, The distribution of a product from several sources tonumerous localities (Jour. Math. Phys.)1947: Tjalling Koopmans, Optimum utilization of the transportation system.1947: George Dantzig, simplex method.
5
Introduction The discrete case Measures The Euclidean case
Leonid Kantorovich (1912-1986)
1939: Mathematical Methods of Organizing andPlanning of Production,(unpublished until 1960).1942: On the translocation of masses1948: On a problem of Monge
1975: Nobel prize, jointly with Tjalling Koopmans,“for their contributions to the theory of optimum allocation of resources”
Autobiography:http://nobelprize.org/nobel prizes/economics/laureates/1975/kantorovich-autobio.html
Parallel contributions:1941: Frank Hitchcock, The distribution of a product from several sources tonumerous localities (Jour. Math. Phys.)1947: Tjalling Koopmans, Optimum utilization of the transportation system.1947: George Dantzig, simplex method.
5
Introduction The discrete case Measures The Euclidean case
Twoards the recent theory...I Statistical and probabilistic aspects:
(beginning of ’900: Gini, Dall’Aglio, Hoeffding, Frechet,. . . )
Rachev-Ruschendorf, Mass Transportation Problems (1998)I Particle systems, Boltzmann equation:
Dobrushin, Tanaka (∼’70)
I Yann Brenier (’89): fluid mechanics, transportmap, polar decomposition. Dynamical interpratationof optimal transport.
I John Mather: Lagrangian dynamical systems.I Mike Cullen: meteorologic models, semigeostrofic equations.I Regularity, geometric and functional inequalities, Riemannian geometry,
urban planning, evolution equations, etc.:L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L.Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper,T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . .
I C. Villani: Optimal transport: Old and NewSpringer (2009) 978 p.
6
Introduction The discrete case Measures The Euclidean case
Twoards the recent theory...I Statistical and probabilistic aspects:
(beginning of ’900: Gini, Dall’Aglio, Hoeffding, Frechet,. . . )
Rachev-Ruschendorf, Mass Transportation Problems (1998)I Particle systems, Boltzmann equation:
Dobrushin, Tanaka (∼’70)
I Yann Brenier (’89): fluid mechanics, transportmap, polar decomposition. Dynamical interpratationof optimal transport.
I John Mather: Lagrangian dynamical systems.I Mike Cullen: meteorologic models, semigeostrofic equations.I Regularity, geometric and functional inequalities, Riemannian geometry,
urban planning, evolution equations, etc.:L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L.Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper,T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . .
I C. Villani: Optimal transport: Old and NewSpringer (2009) 978 p.
6
Introduction The discrete case Measures The Euclidean case
Twoards the recent theory...I Statistical and probabilistic aspects:
(beginning of ’900: Gini, Dall’Aglio, Hoeffding, Frechet,. . . )
Rachev-Ruschendorf, Mass Transportation Problems (1998)I Particle systems, Boltzmann equation:
Dobrushin, Tanaka (∼’70)
I Yann Brenier (’89): fluid mechanics, transportmap, polar decomposition. Dynamical interpratationof optimal transport.
I John Mather: Lagrangian dynamical systems.I Mike Cullen: meteorologic models, semigeostrofic equations.I Regularity, geometric and functional inequalities, Riemannian geometry,
urban planning, evolution equations, etc.:L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L.Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper,T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . .
I C. Villani: Optimal transport: Old and NewSpringer (2009) 978 p.
6
Introduction The discrete case Measures The Euclidean case
Twoards the recent theory...I Statistical and probabilistic aspects:
(beginning of ’900: Gini, Dall’Aglio, Hoeffding, Frechet,. . . )
Rachev-Ruschendorf, Mass Transportation Problems (1998)I Particle systems, Boltzmann equation:
Dobrushin, Tanaka (∼’70)
I Yann Brenier (’89): fluid mechanics, transportmap, polar decomposition. Dynamical interpratationof optimal transport.
I John Mather: Lagrangian dynamical systems.I Mike Cullen: meteorologic models, semigeostrofic equations.I Regularity, geometric and functional inequalities, Riemannian geometry,
urban planning, evolution equations, etc.:L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L.Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper,T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . .
I C. Villani: Optimal transport: Old and NewSpringer (2009) 978 p.
6
Introduction The discrete case Measures The Euclidean case
Twoards the recent theory...I Statistical and probabilistic aspects:
(beginning of ’900: Gini, Dall’Aglio, Hoeffding, Frechet,. . . )
Rachev-Ruschendorf, Mass Transportation Problems (1998)I Particle systems, Boltzmann equation:
Dobrushin, Tanaka (∼’70)
I Yann Brenier (’89): fluid mechanics, transportmap, polar decomposition. Dynamical interpratationof optimal transport.
I John Mather: Lagrangian dynamical systems.I Mike Cullen: meteorologic models, semigeostrofic equations.I Regularity, geometric and functional inequalities, Riemannian geometry,
urban planning, evolution equations, etc.:L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L.Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper,T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . .
I C. Villani: Optimal transport: Old and NewSpringer (2009) 978 p.
6
Introduction The discrete case Measures The Euclidean case
Twoards the recent theory...I Statistical and probabilistic aspects:
(beginning of ’900: Gini, Dall’Aglio, Hoeffding, Frechet,. . . )
Rachev-Ruschendorf, Mass Transportation Problems (1998)I Particle systems, Boltzmann equation:
Dobrushin, Tanaka (∼’70)
I Yann Brenier (’89): fluid mechanics, transportmap, polar decomposition. Dynamical interpratationof optimal transport.
I John Mather: Lagrangian dynamical systems.I Mike Cullen: meteorologic models, semigeostrofic equations.I Regularity, geometric and functional inequalities, Riemannian geometry,
urban planning, evolution equations, etc.:L. Caffarelli, C. Evans, W. Gangbo, R. McCann, F. Otto, L.Ambrosio, G. Buttazzo, C. Villani, J. Lott, N. Trudinger, G. Loeper,T. Sturm, J. Carrillo, G. Toscani, A. Pratelli,. . .
I C. Villani: Optimal transport: Old and NewSpringer (2009) 978 p.
6
Introduction The discrete case Measures The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
7
Introduction The discrete case Measures The Euclidean case
Discrete formulation
• Initial configuration of resources in X =x1, · · · , xh; at every point xi ∈ X itis available the quantity mi = m(xi).
• Final configuration Y = y1, · · · , yn:at every point yj the quantity nj =n(yj) is expected.
• The unitary cost cij = c(xi, yj) fortransporting the single unit from posi-tion xi to the destination yj . x4
x3
x2
x1
y3
y2
y1
c11
c12
c13
c21
c22
c23
T11
T21
T33
T42
T43
Admissible transference plan: choose the quantities Ti,j = T (xi, yj) moved
from xi to yj , so that
T (xi, yj) ≥ 0,Xy∈Y
T (xi, y) = m(xi),Xx∈X
T (x, yj) = n(yj)
The cost of the transference plan T is C(T ) :=X
x∈X,y∈Yc(x, y)T (x, y)
8
Introduction The discrete case Measures The Euclidean case
Discrete formulation
• Initial configuration of resources in X =x1, · · · , xh; at every point xi ∈ X itis available the quantity mi = m(xi).
• Final configuration Y = y1, · · · , yn:at every point yj the quantity nj =n(yj) is expected.
• The unitary cost cij = c(xi, yj) fortransporting the single unit from posi-tion xi to the destination yj . x4
x3
x2
x1
y3
y2
y1
c11
c12
c13
c21
c22
c23
T11
T21
T33
T42
T43
Admissible transference plan: choose the quantities Ti,j = T (xi, yj) moved
from xi to yj , so that
T (xi, yj) ≥ 0,Xy∈Y
T (xi, y) = m(xi),Xx∈X
T (x, yj) = n(yj)
The cost of the transference plan T is C(T ) :=X
x∈X,y∈Yc(x, y)T (x, y)
8
Introduction The discrete case Measures The Euclidean case
Discrete formulation
• Initial configuration of resources in X =x1, · · · , xh; at every point xi ∈ X itis available the quantity mi = m(xi).
• Final configuration Y = y1, · · · , yn:at every point yj the quantity nj =n(yj) is expected.
• The unitary cost cij = c(xi, yj) fortransporting the single unit from posi-tion xi to the destination yj . x4
x3
x2
x1
y3
y2
y1c11
c12
c13
c21
c22
c23
T11
T21
T33
T42
T43
Admissible transference plan: choose the quantities Ti,j = T (xi, yj) moved
from xi to yj , so that
T (xi, yj) ≥ 0,Xy∈Y
T (xi, y) = m(xi),Xx∈X
T (x, yj) = n(yj)
The cost of the transference plan T is C(T ) :=X
x∈X,y∈Yc(x, y)T (x, y)
8
Introduction The discrete case Measures The Euclidean case
Discrete formulation
• Initial configuration of resources in X =x1, · · · , xh; at every point xi ∈ X itis available the quantity mi = m(xi).
• Final configuration Y = y1, · · · , yn:at every point yj the quantity nj =n(yj) is expected.
• The unitary cost cij = c(xi, yj) fortransporting the single unit from posi-tion xi to the destination yj . x4
x3
x2
x1
y3
y2
y1
c11
c12
c13
c21
c22
c23
T11
T21
T33
T42
T43
Admissible transference plan: choose the quantities Ti,j = T (xi, yj) moved
from xi to yj , so that
T (xi, yj) ≥ 0,Xy∈Y
T (xi, y) = m(xi),Xx∈X
T (x, yj) = n(yj)
The cost of the transference plan T is C(T ) :=X
x∈X,y∈Yc(x, y)T (x, y)
8
Introduction The discrete case Measures The Euclidean case
Discrete formulation
• Initial configuration of resources in X =x1, · · · , xh; at every point xi ∈ X itis available the quantity mi = m(xi).
• Final configuration Y = y1, · · · , yn:at every point yj the quantity nj =n(yj) is expected.
• The unitary cost cij = c(xi, yj) fortransporting the single unit from posi-tion xi to the destination yj . x4
x3
x2
x1
y3
y2
y1
c11
c12
c13
c21
c22
c23
T11
T21
T33
T42
T43
Admissible transference plan: choose the quantities Ti,j = T (xi, yj) moved
from xi to yj , so that
T (xi, yj) ≥ 0,Xy∈Y
T (xi, y) = m(xi),Xx∈X
T (x, yj) = n(yj)
The cost of the transference plan T is C(T ) :=X
x∈X,y∈Yc(x, y)T (x, y)
8
Introduction The discrete case Measures The Euclidean case
Discrete formulation
• Initial configuration of resources in X =x1, · · · , xh; at every point xi ∈ X itis available the quantity mi = m(xi).
• Final configuration Y = y1, · · · , yn:at every point yj the quantity nj =n(yj) is expected.
• The unitary cost cij = c(xi, yj) fortransporting the single unit from posi-tion xi to the destination yj . x4
x3
x2
x1
y3
y2
y1
c11
c12
c13
c21
c22
c23
T11
T21
T33
T42
T43
Admissible transference plan: choose the quantities Ti,j = T (xi, yj) moved
from xi to yj , so that
T (xi, yj) ≥ 0,Xy∈Y
T (xi, y) = m(xi),Xx∈X
T (x, yj) = n(yj)
The cost of the transference plan T is C(T ) :=X
x∈X,y∈Yc(x, y)T (x, y)
8
Introduction The discrete case Measures The Euclidean case
Discrete formulation
• Initial configuration of resources in X =x1, · · · , xh; at every point xi ∈ X itis available the quantity mi = m(xi).
• Final configuration Y = y1, · · · , yn:at every point yj the quantity nj =n(yj) is expected.
• The unitary cost cij = c(xi, yj) fortransporting the single unit from posi-tion xi to the destination yj . x4
x3
x2
x1
y3
y2
y1
c11
c12
c13
c21
c22
c23
T11
T21
T33
T42
T43
Admissible transference plan: choose the quantities Ti,j = T (xi, yj) moved
from xi to yj , so that
T (xi, yj) ≥ 0,Xy∈Y
T (xi, y) = m(xi),Xx∈X
T (x, yj) = n(yj)
The cost of the transference plan T is C(T ) :=X
x∈X,y∈Yc(x, y)T (x, y)
8
Introduction The discrete case Measures The Euclidean case
Optimal transport
Problem
Find the best transference plan T which minimizes the cost C(T ) among all theadmissible plans.
The linear programming structure: given positive coefficients mi,nj and ci,jfind the quantities Ti,j minimizing the linear functional
C(T ) =Xi,j
ci,jTi,j
under the linear/convex constraints
Ti,j ≥ 0,Xj
Ti,j = mi,Xi
Ti,j = mj
In vector notation:min ~C · ~T : A0
~T ≥ 0, A1~T = ~b
In the discrete case existence of the optimal plan is easy; more important are3 foundamental properties:
I Cyclical monotonicity of the optimal transference plan.
I Dual characterization, Kantorovich potentials (prices in economic terms),linear programming.
I Integrality of the transference plan, transport maps.
9
Introduction The discrete case Measures The Euclidean case
Optimal transport
Problem
Find the best transference plan T which minimizes the cost C(T ) among all theadmissible plans.
The linear programming structure: given positive coefficients mi,nj and ci,jfind the quantities Ti,j minimizing the linear functional
C(T ) =Xi,j
ci,jTi,j
under the linear/convex constraints
Ti,j ≥ 0,Xj
Ti,j = mi,Xi
Ti,j = mj
In vector notation:min ~C · ~T : A0
~T ≥ 0, A1~T = ~b
In the discrete case existence of the optimal plan is easy; more important are3 foundamental properties:
I Cyclical monotonicity of the optimal transference plan.
I Dual characterization, Kantorovich potentials (prices in economic terms),linear programming.
I Integrality of the transference plan, transport maps.
9
Introduction The discrete case Measures The Euclidean case
Optimal transport
Problem
Find the best transference plan T which minimizes the cost C(T ) among all theadmissible plans.
The linear programming structure: given positive coefficients mi,nj and ci,jfind the quantities Ti,j minimizing the linear functional
C(T ) =Xi,j
ci,jTi,j
under the linear/convex constraints
Ti,j ≥ 0,Xj
Ti,j = mi,Xi
Ti,j = mj
In vector notation:min ~C · ~T : A0
~T ≥ 0, A1~T = ~b
In the discrete case existence of the optimal plan is easy; more important are3 foundamental properties:
I Cyclical monotonicity of the optimal transference plan.
I Dual characterization, Kantorovich potentials (prices in economic terms),linear programming.
I Integrality of the transference plan, transport maps.
9
Introduction The discrete case Measures The Euclidean case
Cyclical monotonicityConsider an aribtrary collection of couples (x, y) joined by a transport ray , i.e.T (x, y) > 0: in the picture (x2, y1), (x3, y2), (x4, y3)
x4
x3
x2
x1
y3
y2
y1T11
T21
T33
T42
T43
σ
σ
The associated (unitary) cost is
c(x2, y1) + c(x3, y2) + c(x4, y3) ≤ c(x2, y2) + c(x3, y3) + c(x4, y1)
if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1
Theorem (Rachev-Ruschendorf)
If T is optimal the cost of any rearranged configuration by a cyclical permutationcannot decrease.
10
Introduction The discrete case Measures The Euclidean case
Cyclical monotonicityConsider an aribtrary collection of couples (x, y) joined by a transport ray , i.e.T (x, y) > 0: in the picture (x2, y1), (x3, y2), (x4, y3)
x4
x3
x2
x1
y3
y2
y1
T11
T21
T33
T42
T43
σ
σ
The associated (unitary) cost is
c(x2, y1) + c(x3, y2) + c(x4, y3) ≤ c(x2, y2) + c(x3, y3) + c(x4, y1)
if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1
Theorem (Rachev-Ruschendorf)
If T is optimal the cost of any rearranged configuration by a cyclical permutationcannot decrease.
10
Introduction The discrete case Measures The Euclidean case
Cyclical monotonicityConsider an aribtrary collection of couples (x, y) joined by a transport ray , i.e.T (x, y) > 0: in the picture (x2, y1), (x3, y2), (x4, y3)
x4
x3
x2
x1
y3
y2
y1
T11
T21
T33
T42
T43
σ
σ
The associated (unitary) cost is
c(x2, y1) + c(x3, y2) + c(x4, y3) ≤ c(x2, y2) + c(x3, y3) + c(x4, y1)
if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1
Theorem (Rachev-Ruschendorf)
If T is optimal the cost of any rearranged configuration by a cyclical permutationcannot decrease.
10
Introduction The discrete case Measures The Euclidean case
Cyclical monotonicityConsider an aribtrary collection of couples (x, y) joined by a transport ray , i.e.T (x, y) > 0: in the picture (x2, y1), (x3, y2), (x4, y3)
x4
x3
x2
x1
y3
y2
y1
T11
T21
T33
T42
T43
σ
σ
The associated (unitary) cost is
c(x2, y1) + c(x3, y2) + c(x4, y3) ≤ c(x2, y2) + c(x3, y3) + c(x4, y1)
if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1
Theorem (Rachev-Ruschendorf)
If T is optimal the cost of any rearranged configuration by a cyclical permutationcannot decrease.
10
Introduction The discrete case Measures The Euclidean case
Cyclical monotonicityConsider an aribtrary collection of couples (x, y) joined by a transport ray , i.e.T (x, y) > 0: in the picture (x2, y1), (x3, y2), (x4, y3)
x4
x3
x2
x1
y3
y2
y1
T11
T21
T33
T42
T43
σ
σ
The associated (unitary) cost is
c(x2, y1) + c(x3, y2) + c(x4, y3) ≤ c(x2, y2) + c(x3, y3) + c(x4, y1)
if one applies a (cyclical) permutation σ of the targets: y1 → y2 → y3 → y1
Theorem (Rachev-Ruschendorf)
If T is optimal the cost of any rearranged configuration by a cyclical permutationcannot decrease.
10
Introduction The discrete case Measures The Euclidean case
Cyclical monotonicity is also sufficient
Theorem
If T is a cyclically monotone admissible plan then it is optimal.
11
Introduction The discrete case Measures The Euclidean case
The dual problem: optimal prices
Linear programming: the dual problem gives a crucial insight on the structureof the optimal transference plan.
Economic interpretation: a transport company offers to take care thetransportation job: they will pay the price u(x) to buy a unit placed at thepoint x and they will sell it at y for the price v(y).To be competitive, the prices should be more convenient than the transportationcost c(x,y):
v(y)− u(x) ≤ c(x,y) x ∈ X, y ∈ Y (*)
The total profit for the company is
P(u,v) :=Xy∈Y
n(y)v(y)−Xx∈X
m(x)u(x)
and their problem is to find the prices which maximaize the profits
maxP(u, v) among all the competitive prices (u,v) satisfying (*)
Clearly C(T ) ≥ P(u,v) for every admissible trasnference plan T and every coupleof competitive prices u,v.
12
Introduction The discrete case Measures The Euclidean case
The dual problem: optimal prices
Linear programming: the dual problem gives a crucial insight on the structureof the optimal transference plan.
Economic interpretation: a transport company offers to take care thetransportation job: they will pay the price u(x) to buy a unit placed at thepoint x and they will sell it at y for the price v(y).To be competitive, the prices should be more convenient than the transportationcost c(x,y):
v(y)− u(x) ≤ c(x,y) x ∈ X, y ∈ Y (*)
The total profit for the company is
P(u,v) :=Xy∈Y
n(y)v(y)−Xx∈X
m(x)u(x)
and their problem is to find the prices which maximaize the profits
maxP(u, v) among all the competitive prices (u,v) satisfying (*)
Clearly C(T ) ≥ P(u,v) for every admissible trasnference plan T and every coupleof competitive prices u,v.
12
Introduction The discrete case Measures The Euclidean case
The dual problem: optimal prices
Linear programming: the dual problem gives a crucial insight on the structureof the optimal transference plan.
Economic interpretation: a transport company offers to take care thetransportation job: they will pay the price u(x) to buy a unit placed at thepoint x and they will sell it at y for the price v(y).To be competitive, the prices should be more convenient than the transportationcost c(x,y):
v(y)− u(x) ≤ c(x,y) x ∈ X, y ∈ Y (*)
The total profit for the company is
P(u,v) :=Xy∈Y
n(y)v(y)−Xx∈X
m(x)u(x)
and their problem is to find the prices which maximaize the profits
maxP(u, v) among all the competitive prices (u,v) satisfying (*)
Clearly C(T ) ≥ P(u,v) for every admissible trasnference plan T and every coupleof competitive prices u,v.
12
Introduction The discrete case Measures The Euclidean case
The dual problem: optimal prices
Linear programming: the dual problem gives a crucial insight on the structureof the optimal transference plan.
Economic interpretation: a transport company offers to take care thetransportation job: they will pay the price u(x) to buy a unit placed at thepoint x and they will sell it at y for the price v(y).To be competitive, the prices should be more convenient than the transportationcost c(x,y):
v(y)− u(x) ≤ c(x,y) x ∈ X, y ∈ Y (*)
The total profit for the company is
P(u,v) :=Xy∈Y
n(y)v(y)−Xx∈X
m(x)u(x)
and their problem is to find the prices which maximaize the profits
maxP(u, v) among all the competitive prices (u,v) satisfying (*)
Clearly C(T ) ≥ P(u,v) for every admissible trasnference plan T and every coupleof competitive prices u,v.
12
Introduction The discrete case Measures The Euclidean case
Duality theorem
Theorem (Min-max and “complementary slackness”)
An admissible transference plan T is optimal if and only if there exist competitiveprices (u,v) such that
C(T ) = P(u,v).
In particular
minTC(T ) = max
(u,v)P(u,v).
Moreover, the “slackness”
S(x,y) := c(x,y)− u(x)− v(y) ≥ 0
satisfies the “complementary slackness principle”
T (x,y)S(x,y) = 0 i.e. T (x,y) > 0 ⇒ S(x,y) = 0.
“If x and y are connected through an optimal transport ray then their respectiveprices u(x) e v(y) are maximal: v(y)− u(x) = c(x,y).”
13
Introduction The discrete case Measures The Euclidean case
Duality theorem
Theorem (Min-max and “complementary slackness”)
An admissible transference plan T is optimal if and only if there exist competitiveprices (u,v) such that
C(T ) = P(u,v).
In particular
minTC(T ) = max
(u,v)P(u,v).
Moreover, the “slackness”
S(x,y) := c(x,y)− u(x)− v(y) ≥ 0
satisfies the “complementary slackness principle”
T (x,y)S(x,y) = 0 i.e. T (x,y) > 0 ⇒ S(x,y) = 0.
“If x and y are connected through an optimal transport ray then their respectiveprices u(x) e v(y) are maximal: v(y)− u(x) = c(x,y).”
13
Introduction The discrete case Measures The Euclidean case
Duality theorem
Theorem (Min-max and “complementary slackness”)
An admissible transference plan T is optimal if and only if there exist competitiveprices (u,v) such that
C(T ) = P(u,v).
In particular
minTC(T ) = max
(u,v)P(u,v).
Moreover, the “slackness”
S(x,y) := c(x,y)− u(x)− v(y) ≥ 0
satisfies the “complementary slackness principle”
T (x,y)S(x,y) = 0 i.e. T (x,y) > 0 ⇒ S(x,y) = 0.
“If x and y are connected through an optimal transport ray then their respectiveprices u(x) e v(y) are maximal: v(y)− u(x) = c(x,y).”
13
Introduction The discrete case Measures The Euclidean case
Duality theorem
Theorem (Min-max and “complementary slackness”)
An admissible transference plan T is optimal if and only if there exist competitiveprices (u,v) such that
C(T ) = P(u,v).
In particular
minTC(T ) = max
(u,v)P(u,v).
Moreover, the “slackness”
S(x,y) := c(x,y)− u(x)− v(y) ≥ 0
satisfies the “complementary slackness principle”
T (x,y)S(x,y) = 0 i.e. T (x,y) > 0 ⇒ S(x,y) = 0.
“If x and y are connected through an optimal transport ray then their respectiveprices u(x) e v(y) are maximal: v(y)− u(x) = c(x,y).”
13
Introduction The discrete case Measures The Euclidean case
Duality via Von Neumann min-max
minT
Xi,j
ci,jTi,j : Ti,j ≥ 0,Xj
Ti,j = mi,Xi
Ti,j = nj .
Introduce Lagrange multipliers Si,j ≥ 0, ui,vj for the constraint
minT
Xi,j
ci,jTi,j = minT
maxS,u,v
Xi,j
ci,jTi,j −Xi,j
Si,jTi,j
−Xi
ui
“Xj
Ti,j −mi
”+Xj
vj
“Xi
Ti,j −mj
”= min
TmaxS,u,v
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxS,u,v
minT
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxu,v
Xi,j
vjnj − uimi : ci,j − Si,j − ui − vj = 0
= maxu,v
Xi,j
vjnj − uimi : ci,j − ui − vj ≥ 0.
14
Introduction The discrete case Measures The Euclidean case
Duality via Von Neumann min-max
minT
Xi,j
ci,jTi,j : Ti,j ≥ 0,Xj
Ti,j = mi,Xi
Ti,j = nj .
Introduce Lagrange multipliers Si,j ≥ 0, ui,vj for the constraint
minT
Xi,j
ci,jTi,j = minT
maxS,u,v
Xi,j
ci,jTi,j −Xi,j
Si,jTi,j
−Xi
ui
“Xj
Ti,j −mi
”+Xj
vj
“Xi
Ti,j −mj
”= min
TmaxS,u,v
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxS,u,v
minT
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxu,v
Xi,j
vjnj − uimi : ci,j − Si,j − ui − vj = 0
= maxu,v
Xi,j
vjnj − uimi : ci,j − ui − vj ≥ 0.
14
Introduction The discrete case Measures The Euclidean case
Duality via Von Neumann min-max
minT
Xi,j
ci,jTi,j : Ti,j ≥ 0,Xj
Ti,j = mi,Xi
Ti,j = nj .
Introduce Lagrange multipliers Si,j ≥ 0, ui,vj for the constraint
minT
Xi,j
ci,jTi,j = minT
maxS,u,v
Xi,j
ci,jTi,j −Xi,j
Si,jTi,j
−Xi
ui
“Xj
Ti,j −mi
”+Xj
vj
“Xi
Ti,j −mj
”= min
TmaxS,u,v
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxS,u,v
minT
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxu,v
Xi,j
vjnj − uimi : ci,j − Si,j − ui − vj = 0
= maxu,v
Xi,j
vjnj − uimi : ci,j − ui − vj ≥ 0.
14
Introduction The discrete case Measures The Euclidean case
Duality via Von Neumann min-max
minT
Xi,j
ci,jTi,j : Ti,j ≥ 0,Xj
Ti,j = mi,Xi
Ti,j = nj .
Introduce Lagrange multipliers Si,j ≥ 0, ui,vj for the constraint
minT
Xi,j
ci,jTi,j = minT
maxS,u,v
Xi,j
ci,jTi,j −Xi,j
Si,jTi,j
−Xi
ui
“Xj
Ti,j −mi
”+Xj
vj
“Xi
Ti,j −mj
”= min
TmaxS,u,v
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxS,u,v
minT
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxu,v
Xi,j
vjnj − uimi : ci,j − Si,j − ui − vj = 0
= maxu,v
Xi,j
vjnj − uimi : ci,j − ui − vj ≥ 0.
14
Introduction The discrete case Measures The Euclidean case
Duality via Von Neumann min-max
minT
Xi,j
ci,jTi,j : Ti,j ≥ 0,Xj
Ti,j = mi,Xi
Ti,j = nj .
Introduce Lagrange multipliers Si,j ≥ 0, ui,vj for the constraint
minT
Xi,j
ci,jTi,j = minT
maxS,u,v
Xi,j
ci,jTi,j −Xi,j
Si,jTi,j
−Xi
ui
“Xj
Ti,j −mi
”+Xj
vj
“Xi
Ti,j −mj
”= min
TmaxS,u,v
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxS,u,v
minT
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxu,v
Xi,j
vjnj − uimi : ci,j − Si,j − ui − vj = 0
= maxu,v
Xi,j
vjnj − uimi : ci,j − ui − vj ≥ 0.
14
Introduction The discrete case Measures The Euclidean case
Duality via Von Neumann min-max
minT
Xi,j
ci,jTi,j : Ti,j ≥ 0,Xj
Ti,j = mi,Xi
Ti,j = nj .
Introduce Lagrange multipliers Si,j ≥ 0, ui,vj for the constraint
minT
Xi,j
ci,jTi,j = minT
maxS,u,v
Xi,j
ci,jTi,j −Xi,j
Si,jTi,j
−Xi
ui
“Xj
Ti,j −mi
”+Xj
vj
“Xi
Ti,j −mj
”= min
TmaxS,u,v
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxS,u,v
minT
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxu,v
Xi,j
vjnj − uimi : ci,j − Si,j − ui − vj = 0
= maxu,v
Xi,j
vjnj − uimi : ci,j − ui − vj ≥ 0.
14
Introduction The discrete case Measures The Euclidean case
Duality via Von Neumann min-max
minT
Xi,j
ci,jTi,j : Ti,j ≥ 0,Xj
Ti,j = mi,Xi
Ti,j = nj .
Introduce Lagrange multipliers Si,j ≥ 0, ui,vj for the constraint
minT
Xi,j
ci,jTi,j = minT
maxS,u,v
Xi,j
ci,jTi,j −Xi,j
Si,jTi,j
−Xi
ui
“Xj
Ti,j −mi
”+Xj
vj
“Xi
Ti,j −mj
”= min
TmaxS,u,v
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxS,u,v
minT
Xi,j
Ti,j
“ci,j − Si,j − ui + vj
”+ vjnj − uimi
= maxu,v
Xi,j
vjnj − uimi : ci,j − Si,j − ui − vj = 0
= maxu,v
Xi,j
vjnj − uimi : ci,j − ui − vj ≥ 0.
14
Introduction The discrete case Measures The Euclidean case
Integrality
Theorem
If the initial and final configuration m(x),n(y) ∈ N are integers then thereexists an integer optimal transference plan T , i.e. T (x,y) ∈ N.
In other words, there is no need to split unitary quantities in order to realize theoptimal transport.
Corollary
If m(x) ≡ 1 and n(y) are integers, then the transference plan T is associated toa transport map t : X → Y so that
T (x,y) > 0 ⇔ y = t(x).
If moreover n(y) ≡ 1 then the map t is one-to-one.
Roughly speaking: from every point x ∈ X start a unique transport ray andmass is not splitted in various directions.
15
Introduction The discrete case Measures The Euclidean case
Integrality
Theorem
If the initial and final configuration m(x),n(y) ∈ N are integers then thereexists an integer optimal transference plan T , i.e. T (x,y) ∈ N.
In other words, there is no need to split unitary quantities in order to realize theoptimal transport.
Corollary
If m(x) ≡ 1 and n(y) are integers, then the transference plan T is associated toa transport map t : X → Y so that
T (x,y) > 0 ⇔ y = t(x).
If moreover n(y) ≡ 1 then the map t is one-to-one.
Roughly speaking: from every point x ∈ X start a unique transport ray andmass is not splitted in various directions.
15
Introduction The discrete case Measures The Euclidean case
Integrality
Theorem
If the initial and final configuration m(x),n(y) ∈ N are integers then thereexists an integer optimal transference plan T , i.e. T (x,y) ∈ N.
In other words, there is no need to split unitary quantities in order to realize theoptimal transport.
Corollary
If m(x) ≡ 1 and n(y) are integers, then the transference plan T is associated toa transport map t : X → Y so that
T (x,y) > 0 ⇔ y = t(x).
If moreover n(y) ≡ 1 then the map t is one-to-one.
Roughly speaking: from every point x ∈ X start a unique transport ray andmass is not splitted in various directions.
15
Introduction The discrete case Measures The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
16
Introduction The discrete case Measures The Euclidean case
Measure dataI X,Y discrete spaces X,Y topological spaces (R,RN , locally compact
spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): hereRN .
I The cost a (lower-semi) continuous function c : X × Y → R ∪ +∞.I The initial and final configurations m(x),n(y) a couple of Borel measuresµ,ν on X and Y . The mass is normalized to 1.Given A ⊂ X,B ⊂ Y µ(A) denotes the quantity of resources available inA, ν(B) denotes the resources expected in B.
Transport plan T a measure γ onX×Y : γ(A ×B) is the mass coming from Aand transported in B.Admissibility: the marginals of γ arethus fixed (γ is a coupling between µ andν)
γ(A× Y ) = µ(A), γ(X ×B) = ν(B)
Γ(µ,ν) : collection of all the admissibletrasnference plan/couplings.
Rm
Rm
µ µ
ν
ν
γ |x− y| = 0
The cost of a transference plan γ isXx,y
c(x, y)T (x, y) C(γ) :=
ZX×Y
c(x,y) dγ(x,y).
17
Introduction The discrete case Measures The Euclidean case
Measure dataI X,Y discrete spaces X,Y topological spaces (R,RN , locally compact
spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): hereRN .
I The cost a (lower-semi) continuous function c : X × Y → R ∪ +∞.I The initial and final configurations m(x),n(y) a couple of Borel measuresµ,ν on X and Y . The mass is normalized to 1.Given A ⊂ X,B ⊂ Y µ(A) denotes the quantity of resources available inA, ν(B) denotes the resources expected in B.
Transport plan T a measure γ onX×Y : γ(A ×B) is the mass coming from Aand transported in B.Admissibility: the marginals of γ arethus fixed (γ is a coupling between µ andν)
γ(A× Y ) = µ(A), γ(X ×B) = ν(B)
Γ(µ,ν) : collection of all the admissibletrasnference plan/couplings.
Rm
Rm
µ µ
ν
ν
γ |x− y| = 0
The cost of a transference plan γ isXx,y
c(x, y)T (x, y) C(γ) :=
ZX×Y
c(x,y) dγ(x,y).
17
Introduction The discrete case Measures The Euclidean case
Measure dataI X,Y discrete spaces X,Y topological spaces (R,RN , locally compact
spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): hereRN .
I The cost a (lower-semi) continuous function c : X × Y → R ∪ +∞.I The initial and final configurations m(x),n(y) a couple of Borel measuresµ,ν on X and Y . The mass is normalized to 1.Given A ⊂ X,B ⊂ Y µ(A) denotes the quantity of resources available inA, ν(B) denotes the resources expected in B.
Transport plan T a measure γ onX×Y : γ(A ×B) is the mass coming from Aand transported in B.Admissibility: the marginals of γ arethus fixed (γ is a coupling between µ andν)
γ(A× Y ) = µ(A), γ(X ×B) = ν(B)
Γ(µ,ν) : collection of all the admissibletrasnference plan/couplings.
Rm
Rm
µ µ
ν
ν
γ |x− y| = 0
The cost of a transference plan γ isXx,y
c(x, y)T (x, y) C(γ) :=
ZX×Y
c(x,y) dγ(x,y).
17
Introduction The discrete case Measures The Euclidean case
Measure dataI X,Y discrete spaces X,Y topological spaces (R,RN , locally compact
spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): hereRN .
I The cost a (lower-semi) continuous function c : X × Y → R ∪ +∞.I The initial and final configurations m(x),n(y) a couple of Borel measuresµ,ν on X and Y . The mass is normalized to 1.Given A ⊂ X,B ⊂ Y µ(A) denotes the quantity of resources available inA, ν(B) denotes the resources expected in B.
Transport plan T a measure γ onX×Y : γ(A ×B) is the mass coming from Aand transported in B.Admissibility: the marginals of γ arethus fixed (γ is a coupling between µ andν)
γ(A× Y ) = µ(A), γ(X ×B) = ν(B)
Γ(µ,ν) : collection of all the admissibletrasnference plan/couplings.
Rm
Rm
µ µ
ν
ν
γ |x− y| = 0
The cost of a transference plan γ isXx,y
c(x, y)T (x, y) C(γ) :=
ZX×Y
c(x,y) dγ(x,y).
17
Introduction The discrete case Measures The Euclidean case
Measure dataI X,Y discrete spaces X,Y topological spaces (R,RN , locally compact
spaces, Polish (i.e. complete and separable) spaces, Radon spaces, . . . ): hereRN .
I The cost a (lower-semi) continuous function c : X × Y → R ∪ +∞.I The initial and final configurations m(x),n(y) a couple of Borel measuresµ,ν on X and Y . The mass is normalized to 1.Given A ⊂ X,B ⊂ Y µ(A) denotes the quantity of resources available inA, ν(B) denotes the resources expected in B.
Transport plan T a measure γ onX×Y : γ(A ×B) is the mass coming from Aand transported in B.Admissibility: the marginals of γ arethus fixed (γ is a coupling between µ andν)
γ(A× Y ) = µ(A), γ(X ×B) = ν(B)
Γ(µ,ν) : collection of all the admissibletrasnference plan/couplings.
Rm
Rm
µ µ
ν
ν
γ |x− y| = 0
The cost of a transference plan γ isXx,y
c(x, y)T (x, y) C(γ) :=
ZX×Y
c(x,y) dγ(x,y).
17
Introduction The discrete case Measures The Euclidean case
Transport and probabilityDiscrete setting: x1, · · · , xN, m1, · · · ,mN µ =
Pimiδxi . t:=
transport map, yi = t(xi),
t#µ = ν =X
miδyi .
In term of measures
ν(B) =X
i:yi ∈ B
mi =X
i:t(xi)∈Bmi =
Xi:xi∈t−1(B)
mi = µ(t−1(B))
In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X)we define
ν = t#µ ⇔ ν(B) = µ(t−1(B)).
In probability: P is a probability measure on the probability space Ω,X : Ω→ X is a random variable,
X#P ∈P(X ) is the law of X, X#P(A) = P[X ∈ A].
Change of variable formula:ZX
φ(t(x)) dµ(x) =
ZY
φ(y) dν(y)
Expectation: E[φ(X)] =
ZΩφ(X(ω)) dP(ω) =
ZXφ(x) d(X#P)
18
Introduction The discrete case Measures The Euclidean case
Transport and probabilityDiscrete setting: x1, · · · , xN, m1, · · · ,mN µ =
Pimiδxi . t:=
transport map, yi = t(xi),
t#µ = ν =X
miδyi .
In term of measures
ν(B) =X
i:yi ∈ B
mi =X
i:t(xi)∈Bmi =
Xi:xi∈t−1(B)
mi = µ(t−1(B))
In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X)we define
ν = t#µ ⇔ ν(B) = µ(t−1(B)).
In probability: P is a probability measure on the probability space Ω,X : Ω→ X is a random variable,
X#P ∈P(X ) is the law of X, X#P(A) = P[X ∈ A].
Change of variable formula:ZX
φ(t(x)) dµ(x) =
ZY
φ(y) dν(y)
Expectation: E[φ(X)] =
ZΩφ(X(ω)) dP(ω) =
ZXφ(x) d(X#P)
18
Introduction The discrete case Measures The Euclidean case
Transport and probabilityDiscrete setting: x1, · · · , xN, m1, · · · ,mN µ =
Pimiδxi . t:=
transport map, yi = t(xi),
t#µ = ν =X
miδyi .
In term of measures
ν(B) =X
i:yi ∈ B
mi =X
i:t(xi)∈Bmi =
Xi:xi∈t−1(B)
mi = µ(t−1(B))
In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X)we define
ν = t#µ ⇔ ν(B) = µ(t−1(B)).
In probability: P is a probability measure on the probability space Ω,X : Ω→ X is a random variable,
X#P ∈P(X ) is the law of X, X#P(A) = P[X ∈ A].
Change of variable formula:ZX
φ(t(x)) dµ(x) =
ZY
φ(y) dν(y)
Expectation: E[φ(X)] =
ZΩφ(X(ω)) dP(ω) =
ZXφ(x) d(X#P)
18
Introduction The discrete case Measures The Euclidean case
Transport and probabilityDiscrete setting: x1, · · · , xN, m1, · · · ,mN µ =
Pimiδxi . t:=
transport map, yi = t(xi),
t#µ = ν =X
miδyi .
In term of measures
ν(B) =X
i:yi ∈ B
mi =X
i:t(xi)∈Bmi =
Xi:xi∈t−1(B)
mi = µ(t−1(B))
In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X)we define
ν = t#µ ⇔ ν(B) = µ(t−1(B)).
In probability: P is a probability measure on the probability space Ω,X : Ω→ X is a random variable,
X#P ∈P(X ) is the law of X, X#P(A) = P[X ∈ A].
Change of variable formula:ZX
φ(t(x)) dµ(x) =
ZY
φ(y) dν(y)
Expectation: E[φ(X)] =
ZΩφ(X(ω)) dP(ω) =
ZXφ(x) d(X#P)
18
Introduction The discrete case Measures The Euclidean case
Transport and probabilityDiscrete setting: x1, · · · , xN, m1, · · · ,mN µ =
Pimiδxi . t:=
transport map, yi = t(xi),
t#µ = ν =X
miδyi .
In term of measures
ν(B) =X
i:yi ∈ B
mi =X
i:t(xi)∈Bmi =
Xi:xi∈t−1(B)
mi = µ(t−1(B))
In general, for every Borel map t : X → Y and every Borel measure µ ∈ P(X)we define
ν = t#µ ⇔ ν(B) = µ(t−1(B)).
In probability: P is a probability measure on the probability space Ω,X : Ω→ X is a random variable,
X#P ∈P(X ) is the law of X, X#P(A) = P[X ∈ A].
Change of variable formula:ZX
φ(t(x)) dµ(x) =
ZY
φ(y) dν(y)
Expectation: E[φ(X)] =
ZΩφ(X(ω)) dP(ω) =
ZXφ(x) d(X#P)
18
Introduction The discrete case Measures The Euclidean case
The general problem
Problem
Given two Borel probability measures µ ∈ P(X) and ν ∈ P(Y ) find anadmissible trasnference plan γ ∈ Γ(µ,ν) minimizing the toal cost
minγ∈Γ(µ,ν)
C(γ)
Kantorovich potentials: functions u : X → R, v : Y → R such that
v(y)− u(x) ≤ c(x,y) (Π(c))
Xx
u(x)m(x) +Xy
v(y)n(y) P(u,v) :=
ZXu(x) dµ(x) +
ZYv(y) dν(y)
Problem (Dual formulation)
Find a couple of Kantorovich potentials (u,v) ∈ Π(c) maximizing
maxΠ(c)P(u,v).
19
Introduction The discrete case Measures The Euclidean case
The general problem
Problem
Given two Borel probability measures µ ∈ P(X) and ν ∈ P(Y ) find anadmissible trasnference plan γ ∈ Γ(µ,ν) minimizing the toal cost
minγ∈Γ(µ,ν)
C(γ)
Kantorovich potentials: functions u : X → R, v : Y → R such that
v(y)− u(x) ≤ c(x,y) (Π(c))
Xx
u(x)m(x) +Xy
v(y)n(y) P(u,v) :=
ZXu(x) dµ(x) +
ZYv(y) dν(y)
Problem (Dual formulation)
Find a couple of Kantorovich potentials (u,v) ∈ Π(c) maximizing
maxΠ(c)P(u,v).
19
Introduction The discrete case Measures The Euclidean case
The general problem
Problem
Given two Borel probability measures µ ∈ P(X) and ν ∈ P(Y ) find anadmissible trasnference plan γ ∈ Γ(µ,ν) minimizing the toal cost
minγ∈Γ(µ,ν)
C(γ)
Kantorovich potentials: functions u : X → R, v : Y → R such that
v(y)− u(x) ≤ c(x,y) (Π(c))
Xx
u(x)m(x) +Xy
v(y)n(y) P(u,v) :=
ZXu(x) dµ(x) +
ZYv(y) dν(y)
Problem (Dual formulation)
Find a couple of Kantorovich potentials (u,v) ∈ Π(c) maximizing
maxΠ(c)P(u,v).
19
Introduction The discrete case Measures The Euclidean case
A foundamental theorem
Assume that the cost is continuous and feasible, e.g.
C(µ⊗ ν) =
ZZX×Y
c(x,y) d(µ⊗ ν)(x,y) < +∞ (sufficient feasibility codition)
Theorem
Existence There rexists an optimal transference plan γopt ∈ Γ(µ,ν) and acouple of optimal Kantorovich potentials (uopt,vopt) ∈ Π(c).
Duality
C(γopt) = minΓ(µ,ν)
C(γ) = maxΠ(c)P(u, v) = P(uopt,vopt).
Slackness For every (x,y) ∈ supp(γ) ( connection by a transport ray)
c(x,y) = vopt(y)− uopt(x).
Cyclical monotonicity For every (x1, y1), (x2, y2), · · · , (xN , yN ) in the supportof γ and every permutation σ : 1, 2, · · ·N → 1, 2, · · · , N
c(x1, y1) + · · ·+ c(xN , yN ) ≤ c(x1, yσ(1)) + · · · c(xN , yσ(N)).
20
Introduction The discrete case Measures The Euclidean case
A foundamental theorem
Assume that the cost is continuous and feasible, e.g.
C(µ⊗ ν) =
ZZX×Y
c(x,y) d(µ⊗ ν)(x,y) < +∞ (sufficient feasibility codition)
Theorem
Existence There rexists an optimal transference plan γopt ∈ Γ(µ,ν) and acouple of optimal Kantorovich potentials (uopt,vopt) ∈ Π(c).
Duality
C(γopt) = minΓ(µ,ν)
C(γ) = maxΠ(c)P(u, v) = P(uopt,vopt).
Slackness For every (x,y) ∈ supp(γ) ( connection by a transport ray)
c(x,y) = vopt(y)− uopt(x).
Cyclical monotonicity For every (x1, y1), (x2, y2), · · · , (xN , yN ) in the supportof γ and every permutation σ : 1, 2, · · ·N → 1, 2, · · · , N
c(x1, y1) + · · ·+ c(xN , yN ) ≤ c(x1, yσ(1)) + · · · c(xN , yσ(N)).
20
Introduction The discrete case Measures The Euclidean case
A foundamental theorem
Assume that the cost is continuous and feasible, e.g.
C(µ⊗ ν) =
ZZX×Y
c(x,y) d(µ⊗ ν)(x,y) < +∞ (sufficient feasibility codition)
Theorem
Existence There rexists an optimal transference plan γopt ∈ Γ(µ,ν) and acouple of optimal Kantorovich potentials (uopt,vopt) ∈ Π(c).
Duality
C(γopt) = minΓ(µ,ν)
C(γ) = maxΠ(c)P(u, v) = P(uopt,vopt).
Slackness For every (x,y) ∈ supp(γ) ( connection by a transport ray)
c(x,y) = vopt(y)− uopt(x).
Cyclical monotonicity For every (x1, y1), (x2, y2), · · · , (xN , yN ) in the supportof γ and every permutation σ : 1, 2, · · ·N → 1, 2, · · · , N
c(x1, y1) + · · ·+ c(xN , yN ) ≤ c(x1, yσ(1)) + · · · c(xN , yσ(N)).
20
Introduction The discrete case Measures The Euclidean case
A foundamental theorem
Assume that the cost is continuous and feasible, e.g.
C(µ⊗ ν) =
ZZX×Y
c(x,y) d(µ⊗ ν)(x,y) < +∞ (sufficient feasibility codition)
Theorem
Existence There rexists an optimal transference plan γopt ∈ Γ(µ,ν) and acouple of optimal Kantorovich potentials (uopt,vopt) ∈ Π(c).
Duality
C(γopt) = minΓ(µ,ν)
C(γ) = maxΠ(c)P(u, v) = P(uopt,vopt).
Slackness For every (x,y) ∈ supp(γ) ( connection by a transport ray)
c(x,y) = vopt(y)− uopt(x).
Cyclical monotonicity For every (x1, y1), (x2, y2), · · · , (xN , yN ) in the supportof γ and every permutation σ : 1, 2, · · ·N → 1, 2, · · · , N
c(x1, y1) + · · ·+ c(xN , yN ) ≤ c(x1, yσ(1)) + · · · c(xN , yσ(N)).
20
Introduction The discrete case Measures The Euclidean case
A foundamental theorem
Assume that the cost is continuous and feasible, e.g.
C(µ⊗ ν) =
ZZX×Y
c(x,y) d(µ⊗ ν)(x,y) < +∞ (sufficient feasibility codition)
Theorem
Existence There rexists an optimal transference plan γopt ∈ Γ(µ,ν) and acouple of optimal Kantorovich potentials (uopt,vopt) ∈ Π(c).
Duality
C(γopt) = minΓ(µ,ν)
C(γ) = maxΠ(c)P(u, v) = P(uopt,vopt).
Slackness For every (x,y) ∈ supp(γ) ( connection by a transport ray)
c(x,y) = vopt(y)− uopt(x).
Cyclical monotonicity For every (x1, y1), (x2, y2), · · · , (xN , yN ) in the supportof γ and every permutation σ : 1, 2, · · ·N → 1, 2, · · · , N
c(x1, y1) + · · ·+ c(xN , yN ) ≤ c(x1, yσ(1)) + · · · c(xN , yσ(N)).
20
Introduction The discrete case Measures The Euclidean case
Outline
1 A short historical tour
2 The “discrete” case, duality and linear programming
3 The measure-theoretic setting
4 Euclidean spaces: geometry and transport maps
21
Introduction The discrete case Measures The Euclidean case
Some important questions
I Uniqueness of the optimal transference plan
I Integrality existence of a transport map.
I Links with the geometry: the cost function (x,y) depends on the distancebetween x and y (|x− y| when X = Y = Rd)
I Regularity of Kantorovich potentials
I Further information when the measures µ = fL d L d andν = gL d L d are absolutely continuous with respect to theLebesgue measure:
µ(A) =
ZAf(x) dx, ν(B) =
ZBg(y) dy.
All these questions are strictly linked!From now on we will consider the Euclidean case X = Y = Rd.
22
Introduction The discrete case Measures The Euclidean case
Some important questions
I Uniqueness of the optimal transference plan
I Integrality existence of a transport map.
I Links with the geometry: the cost function (x,y) depends on the distancebetween x and y (|x− y| when X = Y = Rd)
I Regularity of Kantorovich potentials
I Further information when the measures µ = fL d L d andν = gL d L d are absolutely continuous with respect to theLebesgue measure:
µ(A) =
ZAf(x) dx, ν(B) =
ZBg(y) dy.
All these questions are strictly linked!From now on we will consider the Euclidean case X = Y = Rd.
22
Introduction The discrete case Measures The Euclidean case
Some important questions
I Uniqueness of the optimal transference plan
I Integrality existence of a transport map.
I Links with the geometry: the cost function (x,y) depends on the distancebetween x and y (|x− y| when X = Y = Rd)
I Regularity of Kantorovich potentials
I Further information when the measures µ = fL d L d andν = gL d L d are absolutely continuous with respect to theLebesgue measure:
µ(A) =
ZAf(x) dx, ν(B) =
ZBg(y) dy.
All these questions are strictly linked!From now on we will consider the Euclidean case X = Y = Rd.
22
Introduction The discrete case Measures The Euclidean case
Some important questions
I Uniqueness of the optimal transference plan
I Integrality existence of a transport map.
I Links with the geometry: the cost function (x,y) depends on the distancebetween x and y (|x− y| when X = Y = Rd)
I Regularity of Kantorovich potentials
I Further information when the measures µ = fL d L d andν = gL d L d are absolutely continuous with respect to theLebesgue measure:
µ(A) =
ZAf(x) dx, ν(B) =
ZBg(y) dy.
All these questions are strictly linked!From now on we will consider the Euclidean case X = Y = Rd.
22
Introduction The discrete case Measures The Euclidean case
Some important questions
I Uniqueness of the optimal transference plan
I Integrality existence of a transport map.
I Links with the geometry: the cost function (x,y) depends on the distancebetween x and y (|x− y| when X = Y = Rd)
I Regularity of Kantorovich potentials
I Further information when the measures µ = fL d L d andν = gL d L d are absolutely continuous with respect to theLebesgue measure:
µ(A) =
ZAf(x) dx, ν(B) =
ZBg(y) dy.
All these questions are strictly linked!From now on we will consider the Euclidean case X = Y = Rd.
22
Introduction The discrete case Measures The Euclidean case
Some important questions
I Uniqueness of the optimal transference plan
I Integrality existence of a transport map.
I Links with the geometry: the cost function (x,y) depends on the distancebetween x and y (|x− y| when X = Y = Rd)
I Regularity of Kantorovich potentials
I Further information when the measures µ = fL d L d andν = gL d L d are absolutely continuous with respect to theLebesgue measure:
µ(A) =
ZAf(x) dx, ν(B) =
ZBg(y) dy.
All these questions are strictly linked!From now on we will consider the Euclidean case X = Y = Rd.
22
Introduction The discrete case Measures The Euclidean case
Integrality and transport maps
At the continuous level the integrality condition could be informally stated byasking that (almost) every point x is the starting point of at most onetransport ray.We can say that y is connected to x by a transport ray if (x,y) ∈ suppγ; thus wehave
(x,y1), (x,y2) ∈ suppγ ⇒ y1 = y2 =: t(x)
a property which should hold µ-almost everywhere.t : X → Y is called transport map induced by the plan γ. It satisfies
if A = t−1(B) then µ(A) = ν(B) = γ(A×B).
Recalling the change-of-variable formula, if µ = f dx, ν = g dy, and t isdifferentiable
µ(A) =
ZAf(x) dx = ν(B) =
ZBg(y) dy =
ZAg(t(x))| det Dt(x)| dx
so thatf(x) = g(t(x))|det Dt(x)|.
23
Introduction The discrete case Measures The Euclidean case
Integrality and transport maps
At the continuous level the integrality condition could be informally stated byasking that (almost) every point x is the starting point of at most onetransport ray.We can say that y is connected to x by a transport ray if (x,y) ∈ suppγ; thus wehave
(x,y1), (x,y2) ∈ suppγ ⇒ y1 = y2 =: t(x)
a property which should hold µ-almost everywhere.t : X → Y is called transport map induced by the plan γ. It satisfies
if A = t−1(B) then µ(A) = ν(B) = γ(A×B).
Recalling the change-of-variable formula, if µ = f dx, ν = g dy, and t isdifferentiable
µ(A) =
ZAf(x) dx = ν(B) =
ZBg(y) dy =
ZAg(t(x))| det Dt(x)| dx
so thatf(x) = g(t(x))| det Dt(x)|.
23
Introduction The discrete case Measures The Euclidean case
Integrality and transport maps
At the continuous level the integrality condition could be informally stated byasking that (almost) every point x is the starting point of at most onetransport ray.We can say that y is connected to x by a transport ray if (x,y) ∈ suppγ; thus wehave
(x,y1), (x,y2) ∈ suppγ ⇒ y1 = y2 =: t(x)
a property which should hold µ-almost everywhere.t : X → Y is called transport map induced by the plan γ. It satisfies
if A = t−1(B) then µ(A) = ν(B) = γ(A×B).
Recalling the change-of-variable formula, if µ = f dx, ν = g dy, and t isdifferentiable
µ(A) =
ZAf(x) dx = ν(B) =
ZBg(y) dy =
ZAg(t(x))| det Dt(x)| dx
so thatf(x) = g(t(x))| det Dt(x)|.
23
Introduction The discrete case Measures The Euclidean case
Integrality and transport maps
At the continuous level the integrality condition could be informally stated byasking that (almost) every point x is the starting point of at most onetransport ray.We can say that y is connected to x by a transport ray if (x,y) ∈ suppγ; thus wehave
(x,y1), (x,y2) ∈ suppγ ⇒ y1 = y2 =: t(x)
a property which should hold µ-almost everywhere.t : X → Y is called transport map induced by the plan γ. It satisfies
if A = t−1(B) then µ(A) = ν(B) = γ(A×B).
Recalling the change-of-variable formula, if µ = f dx, ν = g dy, and t isdifferentiable
µ(A) =
ZAf(x) dx = ν(B) =
ZBg(y) dy =
ZAg(t(x))| det Dt(x)| dx
so thatf(x) = g(t(x))| det Dt(x)|.
23
Introduction The discrete case Measures The Euclidean case
Existence and uniqueness of the optimal transport map:c(x, y) = 1
2 |x− y|2
Theorem (Brenier (1989))
Siano µ = f dx, ν = g dy, c(x,y) := 12|x− y|2
I There exists a unique optimal transference plan γ and it is associated to atransport map t.
I The Kantorovich potentials are perturbations of convex functions; moreprecisely
1
2|x|2 + u(x) = φ(x) and
1
2|y|2 − v(y) = ψ(y) are convex
and ψ is the Legendre transform of φ
ψ(y) = φ∗(y) = supx〈y,x〉 − φ(x).
I t(x) =∇φ(x) = x−∇u(x) is the gradient of a convex function, it isessentially injective, a.e. differentiable, differenziabile, and Dt = D2φ ispositive definite.
I φ solves Monge-Ampere equation
det D2φ(x) =f(x)
g(∇φ(x))
24
Introduction The discrete case Measures The Euclidean case
Existence and uniqueness of the optimal transport map:c(x, y) = 1
2 |x− y|2
Theorem (Brenier (1989))
Siano µ = f dx, ν = g dy, c(x,y) := 12|x− y|2
I There exists a unique optimal transference plan γ and it is associated to atransport map t.
I The Kantorovich potentials are perturbations of convex functions; moreprecisely
1
2|x|2 + u(x) = φ(x) and
1
2|y|2 − v(y) = ψ(y) are convex
and ψ is the Legendre transform of φ
ψ(y) = φ∗(y) = supx〈y,x〉 − φ(x).
I t(x) =∇φ(x) = x−∇u(x) is the gradient of a convex function, it isessentially injective, a.e. differentiable, differenziabile, and Dt = D2φ ispositive definite.
I φ solves Monge-Ampere equation
det D2φ(x) =f(x)
g(∇φ(x))
24
Introduction The discrete case Measures The Euclidean case
Existence and uniqueness of the optimal transport map:c(x, y) = 1
2 |x− y|2
Theorem (Brenier (1989))
Siano µ = f dx, ν = g dy, c(x,y) := 12|x− y|2
I There exists a unique optimal transference plan γ and it is associated to atransport map t.
I The Kantorovich potentials are perturbations of convex functions; moreprecisely
1
2|x|2 + u(x) = φ(x) and
1
2|y|2 − v(y) = ψ(y) are convex
and ψ is the Legendre transform of φ
ψ(y) = φ∗(y) = supx〈y,x〉 − φ(x).
I t(x) =∇φ(x) = x−∇u(x) is the gradient of a convex function, it isessentially injective, a.e. differentiable, differenziabile, and Dt = D2φ ispositive definite.
I φ solves Monge-Ampere equation
det D2φ(x) =f(x)
g(∇φ(x))
24
Introduction The discrete case Measures The Euclidean case
Existence and uniqueness of the optimal transport map:c(x, y) = 1
2 |x− y|2
Theorem (Brenier (1989))
Siano µ = f dx, ν = g dy, c(x,y) := 12|x− y|2
I There exists a unique optimal transference plan γ and it is associated to atransport map t.
I The Kantorovich potentials are perturbations of convex functions; moreprecisely
1
2|x|2 + u(x) = φ(x) and
1
2|y|2 − v(y) = ψ(y) are convex
and ψ is the Legendre transform of φ
ψ(y) = φ∗(y) = supx〈y,x〉 − φ(x).
I t(x) =∇φ(x) = x−∇u(x) is the gradient of a convex function, it isessentially injective, a.e. differentiable, differenziabile, and Dt = D2φ ispositive definite.
I φ solves Monge-Ampere equation
det D2φ(x) =f(x)
g(∇φ(x))
24
Introduction The discrete case Measures The Euclidean case
Existence and uniqueness of the optimal transport map:c(x, y) = 1
2 |x− y|2
Theorem (Brenier (1989))
Siano µ = f dx, ν = g dy, c(x,y) := 12|x− y|2
I There exists a unique optimal transference plan γ and it is associated to atransport map t.
I The Kantorovich potentials are perturbations of convex functions; moreprecisely
1
2|x|2 + u(x) = φ(x) and
1
2|y|2 − v(y) = ψ(y) are convex
and ψ is the Legendre transform of φ
ψ(y) = φ∗(y) = supx〈y,x〉 − φ(x).
I t(x) =∇φ(x) = x−∇u(x) is the gradient of a convex function, it isessentially injective, a.e. differentiable, differenziabile, and Dt = D2φ ispositive definite.
I φ solves Monge-Ampere equation
det D2φ(x) =f(x)
g(∇φ(x))
24
Introduction The discrete case Measures The Euclidean case
Brenier theoremµ = f dx,ν = g dx are absolutely continuous in Rd.
The optimal coupling γ ∈ Γo(µ,ν) isconcentrated on the graph of a
cyclically monotone map t:
γ = (i× t)#µ
W2(µ,ν) =
ZRd|x− t(x)|2 dµ(x)
Rd
Rd
µ µ
ν
ν
γ
t can be recovered by the optimal Kantorovich potentials u− v satisfying
v(y)− u(x) ≤ |x− y|2, W22(µ,ν) =
Zv(y) dν(y)−
Zu(x) dµ(x)
by
t(x) = x+∇u(x) = ∇“1
2|x|2 + u(x)
”,
1
2|x|2 + u(x) is convex.
25
Introduction The discrete case Measures The Euclidean case
Brenier theoremµ = f dx,ν = g dx are absolutely continuous in Rd.
The optimal coupling γ ∈ Γo(µ,ν) isconcentrated on the graph of a
cyclically monotone map t:
γ = (i× t)#µ
W2(µ,ν) =
ZRd|x− t(x)|2 dµ(x)
Rd
Rd
µ µ
ν
ν
t
t can be recovered by the optimal Kantorovich potentials u− v satisfying
v(y)− u(x) ≤ |x− y|2, W22(µ,ν) =
Zv(y) dν(y)−
Zu(x) dµ(x)
by
t(x) = x+∇u(x) = ∇“1
2|x|2 + u(x)
”,
1
2|x|2 + u(x) is convex.
25
Introduction The discrete case Measures The Euclidean case
Brenier theoremµ = f dx,ν = g dx are absolutely continuous in Rd.
The optimal coupling γ ∈ Γo(µ,ν) isconcentrated on the graph of a
cyclically monotone map t:
γ = (i× t)#µ
W2(µ,ν) =
ZRd|x− t(x)|2 dµ(x)
Rd
Rd
µ µ
ν
ν
t
t can be recovered by the optimal Kantorovich potentials u− v satisfying
v(y)− u(x) ≤ |x− y|2, W22(µ,ν) =
Zv(y) dν(y)−
Zu(x) dµ(x)
by
t(x) = x+∇u(x) = ∇“1
2|x|2 + u(x)
”,
1
2|x|2 + u(x) is convex.
25
Introduction The discrete case Measures The Euclidean case
Brenier theoremµ = f dx,ν = g dx are absolutely continuous in Rd.
The optimal coupling γ ∈ Γo(µ,ν) isconcentrated on the graph of a
cyclically monotone map t:
γ = (i× t)#µ
W2(µ,ν) =
ZRd|x− t(x)|2 dµ(x)
Rd
Rd
µ µ
ν
ν
t
t can be recovered by the optimal Kantorovich potentials u− v satisfying
v(y)− u(x) ≤ |x− y|2, W22(µ,ν) =
Zv(y) dν(y)−
Zu(x) dµ(x)
by
t(x) = x+∇u(x) = ∇“1
2|x|2 + u(x)
”,
1
2|x|2 + u(x) is convex.
25
Introduction The discrete case Measures The Euclidean case
Extensions and applications
I Strictly convex costs c(x, y) = h(|x− y|): Gangbo-McCann,. . . (’96-)
I Monge problem c(x, y) = |x− y|: Sudakov (’79), Ambrosio (2000),. . . ,Bianchini, Champion-De Pascale,. . .
I Regularity: (Caffarelli,. . . (’92-), Wang, Trudinger, Loeper, Villani,McCann,)
I Isoperimetric and functional inequalities: Gromov, Villani, Otto,McCann, Maggi, Figalli, Pratelli, . . .
I Hilbert and Wiener spaces: Feyel-Ustunel, Ambrosio-Gigli-S., (’04-), . . .
I Riemannian manifold, Ricci flow: McCann, Sturm, Villani, Lott,Topping, Carfora . . . (’98-))
I . . .
26
Introduction The discrete case Measures The Euclidean case
A distance between probability measures
The quadratic cost c(x,y) = |x− y|2 induces a distance between probabilitymeasures with finite quadratic moment (P2(Rd)): the so-calledKantorovich-Rubinstein-Wasserstein distance
W2(µ,ν) :=“C(µ,ν)
”1/2=“
minγ∈Γ(µ,ν)
ZZ|x− y|2 dγ(x,y)
”1/2
This distance has a simple interpretation in the case of discrete measures: if
µ =1
N
NXk=1
δxk e ν =1
N
NXk=1
δyk allora
W 22 (µ,ν) = min
σ
1
N
NXk=1
|xk − yσ(k)|2, σ permutation of 1, 2, · · · , N
P2(Rd),W2 is a complete and separable metric space, the distance W2 isassociated to the weak convergence of measures:
W2(µn, µ)→ 0 ⇔
8><>:Zζ(x) dµn(x)→
Zζ(x) dµ(x)
per ogni ζ ∈ C0(Rd), |ζ(x)| ≤ A|x|2 +B.
27
Introduction The discrete case Measures The Euclidean case
A distance between probability measures
The quadratic cost c(x,y) = |x− y|2 induces a distance between probabilitymeasures with finite quadratic moment (P2(Rd)): the so-calledKantorovich-Rubinstein-Wasserstein distance
W2(µ,ν) :=“C(µ,ν)
”1/2=“
minγ∈Γ(µ,ν)
ZZ|x− y|2 dγ(x,y)
”1/2
This distance has a simple interpretation in the case of discrete measures: if
µ =1
N
NXk=1
δxk e ν =1
N
NXk=1
δyk allora
W 22 (µ,ν) = min
σ
1
N
NXk=1
|xk − yσ(k)|2, σ permutation of 1, 2, · · · , N
P2(Rd),W2 is a complete and separable metric space, the distance W2 isassociated to the weak convergence of measures:
W2(µn, µ)→ 0 ⇔
8><>:Zζ(x) dµn(x)→
Zζ(x) dµ(x)
per ogni ζ ∈ C0(Rd), |ζ(x)| ≤ A|x|2 +B.
27
Introduction The discrete case Measures The Euclidean case
A distance between probability measures
The quadratic cost c(x,y) = |x− y|2 induces a distance between probabilitymeasures with finite quadratic moment (P2(Rd)): the so-calledKantorovich-Rubinstein-Wasserstein distance
W2(µ,ν) :=“C(µ,ν)
”1/2=“
minγ∈Γ(µ,ν)
ZZ|x− y|2 dγ(x,y)
”1/2
This distance has a simple interpretation in the case of discrete measures: if
µ =1
N
NXk=1
δxk e ν =1
N
NXk=1
δyk allora
W 22 (µ,ν) = min
σ
1
N
NXk=1
|xk − yσ(k)|2, σ permutation of 1, 2, · · · , N
P2(Rd),W2 is a complete and separable metric space, the distance W2 isassociated to the weak convergence of measures:
W2(µn, µ)→ 0 ⇔
8><>:Zζ(x) dµn(x)→
Zζ(x) dµ(x)
per ogni ζ ∈ C0(Rd), |ζ(x)| ≤ A|x|2 +B.
27
Introduction The discrete case Measures The Euclidean case
Weak convergence, lower semicontinuity, and compactness
Definition (Weak convergence)
A sequence µn ∈P(Rm) converges weakly to µ ∈P(Rm) if
limn→+∞
ZRm
ϕ(x) dµn(x) =
ZRm
ϕ(x) dµ(x) ∀ϕ ∈ C0b(Rd)
I Test functions ϕ can be equivalently choosen in C0c (Rd) or in C∞c (Rd), as for
distributional convergence.
I If Xn → X pointwise, then (Xn)#P X#P.
I If ζ : Rd → [0,+∞] is just lower semicontinuous (no boundedness isrequired) and µn µ then
lim infn→+∞
ZRdζ(x) dµn(x) ≥
ZRdζ(x) dµ(x).
I Prokhorov Theorem: A set Γ ⊂P(Rd) is weakly relatively compactiff it is tight, i.e.
for every ε > 0 there exists a compact set K b Rd: µ(Rd \K) ≤ ε ∀µ ∈ Γ.
28
Introduction The discrete case Measures The Euclidean case
Weak convergence, lower semicontinuity, and compactness
Definition (Weak convergence)
A sequence µn ∈P(Rm) converges weakly to µ ∈P(Rm) if
limn→+∞
ZRm
ϕ(x) dµn(x) =
ZRm
ϕ(x) dµ(x) ∀ϕ ∈ C0b(Rd)
I Test functions ϕ can be equivalently choosen in C0c (Rd) or in C∞c (Rd), as for
distributional convergence.
I If Xn → X pointwise, then (Xn)#P X#P.
I If ζ : Rd → [0,+∞] is just lower semicontinuous (no boundedness isrequired) and µn µ then
lim infn→+∞
ZRdζ(x) dµn(x) ≥
ZRdζ(x) dµ(x).
I Prokhorov Theorem: A set Γ ⊂P(Rd) is weakly relatively compactiff it is tight, i.e.
for every ε > 0 there exists a compact set K b Rd: µ(Rd \K) ≤ ε ∀µ ∈ Γ.
28
Introduction The discrete case Measures The Euclidean case
Weak convergence, lower semicontinuity, and compactness
Definition (Weak convergence)
A sequence µn ∈P(Rm) converges weakly to µ ∈P(Rm) if
limn→+∞
ZRm
ϕ(x) dµn(x) =
ZRm
ϕ(x) dµ(x) ∀ϕ ∈ C0b(Rd)
I Test functions ϕ can be equivalently choosen in C0c (Rd) or in C∞c (Rd), as for
distributional convergence.
I If Xn → X pointwise, then (Xn)#P X#P.
I If ζ : Rd → [0,+∞] is just lower semicontinuous (no boundedness isrequired) and µn µ then
lim infn→+∞
ZRdζ(x) dµn(x) ≥
ZRdζ(x) dµ(x).
I Prokhorov Theorem: A set Γ ⊂P(Rd) is weakly relatively compactiff it is tight, i.e.
for every ε > 0 there exists a compact set K b Rd: µ(Rd \K) ≤ ε ∀µ ∈ Γ.
28
Introduction The discrete case Measures The Euclidean case
Weak convergence, lower semicontinuity, and compactness
Definition (Weak convergence)
A sequence µn ∈P(Rm) converges weakly to µ ∈P(Rm) if
limn→+∞
ZRm
ϕ(x) dµn(x) =
ZRm
ϕ(x) dµ(x) ∀ϕ ∈ C0b(Rd)
I Test functions ϕ can be equivalently choosen in C0c (Rd) or in C∞c (Rd), as for
distributional convergence.
I If Xn → X pointwise, then (Xn)#P X#P.
I If ζ : Rd → [0,+∞] is just lower semicontinuous (no boundedness isrequired) and µn µ then
lim infn→+∞
ZRdζ(x) dµn(x) ≥
ZRdζ(x) dµ(x).
I Prokhorov Theorem: A set Γ ⊂P(Rd) is weakly relatively compactiff it is tight, i.e.
for every ε > 0 there exists a compact set K b Rd: µ(Rd \K) ≤ ε ∀µ ∈ Γ.
28
Introduction The discrete case Measures The Euclidean case
Optimal couplings and triangular inequalityLower semicontinuity and tightness: the minimum problem
W2
2(µ1,µ2) := minnZ
Rm×Rm|x1 − x2|2 dµ(x1,x2) : µ ∈ Γ(µ1,µ2)
ois attained: Γo(µ1,µ2) denotes the collection (closed, convex set) of all theoptimal couplings in P2(Rm × Rm). In general more than one optimal couplingcould exist.Connecting a sequence of measures, disintegration and Kolmogorovtheorem:if µ1,2 ∈ Γo(µ1, µ2), µ2,3 ∈ Γo(µ2, µ3), · · · ,µj,j+1 ∈ Γo(µj , µj+1) then thereexists a probability measure P and random variables X1, X2, X3, · · · , Xj , Xj+1, · · ·such that µ1,2 = (X1, X2)#P, · · · ,µj,j+1 = (Xj , Xj+1)#P.In particular
W22(µj , µj+1) = E
ˆ|Xj −Xj+1|2
˜(Xh, Xk)#P ∈ Γ(µh, µk) but it is not optimal in general
if h, k are not consecutive.Application: W2 is a distance, triangular inequality.
W2(µ1, µ3) ≤W2(µ1, µ2) + W2(µ2, µ3)
W2(µ1, µ3) ≤“
Eˆ|X1 −X3|2
˜”1/2=“
Eˆ|(X1 −X2) + (X2 −X3)|2
˜”1/2
≤“
Eˆ|X1 −X2|2
˜”1/2+“
Eˆ|X2 −X3|2
˜”1/2= W2(µ1, µ2) + W2(µ2, µ3)
29
Introduction The discrete case Measures The Euclidean case
Optimal couplings and triangular inequalityLower semicontinuity and tightness: the minimum problem
W2
2(µ1,µ2) := minnZ
Rm×Rm|x1 − x2|2 dµ(x1,x2) : µ ∈ Γ(µ1,µ2)
ois attained: Γo(µ1,µ2) denotes the collection (closed, convex set) of all theoptimal couplings in P2(Rm × Rm). In general more than one optimal couplingcould exist.Connecting a sequence of measures, disintegration and Kolmogorovtheorem:if µ1,2 ∈ Γo(µ1, µ2), µ2,3 ∈ Γo(µ2, µ3), · · · ,µj,j+1 ∈ Γo(µj , µj+1) then thereexists a probability measure P and random variables X1, X2, X3, · · · , Xj , Xj+1, · · ·such that µ1,2 = (X1, X2)#P, · · · ,µj,j+1 = (Xj , Xj+1)#P.In particular
W22(µj , µj+1) = E
ˆ|Xj −Xj+1|2
˜(Xh, Xk)#P ∈ Γ(µh, µk) but it is not optimal in general
if h, k are not consecutive.Application: W2 is a distance, triangular inequality.
W2(µ1, µ3) ≤W2(µ1, µ2) + W2(µ2, µ3)
W2(µ1, µ3) ≤“
Eˆ|X1 −X3|2
˜”1/2=“
Eˆ|(X1 −X2) + (X2 −X3)|2
˜”1/2
≤“
Eˆ|X1 −X2|2
˜”1/2+“
Eˆ|X2 −X3|2
˜”1/2= W2(µ1, µ2) + W2(µ2, µ3)
29
Introduction The discrete case Measures The Euclidean case
Optimal couplings and triangular inequalityLower semicontinuity and tightness: the minimum problem
W2
2(µ1,µ2) := minnZ
Rm×Rm|x1 − x2|2 dµ(x1,x2) : µ ∈ Γ(µ1,µ2)
ois attained: Γo(µ1,µ2) denotes the collection (closed, convex set) of all theoptimal couplings in P2(Rm × Rm). In general more than one optimal couplingcould exist.Connecting a sequence of measures, disintegration and Kolmogorovtheorem:if µ1,2 ∈ Γo(µ1, µ2), µ2,3 ∈ Γo(µ2, µ3), · · · ,µj,j+1 ∈ Γo(µj , µj+1) then thereexists a probability measure P and random variables X1, X2, X3, · · · , Xj , Xj+1, · · ·such that µ1,2 = (X1, X2)#P, · · · ,µj,j+1 = (Xj , Xj+1)#P.In particular
W22(µj , µj+1) = E
ˆ|Xj −Xj+1|2
˜(Xh, Xk)#P ∈ Γ(µh, µk) but it is not optimal in general
if h, k are not consecutive.Application: W2 is a distance, triangular inequality.
W2(µ1, µ3) ≤W2(µ1, µ2) + W2(µ2, µ3)
W2(µ1, µ3) ≤“
Eˆ|X1 −X3|2
˜”1/2=“
Eˆ|(X1 −X2) + (X2 −X3)|2
˜”1/2
≤“
Eˆ|X1 −X2|2
˜”1/2+“
Eˆ|X2 −X3|2
˜”1/2= W2(µ1, µ2) + W2(µ2, µ3)
29
Introduction The discrete case Measures The Euclidean case
Optimal couplings and triangular inequalityLower semicontinuity and tightness: the minimum problem
W2
2(µ1,µ2) := minnZ
Rm×Rm|x1 − x2|2 dµ(x1,x2) : µ ∈ Γ(µ1,µ2)
ois attained: Γo(µ1,µ2) denotes the collection (closed, convex set) of all theoptimal couplings in P2(Rm × Rm). In general more than one optimal couplingcould exist.Connecting a sequence of measures, disintegration and Kolmogorovtheorem:if µ1,2 ∈ Γo(µ1, µ2), µ2,3 ∈ Γo(µ2, µ3), · · · ,µj,j+1 ∈ Γo(µj , µj+1) then thereexists a probability measure P and random variables X1, X2, X3, · · · , Xj , Xj+1, · · ·such that µ1,2 = (X1, X2)#P, · · · ,µj,j+1 = (Xj , Xj+1)#P.In particular
W22(µj , µj+1) = E
ˆ|Xj −Xj+1|2
˜(Xh, Xk)#P ∈ Γ(µh, µk) but it is not optimal in general
if h, k are not consecutive.Application: W2 is a distance, triangular inequality.
W2(µ1, µ3) ≤W2(µ1, µ2) + W2(µ2, µ3)
W2(µ1, µ3) ≤“
Eˆ|X1 −X3|2
˜”1/2=“
Eˆ|(X1 −X2) + (X2 −X3)|2
˜”1/2
≤“
Eˆ|X1 −X2|2
˜”1/2+“
Eˆ|X2 −X3|2
˜”1/2= W2(µ1, µ2) + W2(µ2, µ3)
29
Introduction The discrete case Measures The Euclidean case
Optimal couplings and triangular inequalityLower semicontinuity and tightness: the minimum problem
W2
2(µ1,µ2) := minnZ
Rm×Rm|x1 − x2|2 dµ(x1,x2) : µ ∈ Γ(µ1,µ2)
ois attained: Γo(µ1,µ2) denotes the collection (closed, convex set) of all theoptimal couplings in P2(Rm × Rm). In general more than one optimal couplingcould exist.Connecting a sequence of measures, disintegration and Kolmogorovtheorem:if µ1,2 ∈ Γo(µ1, µ2), µ2,3 ∈ Γo(µ2, µ3), · · · ,µj,j+1 ∈ Γo(µj , µj+1) then thereexists a probability measure P and random variables X1, X2, X3, · · · , Xj , Xj+1, · · ·such that µ1,2 = (X1, X2)#P, · · · ,µj,j+1 = (Xj , Xj+1)#P.In particular
W22(µj , µj+1) = E
ˆ|Xj −Xj+1|2
˜(Xh, Xk)#P ∈ Γ(µh, µk) but it is not optimal in general
if h, k are not consecutive.Application: W2 is a distance, triangular inequality.
W2(µ1, µ3) ≤W2(µ1, µ2) + W2(µ2, µ3)
W2(µ1, µ3) ≤“
Eˆ|X1 −X3|2
˜”1/2=“
Eˆ|(X1 −X2) + (X2 −X3)|2
˜”1/2
≤“
Eˆ|X1 −X2|2
˜”1/2+“
Eˆ|X2 −X3|2
˜”1/2= W2(µ1, µ2) + W2(µ2, µ3)
29
Introduction The discrete case Measures The Euclidean case
“Soft” properties
I Convergence with respect to W ⇔Weak convergence +convergence of the quadraticmoments.
I Completeness (if one considers all the probability measures in P2(Rm)).
I Lower semicontinuity with respect to weak/distributional convergence
I Convexity (but linear segments are not geodesics!)
I Existence of (constant speed, minimizing) geodesics connecting arbitrarymeasures µ0, µ1: they are curves µ : t ∈ [0, 1] 7→ µt s.t.
W2(µ0, µ1) = L10[µ], W2(µs, µt) = |t− s|W2(µ0, µ1).
30
Introduction The discrete case Measures The Euclidean case
“Soft” properties
I Convergence with respect to W ⇔Weak convergence +convergence of the quadraticmoments.
I Completeness (if one considers all the probability measures in P2(Rm)).
I Lower semicontinuity with respect to weak/distributional convergence
I Convexity (but linear segments are not geodesics!)
I Existence of (constant speed, minimizing) geodesics connecting arbitrarymeasures µ0, µ1: they are curves µ : t ∈ [0, 1] 7→ µt s.t.
W2(µ0, µ1) = L10[µ], W2(µs, µt) = |t− s|W2(µ0, µ1).
30
Introduction The discrete case Measures The Euclidean case
“Soft” properties
I Convergence with respect to W ⇔Weak convergence +convergence of the quadraticmoments.
I Completeness (if one considers all the probability measures in P2(Rm)).
I Lower semicontinuity with respect to weak/distributional convergence
I Convexity (but linear segments are not geodesics!)
I Existence of (constant speed, minimizing) geodesics connecting arbitrarymeasures µ0, µ1: they are curves µ : t ∈ [0, 1] 7→ µt s.t.
W2(µ0, µ1) = L10[µ], W2(µs, µt) = |t− s|W2(µ0, µ1).
30
Introduction The discrete case Measures The Euclidean case
“Soft” properties
I Convergence with respect to W ⇔Weak convergence +convergence of the quadraticmoments.
I Completeness (if one considers all the probability measures in P2(Rm)).
I Lower semicontinuity with respect to weak/distributional convergence
I Convexity (but linear segments are not geodesics!)
I Existence of (constant speed, minimizing) geodesics connecting arbitrarymeasures µ0, µ1: they are curves µ : t ∈ [0, 1] 7→ µt s.t.
W2(µ0, µ1) = L10[µ], W2(µs, µt) = |t− s|W2(µ0, µ1).
30
Introduction The discrete case Measures The Euclidean case
“Soft” properties
I Convergence with respect to W ⇔Weak convergence +convergence of the quadraticmoments.
I Completeness (if one considers all the probability measures in P2(Rm)).
I Lower semicontinuity with respect to weak/distributional convergence
I Convexity (but linear segments are not geodesics!)
I Existence of (constant speed, minimizing) geodesics connecting arbitrarymeasures µ0, µ1: they are curves µ : t ∈ [0, 1] 7→ µt s.t.
W2(µ0, µ1) = L10[µ], W2(µs, µt) = |t− s|W2(µ0, µ1).
30