Projection methods:convergence and counterexamples
4 January 2019
Hangzhou Dianzi University
Vera Roshchina
School of Mathematics and StatisticsUNSW [email protected]
Based on joint work withHong-Kun Xu, Roberto Cominetti and Andrew Williamson.
The method of alternating projections
C1C2
The method of alternating projections
Let H be a Hilbert space, with inner product 〈·, ·〉 and norm ‖ · ‖.
For any closed convex set C ⊆ H and any x ∈ H there exists aunique point PC(x) ∈ C such that
‖x− PC(x)‖ = infy∈C‖x− y‖.
Given two closed convex sets C1, C2 ⊆ H and x0 ∈ H, let
x1 = PC1(x0), x2 = PC2
(x1),
x3 = PC1(x2), x4 = PC2
(x3),
. . . . . .
x2k+1 = PC1x2k, x2k+2 = PC2
x2k+1,
. . . . . .
Convergence
Let M1 and M2 be closed affine subspaces of H, M = M1 ∩M2.
Theorem 1 (von Neumann 1933). For each x ∈ H
limn→∞ ‖(PM2
PM1)n(x)− PM(x)‖ = 0.
von Neumann, Functional Operators-Vol. II. The Geometry of Orthogonal Spaces,
Annals of Math. Studies, 1950 (reprint of 1933 lectures).
Theorem 2 (Bregman 1965). For C = C1 ∩ C2 6= ∅, where
C1, C2 ⊆ H are closed convex sets, the sequence of alternating
projections converges weakly to a point in C.
Bregman, The method of successive projection for finding a common point of
convex sets, Sov. Math. Dokl., 1965.
The question of whether convergence is always strong remainedopen until 2004, despite many works on sufficient conditions.
Counterexample of Hundal
Theorem 3 (Hundal 2004). There exist a Hilbert space H, closed
convex sets C1, C2 ⊂ H with intersection C1 ∩ C2 = {0} and a
starting point x0 such that
limn→∞ ‖(PC2
PC1)n(x0)‖ > 0.
In a separable Hilbert space with an orthonormal basis {ei}∞i=1, let
C1 = {x | 〈x, e1〉 ≤ 0}, C2 = cone {p(t) | t ≥ 0},
p(t) = ebtc+2 cos(f(t)) + ebtc+3 sin(f(t)) + e1h(t), t ≥ 0,
f(t) =π
2(t− btc), h(t) = e−100t3
Hundal, An alternating projection that does not converge in norm. Nonlinear
Anal. 2004.
Rate of convergence
x0 x0
x0 x0
Angles between subspaces
The Friedrichs angle between two closed linear subspaces M1 andM2 is α ∈ [0, π2] such that (BH is a unit ball, M = M1 ∩M2)
c = cosα = supx∈M1∩M⊥∩BHy∈M2∩M⊥∩BH
|〈x, y〉|.
Theorem 4 (Aronszajn, 1950). For each x ∈ H and n ≥ 1
‖(PM2PM1
)n(x)− PM(x)‖ ≤ c2n−1‖x‖.
We have c < 1 iff M1 + M2 is closed; in this case the method ofalternating projections converges linearly.
Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc., 1950.
The constant is the smallest possible Kayalar and Weinert, Error bounds forthe method of alternating projections, Math. Control Signals Systems, 1988.
Generalisations to several sets Reich and Zalas, The optimal error bound forthe method of simultaneous projections, J. Approx. Theory, 2017
What if c = 1?
Theorem 5 (Bauschke, Borwein and Lewis). For two closedaffine subspaces M1,M2 ∈ H exactly one of the alternatives holds.
(1) M1+M2 is closed. Then for each x the alternating projectionsconverge linearly to PM1∩M2
(x) with a rate c2.
(2) M1 + M2 is not closed. Then for any sequence of positivereal numbers
1 > λ1 ≥ λ2 ≥ · · · ≥ λn → 0
there exists a point xλ ∈ H such that
‖(PM2PM1
)n(xλ)− PM(xλ)‖ ≥ λn ∀n ∈ N.
Bauschke, Borwein, and Lewis, The method of cyclic projections for closed convex
sets in Hilbert space, Contemporary Mathematics, 1997.
Bauschke, Deutsch, Hundal, Characterizing arbitrarily slow convergence in the
method of alternating projections. Int. Trans. Oper. Res., 2009.
Special properties and convergence
Regularity and the existence of Slater points
Gubin, Polyak, Raik, The method of projections for finding the common point of
convex sets, USSR Comput. Math. Math. Phys., 1967.
Symmetry
Bruck, Reich, Nonexpansive projections and resolvents of accretive operators in
Banach spaces, Houston J. Math., 1977.
Reich, A limit theorem for projections, Linear and Multilinear Algebra, 1983.
Semialgebraic structure
Borwein, Li, Yao, Analysis of the convergence rate for the cyclic projection algo-
rithm applied to basic semialgebraic convex sets. SIAM J. Optim. 24, 498–527
(2014)
Drusvyatskiy, Li, Wolkowicz, A note on alternating projections for ill-posed semidef-
inite feasibility problems. Math. Program. 162 (2017), 537–548.
What if the problem is infeasible?
Assume thatC1, C2 ∈ H are convex and closed, but possiblyC1 ∩ C2 = ∅.Define the distance between C1 and C2 as
dist(C1, C2) = infx∈C1y∈C2
‖y − x‖.
The following sets may be empty,
P1 = {x ∈ C1 |dist(x,C2) = dist(C1, C2)},
P2 = {y ∈ C2 |dist(y, C1) = dist(C1, C2)}.
C1
C2
vP2
P1 C1
C2
v
The displacement vector and convergence
Define the displacement vector
v = PC2−C1(0),
where C2 − C1 is the Minkowski difference,
C2 − C1 = {y − x, x ∈ C1, y ∈ C2}.
For the alternating projections we have
x2k − x2k+1 → v, x2k+2 − x2k+1 → v.
If P1 and P2 are empty, then ‖xn‖ → ∞.
Otherwise x2k+1 ⇀ x ∈ P1, x2k ⇀ y ∈ P2, and y − x = v.
Bauschke, Borwein, On the Convergence of yon Neumann’s Alternating Projec-
tion Algorithm for Two Sets, Set-Valued Analysis, 1993.
A helpful illustration
C1C2
What about more than two sets?
For m ≥ 2 sets we can generalise alternating projections startingfrom x0 ∈ H, and projecting cyclically onto each of the sets.
For three sets C1, C2, C3,
x1 = PC1(x0), x2 = PC2
(x1), x3 = PC3(x2), x4 = PC1
(x3), · · ·
C1C2
C3
u0u1
u2
u6u4
u5u3
There is no variational characterisation
Under mild assumptions (e.g. one of the sets is bounded) cyclicprojections converge weakly either to a point in the intersectionC1 ∩ C2 ∩ · · · ∩ Cm or to a fixed cycle if the intersection is empty.
Bruck, Reich, Nonexpansive projections and resolvents of accretive operators in
Banach spaces. Houston J. Math., 1977.
Recall that for two sets this cycle realises the distance between thesets; however, for m ≥ 3 there is no function Φ : Hm → R such thatfor any collection of compact convex sets C1, C2, . . . , Cm ⊂ H thelimit cycles are precisely the solutions to the minimisation problem
minxi∈Ci
Φ(x1, x2, . . . , xm).
Baillon, Combettes, Cominetti, There is no variational characterization of the cy-
cles in the method of periodic projections. J. Funct. Anal., 2012.
Under-relaxed projections
Fix α ∈ (0,1] and instead of PC(x) consider
R(x) = (1− α)x+ αPC(x).
C
u true projection
under-relaxedprojection
This leads to under-relaxed alternating and cyclic projections.
Under-relaxed projections
C2
C3
C1
Iterations for α = 0.75 and α = 0.35 (shown in red).
Two special limits
Fix α ∈ (0,1] and instead of PC(x) consider
R(x) = (1− α)x+ α(PC(x)− x).
The under-relaxed cyclic projections converge weakly to a fixed cy-cle iff such a cycle exists (e.g. when one of the sets is bounded).Bruck, Reich, Nonexpansive projections and resolvents of accretive operators in
Banach spaces. Houston J. Math., 1977.
Consider the limit of such α-cycles as α ↓ 0, or alternatively vary α,letting αk ↓ 0,
∑k∈Nαk = +∞.
De Pierro’s conjecture
Conjecture 1. The least squares solution
S = Arg minx∈H
m∑i=1
minxi∈Ci
‖x− xi‖2
exists iff both limits exist and solve this least squares problem.
De Pierro, From parallel to sequential projection methods and vice versa in convex
feasibility: results and conjectures, Stud. Comput. Math., 2001.
The conjecture is true for affine subspaces of Rn,Censor, Eggermont, Gordon, Strong underrelaxation in Kaczmarz’s method for in-
consistent systems. Numer. Math., 1983.
closed affine subspaces satisfying a metric regularity condition,Bauschke, Edwards, A conjecture by De Pierro is true for translates of regular sub-
spaces, J. Nonlinear Convex Anal., 2005.
and sets satisfying a certain geometric condition.Baillon, Combettes, Cominetti, Asymptotic behavior of compositions of under-
relaxed nonexpansive operators, J. Dyn. Games, 2014.
A misleading example
C1 = co {(−2,2,1), (−2,2,−1)}, C2 = co {(2,2,1), (2,2,−1)},
C3 = {(x, y, z) |x2 + y2 ≤ 1, |z| ≤ 1}, S ={(
0, 53, z
): |z| ≤ 1
}.
S
C1
C2
C3
u0z0=0.5 SC1
C2
C3
u0z0=-0.5
Under-relaxed projections for α = 0.5 and different starting points.
Counterexample
C1 = co {(−2,2,1), (−2,2,−1)}, C2 = co {(2,2,1), (2,2,−1)},C3 = co {pk | k ∈ N}, pk = (cos tk, sin tk, (−1)k).Here {tk} is increasing, t1 = π
4 and tk → π2 as k →∞.
C1
C2
C3
p1 p3
p2 p4
Counterexample
For this three-set system the limits described earlier do not exist, how-ever, the least-squares problem has a solution.
C1
C2
C3
p1 p3
p2 p4
Cominetti, Roshchina, Williamson, A counterexample to De Pierro’s conjecture on
the convergence of under-relaxed cyclic projections, Optimization, 2018.
Reduction to two dimensions
The projections of the two-dimensional cycles correspond to an ‘os-cillating’ path in 3D. As α ↓ 0, limit cycles ‘follow’ this path, andhence there is no convergence to a single point.
a}=C1{ ' b}=C2{ '
v1
v2v3
C3'
C1
C2
C3
p1 p3
p2 p4
Bonus #1: Krasnoselskii-Mann iterations
When T : C → C is a contraction, i.e. for some ρ ∈ [0,1) we have
‖Tx− Ty‖ ≤ ρ‖x− y‖ ∀x, y ∈ C,
for the fixed-point iterations we get the asymptotic regularity,
‖Txn − xn‖ ≤ ρn‖Tx0 − x0‖ → 0.
This is not the case for nonexpansive maps (with ρ = 1).
Let T : C → C be a nonexpansive map defined on a convexbounded subset C of a normed space X.
Krasnoselski-Mann iterations (for αn ∈ [0,1]):
xn+1 = (1− αn+1)xn + αn+1Txn.
Bonus #1: Krasnoselskii-Mann iterations
For a rotation T : R2 → R2 full step xk+1 = T (xk) on the left andxk+1 = (1− α)xk + αkT (xk) on the right.
Rate of convergence
Krasnoselskii-Mann iterations: xn+1 = (1− αn+1)xn + αn+1Txn.
Theorem 6. The Krasnoselskii–Mann iterates satisfy
‖Txn − xn‖ ≤diamC√
π∑ni=1αi(1− αi)
. (1)
Cominetti, Soto, Vaisman, On the rate of convergence of Krasnoselskii–Mann iter-ations and their connection with sums of Bernoullis. Israel J. Math., 2014.
Theorem 7. The constant κ = 1/√π in the bound (1) is tight.
Specifically, for each κ < 1/√π there exists a nonexpansive map T
defined on the unit cube C = [0,1]N ⊆ l∞(N), an initial point x0 ∈C, and a constant sequence αn ≡ α, such that the correspondingKM iterates satisfy for some n ∈ N
‖Txn − xn‖ > κdiamC√∑n
i=1αi(1− αi).
Bravo, Cominetti, Sharp convergence rates for averaged nonexpansive maps, Isr.J. Math., 2018.
Bonus #2: Over-relaxed projections
Douglas-Rachford is a variant of projection method that uses reflec-tions and averages instead of projections. The Douglas-Rachfordoperator is defined as
TA,B :=1
2(I +RBRA), RC := 2PC − I.
For the convex setting, the convergence results are very similar tothe method of alternating projections. However the Douglas–Rachfordmethod is successfully applied to nonsmooth problems, where its be-haviour is not fully understood.
https://carma.newcastle.edu.au/scott/#!page-beauty-in-mathematics
Aragon Artacho, Borwein, Tam, Global behavior of the Douglas–Rachford method
for a nonconvex feasibility problem. J. Global Optim. 2016
Lindstrom, Sims, Survey: Sixty Years of Douglas–Rachford, 2018 (arxiv preprint)
Bonus #3: Optimisation with projections
Consider an optimisation problem
min f(x)
s.t. x ∈ C1 ∩ C2 ∩ · · · ∩ Cm,
where f : Rn → R is convex, and C1, . . . , Cm are closed convexsets in Rn. A version of cyclic projections algorithm can be sup-plemented with a gradient (subgradient) step. For example, con-sider sequential, cyclic and parallel projections, starting from somex0 ∈ Rn:
xk+1 := PCm · · ·PC2PC1
(xk − λkvk),
xk+1 :=m∑j=1
βjPCj(xk − λkvk),
xk+1 := PC[k+1](xk − λkvk), [k + 1] = (k mod m) + 1.
where vk ∈ ∂f(xk) (when f is smooth, vk = ∇f(xk)).
Convergence
If the function f is convex, 0 < λk → 0,∑∞k=1 λk = +∞, the feasible
set is nonempty and {xk} is bounded, then for all three methodsthe sequence {fk} of the function values converges to the optimalvalue, and every cluster point of {xk} is an optimal solution, giventhat the solution set is nonempty.
This result is also true for composite optimisation problem, when f =
f1 + · · · + fN , and the subgradient step is replace by a cycle ofsubgradient steps involving each one of these functions.
Convergence of sequential projections was shown in De Pierro, Neto,
Salomao, From convex feasibility to convex constrained optimization using block
action projection methods and underrelaxation. Int. Trans. Oper. Res. (2009)
Parallel and cyclic versions: Roshchina, Xu, forthcoming preprint, 2019.
Thank you