Post on 20-Jul-2020
transcript
Un metodo iterativo di tipo Gauss-Newtonper la risoluzione del problema TLS
A Gauss-Newton iteration for solving TLS problems
Antonio Fazzi1 and Dario Fasino2
1Gran Sasso Science Institute. 2University of Udine.
Como, 17/02/2017
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 1 / 17
The Total Least Squares problem
The Total Least Squares (TLS) problem
Given A ∈ Rm×n with m > n and b ∈ Rm, the Total Least Squares (TLS)problem is de�ned as
minE ,f‖(E | f )‖2F with b + f ∈ Im(A + E )
where E ∈ Rm×n and f ∈ Rm. After we �nd such a matrix (E | f ) whoseFrobenius norm is minimum, each x ∈ Rn satisfying
(A + E )x = b + f
is a solution of the TLS problem.
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 2 / 17
The Total Least Squares problem
Solution, existence and uniqueness
De�ne the matrix C = (A | b) and consider the SVD
C = UΣV T .
In the following we assume that the problems has a unique solution; thishappens if
σ′n > σn+1, where σ′n and σn+1 are the smallest singular values of A and C ,
respectively.
The solution of the TLS problem is the vector xTLS such that
vn+1 = −ζ(
xTLS−1
)where ζ is a normalization constant.
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 3 / 17
The Total Least Squares problem
The function η
It's known that xTLS can be characterized as point of global minimum ofthe function
η(x) =‖Ax − b‖221 + ‖x‖22
.
The function η(x) measures the backward error of the vector x asapproximated solution of the linear system Ax = b:
Lemma
For each vector x there exists a rank one matrix (E |f ) such that
(A + E )x = b + f , ‖(E |f )‖2F = ‖(E |f )‖22 = η(x).
Moreover, for each matrix (E |f ) such that (A + E )x = b + f it holds true‖(E |f )‖2
F≥ ‖(E |f )‖22 ≥ η(x).
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 4 / 17
The Total Least Squares problem
The function η
It's known that xTLS can be characterized as point of global minimum ofthe function
η(x) =‖Ax − b‖221 + ‖x‖22
.
The function η(x) measures the backward error of the vector x asapproximated solution of the linear system Ax = b:
Lemma
For each vector x there exists a rank one matrix (E |f ) such that
(A + E )x = b + f , ‖(E |f )‖2F = ‖(E |f )‖22 = η(x).
Moreover, for each matrix (E |f ) such that (A + E )x = b + f it holds true‖(E |f )‖2
F≥ ‖(E |f )‖22 ≥ η(x).
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 4 / 17
Solution of TLS problems using Gauss�Newton
The Gauss-Newton algorithm
The Gauss-Newton algorithm is a cheap optimization method that cansolve nonlinear least squares problems
minx∈Rn‖f (x)‖22, f : Rn → Rm, m ≥ n.
The basic idea is to linearize f (x) in a neighborhood of x ;the step x → x + h is computed by replacing ‖f (x + h)‖22 with‖f (x) + J(x)h‖22 and solving the corresponding ordinary LS problem.
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 5 / 17
Solution of TLS problems using Gauss�Newton
The Gauss�Newton applied to the function η
We setη(x) = ‖f (x)‖22
where
f (x) =1√
1 + xT x(Ax − b).
Hence, minx ‖f (x)‖2 xTLS.
The Jacobian of f is
J(x) =1√
1 + xT xA− 1
(1 + xT x)32
(Ax − b)xT .
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 6 / 17
Solution of TLS problems using Gauss�Newton
Outline of the algorithm
Basic GN-TLS method
Input: A, b (problem data); ε, maxit (stopping criteria)Output: x (approximation of xTLS)
Set k := 0Compute x0 := argminx ‖Ax − b‖2Compute f0 := f (x0) and J0 := J(x0)while ‖JTk fk‖2 ≥ ε and k < maxit
Compute hk := argminh ‖Jkh + fk‖2Set xk+1 := xk + hkSet k := k + 1, fk := f (xk), Jk := J(xk)
end
x := xk
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 7 / 17
Solution of TLS problems using Gauss�Newton
Computational Cost
The Gauss�Newton method solves at each step a least squares problem,which can be written in the form:
minh‖Jkh + fk‖22 =
minh
∥∥∥∥(A− rk1 + xTk xk
xTk
)h + rk
∥∥∥∥22
We can compute only once the QR factorization of A, and then we use atechnique which updates the QR factorization of rank one perturbations.This update has only a quadratic cost.
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 8 / 17
Solution of TLS problems using Gauss�Newton
Computational Cost
The Gauss�Newton method solves at each step a least squares problem,which can be written in the form:
minh‖Jkh + fk‖22 =
minh
∥∥∥∥(A− rk1 + xTk xk
xTk
)h + rk
∥∥∥∥22
We can compute only once the QR factorization of A, and then we use atechnique which updates the QR factorization of rank one perturbations.This update has only a quadratic cost.
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 8 / 17
Solution of TLS problems using Gauss�Newton
Geometry of the method
Proposition
Let
f (x) =1√
1 + xT x(Ax − b).
Then its image Im(f ) ⊂ Rm is an open subset of the ellipsoid vTXv = 1,where X = (CCT )+.
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 9 / 17
Solution of TLS problems using Gauss�Newton
Image of f (x)
Figure: Surface plot of Im(f ). Blue star: f (xTLS); red star: f (xLS).
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 10 / 17
An improved variant
Improved variant of the basic GN-TLS
Motivationensure convergence
increase convergence speed
The value f (x + h) comes from a linear combination of f (x) andf (x) + J(x)h, so it's not the retraction of the Gauss-Newton step!
Idea
Introduce a step-length parameter α such that
f (x + αh) = τ(f (x) + J(x)h).
α = 1/(1 + xTh/(1 + xT x))
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 11 / 17
An improved variant
Improved variant of the basic GN-TLS
Motivationensure convergence
increase convergence speed
The value f (x + h) comes from a linear combination of f (x) andf (x) + J(x)h, so it's not the retraction of the Gauss-Newton step!
Idea
Introduce a step-length parameter α such that
f (x + αh) = τ(f (x) + J(x)h).
α = 1/(1 + xTh/(1 + xT x))
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 11 / 17
An improved variant
Example
Figure: Example in dimension 1. Notice the di�erence between the two methods.
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 12 / 17
An improved variant
GN-TLS method with �optimal� step length
Input: A, b (problem data); ε, maxit (stopping criteria)Output: x (approximation of xTLS)
Set k := 0Compute x0 := argminx ‖Ax − b‖2Compute f0 := f (x0) and J0 := J(x0)while ‖JTk fk‖2 ≥ ε and k < maxit
Compute hk := argminh ‖Jkh + fk‖2Set αk := 1/(1 + xTk hk/(1 + xTk xk))Set xk+1 := xk + αkhkSet k := k + 1, fk := f (xk), Jk := J(xk)
end
x := xk
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 13 / 17
An improved variant
Equivalence with an inverse power iteration
The GN-TLS method with optimal step length is equivalent to an inversepower method involving the matrix CTC . Indeed, let
sk = (xk ,−1)T/√
1 + xTk xk . Then,
sk+1 = βk(CTC )−1sk , βk = 1/‖(CTC )−1sk‖2.
Meanwhile,f (xk+1) = βk(CCT )+f (xk).
Corollary
The GN-TLS method with optimal step length is convergent. Moreover,
‖f (xk)− f (xTLS)‖ = O((σn+1σn)2k),
‖xk − xTLS‖ = O((σn+1σn)2k)
|η(xk)− η(xTLS)| = O((σn+1σn)4k)
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 14 / 17
An improved variant
Equivalence with an inverse power iteration
The GN-TLS method with optimal step length is equivalent to an inversepower method involving the matrix CTC . Indeed, let
sk = (xk ,−1)T/√
1 + xTk xk . Then,
sk+1 = βk(CTC )−1sk , βk = 1/‖(CTC )−1sk‖2.
Meanwhile,f (xk+1) = βk(CCT )+f (xk).
Corollary
The GN-TLS method with optimal step length is convergent. Moreover,
‖f (xk)− f (xTLS)‖ = O((σn+1σn)2k),
‖xk − xTLS‖ = O((σn+1σn)2k)
|η(xk)− η(xTLS)| = O((σn+1σn)4k)
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 14 / 17
An improved variant
Numerical experiments
Test problem by Björck, Heggernes, Matstoms (2000)
0 10 2010
−18
10−16
10−14
10−12
10−10
10−8
0 10 2010
−20
10−15
10−10
10−5
100
105
0 10 200.95
1
1.05
1.1
1.15
1.2
Figure: Left: log ‖JTk fk‖. Center: Errors log ‖xk − xTLS‖ (solid lines) andlog |η(xk)− η(xTLS)| (dashed lines). Right: plot of αk
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 15 / 17
Conclusions
Conclusions
The method produces a sequence of approximations which convergeswith no restriction. The value η(xk), available during the iterations,estimates the backward error in Ax ≈ b.
The method avoids to compute the SVD. At each step it only solves aleast squares problem whose matrix is a rank one perturbation of thedata matrix. This can be useful in some circumstances:
if A is large and sparse we can use (transpose free) Krylov methodswhere the matrix is only involved in matrix-vector products;if the QR factorization of A is known in advance.
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 16 / 17
Conclusions
Conclusions
D. Fasino, A. Fazzi.A Gauss�Newton iteration for Total Least Squares problems.arXiv:1608.01619 (2016). Submitted.
Thank you for your attention.
A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 17 / 17