Un metodo iterativo di tipo Gauss-Newton per la...

transcript

Un metodo iterativo di tipo Gauss-Newtonper la risoluzione del problema TLS

A Gauss-Newton iteration for solving TLS problems

Antonio Fazzi1 and Dario Fasino2

1Gran Sasso Science Institute. 2University of Udine.

Como, 17/02/2017

A. Fazzi, D. Fasino (1Gran Sasso Science Institute. 2University of Udine.) 1 / 17

The Total Least Squares problem

The Total Least Squares (TLS) problem

Given A ∈ Rm×n with m > n and b ∈ Rm, the Total Least Squares (TLS)problem is de�ned as

minE ,f‖(E | f )‖2F with b + f ∈ Im(A + E )

where E ∈ Rm×n and f ∈ Rm. After we �nd such a matrix (E | f ) whoseFrobenius norm is minimum, each x ∈ Rn satisfying

(A + E )x = b + f

is a solution of the TLS problem.

Solution, existence and uniqueness

De�ne the matrix C = (A | b) and consider the SVD

C = UΣV T .

In the following we assume that the problems has a unique solution; thishappens if

σ′n > σn+1, where σ′n and σn+1 are the smallest singular values of A and C ,

respectively.

The solution of the TLS problem is the vector xTLS such that

vn+1 = −ζ(

xTLS−1

)where ζ is a normalization constant.

The function η

It's known that xTLS can be characterized as point of global minimum ofthe function

η(x) =‖Ax − b‖221 + ‖x‖22

The function η(x) measures the backward error of the vector x asapproximated solution of the linear system Ax = b:

For each vector x there exists a rank one matrix (E |f ) such that

(A + E )x = b + f , ‖(E |f )‖2F = ‖(E |f )‖22 = η(x).

Moreover, for each matrix (E |f ) such that (A + E )x = b + f it holds true‖(E |f )‖2

F≥ ‖(E |f )‖22 ≥ η(x).

The function η

It's known that xTLS can be characterized as point of global minimum ofthe function

η(x) =‖Ax − b‖221 + ‖x‖22

The function η(x) measures the backward error of the vector x asapproximated solution of the linear system Ax = b:

For each vector x there exists a rank one matrix (E |f ) such that

(A + E )x = b + f , ‖(E |f )‖2F = ‖(E |f )‖22 = η(x).

Moreover, for each matrix (E |f ) such that (A + E )x = b + f it holds true‖(E |f )‖2

F≥ ‖(E |f )‖22 ≥ η(x).

Solution of TLS problems using Gauss�Newton

The Gauss-Newton algorithm

The Gauss-Newton algorithm is a cheap optimization method that cansolve nonlinear least squares problems

minx∈Rn‖f (x)‖22, f : Rn → Rm, m ≥ n.

The basic idea is to linearize f (x) in a neighborhood of x ;the step x → x + h is computed by replacing ‖f (x + h)‖22 with‖f (x) + J(x)h‖22 and solving the corresponding ordinary LS problem.

The Gauss�Newton applied to the function η

We setη(x) = ‖f (x)‖22

f (x) =1√

1 + xT x(Ax − b).

Hence, minx ‖f (x)‖2 xTLS.

The Jacobian of f is

J(x) =1√

1 + xT xA− 1

(1 + xT x)32

(Ax − b)xT .

Outline of the algorithm

Basic GN-TLS method

Input: A, b (problem data); ε, maxit (stopping criteria)Output: x (approximation of xTLS)

Set k := 0Compute x0 := argminx ‖Ax − b‖2Compute f0 := f (x0) and J0 := J(x0)while ‖JTk fk‖2 ≥ ε and k < maxit

Compute hk := argminh ‖Jkh + fk‖2Set xk+1 := xk + hkSet k := k + 1, fk := f (xk), Jk := J(xk)

x := xk

Computational Cost

The Gauss�Newton method solves at each step a least squares problem,which can be written in the form:

minh‖Jkh + fk‖22 =

∥∥∥∥(A− rk1 + xTk xk

)h + rk

∥∥∥∥22

We can compute only once the QR factorization of A, and then we use atechnique which updates the QR factorization of rank one perturbations.This update has only a quadratic cost.

Computational Cost

The Gauss�Newton method solves at each step a least squares problem,which can be written in the form:

minh‖Jkh + fk‖22 =

∥∥∥∥(A− rk1 + xTk xk

)h + rk

∥∥∥∥22

We can compute only once the QR factorization of A, and then we use atechnique which updates the QR factorization of rank one perturbations.This update has only a quadratic cost.

Geometry of the method

Proposition

f (x) =1√

1 + xT x(Ax − b).

Then its image Im(f ) ⊂ Rm is an open subset of the ellipsoid vTXv = 1,where X = (CCT )+.

Image of f (x)

Figure: Surface plot of Im(f ). Blue star: f (xTLS); red star: f (xLS).

An improved variant

Improved variant of the basic GN-TLS

Motivationensure convergence

increase convergence speed

The value f (x + h) comes from a linear combination of f (x) andf (x) + J(x)h, so it's not the retraction of the Gauss-Newton step!

Introduce a step-length parameter α such that

f (x + αh) = τ(f (x) + J(x)h).

α = 1/(1 + xTh/(1 + xT x))

An improved variant

Improved variant of the basic GN-TLS

Motivationensure convergence

increase convergence speed

The value f (x + h) comes from a linear combination of f (x) andf (x) + J(x)h, so it's not the retraction of the Gauss-Newton step!

Introduce a step-length parameter α such that

f (x + αh) = τ(f (x) + J(x)h).

α = 1/(1 + xTh/(1 + xT x))

An improved variant

Example

Figure: Example in dimension 1. Notice the di�erence between the two methods.

An improved variant

GN-TLS method with �optimal� step length

Input: A, b (problem data); ε, maxit (stopping criteria)Output: x (approximation of xTLS)

Set k := 0Compute x0 := argminx ‖Ax − b‖2Compute f0 := f (x0) and J0 := J(x0)while ‖JTk fk‖2 ≥ ε and k < maxit

Compute hk := argminh ‖Jkh + fk‖2Set αk := 1/(1 + xTk hk/(1 + xTk xk))Set xk+1 := xk + αkhkSet k := k + 1, fk := f (xk), Jk := J(xk)

x := xk

An improved variant

Equivalence with an inverse power iteration

The GN-TLS method with optimal step length is equivalent to an inversepower method involving the matrix CTC . Indeed, let

sk = (xk ,−1)T/√

1 + xTk xk . Then,

sk+1 = βk(CTC )−1sk , βk = 1/‖(CTC )−1sk‖2.

Meanwhile,f (xk+1) = βk(CCT )+f (xk).

Corollary

The GN-TLS method with optimal step length is convergent. Moreover,

‖f (xk)− f (xTLS)‖ = O((σn+1σn)2k),

‖xk − xTLS‖ = O((σn+1σn)2k)

|η(xk)− η(xTLS)| = O((σn+1σn)4k)

An improved variant

Equivalence with an inverse power iteration

The GN-TLS method with optimal step length is equivalent to an inversepower method involving the matrix CTC . Indeed, let

sk = (xk ,−1)T/√

1 + xTk xk . Then,

sk+1 = βk(CTC )−1sk , βk = 1/‖(CTC )−1sk‖2.

Meanwhile,f (xk+1) = βk(CCT )+f (xk).

Corollary

The GN-TLS method with optimal step length is convergent. Moreover,

‖f (xk)− f (xTLS)‖ = O((σn+1σn)2k),

‖xk − xTLS‖ = O((σn+1σn)2k)

|η(xk)− η(xTLS)| = O((σn+1σn)4k)

An improved variant

Numerical experiments

Test problem by Björck, Heggernes, Matstoms (2000)

0 10 2010

10−16

10−14

10−12

10−10

10−8

0 10 2010

10−15

10−10

10−5

0 10 200.95

Figure: Left: log ‖JTk fk‖. Center: Errors log ‖xk − xTLS‖ (solid lines) andlog |η(xk)− η(xTLS)| (dashed lines). Right: plot of αk

Conclusions

The method produces a sequence of approximations which convergeswith no restriction. The value η(xk), available during the iterations,estimates the backward error in Ax ≈ b.

The method avoids to compute the SVD. At each step it only solves aleast squares problem whose matrix is a rank one perturbation of thedata matrix. This can be useful in some circumstances:

if A is large and sparse we can use (transpose free) Krylov methodswhere the matrix is only involved in matrix-vector products;if the QR factorization of A is known in advance.

Conclusions

D. Fasino, A. Fazzi.A Gauss�Newton iteration for Total Least Squares problems.arXiv:1608.01619 (2016). Submitted.

Thank you for your attention.

Un metodo iterativo di tipo Gauss-Newton per la...

Documents