Large Scale Tikhonov Regularization for Total Least ... · Total Least Squares Problems Total Least...

Large–Scale Tikhonov Regularization for Total LeastSquares Problems

Heinrich [email protected]

Joint work with Jorg Lampe

Hamburg University of TechnologyInstitute of Numerical Simulation

TUHH Heinrich Voss Tikhonov Regularization for TLS Bremen 2011 1 / 24

http://www.tu-harburg.de/ins/hp/voss/

Outline

1 Total Least Squares Problems

2 Regularization of TLS Problems

3 Tikhonov Regularization of TLS problems

4 Numerical Experiments

5 Conclusions


Total Least Squares Problems

Outline





5 Conclusions



Total Least Squares ProblemsThe ordinary Least Squares (LS) method assumes that the system matrix A ofa linear model is error free, and all errors are confined to the right hand side b.

However, in engineering applications this assumption is often unrealistic.Many problems in data estimation are obtained by linear systems where both,the matrix A and the right-hand side b, are contaminated by noise, forexample if A as well is only available by measurements or if A is an idealizedapproximation of the true operator.

If the true values of the observed variables satisfy linear relations, and if theerrors in the observations are independent random variables with zero meanand equal variance, then the total least squares (TLS) approach often givesbetter estimates than LS.

Given A ∈ Rm×n, b ∈ Rm, m ≥ n

Find ∆A ∈ Rm×n, ∆b ∈ Rm and x ∈ Rn such that

‖[∆A,∆b]‖2F = min! subject to (A + ∆A)x = b + ∆b, (1)

where ‖ · ‖F denotes the Frobenius norm.TUHH Heinrich Voss Tikhonov Regularization for TLS Bremen 2011 4 / 24


























Total Least Squares Problems cnt.

Although the name “total least squares” was introduced only recently in theliterature by Golub and Van Loan (1980), this fitting method is not new andhas a long history in the statistical literature, where it is known as orthogonalregression, errors-in-variables, or measurement errors, and in imagedeblurring blind deconvolution

The univariate problem (n = 1) is already discussed by Adcock (1877), and itwas rediscovered many times, often independently.

About 30 – 40 years ago, the technique was extended by Sprent (1969) andGleser (1981) to the multivariate case (n > 1).

More recently, the total least squares method also stimulated interest outsidestatistics. In numerical linear algebra it was first studied by Golub and VanLoan (1980).

























The TLS problem can be analyzed in terms of the singular valuedecomposition of the augmented matrix [A,b] = UΣV T .

A TLS solution exists if and only if the right singular subspace Vmincorresponding to σn+1 contains at least one vector with a nonzero lastcomponent.

It is unique if it holds that σ′n > σn+1 where σ′n denotes the smallest singularvalue of A, and then it is given by

xTLS = − 1V (n + 1,n + 1)

V (1 : n,n + 1).







xTLS = − 1V (n + 1,n + 1)

V (1 : n,n + 1).







xTLS = − 1V (n + 1,n + 1)

V (1 : n,n + 1).


Regularization of TLS Problems

Outline





5 Conclusions




When solving practical problems they are usually ill-conditioned, andregularization is necessary to stabilize the computed solution.

Fierro, Golub, Hansen and O’Leary (1997) suggested to filter its solution bytruncating the small singular values of the TLS matrix [A,b], and theyproposed an iterative algorithm based on Lanczos bidiagonalization forcomputing truncated TLS solutions.




When solving practical problems they are usually ill-conditioned, andregularization is necessary to stabilize the computed solution.

Fierro, Golub, Hansen and O’Leary (1997) suggested to filter its solution bytruncating the small singular values of the TLS matrix [A,b], and theyproposed an iterative algorithm based on Lanczos bidiagonalization forcomputing truncated TLS solutions.



Regularization Adding a Quadratic Constraint

Sima, van Huffel, and Golub (2004) suggest to regularize the TLS problemadding a quadratic constraint

‖[∆A,∆b]‖2F = min! subject to (A + ∆A)x = b + ∆b, ‖Lx‖ ≤ δ,

where δ > 0 and the regularization matrix L ∈ Rp×n, p ≤ n defines a (semi-)norm on the solution through which the size of the solution is bounded or acertain degree of smoothness can be imposed.

Let F ∈ Rn×k be a matrix whose columns form an orthonormal basis of thenullspace of the regularization matrix L. If it holds that

σmin([AF ,b]) < σmin(AF )

then the solution xRTLS of the constrained TLS problem is attained (Beck, BenTal 2006)



Regularization Adding a Quadratic Constraint

Sima, van Huffel, and Golub (2004) suggest to regularize the TLS problemadding a quadratic constraint

‖[∆A,∆b]‖2F = min! subject to (A + ∆A)x = b + ∆b, ‖Lx‖ ≤ δ,

where δ > 0 and the regularization matrix L ∈ Rp×n, p ≤ n defines a (semi-)norm on the solution through which the size of the solution is bounded or acertain degree of smoothness can be imposed.

Let F ∈ Rn×k be a matrix whose columns form an orthonormal basis of thenullspace of the regularization matrix L. If it holds that

σmin([AF ,b]) < σmin(AF )

then the solution xRTLS of the constrained TLS problem is attained (Beck, BenTal 2006)



First Order Conditions; Golub, Hansen, O’Leary 1999

Assume xRTLS exists and constraint is active, then (RTLS) is equivalent to

f (x) :=‖Ax − b‖2

1 + ‖x‖2 = min! subject to ‖Lx‖2 = δ2.

First-order optimality conditions are equivalent to

(AT A + λI I + λLLT L)x = AT b,µ ≥ 0, ‖Lx‖2 = δ2

with

λI = −‖Ax − b‖2

1 + ‖x‖2 , λL = µ(1 + ‖x‖2), µ =bT (b − Ax) + λI

δ2(1 + ‖x‖2).



First Order Conditions; Golub, Hansen, O’Leary 1999

Assume xRTLS exists and constraint is active, then (RTLS) is equivalent to

f (x) :=‖Ax − b‖2

1 + ‖x‖2 = min! subject to ‖Lx‖2 = δ2.

First-order optimality conditions are equivalent to

(AT A + λI I + λLLT L)x = AT b,µ ≥ 0, ‖Lx‖2 = δ2

with

λI = −‖Ax − b‖2

1 + ‖x‖2 , λL = µ(1 + ‖x‖2), µ =bT (b − Ax) + λI

δ2(1 + ‖x‖2).



Two Iterative Algorithms based on EVPs

Two approaches for solving the first order conditions(AT A + λI(x)I + λL(x)LT L

)x = AT b (∗)

1. Quadratic EVPs: Sima, Van Huffel, Golub (2004), Lampe, V. (2007,2008)

Iterative algorithm based on updating λI

With fixed λI reformulate (∗) into QEPDetermine rightmost eigenvalue, i.e. the free parameter λL

Use corresponding eigenvector to update λI

2. Linear EVPs: Renaut, Guo (2005), Lampe, V. (2008)

Iterative algorithm based on updating λL

With fixed λL reformulate (∗) into linear EVPDetermine smallest eigenvalue, i.e. the free parameter λI

Use corresponding eigenvector to update λL





)x = AT b (∗)













)x = AT b (∗)










Tikhonov Regularization of TLS problems

Outline





5 Conclusions



Tikhonov Regularization of TLS problem

f (x) + λ‖Lx‖2 =‖Ax − b‖2

1 + ‖x‖2 + λ‖Lx‖2 = min!.

Beck, Ben–Tal (2006) proposed an algorithm where in each iteration step aCholesky decomposition has to be computed, which is prohibitive forlarge-scale problems.

We present a method which solves the first order conditions which areequivalent to

q(x) := (AT A + µLT L− f (x)I)x − AT b = 0, with µ := (1 + ‖x‖2)λ.

via a combination of Newton’s method with an iterative projection method.




f (x) + λ‖Lx‖2 =‖Ax − b‖2

1 + ‖x‖2 + λ‖Lx‖2 = min!.








f (x) + λ‖Lx‖2 =‖Ax − b‖2

1 + ‖x‖2 + λ‖Lx‖2 = min!.








Newton’s method:xk+1 = xk − J(xk )−1q(xk )

with the Jacobi matrix

J(x) = AT A + µLT L− f (x)I − 2xxT AT A− bT A− f (x)xT

1 + ‖x‖2 .

Sherman–Morrison formula yields

xk+1 = J−1k AT b − 1

1− (vk )T J−1k uk

J−1k uk (vk )T (xk − J−1

k AT b),

withJ(x) := AT A + µLT L− f (x)I,

uk := 2xk/(1 + ‖xk‖2) and vk := AT Axk − AT b − f (xk )xk .




Newton’s method:xk+1 = xk − J(xk )−1q(xk )

with the Jacobi matrix

J(x) = AT A + µLT L− f (x)I − 2xxT AT A− bT A− f (x)xT

1 + ‖x‖2 .

Sherman–Morrison formula yields

xk+1 = J−1k AT b − 1

1− (vk )T J−1k uk

J−1k uk (vk )T (xk − J−1

k AT b),

withJ(x) := AT A + µLT L− f (x)I,

uk := 2xk/(1 + ‖xk‖2) and vk := AT Axk − AT b − f (xk )xk .




To avoid the solution of the large scale linear systems with varying matrices Jkwe combine Newton’s method with an iterative projection method.

Let V be an ansatz space of small dimension k , and let the columns ofV ∈ Rk×n form an orthonormal basis of V.Replace z = J−1

k AT b with Vyk1 where yk

1 solves V T Jk Vyk1 = AT b,

and w = J−1k uk with Vyk

2 where yk2 solves V T Jk Vyk

2 = uk .

Ifxk+1 = Vk yk

1 −1

1− (vk )T Vk yk2

Vk yk2 (vk )T (xk − Vk yk

1 )

does not satisfy a prescribed accuracy requirement, then V is expanded withthe residual

q(xk+1) = (AT A + µLT L− f (xk )I)xk+1 − AT b

and the step is repeated until convergence.










2 = uk .

Ifxk+1 = Vk yk

1 −1

1− (vk )T Vk yk2


1 )













2 = uk .

Ifxk+1 = Vk yk

1 −1

1− (vk )T Vk yk2


1 )







Initializing the iterative projection method with a Krylov space V =K`(AT A + µLT L,AT b) the iterates xk are contained in a Krylov space ofAT A + µLT L.

Due to the convergence properties of the Lanczos process the maincontributions come from the first singular vectors of [A;

√µL] which for small µ

are close to the first right singular vectors of A.

It is common knowledge that these vectors are not always appropriate basisvectors for a regularized solution, and it may be advantageous to apply theregularization with a general regularization matrix L implicitly.




















Assume that L is nonsingular and use the transformation x := L−1y (forgeneral L we had to use the A-weighted generalized inverse L†A, cf. Elden1982) which yields

‖AL−1y − b‖2

1 + ‖L−1y‖2 + λ‖y‖2 = min!.

Transforming the first order conditions back and multiplying from the left withL−1 one gets

(LT L)−1(AT Ax + µLT Lx − f (x)x − AT b) = 0.

This equation suggests to precondition the expansion of the search spacewith LT L or an approximation M ≈ LT L thereof which yields the followingAlgorithm.





‖AL−1y − b‖2

1 + ‖L−1y‖2 + λ‖y‖2 = min!.








‖AL−1y − b‖2

1 + ‖L−1y‖2 + λ‖y‖2 = min!.







Require: Initial basis V0 with V T0 V0 = I, starting vector x0

1: for k = 0,1, . . . until convergence do2: Compute f (xk ) = ‖Axk − b‖2/(1 + ‖xk‖2)

3: Solve V Tk Jk Vk yk

1 = V Tk AT b for yk

14: Compute uk = 2xk/(1 + ‖xk‖2) and vk = AT Axk − AT b − f (xk )xk

5: Solve V Tk Jk Vk yk

2 = V Tk uk for yk

26: Compute xk+1 = Vk yk

1 −1

1−(vk )T Vk yk2Vk yk

2 (vk )T (xk − Vk yk1 )

7: Compute qk+1 = (AT A + µLT L− f (xk )I)xk+1 − AT b8: Compute r = M−1qk+1

9: Orthogonalize r = (I − Vk V Tk )r

10: Normalize vnew = r/‖r‖11: Enlarge search space Vk+1 = [Vk , vnew]12: end for13: Output: Approximate Tikhonov TLS solution xk+1



Tikhonov Regularization of TLS problemThe Tikhonov TLS methods allows for a massive reuse of information fromprevious iteration steps.

Assume that the matrices Vk , AT AVk , LT LVk are stored. Then neglectingmultiplications with L and LT and solves with M the essential cost in everyiteration step is only two matrix-vector products with dense matrices A and AT

for extending AT AVk .

With these matrices f (xk ) in Line 1 can be evaluated as

f (xk ) =1

1 + ‖yk‖2

((xk )T (AT Ayk )− 2(yk )T V T

k (AT b) + ‖b‖2),

and qk+1 in Line 7 can be determined according to

qk+1 = (AT AVk )yk+1 + µ(LT LVk )yk+1 − f (xk )xk+1 − AT b.

Since the the number of iteration steps until convergence is usually very smallcompared to the dimension n, the overall cost of the Algorithm is of the orderO(mn).







f (xk ) =1

1 + ‖yk‖2


k (AT b) + ‖b‖2),










f (xk ) =1

1 + ‖yk‖2


k (AT b) + ‖b‖2),










f (xk ) =1

1 + ‖yk‖2


k (AT b) + ‖b‖2),





Numerical Experiments

Outline





5 Conclusions




Consider several examples from Hansen’s Regularization Tools.

The regularization matrix L is chosen to be the nonsingular approximation tothe scaled discrete first order derivative operator in one space-dimension.

The numerical tests are carried out on an Intel Core 2 Duo T7200 computerwith 2.3 GHz and 2 GB RAM under MATLAB R2009a (actually our numericalexamples require less than 0.5 GB RAM).
















Problem Method ‖q(xk )‖‖AT b‖ Iters MatVecs ‖x−xtrue‖

‖xtrue‖

phillips TTLS 8.5e-16 8.0 25.0 8.9e-2σ = 1e − 3 RTLSQEP 5.7e-11 3.0 42.0 8.9e-2

RTLSEVP 7.1e-13 4.0 47.6 8.9e-2baart TTLS 2.3e-15 10.1 29.2 1.5e-1σ = 1e − 3 RTLSQEP 1.0e-07 15.7 182.1 1.4e-1

RTLSEVP 4.1e-10 7.8 45.6 1.5e-1shaw TTLS 9.6e-16 8.3 25.6 7.0e-2σ = 1e − 3 RTLSQEP 3.7e-09 4.1 76.1 7.0e-2

RTLSEVP 2.6e-10 3.0 39.0 7.0e-2deriv2 TTLS 1.2e-15 10.0 29.0 4.9e-2σ = 1e − 3 RTLSQEP 2.3e-09 3.1 52.3 4.9e-2

RTLSEVP 2.6e-12 5.0 67.0 4.9e-2heat(κ=1) TTLS 8.4e-16 19.9 48.8 1.5e-1σ = 1e − 2 RTLSQEP 4.1e-08 3.8 89.6 1.5e-1

RTLSEVP 3.2e-11 4.1 67.2 1.5e-1heat(κ=5) TTLS 1.4e-13 25.0 59.0 1.1e-1σ = 1e − 3 RTLSQEP 6.1e-07 4.6 105.2 1.1e-1

RTLSEVP 9.8e-11 4.0 65.0 1.1e-1


Conclusions

Outline





5 Conclusions


Conclusions

Conclusions

We discussed a Tikhonov regularization approach for large total least squaresproblems.

It is highly advantageous to combine Newton’s method with an iterativeprojection method and to reuse information gathered in previous iterationsteps.

Several examples demonstrate that fairly small ansatz spaces are aresufficient to get accurate solutions. Hence, the method is qualified to solvelarge-scale regularized total least squares problems efficiently

We assumed the regularization parameter λ to be fixed. The same techniqueof recycling ansatz spaces can be used in an L-curve method to determine areasonable parameter.


Conclusions

Conclusions






Conclusions

Conclusions






Conclusions

Conclusions






Date post:	19-Aug-2020
Category:	Documents
Upload:	others
View:	5 times
Download:	0 times

Large Scale Tikhonov Regularization for Total Least ... · Total Least Squares Problems Total Least...

Documents