+ All Categories
Home > Documents > Iterative Projection Methods

Iterative Projection Methods

Date post: 27-Mar-2022
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
79
Iterative Projection Methods for noisy and corrupted systems of linear equations Jamie Haddock Tulane Probability and Statistics Seminar, November 7, 2018 Computational and Applied Mathematics UCLA joint with Jes´ us A. De Loera, Deanna Needell, and Anna Ma https://arxiv.org/abs/1802.03126 (BIT Numerical Mathematics 2018+) https://arxiv.org/abs/1803.08114 https://arxiv.org/abs/1605.01418 (SISC 2017) 1
Transcript
Iterative Projection Methods - for noisy and corrupted systems of linear equationsJamie Haddock
Computational and Applied Mathematics
UCLA
joint with Jesus A. De Loera, Deanna Needell, and Anna Ma
https://arxiv.org/abs/1802.03126 (BIT Numerical Mathematics 2018+)
https://arxiv.org/abs/1803.08114
We are interested in solving highly overdetermined systems of equations
(or inequalities), Ax = b (Ax ≤ b), where A ∈ Rm×n, b ∈ Rm and
m n. Rows are denoted aTi .
3
Iterative Projection Methods
If {x ∈ Rn : Ax = b} is nonempty, these methods construct an
approximation to an element:
1. Randomized Kaczmarz Method
2. Motzkin’s Method
4
2
T ik xk−1
Theorem (Strohmer - Vershynin 2009)
Let x be the solution to the consistent system of linear equations Ax = b.
Then the Random Kaczmarz method converges to x linearly in
expectation:
)k
2. Choose ik ∈ [m] as ik := argmax i∈[m]
|aTi xk−1 − bi |.
T ik xk−1
Theorem (Agmon 1954)
For a consistent, normalized system, ai = 1 for all i = 1, ...,m,
Motzkin’s method converges linearly to the solution x:
xk − x2 ≤ (
Given x0 ∈ Rn:
1. Choose τk ⊂ [m] to be a sample of size β constraints chosen
uniformly at random from among the rows of A.
2. From among these β rows, choose ik := argmax i∈τk
|aTi xk−1 − bi |.
T ik xk−1
Theorem (De Loera - H. - Needell 2017)
For a consistent, normalized system the SKM method with samples of
size β converges to the solution x at least linearly in expectation: If sk−1
is the number of constraints satisfied by xk−1 and
Vk−1 = max{m − sk−1,m − β + 1} then
Exk − x2 ≤ (
) x0 − x2
. ‘faster’ convergence for larger sample size
14
. ‘faster’ convergence for larger sample size
14
. ‘faster’ convergence for larger sample size
14
. ‘faster’ convergence for larger sample size
14
15
15
15
1− 1 ||A||2F ||A−1||22
)k
)k
1− 1 mA−12
)k
16
1− 1 ||A||2F ||A−1||22
)k
)k
1− 1 mA−12
)k
16
1− 1 ||A||2F ||A−1||22
)k
)k
1− 1 mA−12
)k
16
1− 1 ||A||2F ||A−1||22
)k
)k
1− 1 mA−12
)k
16
An Accelerated Convergence Rate
Theorem (H. - Needell 2018+)
Let x denote the solution of the consistent, normalized system Ax = b.
Motzkin’s method exhibits the (possibly highly accelerated) convergence
rate:
( 1− 1
) · x0 − x2
Here γk bounds the dynamic range of the kth residual, γk := Axk−Ax2
Axk−Ax2 ∞
17
Iterations
0
0.5
1
1.5
2
2.5
|| x
Iterations
0
0.5
1
1.5
2
2.5
|| x
Iterations
0
0.5
1
1.5
2
2.5
3
3.5
|| x
20
. bound uses dynamic range of sample of β rows
. use this bound to design methods which identify optimal β?
21
. bound uses dynamic range of sample of β rows 21
Is this the right problem?
xLS
/ noisy
. corrupted
x∗
xLS
22
xLS
/ noisy
. corrupted
x∗
xLS
22
Noisy Convergence Results
Theorem (Needell 2010)
Let A have full column rank, denote the desired solution to the system
Ax = b by x, and define the error term e = Ax− b. Then RK iterates
satisfy
)k

Theorem (H. - Needell 2018+)
Let x denote the desired solution of the system Ax = b and define the
error term e = b−Ax. If Motzkin’s method is run with stopping criterion
Axk − b∞ ≤ 4e∞, then the iterates satisfy
xT − x2 ≤ T−1∏ k=0
( 1− 1

Noisy Convergence Results
Theorem (Needell 2010)
Let A have full column rank, denote the desired solution to the system
Ax = b by x, and define the error term e = Ax− b. Then RK iterates
satisfy
)k

Theorem (H. - Needell 2018+)
Let x denote the desired solution of the system Ax = b and define the
error term e = b−Ax. If Motzkin’s method is run with stopping criterion
Axk − b∞ ≤ 4e∞, then the iterates satisfy
xT − x2 ≤ T−1∏ k=0
( 1− 1

Noisy Convergence
. A is 50000× 100 Gaussian matrix, inconsistent system (Ax = b + e)
. Left: Gaussian error e
. Right: sparse, ‘spiky’ error e
. Motzkin suffers from a worse ‘convergence horizon’ if e is sparse
24
Solution (x∗): x∗ ∈ {x : Ax = b}
Applications: logic programming, error correction in telecommunications
Problem: Ax = b + e
Solution (xLS): xLS ∈ argminAx− b− e2
26
Problem
Solution (x∗): x∗ ∈ {x : Ax = b}
Applications: logic programming, error correction in telecommunications
Problem: Ax = b + e
Solution (xLS): xLS ∈ argminAx− b− e2
26
Problem
Solution (x∗): x∗ ∈ {x : Ax = b}
Applications: logic programming, error correction in telecommunications
Problem: Ax = b + e
Solution (xLS): xLS ∈ argminAx− b− e2
26
. no PTAS unless P = NP
28
MAX-FS
. no PTAS unless P = NP
28
MAX-FS
. no PTAS unless P = NP
28
Proposed Method
Goal: Use RK to detect the corrupted equations with high probability.
We call ε∗/2 the detection horizon.
29
Proposed Method
Goal: Use RK to detect the corrupted equations with high probability.
Lemma
Let ε∗ = mini∈supp(e) |Ax∗ − b|i = |ei | and suppose |supp(e)| = s. If
||ai || = 1 for i ∈ [m] and ||x− x∗|| < 1 2ε ∗ we have that the d ≤ s indices
of largest magnitude residual entries are contained in supp(e). That is,
we have D ⊂ supp(e), where
D = argmax D⊂[A],|D|=d
∑ i∈D
29
Proposed Method
Goal: Use RK to detect the corrupted equations with high probability.
x∗
xk
29
Proposed Method
Goal: Use RK to detect the corrupted equations with high probability.
x∗
xk
29
1: procedure MRK(A,b, k ,W , d)
2: S = ∅ 3: for i = 1, 2, ...W do
4: xik = kth iterate produced by RK with x0 = 0, A, b.
5: D = d indices of the largest entries of the residual, |Axik − b|. 6: S = S ∪ D
7: return x, where ASC x = bSC
30
Example
x∗
x∗
MRK(A,b,k = 2,W = 3,d = 1): j = 2, i = 1, S = {7}
x∗
MRK(A,b,k = 2,W = 3,d = 1): j = 1, i = 2, S = {7}
x∗
MRK(A,b,k = 2,W = 3,d = 1): j = 1, i = 2, S = {7}
x∗
31
Example
MRK(A,b,k = 2,W = 3,d = 1): j = 2, i = 2, S = {7, 5}
x∗
31
Example
MRK(A,b,k = 2,W = 3,d = 1): j = 1, i = 3, S = {7, 5}
x∗
Example
MRK(A,b,k = 2,W = 3,d = 1): j = 1, i = 3, S = {7, 5}
x∗
31
Example
MRK(A,b,k = 2,W = 3,d = 1): j = 2, i = 3, S = {7, 5, 6}
x∗
Lemma
Let ε∗ = mini∈supp(e) |Ax∗ − b|i = |ei | and suppose |supp(e)| = s.
Assume that ||ai || = 1 for all i ∈ [m] and let 0 < δ < 1. Define
k∗ =
m−s
)⌉. Then in window i of the Windowed Kaczmarz method, the iterate
produced by the RK iterations, xik∗ satisfies
P [ ||xik∗ − x∗|| ≤ 1
(m − s
Theorem (H. - Needell 2018+)
Assume that ai = 1 for all i ∈ [m] and let 0 < δ < 1. Suppose
d ≥ s = |supp(e)|, W ≤ bm−nd c and k∗ is as given in the previous
lemma. Then the Windowed Kaczmarz method on A,b will detect the
corrupted equations (supp(e) ⊂ S) and the remaining equations given by
A[m]−S ,b[m]−S will have solution x∗ with probability at least
pW := 1− [
m
0 0.2 0.4 0.6 0.8 1 0
0.2
0.4
0.6
0.8
1
0.5
0.6
0.7
0.8
0.9
1
k
0
0.2
0.4
0.6
0.8
1
m
0 0.2 0.4 0.6 0.8 1 0
0.2
0.4
0.6
0.8
1
0.5
0.6
0.7
0.8
0.9
1
k
0
0.2
0.4
0.6
0.8
1
rate of detecting all
corrupted equations in one
0 0.2 0.4 0.6 0.8 1 0
0.2
0.4
0.6
0.8
1
0.5
0.6
0.7
0.8
0.9
1
k
0
0.2
0.4
0.6
0.8
1
RK iterations k
. experimental rate of success of detecting all corrupted equations
over all W = bm−nd c windows
35
. Upper left: probability of
m
. Upper right: experimental
. Lower left: experimental
RK iterations k
36
Conclusions
. Motzkin’s method is accelerated even in the presence of noise
• γk , the parameter governing this acceleration, governs the
acceleration of SKM
. RK methods may be used to detect corruption
. theoretical bounds do not reflect empirical results
37
size β
bounds
38
Questions?
[DLHN17] J. A. De Loera, J. Haddock, and D. Needell. A sampling
Kaczmarz-Motzkin algorithm for linear feasibility. SIAM Journal on
Scientific Computing, 39(5):S66–S87, 2017.
[HN18a] J. Haddock and D. Needell. On Motzkin’s method for inconsistent linear
systems. BIT Numerical Mathematics, 2018. To appear.
[HN18b] J. Haddock and D. Needell. Randomized projection methods for linear
systems with arbitrarily large sparse corruptions. 2018. Submitted.
[MS54] T. S. Motzkin and I. J. Schoenberg. The relaxation method for linear
inequalities. Canadian J. Math., 6:393–404, 1954.
[Nee10] D. Needell. Randomized Kaczmarz solver for noisy linear systems. BIT,
50(2):395–403, 2010.
[SV09] T. Strohmer and R. Vershynin. A randomized Kaczmarz algorithm with
exponential convergence. J. Fourier Anal. Appl., 15:262–278, 2009.
39

Recommended