Nonlinear Equations of One VariableIcjlin/courses/nm2016/part5.pdf · Nonlinear Equations of One...

transcript

Nonlinear Equations of One Variable I

Solve f (x) = 0, x ∈ R1

Finding a root

The difference: nonlinear but not linear

Bisection method:Given [ai , bi ], compute

pi =ai + bi

Iff (pi) = 0, stop

Chih-Jen Lin (National Taiwan Univ.) 1 / 54

Nonlinear Equations of One Variable II

ai+1 = ai and bi+1 = pi , if f (ai)f (pi) < 0

ai+1 = pi and bi+1 = bi , otherwise

Example:

Nonlinear Equations of One Variable III

>> bisect21This is the Bisection Method.Input the function F(x) in terms of xFor example: cos(x)

’cos(x)’Input endpoints A < B on separate lines01

F(A) and F(B) have same signInput endpoints A < B on separate lines00.5

F(A) and F(B) have same signInput endpoints A < B on separate lines02

Nonlinear Equations of One Variable IV

Input tolerance0.001

Input maximum number of iterations - no decimal point50

Select output destination1. Screen2. Text fileEnter 1 or 21

Select amount of output1. Answer only2. All intermediate approximationsEnter 1 or 22

Bisection MethodI P F(P)

Nonlinear Equations of One Variable V1 1.00000000e+00 5.4030231e-012 1.50000000e+00 7.0737202e-023 1.75000000e+00 -1.7824606e-014 1.62500000e+00 -5.4177135e-025 1.56250000e+00 8.2962316e-036 1.59375000e+00 -2.2951658e-027 1.57812500e+00 -7.3286076e-038 1.57031250e+00 4.8382678e-049 1.57421875e+00 -3.4224165e-03

10 1.57226562e+00 -1.4692977e-0311 1.57128906e+00 -4.9273569e-04

Approximate solution P = 1.57128906with F(P) = -0.00049274Number of iterations = 11 Tolerance = 1.00000000e-03

Newton’s method I

Newton’s method II

Solvef (x) = 0

Fining the tangent line at xk :

y − f (xk)

x − xk= f ′(xk)

xk : the current iterateLet y = 0

xk+1 = xk − f (xk)/f ′(xk)

Newton’s method III

The original idea of Newton’s method:

min f (x)

Equivalent to f ′(x) = 0

f (x + d) = f (x) + f ′(x)d +1

2d2f ′′(x) + · · ·

≈ f (x) + f ′(x)d +1

2d2f ′′(x)

Newton’s method IV

Second-order approximation

f (x) + f ′(x)d +1

2d2f ′′(x)

f ′′(x)d + f ′(x) = 0

d = − f ′(x)

f ′′(x)

xk+1 = xk −f ′(x)

f ′′(x).

Newton’s method V

Newton’s method may not converge

{xk} may diverge

>> newtonnewtonThis is Newtons MethodInput the function F(x) in terms of xFor example: cos(x)’cos(x)’

’cos(x)’Input the derivative of F(x) in terms of x’-sin(x)’

’-sin(x)’Input initial approximation

Newton’s method VI

11Input tolerance0.001

0.001Input maximum number of iterations - no decimal point50

50Select output destination1. Screen2. Text fileEnter 1 or 21

1Select amount of output1. Answer only

Newton’s method VII

2. All intermediate approximationsEnter 1 or 22

2Newtons Method

I P F(P)1 1.64209262e+00 -7.1235903e-022 1.57067528e+00 1.2104963e-043 1.57079633e+00 -5.9124355e-13

Approximate solution = 1.5707963268e+00with F(P) = -5.9124355058e-13Number of iterations = 3Tolerance = 1.0000000000e-03

Newton’s method VIII

>> newtonThis is Newtons MethodInput the function F(x) in terms of xFor example: cos(x)’cos(x)’

Input the derivative of F(x) in terms of x’-sin(x)’

Input initial approximation0.1

Input tolerance0.001

Input maximum number of iterations - no decimal point50

Select output destination1. Screen2. Text file

Newton’s method IX

Enter 1 or 21

Select amount of output1. Answer only2. All intermediate approximationsEnter 1 or 22

Newtons MethodI P F(P)1 1.00666444e+01 -8.0097972e-012 1.14045284e+01 3.9764987e-013 1.09711401e+01 -2.4431758e-024 1.09955792e+01 4.8638065e-065 1.09955743e+01 -4.2862638e-16

Approximate solution = 1.0995574288e+01

Newton’s method X

with F(P) = -4.2862637970e-16Number of iterations = 5Tolerance = 1.0000000000e-03>>

The number of iterations is smaller than bisection

Convergence Rate I

Which algorithm needs fewer iterations ?Need to analyze convergence rate

Assume x∗ is a solution

Linear convergence: if

limk→∞

‖xk+1 − x∗‖‖xk − x∗‖

≤ r < 1.

Example: r = 0.1, and

x1 − x∗ = 0.1

x2 − x∗ = 0.01

x3 − x∗ = 0.001Chih-Jen Lin (National Taiwan Univ.) 16 / 54

Convergence Rate II

Superlinear convergence: if

limk→∞

‖xk+1 − x∗‖‖xk − x∗‖

Example:

‖xk+1 − x∗‖‖xk − x∗‖

= 0.1, 0.01, 0.001

Convergence Rate III

x1 − x∗ = 0.1

x2 − x∗ = 0.1× 0.1 = 0.01

x3 − x∗ = 0.01× 0.01 = 0.0001

x4 − x∗ = 0.001× 0.0001 = 10−7

Quadratic convergence. If

limk→∞

‖xk+1 − x∗‖‖xk − x∗‖2

≤ r (1)

Convergence Rate IV

Example: r = 0.1

x1 − x∗ = 0.1

x2 − x∗ = (0.1)2 × 0.1 = 10−3

x3 − x∗ = (10−3)2 × 0.1 = 10−7

x4 − x∗ = (10−7)2 × 0.1 = 10−15

No need to have r < 1

(1) implies superlinear convergence

As long as ‖xk − x∗‖ → 0, the convergence is stillfaster.

Convergence Rate V

Example:

‖xk+1 − x∗‖‖xk − x∗‖2

= 22, and start from ‖xk − x∗‖ = 2−3

2−6 · 22 = 2−4, 2−8 · 22 = 2−6

Newton’s method: Quadratic convergence

Homework 7-1 I

Write C or Matlab programs for bisection andNewton methods

Find roots of some functions (also draw thesefunctions)You don’t want to try the same functions as others

Try different initial solutions and see what happens

Check their convergence rates

Newton Method: Quadratic Convergence I

Use Newton’s method to solve

f (x) = 0

Assume f satisfies1 f is continuously differentiable2 f ′ is Lipschitz continuous:

|f ′(y)− f ′(x)| ≤ α|y − x |,∀x , y

where α > 0 is tha Lipschitz constant

Newton Method: Quadratic ConvergenceII

Theorem 1

If {xk} → x∗ and f ′(x∗) 6= 0, then1 f (x∗) = 02 ∃L ≥ 1, δ > 0 such that ∀k ≥ L

|xk+1 − x∗| ≤ δ|xk − x∗|2

Proof.

Newton Method: Quadratic ConvergenceIII

From Lemma 3,

f (xk+1) = f (xk) + f ′(xk)(xk+1 − xk) + e(xk+1, xk)

= 0 + e(xk+1, xk),

|e(xk+1, xk)| ≤ 1

2α|xk+1 − xk |2

|f (xk+1)| ≤ 1

2α|xk+1 − xk |2.

Newton Method: Quadratic ConvergenceIV

Since {xk} → x∗

|f (xk+1)| → 0

Since f is continuous,

f (xk+1)→ f (x∗) implies f (x∗) = 0.

Defineβ̄ ≡ |f ′(x∗)−1|.

Newton Method: Quadratic ConvergenceV

Since|xk − x∗| → 0

and f ′ is Lipschitz continuous, ∃L ≥ 1 such that ∀k ≥ L,

|f ′(x∗)−1(f ′(xk)− f ′(x∗))| ≤ |f ′(x∗)−1||f ′(xk)− f ′(x∗)|≤ αβ̄|xk − x∗|

Newton Method: Quadratic ConvergenceVI

For k ≥ L, from Lemma 2, f ′(xk)−1 exists and

|f ′(xk)−1| ≤ 2β̄

By the definition of Newton’s method:

xk+1 − x∗ = xk − x∗ − f ′(xk)−1f (xk) (2)

Since f (x∗) = 0

f (x∗) = f (xk) + f ′(xk)(x∗ − xk) + e(x∗, xk) = 0

Newton Method: Quadratic ConvergenceVII

Then multiply f ′(xk)−1 on each term:

f ′(xk)−1f (xk) = xk − x∗ − f ′(xk)−1e(x∗, xk) (3)

Thus (2) and (3) imply

xk+1 − x∗ = f ′(xk)−1e(x∗, xk)

Newton Method: Quadratic ConvergenceVIII

For k ≥ L,

|xk+1 − x∗| ≤ |f ′(xk)−1||e(x∗, xk)|

≤ 2β̄1

2α|xk − x∗|2

= δ|xk − x∗|2

by defining δ = β̄αThen the proof is complete

Newton Method: Quadratic ConvergenceIX

Lemma 2If

|f ′(x∗)−1(f ′(xk)− f ′(x∗))| < 1 (4)

|f ′(xk)−1| ≤ |f ′(x∗)−1|1− |f ′(x∗)−1(f ′(xk)− f ′(x∗))|

Proof.

Newton Method: Quadratic ConvergenceX

|f ′(xk)−1| ≤ |f ′(x∗)−1|1− |f ′(x∗)−1f ′(xk)− 1|

|f ′(xk)−1| − |f ′(x∗)−1 − f ′(xk)−1| ≤ |f ′(x∗)−1| (7)

Note that we need (4) only for ensuring that thedenominator of the right-hand side of (5) is positive.

Newton Method: Quadratic ConvergenceXI

Lemma 3

If f ′ is Lipschitz continuous with constant α > 0 and

f (y) = f (x) + f ′(x)(y − x) + e(y , x),

then ∀x , y|e(y , x)| ≤ 1

2α|y − x |2

Newton Method: Quadratic ConvergenceXII

Proof.

f (y)− f (x)− f ′(x)(y − x)

(f ′(x + t(y − x))− f ′(x))(y − x)dt

Note that

d(f (x + t(y − x))

dt= f ′(x + t(y − x))(y − x)

Newton Method: Quadratic ConvergenceXIII

because t = 1:

f (x + t(y − x)) = f (y)

and t = 0:

f (x + t(y − x)) = f (x)

Newton Method: Quadratic ConvergenceXIVThen

|e(y , x)|

≤∫ 1

|(f ′(x + t(y − x))− f ′(x))(y − x)|dt (8)

≤∫ 1

α|(x + t(y − x))− x ||(y − x)|dt (9)

≤∫ 1

αt|(y − x)|2dt =1

2α|y − x |2

(8) to (9) using Lipschitz condition (by assumption)Chih-Jen Lin (National Taiwan Univ.) 35 / 54

Nonlinear Equations with more than OneVariable I

n equalities, n variables

f1(x1, . . . , xn) = 0...

fn(x1, . . . , xn) = 0

Nonlinear Equations with more than OneVariable II

The Jacobian J(x)

J(x) =

∂f1(x)∂x1

· · · ∂f1(x)∂xn...

∂fn(x)∂x1

· · · ∂fn(x)∂xn

J(x) is a square matrix but not symmetric

Newton’s method

xk+1 = xk − J(xk)−1f (xk)

Nonlinear Equations with more than OneVariable III

Of course whether J(xk) is invertable is an issue

Homework 7-2: Consider

f1(x1, x2, x3) = x21 + x22 + x23 − 1 = 0

f2(x1, x2, x3) = x21 + x23 − 1/4 = 0

f3(x1, x2, x3) = x21 + x22 − 4x3 = 0

With different initial solutions, use Newton’smethod to solve it until

‖xk+1 − xk‖ < 10−7

Nonlinear Equations with more than OneVariable IV

Use Matlab, C, or python.

Questions remained about Newton’s methods ?

Selection of initial solutionsUnder what conditions we know the sequenceconverges ?

If there are m equalities, n variables but m 6= n:

f1(x1, . . . , xn) = 0...

fm(x1, . . . , xn) = 0Chih-Jen Lin (National Taiwan Univ.) 39 / 54

Nonlinear Equations with more than OneVariable V

Then we cannot use Newton’s method as J(x) ism × n, not invertible

J(x) =

∂f1(x)∂x1

· · · ∂f1(x)∂xn...

∂fm(x)∂x1

· · · ∂fm(x)∂xn

It’s not a square matrix

Nonlinear Equations with more than OneVariable VI

Consider the following reformulation

minx‖f (x)‖2 (10)

That is

min f (x)T f (x)

f1(x)2 + · · ·+ fm(x)2

Nonlinear Equations with more than OneVariable VII

Now f : Rn → Rm

If there are x such that f (x) = 0, then the minimalvalue of (10) must be zero.

Consider

g(x) ≡ ‖f (x)‖2

min g(x) means solving ∇g(x) = 0

Nonlinear Equations with more than OneVariable VIII

That is

∂g(x)

∂x1= 0, . . . ,

∂g(x)

∂xn= 0

g : Rn → R1,∇g : Rn → Rn

Nonlinear Equations with more than OneVariable IX

What is ∇g(x)?

∂g(x)

∂x1=

∂(f1(x)2 + · · · fm(x)2)

= 2f1(x)∂f1(x)

∂x1+ · · ·+ 2fm(x)

∂fm(x)

∂x1...

∂g(x)

∂xn= 2f1(x)

∂f1(x)

∂xn+ · · ·+ 2fm(x)

∂fm(x)

Nonlinear Equations with more than OneVariable X

Therefore,

∇g(x) = 2

∂f1(x)∂x1

· · · ∂fm(x)∂x1...

∂f1(x)∂xn

· · · ∂fm(x)∂xn

...fm(x)

= 2J(x)T f (x)

Then we use Newton’s method to solve ∇g(x) = 0

Nonlinear Equations with more than OneVariable XI

The iteration

xk+1 = xk −∇2g(xk)−1∇g(xk)

What is ∇2g(x)? An n × n matrix:

∇2g(x) =

∂g(x)∂x1∂x1

· · · ∂g(x)∂xn∂x1...

∂g(x)∂x1∂xn

· · · ∂g(x)∂xn∂xn

Nonlinear Equations with more than OneVariable XII

As in general

∂g(x)

∂xi∂xj=∂g(x)

∂xj∂xi

∇2g(x) is symmetric

What is ∇2g(x) represented in f ?

Nonlinear Equations with more than OneVariable XIII

Remember that

∂g(x)

∂x1= 2f1(x)

∂f1(x)

∂x1+ · · ·+ 2fm(x)

∂fm(x)

∂g(x)

∂x1∂x1= 2f1(x)

∂f1(x)

∂x1∂x1+ 2

∂f1(x)

+ · · ·+ 2fm(x)∂fm(x)

∂x1∂x1+ 2

∂fm(x)

Nonlinear Equations with more than OneVariable XIV

∂g(x)

∂x1∂x2= 2f1(x)

∂f1(x)

∂x1∂x2+ 2

∂f1(x)

+ · · ·+ 2fm(x)∂fm(x)

∂x1∂x2+ 2

∂fm(x)

In matrix form:Define

(Gk)ij ≡ fk(x)∂fk(x)

∂xi∂xj

Nonlinear Equations with more than OneVariable XV

∇2g(x) = 2(m∑

Gk + J(x)TJ(x))

J(x)T =

∂f1(x)∂x1

· · · ∂fm(x)∂x1...

∂f1(x)∂xn

· · · ∂fm(x)∂xn

, J(x) =

∂f1(x)∂x1

· · · ∂f1(x)∂xn...

∂fm(x)∂x1

· · · ∂fm(x)∂xn

Fixed-Point Iteration for NonlinearSystems I

f1(x1, x2) = x31 + 10x1 − x2 − 5 = 0

f2(x1, x2) = x1 + x32 − 10x2 + 1 = 0

Fixed-Point Iteration for NonlinearSystems II

Converting to

x1 = g1(x1, x2), x2 = g2(x1, x2)

For example,

x1 = −0.1x31 + 0.1x2 + 0.5

x2 = 0.1x1 + 0.1x32 + 0.1

Then using the following iteration

xk+11 = −0.1(xk1 )3 + 0.1xk2 + 0.5

xk+12 = 0.1xk1 + 0.1(xk2 )3 + 0.1

Fixed-Point Iteration for NonlinearSystems III

Fixed point ?

x = g(x), then x is a fixed point of g

The convergence ??

Fixed-Point Iteration for NonlinearSystems IV

Theorem 4

If p is a fixed point of g(x),

‖∇g(p)‖ < 1,

and the initial point x0 is sufficiently close to p, then thealgorithm

xk+1 = g(xk)

converges

Nonlinear Equations of One VariableIcjlin/courses/nm2016/part5.pdf · Nonlinear Equations of One...

Documents