INF5620: Numerical Methods for Partial Differential...

INF5620: Numerical Methods for Partial DifferentialEquations

Hans Petter Langtangen

Simula Research Laboratory, and

Dept. of Informatics, Univ. of Oslo

January 2006

INF5620: Numerical Methods for Partial Differential Equations – p. 1

About the course

About the course – p. 2

Course data

10 p

Lectures: Wednesdays 10-14 in B70, math building

Sometimes 4 h lectures, sometimes less, sometimes 4 h exercises

Course web page: http://folk.uio.no/hpl/INF5620,reachable from the official central UiO web page of the course

Look for messages at the web page!

Teachers:Xing Cai: [email protected] Petter Langtangen: [email protected], 99 53 20 21


Basic features of the course

Goal: produce solutions of PDEs

Integrated approach:mechanics, numerics, algorithms, software

Generic approach –methods applicable to a wide range of PDE problems

Modern numerical methods

Modern implementation techniques

Non-trivial applications with nonlinear systems of PDEs

Analysis of simplified problems

Discovery of numerical properties by computer experiments

Carry out your own 2-week PDE project


Contents

Numerical methods:

Finite difference methods

Finite element methods (main emphasis)

Application areas:

Heat transfer

Diffusion

Wave phenomena

Thermo-elasticity

Viscous fluid flow


How to learn it

Overview from lectures

Exercises with hand calculations (get the details!)

Compulsory exercises:2 computer implementations of 1D finite difference methods2 1D finite element hand calculations2-week PDE project (comprehensive implementation)


The exam

20 min talk

Additional questions

6 topics given two weeks on beforehand

Focus on overview and understanding

Some focus on mathematical details, derivations, intricate steps inalgorithms etc.

No focus on details regarding software tools (but some topics willinvolve overview and principle workings of software tools)


Acronyms

PDE = partial differential equation(plural: PDEs)

ODE = ordinary differential equation(plural: ODEs)

OOP = object-oriented programming


Scientific software trends

Dramatic increase in the interest of problem solving environments:Maple, Matlab, Mathematica, S-Plus, ...

PDE solvers are often huge & expensive

It’s difficult to build a flexible “Matlab” for PDEs, butmodern programming techniques and languages (e.g. C++)simplify the taskDiffpack is one attempt (used in this course)

Practical problem solving in industry makes use of large programpackages – that is one reason why we use a package in this course

New numerical projects in industry make increasing use of C++instead of Fortran – therefore we expose students to C++ and moremodern implementation techniques

We also see the potential of high-level languages like Python, incombination with C++ or Fortran, for solving PDEs – INF5660 may bea companion course


Diffpack

Numerical library for PDE solution

(Almost) a full problem-solving environment for PDEs

A tool for programmers

Implementated in C++ and requires you to program in C++

Relies on object-oriented programming

Reduced implementation efforts for finite elements and PDEs

Enables real-world problem solving in a course


Some features of Diffpack

Free version (though with array-size limition)

Free version at UiO and for students

Over 200 commercial installations: www.diffpack.com(Siemens, Xerox, DaimlerCrysler, Mitsubishi, NASA, Intel, Stanford,Cornell, Cambridge, Harvard, ...)

Some application areas:basic model equations in applied math.(Laplace, heat and wave equations)viscous fluid flow (Navier-Stokes equations)many types of water wave equationsheat transfer, incl. phase changesthermo-elasticitystochastic PDEs and ODEscomputational engineering, medicine, geology, finance

1D, 2D, 3D within the same code lines


The Diffpack philosophy

Diffpack relies on programming and scripting

Diffpack is a set of libraries, consisting of C++ classes in hierarchies(OO design), applications (examples), and (Perl/Python) scripts

A simulator mainly contains problem-dependent code; genericmethods and data structures are already programmed in the libraries

Diffpack acts as a computational engine with a layered design:primitive layers: arrays, input/output, ...intermediate layers: linear systems/solvers, grids, fields, ...higher-level layers: simulators, parallel toolbox, ...


How to learn Diffpack

Required:

good general programming skills

some familiarity with the class concept

thorough knowledge of the numerics

the right attitude:don’t reinvent the wheel – learn to use others’ codedon’t try to understand all details – utilize black boxes

Principles:

learn on demand

rely on program examples

stay cool!

Have access to a C++ textbook, e.g., Barton and Nackman’s Scientificand Engineering C++


Literature

H. P. Langtangen: Computational Partial Differential Equations, Springer,2nd ed., 2003


Warnings

Numerical solution of PDEs is a huge field in rapid growth; it takesyears to master the field

Many other fields (computer science, physics, mathematics) arewired into PDE numerics

C++ takes time to master

OOP takes time to understand

Diffpack requires you to have a thorough and generic understandingof the numerics

Difficulties with this course are usually not due to C++/OOP/Diffpackdetails – but lack of the proper overview of mathematics andnumerics


1D heat conduction

1D heat conduction – p. 16

Heat conduction in the continental crust

earth surface

x

sT

x=b

x=0

-Q

mantleKnowing the temperature at the earth’s surface and the heat flowfrom the mantle, what is the temperature distribution through thecontinental crust?

Interesting question in geology and geophysics – and for thosenations exploring oil resources...


Basic assumptions

x

sT

x=b

x=0

-Q

Physical assumptions:

Crust of infinite area

Steady state heat flow

Heat generated by radioactive elements

Physical quantities:

u(x) : temperature

q(x) : heat flux (velocity of heat)

s(x) : heat release per unit time and mass1D heat conduction – p. 18

Summary of the model

Differential equations and boundary conditions:

−u′′(x) = f(x), x ∈ (0, 1), u(0) = 0, u′(1) = 1

(f(x) is a scaled version of s(x))

Finite difference method (h = cell size):

u1 = 0

ui+1 − 2ui + ui−1 = −h2fi

2un−1 − 2un = −2h− h2fn

which can be written as a linear system

Au = b

where u = (u1, . . . , un) and A is tridiagonal

What to do: Fill A and b, solve for u by Gaussian elimination1D heat conduction – p. 19

Derivation of the model (1)

x

x=0

x=b

outflow

inflow

s(x)=R exp(-x/L)

Physical principles:

First law of thermodynamics:

net outflow of heat = total generated heat

Fourier’s law: heat flows from hot to cold regions (i.e. heat velocity ispoportional with changes in temperature)

q(x) = −λu′(x)

λ reflects the material’s ability to conduct heat



x

s(x) h

q(x-h/2)

q(x+h/2)

x=0

x=b

The first law of thermodynamics:outflow = heat generation

q(x+ h/2)− q(x− h/2) = s(x)h

Here: heat generation s(x) due to radioactive decay,

s(x) = R exp(−x/L)

Divide left-hand side by h and make h small,

q(x+ h/2)− q(x− h/2)h

= s(x) → q′(x) = s(x)



We have more information (boundary conditions):u(0) = Ts (at the surface of the earth)

q(b) = −Q (at the bottom of the crust)

We need to get u into the model; combining the 1st law ofthermodynamics

q′(x) = s(x)

with Fourier’s lawq(x) = −λu′(x)

we can eliminate q and get a differential equation for u:

− d

dx

(λdu

dx

)= s(x)


Mathematical model

− d

dx

(λdu

dx

)= Re−x/L, u(0) = Ts, λ(b)u′(b) = −Q

or if λ is constant:

−u′′(x) = λ−1Re−x/L, u(0) = Ts, λ(b)u′(b) = −Q

Observe: u = u(x;λ,R, L, b, Ts, Q)

u varies with 7 parameters!

Suppose that we want to investigate the influence of the differentparameters. Assume (modestly) three values of each parameter:Number of possible combinations: 36 = 729.

Using scaling we can reduce the six physical parametersλ,R, L, b, Ts, Q to only two!


Scaling

We introduce dimensionless quantities(see HPL A.1 and assume that λ is constant):

x = xb, u = Ts +Qbu/λ, s(bx) = Rs(x)

−d2u

dx2= γe−x/β , u = 0,

du

dx(1) = 1

where we have two dimensionless quantities

β = b/L, γ = bR/Q

Dropping the bars, we get a problem on the form

−u′′(x) = f(x), x ∈ (0, 1)

u(0) = 0

u′(1) = 1


Discretization of our equation

1. Divide the domain [0, 1] into n− 1 cells,the cell edges xi are called nodes (i = 1, . . . , n)

2. Let ui = u(xi),our goal is to let the computer calculate u1,u2,u3, . . .

u

u u

u u3

4 51

2

x=1x=0x

3. The differential equation is to be fulfilled at the nodes only:

−u′′(xi) = f(xi), i = 1, . . . , n

4. Derivatives are approximated by finite differences


Finite difference approximations (1)

Recall the definition of the derivative from introductory calculus:

limh→0

u(x+ h)− u(x)h

= u′(x)

Idea: use this formula with a finite h– this is a finite difference approximation to the derivative

What is the error in this approximation? Expand u(x+ h) in a Taylorseries and compute

1

h(u(x+ h)− u(x)) =

1

h

(u(x) + u′(x)h+

1

2u′′(x)h2 + · · · − u(x)

)=

u′(x) +1

2u′′(x)h+ · · ·

The largest error term is u′′h/2, proportional to h1D heat conduction – p. 26


An alternative finite difference approximation:

u′(x) ≈ u(x+ h)− u(x− h)2h

Compute the error by Taylor series expansion of u(x+ h) andu(x− h) around x:

u(x+ h)− u(x− h)2h

= u′(x) +1

6u′′′(x)h2 + . . .

Leading error term proportional to h2



Approximation to u′′(x):

u′′(x) ≈ u(x+ h)− 2u(x) + u(x− h)h2

or

u′′(xi) ≈u(xi + h)− 2u(x) + u(xi − h)

h2

Alternative notation, noting that ui ≡ u(xi), ui+1 = u(xi + h), andui−1 = u(xi − h):

[u′′]i ≈ui+1 − 2ui + ui−1

h2

Show that the error is O(h2)(Hint: expand ui+1, ui, and ui−1 in Taylor series around xi and insertthe series in the finite difference formula)


The discrete differential equation

The equation at the nodes:

−u′′(xi) = f(xi), i = 1, . . . , n

Replace u′′ by a centered finite difference:

u′′(xi) ≈ui+1 − 2ui + ui−1

h2

⇒ The differential equation is transformed to a system of algebraicequations:

−ui+1 − 2ui + ui−1

h2= fi, i = 1, . . . , n


Discretizing boundary conditions

u(0) = 0 simply becomes u1 = 0

u′(1) = 1 can be approximated as

un+1 − un−1

2h= 1

Problem: un+1 is not in the mesh!

Solution: Use the discrete differential equation for i = n:

−un−1 − 2un + un+1

h2= fn

and the discrete boundary condition to eliminate un+1

The result is2un−1 − 2un = −2h− h2fn


System of equations

The complete set of finite difference equations,

u1 = 0

ui+1 − 2ui + ui−1 = −h2fi

2un−1 − 2un = −2h− h2fn

can be written as a linear system on matrix form

Au = b

where u = (u1, . . . , un) and A is a tridiagonal matrix


Tridiagonal coefficient matrix

A =

A1,1 A1,2 0 · · · · · · · · · · · · · · · 0

A2,1 A2,2 A2,3. . .

. . .. . .

. . ....

.... . .

. . .. . .

. . .. . .

......

. . .. . .

. . .. . .

. . ....

.... . . 0 Ai,i−1 Ai,i Ai,i+1 0

......

. . .. . .

. . .. . .

. . .. . . 0

.... . .

. . .. . .

. . .. . .

. . . An−1,n

0 · · · · · · · · · · · · · · · 0 An,n−1 An,n

A1,1 = 1, A1,2 = 0, An,n−1 = 2

Ai,i−1 = 1, Ai,i+1 = 1, i = 2, . . . , n− 1

Ai,i = −2, i = 2, . . . , n1D heat conduction – p. 32

Solution of linear systems

The system is solved by Gaussian elimination:

Compute the LU factorization: A = LU

L: lower triangular matrix

U : upper triangular matrix

Solve Ly = b (easy)

Solve Ux = y (easy)

Computational work:

A is dense: O(n3)

A is tridiagonal: O(n)


Solution of linear systems; general case

LU factorization (Gaussian elimination) is the optimal solution methodwhen A is tridiagonal

However, in 2D and 3D problems,LU factorization is a very slow process (A is no longer tridiagonal)the structure of A favors iterative methods, which are very muchfaster than LU factorization

Iterative methods are discussed at the end of the course


Implementation

We want the computer to solve our linear system (for arbitrary n)

This task can easily be accomplished using any computer languageand any computer

The program fills A and b with numbers according to the derivedformulas and then calls a Gaussian elimination procedure to find u

In numerical simulation in general, computer codes are large andcomplicated and using effective tools is fundamental

We shall use a comprehensive tool, Diffpack, even for this verysimple problem

The Diffpack code will be close similar codes in Python, Fortran 77,Matlab, C, C++, Java, ...

There is no particular advantage of using Diffpack (except thatDiffpack has a solver for tridiagonal linear system), but it is a simpleproblem for the first Diffpack encounter


Diffpack intro

Diffpack intro – p. 36

The standard intro to a new language

A “scientific” Hello World code:#include <iostream> // make input/output functionality available#include <cmath> // make math functions available: e.g. sin(x)int main () // function "main" is always the main program

std::cout << "Hello, World! Give a number: ";double r; std::cin >> r; // read number into double precision rdouble s = sin(r); // declare s and initialize with sin(r)std::cout << "\nThe value of sin(" << r << ") is " << s << "\n";

This is pure C++ - no Diffpack!


Compiling and linking

Compile:g++ -c hw.cpp

Link hw.o to the C/C++ standard and math libraryg++ -o app hw.o -lm# -lm (link to math lib.) can often be left out:g++ -o app hw.o

Run the program:./app

Compiling and linking in one step:g++ -o app hw.cpp -lm

C++ compilers can have other names: CC, xlC


The corresponding Diffpack program

Make special directory: Mkdir myfirstdp

cd myfirstdp

Make a file hw.cpp with the following contents:

#include <IsOs.h> // Diffpack tools for input/output#include <cmath> // make math functions available: sin(x)int main (int argc, const char* argv[])

initDiffpack (argc, argv); // should always be performeds_o << "Hello world! Give a number: ";real r; s_i >> r; // read real number into rreal s = sin(r);s_o << "\nThe value of sin(" << r << ") is " << s << "\n";

/* Explanation:IsOs.h : input/output in Diffpack, much like iostreamreal : real variables in Diffpack, equals double by defaults_i : standard input in Diffpack, corresponds to std::cins_o : standard input in Diffpack, corresponds to std::cout*/


Compiling and linking

Diffpack is compiled using makefiles (which are automaticallygenerated by the Mkdir command)

Compilation and linking is just a matter ofMake (safe, but results in slow code)Make MODE=opt (fast code, but less safety checks)

Always start with Make; use only optimized mode (MODE=opt)when the program is thoroughly tested!


Arrays in Diffpack

Conventions as in Fortran:first index is 1subscript syntax: a(i)

Different from C, where arrays start at 0 and brackets are used: a[0],a[1], ...

Diffpack arrays are not a built-in feature of C++, but they are definedby a programmer (and can in principle be extended by anybody tomeet the demands in a particular application)


Code example with arrays

#include <Arrays_real.h>int main (int argc, const char* argv[])

initDiffpack (argc, argv);int i,j,k,n,m,p; real r;n = m = 4; p = 3;

Vec(real) w(n);w.redim (m);i = w.size();w = -3.14;Vec(real) z;z = w;z(n-1) = w(1) - 4.3;z.print (s_o, "z");z.printAscii(s_o,"z");z.print (s_o);


Heat conduction problem in Diffpack

Find a suitable test problem with known analytical solution

−u′′(x) = γ exp (−βx), u(0) = 0, u′(1) = 1

u(x) =γ

β2

(1− e−βx

)+

(1− γ

βe−β

)x, β 6= 0

u(x) = x

(1 + γ

(1− 1

2x

)), β = 0

Read β, γ and n

Initialize A and b

Call a Gaussian elimination procedure in Diffpack to solve for u


Test problem for debugging

Choose e.g. n = 2 and solve the discrete equations by hand,

u1 = 0, 2u1 − 2u2 = −2h− h2γe−β

Solution u2:

u2 = 1 +γ

2e−β

When β = 0, the numerical solution is exact for all n (!), i.e., theanalytical solution

u(xi) = ui = (i− 1)h

(1 + γ

(1− 1

2(i− 1)h

))

fulfills the discrete equations

In general, u′′ = const is solved exactly by finite difference methodson uniform grids


Diffpack/C++ program in F77/C style

Declaration and initialization of variables:#include <Arrays_real.h> // for array functionality (and I/O)#include <cmath> // for the exponential function

int main(int argc, const char* argv[])

initDiffpack(argc, argv);s_o << "Give number of solution points: "; // write to the screenint n; // declare an integer n (no of grid points)s_i >> n; // read n from s_i, i.e. the keyboardreal h=1.0/(n-1); // note: 1/(n-1) gives integer division (=0)Mat(real) A(n,n); // create an nxn matrixArrayGen(real) b(n); // create a vector of length n.ArrayGen(real) u(n); // the grid point valuess_o << "Give beta: "; real beta; s_i >> beta;s_o << "Give gamma: "; real gamma; s_i >> gamma;


Fill matrix and right-hand side

A.fill(0.0); // set all entries in A equal to 0.0b.fill(0.0); // set all entries in b equal to 0.0real x; int i;

i = 1;A(i,i) = 1;b(i) = 0;// inner grid points:for (i = 2; i <= n-1; i++) // i++ means i=i+1x = (i-1)*h;A(i,i-1) = 1; A(i,i) = -2; A(i,i+1) = 1;b(i) = - h*h*gamma*exp(-beta*x);

// i = n:i = n; x = (i-1)*h;A(i,i-1) = 2; A(i,i) = -2;b(i) = - 2*h - h*h*gamma*exp(-beta*x);

if (n <= 10) A.print (s_o,"A matrix"); // print matrix to the screenb.print (s_o,"right-hand side"); // print vector to the screen


Solve for u and write out solution

A.factLU(); A.forwBack(b,u); // Gaussian elimination

s_o << "\n\n x numerical error:\n";real u_exact;for (i = 1; i <= n; i++) // \n is newlinex = (i-1)*h;if (beta < 1.0E-09) // is beta zero?u_exact = x*(1 + gamma*(1 - 0.5*x));

else u_exact = gamma/(beta*beta)*(1 - exp(-beta*x)) +

(1 - gamma/beta*exp(-beta))*x;s_o << oform("%4.3f %8.5f %12.5e\n",

x,u(i),u_exact-u(i));// test for the case of only one cell:if (n == 2) s_o << "u(2)=" << 1+0.5*gamma*exp(-beta) << "\n"; // write results to the file "SIMULATION.res"Os file ("SIMULATION.res", NEWFILE); // open filefor (i = 1; i <= n; i++)file << (i-1)*h << " " << u(i) << "\n";

file->close();


Tridiagonal matrices

A is tridiagonal

Mat(real) A(n,n) is a dense matrix

Save memory and CPU-time: use tridiagonal matrix (this can bequite dramatic savings!)

MatTri(real) A(n)

A(i,-1), A(i,0), A(i,1) for Ai,i−1, Ai,i, Ai,i+1

Otherwise the program remains the same


Exercises

1. Perform the steps to be a Diffpack user

2. Type in the Diffpack version of our numerical “Hello World!” program,compile and run the program

3. Introduce MatTri instead of Mat in the 1D heat conduction program(Exercise 1.4 in HPL)


The heat conduction coefficient

The derivation of the 1D model ends in

− d

dx

(λdu

dx

)= s(x)

and allows a variable λ

λ: heat conduction coefficient

The continental crust is typically not homogeneous!

⇒ λ varies in space!

Model simplification: λ = λ(x)(λ = λ(x, y, z) would require a 3D model)

Need to discretize the operator

d

dx

(λdu

dx

)


Discretization of variable coefficients

Mathematical problem

− d

dx

(λ(x)

du

dx

)= f(x), 0 < x < 1

u(0) = 0, u′(1) = 1.

NEVER expand (λu′)′ (by the rule of product differentiation)

Two-step discretization, first outer operator:

d

dx

(λ(x)

du

dx

)∣∣∣∣x=xi

≈ 1

h

λdu

dx

∣∣∣∣x=x

i+ 12

− λdu

dx

∣∣∣∣x=x

i− 12

Then inner operator:

λdu

dx

∣∣∣∣x=x

i+ 12

≈ λi+ 12

ui+1 − ui

hDiffpack intro – p. 51

Finite difference equations

Left point, inner points, right point:

u1 = 0

λi+ 12(ui+1 − ui)− λi− 1

2(ui − ui−1) = −h2fi, i = 2, . . . , n− 1

2λn(un−1 − un) = −2hλn+ 12− h2fn

Arithmetic mean:

λi+ 12

=1

2(λi + λi+1)

Harmonic mean:1

λi+ 12

=1

2

(1

λi+

1

λi+1

)

Geometric mean:λi+ 1

2= (λiλi+1)

1/2


A nonlinear problem

A nonlinear problem – p. 53

Nonlinear heat conduction

Heat conduction typically depends upon the temperature

− d

dx

(λ(u)

du

dx

)= f(x), 0 < x < 1, u(0) = 0, u′(1) = 1

This is a nonlinear differential equation

Using the same discretization reasoning as when λ = λ(x),

u1 = 0

λi+ 12(ui+1 − ui)− λi− 1

2(ui − ui−1) = −h2fi

2λn(un−1 − un) = −2hλn+ 12− h2fn

whereλi+ 1

2≡ λ(ui+ 1

2)


The new problem

Our discrete equations contains λ(ui+ 12), i.e., the coefficients that we

previously put in the matrix A, now depend on the solution ui andui+1

The linear system can be written as

A(u)u = b

This is a set of nonlinear algebraic equations

The nonlinearity arises from the λ(u)u′ product in the underlyingdifferential equation

We cannot use LU decomposition because A depends on u

What can we do?


Solution method

If we only had a linear equation, we would get a linear systemAu = b, which know how to solve...

Idea: Guess a solution u0 and use this in λ:

− d

dx

(λ(u0)

du1

dx

)= f(x)

u1 is – hopefully – a better approximation than u0

This approach suggest an iteration procedure:use solution from last iteration in λthe equation is now linearuse the solution technology for −(λ(x)u′(x))′ = f(x)


Algorithm

Guess a solution u0 (need not be correct)

Solve the recursive equations

− d

dx

(λ(uk−1)

du

dx

k)

= f(x), uk(0) = 0,du

dx

k

(1) = 1

until difference between uk and uk−1 is small

“Small” can mean √√√√n∑

j=1

|ukj − uk−1

j |2 ≤ ε

Pros: may reuse previous code by inserting an evaluation of λ(ui+ 12)

Cons: slow convergence (faster methods exist)


The complete scheme (1)

For i = 1, 2, . . . , n− 1:

1

2

(λ(uk−1

i ) + λ(uk−1i+1 )

)(uk

i+1 − uki

)−

1

2

(λ(uk−1

i−1 ) + λ(uk−1i )

)(uk

i − uki−1

)= −h2f(xi)


The complete scheme (2)

For i = n

2λ(uk−1n )(uk

n−1 − ukn) = −2hλn+ 1

2− h2f(xn)

Now,

λn+ 12

=1

2

(λ(uk−1

n ) + λ(uk−1n+1)

)

Using the boundary condition

uk−1n+1 − uk−1

n−1

2h= 1, k > 1

gives

λn+ 12

=1

2

(λ(uk−1

n ) + λ(uk−1n−1 + 2h)

)


Implementation

Reuse old program (Heat1D) with:

Loop around system generation and solution

Two arrays uk and ukm

Initial guess in ukm

New auxiliary variables (for iteration etc.)

Function lambda to evaluate λ(u)

Update A and b for each step

Call A.resetFact() to enable new LU decomposition prior tocall to A.factLU()

Check for termination upon convergence

Set ukm equal uk before new iteration


The central code segment (1)

int k = 0; // iteration counterconst int k_max = 200; // max no of iterationsreal udiff = INFINITY; // udiff = ||uk - ukm||const real epsilon = 0.0000001; // tolerance in termination crit.

while (udiff > epsilon && k <= k_max)k++; // increase k by 1A.fill(0.0); b.fill(0.0); // initialize A and b

for (i = 1; i <= n; i++)if (i == 1) A(1,0) = 1; else if (i > 1 && i < n)

lambda1 = lambda(ukm(i-1), m);lambda2 = lambda(ukm(i), m);lambda3 = lambda(ukm(i+1), m);

A(i,-1) = 0.5*(lambda1 + lambda2);A(i, 0) = -0.5*(lambda1 + 2*lambda2 + lambda3);A(i, 1) = 0.5*(lambda2 + lambda3);

else if (i == n)

A(i,-1) = 2*lambda(ukm(i), m);A(i, 0) = - A(i,-1);b(i) = -(h*lambda(ukm(i-1)+2*h,m)+lambda(ukm(i),m));


The central code segment (1)

A.resetFact(); // ready for new factLUA.factLU(); A.forwBack(b,uk); // Gaussian elimination

// check termination criterion:udiff = 0;for (i = 1; i <= n; i++)

udiff += sqr(uk(i) - ukm(i));udiff = sqrt(udiff);s_o << "iteration " << k << ": udiff = " << udiff << "\n";

ukm = uk; // ready for next iteration


No of iterations

λ(u) = um

0

10

20

30

40

50

60

0 0.5 1 1.5 2 2.5 3 3.5 4

no o

f ite

ratio

ns

m

number of iterations

n=100


Numerical error

λ(u) = um

-35

-30

-25

-20

-15

-10

-5

0

0 0.5 1 1.5 2 2.5 3 3.5 4

log(

erro

r)

m

numerical error

n=100


Convergence

Define the error from iterations:

EI = ||uk − uk−1||

where || · || is some norm, e.g.,

||u|| =

√√√√ 1

n

n∑

i=1

u2i

Define the discretization error:

E∆ = ||u− uk||

Basic issue in discretization: how does E∆ vary with the cell size h?

Investigation: make EI negligible (EI E∆), compute E∆ fordifferent choices of h

Common model for relating discretization error to grid size:

E∆ = Chr

fit C and r to data (linear least squares):

logE∆ = logC + r log h

y = aξ + b, y = logE∆, b = logC, ξ = log h, r = a

Does E∆ → 0 as h→ 0? And how fast?

Second-order finite difference approximations suggest r = 2


Convergence plot (1)

λ(u) = um

-10

-9

-8

-7

-6

-5

-4

-3

-2

-8 -7 -6 -5 -4 -3 -2 -1

log(

erro

r)

log(h)

numerical error

m=0.2m=1.2m=3.2


Convergence plot (2)

λ(u) = (1 + u)m

-24

-22

-20

-18

-16

-14

-12

-10

-8

-6

-4

-8 -7 -6 -5 -4 -3 -2 -1

log(

erro

r)

log(h)

numerical error

m=0.2m=1.2m=3.2


Summary of results

Number of iterations increase with m

m = 1: numerical solution is exact (!)

λ(u) = um: O(hr) for r ≤ 1

(despite our use of O(h2) accurate finite differences!)

λ(u) = (1 + u)m: O(h2) – as expected

Explanation: um gives u′(0)→∞,need a very fine grid around x = 0 to get accurate results

Note: theory does not extend well to nonlinear problems; systematicexperiments may be an important additional tool


Simulation of waves

Simulation of waves – p. 69

Vibration of a string

Mathematical model: the wave equation

∂2u

∂t2= γ2 ∂

2u

∂x2, x ∈ (a, b)

- Time- and space-dependent problem- This is a partial differential equation (PDE)- Boundary conditions at x = a, b (u or ∂u/∂x)

- Initial conditions: known u(x, 0) and ut(x, 0)

Explicit finite difference method:

u`+1i = f(u`

i , uì−1, u

ì+1, u

`−1i )

Implementation: run through a space-time grid and compute u`+1i for

each grid point



y

x

Physical assumptions:

the string = a line in 2D space

no gravity forces

up-down movement (i.e., only in y-direction)

Physical quantities:

r = xi + u(x, t)j : position

T (x) : tension force (along the string)

θ(x) : angle with horizontal direction

%(x) : density



y

x

T

T

(x-h/2)

(x+h/2)

u(x,t)

ρ∆ s

hPhysical principle: Newton’s second law applied to a small(infinitesimal) part of the string

sum of forces = total mass · acceleration



Great mathematicians had great problems with understanding how toset up the mathematical model for a vibrating string

Euler, D’Alambert and Taylor all made various attempts (which look“stupid” by today’s standards...)

Lagrange was the first one to derive the right partial differentialequation

This happened about 100 years after Newton had presented themathematics and physics we need to derive this PDE

The derivation to be presented here is typical: simple principles, butlots of mathematical details; it’s easy to get lost in the details



y

x

T

T

(x-h/2)

(x+h/2)

u(x,t)

ρ∆ s

h

Acceleration:

a =∂2r

∂t2=∂2u

∂t2j

Newton’s law applied to a string element:

T(x+

h

2

)− T

(x− h

2

)= %(x)∆s

∂2u

∂t2(x, t) j

⇒ A vector equation with two components



y

x

T

T

(x-h/2)

(x+h/2)

u(x,t)

ρ∆ s

h

The tension reads

T (x) = T (x) cos θ(x) i + T (x) sin θ(x) j

Newton’s law in component form

T (x+h

2) cos θ(x+

h

2)− T (x− h

2) cos θ(x− h

2) = 0

T (x+h

2) sin θ(x+

h

2)− T (x− h

2) sin θ(x− h

2) = %(x)∆s

∂2u

∂t2



Divide the first component by h and let h→ 0

∂

∂x

(T cos θ

)= 0

Similarly for the second component

∂

∂x

(T sin θ

)= %(

limh→0

∆s

h

)∂2u

∂t2



We need to determine the limit limh→0 ∆s/h

Assume linear segment, then by Pythagoras:

∆s2 = h2 + ∆u2, i.e.,∆s

h= lim

h→0

√

1 +

(∂u

∂x

)2

Furthermore,

tan θ =∂u

∂x,

which means that

sin θ =tan θ√

1 + tan2 θ=

∂u∂x√

1 +(

∂u∂x

)2



Altogether this gives

%

[1 +

(∂u

∂x

)2] 1

2∂2u

∂t2=

∂

∂x

T

[1 +

(∂u

∂x

)2]− 1

2∂u

∂x

which is a nonlinear partial differential equation.

Assume small vibrations, i.e., (∂u/∂x)2 1. For small vibrations,θ(x) ≈ 0, such that

0 =∂

∂x

(T cos θ

)=

∂

∂x

(T(1− θ2

2!+ . . .

))≈ ∂

∂xT

This means that T is approximately a constant and that the squareroots are 1


Summing up

The governing PDE:

∂2u

∂t2= c2

∂2u

∂x2c2 = T/%

String fixed at the ends:

u(a, t) = u(b, t) = 0

String initially at rest:

u(x, 0) = I(x),∂u

∂t(x, 0) = 0


The scaled wave equation problem

We scale the equations (γ ≡ 1, but kept as a label) and arrive at thefollowing initial-boundary value problem:

∂2u

∂t2= γ2 ∂

2u

∂x2, x ∈ (0, 1), t > 0

u(x, 0) = I(x), x ∈ (0, 1)

∂

∂tu(x, 0) = 0, x ∈ (0, 1)

u(0, t) = 0, t > 0,

u(1, t) = 0, t > 0


Finite difference approximation (1)

Introduce a grid in space-time

xi = (i− 1)h, i = 1, . . . , n

t` = `∆t, ` = 0, 1, . . .

h x

t

∆ t



Central difference approximations:

∂2

∂x2u(xi, t`) =

uì−1 − 2u`

i + uì+1

h2+O(h2)

∂2

∂t2u(xi, t`) =

u`−1i − 2u`

i + u`+1i

∆t2+O(∆t2)

Insert these in the PDE

∂2u

∂t2= γ2 ∂

2u

∂x2



The PDE has been transformed to a difference equation

u`−1i − 2u`

i + u`+1i

∆t2= γ2u

ì−1 − 2u`

i + uì+1

h2

All u values at time levels ` and `− 1 are assumed known

⇒ Only one unknown term: u`+1i

⇒ Can solve for u`+1i explicitly:

u`+1i = 2u`

i − u`−1i + γ2 ∆t2

h2

(u`

i−1 − 2uì + u`

i+1

)

This scheme is classified as a an explicit finite difference method; noneed to solve coupled systems of linear equations (⇒ easierprogramming!)


The computational procedure

h x

t

∆ t

Can find u`+1i for one i at a time if u at t` and t`−1 is known

Need u−1i and u0

i for all i to start the algorithm


Initial conditions (1)

u0i = I(xi): evaluate directly

A bit more challenging:

∂u

∂t

∣∣∣∣t=0

= 0 ⇒ u1i − u−1

i

2∆t= 0 ⇒ u−1

i = u1i

but u−1i is outside the legal time grid...

Idea: eliminate u−1i by using the discrete PDE at t = 0, which gives a

special formula for the first step:

u1i = u0

i + γ2 ∆t2

2h2

(u0

i−1 − 2u0i + u0

i+1

)


Initial conditions (2)

It is awkward to have a special first step. Instead we introduce

u−1i = u0

i + γ2 ∆t2

2h2(u0

i+1 − 2u0i + u0

i−1)

and use the standard difference equation also at the first step


Algorithm

Define storage u+i , ui, u−i for u`+1

i , uì , u

`−1i and set C = γ∆t/h

Set initial conditions: ui = I(xi), i = 1, . . . , n

Define the artificial quantity u−i (i = 2, . . . , n− 1)

u−i = ui +1

2C2(ui+1 − 2ui + ui−1),

Set t = 0; while t < tstop

t = t+ ∆t

Update all inner points (i = 2, . . . , n− 1)

u+i = 2ui − u−i + C2(ui+1 − 2ui + ui−1)

Set boundary condition: u+1 = 0, u+

n = 0

Initialize for next step u−i = ui, ui = u+i , all i


Diffpack code in F77/C style

We use functions in C++void timeLoop (ArrayGen(real)& up, ArrayGen(real)& u,

ArrayGen(real)& um, real tstop, real C);void setIC (real C, ArrayGen(real)& u0, ArrayGen(real)& um);

The main program:ArrayGen(real) up (n); // u at time level l+1ArrayGen(real) u (n); // u at time level lArrayGen(real) um (n); // u at time level l-1// get n and Courant number C=dt/dx from the user

timeLoop (up, u, um, tstop, C); // finite difference scheme


The timeLoop function

void timeLoop (ArrayGen(real)& up, ArrayGen(real)& u,ArrayGen(real)& um, real tstop, real C)

int n = u.size(); // length of the vector u (no of grid points)real h = 1.0/(n-1); // length of grid intervalsreal dt = C*h; // time step, assumes unit wave velocity!!real t = 0; // time

setIC (C, u, um); // set initial conditions

int i; // loop counter over grid pointsint step_no = 0; // current step number

while (t <= tstop)t += dt; step_no++; // increase time; count no. of steps

// update inner points according to finite difference scheme:for (i = 2; i <= n-1; i++)

up(i) = 2*u(i) - um(i) + sqr(C)*(u(i+1) - 2*u(i) + u(i-1));

up(1) = 0; up(n) = 0; // update boundary points:um = u; u = up; // update data struct. for next step


The setIC function

void setIC (real C, ArrayGen(real)& u0, ArrayGen(real)& um)

int n = u0.size(); // length of the vector ureal x; // coordinate of a grid pointreal h = 1.0/(n-1); // length of grid intervalsreal umax = 0.05; // max string displacement

int i; // loop counter over grid pointsfor (i = 1; i <= n; i++) // set the initial displacement u(x,0)x = (i-1)*h;if (x < 0.7) u0(i) = (umax/0.7) * x;else u0(i) = (umax/0.3) * (1 - x);

for (i = 2; i <= n-1; i++) // set the help variable um:um(i) = u0(i) + 0.5*sqr(C) * (u0(i+1) - 2*u0(i) + u0(i-1));

um(1) = 0; um(n) = 0; // dummy values, not used in the scheme


Dumping solution to file

We dump the solution at each time point to file such that we canmake a movie after the simulation is finished

Diffpack has tools for managing a large number of curves on files

CurvePlotFile: manager for a collection of curves

CurvePlot: a variable that holds a curve

Code example:int n = u.size(); // the number of unknownsreal h = 1.0/(n-1); // length of grid intervalsCurvePlot plot (plotfile); // a single plotplot.initPair ("displacement", // plot title

oform("u(x,%g)",t), // name of function"x", // name of indep. var.oform("C=%g, h=%g",C,h)); // comment

for (int i = 1; i <= n; i++) // add (x,y) data pointsplot.addPair (h*(i-1) /* x-value */, u(i) /* y value */);

plot.finish();


Visualizing the results

The simulation produces a Diffpack case with name SIMULATION

Central files generated in the simulation:SIMULATION.dp — logfile for simulation, i.e., runtimeSIMULATION.map — overview of data filesSIMULATION.files — explanation of what the files are.SIMULATION_1, .SIMULATION_2, ... the (hidden) datafiles


Animation

Make the animation using Diffpack features:curveplotmovie gnuplot SIMULATION.map -0.1 0.1

(script) (program) (name of map file) (ymin) (ymax)

Can replace gnuplot by matlabcurveplotmovie matlab SIMULATION.map -0.1 0.1


Varying the Courant number C

C = γ∆t

h, t = 0.5, h = 1/20

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.02

−0.015

−0.01

−0.005

0

0.005

0.01

0.015

(a) C = 1.0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.1

−0.08

−0.06

−0.04

−0.02

0

0.02

0.04

0.06

0.08

(b) C = 1.05


Varying the Courant number C

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.015

−0.01

−0.005

0

0.005

0.01

0.015

(c) C = 0.8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.015

−0.01

−0.005

0

0.005

0.01

0.015

(d) C = 0.3


Another numerical example

−10 −5 0 5 10

0

0.2

0.4

0.6

0.8

1

Time t=0.0, σ=103

−10 −5 0 5 10

0

0.2

0.4

0.6

0.8

1

Time t=15.0, C=1.0

−10 −5 0 5 10

0

0.2

0.4

0.6

0.8

1

Time t=15.0, C=0.8

−10 −5 0 5 10

0

0.2

0.4

0.6

0.8

1

Time t=0.0, σ=1

−10 −5 0 5 10

0

0.2

0.4

0.6

0.8

1

Time t=15.0, C=0.99

−10 −5 0 5 10

0

0.2

0.4

0.6

0.8

1

Time t=15.0, C=0.8


Numerical stability and accuracy

We have two parameters: ∆t and h

How do we choose ∆t and h?

Too large ∆t and h give- too large numerical errors- or in the worst case: unstable solutions

Too small ∆t and h require too much computing power

Simplified problems can be analyzed theoretically, which yields aguide to choosing ∆t and h

Basic result for our wave equation: ∆t ≤ h/γ (derived later)

Peculiar case: exact solution is obtained by ∆t = h/γ, regardless ofh (!!!)


2D wave equation

2D wave equation – p. 98

A more general wave equation

General form of a 1D/2D/3D wave equation for u(x, t):

∂2u

∂t2= ∇ · [λ(x)∇u]

Wave travels with velocity γ =√λ

The operator ∇ · [λ(x)∇u] is frequently encountered in this course!

In 2D the operator is written out as

∂

∂x

(λ(x, y)

∂u

∂x

)+

∂

∂y

(λ(x, y)

∂u

∂y

)

Goal for the next slides: Learn how to discretize a 2D wave equationwith variable coefficients

We shall do this by putting together elements we have learned so far


Applications of wave the equation

Vibrations of a string (1D)

Vibrations of a drum (2D)

Large destructive water waves (2D; water elevation)

Sound waves (1D: organ pile, flute; 3D: room, space)

Light and radio waves (3D)


Example: earthquake-generated waves

−0.15

−0.105

−0.06

−0.015

0.03

0.075

0.12

0.165

0.21

0.255

0.3

X

Y

Z

−1

0

0.2

−20−10

010

20 −20 −10 0 10 20

Note: scales are distorted!

Earthquake close to seamount

Effect of earthquake: Sudden elevation of the surface, modeled hereas a prescribed initial surface at rest

Domain = segment of the ocean


Principles

∂2u

∂t2= ∇ · [λ(x)∇u]

∂2u

∂t2=

∂

∂x

(λ∂u

∂x

)+

∂

∂y

(λ∂u

∂y

)

Time discretization as for 1D wave eq.

Space discretization: generalized from (λ(x)u′)′

Boundary conditions: generalized from u(0) = 0 and u′(1) = 1

Initial conditions as for 1D wave eq.

Overall algorithm as for 1D wave eq.


Discretization

Seek approximation uì,j on a rectangular grid to u(xi, yj , t`)

xi = (i− 1)∆x, yj = (j − 1)∆y, t` = `∆t

Approximate derivatives by central differences

[∂2u

∂t2

]`

i,j

≈u`+1

i,j − 2uì,j + u`−1

i,j

∆t2

A spatial term like ∂∂y

(λ∂u

∂y

)takes the form

1

∆y

(λi,j+ 1

2

(u`

i,j+1 − uì,j

∆y

)− λi,j− 1

2

(u`

i,j − uì,j−1

∆y

))


The scheme

The finite difference scheme takes the form

u`+1i,j = 2u`

i,j − u`−1i,j + [∆u]ì,j

Exercise: derive the expression for [∆u]ì,j


Algorithm (BC: u = 0)

DEFINITIONS:Storage u+

i,j , ui,j, and u−i,j for u`+1i,j , u`

i,j , and u`−1i

The whole grid:¯(∞) = i = 1, . . . , nx, j = 1, . . . , ny

Inner points:(∞) = i = 2, . . . , nx − 1, j = 2, . . . , ny − 1INITIAL CONDITIONS:

ui,j = I(xi, yj), (i, j) ∈ (∞)

SET ARTIFICIAL QUANTITY u−i,j:

u−i,j = ui,j + 12 [4u]i,j, (i, j) ∈ (∞)

Set t = 0


Algorithm (BC: u = 0)

While t ≤ tstop

t← t+ ∆t

UPDATE ALL INNER POINTS:

u+i,j = 2ui,j − u−i,j + [4u]i,j , (i, j) ∈ (∞)

INITIALIZE FOR NEXT STEP:

u−i,j = ui,j , ui,j = u+i,j , (i, j) ∈ (∞)


A model for water waves

Physical assumption: long waves in shallow water

Corresponding mathematical model:

∂2u

∂t2= ∇ ·

[gH(x)∇u

]

Physical quantities:u(x, y, t) : water surface elevationg : acceleration of gravityH(x, y) : still-water depth

Boundary condition at coastline

∂u

∂n≡ ∇u · n = 0

(full reflection of waves)


Scaling

Let Hc be a characteristic value of H(x, y)

We introduce new variables

x = x/Hc, y = y/Hc, t = t/√Hc/g, λ = H/Hc

u = u/uc (uc cancels and can be arbibtary)

Inserted in the equation:

∂2u

∂t2= ∇ ·

[λ∇u

]

⇒ g (the driving force) is scaled away


Implementing boundary conditions (1)

There are two ways of handling ∂u/∂n = 0 conditions:“Ghost cells” at boundary with explicit updating of fictitious valuesModify stencil at boundary

We choose the second option, as this allows direct output of ui,j to avisualization program, i.e., no need to remove ghost cells.


Implementing boundary conditions (2)

Consider the boundary i = 1 (x=const)

Boundary condition:∂u

∂n≡ ∂u

∂x= 0

Discrete version:

u2,j − u0,j

2∆x= 0 ⇔ u2,j = u0,j

u0,j is outside the legal mesh

Use the discrete PDE for i = 1 and eliminate u0,j , that is, just replaceu0,j by u2,j


Modified difference operator

The boundary condition modifies the finite difference equations, andthis can be viewed as modifying the operator [4u]i,j (in this exampleat i = 1) according to

[4u]1,j:i−1→i+1 ≡(

∆t

∆x

)2(λ1+ 1

2 ,j(u2,j − u1,j)− λ1− 12 ,j(u1,j − u2,j)

)+

(∆t

∆y

)2(λ1,j+ 1

2(u1,j+1 − u1,j)− λ1,j− 1

2(u1,j − u1,j−1)

)


Efficiency issues

Two things should be considered:

Loops should be ordered such that u(i, j) is traversed in the order itis stored. In Diffpack ArrayGen objects are stored columnwise.Therefore the loop should read:for (j = 1; j <= ny; j++)

for (i = 1; i <= nx; i++)u(i,j) = ...

One should avoid if statements in loops if possible (they preventmany compiler optimization techniques); hence we will have separateloops over inner and boundary points.

Remark I: Debug code before optimizing it!!Remark II: Focus on a readable and maintainable code before thinking ofefficiency


Updating of internal points

We define a function for updating the solution:WAVE(u+, u, u−, a, b, c)

This function reads, at inner points,

u+i,j = 2aui,j − bu−i,j + c[4u]i,j, (i, j) ∈ (∞)


Updating of boundary points (1)

i = 1, j = 2, . . . , ny − 1;u+

i,j = 2aui,j − bu−i,j + c[4u]i,j:i−1→i+1,

i = nx, j = 2, . . . , ny − 1;u+

i,j = 2aui,j − bu−i,j + c[4u]i,j:i+1→i−1,

j = 1, i = 2, . . . , nx − 1;u+

i,j = 2aui,j − bu−i,j + c[4u]i,j:j−1→j+1,

j = ny, i = 2, . . . , nx − 1;u+

i,j = 2aui,j − bu−i,j + c[4u]i,j:j−1→j+1,


Updating of boundary points (2)

i = 1, j = 1;u+

i,j = 2aui,j − bu−i,j + c[4u]i,j:i−1→i+1,j−1→j+1

i = nx, j = 1;u+

i,j = 2aui,j − bu−i,j + c[4u]i,j:i+1→i−1,j−1→j+1

i = 1, j = ny;u+

i,j = 2aui,j − bu−i,j + c[4u]i,j:i−1→i+1,j+1→j−1

i = nx, j = ny;u+

i,j = 2aui,j − bu−i,j + c[4u]i,j:i+1→i−1,j+1→j−1


Modified algorithm (BC: ∂u/∂n = 0)

DEFINITIONS: as above

INITIAL CONDITIONS:

ui,j = I(xi, yj), (i, j) ∈ (∞)

SET ARTIFICIAL QUANTITY u−i,j:

WAVE(u−, u, u−, 0.5, 0, 0.5)

Set t = 0


Modified algorithm (BC: ∂u/∂n = 0)

While t ≤ tstop

t← t+ ∆t

UPDATE ALL POINTS:

WAVE(u+, u, u−, 1, 1, 1)

INITIALIZE FOR NEXT STEP:

u−i,j = ui,j , ui,j = u+i,j , (i, j) ∈ (∞)


Diffpack/C++ written in F77/C style

src/fdm/intro/Wave2D#include <Arrays_real.h>#include <FieldLattice.h>#include <SimRes2mtv.h>

// We define a macro LaplaceU to save typing of long// finite difference formulas. For example,//// #define mac(X) q0(i,j-X)//// defines a macro mac(X) and any text mac(i+2) will then be// transformed to q0(i,j-i+2) by the C/C++ preprocessor (cpp).

#define LaplaceU(i,j,im1,ip1,jm1,jp1) \sqr(dt/dx)*\( 0.5*(lambda(ip1,j )+lambda(i ,j ))*(u(ip1,j )-u(i ,j )) \-0.5*(lambda(i ,j )+lambda(im1,j ))*(u(i ,j )-u(im1,j )))\

+sqr(dt/dy)*\( 0.5*(lambda(i ,jp1)+lambda(i ,j ))*(u(i ,jp1)-u(i ,j )) \-0.5*(lambda(i ,j )+lambda(i ,jm1))*(u(i ,j )-u(i ,jm1)))


More code ...

void WAVE (ArrayGenSel(real)& up, const ArrayGen(real)& u,const ArrayGenSel(real)& um, real a, real b, real c,const ArrayGenSel(real)& lambda, real dt, real dx, real dy)

int nx, ny; up.getDim (nx, ny);int i,j;

// update inner points according to finite difference scheme:for (j = 2; j <= ny-1; j++)for (i = 2; i <= nx-1; i++)up(i,j) = a*2*u(i,j) - b*um(i,j)

+ c*LaplaceU(i,j,i-1,i+1,j-1,j+1);

// update boundary points (modified finite difference schemes):i=1;for (j = 2; j <= ny-1; j++)up(i,j)=a*2*u(i,j)-b*um(i,j) +c*LaplaceU(i,j,i+1,i+1,j-1,j+1);

i=nx;for (j = 2; j <= ny-1; j++)up(i,j)=a*2*u(i,j)-b*um(i,j) + c*LaplaceU(i,j,i-1,i-1,j-1,j+1);

j=1;for (i = 2; i <= nx-1; i++)up(i,j)=a*2*u(i,j)-b*um(i,j) + c*LaplaceU(i,j,i-1,i+1,j+1,j+1);

...


More code ...

int main (int argc, const char* argv[])

initDiffpack (argc, argv);s_o << "Give number of intervals in x and y direction: ";int h; s_i >> h; int nx = h+1; s_i >> h; int ny = h+1;s_o << "Give width of domain in x direction: ";real wx; s_i >> wx;s_o << "Give width of domain in y direction: ";real wy; s_i >> wy;

// ArrayGenSel is like ArrayGen, but has increased functionality// for finite difference methods// (we need it for the FieldLattice object for visualization)ArrayGenSel(real) up (nx,ny); // u at time level l+1ArrayGenSel(real) u (nx,ny); // u at time level lArrayGenSel(real) um (nx,ny); // u at time level l-1ArrayGenSel(real) lambda (nx,ny); // variable coefficientconst real dx = wx/(nx-1); // length of grid intervals in x dir.const real dy = wy/(ny-1); // length of grid intervals in y dir.

s_o << "Give time step length: (0 gives dt=dx): ";real dt; s_i >> dt;


More code ...

// fill lambda with values .... (see source file)

// fill um with initial values:um.fill(0.0);

// set the help variable um:WAVE (um, u, um, 0.5, 0, 0.5, lambda, dt, dx, dy);

int step_no = 0; // current step number

while (t <= tstop)

t += dt; // increase time by the time stepstep_no++; // increase step number by 1s_o << "t=" << t << "\n";

WAVE (up, u, um, 1, 1, 1, lambda, dt, dx ,dy);

um = u; u = up; // update data struct. for next step// dump solution to file ... (see source file)


Visualizing the results

unix> plotmtv -colorps W0017.tmp.mtv


Nature of some PDEs

Nature of some PDEs – p. 123

Hyperbolic equations (1)

Linear PDEs can be divided into basic categories:elliptic, parabolic, and hyperbolic

A typical hyperbolic equation is the wave equation

∂2u

∂t2= γ2 ∂

2u

∂x2

which has the general solution

u(x, t) = f(x− γt) + g(x+ γt)

i.e., two waves propagating to the left and to the rightThe functions f and g are determined from u(x, 0) and ∂u(x, 0)/∂t.Disturbances travel with a finite wave speed.Reference: HPL appendix A.5


Changing boundary conditions

Boundary conditions: u(0, t) = 0, u(1, t) = 0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time t=0.300

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time t=0.667

Boundary conditions: u(0, t) = 0, ux(1, t) = 0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time t=0.300

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time t=0.667


Observations

Changing the BC influences the solution at only some (x, t) points(e.g., at the midpoint, after the initial disturbance has left, a change inthe BC at x = 1 is not felt before the pulse has been reflected fromthe boundary)

An initial disturbance is transported without change of shape(essential for oral communication!)


BC: let waves leave the domain

Boundary conditions: u(0, t) = 0 and(ut + γux

)|x=1 = 0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time t=0.300

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Time t=0.667

The wave propagates out of the domain, exactly as we want forocean waves! Unfortunately, the condition is hard to generalizesuccessfully to physically relevant 2D cases


Other hyperbolic equations (2)

Uni-directional “wave” or transport equation:

∂u

∂t+ γ

∂u

∂x= 0

whereu(x, t) = I(x− γt)

I(x) is the initial condition u(x, 0)

Systems of wave equations (here: long ocean waves)

∂η

∂t= − ∂

∂x(uH)− ∂

∂y(vH)

(−∂H∂t

)

∂u

∂t= −∂η

∂x, x ∈ Ω, t > 0

∂v

∂t= −∂η

∂y, x ∈ Ω, t > 0


Other hyperbolic equations (2)

Multi-dimensional standard wave equation:

∂2u

∂t= ∇ · (λ∇u)

Applications: radio waves, light, sound, membranes, ...

Nonlinear hyperbolic conservation law:

∂u

∂t+

∂

∂xf(u) = 0

or a system of such equations:

∂u

∂t+

∂

∂xf(u) = 0

Applications: gas dynamics, oil reservoir flow


Elliptic equations

Elliptic equations are stationary (equilibrium or steady-state physicalconditions)

Example:−u′′(x) = 2, u(0) = 0, u′(1) = 0

Let us look at the effect of changing u′(1) = 0 to u(1) = 0



u′(1) = 0 versus u(1) = 0:

0

0.2

0.4

0.6

0.8

1

1.2

0 0.2 0.4 0.6 0.8 1

u(1)=0u’(1)=0

Observe: All points in the interior are effected by the boundarycondition! And the solution is smooth.


Multi-dimensional elliptic equations

The Poisson equation is a typical multi-dimensional elliptic equation:

−∇2u(x) = f(x)

or−∇ · (λ(x)∇u(x)) = f(x)

Another elliptic equation:

−∇ · (λ(x)∇u(x)) + αu = f(x)


The Helmholtz equation

Sometimes one solves the wave equation

∂2u

∂t2= γ2∇2u

by assuming periodical waves in time:

u(x, y, z, t) = exp (−iωt)u(x, y, z), i =√−1

This results in the famous Helmholtz equation

∇2u+ k2u = 0, k = ω/γ

The Helmholtz equation is not an elliptic equation (wrong sign!)


Parabolic equations (1)

A 1D heat equation is a typical parabolic equation:

∂u

∂t=∂2u

∂x2+ 2, u(x, 0) = 0, u(0, t) = 0

Typically for parabolic equations:

time derivative = elliptic counterpart

That is, as t→∞, the eq. above tends to the elliptic equation

−∂2u

∂x2= 2, u(x, 0) = 0, u(0, t) = 0



Let us see the effect of two different conditions at x = 1: u(1, t) = 0versus ux(1, t) = 0

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

Time t=0.300

As for the elliptic counterpart, all points in the interior are effected bythe boundary condition


Parabolic equations (2)

Multi-dimensional parabolic (heat) equation:

∂u

∂t= ∇ · (λ(x)∇u) + f

Recall: as t→∞, we normally have that ∂u/∂t→ 0, and theparabolic equation approaches the elliptic counterpart


Numerical methods

Hyperbolic equations:explicit schemes, must choose ∆t ∼ hElliptic equations:all grid-point values are coupled in a linear system, time-consumingto solve, requires sophisticated iterative methods

Parabolic equations:can use explicit schemes, if ∆t ∼ h2, but implicit schemes couplingall points (like in elliptic equations) are preferred


Why solutions of elliptic eqs. are smooth

Solutions of elliptic equations are smooth!

−u′′(x) = f(x) with “noisy” f(x):

u = −∫ (∫

f(x)dx

)dx

Twice integration of noisy f(x)⇒ smooth u(x)

(λu′)′ = 0 with noisy coefficient λ(x):

u(x) = const ·∫ x

0

dτ

λ(τ)

Rough λ⇒ smooth u


More general result from Fourier series

−u′′(x) = f(x) solved by Fourier series (here: sine series):

u(x) =∞∑

i=1

ui sin iπx

Expanding f(x) also in a Fourier series:

f(x) =∞∑

i=1

fi sin iπx

Inserting this in the equation gives

ui ∼ fi/i2

i.e., the u series converges faster than the f series and from Fourierseries theory this means that u is smoother than f


Smoothness and variational calculus

∇2u = 0 is equivalent with

minv(x)

∫

Ω

||∇v||2dΩ

⇒ u is “smoothest” among “all” v over Ω


Analysis of difference schemes

Analysis of difference schemes – p. 141

Outline

Try to explain the observed numerical behavior (accuracy, stability) bymathematical means

Tool: exact solution of the discrete equations

Main focus on the wave and heat equations

Intro to classical topics like truncation error and von Neumannstability


Operator notation (1)

Finite difference schemes are often long and difficult to readcompared to the underlying PDE

Operator notation gives condensed expressions much like the PDE

Define

[δxu]ì,j,k ≡

uì+ 1

2 ,j,k− u`

i− 12 ,j,k

∆x

with similar definitions of δy, δz, and δt

Another difference:

[δ2xu]ì,j,k ≡

uì+1,j,k − u`

i−1,j,k

2∆x



Compound difference (1D now, to save index writing):

[δxδxu]ì =

1

h2

(u`

i−1 − 2uì + u`

i+1

)

One-sided forward difference:

[δ+x u]ì ≡

uì+1 − u`

i

h

and the backward difference:

[δ−x u]ì ≡

uì − u`

i−1

h

Operator notation for arithmetic average:

[ux]ì ≡1

2

(u`

i+ 12

+ uì− 1

2

)



Put the whole equation inside brackets:

[δxδxu = −f ]i

This is a finite difference scheme for u′′ = −fExample: (λu′)′ = 0 is discretized as

[δxλxδxu = 0]i

Another example, the heat equation:

∂u

∂t= κ

(∂2u

∂x2+∂2u

∂y2

)

[δ+t u = κ (δxδxu+ δyδyu)]ì,j


Typical solution of a wave equation

Wave equation:∂2u

∂t2= γ2 ∂

2u

∂x2

Typical solution

u = Aei(kx−ωt)

for arbitrary k

Only real or imaginary part has physical interpretation, e.g., the realpart is A cos(kx− ωt)Inserting solution gives the dispersion relation ω = ω(k):

ω = ±γk ⇒ u = Aeik(x±γt)


General solutions

Can build general solutions as Fourier series

u =∑

k

Akei(kx−ωt)

...or Fourier integrals

u =

∫ ∞

−∞

A(k)ei(kx−ωt)dk

The basic component in these general solutions is

ei(kx−ωt)

which we will study in the following


Physical interpretation

What do A, k, and ω in

u = Aei(kx−ωt)

really mean?

-1.5

-1

-0.5

0

0.5

1

1.5

0 1 2 3 4 5 6

λ

c

A

λ = 2π/k is the wave length, c = ω/k = γ is the wave velocity


A heat equation

The heat (diffusion) equation:

∂u

∂t= κ

∂2u

∂x2

Typical solution u = A exp (i(kx− ωt)) for arbitrary k

Inserting solution gives ω = −iκk2 (called the dispersion relation)

The form of the solution:

u = Ae−κk2teikx

Can build more complicated solutions through Fourier series orintegral


The damping in the heat equation

A general solution component:

u = Ae−κk2teikx

damps short waves (big k) significantly (exp (−k2))

Example: add two components, k = π and k = 100π (chooseκπ = 1), consider the imaginary part:

u(x, t) = 1 · e−t sinπx+ 0.6 · e−10000t sin 100πx

This is a sine with period 1 plus a 60 percent perturbation whichoscillates 100 times faster

The damping factor of the perturbation is exp (−10000) that of the firstterm; after t = 1/1000 this damping os about 5 · 10−5

Recall: no damping in the wave equation


Plot of the damping

-0.5

0

0.5

1

1.5

2

0 0.2 0.4 0.6 0.8 1

u(x,0)

-0.5

0

0.5

1

1.5

2

0 0.2 0.4 0.6 0.8 1

u(x,0.001)


Solution of discrete equations

Consider the explicit finite difference for the wave equation, with u`j as

unknown

A typical solution reads

u`j = Aei(kjh−ω`∆t) = Aei(kx−ωt)

⇒ same structure as the solution of the PDE!

Inserting this form of uì in the scheme:

ω = ω(k, h,∆t)

(numerical dispersion relation)

The velocity c = ω/k should be γ (constant), but will now depend onk, h, and ∆t

Hopefully, c→ γ as h,∆t→ 0


The wave equation scheme

Inserting u`j = exp (i(kjh− ω`∆t)) in the wave equation scheme

gives

sinω∆t

2= ±γ∆t

hsin

kh

2

Can solve for ω and then we have an analytical solution of thediscrete equations:

ω = ± 2

∆tarcsin

(γ∆t

hsin

kh

2

)

Can assess the accuracy by plotting ω − ωBetter: plot c = ω/k as a function of kh for different Courant (γ∆t/h)numbers

Can assess the accuracy by investigating

ω − ω = − 1

24γk3(h2 − γ2∆t2) +O(h2∆t2, h4,∆t4)


Stability (1)

We know that the exact solution of the wave equation PDE containsno damping or no growth of a wave component, i.e.,

ω is real

A numerical wave component should exhibit the same qualitativebehavior

...it can be slightly damped, but not amplified (for sufficiently large tthe wave becomes arbitrarily large)

⇒ Should have real ω (or a small negative imaginary part only, i.e.,slight damping)

The equation for ω:

sinω∆t

2= ±γ∆t

hsin

kh

2


Stability (2)

The equation for ω,

sinω∆t

2= ±γ∆t

hsin

kh

2

can also have complex ω

Complex ω will occur in conjugate pairs, i.e., one root has positiveimaginary part, leading to wave growth, and cannot be allowed

⇒ Only real ω can be accepted

sin = const × sin⇒ the const must be in [−1, 1], here

γ∆t

h≤ 1

or

∆t ≤ h

γ

which is the stability criterionAnalysis of difference schemes – p. 155

The effect of round-off errors

Consider the initial condition u(x, 0) = A exp (ikx) for the wave eq.

General solution:u(x, t) = Aei(kx−ωt)

Perturb initial condition:

u(x, 0) = Aeikx + aei10kx, a A

This always happens on a computer, because of round-off errors

If ∆t ≤ h/γ, no numerical wave components are damped or amplifiedand the solution reads

u`j = Aei(kjh−ω`∆t) + aei(10kjh−ω`∆t)

⇒ Initial perturbation (round-off errors) is not amplified


Consequences of stability

If ∆t > h/γ, at least one wave component starts to grow in time, i.e.,the initial perturbation is amplified

After some time, the solution is completely nonphysical

When a program gives nonsense solutions, recall that the reason caneither be a bug or a too large ∆t!


Accuracy

The exact and numerical solutions,

exp (i(kx− ωt)) and exp (i(kx− ωt))

have the same basic structure, only the ω values differ

Can assess the numerical accuracy as Eω = ω − ωCan plot Eω or make a Taylor series expansion in terms of the gridand physical parameters:

Eω = − 1

24γk3(h2 − γ2∆t2) +O(h2∆t2, h4,∆t4)

Errors go to zero as h2,∆t2

It turns out that Eω = 0 if ∆t = h/γ (!)


Summary of numerical properties

The wave equation:

Accuracy: O(h2,∆t2), but exact curves can also be produced

Special result: C ≡ γ∆t/h = 1 implies that the numerical solution isexact

Stability: ∆t ≤ h/γThe method of analysis applies to linear, homogeneous,time-dependent equations with constant coefficients


The heat equation; stability

The heat equation:∂u

∂t= κ

∂2u

∂x2

can be discretized by

[δ+t u = κδxδxu]`j

(explicit forward scheme, u`+1i = old values)

Inserting discrete wave component

u`j = Aei(kjh−ω`∆t) = Aξèikjh, ξ = e−iω∆t

in the numerical scheme results in

ξ = 1− κ4∆t

h2sin2 kh

2


Stability criterion

Damping, i.e. no growth, implies |ξ| ≤ 1

With

ξ = 1− κ4∆t

h2sin2 kh

2

this leads to

∆t ≤ h2

2κ

as the stability criterion


Truncation error

PDE: L(u) = f

Numerical approximation: u∆, L∆(u∆) = f∆

The truncation error is defined as

τ = L∆(u)− f∆

i.e. τ reflects the residual when the analytical solution is inserted inthe numerical scheme

Computational technique:expand analytical u(x) at xi±1 in Taylor series about xi:

u(xi±1) = u(xi)±∂u

∂x

∣∣∣∣xi

h+1

2

∂2u

∂x

∣∣∣∣xi

h2 ± 1

6

∂3u

∂x

∣∣∣∣xi

h3

+1

24

∂4u

∂x

∣∣∣∣xi

h4 + · · ·


Truncation error; example (1)

Problem:

L(u) ≡ −u′′(x)

L∆(u∆) ≡ −ui−1 − 2ui + ui+1

h2

Don’t multiply by h2! (τ ∼ L∆)

Insert Taylor expansions:

τ = u′′(xi) + f(xi) +1

12u′′′′(x)h2 +O(h4)

PDE is fulfilled pointwise: u′′(xi) + f(xi) = 0

Second order scheme because:

τ =1

12u′′′′(x)h2 +O(h4), τ ∼ O(h2)


What is the truncation error?

τ measures the error in the discrete equations when the exactsolution of the continuous problem is inserted

That is, τ measures the error in an equation (this error is called aresidual), not the error in the solution u− u∆

Hopefully, τ reflects the true error u− u∆


Truncation error; example (2)

Model problem: the 1D wave equation

L(u) ≡ ∂2u

∂t2− γ2 ∂

2u

∂x2

L∆(u∆) ≡ u`−1i − 2u`

i + u`+1i

∆t2

− γ2uì−1 − 2u`

i + uì+1

h2

τ =∂2u

∂t2+

1

12

∂4u

∂t4∆t2 +O(∆t4)

− γ2 ∂2u

∂x2− γ2 1

12

∂4u

∂x4h2 +O(h4)

τ = O(∆t2, h2)


von Neumann stability analysis (1)

Make an equation for the numerical error (see later)

Seek discrete solution for the error:

e`j =

∑

k

ξ` exp (ikjh)

Insert e`j in error equation, compute for a single component k:

e`j = exp (ikjh)

Common stability requirement:|e`

j | <∞⇒ |ξ| ≤ 1

Gives a condition on ∆t

Very similar to discussing numerical dispersion relations


von Neumann stability analysis (2)

1D heat equation: ut = κuxx

Perturbed solution v: vt = κvxx

Equation for error e = u− v: et = κexx

Inserting e`j = ξ` exp (ikjh):

ξ − 1

∆t= −4κ

h2sin2 kh

2

|ξ| ≤ 1⇒ −1 ≤ ξ ≤ 1:

−1 ≤ ξ = 1− 4κ∆t

h2⇒ ∆t ≤ h

2κ

1D wave equation: utt = c2uxx,

∆t ≤ h

c


Consistency and convergence

u: solution of continuous problem

u∆: solution of discrete problem

∆: mesh parameter (h,∆t etc)

Definition of consistency: τ → 0 as ∆→ 0

Example:

τ =1

12u′′′′(xi)h

2 +O(h4)→ 0 as h→ 0

Interpretation: The analytical solution fulfulls the discrete equationsas ∆→ 0

Convergence: u∆ → u as ∆→ 0

τ → 0 does not imply u∆ → u


Lax’ theorem

Convergence⇔ consistency and stability

⇒ easy tool for proving convergence!


Intro to finite elements

Intro to finite elements – p. 170

Features of the method

Flexibility

Straightforward handling of complicated geometries

Easy to construct higher-order approximations

Broad spectrum of applications

Popular method for demanding engineering applications

Strong mathematical foundation


Basic principles

Finite difference method:

∂u

∂x→ ui+1 − ui−1

2h

Finite element method:

u(x) → u(x) =M∑

j=1

ujNj(x)

Nj(x): prescribed functions

uj : unknown parameters

Need M equations for determining uj

Optimal goal: find uj such that ||u− u|| is minimized

Realistic goal: find uj such that the residual is small


A least-squares method

We look at a general PDE

L(u) = 0, x ∈ Ω

Insert u =∑

j ujNj for u, but L(u) is then not zero,

L(u) = R 6= 0, R = R(u1, . . . , uM ; x)

Idea: adjust u1, . . . , un such that

∫

Ω

R2(u1, . . . , uM ; x)dΩ

is minimized with respect to u1, . . . , uM

Result: ∫

Ω

2R∂R

∂uidΩ = 0, i = 1, . . . ,M.

⇒ algebraic system of equations for u1, . . . , uMIntro to finite elements – p. 173

Least squares: example

Boundary value problem:

−u′′(x) = f(x), x ∈ (0, 1), u(0) = u(1) = 0

Nj(x) = sin jπx, u =∑

j uj sin jπx

Boundary conditions are satisfied since each term vanishes(sin iπ0 = 0, sin iπ = 0)

Residual:

R =

M∑

j=1

ujN′′j (x) + f(x),

∂R

∂ui= N ′′

i (x)

Least-squares equations:

∫ 1

0

M∑

j=1

ujN′′j (x) + f(x)

N ′′

i (x)dx = 0


Writing up a system of linear equations

The least-squares equations

∫ 1

0

M∑

j=1


N ′′

i (x)dx = 0

is a system of linear equations

To see this, we write the system on standard form

Au = b orn∑

j=1

Ai,juj = bi, i = 1, . . . , n

Interchange integration and summation, factor uj out:

n∑

j=1

−(∫ 1

0

N ′′i N

′′j dx

)

︸︷︷︸Ai,j

uj =

∫ 1

0

fN ′′i dx

︸︷︷︸bi


Weighted residual method (WRM)

We look at a general PDE again

L(u) = 0, x ∈ Ω

Insert u =∑

j ujNj for u and obtain a residual R:

R ≡ L(u), R = R(u1, . . . , uM ; x)

Require R to be zero in a weighted mean:

∫

Ω

RWi(x)dΩ = 0, i = 1, . . . ,M

Wi(x): prescribed weighting functions

⇒ Linear system for u1, . . . , uM

Galerkin’s method: Wi = Ni (common choice - often optimal)

Observation: least squares = WRM with Wi = ∂R/∂ui


Galerkin: example

Boundary value problem:

−u′′(x) = f(x), x ∈ (0, 1), u(0) = u(1) = 0

Nj(x) = sin jπx

Boundary conditions are satisfied

Residual:

R =M∑

j=1


Galerkin equations:

∫ 1

0

M∑

j=1


Ni(x)dx = 0


Writing up the linear system

We interchange summation and integration to write the equations onstandard form

∑j Ai,juj = bi such that we can identify the coefficient

matrix and the right-hand side (these must be known before we cancall up software to solve the linear system)

Ai,j = −∫ 1

0

Ni(x)N′′j (x)dx, bi =

∫ 1

0

f(x)Ni(x)dx


Linear algebra interpretation (1)

Recall from linear algebra:

(u, v) = 0 ∀v ∈ V ⇔ u = 0 or u ⊥ V (u 6∈ V )

Define function space V spanned by

B = N1, N2, . . . , NM

Define inner product

(u, v) =

∫

Ω

uvdΩ

With these definitions we can redefine/interpret the Galerkin methodas a “geometric” or linear algebra approach


Linear algebra interpretation (2)

Galerkin’s method: Find

u =M∑

j=1

ujNj ∈ V

such that the resulting residual is orthogonal to V :

(R, v) = 0 ∀v ∈ V (⇒ R 6∈ V or R ≡ 0)

We hope that V contains the "important" functions in the problem⇒R is "small"

Hope for convergence: u→ u as M →∞, when R→ 0


Collocation methods

Force the residual to vanish at M distinct points

Collocation:

R(u1, . . . , uM ; x[i]) = 0, i = 1, . . . ,M

Cf. the finite difference method:the PDE is fulfilled at M points

Note: collocation also arise from WRM with

Wi(x) = δ(x− x[i])

(Recall:∫∞

−∞f(x)δ(x− x0)dx = f(x0))

Subdomain collocation:∫

Ωi

RdΩ = 0, i = 1, . . . ,M, Ω = ∪Mi=1Ωi


A worked example (1)

Problem:

−u′′(x) = f(x), x ∈ (0, 1), u(0) = u(1) = 0

Approximation:

u(x) ≈ u(x) =M∑

j=1

ujNj(x)

Force boundary conditions:

Nj(0) = Nj(1) = 0, i = 1, . . . ,M



Choices of Nj(x):

Nj(x) = sin jπx

Nj(x) = xj(1− x)

Least squares:

−M∑

j=1

1∫

0

N ′′i (x)N ′′

j (x)dx

uj =

1∫

0

f(x)N ′′i (x)dx



Galerkin’s method:

−M∑

j=1

1∫

0

Ni(x)N′′j (x)dx

uj =

1∫

0

f(x)Ni(x)dx

Observation:Ni = sin iπx⇔ Ai,j = 0, i 6= j

Therefore (Galerkin and least squares):

uj =2

π2j2

∫ 1

0

f(x) sin jπxdx

No need for solving linear systems!



Collocation:

The residual is forced to vanish at M points

x[1], . . . , x[i], . . . , x[M ]

called the collocation points

Equivalent view: use u in PDE and let the PDE be fullfilled at thecollocation points

−M∑

j=1

N ′′j (x[i])uj = f(x[i]), i = 1, . . . ,M

which is a linear system with Ai,j = −N ′′j (x[i]) as coefficient matrix

and bi = f(x[i]) as right-hand side



With Nj = sin jπx the coefficient matrix Ai,j = N ′′j (x[i]) is in general

full (dense), e.g,

Ai,j = sin[j(i− 1)hπ], i = 1, . . . ,M, h = 1/(M − 1)


Ill-conditioning

u′′ = 1, u(0) = u(1) = 0⇒ u = −x(1− x)/2

u =∑M

j=1 ujxj(1− x) contains the exact solution

Galerkin gives u1 = −1/2, uj = 0, j > 1

On the computer (with 6 digits):

M (u1, . . . , uM )

2 (-.50000, .00001)4 (-.50001, .00039, -.00079, .00048 )6 (-.50072, .01296, -.07323, .17561, -.18727 ,. 073122 )8 (-.49977, -.00510, .01485, .11669, -.73884, 1.56298, ...)

⇒ The method does not converge!

Nj = xj(1− x) almost linearly dependent for j > 5

⇒ Ill-conditioned coefficient matrix, round-off errors accumulate


Cure for ill-conditioning

Choose orthogonal or nearly orthogonal Ni

Fourier series provide orthogonal Ni

Generalized Fourier series, using e.g. Legendre polynomials, Besselfunctions, Laguerre polynomials, etc., also give orthogonal Ni

Finite elements provide nearly orthogonal Ni


A new view on Fourier series (1)

We can interpret Fourier series, e.g.,

u ≈ u =∞∑

j=1

uj sin jπx on (0, 1)

uj = 2

1∫

0

u(x) sin jπxdx

as a special case of a Galerkin or least-squares method

Solve u = f by an approximate method

u ≈ u =∑

j

ujNj

Residual: R = u− f =∑

j ujNj − fLeast squares and Galerkin:

M∑

j=1

(∫

Ω

NiNjdΩ

)uj =

∫

Ω

fNidΩ, i = 1, . . . ,M


A new view on Fourier series (2)

Let Ω = (0, 1), choose Ni = sin iπx

Since these Ni are orthogonal,∫ 1

0NiNjdx = 1

2δij

Hence, the matrix becomes diagonal, and we can solve the systemby hand:

uj = 2

1∫

0

f(x) sin jπxdx

These are the well-known coefficients in the Fourier sine series off(x)

⇒ Fourier series can be viewed as a least-squares or Galerkin method


Treatment of boundary conditions (1)

u = 0 on ∂Ω⇒ Ni = 0 on ∂Ω

u = ψ(x) on ∂Ω:

u = ψ(x) +M∑

j=1

ujNj(x), Nj = 0 on ∂Ω

ψ = ψ on ∂Ω

Example: u(0) = UL, u(1) = UR,choose ψ = xUR + (1− x)UL

Note: ψ is not uniquely determined


Treatment of boundary conditions (2)

What about u(0) = UL and u′(1) = β?

u(x) ≈ u(x) = UL +M∑

j=1

ujNj(x), Nj(0) = 0

Integration by parts:

−∫ 1

0

u′′Nidx =

∫ 1

0

fNidx (Galerkin)

∫ 1

0

u′N ′idx− [u′Ni]

10 =

∫ 1

0

fNidx

∫ 1

0

u′N ′idx =

∫ 1

0

fNidx+ βNi(1)

Recall that Ni(0) = 0 due to u(0) = UL.Must have Ni(1) 6= 0.

Remark: could add an equation/constraint u = uL, e.g.,

M∑

j=1

ujNj(0) = UL

and get rid of ψ and the requirement Nj(0) = 0; this is what we willdo later (!)


Advantages of integration by parts

A way to incorporate derivative BCs

The physical derivative BCs naturally arise from the integration byparts (because the BC is closely related to the PDE, e.g.,−(λu′)′ = f has derivative condition −λu′ = · · · )Symmetric matrix:

∫N ′

iN′jdx

Lower continuity requirements on Ni


Essential and natural boundary conditions

u = UL must be enforced in u⇒ essential boundary conditions

u′ = β appears naturally in the formulas⇒ natural boundary conditions

Observation: forgetting boundary term implies u′ = 0


Multidimensional problems

Boundary-value problem

−∇ · [k(x)∇u(x)] = f(x), x ∈ Ω,

−k(x)∂u

∂n= g(x), x ∈ ∂ΩN ,

u(x) = ψ(x), x ∈ ∂ΩE .

Expansion:

u(x) ≈ u(x) = ψ(x) +M∑

j=1

ujNj(x)

withNj = 0 on ∂ΩE


Multidimensional integration by parts

Integration by parts lemma:

−∫

Ω

∇ · [k∇u]WidΩ =

∫

Ω

k∇u · ∇WidΩ−∫

∂Ω

Wik∂u

∂ndΓ


M∑

j=1

∫

Ω

k(x)∇Ni · ∇Nj dΩ

uj =

∫

Ω

f(x)NidΩ−∫

∂ΩN

g(x)NidΓ

−∫

Ω

k∇ψ · ∇NjdΩ

Common physical flux-conditions appear as natural boundaryconditions


2D example

Ω = (0, 1)× (0, 1)

Expansion:

u =

n1∑

i=1

n2∑

j=1

ui,jNi,j , Ni,j = sin iπx1 sin jπx2

Weighting functions (Galerkin): Nk,`

Linear system: ∑

i

∑

j

Ai,j,k,ùi,j = bk,`

For implementation: convert double to indices to single

N(j−1)n+i(x1, x2) = sin iπx1 sin jπx2

i, j = 1, . . . , n, M = n2


Time-dependent problems (1)

∂2u

∂t2= ∇ · (c2∇u), x ∈ Ω, t > 0

u(x, 0) = f(x), x ∈ Ω

∂

∂tu(x, 0) = 0, x ∈ Ω

∂u

∂n= 0, x ∈ ∂Ω

Finite differences in time:

∂2

∂t2u(x, t`) =

u`−1 − 2u` + u`+1

∆t2+O(∆t2)

Spatial problem at each time level:

u`+1(x) = 2u`(x)− u`−1(x) + (c∆t)2∇2u`(x)


Time-dependent problems (2)

Initial condition:u0 = f(x)

∂u

∂t= 0→ u1 − u−1

2∆t= 0

(can develop a special formula for u−1 such that the main schemecan be used for ` = 0, 1, 2, . . ., cf. the method for the 1D waveequation)

The spatial variation of u, u`(x), is expanded in the standard way:

u`(x) ≈ u` =M∑

j=1

u`jNj(x), ` = −1, 0, 1, 2, . . .


Summary of the time-discrete equations

u0 = f(x), x ∈ Ω

u−1 = u0 +1

2c2∆t2∇2u0, x ∈ Ω

u`+1 = 2u` − u`−1 + c2∆t2∇2u`, x ∈ Ω, ` = 0, 1, . . .

∂u`

∂n= 0, x ∈ ∂Ω k = 0, 1 . . .


Weighted residual methods

All the equations are on the form “u = g”, where g is known andu` ≈ u` =

∑j u

`jNj

Could, in principle, start with an analytical f(x) and analyticallyderive u1, u2 and so on, but these expressions become complicated

Working with u` ≈ u` =∑

j u`jNj instead allows easy update of

u`1, . . . , u

`M

Galerkin method for u = g: insert u`, multiply by Ni and integrate:

∫

Ω

Ni

∑

j

u`jNjdΩ =

∫

Ω

NigdΩ

giving a linear system (!):

M∑

j=1

(∫

Ω

NiNjdΩ

)u`

j =

∫

Ω

gNidΩ, i = 1, . . . ,M


Treatment of the right-hand sides

When g contains ∇2u` we integrate by parts on the right-hand side

Example:

u`+1 = 2u` − u`−1 + c2∆t2∇2u`

∫

Ω

u`+1NidΩ =

∫

Ω

(2u` − u`−1 + c2∆t2∇2u`)NidΩ

integration of∫Ni∇2udΩ by parts:

∫

Ω

u`+1NidΩ =

∫

Ω

(2u`Ni − u`−1Ni − c2∆t2∇Ni · ∇u`)dΩ

(the surface integral vanishes since ∂u`/∂n = 0)

Expand u to a sum and identify coefficient the matrix and theright-hand side


Spatial problems

M∑

j=1

Mi,ju0j =

∫

Ω

f(x)NidΩ,

M∑

j=1

Mi,ju−1j =

∫

Ω

[u0Ni −

1

2(c∆t)2∇Ni · ∇u0

]dΩ +

1

2∆t2

∫

∂Ω

c2∂f

∂nNidΓ,

M∑

j=1

Mi,ju`+1j =

∫

Ω

[(2u`(x)− u`−1(x)

)Ni − (c∆t)2∇Ni · ∇u`(x)

]dΩ,

where

Mi,j =

∫

Ω

NiNjdΩ

is the mass matrix


Matrix notation

Can introduce matrix-vector notation; M = Mij

Mu0 = f

Mu−1 = Mu0 − 1

2Ku0 + fn

Mu`+1 = 2Mu` −Mu`−1 −Ku`


Finite elements

Ni: piecewise polynomials

Example: piecewise linear Ni gives a piecewise linear u =∑

j ujNj :

x

u

Define elements Ωe and nodes x[i]

Definition of Ni:1. polynomial over each element

2. Ni(x[j]) = δij ; 1 if i = j and 0 if i 6= j

This is the type of Ni we use in the finite element method


A nice feature of finite element Ni

Property 2,

Ni(x[j]) = δij ; 1 if i = j and 0 if i 6= j

implies that uj is the value of u at node j

Proof:u(x[i]) =

∑

j

ujNj(x[i]) = ui

This interpretation of ui is very convenient both for practical work,implementation and for comparison with finite difference methods


Piecewise linear Ni

21 3 4 5 6 7x

Each element has 2 nodes


Quadratic basis functions

Each element has tree nodes


Essential boundary conditions


−u′′ = f, x ∈ (0, 1), u(0) = uL, u(1) = uR

With ui = u(x[i]) we can construct ψ in a general way:

ψ(x) = uLN1(x) + uRNn(x)

u(x) = ψ(x) +n−1∑

j=2

ujNj(x)

In general (2D/3D): B = boundary nodes with essential conditions,I = internal nodes

u =∑

j∈B

ujNj +∑

j∈(1)

ujNj

Only uj , j ∈ I, enter the linear system as unknownsIntro to finite elements – p. 209



−u′′ = f, x ∈ (0, 1), u(0) = uL, u(1) = uR


n∑

j=1

Ai,juj = bi, i = 1, . . . n

Ai,j =

1∫

0

N ′i(x)N

′j(x)dx, bi =

1∫

0

f(x)Ni(x)dx

Observation: Ni(x) and N ′i(x) vanish over large parts of the domain

("nearly" orthogonal functions)

Ai,j 6= 0 only for j = i− 1, i, i+ 1



Computations:

Ai,i−1 =

1∫

0

N ′i−1N

′idx = − 1

h, Ai,i =

1∫

0

N ′iN

′idx =

2

h

Ai,i+1 =

1∫

0

N ′iN

′i+1dx = − 1

h, A1,1 = An,n =

1

h

A1,2 = An,n−1 = − 1

h, bi =

1∫

0

f(x)Ni(x)dx



Numerical integration, trapezoidal rule:

1∫

0

f(x)Ni(x)dx

≈ 1

2f(x[1])Ni(x

[1])h+n∑

j=1

f(x[j])Ni(x[j])h+

1

2f(x[n])Ni(x

[n])h

=1

2f1δ1ih+

n−1∑

j=2

fjδijh+1

2fnδinh

where fi ≡ f(x[i]) (FDM-inspired notation)

For 2 ≤ i ≤ n− 1 the integral becomes fih (as with finite differences!!)

For i = 1, n we get 12f1h and 1

2fnh


Piecewise constant N ′i(x)

21 3 4 5 6 7x


The resulting equations

Replace eq. no. 1 and n by boundary conditions

u1 = uL, un = uR

The linear system:

u1 = uL,

− 1

hui−1 +

2

hui −

1

hui+1 = f(x[i])h,

i = 2, . . . , n− 1,

un = uR

Same result as from the finite difference method!

Exact or more accurate numerical integration: different right-handside term


Element by element computations (1)

Split integral into a sum over each element:

Ai,j =

1∫

0

N ′iN

′jdx =

m∑

e=1

A(e)i,j , A

(e)i,j =

∫

Ωe

N ′iN

′jdx

bi =

1∫

0

fNidx =

m∑

e=1

b(e)i , b

(e)i =

∫

Ωe

fNidx

A(e)i,j 6= 0 iff i and j are nodes in element e

b(e)i 6= 0 iff i is node in element e

Collect nonzero A(e)i,j in a 2× 2 element matrix :

A(e)r,s , r, s = 1, 2, r, s : local node numbers


Element by element computations (2)

Similar strategy for b(e)i ; we collect the nonzero entries on element e

in b(e)r , with r = 1, 2 counting local node numbers

Algorithm: run through all elements, compute A(e)r,s and b(e)r , and

combine all element matrices and vectors into a linear system

The local nature of Ni gives a method where one can compute just afew numbers for an element, independent of the other elements

The result is a sparse matrix and possibility for performing theelementwise computations in parallel


Local coordinates (1)

Map element Ωe = [x[e], x[e+1]] to [−1, 1]

Formula: local ξ ∈ [−1, 1] to global x,

x(e)(ξ) =1

2

(x[e] + x[e+1]

)+ ξ

1

2

(x[e+1] − x[e]

)

Define Ni in local ξ coordinates

c c -

bbbbbbb"""""""6

? ξ

1

N1(ξ) N2(ξ)

Perform all computations in local coordinates

Local node r (=1,2) in element e corresponds to global nodei = q(e, r)



Local linear basis functions:

N1(ξ) =1

2(1− ξ), N2(ξ) =

1

2(1 + ξ)

In general: can always compute finite element equation with suchlocal basis functions in a reference element of fixed size

We need to change variables:

x =1

2

(x[e] + x[e+1]

)+ ξ

1

2

(x[e+1] − x[e]

)

⇒ dx

dξ=h

2



In the integral:

x[e+1]∫

x[e]

N ′i(x)N

′j(x)dx =

1∫

−1

dNr(ξ)

dξ

dξ

dx

dNs(ξ)

dξ

dξ

dx

∣∣∣∣dx

dξ

∣∣∣∣ dξ

=

1∫

−1

(2

h

)2dNr(ξ)

dξ

dNs(ξ)

dξ

h

2dξ



The variable transformation can be expressed in general formulasapplicable to general finite element problems in 1D, 2D and 3D

General (isoparametric) mapping:

x(e)(ξ) =

ne∑

r=1

Nr(ξ)x[q(e,r)]

(specializes to the previous formula for linear Nr)

Change integration variable from x to ξ:

x[e+1]∫

x[e]

N ′i(x)N

′j(x)dx =

1∫

−1

J−1 dNr(ξ)

dξJ−1 dNs(ξ)

dξdetJdξ

We often write ∫

Ωe

dNr

dx

dNs

dxdetJdξ

as the expression in local coordinates, knowing that

dNr

dx= J−1 dN

dξ=

2

h

dN

dξ



−u′′(x) = f(x)

Jacobian matrix of mapping: J (1× 1 in 1D)

Uniform partition in 1D: J = h/2Element matrix and vector:

A(e)r,s =

∫ 1

−1

2

hN ′

r(ξ)2

hN ′

s(ξ)h

2dξ

b(e)r =

∫ 1

−1

f(x(e)(ξ))Nr(ξ)h

2dξ

Example: r = s = 1,

A(e)1,1 =

2

h

∫ 1

−1

(−1

2)(−1

2)dξ =

1

h


Results


−u′′ = f, x ∈ (0, 1), u(0) = uL, u(1) = uR

Element matrix and vector:

A(e)

r,s

=

1

h

(1 −1

−1 1

)

b(e)r

=

h

2

(f(x(e)(−1))

f(x(e)(1))

)

where numerical integration is used:

1∫

−1

g(ξ)dξ ≈ g(−1) + g(1)


Essential boundary conditions

Incorporate essential boundary conditions at the element level

Element level equations:

2∑

s=1

A(e)r,s us = b(e)r , r = 1, 2

Example: Essential condition u1 = uL

Replace eq. no. 1 by u1 = uL:

u1 = uL

A(e)2,1u1 + A

(e)2,2u2 = b

(e)2

Modify the element matrix and vector:

(1 0

− 1h

1h

)(u1

u2

)=

(uL

12hf(x(1)(1))

)


Symmetric element/coefficient matrix

The element matrix is actually symmetric

The essential BC modification makes the symmetric nonsymmetric

Symmetrization: subtract column 1 in A(e) times uL from b(e), thenreplace eq. no. 1

This modification preserves the symmetry property of the elementmatrix and the resulting coefficient matrix

A symmetric coeff. matrix gives less storage and enables applicationof some efficient iterative solution methods


Numerical integration

Integration rules are normally tabulated for integrals on [−1, 1]:

1∫

−1

g(ξ)dξ ≈nI∑

k=1

g(ξk)wk

ξk: integration points

wk: integration weights

Some rules integrating polynomials of degree p exactly:

name nI p weights points

Gauss-Legendre 1 1 (2) (0)

Gauss-Legendre 2 3 (1, 1) (−1/√

3, 1/√

3)

Gauss-Legendre 3 5 (5/9, 8/9, 5/9) (−√

3/5, 0,√

3/5)

Gauss-Lobatto 2 1 (1, 1) (−1, 1)

Gauss-Lobatto 3 3 (1/3, 4/3, 1/3) (−1, 0, 1)


Assembly

Element matrices and vectors must be assembled (added) into theglobal system of linear equations

Essential: local→ global mapping, q(e, r),local node r in element e has global node number q(e, r)

In 1D, q(e, r) = e− 1 + r, but in 2D/3D the grid is more complicatedand q is just a table

Algorithm:

Aq(e,r),q(e,s) := Aq(e,r),q(e,s) + A(e)r,s , r, s = 1, 2

bq(e,r) := bq(e,r) + b(e)r , r = 1, 2


Illustration of the assembly process

element matrices global matrix

3

4

q(e,r)


Summing up the procedures

Weighted residual formulation,often Galerkin’s choice Wi = Ni

Integration by parts

Derivative boundary conditions in boundary terms

Compute element matrices and vectorsLocal coordinates with local numberingNumerical integrationEnforce essential boundary conditionsAssemble local contributions

Solve linear system


Generality

−u′′ = f , u(0) = uL, u(1) = uR

is just an example

The algorithm works in 1D, 2D, 3D

Complicated geometries can be handled

Element shapes in 2D: triangles, quadrilaterals

Element shapes in 3D: boxes, tetrahedra

Time dependency: "time loop outside a stationary solver"


Nonconstant element size

Trivial to work with varying element size

Just replace h by a he in the formulas from element e

Result in model problem:

u1 = uL

− 1

hi−1ui−1 +

(1

hi−1+

1

hi

)ui −

1

hiui+1 =

1

2(hi−1 + hi)f(x[i])

un = uR

i = 2, . . . , n− 1.

Varying element shape and size is straightforward inmulti-dimensional problems


The elementwise algorithm (1)

initialize global linear system:set Ai,j = 0 for i, j = 1, . . . , n

set bi = 0 for i = 1, . . . , n

loop over all elements:for e = 1, . . . , m

set A(e)r,s = 0, r, s = 1, . . . , ne

set b(e)r = 0, r = 1, . . . , ne

loop over numerical integration points:for k = 1, . . . , nI

evaluate Nr(ξk), derivatives of Nr wrt. ξ and x, J

contribution to element matrix and vectorfrom the current integration pointfor r = 1, . . . , ne

for s = 1, . . . , ne

A(e)r,s := A

(e)r,s + dNr

dxNsdx

det Jwk

b(e)r := b(e) + f(x(e)(ξk))Nr det Jwk


The elementwise algorithm (1)

incorporate essential boundary conditions:for r = 1, . . . , ne

if node r has an essential boundary condition then

modify A(e)r,s and b

(e)r due to this condition

assemble:for r = 1, . . . , ne

for s = 1, . . . , ne

Aq(e,r),q(e,s) := Aq(e,r),q(e,s) + A(e)r,s

bq(e,r) := bq(e,r) + b(e)r


Exercise 2.7

−u′′(x) = f(x) ≡ −(α+ 1)xα

Good repetition of the previous material

A new point:∫ 1

0xαNi(x)dx

The final discrete equations can be written

−h[δxδxu]i = hbi

for internal nodes i = 2, . . . , n− 1, where

bi =1

h2

1

α+ 2

(−(x[i−1])α+2 + 2(x[i])α+2 − (x[i+1])α+2

)

For i = 1: u1 = 0

For i = n: a slightly modified equation


Exercise 2.7 cont.

A standard finite difference method gives

−[δxδxu]i = fi = −(α+ 1)(x[i])α

for internal nodes i = 2, . . . , n− 1

FEM and FDM give slightly different equations

Which method is most accurate?

By accident, FEM solves this problem exactly (!)


FEM for the wave equation

∂2u

∂t2= ∇ · (c2∇u), x ∈ Ω, t > 0

u(x, 0) = f(x), x ∈ Ω

∂

∂tu(x, 0) = 0, x ∈ Ω

∂u

∂n= 0, x ∈ ∂Ω

Finite differences in time:

∂2

∂t2u(x, t`) =

u`−1 − 2u` + u`+1

∆t2+O(∆t2)

Spatial problem at each time level (chosen c = const for simplicity):

u`+1(x) = 2u`(x)− u`−1(x) + (c∆t)2∇2u`(x)

(can be discretized by finite elements) Intro to finite elements – p. 235

Time stepping

u0 = f(x), x ∈ Ω

u−1 = u0 +1

2c2∆t2∇2u0, x ∈ Ω

u`+1 = 2u` − u`−1 + c2∆t2∇2u`, x ∈ Ω, ` = 0, 2, . . .

∂u`

∂n= 0, x ∈ ∂Ω ` = 1, 2 . . .

u`(x) ≈ u` =n∑

j=1

u`jNj(x), ` = −1, 0, 1, 2, . . .


Spatial problems

n∑

j=1

Mi,ju0j =

∫

Ω

f(x)NidΩ,

n∑

j=1

Mi,ju−1j =

∫

Ω

[u0Ni −

1

2(c∆t)2∇Ni · ∇u0

]dΩ +

1

2∆t2

∫

∂Ω

c2∂f

∂nNidΓ,

n∑

j=1

Mi,ju`+1j =

∫

Ω

[(2u`(x)− u`−1(x)

)Ni − (c∆t)2∇Ni · ∇u`(x)

]dΩ,

where

Mi,j =

∫

Ω

NiNjdΩ

is the mass matrix


Spatial problems

Alternative notation:

n∑

j=1

Mi,ju0j = bi,

n∑

j=1

Mi,ju−1j =

n∑

j=1

Mi,ju0j + (c∆t)2

n∑

j=1

Ki,ju0j + bi,

n∑

j=1

Mi,ju`+1j = 2

n∑

j=1

Mi,ju`j −

n∑

j=1

Mi,ju`−1j − (c∆t)2

n∑

j=1

Ki,ju`j ,

where

Mi,j =

∫

Ω

NiNjdΩ from u terms

Ki,j =

∫

Ω

∇Ni · ∇NjdΩ from ∇2u termsIntro to finite elements – p. 238

FEM for the 1D wave equation

1D: ∇2u = u,xx

Linear elements

Compute element matrices corresponding to the two principal terms(u,tt and u,xx):

M (e)ij =

h

6

(2 1

1 2

)K(e)

ij =c2

h

(1 −1

−1 1

)

(mass matrix and stiffness matrix at the element level)

Assembling the stiffness matrix:

∑

j

Ki,ju`j =

c2

h

(u`

i−1 − 2uì + u`

i+1

)

(same as FDM)


Interpreting the mass matrix term (1)

Assembling the mass matrix contributions:

∑

j

Mi,ju`+1j =

h

6

(u`+1

i−1 + 4u`+1i + u`+1

i+1

)

With FDM, only hu`+1i would appear

Can rewrite this as

h

(u`+1

i +1

6

(u`+1

i−1 − 2u`+1i + u`+1

i+1

))

or expressed with difference operators:

h[u+h2

6δxδxu]

`+1i

⇒ FDM representation + a diffusion term


Interpreting the mass matrix term (2)

The complete equation:

[δtδt(u+h2

6δxδxu) = c2δxδxu]

ì

⇒ FDM representation + [ 16h2δtδtδxδxu]

ì (dispersion)

Notice: FEM gives an implicit scheme(must solve a linear system to find u`+1

i )

In 1D: solution of tridiagonal systems is fast

In 2D/3D: solving linear systems slows down the method significantly


Lumping the mass matrix

If we apply nodal-point integration – the trapezoidal rule – theelement mass matrix becomes

h

2

(1 0

0 1

)

Assembling: ∑

j

Mi,ju`+1j = hu`+1

i

i.e. the same result as a finite difference method!

Making the mass matrix diagonal (by e.g. nodal point integration) iscalled mass lumping

Final lumped scheme:

[δtδtu = c2δxδxu]ì

i.e. a standard finite difference scheme Intro to finite elements – p. 242

Some questions

What is best, consistent or lumped mass matrix?

That depends on the equation!

We shall make an analysis of discrete wave eqs.

Is there any physical justification of lumping?

Yes! – see Exercise 2.11


Analysis of FEM for the wave equation (1)

Derive the discrete equations

Look for analytical solutions of the discrete equations

Find corresponding analytical solutions of the continuous problem

Compare principal quantities, e.g., wave velocity

This is called numerical dispersion analysis



The discrete equations from FEM:

[δtδt(u+h2

6δxδxu) = c2δxδxu]

ì

Lumped mass: only δtδtu on the left-hand side

Inserting a discrete solution:

u`j = Aei(kjh−ω`∆t) = Aei(kx−ωt)

Results in a numerical dispersion relation:

ω = ω(k, h,∆t)

Can use this for stability and accuracy analysis



Numerical dispersion relation follows from (and solving for ω)

sin2 ω∆t

2=c2∆t2

h2

(1− 2

3sin2 kh

2

)−1

sin2 kh

2

Truncation error or series expansion of error in wave velocity:

τ ∼ ∆t2

12

(∂4u

∂t4

)`

i

− c2h2

12

(∂4u

∂x4

)`

i

+h2

6

(∂4u

∂x2∂t2

)`

i

No big difference from FDM

Stability: must require real ω

c∆t

h≤ 1√

3

A reducing factor 1/√

3 compared with FDM!Intro to finite elements – p. 246

Consistent vs. lumped mass matrix

FEM w/lumped mass = FDM scheme

FEM w/consistent mass:- same order of accuracy as FDM

- lower stability (1/√

3)- exact solution for C = 1 is not true- must solve linear systems

⇒ Use lumped mass for this wave equation!


Other problems

The conclusions here apply to 2D/3D wave equations

∂2u

∂t2= ∇ · [c2∇u]

Another PDE (uni-directional wave eq.):

∂u

∂t+ v · ∇u = 0

Here, lumped mass reduces the accuracy significantly, so don’tgeneralize too much


Error in numerical wave velocity (1)

consistent mass

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3p

(c− c)/c as function of p = kh. C = c∆t/h. Top curve: C = 1/√

3 (max);mid curve: C = 0.3; bottom curve: C = 0.1.


Error in numerical wave velocity (2)

lumped mass or FDM

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

0 0.5 1 1.5 2 2.5 3

(c− c)/c as function of p = kh. C = c∆t/h. C = 1 gives exact solution.Top curve: C = 0.98 (max); mid curves: C = 0.9, 0.5; bottom curve:C = 0.1.


Software tools for experimentation (1)

Propagation of numerical high-frequency noise:src/fdm/Wave1D/steep1 (steep2)

σ (sigma) is a steepness parameter in a plug-shaped profile:

f(x) =

0.5− π−1 arctan(σ(x− 2)), x > 0,

0.5 + π−1 arctan(σ(x+ 2)), x ≤ 0

Vary the resolution and σ. Study the effect on the wave propagation.

Error in wave velocity for a sine component:

c− c =ω − ωk∼ −k2

use this information to explain the visual observationbs


Quadratic 1D elements

Piecewise quadratic Ni ⇒ piecewise quadratic u

Three nodes per element:one in the middle plus the two at the ends

We always have Ni(x[j]) = 0, i 6= j

Previous algorithms and techniques still work


Quadratic 1D elements in local coordinates

Three nodes per element:

ξ1 = −1, ξ2 = 0, ξ3 = 1

Calculation of basis functions:

Nr(ξ) = arξ2 + brξ + cr, Nr(ξs) = δrs

⇒ 3 equations for ar, br, cr (r = 1, 2, 3)

N1(ξ) =1

2ξ(ξ − 1), N2(ξ) = (1 + ξ)(1− ξ), N3(ξ) =

1

2ξ(1 + ξ)

Isoparametric mapping:

x =3∑

r=1

Nr(ξ)x[q(e,r)]

(could also use the linear mapping if the mid node is in the center)

3× 3 element matrixIntro to finite elements – p. 253

Why quadratic elements?

PDE:−u′′(x) = 1, x ∈ (0, 1)

Basic error estimate:

(∫ 1

0

(u− u)2dx) 1

2

= O(hs+1), s = degree of Ni

where h is the distance between two neighboring nodes

Linear elements: error ∼ h2

Quadratic elements: error ∼ h3

⇒ doubling the number of nodes reduces the error by 1/8


Quadratic elements: example

PDE:−u′′(x) = 1, x ∈ (0, 1)

Apply the general elementwise algorithm with ne = 3, quadratic Nr,analytical integration

Element matrix:

1

3he

7 −8 1

−8 16 −8

1 −8 7

he: physical length of element e.

element vector:

he

6

1

4

1


Implementation of 1D FE problems

Direct implementation in terms of arrays

Linear 1D elements, −u′′ = β = const, but still fairly generalimplementation

scan: read n and β, allocate vectors, matrices, etc., call initGrid.

initGrid: compute xi and q(e, r), i.e., the finite element grid.

makeSystem: calculate the linear system.

solve: solve linear system by Gaussian elimination.

calcElmMatVec: compute element matrix and vector for an element.

integrands: evaluate the integrands of the weighted residual statement

N: evaluate the basisfunctions in local coordinates.

dN: evaluate the derivatives of the basisfunctions in local coordinates.


Extensions of the program

1D to 2D (3D): big job

Another PDE: easy - integrands

Quadratic elements: small, scattered modifications - N, dN, integrands,initGrid ...

Goal: solver code is independent ofelement typelinear system solvermatrix formatgrid typeno of space dimensions


2D domains

Strength of the finite element method:easy to work with geometrically complicated domains

0−8.92 7.92

1

2

3

4

5

6

7

8

Lake Superior with 6 islands, 2330 triangles

0−8.92 7.92

1

2

3

4

5

6

7

8


Element shapes

Rectangular, triangular

Straight or curved sides

Illegal:

illegal node


2D rectangular bilinear element

e e

ee

6

-

21

3 4

ξ1

ξ2

4 nodes

Bilinear functions:

Nr(ξ1, ξ2) = ar + brξ1 + crξ2 + drξ1ξ2

Conditions for determining ar, br, cr, dr:

Nr(node s) = δrs

4 equations for ar, br, cr, dr (fixed r)Intro to finite elements – p. 260

Mapping of 2D bilinear element

ξ

ξ

global

2

1

x2

x1

local


2D linear 3-node element

ξ

ξ

local global

2

1

x

x1

2

1

2

3

linear Ni(ξ1, ξ2), straight sides


Typical 2D linear basis function


Construction of the basis functions

(0,1)(0,0)

(1,0)

1ξ

2ξ

reference element

13

2

Principles for the construction:

1. Ni(ξ1, ξ2) is a polynomial

2. Ni = δij at local node j

3 constraints⇒ Ni must be linear

N1 = ξ1, N2 = ξ2, N3 = 1− ξ1 − ξ2Intro to finite elements – p. 264

Construction of 2D basis functions; Example

Example: N2(ξ1, ξ2)

Ni(ξ1, ξ2) = αi + βiξ1 + γiξ2

N2(1, 0) = α2 + β2 · 1 + γ2 · 0N2(0, 1) = α2 + β2 · 0 + γ2 · 1N2(0, 0) = α2 + β2 · 0 + γ2 · 0

3 linear equations in 3 unknowns α2, β2, and γ2.

Solution: α2 = β2 = 0, γ2 = 1

N(ξ1, ξ2) = ξ2


2D quadratic 6-node element

x

ξ

ξ

local global

2

1

x

1

2

quadratic Ni(ξ1, ξ2), curved sides (parabola)



ξ

ξ

local

2

1

x

x1

2

global

biquadratic Ni(ξ1, ξ2), curved sides (parabola)



ξ

ξ

local global

2

1

x2

x1

biquadratic Ni(ξ1, ξ2) minus ξ21ξ22 term, curved sides (parabola)


3D elements

tetrahedron with 4 corner nodes (linear Ni)

tetrahedron with 10 nodes (quadratic Ni),mid-node on each edge

box with 8 corner nodes (tri-linear Ni)

box with 20 nodes (quadratic Ni),mod-node on each edge

box with 27 nodes (tri-quadratic Ni),mod-node on edges and sides


Triangular vs. box shape

Any 2D geometry can be divided into triangles(if the boundaries are approx. by polygons)

Any 3D geometry can be divided into tetrahedra(if the boundaries are approx. by polygons)

Many geometries can be divided into rectangles/boxes, but one mayneed an extra triangle/tetrahedron

Different element shapes have different properties, depending on thePDE system, so chosing the right element is not obvious, and it is notonly a geometry-approximation thing


Convection-dominated flow

Convection-dominated flow – p. 271

Projects 1.5.2 and 1.5.3

Model problem:

u′(x) = εu′′(x), x ∈ (0, 1), u(0) = 0, u(1) = 1

ε small: boundary layer at x = 1

Standard numerics (i.e. centered differences) will fail!

Cure: upwind differences


Notation for difference equations (1)

Define

[δxu]ì,j,k ≡

uì+ 1

2 ,j,k− u`

i− 12 ,j,k

h

with similar definitions of δy, δz, and δt

Another difference:

[δ2xu]ì,j,k ≡

uì+1,j,k − u`

i−1,j,k

2h

Compound difference:

[δxδxu]ì =

1

h2

(u`

i−1 − 2uì + u`

i+1

)


Notation for difference equations (1)

One-sided forward difference:

[δ+x u]ì ≡

uì+1 − u`

i

h

and the backward difference:

[δ−x u]ì ≡

uì − u`

i−1

h

Put the whole equation inside brackets:

[δxδxu = −f ]i

is a finite difference scheme for u′′ = −f


Centered differences

u′(x) = εu′′(x), x ∈ (0, 1), u(0) = 0, u(1) = 1

ui+1 − ui−1

2h= ε

ui−1 − 2ui + ui+1

h2, i = 2, . . . , n− 1

u1 = 0, un = 1

or[δ2xu = εδxδxu]i

Analytical solution:

u(x) =1− ex/ε

1− e1/ε

⇒ u′(x) > 0, i.e., monotone function


Numerical experiments (1)

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 0.2 0.4 0.6 0.8 1

u(x)

x

n=20, epsilon=0.1

centeredexact



-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 0.2 0.4 0.6 0.8 1

u(x)

x

n=20, epsilon=0.01

centeredexact



-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 0.2 0.4 0.6 0.8 1

u(x)

x

n=80, epsilon=0.01

centeredexact



-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 0.2 0.4 0.6 0.8 1

u(x)

x

n=20, epsilon=0.001

centeredexact


Numerical experiments; summary

The solution is not monotone if h > 2ε

The convergence rate is h2

(in agreement with truncation error analysis)provided h ≤ 2ε

Completely wrong qualitative behavior for h 2ε


Analysis

Can find an analytical solution of the discrete problem (!)

Method: insert ui ∼ βi and solve for β

β1 = 1, β2 =1 + h/(2ε)

1− h/(2ε)

cf. HPL app. A.4.4

Complete solution:

ui = C1βi1 + C2β

i2

Determine C1 and C2 from boundary conditions

ui =βi

2 − β2

βn2 − β2


Important result

Observe: ui oscillates if β2 < 0

⇒ 1 + h/(2ε)

1− h/(2ε) < 0 ⇒ h > 2ε

Must require h ≤ 2ε for ui to have the same qualitative property asu(x)

This explains why we observed oscillations in the numerical solution


Upwind differences

Problem:

u′(x) = εu′′(x), x ∈ (0, 1), u(0) = 0, u(1) = 1

Use a backward difference, called upwind difference, for the u′ term:

ui − ui−1

h= ε

ui−1 − 2ui + ui+1

h2, i = 2, . . . , n− 1

u1 = 0, un = 1

The scheme can be written as

[δ−x u = εδxδxu]i



-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 0.2 0.4 0.6 0.8 1

u(x)

x

n=20, epsilon=0.1

upwindexact



-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 0.2 0.4 0.6 0.8 1

u(x)

x

n=20, epsilon=0.01

upwindexact


Numerical experiments; summary

The solution is always monotone, i.e., always qualitatively correct

The boundary layer is too thick

The convergence rate is h(in agreement with truncation error analysis)


Analysis

Analytical solution of the discrete equations:

ui = βi ⇒ β1 = 1, β2 = 1 + h/ε

ui = C1 + C2βi2

Using boundary conditions:

ui =βi

2 − β2

βn2 − β2

Since β2 > 0 (actually β2 > 1), βi2 does not oscillate


Centered vs. upwind scheme

Truncation error:centered is more accurate than upwind

Exact analysis:centered is more accurate than upwind when centered is stable(i.e. monotone ui), but otherwise useless

ε = 10−6 ⇒ 500 000 grid points to make h ≤ 2ε

Upwind gives best reliability, at a cost of a too thick boundary layer


An interpretation of the upwind scheme

The upwind scheme

ui − ui−1

h= ε

ui−1 − 2ui + ui+1

h2

or[δ−x u = εδxδxu]i

can be rewritten as

ui+1 − ui−1

2h= (ε+

h

2)ui−1 − 2ui + ui+1

h2

or

[δ2xu = (ε+h

2)δxδxu]i

Upwind = centered + artificial diffusion (h/2)


Finite elements for the model problem

Galerkin formulation of

u′(x) = εu′′(x), x ∈ (0, 1), u(0) = 0, u(1) = 1

and linear elements leads to a centered scheme (show it!)

ui+1 − ui−1

2h= ε

ui−1 − 2ui + ui+1

h2, i = 2, . . . , n− 1

u1 = 0, un = 1

or[δ2xu = εδxδxu]i

Stability problems when h > 2ε


Finite element theory (1)

Abstract finite element theory starts with

a(u, v) = L(v) ∀v ∈ V

Considerv · ∇u = ε∇2u

⇒ Nonsymmetric a(u, v)

General best-approximation result:

||u− uh||V ≤c1c2||u− v||V ∀v ∈ V

where c1 and c2 are the bounds of a:

c2||v||V ≤ a(v, v), |a(u, v)| ≤ c1||u||V ||v||V


Finite element theory (2)

In the estimate, c2 = ε and

c1 = ε+√CΩ sup|v(x)| | x ∈ Ω

If “v is ε”, c1/c2 is also large

Some indication that the best-approximation property of the Galerkinmethod is not that much worth


Finite elements and upwind differences

How to construct upwind differences in a finite element context?

One possibility: add artificial diffusion (h/2)

u′(x) = (ε+h

2)u′′(x), x ∈ (0, 1), u(0) = 0, u(1) = 1

Can be solved by a Galerkin method

Equivalent strategy: use perturbed weighting functions


Perturbed weighting functions in 1D

TakeWi(x) = Ni(x) + τN ′

i(x)

as weighting function for the convective term u′:

∫ 1

0

u′Widx =

∫ 1

0

u′Nidx+

∫ 1

0

τN ′iu

′dx

The new term τN ′iu

′ is the weak formulation of an artificial diffusionterm τNiu

′′

With τ = h/2 we then get the upwind scheme


Optimal artificial diffusion

Try a weighted sum of a centered and an upwind discretization:

[u′]i ≈ [θδ−x u+ (1− θ)δ2xu]i, 0 ≤ θ ≤ 1

[θδ−x u+ (1− θ)δ2xu = εδxδxu]i

Is there an optimal θ?

Yes, for

θ(h/ε) = cothh

2ε− 2ε

h

we get exact ui (i.e. u exact at nodal points)

Equivalent artificial diffusion τo = 0.5hθ(h/ε)

Exact finite element method: Wi(x) = Ni(x) + τoN′i(x) for the

convective term u′


Multi-dimensional problems

Model problem:

vx∂u

∂x+ vy

∂u

∂y= ∇2u

often written asv · ∇u = ∇2u

Non-physical oscillations occur with centered differences or Galerkinmethods when the left-hand side terms are large

Remedy: upwind differences

Downside: too much diffusion

Important result: extra stabilizing diffusion is needed only in thestreamline direction (vx, vy)


Streamline diffusion

Idea: add diffusion in the streamline direction

Isotropic diffusion:

d∑

i=1

d∑

j=1

kδij∂2u

∂xi∂xj= k∇2u

kδij is the diffusion tensor (same in all directions)

Streamline diffusion:

d∑

i=1

d∑

j=1

∂

∂xi

(kij

∂u

∂xj

), kij = τ

vivj

||v||2

Implementation: artificial diffusion term or perturbed weightingfunction


Perturbed weighting functions (1)

Consider the weighting function

Wi = Ni + τ∗v · ∇Ni

for the convective (left-hand side) term:∫Wi v · ∇u dΩ

This expands to

∫Niv · ∇udΩ +

∫τ∗v · ∇uv · ∇NidΩ

The latter term can be viewed as the Galerkin formulation of (writev · ∇u =

∑i ∂u/∂xi etc.)

d∑

i=1

d∑

j=1

∂

∂xi

(τ∗vivj

∂u

∂xj

)


Perturbed weighting functions (2)

⇒ Streamline diffusion can be obtained by perturbing the weightingfunction

Common name: SUPG(streamline-upwind/Petrov-Galerkin)


Consistent SUPG

Why not just add artificial diffusion?

Why bother with perturbed weighting functions?

In standard FEM, ∫

Ω

L(u)WidΩ = 0

the exact solution is a solution of the FEM equations (it fulfills L(u))

This no longer holds if weadd an artificial diffusion term (∼ h/2)use different weighting functions on different terms

Idea: use consistent SUPGno artificial diffusion termsame (perturbed) weighting functionapplies to all terms


A step back to 1D

Let us try to useWi(x) = Ni(x) + τN ′

i(x)

on both terms in u′ = εu′′:

∫ 1

0

(Niu′ + (ε+ τ)N ′

iu′)dx+ τ

∫ 1

0

N ′′i u

′dx = 0

Problem: last term

Remedy: drop it (!)

Justification: N ′′i = 0 on each linear element

Drop 2nd-order derivatives of Ni in 2D/3D too

Consistent SUPG is not so consistent...


Choosing τ ∗

Choosing τ∗ is a research topic

Many suggestions

Two classes:τ∗ ∼ hτ∗ ∼ ∆t (time-dep. problems)

Little theory


A test problem (1)

u=0

u=0du/dn=0 or1

1

y

x

u=1θ

v

y = x tan θ + 0.25

0.25

u=0

du/dn=0


A test problem (2)

Methods:

1. Classical SUPG:Brooks and Hughes: "A streamline upwind/Petrov-Galerkin finiteelement formulation for advection domainated flows with particularemphasis on the incompressible Navier-Stokes equations", Comp.Methods Appl. Mech. Engrg., 199-259, 1982.

2. An additional discontinuity-capturing term

Wi = Ni + τ∗v · ∇Ni + τv · ∇u||∇u||2∇u

was proposed inHughes, Mallet and Mizukami: "A new finite element formulation forcomputational fluid dynamics: II. Beyond SUPG", Comp. MethodsAppl. Mech. Engrg., 341-355, 1986.


Galerkin’s method

X

Y

Z

−0.65

0

1

2

0

10

1


SUPG

X

Y

Z

−0.65

0

1

2

0

10

1


Time-dependent problems

Model problem:∂u

∂t+ v · ∇u = ε∇2u

Can add artificial streamline diffusion term

Can use perturbed weighting function

Wi = Ni + τ∗v · ∇Ni

on all terms

How to choose τ∗?


Taylor-Galerkin methods (1)

Idea: Lax-Wendroff + Galerkin

Model equation:∂u

∂t+ U

∂u

∂x= 0

Lax-Wendroff: 2nd-order Taylor series in time,

u`+1 = u` + ∆t

[∂u

∂t

]`

+1

2∆t2

[∂2u

∂t2

]`

Replace temporal by spatial derivatives,

∂

∂t= −U ∂

∂x

Result:

u`+1 = u` − U∆t

[∂u

∂x

]`

+1

2U2∆t2

[∂2u

∂x2

]`



We can write the scheme on the form

[δ+t u+ U

∂u

∂x=

1

2U2∆t

∂2u

∂x2

]`

⇒ Forward scheme with artificial diffusion

Lax-Wendroff: centered spatial differences,

[δ+t u+ Uδ2xu =1

2U2∆tδxδxu]

ì

Alternative: Galerkin’s method in space,

[δ+t u+ Uδ2xu =1

2U2∆tδxδxu]

ì

provided that we lump the mass matrix

This is Taylor-Galerkin’s methodConvection-dominated flow – p. 309


In multi-dimensional problems,

∂u

∂t+ v · ∇u = 0

we have∂

∂t= −v · ∇

and (∇ · v = 0)

∂2

∂t2= ∇ · (vv · ∇) =

d∑

r=1

d∑

s=1

∂

∂xr

(vrvs

∂

∂xs

)

This is streamline diffusion with τ ∗ = ∆t/2:

[δ+t u+ v · ∇u =1

2∆t∇ · (vv · ∇u)]`



Can use the Galerkin method in space(gives centered differences)

The result is close to that of SUPG, but τ ∗ is diferent

⇒ The Taylor-Galerkin method points to τ ∗ = ∆t/2 for SUPG intime-dependent problems


Nonlinear PDEs

Nonlinear PDEs – p. 312

Examples

Some nonlinear model problems to be treated next:

−u′′(x) = f(u), u(0) = uL, u(1) = uR,

−(λ(u)u′)′ = 0, u(0) = uL, u(1) = uR

−∇ · [λ(u)∇u] = g(x), with u or − λ∂u∂n

B.C.

Discretization methods:standard finite difference methodsstandard finite element methodsthe group finite element method

We get nonlinear algebraic equations

Solution method: iterate over linear equations


Nonlinear discrete equations; FDM

Finite differences for −u′′ = f(u):

− 1

h2(ui−1 − 2ui + ui+1) = f(ui)

⇒ nonlinear system of algebraic equations

F (u) = 0, or Au = b(u), u = (u1, . . . , un)T

Finite differences for (λ(u)u′)′ = 0:

1

2h2([λ(ui+1) + λ(ui)](ui+1 − ui)

−[λ(ui) + λ(ui−1)](ui − ui−1)) = 0

⇒ nonlinear system of algebraic equations

F (u) = 0 or A(u)u = b


Nonlinear discrete equations; FEM

Finite elements for −u′′ = f(u):

u ≈ u =n∑

k=1

ukNk(x)

Galerkin approach:

∫ 1

0

N ′i u

′dx =

∫ 1

0

f(∑

k

Nkuk)Nidx

(assuming prescribed u(0) and u(1))

Left-hand side is easy to assemble:

− 1

h(ui−1 − 2ui + ui+1) =

∫ 1

0

f(∑

k

ukNk(x))Nidx


Nonlinearities in the FEM

Note thatf(∑

k

Nk(x)uk)

is a complicated function of u1, . . . , un

F.ex.: f(u) = u2

∫ 1

0

(∑

k

Nkuk

)2

Nidx

gives rise to a difference representation

h

12

(u2

i−1 + 2ui(ui−1 + ui+1) + 6u2i + u2

i+1

)

(compare with f(ui) = u2i in FDM!)

Must use numerical integration in general


The group finite element method

The group finite element method:

f(u) = f(∑

k

ukNk(x)) ≈n∑

k=1

f(uk)Nk

Resulting term:∫ 1

0f(u)Nidx =

∫ 1

0

∑k NiNkf(uk) gives

∑

k

∫ 1

0

NkNidx or f(uk), Mf(u)

which is a mass matrix-like term: h6 (f(ui−1) + 4f(ui) + f(ui+1))

Trapezoidal integration gives an FDM-like term:

∑

k

∫ 1

0

NkNidx f(uk)dx ≈ hf(ui)

⇒ similar results as FDM Nonlinear PDEs – p. 317

FEM for a nonlinear coefficient

We now look at

(λ(u)u′)′ = 0, u(0) = uL, u(1) = uR

Using a finite element method (exercise 4.2) results in an integral

∫ 1

0

λ(∑

k

ukNk)N ′iN

′j dx

⇒ complicated!

Linear elements and trapezoidal rule:

1

2(λ(ui) + λ(ui+1))(ui+1 − ui)−

1

2(λ(ui−1) + λ(ui))(ui − ui−1) = 0

⇒ FDM with arithmetic mean for λ(ui+1/2)


Nonlinear algebraic equations

FEM/FDM for nonlinear PDEs givesnonlinear algebraic equations:

(λ(u)u′)′ = 0 ⇒ A(u)u = b

−u′′ = f(u) ⇒ Au = b(u)

In general a nonlinear PDE gives

F (u) = 0

or

F1(u1, . . . , un) = 0

. . .

Fn(u1, . . . , un) = 0


Solving nonlinear algebraic eqs.

HaveA(u)u− b = 0, Au− b(u) = 0, F (u) = 0

Idea: solve nonlinear problem as a sequence of linear subproblems

Must perform some kind of linearization

Iterative method: guess u0, solve linear problems for u1, u2, . . . andhope that

limk→∞

uk = u

i.e. the iteration converges


Successive substitutions (1)

Model problem: A(u)u = b

Simple iteration scheme:

A(uk)uk+1 = b, k = 0, 1, . . .

Must provide (good) guess u0

Termination:||uk+1 − uk|| ≤ εu

or using the residual (expensive, req. new A(uk+1)!)

||b−A(uk+1)uk+1|| ≤ εr

Relative criteria:||uk+1 − uk|| ≤ εu||uk||

or (more expensive)

||b−A(uk+1)uk+1|| ≤ εr||b−A(u0)u0||

Simple method, but sometimes slow convergece


Successive substitutions (2)

Model problem: Au = b(u)

Simple iteration scheme:

Auk+1 = b(uk), k = 0, 1, . . .

Relaxation:

Au∗ = b(uk), uk+1 = ωu∗ + (1− ω)uk

(may improve convergence, avoids too large steps)

Picard iteration is another name of this method


Newton’s method (1)

The Newton (Newton-Raphson) method for f(x) = 0, x ∈ IR

Given an approximation xk

Approximate f by a linear function at xk:

f(x) ≈M(x;xk) = f(xk) + f ′(xk)(x− xk)

Find new xk+1 such that

M(xk+1;xk) = 0⇒ xk+1 = xk − f(xk)

f ′(xk)


Newton’s method (2)

Systems of nonlinear equations:

F (u) = 0, F (u) ≈M(u; uk)

Multi-dimensional Taylor-series expansion:

M(u; uk) = F (uk) + J(u− uk), J ≡ ∇F

Ji,j =∂Fi

∂uj

Iteration no. k:solve linear system J(uk)(δu)k+1 = −F (uk)

update: uk+1 = uk + (δu)k+1

Can use relaxation: uk+1 = uk + ω(δu)k+1


The Jacobian matrix; FDM (1)

Model: u′′ = −f(u)

Scheme:

Fi ≡1

h2(ui−1 − 2ui + ui+1)− f(ui) = 0

Jacobian matrix term (FDM):

Ji,j =∂Fi

∂uj

Fi = 0 contains only ui, ui±1

⇒ Jacobian is sparse (tridiagonal)


The Jacobian matrix; FDM (2)

Fi ≡1

h2(ui−1 − 2ui + ui+1)− f(ui) = 0

Derivation:

Ji,i−1 =∂Fi

∂ui−1=

1

h2

Ji,i+1 =∂Fi

∂ui+1=

1

h2

Ji,i =∂Fi

∂ui= − 2

h2− f ′(ui)

Must form the Jacobian in each iteration and solve

Jδuk+1 = −F (uk)

and then update

uk+1 = uk + ωδuk+1


The Jacobian matrix; FEM

−u′′ = f(u) + FEM gives Fi = 0, where

Fi ≡∫ 1

0

∑

j

N ′iN

′juj − f(

∑

s

usNs)Ni

dx

The Jacobian:

Ji,j =∂Fi

∂uj

becomes∫ 1

0

[N ′

iN′j − f ′(

∑

s

usNs)NjNi

]dx

In general, for FE function u =∑

s usNs,

∂

∂ujf(u) = f ′(u)

∂u

∂uj= f ′(u)

∂

∂uj

∑

s

usNs = f ′(u)Nj


A 2D/3D transient nonlinear PDE (1)

PDE:

%C∂u

∂t= ∇ · [κ(u)∇u]

(f.ex. u = g on the boundary and u = I at t = 0)

FDM in time:u` − u`−1

∆t= ∇ ·

[λ(u`)∇u`

]

with λ = κ/(%C)

FEM→ nonlinear algebraic equations:

Fi(u`1, . . . , u

`n) = 0, i = 1, . . . , n

where

Fi ≡∫

Ω

[(u` − u`−1

)Ni + ∆tλ(u`)∇u` · ∇Ni

]dΩ


A 2D/3D transient nonlinear PDE (2)

Successive substitution:Use “old” u`,k in λ(u`) term, solve linear problem for u`,k+1,k = 0, 1, . . .

Exercise: specify element matrix and vector

Newton-Raphson’s method: need J ,

Ji,j =∂Fi

∂uj

Exercise: carry out the differentiation, specify element matrix andvector


Iteration methods at the PDE level

Consider −u′′ = f(u)

Could introduce a successive substitution at the PDE level:

− d2

dx2uk+1 = f(uk), k = 0, 1, . . .

⇒ linear problem for uk+1

A PDE-level Newton-Raphson method can also be formulated(see the book for details)

We get identical results for our model problem

Time-dependent problems: first use finite differences in time, thenuse an interation method (successive subst. or Newton-Raphson) atthe time-discrete PDE level


Continuation methods

Challenging nonlinear PDE:

∇ · (||∇u||q∇u) = 0

For q = 0 this problem is simple

Idea: solve a sequence of problems, starting with q = 0, and increaseq towards a target value

Sequence of PDEs:

∇ · (||∇ur||qr∇ur) = 0, r = 0, 1, 2, . . .

with = 0 < q0 < q1 < q2 < · · · < qm = q

Start guess for ur is ur−1

(the solution of a “simpler” problem)

CFD: The Reynolds number is often the continuation parameter q


Exercises

Methods for nonlinear PDEs are best learned through exercises

Exercises 4.1, 4.2, 4.4, 4.5, 4.6, 4.7, 4.9, 4.10


Model problem for nonlinear PDEs

Let us make software for solving

∂u

∂t= ∇ · [λ(u)∇u]

Boundary conditions: u prescribed

Test solution: λ(u) = u, u = φ(x, t) = dt+∑

j xj at the boundary⇒ u = φ everywhere


Discretization in time

Backward Euler scheme in time:

u` − u`−1

∆t= ∇ ·

[λ(u`)∇u`

]

⇒ recursive set of spatial problems for u`(x)


Discretization in space

FEM for the spatial problems:

u`(x) ≈ u`(x) =n∑

j=1

u`jNj(x)

Fi(u`1, . . . , u

`n) = 0, i = 1, . . . , n,

where

Fi ≡∫

Ω

[(u` − u`−1

)Wi + ∆tλ(u`)∇u` · ∇Wi

]dΩ .

⇒ Nonlinear system of algebraic equations for u = (u`1, . . . , u

`n)


Solution of nonlinear systems

Idea: solve a nonlinear system as a sequence of linear systems

Approx. to u` in iteration k: u`,k(x)

Successive substitutions:use “old” u`,k in nonlinear coefficients, λ(u`,k), and solve for u`,k+1

Newton’s or Newton-Raphson’s method:in iteration k, solve

1. Jδuk+1 = −F

2. u`,k+1 = u`,k + δuk+1

where

Ji,j =∂Fi

∂uj

and Fi are computed using “old” values u`,k

(F = (F1, . . . , Fn), J = Ji,j)


The Jacobian

In our example:

Ji,j ≡∂Fi

∂u`j

=

∫

Ω

[WiNj + ∆t

dλ

du(u`,k)Nj∇Wi · ∇u`,k

+ ∆t λ(u`,k)∇Wi · ∇Nj

]dΩ

Such expressions must be calculated by hand(or symbolic math software)


What to implement

Fi: these are the same terms that appear in a corresponding linearPDE problem

In case of Newton-Raphson’s method, we also need to implementJi,j (containing terms that are not identical to those in the PDE)

⇒ Nonlinear PDE solver = Linear PDE solver + an outer “nonlinear” loop


Implementation in Diffpack

The evaluation of the Jacobian and right-hand side at an integrationfollows the same set-up as in linear problems

The management of a “nonlinear” loop is a new component

Need information about the type of nonlinear solver


Nonlinear systems in Diffpack

Nonlinear solvers are realized as subclasses of a class hierarchyNonLinEqSolver:

- SuccessiveSubst- NewtonRaphson

Nonlinear solver algorithm:// calling NonLinEqSolver’s solve() leads to

iteration = 0;while (!converged)

iteration++;// ask simulator to set up the linear (sub)system to be// solved in this iteration:

solver->makeAndSolveLinearSystem(); // define the PDE!

// perform updates according to the algorithm

This loop takes place in the Diffpack libraries


makeAndSolveLinearSystem

The purpose ofmakeAndSolveLinearSystem()

is the same as for a linear problem:makeSystem (assemble system)lineq’s solve (solve linear system)

Hence, the programmer has complete control of the linear systemand its solution in each iteration

Note: makeSystem defines the linear system and hence the PDE(implicitly)

This set-up makes it easy to switch between iterationmethods/strategies


Implementation

Derive simulator from NonLinEqSolverUDC and FEM

Add three new data items:Vec(real) nonlin_solution;Handle(NonLinEqSolver_prm) nlsolver_prm;Handle(NonLinEqSolver) nlsolver;

Initialize these objects in scan

Call nonlinear solver:nlsolver->solve();

In each iteration, the nonlinear solver jumps back to yourvirtual void makeAndSolveLinearSystem()

// essentiallymakeSystem (*dof, *lineq); // set up linear subsystemlineq->solve(); // solve linear subsystem

Tip: learn the numerics well before starting with the implementation!


A real makeAndSolveLinearSystem

void NlHeat1:: makeAndSolveLinearSystem ()

dof->vec2field (nonlin_solution, *u); // u = most recent guess

if (nlsolver->getCurrentState().method == NEWTON_RAPHSON)// essential boundary conditions must be set to zero because// the unknown vector in the linear system is a correction// vector (assume that nonlin_solution has correct ess. bc.)dof->fillEssBC2zero();

else// normal (default) treatment of essential boundary cond.dof->unfillEssBC2zero();

makeSystem (*dof, *lineq);

// init start vector for iterative linear solver:if (nlsolver->getCurrentState().method == NEWTON_RAPHSON)// start for a correction vector (expected to be approx 0):linear_solution.fill (0.0);

else// use the most recent nonlinear solution:linear_solution = nonlin_solution;

lineq->solve(); // invoke a linear system solver// the solution of the linear system is now available// in the vector linear_solution


integrands (1)

void NlHeat1::integrands(ElmMatVec& elmat,const FiniteElement& fe)

const real dt = tip->Delta(); // current time stepconst int nsd = fe.getNoSpaceDim(); // no of space dimsconst real u_pt = u->valueFEM (fe); // interpolate uconst real up_pt= u_prev->valueFEM (fe); // interpolate u_prevPtv(real) gradu_pt (nsd); // grad u at present pt.u->derivativeFEM (gradu_pt, fe); // compute gradu_ptPtv(real) gradup_pt (nsd); // grad u_prev --"--u_prev->derivativeFEM (gradup_pt, fe); // compute gradup_pt

const int nbf = fe.getNoBasisFunc(); // no of local nodesconst real detJxW = fe.detJxW();real gradNi_gradNj, gradNi_gradu, h;int i,j,s;


integrands (2)

if (nlsolver->getCurrentState().method == NEWTON_RAPHSON)for (i = 1; i <= nbf; i++)

gradNi_gradu = 0;for (s = 1; s <= nsd; s++)gradNi_gradu += fe.dN(i,s)*gradu_pt(s);

for (j = 1; j <= nbf; j++) gradNi_gradNj = 0;for (s = 1; s <= nsd; s++)

gradNi_gradNj += fe.dN(i,s)*fe.dN(j,s);h = fe.N(i)*fe.N(j) + dt*( lambda(u_pt)*gradNi_gradNj

+ dlambda(u_pt)*fe.N(j)*gradNi_gradu );elmat.A(i,j) += h*detJxW;

h = fe.N(i)*(u_pt - up_pt) + dt*u_pt*gradNi_gradu;elmat.b(i) += -h*detJxW;

else// error message ... not implemented ...


Lessons learned

Transient PDE solver = stationary PDE solver + a time loop and acouple of extra data items

Nonlinear PDE solver = linear PDE solver + a hidden nonlinear loopand three extra data items

Learn software tools for ∇2u = f well;- they can be trivially reused for systems of transient nonlinear

PDEs- the Diffpack programming philosophy remains the same

The implementational steps from simple to advanced problems canbe small

Be prepared for major numerical steps when moving to advancedproblems

Diffpack does not simplify the numerics, just the implementation(⇒ the Diffpack book contains both numerics and software)


Elasticity

Elasticity – p. 347

Linear thermo-elasticity

Application area: structural analysis

pressure load

Purpose of simulation: compute deformation and internal forces(stress)


The deformation and a stress measure

6.1×10−17 1.0×100 2.0×100 3.0×1000

1

2

3

0.0195

0.161

0.302

0.443

0.584

0.725

0.866

1.01

1.15

1.29

1.43

6.1×10−17 1.0×100 2.0×100 3.0×1000

1

2

3


Mathematical model (1)

Basic quantities:u(x) displacement field(a vector at each point)σij : the stress tensor (3x3 matrix)(needed for evaluating stresses)λ, µ: elasticity coefficientsT : temperature deviation

Basic equations:Equilibrium:

∇ · σ = 0

Constitutive law for elasticity (Hooke’s law):

σ = λ(∇ · u)I + µ(∇u + (∇u)T )− α(3λ+ 2µ)TI


A look at the very basics

Consider elongation of a bar

F F

FF

F/2

F/2

F/2

F/2

The stress at the red circle depends on the surface orientation(note: stress is difficult to understand!)


Mathematical model (2)

Combining the equations gives:

∇ [(λ+ µ)∇ · u] +∇ · [µ∇u] = ∇ [α(3λ+ 2µ)T ]

or with constant λ and µ:

(λ+ µ)∇(∇ · u) + µ∇2u = α(3λ+ 2µ)∇T

Primary unknown: u

Primary interest: σ

T is prescribed or found from a heat eq.

Solve for u, find σ from Hooke’s law

Note: d unknowns per node: u

Linear (elliptic) vector PDE for u


Special versions of the model

Full 3D thermo-elasticity

2D plane strain elasticity u3 = 0, ∂/∂x3 = 0

2D plane stress elasticity:set u3 = 0, ∂/∂x3 = 0 and modify λ


Notation

Index notation to condense formulas

Rule 1: ai is vector, aij is tensor

Rule 2: sum over repeated indices

aibi ≡d∑

i=1

aibi

Rule 3: comma denotes differentiation

f,i ≡∂f

∂xiai,k ≡

∂ai

∂xk

These rules can be combined, e.g.,

σij,j ≡d∑

j=1

∂σij

∂xj


The Kronecker delta

Kronecker delta: δij = 0 if i 6= j, δij = 1 if i = j

With summation convention: δii = 1 + 1 + 1 = 3 (!)

Without summation convention: δii = 1

Rule 4: annihilate expressions with δij

δijvj = vi vjδij = vi aijδij = aii

(e.g., set i = 1, δ1jvj = 1 · vj + 0 · vj + 0 · vj)


Mathematical model with new notation

Basic quantities:ui(xj): displacement field

σij : the stress tensor

λ, µ: elasticity coefficientsT : temperature deviation

Basic equations:Equilibrium:

σij,j = 0

Constitutive law for elasticity (Hooke’s law):

σij = λuk,kδij + µ(ui,j + uj,i)− α(3λ+ 2µ)Tδij

Combined into an equation for ui:

((λ+ µ)uk,k),i + (µui,j),j = (α(3λ+ 2µ)T ),i


The Poisson equation revisited (1)

−∇ · [λ∇u] = f

Written as a first order system

∇ · q = f

q = −λ∇u

Starting with this system, and eliminating q after having derived theweighted residual form and performed integration by parts, is theapproach we shall use in elasticity (as it simplifies the mathematicaldetails in the elasticity problem)



Weighted residual form (Galerkin’s method):

∫

Ω

∇ · qNidΩ, q ≈ q =∑

j

qjNj , u ≈ u =∑

j

ujNj

Integration by parts:

∫

Ω

∇ · qNidΩ = −∫

Ω

∇Ni · q dΩ +

∫

∂Ω

Niq · n dΓ

Insert q = −λ∇u, and obtain standard FEM problem

Short notation with indices (q → qi):

∇ · q =

d∑

k=1

∂qk∂xk

= qk,k, qk = −λu,k

Sum over repeated index and comma denotes differentiation Elasticity – p. 358


Integration by parts in alternative notation:

∫

Ω

qk,kNidΩ = −∫

Ω

Ni,kqkdΩ +

∫

∂Ω

NinkqkdΓ

qk = −λu,k ≈ −λn∑

j=1

ujNj,k

⇒ Linear system (as usual)

n∑

j=1

∫

Ω

λ Ni,kNj,k︸︷︷︸=∇Ni·∇Nj

dΩuj =

∫

Ω

fNidΩ +

∫

∂Ω

Niλu,knk︸︷︷︸= ∂u

∂n

dΓ

This new notation and use of both qk and u in the derivation makesthe numerical details of more complicated problems (e.g. elasticity)easier


FEM in elasticity (1)

Equilibrium equation:σrs,s = 0

Galerkin’s method + integration by parts:

∫

Ω

σrsNi,sdΩ =

∫

∂Ω

Ni σrsns︸︷︷︸b.c.

dΓ

d equations, each weighted by Ni

Replace σrs by ur:

σrs = λuk,kδrs + µ(ur,s + us,r)− α(3λ+ 2µ)Tδrs

d equations for each i (node)


FEM in elasticity (2)

Expansion:

ui ≈ ui =n∑

j=1

uijNj(x1, . . . , xd)

d unknowns at each node: u1j , . . . , u

dj

Linear system: Kx = b

x = (u11, . . . , u

d1, u

12, . . . , u

d2, . . . , u

1n, . . . , u

dn)T

Matrices (vectors) consist of d× d (d) blocks


The element equations

Element level equations can be written as

∑

j

∑

s

Arsi,ju

sj = bri , r = 1, . . . , d, i = 1, . . . , n

Equation number: (i, r)→ d(i− 1) + r

Unknown number: (j, s)→ d(j − 1) + s

Element matrix: dn× dn (n = no of nodes in elem.)

For fixed i and j (node numbers), Arsi,j is a d× d matrix reflecting the

coupling of node i and j


Derivation of the element equations

Aim 1: insert σrs in∫ΩσrsNi,sdΩ, where

σrs = λuk,kδrs + µ(ur,s + us,r)− α(3λ+ 2µ)Tδrs

and

ui ≈ ui =n∑

j=1

uijNj(x)

Aim 2: Manipulate expressions to identify element matrix Arsi,j and

vector bri


Derivation; cont.

Let’s look at the first term in σrs:

σrs = λuk,kδrs = λ∑

j

Nk,jukj δrs in

∫

Ω

σrsNi,sdΩ

gives

∑

j

∑

s

∫

Ω

λ

(∑

k

Nj,k

)Ni,sδrsdΩ

uk

j


Derivation; cont.

Try to rewrite

∑

j

∑

s

∫

Ω

λ

(∑

k

Nj,k

)Ni,sδrsdΩ

uk

j

on the form∑

j

∑sA

rsi,ju

sj

Step 1: δrsφs = φr for any vector φ

⇒ annihilate s:∑

s Ni,sδrs = Ni,r

Step 2: change k with s (dummy summation index)

∑

j

∑

s

∫

Ω

λNi,rNj,sdΩ

︸︷︷︸Ars

i,j

usj


Derivation; cont.

Next term:

∑

j

∑

s

∫

Ω

µNj,sNi,sdΩ

ur

j

Step 1: urj = δrku

kj (∑

k)

Step 2: change k and s

Result:

∑

j

∑

s

∫

Ω

(∑

k

µNi,kNj,k

)δrs

us

j

Easy to identify Arsi,j

Next terms are straightforward


Result of derivation (1)

General formula for Arsi,j :

Arsi,j =

∫

Ω

[µ

(∑

k

Ni,kNj,k

)δrs

+ µNi,sNj,r + λNi,rNj,s

]dΩ

This derivation: global level

Local level: replace Ω by Ω anddΩ by detJ dξ1 · · · dξd


Result of derivation (2)

Right-hand side:

bri =

∫

Ω

[(2µ+ 3λ)αTNi,r] dΩ +

∫

∂Ω

NitrdΓ

tr = σrsns: stress vector at the surface ∂Ω

Essential conditions: ur given

Natural conditions: tr given


Implementation

Standard Poisson1/2 data

The DegFreeFE object is more central(d unknown per node!)

FieldsFE u (vector field)

Vec(real) solution(solution of linear system)

Shuffling u↔ solution using DegFreeFE

Initialization:u.rebind (new FieldsFE (*grid,"u"));dof.rebind (new DegFreeFE (*grid, nsd)); // !!!solution.redim (u->getNoValues());lineq->attach (solution);


Entries in the element matrix

1

2

3

2x2 matrix, coupling node

entry, coupling local dof 2

4

in node 3 with local dof 1

in node 4

2 and 3

1 2 3 4


The heart of the integrands routine

4 loops:

// matrix:

for i = 1,....,nbffor j = 1,....,nbffor r = 1,....,dfor s = 1,....,d

add A_i,j^rs intoelmat.A (d*(i-1)+r, d*(j-1)+s)

// right-hand side:

for i = 1,....,nbffor r = 1,....,dadd b_i^r into

elmat.b (d*(i-1)+r)


The important variables in elasticity

Primary unknowns in the finite element method: the displacementfield ur

Primary interest:the components of the stress tensor σrs

a norm of σrs, e.g.,

m ∝√σ′

ijσ′ij

where

σ′rs ≡ σrs −

1

3σkkδrs

Note: σrs ∼ ∂ur/∂xs

σrs is discontinuous accross element boundaries

Smoothing might be necessary


Computing derivatives (1)

u(x) is a finite element field

Define g = ∂u/∂x1

Bilinear u⇒ g ∼ a+ bx2

Linear u⇒ g ∼ const

g is discontinuous across element boundaries

g has optimal accuracy at the reduced Gauss points (=centroid inlinear/bilinear elements)


Computing derivatives (2)

g = ∂u/∂x1 (e.g.)

Find continuous g =∑

j gjNj(x) as approximate solution of

g = g

Galerkin or least squares:

∑

j

(∫

Ω

NiNjdΩ

)gj =

∫

Ω

gNidΩ

Integrate rhs with reduced Gauss rule

Lump the mass matrix∫NiNjdΩ

Efficient solution of diagonal system for gj

⇒ g is a continuous field


Computing derivatives in Diffpack

Let m be a stress norm (discontinuous)

Representation of m: class FieldsFEatItgPt

FieldsFEatItgPt = nf point values of derivatives at each (possiblyreduced) integration point in each element (representing m only:nf = 1)

Class FieldsFEatItgPt has a functionvoid derivedQuantitiesAtItgPt(FEM& fesolver,GridFE& grid,int nfields,NumItgPoints pt_tp = GAUSS_POINTS,int relative_order = -1 // reduced Gauss pts);

that runs through all elements in the grid and their (reduced)integration points, and for each point, fesolver’s virtualderivedQuantitiesAtItgPt is called for defining thevalues of the nfields discontinuous fields at the current point


Computing derivatives in Diffpack

In class Elasticity1:class Elasticity1 : public FEM

...Handle(FieldsFEatItgPts) stress_measures; // m...

;

void Elasticity1:: calcDerivedQuantities ()

// Handle(stress_measures) contains stresses (now only the norm m)stress_measures->derivedQuantitiesAtItgPt(*this, *grid, 1 /* 1 derived quantity */,GAUSS_POINTS, -1 /* reduced Gauss-Legendre points */);

FEM::smoothFields (*smooth_stress_measures, *stress_measures);

void Elasticity1:: derivedQuantitiesAtItgPt(VecSimple(NUMT)& quantities, const FiniteElement& fe)

// fill quantities(1) with the expression for m


Plate with imperfection

Consider a plate with an elliptic hole:

σσ

Plane stress (thin plate)


Boundary conditions

Equation: Navier, 2D, with λ replaced by

λ′ =2λµ

λ+ 2µ

Boundary with tension force:stress vector known (t = σi)

Inner boundary in hole: no stress

Upper and lower boundary: no stress

⇒ 2 conditions at each point on the boundary

(recall that σi3 = 0 by definition of plane strain)(In a 3D formulation of the problem we would trivially get3 cond. at each point)


Symmetry

For numerical computations it is crucial to reduce the size of thedomain as much as possible

Here: symmetry about two lines

σσ

Condition at a symmetry line:vanishing normal displacement: 1 conditionno shear stress: 1 (2D) or 2 (3D) conditions


Numerical simulations

0 1 1.1410

1

equivalent stress in deformed configuration

0.367

1.12

1.88

2.63

3.39

4.14

4.89

5.65

6.4

7.16

7.91

0 1 1.1410

1


Plate with crack

Let the ellipse collapse to a line

σσ


Boundary conditions

What has actually changed?

Nothing;same boundary conditions(stress-free inner surface of the crack)same symmetry properties

However: the extreme geometry will lead to infinite stresses at thecrack tip


Numerical simulations

0 1 1.1410

1


0.447

0.94

1.43

1.92

2.42

2.91

3.4

3.89

4.38

4.88

5.37

0 1 1.1410

1


Elastic beam with a crack

0 10

0.1

0.2

0.3

0 10

0.1

0.2

0.3uniform pressure load

clampedend

crack

0 1−0.1

0

0.1

0.2

0.3

0 1−0.1

0

0.1

0.2

0.3


Mathematical model

Elasticity

Plane strain

No temperature effects


Numerical simulation: stress

0 1−0.1

0

0.1

0.2

0.3


0.00067210.320.630.941.251.661.972.282.592.8103

0 1−0.1

0

0.1

0.2

0.3


Shallow water waves

Shallow water waves – p. 387

Tsunamis

Waves in fjords, lakes, or oceans, generated byslideearthquakesubsea volcanoasteroid

human activity, like nuclear detonation, or slides generated by oildrilling, may generate tsunamis

Propagation over large distances

Hardly recognizable in the open ocean, but wave amplitude increasesnear shore

Run-up at the coasts may result in severe damage

Giant events: Dec 26 2004 (≈ 300000 killed), 1883 (similar to 2004),65 My ago (extinction of the dinosaurs)


Norwegian tsunamis

5˚ 10˚

15˚

60˚

65˚

70˚

OsloStockholm

Bergen

Tromsø

Bodø

Trondheim

NO

RW

AY

SWED

EN

Circules: Major incidents, > 10 killed; Triangles: Selected smallerincidents; Square: Storegga (5000 B.C.)


Tsunamis in the Pacific

Scenario: earthquake outside Chile, generates tsunami, propagating at800 km/h accross the Pacific, run-up on densly populated coasts in Japan;http://www.geophys.washington.edu/tsunami/ Shallow water waves – p. 390

Selected events; slides

location year run-up dead

Loen 1905 40m 61

Tafjord 1934 62m 41

Loen 1936 74m 73

Storegga 5000 B.C. 10m(?) ??Vaiont, Italy 1963 270m 2600

Litua Bay, Alaska 1958 520m 2

Shimabara, Japan 1792 10m(?) 15000


Selected events; earthquakes etc.

location year strength run-up dead

Thera 1640 B.C. volcano ? ?Thera 1650 volcano ? ?Lisboa 1755 M=9 ? 15(?)m ?000Portugal 1969 M=7.9 1 mAmorgos 1956 M=7.4 5(?)m 1Krakatao 1883 volcano 40 m 36 000Flores 1992 M=7.5 25 m 1 000Nicaragua 1992 M=7.2 10 m 168Sumatra 2004 M=9 50 m 300 000

The selection is biased wrt. European events; 150 catastrophic tsunamievents have been recorded along along the Japanese coast in moderntimes.Tsunamis: no. 5 killer among natural hazards


Why simulation?

Increase the understanding of tsunamis

Assist warning systems

Assist building of harbor protection (break waters)

Recognize critical coastal areas (e.g. move population)

Hindcast historical tsunamis (assist geologists/biologists)


Problem sketch

η(x,y,t)

y

z

x

H(x,y,t)

Assume wavelength depth (long waves)

Assume small amplitudes relative to depth

Appropriate approx. for many ocean wave phenomena

Reference: HPL chapter 6.2


Mathematical model

PDEs:

∂η

∂t= − ∂

∂x(uH)− ∂

∂y(vH)

(−∂H∂t

)

∂u

∂t= −∂η

∂x, x ∈ Ω, t > 0

∂v

∂t= −∂η

∂y, x ∈ Ω, t > 0

η(x, y, t) : surface elevation

u(x, y, t) and v(x, y, t) : horizontal (depth averaged) velocities

H(x, y) : stillwater depth (given)

Boundary conditions: either η, u or v given at each point

Initial conditions: all of η, u and v given


Primary unknowns

Discretization: finite differences

Staggered grid in time and space

⇒ η, u, and v unknown at different points:

ηì+ 1

2 ,j+ 12, u

`+ 12

i,j+ 12

, v`+ 1

2

i+ 12 ,j+1

r

ηì+ 1

2 ,j+ 12

u`+ 1

2

i,j+ 12

u`+ 1

2

i+1,j+ 12

v`+ 1

2

i+ 12 ,j

v`+ 1

2

i+ 12 ,j+1


A global staggered grid

Widely used grid in computational fluid dynamics (CFD)

Important for Navier-Stokes solvers

Basic idea:centered differences in time and space


Discrete equations; η

∂η

∂t= − ∂

∂x(uH)− ∂

∂y(vH)

at (i+1

2, j +

1

2, `− 1

2)

1

∆t

[η`

i+ 12 ,j+ 1

2− η`−1

i+ 12 ,j+ 1

2

]= − 1

∆x

[(Hu)

`− 12

i+1,j+ 12

− (Hu)`− 1

2

i,j+ 12

]

− 1

∆y

[(Hv)

`− 12

i+ 12 ,j+1

− (Hv)`− 1

2

i+ 12 ,j

]


Discrete equations; u

∂u

∂t= −∂η

∂xat (i, j +

1

2, `)

1

∆t

[u

`+ 12

i,j+ 12

− u`− 12

i,j+ 12

]= − 1

∆x

[η`

i+ 12 ,j+ 1

2− η`

i− 12 ,j+ 1

2

]


Discrete equations; v

∂v

∂t= −∂η

∂yat (i+

1

2, j, `)

1

∆t

[v

`+ 12

i+ 12 ,j− v`− 1

2

i+ 12 ,j

]=

1

∆y

[η`

i+ 12 ,j+ 1

2− η`

i+ 12 ,j− 1

2

]


Complicated costline boundary

Saw-tooth approximation to real boundary

Successful method, widely used

Warning: can lead to nonphysical waves


Relation to the wave equation

Eliminate u and v (easy!)

∂2η

∂t2= ∇ · [H(x, y)∇η]

Eliminate discrete u and v

⇒ Standard 5-point explicit finite difference scheme for discrete η


Stability and accuracy

Centered differences in time and space

⇒ truncation error: O(∆x2,∆y2,∆t2)

Stability as for the std. wave equation in 2D:

∆t ≤ H− 12

√1

1∆x2 + 1

∆y2

(CFL condition)

If H const, exact numerical solution is possible for one-dimensionalwave propagation


Verification of an implementation

How can we verify that the program works?

Compare with an analytical solution(if possible)

Check that basic physical mechanisms are reproduced in aqualitatively correct way by the program


Tsunami due to a slide

Surface elevation ahead of the slide, dump behind

Initially, negative dump propagates backwards

The surface waves propagate faster than the slide movesShallow water waves – p. 405

Tsunami due to faulting

The sea surface deformation reflect the bottom deformation

Velocity of surface waves (H ∼ 5 km): 790 km/h

Velocity of seismic waves in the bottom: 6000–25000 km/hShallow water waves – p. 406

Tsunami approaching the shore

The velocity of a tsunami is√gH(x, y, t).

The back part of the wave moves at higher speed⇒ the wavebecomes more peak-formed

Deep water (H ∼ 3 km): wave length 40 km, height 1 m

Shallow water (H ∼ 10 m): wave length 2 km, height 4 mShallow water waves – p. 407

Tsunamis experienced from shore

As a fast tide, with strong currents in fjords

A wall of water approaching the beach

Wave breaking: the top has larger effective depth and moves faster thanthe front part (requires a nonlinear PDE)


A penalty N-S solver

A penalty N-S solver – p. 409

Viscous fluid flow

0 0.01 0.02 0.03 0.04 0.05 0.060

0.01

0.02

0 0.01 0.02 0.03 0.04 0.05 0.060

0.01

0.02

Many processes in science and technology involve viscous fluid flow, andthe numerical models then need solvers for the Navier-Stokes (N-S)equations:

%

(∂v

∂t+ v · ∇v

)= −∇p+ µ∇2v + %b

∇ · v = 0


Different ways of writing the N-S eqs.

With vector symbols:

%

(∂v

∂t+ v · ∇v

)= −∇p+ µ∇2v + %b

∇ · v = 0

With index notation:

%(vr,t + vsvr,s) = −p,r + µvr,ss + %br

vs,s = 0

The latter is preferred when deriving the details of a numericalmethod (as in the elasticity problem)


Numerical methods

The condition ∇ · v = 0 and the term ∇p make the N-S equationshard to solve numerically

There are numerous approaches:fully implicitartificial compressibilitypenalty functionsoperator splitting


Penalty methods

Firm basis in calculus of variations

Main result:p = −λ∇ · v, λ→∞

⇒ Can eliminate the pressure!!!

Result: a kind of nonlinear transient elasticity problem

Very convenient from a numerical point of view

λ→∞ gives some undesired numerical properties (ill-conditionedmatrix systems)

Good educational example on using Diffpack

Reference: HPL chapter 6.3.


Modified N-S equations

p = −λ∇ · v = −λvs,s eliminates p and the eq. ∇ · v = vs,s = 0

Result:%(αvr,t + vsvr,s) = λvs,sr + µvr,ss + %br

≈ eq. of linear elasticity, modulo the acceleration terms on theleft-hand side(which add transient and nonlinear effects)

Implementation in Diffpack:Extend class Elasticity1 with a time loop and a nonlinear solver

⇒ Combine class Poisson1, Heat1, NlHeat1, and Elasticity1


Basic steps

1. Derive the weak form

2. Identify the integrands

3. Get control of the element degrees of freedom, i.e., how the formulasare stacked in the element matrix/vector

Strong similarity to the elasticity problem!

Then, use class NlHeat1 as a template for administering the solutionprocess


Discretization (1)

In time: θ-ruleθ = 1: backward Euler, θ = 0.5: Crank-Nicolson

In space: isoparametric finite elements

v`r(x, t) =

n∑

j=1

vr,`j Nj(x)

Weak form: multiply by Ni, integrate 2nd order derivatives by parts

⇒ Nonlinear system at each time level

F ri (v1

1 , . . . , vd1 , v

12, . . . , v

d2 , . . . , v

1n, . . . , v

dn) = 0

for i = 1, . . . , n (nodes), r = 1, . . . , d (loc. dof.)


Discretization (2)

Newton-Raphson method for F ri = 0

⇒ Sequence of linear systems

n∑

j=1

d∑

s=1

Arsi,jδv

sj = −F r

i

Arsi,j ≡

∂F ri

∂vs,`j


Entries in the element matrix

1

2

3

2x2 matrix, coupling node

entry, coupling local dof 2

4

in node 3 with local dof 1

in node 4

2 and 3

1 2 3 4


Selective reduced integration

The λ term must be integrated by a rule of one order lower than therule used for the other terms (equiv. to using mixed interpolation).


Computing the pressure

p = −λvs,s can be computed when the velocity is known

p ∼ derivatives of vs ⇒ p becomes discontinuous

Might smooth p:n∑

j=1

Mi,jpj = bi

where

bi = −λ∫vs,sNidΩ, Mi,j =

∫NiNjdΩ

normally with lumped Mi,j

⇒ Another finite element assembly process

Diffpack tool: class FieldsFEatItgPt or integrands functor


The idea of integrand functors

A solver can only have one integrands function

What if it needs more than one?if-else tests in integrandsexternal integrands functions as functors

Basic structure of an integrand functor:class MyExtraIntegrand : public IntegrandCalc

MySim* data; // access to all solver datapublic:

MyExtraIntegrand (MySim* sim) : data(sim)

void integrands (ElmMatVec& em, const FiniteElement& fe)

// normal integrands function// access physical parameters in the solver by data->

;

Overloaded versions of FEM::makeSystem work with integrandfunctors (as an alternative to the integrands function in the solver)


Implementation; pressure computation

Integrand functor for

bi = −λ∫vs,sNidΩ

Mi,j can be computed once and for all by makeMassMatrix

Class FEM has a function smoothField for solving

∑

j

Mi,jpj = functor-defined right-hand side

// integrand functor:class PressureIntg : public IntegrandCalc ... ;

void NsPenalty1:: calcDerivedQuantities ()

PressureIntg penalty_integrand (this);FEM::smoothField (*p, penalty_integrand);// calls makeMassMatrix (if necessary), makeSystem// and solves the diagonal system


Flow in a constricted channel (1)

0 0.01 0.02 0.03 0.04 0.05 0.060

0.01

0.02

0 0.01 0.02 0.03 0.04 0.05 0.060

0.01

0.02

uniform

inlet

profile

outlet

solid wall

solid wall

Re=60

0.0252 0.03 0.04 0.05 0.05560.008267

0.01

0.0192

0.0252 0.03 0.04 0.05 0.05560.008267

0.01

0.0192


Flow in a constricted channel (2)

Re=60

0 0.01 0.02 0.03 0.04 0.05 0.060

0.01

0.02

−0.00336 −0.00

336

0.00

236

0.00807

0.01

95

0 0.01 0.02 0.03 0.04 0.05 0.060

0.01

0.02

0 0.01 0.02 0.03 0.04 0.05 0.060

0.01

0.02

0 0.01 0.02 0.03 0.04 0.05 0.060

0.01

0.02


A fast FE N-S solver

A fast FE N-S solver – p. 425

Splitting the N-S equations


vs,s = 0

Difficulty: p,r and vs,s = 0

Idea: Split N-S into simpler equations

Common approach: split N-S intoan explicit convection-diffusion equation for vr

an (implicit) Poisson equation for pexplicit updating formula for vr


A 2nd order algorithm (1)


vs,s = 0

1. Calculation of an intermediate velocity field:

k(1)r = −∆t(v`

sv`r,s − νv`

r,ss)

vr = v`r + k(1)

r

k(2)r = −∆t(v`

sv`r,s − νv`

r,ss)

v∗r = v`r +

1

2

(k(1)

r + k(2)r

)


A 2nd order algorithm (1)

2. Solution of a Poisson equation for the new pressure (arising from theincompressibility constraint v`+1

s,s = 0):

∇2p`+1 =%

∆tv∗s,s

3. Correction of the intermediate velocity field:

v`+1r = v∗r − (p`+1

,r − %br)∆t/%

Same interpolation for vr and p (no need for mixed finite elements)


Discrete equations

First step:

Mk(1)r = −∆tar(v1, . . . ,vd)− ν∆tKvr

for r = 1, . . . , d.K: ∇2 operatorar: nonlinear convective term

Explicit updates a la

vr = v`r + k(1)

r , v∗r = v`

r +1

2

(k(1)

r + k(1)r

)

The pressure Poisson equation:

Kp`+1 =%

∆tBsv

∗s

Correcting the velocity field (cr contains body forces):

Mv`+1r = Mv∗

r − (Brp`+1 − %cr)


Implementation

Obvious: create a std solver for the Poisson equation and the scalarexplicit updates

Observation I: original vector equations are split into d independentscalar equations (!)

Requires several weak forms⇒ integrand functors

Observation II: M , K, and Bs are independent of time

⇒ Can speed up the code by precomputing M , K, and Bs, andgenerate the Poisson equation from matrix-vector products only(cf. class Wave0)

Problem: the nonlinear term ar

Solution: precompute as much as possible, multiply by v at theelement level and assemble


Solving linear systems

Solving linear systems – p. 431

The importance of linear system solvers

PDE problems often (usually) result in linear systems of algebraicequations

Ax = b

Special methods utilizing that A is sparse is much faster thanGaussian elimination!

Most of the CPU time in a PDE solver is often spent on solvingAx = b

⇒ Important to use fast methods


Example: Poisson eq. on the unit cube (1)

−∇2u = f on an n = q × q × q grid

FDM/FEM result in Ax = b system

FDM: 7 entries pr. row in A are nonzero

FEM: 7 (tetrahedras), 27 (trilinear elms.), or 125 (triquadratic elms.)entries pr. row in A are nonzero

A is sparse (mostly zeroes)

Fraction of nonzeroes: Rq−3

(R is nonzero entries pr. row)

Important to work with nonzeroes only!


Example: Poisson eq. on the unit cube (2)

Compare Banded Gaussian elimination (BGE) versus ConjugateGradients (CG)

Work in BGE: O(q7) = O(n2.33)

Work in CG: O(q3) = O(n) (multigrid; optimal),for the numbers below we use incomplete factorizationpreconditioning: O(n1.17)

n = 27000:CG 72 times faster than BGEBGE needs 20 times more memory than CG

n = 8 million:CG 107 times faster than BGEBGE needs 4871 times more memory than CG


Classical iterative methods

Ax = b, A ∈ IRn,n, x, b ∈ IRn .

Split A: A = M −N

Write Ax = b asMx = Nx + b,

and introduce an iteration

Mxk = Nxk−1 + b, k = 1, 2, . . .

Systems My = z should be easy/cheap to solve

Different choices of M correspond to different classical iterationmethods:

Jacobi iterationGauss-Seidel iterationSuccessive Over Relaxation (SOR)Symmetric Successive Over Relaxation (SSOR)


Convergence

Mxk = Nxk−1 + b, k = 1, 2, . . .

The iteration converges if G = M−1N has its largest eigenvalue,%(G), less than 1

Rate of convergence: R∞(G) = − ln %(G)

To reduce the initial error by a factor ε,

||x− xk|| ≤ ε||x− x0||

one needs− ln ε/R∞(G)

iterations


Some classical iterative methods

Split: A = L + D + U

L and U are lower and upper triangular parts, D is A’s diagonal

Jacobi iteration: M = D (N = −L− U )

Gauss-Seidel iteration: M = L + D (N = −U )

SOR iteration: Gauss-Seidel + relaxation

SSOR: two (forward and backward) SOR steps

Rate of convergence R∞(G) for −∇2u = f in 2D with u = 0 as BC:

Jacobi: πh2/2

Gauss-Seidel: πh2

SOR: π2h

SSOR: > πh

SOR/SSOR is superior (h vs. h2, h→ 0 is small)


Jacobi iteration

M = D

Put everything, except the diagonal, on the rhs

2D Poisson equation −∇2u = f :

ui,j−1 + ui−1,j + ui+1,j + ui,j+1 − 4ui,j = −h2fi,j

Solve for diagonal element and use old values on the rhs:

uki,j =

1

4

(uk−1

i,j−1 + uk−1i−1,j + uk−1

i+1,j + uk−1i,j+1 + h2fi,j

)

for k = 1, 2, . . .


Relaxed Jacobi iteration

Idea: Computed new x approximation x∗ from

Dx∗ = (−L−U)xk−1 + b

Setxk = ωx∗ + (1− ω)xk−1

weighted mean of xk−1 and xk if ω ∈ (0, 1)


Relation to explicit time stepping

Relaxed Jacobi iteration for −∇2u = f is equivalent with solving

α∂u

∂t= ∇2u+ f

by an explicit forward scheme until ∂u/∂t ≈ 0, providedω = 4∆t/(αh)2

Stability for forward scheme implies ω ≤ 1

In this example: ω = 1 best (⇔ largest ∆t)

Forward scheme for t→∞ is a slow scheme, hence Jacobi iterationis slow


Gauss-Seidel/SOR iteration

M = L + D

For our 2D Poisson eq. scheme:

uki,j =

1

4

(uk

i,j−1 + uki−1,j + uk−1

i+1,j + uk−1i,j+1 + h2fi,j

)

i.e. solve for diagonal term and use the most recently computedvalues on the right-hand side

SOR is relaxed Gauss-Seidel iteration:compute x∗ from Gauss-Seidel it.

set xk = ωx∗ + (1− ω)xk−1

ω ∈ (0, 2), with ω = 2−O(h2) as optimal choice

Very easy to implement!


Symmetric/double SOR: SSOR

SSOR = Symmetric SOR

One (forward) SOR sweep for unknowns 1, 2, 3, . . . , n

One (backward) SOR sweep for unknowns n, n− 1, n− 2, . . . , 1

M can be shown to be

M =1

2− ω

(1

ωD + L

)(1

ωD

)−1(1

ωD + U

)

Notice that each factor in M is diagonal or lower/upper triangular(⇒ very easy to solve systems My = z)


Status: classical iterative methods

Jacobi, Gauss-Seidel/SOR, SSOR are too slow for paractical PDEcomputations

The simplest possible solution method for −∇2u = f and otherstationary PDEs in 2D/3D is to use SOR

Classical iterative methods converge quickly in the beginning butslow down after a few iterations

Classical iterative methods are important ingredients in multigridmethods


Conjugate Gradient-like methods

Ax = b, A ∈ IRn,n, x, b ∈ IRn .

Use a Galerkin or least-squares method to solve a linear system (!)

Idea: write

xk = xk−1 +

k∑

j=1

αjqj

αj : unknown coefficients, qj : known vectors

Compute the residual:

rk = b−Axk = rk−1 −k∑

j=1

αjAqj

and apply the ideas of the Galerkin or least-squares methods


Galerkin

Residual:


j=1

αjAqj

(rk, qi) = 0

Galerkin’s method (r ∼ R, qj ∼ Nj , αj ∼ uj):

(rk, qi) = 0, i = 1, . . . , k

(·, ·): Eucledian inner product

Result: linear system for αj ,

k∑

j=1

(Aqi, qj)αj = (rk−1, qi), i = 1, . . . , k


Least squares

Residual:


j=1

αjAqj

∂

∂αi(rk, rk) = 0

Least squares: minimize (rk, rk)

Result: linear system for αj :

k∑

j=1

(Aqi,Aqj)αj = (rk−1,Aqi), i = 1, . . . , k


The nature of the methods

Start with a guess x0

In iteration k: seek xk in a k-dimensional vector space Vk

Basis for the space: q1, . . . , qk

Use Galerkin or least squares to compute the (optimal)approximation xk in Vk

Extend the basis from Vk to Vk+1 (i.e. find qk+1)


Extending the basis

Vk is normally selected as a so-called Krylov subspace:

Vk = spanr0,Ar0, . . . ,Ak−1r0

Alternatives for computing qk+1 ∈ Vk+1:

qk+1 = rk +k∑

j=1

βjqk

qk+1 = Ark +k∑

j=1

βjqk

How to choose βj?


Orthogonality properties

Bad news: must solve a k × k linear system for αj in each iteration(as k → n the work in each iteration approach the work of solvingAx = b!)

The coefficient matrix in the αj system:

(Aqi, qj), (Aqi,Aqj)

Idea: make the coefficient matrices diagonal

That is,Galerkin: (Aqi, qj) = 0 for i 6= j

Least squares: (Aqi,Aqj) = 0 for i 6= j

Use βj to enforce orthogonality of qi


Formula for updating the basis vectors

Define〈u,v〉 ≡ (Au,v) = uT Av

and[u,v] ≡ (Au,Av) = uT AT Av

Galerkin: require A-orthogonal qj vectors, which then results in

βi = −〈rk, qi〉〈qi, qi〉

Least squares: require AT A–orthogonal qj vectors, which thenresults in

βi = − [rk, qi]

[qi, qi]


Simplifications

Galerkin: 〈qi, qj〉 = 0 for i 6= j gives

αk =(rk−1, qk)

〈qk, qk〉

and αi = 0 for i < k (!):

xk = xk−1 + αkqk

That is, hand-derived formulas for αj

Least squares:

αk =(rk−1,Aqk)

[qk, qk]

and αi = 0 for i < k


Symmetric A

If A is symmetric (AT = A) and positive definite (positive eigenvalues⇔yT Ay > 0 for any y 6= 0), also βi = 0 for i < k⇒ need to store qk only(q1, . . . , qk−1 are not used in iteration k)


Summary: least squares algorithm

given a start vector x0,compute r0 = b−Ax0 and set q0 = r0.for k = 1, 2, . . . until termination criteria are fulfilled:

αk = (rk−1,Aqk)/[qk, qk]xk = xk−1 + αkqk

rk = rk−1 − αkAqk

if A is symmetric thenβk = [rk, qk]/[qk, qk]qk+1 = rk − βkqk

elseβj = [rk, qj ]/[qj , qj ], j = 1, . . . , k

qk+1 = rk −∑kj=1 βjqj

The Galerkin-version requires A to be symmetric and positivedefinite and results in the famous Conjugate Gradient method


Truncation and restart

Problem: need to store q1, . . . , qk

Much storage and computations when k becomes large

Truncation: work with a truncated sum for xk,

xk = xk−1 +k∑

j=k−K+1

αjqj

where a possible choice is K = 5

Small K might give convergence problems

Restart: restart the algorithm after K iterations(alternative to truncation)


Family of methods

Generalized Conjugate Residual method= least squares + restart

Orthomin method= least squares + truncation

Conjugate Gradient method= Galerkin + symmetric and positive definite A

Conjugate Residuals method= Least squares + symmetric and positive definite A

Many other related methods: BiCGStab, Conjugate GradientsSquared (CGS), Generalized Minimum Residuals (GMRES),Minimum Residuals (MinRes), SYMMLQ

Common name: Conjugate Gradient-like methods

All of these are easily called in Diffpack


Convergence

Conjugate Gradient-like methods converge slowly (but usually fasterthan SOR/SSOR)

To reduce the initial error by a factor ε,

1

2ln

2

ε

√κ

iterations are needed, where κ is the condition number:

κ =largest eigenvalue of A

smalles eigenvalue of A

κ = O(h−2) when solving 2nd-order PDEs(incl. elasticity and Poisson eq.)


Preconditioning

Idea: Introduce an equivalent system

M−1Ax = M−1b

solve it with a Conjugate Gradient-like method and construct M suchthat1. κ = O(1) ⇒ M ≈ A (i.e. fast convergence)2. M is cheap to compute3. M is sparse (little storage)

4. systems My = z (occuring in the algorithm due to M−1Av-likeproducts) are efficiently solved (O(n) op.)

Contradictory requirements!

The preconditioning business: find a good balance between 1-4


Classical methods as preconditioners

Idea: “solve” My = z by one iteration with a classical iterativemethod (Jacobi, SOR, SSOR)

Jacobi preconditioning: M = D (diagonal of A)

No extra storage as M is stored in A

No extra computations as M is a part of A

Efficient solution of My = z

But: M is probably not a good approx to A

⇒ poor quality of this type of preconditioners?

Conjugate Gradient method + SSOR preconditioner is widely used


M as a factorization of A

Idea: Let M be an LU-factorization of A, i.e.,

M = LU

where L and U are lower and upper triangular matrices resp.

Implications:1. M = A (κ = 1): very efficient preconditioner!2. M is not cheap to compute

(requires Gaussian elim. on A!)3. M is not sparse (L and U are dense!)4. systems My = z are not efficiently solved

(O(n2) process when L and U are dense)


M as an incomplete factorization of A

New idea: compute sparse L and U

How? compute only with nonzeroes in A

⇒ Incomplete factorization, M = LU 6= LU

M is not a perfect approx to A

M is cheap to compute and store(O(n) complexity)

My = z is efficiently solved (O(n) complexity)

This method works well - much better than SOR/SSORpreconditioning


How to compute M

Run through a standard Gaussian elimination, which factors A asA = LU

Normally, L and U have nonzeroes where A has zeroes

Idea: let L and U be as sparse as A

Compute only with the nonzeroes of A

Such a preconditioner is called Incomplete LU Factorization, ILU

Option: add contributions outside A’s sparsity pattern to thediagonal, multiplied by ω

Relaxed Incomplete Factorization (RILU): ω > 1

Modified Incomplete Factorization (MILU): ω = 1

See algorithm C.3 in the book


Numerical experiments

Two test cases:−∇2u = f on the unit cube and FDM

−∇2u = f on the unit cube and FEM

Diffpack makes it easy to run through a series of numericalexperiments, using multiple loops, e.g.,

sub LinEqSolver_prmset basic method = ConjGrad & MinRes oksub Precond_prmset preconditioning type = PrecRILUset RILU relaxation parameter = 0.0 & 0.4 & 0.7 & 1.0 ok


Test case 1: 3D FDM Poisson eq.

Equation: −∇2u = 1

Boundary condition: u = 0

7-pt star standard finite difference scheme

Grid size: 20× 20× 20 = 8000 points and 20× 30× 30 = 27000 points

Source code:$NOR/doc/Book/src/linalg/LinSys4/

All details in HPL Appendix D

Input files:$NOR/doc/Book/src/linalg/LinSys4/experiments

Solver’s CPU time written to standard output


Jacobi vs. SOR vs. SSOR

n = 203 = 8000 and n = 303 = 27000

Jacobi: not converged in 1000 iterations

SOR(ω = 1.8): 2.0s and 9.2s

SSOR(ω = 1.8): 1.8s and 9.8s

Gauss-Seidel: 13.2s and 97s

SOR’s sensitivity to relax. parameter ω:1.0: 96s, 1.6: 23s, 1.7: 16s, 1.8: 9s, 1.9: 11s

SSOR’s sensitivity to relax. parameter ω:1.0: 66s, 1.6: 17s, 1.7: 13s, 1.8: 9s, 1.9: 11s

⇒ relaxation is important,great sensitivity to ω


Conjugate Residuals or Gradients?

Compare Conjugate Residuals with Conjugate Gradients

Or: least squares vs. Galerkin

Diffpack names: MinRes and ConjGrad

MinRes: not converged in 1000 iterations

ConjGrad: 0.7s and 3.9s

⇒ ConjGrad is clearly faster than the best SOR/SSOR

Add ILU preconditioner

MinRes: 0.7s and 4s

ConjGrad: 0.6s and 2.7s

The importance of preconditioning grows as n grows


Different preconditioners

ILU, Jacobi, SSOR preconditioners (ω = 1.2)

MinRes:Jacobi: not conv., SSOR: 11.4s, ILU: 4s

ConjGrad:Jacobi: 4.8s, SSOR: 2.8s, ILU: 2.7s

Sensitivity to relax. parameter in SSOR, with ConjGrad as solver:1.0: 3.3s, 1.6: 2.1s, 1.8: 2.1s, 1.9: 2.6s

Sensitivity to relax. parameter in RILU, with ConjGrad as solver:0.0: 2.7s, 0.6: 2.4s, 0.8: 2.2s, 0.9: 1.9s,0.95: 1.9s, 1.0: 2.7s

⇒ ω slightly less than 1 is optimal,RILU and SSOR are equally fast (here)


Test case 2: 3D FEM Poisson eq.

Equation: −∇2u = A1π2 sinπx+ 4A2π

2 sin 2πy + 9A3π2 sin 3πz

Boundary condition: u known

ElmB8n3D and ElmB27n3D elements

Grid size: 21× 21× 21 = 9261 nodes and 31× 31× 31 = 29791 nodes

Source code:$NOR/doc/Book/src/fem/Poisson2

All details in HPL Chapter 3.2 and 3.5

Input files:$NOR/doc/Book/src/fem/Poisson2/linsol-experiments

Solver’s CPU time available in casename-summary.txt


Jacobi vs. SOR vs. SSOR

n = 9261 and n = 303 = 29791, trilinear and triquadratic elms.

Jacobi: not converged in 1000 iterations

SOR(ω = 1.8): 9.1s and 81s, 42s and 338s

SSOR(ω = 1.8): 47s and 248s, 138s and 755s

Gauss-Seidel: not converged in 1000 iterations

SOR’s sensitivity to relax. parameter ω:1.0: not conv., 1.6: 200s, 1.8: 83s, 1.9: 57s(n = 29791 and trilinear elements)

SSOR’s sensitivity to relax. parameter ω:1.0: not conv., 1.6: 212s, 1.7: 207s, 1.8: 245s, 1.9: 435s(n = 29791 and trilinear elements)

⇒ relaxation is important,great sensitivity to ω


Conjugate Residuals or Gradients?

Compare Conjugate Residuals with Conjugate Gradients

Or: least squares vs. Galerkin

Diffpack names: MinRes and ConjGrad

MinRes: not converged in 1000 iterations

9261 vs 29791 unknowns, trilinear elements

ConjGrad: 5s and 22s

⇒ ConjGrad is clearly faster than the best SOR/SSOR!

Add ILU preconditioner

MinRes: 5s and 28s

ConjGrad: 4s and 16s

ILU prec. has a greater impact when using triquadratic elements (andwhen n grows)


Different preconditioners

ILU, Jacobi, SSOR preconditioners (ω = 1.2)

MinRes:Jacobi: 68s., SSOR: 57s, ILU: 28s

ConjGrad:Jacobi: 19s, SSOR: 14s, ILU: 16s

Sensitivity to relax. parameter in SSOR, with ConjGrad as solver:1.0: 17s, 1.6: 12s, 1.8: 13s, 1.9: 18s

Sensitivity to relax. parameter in RILU, with ConjGrad as solver:0.0: 16s, 0.6: 15s, 0.8: 13s, 0.9: 12s, 0.95: 11s, 1.0: 16s

⇒ ω slightly less than 1 is optimal,RILU and SSOR are equally fast (here)


More experiments

Convection-diffusion equations:$NOR/doc/Book/src/app/Cd/Verify

Files: linsol_a.i etc as for LinSys4 and Poisson2

Elasticity equations:$NOR/doc/Book/src/app/Elasticity1/Verify

Files: linsol_a.i etc as for the others

Run experiments and learn!


Multigrid methods

Multigrid methods are the most efficient methods for solving linearsystems

Multigrid methods have optimal complexity O(n)

Multigrid can be used as stand-alone solver or preconditioner

Multigrid applies a hierarchy of grids

Multigrid is not as robust as Conjugate Gradient-like methods andincomplete factorization as preconditioner, but faster when it works

Multigrid is complicated to implement

Diffpack has a multigrid toolbox that simplifies the use of multigriddramatically


The rough ideas of multigrid

Observation: e.g. Gauss-Seidel methods are very efficient during thefirst iterations

High-frequency errors are efficiently damped by Gauss-Seidel

Low-frequence errors are slowly reduced by Gauss-Seidel

Idea: jump to a coarser grid such that low-frequency errors get higherfrequency

Repeat the procedure

On the coarsest grid: solve the system exactly

Transfer the solution to the finest grid

Iterate over this procedure


Damping in Gauss-Seidel’s method (1)

Model problem: −u′′ = f by finite differences:

−uj−1 + 2uj − uj+1 = h2fj

solved by Gauss-Seidel iteration:

2u`j = u`

j−1 + u`−1j+1 + h2fj

Study the error eì = u`

i − u∞i :

2e`j = e`

j−1 + e`−1j+1

This is like a time-dependent problem, where the iteration index ` is apseudo time


Damping in Gauss-Seidel’s method (2)

Can find e`j with techniques from Appendix A.4:

e`j =

∑

k

Ak exp (i(kjh− ω`∆t))

or (easier to work with here):

e`j =

∑

k

Akξ` exp (ikjh), ξ = exp (−iω∆t)

Inserting a wave component in the scheme:

ξ = exp (−iω∆t) =exp (ikh)

2− exp (−ikh) , |ξ| = 1√5− 4 cos kh

Interpretation of |ξ|: reduction in the error per iteration


Gauss-Seidel’s damping factor

|ξ| = 1√5− 4 cos p

, p = kh ∈ [0, π]

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.5 1 1.5 2 2.5 3p

Small p = kh ∼ h/λ: low frequency (relative to the grid) and smalldamping

Large (→ π) p = kh ∼ h/λ: high frequency (relative to the grid) andefficient damping


More than one grid

From the previous analysis: error components with high frequencyare quickly damped

Jump to a coarser grid, e.g. h′ = 2h

p is increased by a factor of 2, i.e., not so high-frequency waves onthe h grid is efficiently damped by Gauss-Seidel on the h′ grid

Repeat the procedure

On the coarsest grid: solve by Gaussian elimination

Interpolate solution to a finer grid, perform Gauss-Seidel iterations,and repeat until the finest grid is reached


Transferring the solution between grids

From fine to coarser: restriction

q-1

q

2 3 4 5

3 4 5 6 7 8 9

1

1 2

simple restriction

weighted restriction

fine grid function

From coarse to finer: prolongation

q−1

q

2 3 4 5

3 4 5 6 7 8 9

1

1 2

interpolated fine grid function

coarse grid function


Smoothers

The Gauss-Seidel method is called a smoother when used to damphigh-frequency error components in multigrid

Other smoothers: Jacobi, SOR, SSOR, incomplete factorization

No of iterations is called no of smoothing sweeps

Common choice: one sweep


A multigrid algorithm

Start with the finest grid

Perform smoothing (pre-smoothing)

Restrict to coarser grid

Repeat the procedure (recursive algorithm!)

On the coarsest grid: solve accurately

Prolongate to finer grid

Perform smoothing (post-smoothing)

One cycle is finished when reaching the finest grid again

Can repeat the cycle

Multigrid solves the system in O(n) operations

Check out HPL C.4.2 for details!!


V- and W-cycles

Different strategies for constructing cycles:

4

3

2

1

γ =1 γ

smoothing

coarse grid solve

q q q=2


Multigrid requires flexible software

Many ingredients in multigrid:pre- and post-smootherno of smoothing sweepssolver on the coarsest levelcycle strategyrestriction and prolongation methodshow to construct the various grids?

There are also other variants of multigrid (e.g. for nonlinear problems)

The optimal combination of ingredients is only known for simplemodel problems (e.g. the Poisson eq.)

In general: numerical experimentation is required!(Diffpack has a special multigrid toolbox for this)


Coupling simulators

Coupling simulators – p. 483

System of PDEs; coupling simulators

System of 2 PDEs:one momentum equation +one energy equation

Develop independent solvers for each PDE

Combine solvers in a few lines


Physical problem

z

Ω

Non-Newtonian fluid

Temperature-dependent viscosity

Steady flow

Straight pipe

0 10.3

0.4

0.5

0.6

0.7

0.8

0.9

1

velocity

0

0.00147

0.00295

0.00442

0.00589

0.00737

0.00884

0.0103

0.0118

0.0133

0.0147

0.0162

0.0177

0.0192

0 10.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 10.3

0.4

0.5

0.6

0.7

0.8

0.9

1

temperature

0

1.39e−05

2.78e−05

4.17e−05

5.56e−05

6.95e−05

8.34e−05

9.74e−05

0.000111

0.000125

0.000139

0.000153

0.000167

0.000181

0 10.3

0.4

0.5

0.6

0.7

0.8

0.9

1


General mathematical model

Equation of continuity:∇ · v = 0

Momentum equation:

%v · ∇v = −∇p+∇ ·P

Constitutive law:

P ∼ exp (−αT )|γ|n−1(∇v + (∇v)T )

γ =√∇v : ∇v

Energy equation:

C%v · ∇T = k∇2T + c exp (−αT )|γ|n+1


Simplified mathematical model

Assumption: rectilinear flowv = (0, 0, w)

Simplified equation system:

∂

∂x

(µ∂w

∂x

)+

∂

∂y

(µ∂w

∂y

)= const

µ = µ0e−αTS(w)n−1

S(w) =

√(∂w

∂x

)2

+

(∂w

∂y

)2

∂2T

∂x2+∂2T

∂y2= −µ0e

−αTS(w)n+1

Two nonlinear Poisson equations

BC: w = 0 and T = 0 at the walls


Numerical solution methods

Galerkin finite element method

Fully implicit formulation:

A(wk,Tk)wk = a

BTk = b(wk,Tk)

Sequential solution method

A(wk,Tk−1)wk = a

BTk = b(wk,Tk)

Solution of nonlinear systems:Picard iterationNewton-Raphson


Simplified structure in 1D

System of PDEs:

d

dx

(e−αT

∣∣∣∣dw

dx

∣∣∣∣n−1

dw

dx

)= const

d2T

dx2= −e−αT

∣∣∣∣dw

dx

∣∣∣∣n+1

Nonlinearities depend on n and α


Software development

Momentum1: ∇ · [µ∇u] = 0

Energy1: ∇2u = −f

Momentum1

Momentum2 Energy2

Energy1simple µ

µrelevant

simple f

relevant f

viscosity modelsCommonRel

Manager

Very little code in Momentum2, Energy2, Manager


Advanced Diffpack features

Advanced Diffpack features – p. 491

Parallel computing in Diffpack

Idea: add a few statements to a Diffpack solver and get a parallelversion

Two approaches:domain decomposition of the mathematical problem(two-level block Jacobi iteration with course grid correction)parallellization of matrix generation and linear solver

Both approaches start with a sequential Diffpack solver and addparallel features in small subclasses

Current status: upcoming module 2001


A parallel simulator

MySimtools

Parallel

MySimP


Multigrid methods

Why multigrid?It is often the fastest solution method for linear systems

How to implement multigrid?wind it into the inner grid/PDE details?put it on top of a solver?

Diffpack was designed without multigrid methods in mind

Multigrid has recently been added as a top module


Basic features of Diffpack multigrid

Multigrid ingredientsmultiple gridsmultiple linear solversinterpolation/prolongation

...put together in an algorithm

Data structures: vectors of grids, linear solvers, sparse matrices

⇒ Can reuse standard Diffpack classes

About 10 lines of additional code in e.g. class Poisson1

Multigrid in a new problem requires heavy experimentation

⇒ Flexible run-time combination of standard Diffpack modules intomultigrid algorithms

Efficiency: surprisingly good

Available through the Multi-Level Module


Mixed finite elements

Diffpack applies a grid overlay, class BasisFuncGrid, for definingbasis functions on a GridFE geometry

Isoparametric elements: BasisFuncGrid is transparent

Mixed finite elements: BasisFuncGrid defines new nodes

Programming with MxFEM and MxFiniteElement instead of FEM andFiniteElement

Easy and flexible tool

Coupled to block matrices, block preconditioners etc

Available in Diffpack v3.5


Domain decomposition

DD as solver or preconditioner

Overlapping vs. non-overlapping

The ideas from OO implementation of multigrid carry over to DD

In fact, an abstract multilevel algorithm constitute the generalsoftware, with multigrid and DD as special cases

Current status: upcoming extension of the Multi-Level Module


Adaptivity

How to implement adaptivity:wind adaptive discretization and solution algorithm (multigrid)together?separate discretization and solvers? Diffpack always separatesdiscretization and solvers!

Adaptive grids require about 10 lines of extra code:adaptive grids are subclasses of GridFEadaptivity: simple loop calling up(1) a refinement criterion and(2) grid->refine

Available through the Adaptivity Module


Example: adaptive grids (1)

XY

Z

Z−

Axi

s

0

1

Y−Axis

0

1 X−Axis

0

1



−1 0 1−1

0

1

−1 0 1−1

0

1

−1 0 1−1

0

1

0.77

3

0.773

0.773

0.773

0.954

0.954

1.13

1.31

1.68

−1 0 1−1

0

1



−1 0 1−1

0

1

−1 0 1−1

0

1

−1 0 1−1

0

1

−1 0 1−1

0

1



0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.8798

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.8798


The 2nd Diffpack book on Springer

Advanced Computational Partial Differential Equations – NumericalMethods and Diffpack Programming, edited by H.P. Langtangen andA. Tveito. Integration of some theory, models, and algorithms, withemphasis on Diffpack software.

Basic concepts in parallel computing

Parallel computing with Diffpack

Multilevel methods

Mixed finite elements

Block preconditioning

Stochastic PDEs

Computational medicine

Computational finance

Computational geology

Published 2003


Intro to OOP

Intro to OOP – p. 504

Traditional programming

Traditional procedural programming:

subroutines/procedures/functions

data structures = variables, arrays

data are shuffled between functions

Problems with procedural approach:

Numerical codes are usually large, resulting in lots of functions withlots of arrays (and their dimensions)

Too many visible details

Little correspondence between mathematical abstraction andcomputer code

Redesign and reimplementation tend to be expensive


Programming with objects (OOP)

Programming with objects makes it easier to handle large andcomplicated codes:

Well-known in computer science/industry

Can group large amounts of data (arrays) as a single variable

Can make different implementations look the same for a user

Not much explored in numerical computing(until late 1990s)


Example: programming with matrices

Mathematical problem:

Matrix-matrix product: C = MB

Matrix-vector product: y = Mx

Points to consider:

What is a matrix?

a well defined mathematical quantity, containing a table of numbersand a set of legal operations

How do we program with matrices?

Do standard arrays in any computer language give good enoughsupport for matrices?


A dense matrix in Fortran 77

Fortran syntax (or C, conceptually)

integer p, q, rdouble precision M(p,q), B(q,r), C(p,r)double precision y(p), x(q)

C matrix-matrix product: C = M*Bcall prodm(M, p, q, B, q, r, C)

C matrix-vector product: y = M*xcall prodv(M, p, q, x, y)

Drawback with this implementation:

Array sizes must be explicitly transferred

New routines for different precisions


Working with a dense matrix in C++

// given integers p, q, j, k, rMatDense M(p,q); // declare a p times q matrixM(j,k) = 3.54; // assign a number to entry (j,k)

MatDense B(q,r), C(p,r);Vector x(q), y(p); // vectors of length q and pC=M*B; // matrix-matrix producty=M*x; // matrix-vector productM.prod(x,y); // matrix-vector product

Observe that

we hide information about array sizes

we hide storage structure (the underlying C array)

the computer code is as compact as the mathematical notation


A dense matrix class

class MatDenseprivate:

double** A; // pointer to the matrix dataint m,n; // A is an m times n matrix

public:// --- mathematical interface ---MatDense (int p, int q); // create pxq matrixdouble& operator () (int i, int j); // M(i,j)=4; s=M(k,l);void operator = (MatDense& B); // M = B;void prod (MatDense& B, MatDense& C); // M.prod(B,C); (C=M*B)void prod (Vector& x, Vector& z); // M.prod(y,z); (z=M*y)MatDense operator * (MatDense& B); // C = M*B;Vector operator * (Vector& y); // z = M*y;void size (int& m, int& n); // get size of matrix

;

Notice that the storage format is hidden from the user


What is this object or class thing?

A class is a collection of data structures and operations on them

An object is a realization (variable) of a class

The MatDense object is a good example:1. data: matrix size + array entries2. operations: creating a matrix, accessing matrix entries,

matrix-vector products,..

A class is a new type of variable, like reals, integers etc

A class can contain other objects;in this way we can create complicated variables that are easy toprogram with


Extension to sparse matrices

Matrix for the discretization of −∇2u = f .

Only 5n out of n2 entries are nonzero.

Store only the nonzero entries!

Many iterative solution methods for Au = b can operate on thenonzeroes only


How to store sparse matrices (1)

A =

a1,1 0 0 a1,4 0

0 a2,2 a2,3 0 a2,5

0 a3,2 a3,3 0 0

a4,1 0 0 a4,4 a4,5

0 a5,2 0 a5,4 a5,5

.

Working with the nonzeroes only is important for efficiency!


How to store sparse matrices (2)

The nonzeroes can be stacked in a one-dimensional array

Need two extra arrays to tell where a row starts and the column indexof a nonzero

A = (a1,1, a1,4, a2,2, a2,3, a2,5, . . .

irow = (1, 3, 6, 8, 11, 14),

jcol = (1, 4, 2, 3, 5, 2, 3, 1, 4, 5, 2, 4, 5).

⇒ more complicated data structures and hence more complicatedprograms


Sparse matrices in Fortran

Code example for y = Mx

integer p, q, nnzinteger irow(p+1), jcol(nnz)double precision M(nnz), x(q), y(p)...call prodvs (M, p, q, nnz, irow, jcol, x, y)

Two major drawbacks:

Explicit transfer of storage structure (5 args)

Different name for two functions that perform the same task on twodifferent matrix formats


Sparse matrix as a C++ class (1)

class MatSparse

private:double* A; // long vector with the nonzero matrix entriesint* irow; // indexing arrayint* jcol; // indexing arrayint m, n; // A is (logically) m times nint nnz; // number of nonzeroes

public:// the same functions as in the example above// plus functionality for initializing the data structures

void prod (Vector& x, Vector& z); // M.prod(y,z); (z=M*y);


Sparse matrix as a C++ class (2)

What has been gained?

Users cannot see the sparse matrix data structure

Matrix-vector product syntax remains the same

The usage of MatSparse and MatDense is the same

Easy to switch between MatDense and MatSparse


The jungle of matrix formats

When solving PDEs by finite element/difference methods there arenumerous advantageous matrix formats:

- dense matrix- banded matrix- tridiagonal matrix- general sparse matrix- structured sparse matrix- diagonal matrix- finite difference stencil as matrix

The efficiency of numerical algorithms is often strongly dependent onthe matrix storage scheme

Goal: hide the details of the storage schemes


Different matrix formats


The matrix class hierarchy

MatSparseMatDense MatTriDiag MatBanded

Matrix

Generic interface in base class Matrix

Implementation of storage and member functions in the subclasses

Generic programming in user code:Matrix& M;

M.prod(x,y); // y=M*x

i.e., we need not know the structure of M, only that it refers to someconcrete subclass object;C++ keeps track of which subclass object!

prod must then be a virtual function


Object-oriented programming

Matrix = object

Details of storage schemes are hidden

Common interface to matrix operations

Base class: define operations, no data

Subclasses: implement specific storage schemes and algorithms

It is possible to program with the base class only!


Bad news...

Object-oriented programming do wonderful things, but might beinefficient

Adjusted picture:When indexing a matrix, one needs to know its data storage structurebecause of efficiency

In the rest of the code one can work with the generic base class andits virtual functions

⇒ Object-oriented numerics: balance between efficiency and OOtechniques


Some Diffpack/C++ programming

Some Diffpack/C++ programming – p. 523

Base class, subclass, inheritance

A subclass inherits data and functions from its base class

Base class:class X int i,k; void calc(); ;

Subclass:class Y : public X int n; void calc2(); ;

Class Y has int i,k,n and functions calc,calc2


Organization of Diffpack vectors

class VecSimplest(Type):just a C array with indexingclass Type: no requirements

subclass VecSimple(Type):adds operator=, input/outputclass Type: operator=, operator<<, operator>>

subclass VecSort(Type):adds operator< etc, sortingclass Type: operator<, operator<= etc

subclass Vec(Type):adds numerical operations on vectorsclass Type: operator*, operator/ etc


The vector class hierarchy

VecSimplest

VecSimple

VecSort

ArrayGenSel

ArrayGen

Vec

Vector

ArrayGenSimplest

ArrayGenSimple

op= op<< op>>

op< op<= etc

plain C arrayop()(int i)

op()(int i, int j)

op()(int i, int j, int k)

can print, scan, op=op+ op- op* op/

inactive entries (FDM & non-rect. geom.)

Vec + multiple indices


Why this vector organization?

Vector of real (=double): Vec(real)

Vector of int: VecSort(int)Vec(int) has too many arithmetic op.

Vector of grids: VecSimple(Grid)operator= and printing/reading for Grid make sense, but notarithmetic operations or sorting

Vector of simulators: VecSimplest(MySim)neither printing/reading, operator=, nor arithmetic operators makesense

Want to use same basic array handling code forVecSimplest(Grid) as for Vec(real)

⇒ Use inheritance to share code and increase reliability


Matrices revisited

Recall the intro example on handling various matrix formats

Declare base class Matrix

Define virtual functions for mathematical operations

Realize dense matrix, diagonal matrix, etc. as subclasses

Implement mathematical operations in subclasses only

Magic: Program with Matrix,C++ figures out which subclass you really mean!

Keywords: virtual functions, inheritance, object-orientedprogramming


Why emphasize software design?

PDE simulator: 50 000+ code lines

Maintainability important

Should be easy to extend

Should be easy to use/understand

Abstractions close to mathematical language are needed

The design must be a balance between attractive abstractions andcomputational efficiency


Grid and field abstractions

PDE:−∇ · [λ(x)∇u(x)] = f(x), x ∈ Ω

Assume some discretization (FDM, FEM, ...)

Natural abstractions:scalar fields: λ(x), f(x), u(x)(explicit functions, discrete fields)discrete Ω: gridfield = grid + valuesor field = explicit formuladiscrete operators?


Programming considerations

Obvious ideas:collect grid information in a grid classcollect field information in a field class

Gain:shorter code, closer to the mathematicsfinite difference methods: minorfinite element methods: importantbig programs: fundamental


Grids and fields in Diffpack

Assume a finite difference method:

Field represented by class FieldLattice:a grid of type GridLatticea set of point values, ArrayGenSel(ArrayGenSel is a subclass of ArrayGen with extrafunctionality)

Grid represented by GridLattice(uniform partition in d dimensions)


The GridLattice class

0 10

1

0 10

1

class GridLatticeprivate:

// data that hold grid spacing, size of domain etcpublic:

GridLattice (int nsd);real getPt (int dir, int index); // get coordinate of pt.int getBase (int dir); // loops: start indexint getMaxI (int dir); // loops: stop indexreal Delta (int dir); // grid spacingvoid scan (Is is); // scan("d=1 [0,1] [1:40]");

;

// declare a 2D grid in a program:// GridLattice grid(2);// grid.scan("d=2 [0,1]x[1,2] [1:10]x[-20:20]");


The FieldLattice class

0 10

1

0

0.00595

0.0119

0.0178

0.0238

0.0297

0.0357

0.0416

0.0476

0.0535

0.0595

0 10

1

class FieldLatticeprivate:

Handle(GridLattice) grid; // pointer to the gridHandle(ArrayGen(real)) vec; // pointer to the field values

public:FieldLattice (GridLattice& grid, const char* fieldname);GridLattice& grid (); // access to the gridArrayGen(real)& values (); // access to the field values

;

// given some 2D FieldLattice f, set f=sin(f):int i0 = f.grid().getBase(1); // start index, x-dirint in = f.grid().getMaxI(1); // stop index, x-dirint j0 = f.grid().getBase(2); // start index, y-dirint jn = f.grid().getMaxI(2); // stop index, y-dirint i,j;for (j = j0; j <= jn; j++)

for (i = i0; i <= in; i++)f.values()(i,j) = sin (f.values()(i,j)); Some Diffpack/C++ programming – p. 534

Smart pointers (handles)

Dynamic memory in C/C++⇒ need pointers

Bug no. 1 in C/C++: pointers

For example, if 5 fields point to the same grid, when can we safelyremove the grid object?

Make life easy: use a smart pointerHandle(X) x; // NULL pointer

x.rebind (new X()); // x points to new X object

// given a function void someFunc (X& xobj):

someFunc (*x); // send object (not the handle)someFunc (x()); // alternative syntax

// given a Handle(X) y:x.rebind (*y); // x points to y’s objectx = y; // not recommended (often a bug...)*x = *y; // set x’s object equal to y’s object

x.getRef(); // extract reference to x (same as *x)x.getPtr(); // extract pointer to x

negligible overhead, automatic garbage collection


Simulator classes

The PDE solver is a class itself

⇒ Easy to extend/modify solver

⇒ Enables coupling to optimization,automatic parameter analysis etc.

⇒ Easy to combine solvers(systems of PDEs)

Typical look:class MySimprotected:

// grid and field objects// PDE dependent parameters

public:void scan(); // read input and initvoid solveProblem();void resultReport();

;


Diffpack naming conventions

Local variables have lower-case letters, words are separated byunderscores, e.g., my_variable

Functions start with lower-case letters, words are separated bycapitals, e.g., myFunction

Class and enum names start with a capital letter, words areseparated by capitals, e.g., MyClass

Macros and enum values have upper-case letters, words areseparated by underscores, e.g., MY_MACRO

Remark: of course, you can follow your own convention, but it isimportant to be consistent!


String vibration revisited

Problem:∂2u

∂t2= γ2 ∂

2u

∂x2

Explicit finite difference method; loop through (x, t) grid

New class-based code: Wave1Dclass Wave1D

Handle(GridLattice) grid; // lattice grid here 1D gridHandle(FieldLattice) up; // solution u at time level l+1Handle(FieldLattice) u; // solution u at time level lHandle(FieldLattice) um; // solution u at time level l-1Handle(TimePrm) tip; // time discretization parameters (dt etc.)CurvePlotFile plotfile;// for plotting resultsreal C; // the Courant number (appears in the scheme)

void setIC (); // set initial conditionsvoid timeLoop (); // perform time steppingvoid dumpSolution (); // make a curve plot of u

public:Wave1D()

~Wave1D() void scan (); // read discretization parameters and initializevoid solveProblem (); // solve the problemvoid resultReport () // just dummy here

; Some Diffpack/C++ programming – p. 538

.h files and .cpp files

The class declaration (listing of data and function) is placed in aseparate file, with extension .h (here Wave1D.h)#ifndef Wave1D_h_IS_INCLUDED#define Wave1D_h_IS_INCLUDED#include <FieldLattice.h>#include <TimePrm.h>

class Wave1D

Handle(GridLattice) grid; // lattice grid here 1D gridHandle(FieldLattice) up; // solution u at time level l+1

...;#endif

The bodies of the member functions are put in a file with extension.cpp (here Wave1D.cpp)


Class TimePrm

Class TimePrm holds time parameters:∆t, time interval for simulation etc.

Initialization:Handle(TimePrm) tip = new TimePrm();tip.scan ("dt=0.1 t in [0,8]");

// only some characters are important:tip.scan ("=0.1 [0,8]");

Useful methods:class TimePrmpublic:

real Delta() const; // return time stepreal time() const; // return current timevoid initTimeLoop(); // initializebool finished(); // is stop time reached?void increaseTime(); // t = t + dtint getTimeStepNo(); // return time step number

;


Reading input

Let us read input (C, the grid, and the stop time) from the Unixcommand line like this:

./app -C 0.8 -g ’d=1 [0,1] [0:40]’ -t 6.5

void Wave1D:: scan ()

// real C is a class member, initialize it here:initFromCommandLineArg

("-C", C, 1.0, "Courant number", "R1[0:1]");

String grid_str;initFromCommandLineArg

("-g", grid_str, "d=1 [0,1] [0:20]", "grid", "S");grid.rebind(new GridLattice(1)); grid->scan (grid_str);

tip.rebind (new TimePrm());real tstop;initFromCommandLineArg

("-t", tstop, 1.0, "tstop", "R1[0:10]");// construct the proper initialization string from C:tip->scan (aform("dt=%g t in [0,%g]", C*grid->Delta(1), tstop));// (we assume unit wave velocity)...


Solving the problem

void Wave1D:: solveProblem () timeLoop();

void Wave1D:: timeLoop ()

tip->initTimeLoop();setIC();

const int i0 = u->grid().getBase(1); // start of loopconst int n = u->grid().getMaxI(1); // end of loopint i;dumpSolution (); // plot initial condition// useful abbreviations (also for efficiency):const ArrayGen(real)& U = u ->values();const ArrayGen(real)& Um = um->values();

ArrayGen(real)& Up = up->values();

while (!tip->finished()) tip->increaseTime();

for (i = i0+1; i <= n-1; i++)Up(i) = 2*U(i) - Um(i) + sqr(C) * (U(i+1) - 2*U(i) + U(i-1));

Up(i0) = 0; Up(n) = 0; // insert boundary values*um = *u; *u = *up; // update for next step, CHANGED

// alternative syntax: um() = u(); u() = up();dumpSolution ();


Set initial conditions

void Wave1D:: setIC ()

// set initial conditions on u and umconst int i0 = u->grid().getBase(1); // start point indexconst int n = u->grid().getMaxI(1); // end point indexconst real umax = 0.05; // max amplitude

// initialization of upup->fill(0.0);

// initialization of u (the initial displacement of the string)u->fill(0.0);int i; real x;for (i = i0; i <= n; i++) x = grid->getPt(1,i); // get x coord of grid point no iif (x < 0.7) u->values()(i) = (umax/0.7) * x;else u->values()(i) = (umax/0.3) * (1 - x);

// initialization of um (the special formula)um->fill(0.0);for (i = i0+1; i <= n-1; i++) // set the help variable um:um->values()(i) = u->values()(i) + 0.5*sqr(C) *

(u->values()(i+1) - 2*u->values()(i) + u->values()(i-1));


Dump results and main function

void Wave1D:: dumpSolution ()

// automatic dump of a curve plot of a 1D field:SimRes2gnuplot::makeCurvePlot(*u, // field to be plotted (1D)plotfile, // curve plot manager"displacement", // plot titleoform("u(x,%.4f)",tip->time()), // name of functionoform("C=%g, h=%g, t=%g", // comment

C,u->grid().Delta(1),tip->time()));

// main.cpp:

#include <Wave1D.h>

int main (int argc, const char* argv[])

initDiffpack (argc, argv);Wave1D simulator;simulator.scan ();simulator.solveProblem ();simulator.resultReport ();


Nice exercise: manual plotting

Let us rewrite dumpSolution: write each data point on the u(x, ·)curve to a CurvePlot object:void Wave1D:: dumpSolution ()

CurvePlot curve (plotfile); // tie CurvePlot to CurvePlotFilecurve.initPair ("displacement", // title

aform("u(x,%.4f)",tip->time()), // curvename"x", // indep.var.aform("C=%g",C)); // comment

// loop through all points in the grid, add (x,u) to curve:int i0 = grid->getBase(1); // start indexint in = grid->getMaxI(1); // stop indexreal x,uval;for (int i = i0; i <= in; i++) x = grid->getPt(1,i); // extract x coordinateuval = u->values()(i);curve.addPair (x, uval);

curve.finish();


Exercise 1.17

Consider a wave equation with damping:

∂2u

∂t2+ β

∂u

∂t= γ2 ∂

2u

∂x2

Same initial and boundary conditions as in class Wave1D

Modify the numerical scheme

Take a copy of class Wave1D

Implement the modification

Give β on the command line

Display a movie of a damped string:curveplotmovie gnuplot SIMULATION.map -0.1 0.1


Date post:	26-May-2020
Category:	Documents
Upload:	others
View:	5 times
Download:	0 times

INF5620: Numerical Methods for Partial Differential...

Documents