INF5620: Numerical Methods for Partial DifferentialEquations
Hans Petter Langtangen
Simula Research Laboratory, and
Dept. of Informatics, Univ. of Oslo
January 2006
INF5620: Numerical Methods for Partial Differential Equations – p. 1
About the course
About the course – p. 2
Course data
10 p
Lectures: Wednesdays 10-14 in B70, math building
Sometimes 4 h lectures, sometimes less, sometimes 4 h exercises
Course web page: http://folk.uio.no/hpl/INF5620,reachable from the official central UiO web page of the course
Look for messages at the web page!
Teachers:Xing Cai: [email protected] Petter Langtangen: [email protected], 99 53 20 21
About the course – p. 3
Basic features of the course
Goal: produce solutions of PDEs
Integrated approach:mechanics, numerics, algorithms, software
Generic approach –methods applicable to a wide range of PDE problems
Modern numerical methods
Modern implementation techniques
Non-trivial applications with nonlinear systems of PDEs
Analysis of simplified problems
Discovery of numerical properties by computer experiments
Carry out your own 2-week PDE project
About the course – p. 4
Contents
Numerical methods:
Finite difference methods
Finite element methods (main emphasis)
Application areas:
Heat transfer
Diffusion
Wave phenomena
Thermo-elasticity
Viscous fluid flow
About the course – p. 5
How to learn it
Overview from lectures
Exercises with hand calculations (get the details!)
Compulsory exercises:2 computer implementations of 1D finite difference methods2 1D finite element hand calculations2-week PDE project (comprehensive implementation)
About the course – p. 6
The exam
20 min talk
Additional questions
6 topics given two weeks on beforehand
Focus on overview and understanding
Some focus on mathematical details, derivations, intricate steps inalgorithms etc.
No focus on details regarding software tools (but some topics willinvolve overview and principle workings of software tools)
About the course – p. 7
Acronyms
PDE = partial differential equation(plural: PDEs)
ODE = ordinary differential equation(plural: ODEs)
OOP = object-oriented programming
About the course – p. 8
Scientific software trends
Dramatic increase in the interest of problem solving environments:Maple, Matlab, Mathematica, S-Plus, ...
PDE solvers are often huge & expensive
It’s difficult to build a flexible “Matlab” for PDEs, butmodern programming techniques and languages (e.g. C++)simplify the taskDiffpack is one attempt (used in this course)
Practical problem solving in industry makes use of large programpackages – that is one reason why we use a package in this course
New numerical projects in industry make increasing use of C++instead of Fortran – therefore we expose students to C++ and moremodern implementation techniques
We also see the potential of high-level languages like Python, incombination with C++ or Fortran, for solving PDEs – INF5660 may bea companion course
About the course – p. 9
Diffpack
Numerical library for PDE solution
(Almost) a full problem-solving environment for PDEs
A tool for programmers
Implementated in C++ and requires you to program in C++
Relies on object-oriented programming
Reduced implementation efforts for finite elements and PDEs
Enables real-world problem solving in a course
About the course – p. 10
Some features of Diffpack
Free version (though with array-size limition)
Free version at UiO and for students
Over 200 commercial installations: www.diffpack.com(Siemens, Xerox, DaimlerCrysler, Mitsubishi, NASA, Intel, Stanford,Cornell, Cambridge, Harvard, ...)
Some application areas:basic model equations in applied math.(Laplace, heat and wave equations)viscous fluid flow (Navier-Stokes equations)many types of water wave equationsheat transfer, incl. phase changesthermo-elasticitystochastic PDEs and ODEscomputational engineering, medicine, geology, finance
1D, 2D, 3D within the same code lines
About the course – p. 11
The Diffpack philosophy
Diffpack relies on programming and scripting
Diffpack is a set of libraries, consisting of C++ classes in hierarchies(OO design), applications (examples), and (Perl/Python) scripts
A simulator mainly contains problem-dependent code; genericmethods and data structures are already programmed in the libraries
Diffpack acts as a computational engine with a layered design:primitive layers: arrays, input/output, ...intermediate layers: linear systems/solvers, grids, fields, ...higher-level layers: simulators, parallel toolbox, ...
About the course – p. 12
How to learn Diffpack
Required:
good general programming skills
some familiarity with the class concept
thorough knowledge of the numerics
the right attitude:don’t reinvent the wheel – learn to use others’ codedon’t try to understand all details – utilize black boxes
Principles:
learn on demand
rely on program examples
stay cool!
Have access to a C++ textbook, e.g., Barton and Nackman’s Scientificand Engineering C++
About the course – p. 13
Literature
H. P. Langtangen: Computational Partial Differential Equations, Springer,2nd ed., 2003
About the course – p. 14
Warnings
Numerical solution of PDEs is a huge field in rapid growth; it takesyears to master the field
Many other fields (computer science, physics, mathematics) arewired into PDE numerics
C++ takes time to master
OOP takes time to understand
Diffpack requires you to have a thorough and generic understandingof the numerics
Difficulties with this course are usually not due to C++/OOP/Diffpackdetails – but lack of the proper overview of mathematics andnumerics
About the course – p. 15
1D heat conduction
1D heat conduction – p. 16
Heat conduction in the continental crust
earth surface
x
sT
x=b
x=0
-Q
mantleKnowing the temperature at the earth’s surface and the heat flowfrom the mantle, what is the temperature distribution through thecontinental crust?
Interesting question in geology and geophysics – and for thosenations exploring oil resources...
1D heat conduction – p. 17
Basic assumptions
x
sT
x=b
x=0
-Q
Physical assumptions:
Crust of infinite area
Steady state heat flow
Heat generated by radioactive elements
Physical quantities:
u(x) : temperature
q(x) : heat flux (velocity of heat)
s(x) : heat release per unit time and mass1D heat conduction – p. 18
Summary of the model
Differential equations and boundary conditions:
−u′′(x) = f(x), x ∈ (0, 1), u(0) = 0, u′(1) = 1
(f(x) is a scaled version of s(x))
Finite difference method (h = cell size):
u1 = 0
ui+1 − 2ui + ui−1 = −h2fi
2un−1 − 2un = −2h− h2fn
which can be written as a linear system
Au = b
where u = (u1, . . . , un) and A is tridiagonal
What to do: Fill A and b, solve for u by Gaussian elimination1D heat conduction – p. 19
Derivation of the model (1)
x
x=0
x=b
outflow
inflow
s(x)=R exp(-x/L)
Physical principles:
First law of thermodynamics:
net outflow of heat = total generated heat
Fourier’s law: heat flows from hot to cold regions (i.e. heat velocity ispoportional with changes in temperature)
q(x) = −λu′(x)
λ reflects the material’s ability to conduct heat
1D heat conduction – p. 20
Derivation of the model (2)
x
s(x) h
q(x-h/2)
q(x+h/2)
x=0
x=b
The first law of thermodynamics:outflow = heat generation
q(x+ h/2)− q(x− h/2) = s(x)h
Here: heat generation s(x) due to radioactive decay,
s(x) = R exp(−x/L)
Divide left-hand side by h and make h small,
q(x+ h/2)− q(x− h/2)h
= s(x) → q′(x) = s(x)
1D heat conduction – p. 21
Derivation of the model (3)
We have more information (boundary conditions):u(0) = Ts (at the surface of the earth)
q(b) = −Q (at the bottom of the crust)
We need to get u into the model; combining the 1st law ofthermodynamics
q′(x) = s(x)
with Fourier’s lawq(x) = −λu′(x)
we can eliminate q and get a differential equation for u:
− d
dx
(λdu
dx
)= s(x)
1D heat conduction – p. 22
Mathematical model
− d
dx
(λdu
dx
)= Re−x/L, u(0) = Ts, λ(b)u′(b) = −Q
or if λ is constant:
−u′′(x) = λ−1Re−x/L, u(0) = Ts, λ(b)u′(b) = −Q
Observe: u = u(x;λ,R, L, b, Ts, Q)
u varies with 7 parameters!
Suppose that we want to investigate the influence of the differentparameters. Assume (modestly) three values of each parameter:Number of possible combinations: 36 = 729.
Using scaling we can reduce the six physical parametersλ,R, L, b, Ts, Q to only two!
1D heat conduction – p. 23
Scaling
We introduce dimensionless quantities(see HPL A.1 and assume that λ is constant):
x = xb, u = Ts +Qbu/λ, s(bx) = Rs(x)
−d2u
dx2= γe−x/β , u = 0,
du
dx(1) = 1
where we have two dimensionless quantities
β = b/L, γ = bR/Q
Dropping the bars, we get a problem on the form
−u′′(x) = f(x), x ∈ (0, 1)
u(0) = 0
u′(1) = 1
1D heat conduction – p. 24
Discretization of our equation
1. Divide the domain [0, 1] into n− 1 cells,the cell edges xi are called nodes (i = 1, . . . , n)
2. Let ui = u(xi),our goal is to let the computer calculate u1,u2,u3, . . .
u
u u
u u3
4 51
2
x=1x=0x
3. The differential equation is to be fulfilled at the nodes only:
−u′′(xi) = f(xi), i = 1, . . . , n
4. Derivatives are approximated by finite differences
1D heat conduction – p. 25
Finite difference approximations (1)
Recall the definition of the derivative from introductory calculus:
limh→0
u(x+ h)− u(x)h
= u′(x)
Idea: use this formula with a finite h– this is a finite difference approximation to the derivative
What is the error in this approximation? Expand u(x+ h) in a Taylorseries and compute
1
h(u(x+ h)− u(x)) =
1
h
(u(x) + u′(x)h+
1
2u′′(x)h2 + · · · − u(x)
)=
u′(x) +1
2u′′(x)h+ · · ·
The largest error term is u′′h/2, proportional to h1D heat conduction – p. 26
Finite difference approximations (2)
An alternative finite difference approximation:
u′(x) ≈ u(x+ h)− u(x− h)2h
Compute the error by Taylor series expansion of u(x+ h) andu(x− h) around x:
u(x+ h)− u(x− h)2h
= u′(x) +1
6u′′′(x)h2 + . . .
Leading error term proportional to h2
1D heat conduction – p. 27
Finite difference approximations (3)
Approximation to u′′(x):
u′′(x) ≈ u(x+ h)− 2u(x) + u(x− h)h2
or
u′′(xi) ≈u(xi + h)− 2u(x) + u(xi − h)
h2
Alternative notation, noting that ui ≡ u(xi), ui+1 = u(xi + h), andui−1 = u(xi − h):
[u′′]i ≈ui+1 − 2ui + ui−1
h2
Show that the error is O(h2)(Hint: expand ui+1, ui, and ui−1 in Taylor series around xi and insertthe series in the finite difference formula)
1D heat conduction – p. 28
The discrete differential equation
The equation at the nodes:
−u′′(xi) = f(xi), i = 1, . . . , n
Replace u′′ by a centered finite difference:
u′′(xi) ≈ui+1 − 2ui + ui−1
h2
⇒ The differential equation is transformed to a system of algebraicequations:
−ui+1 − 2ui + ui−1
h2= fi, i = 1, . . . , n
1D heat conduction – p. 29
Discretizing boundary conditions
u(0) = 0 simply becomes u1 = 0
u′(1) = 1 can be approximated as
un+1 − un−1
2h= 1
Problem: un+1 is not in the mesh!
Solution: Use the discrete differential equation for i = n:
−un−1 − 2un + un+1
h2= fn
and the discrete boundary condition to eliminate un+1
The result is2un−1 − 2un = −2h− h2fn
1D heat conduction – p. 30
System of equations
The complete set of finite difference equations,
u1 = 0
ui+1 − 2ui + ui−1 = −h2fi
2un−1 − 2un = −2h− h2fn
can be written as a linear system on matrix form
Au = b
where u = (u1, . . . , un) and A is a tridiagonal matrix
1D heat conduction – p. 31
Tridiagonal coefficient matrix
A =
A1,1 A1,2 0 · · · · · · · · · · · · · · · 0
A2,1 A2,2 A2,3. . .
. . .. . .
. . ....
.... . .
. . .. . .
. . .. . .
......
. . .. . .
. . .. . .
. . ....
.... . . 0 Ai,i−1 Ai,i Ai,i+1 0
......
. . .. . .
. . .. . .
. . .. . . 0
.... . .
. . .. . .
. . .. . .
. . . An−1,n
0 · · · · · · · · · · · · · · · 0 An,n−1 An,n
A1,1 = 1, A1,2 = 0, An,n−1 = 2
Ai,i−1 = 1, Ai,i+1 = 1, i = 2, . . . , n− 1
Ai,i = −2, i = 2, . . . , n1D heat conduction – p. 32
Solution of linear systems
The system is solved by Gaussian elimination:
Compute the LU factorization: A = LU
L: lower triangular matrix
U : upper triangular matrix
Solve Ly = b (easy)
Solve Ux = y (easy)
Computational work:
A is dense: O(n3)
A is tridiagonal: O(n)
1D heat conduction – p. 33
Solution of linear systems; general case
LU factorization (Gaussian elimination) is the optimal solution methodwhen A is tridiagonal
However, in 2D and 3D problems,LU factorization is a very slow process (A is no longer tridiagonal)the structure of A favors iterative methods, which are very muchfaster than LU factorization
Iterative methods are discussed at the end of the course
1D heat conduction – p. 34
Implementation
We want the computer to solve our linear system (for arbitrary n)
This task can easily be accomplished using any computer languageand any computer
The program fills A and b with numbers according to the derivedformulas and then calls a Gaussian elimination procedure to find u
In numerical simulation in general, computer codes are large andcomplicated and using effective tools is fundamental
We shall use a comprehensive tool, Diffpack, even for this verysimple problem
The Diffpack code will be close similar codes in Python, Fortran 77,Matlab, C, C++, Java, ...
There is no particular advantage of using Diffpack (except thatDiffpack has a solver for tridiagonal linear system), but it is a simpleproblem for the first Diffpack encounter
1D heat conduction – p. 35
Diffpack intro
Diffpack intro – p. 36
The standard intro to a new language
A “scientific” Hello World code:#include <iostream> // make input/output functionality available#include <cmath> // make math functions available: e.g. sin(x)int main () // function "main" is always the main program
std::cout << "Hello, World! Give a number: ";double r; std::cin >> r; // read number into double precision rdouble s = sin(r); // declare s and initialize with sin(r)std::cout << "\nThe value of sin(" << r << ") is " << s << "\n";
This is pure C++ - no Diffpack!
Diffpack intro – p. 37
Compiling and linking
Compile:g++ -c hw.cpp
Link hw.o to the C/C++ standard and math libraryg++ -o app hw.o -lm# -lm (link to math lib.) can often be left out:g++ -o app hw.o
Run the program:./app
Compiling and linking in one step:g++ -o app hw.cpp -lm
C++ compilers can have other names: CC, xlC
Diffpack intro – p. 38
The corresponding Diffpack program
Make special directory: Mkdir myfirstdp
cd myfirstdp
Make a file hw.cpp with the following contents:
#include <IsOs.h> // Diffpack tools for input/output#include <cmath> // make math functions available: sin(x)int main (int argc, const char* argv[])
initDiffpack (argc, argv); // should always be performeds_o << "Hello world! Give a number: ";real r; s_i >> r; // read real number into rreal s = sin(r);s_o << "\nThe value of sin(" << r << ") is " << s << "\n";
/* Explanation:IsOs.h : input/output in Diffpack, much like iostreamreal : real variables in Diffpack, equals double by defaults_i : standard input in Diffpack, corresponds to std::cins_o : standard input in Diffpack, corresponds to std::cout*/
Diffpack intro – p. 39
Compiling and linking
Diffpack is compiled using makefiles (which are automaticallygenerated by the Mkdir command)
Compilation and linking is just a matter ofMake (safe, but results in slow code)Make MODE=opt (fast code, but less safety checks)
Always start with Make; use only optimized mode (MODE=opt)when the program is thoroughly tested!
Diffpack intro – p. 40
Arrays in Diffpack
Conventions as in Fortran:first index is 1subscript syntax: a(i)
Different from C, where arrays start at 0 and brackets are used: a[0],a[1], ...
Diffpack arrays are not a built-in feature of C++, but they are definedby a programmer (and can in principle be extended by anybody tomeet the demands in a particular application)
Diffpack intro – p. 41
Code example with arrays
#include <Arrays_real.h>int main (int argc, const char* argv[])
initDiffpack (argc, argv);int i,j,k,n,m,p; real r;n = m = 4; p = 3;
Vec(real) w(n);w.redim (m);i = w.size();w = -3.14;Vec(real) z;z = w;z(n-1) = w(1) - 4.3;z.print (s_o, "z");z.printAscii(s_o,"z");z.print (s_o);
Diffpack intro – p. 42
Heat conduction problem in Diffpack
Find a suitable test problem with known analytical solution
−u′′(x) = γ exp (−βx), u(0) = 0, u′(1) = 1
u(x) =γ
β2
(1− e−βx
)+
(1− γ
βe−β
)x, β 6= 0
u(x) = x
(1 + γ
(1− 1
2x
)), β = 0
Read β, γ and n
Initialize A and b
Call a Gaussian elimination procedure in Diffpack to solve for u
Diffpack intro – p. 43
Test problem for debugging
Choose e.g. n = 2 and solve the discrete equations by hand,
u1 = 0, 2u1 − 2u2 = −2h− h2γe−β
Solution u2:
u2 = 1 +γ
2e−β
When β = 0, the numerical solution is exact for all n (!), i.e., theanalytical solution
u(xi) = ui = (i− 1)h
(1 + γ
(1− 1
2(i− 1)h
))
fulfills the discrete equations
In general, u′′ = const is solved exactly by finite difference methodson uniform grids
Diffpack intro – p. 44
Diffpack/C++ program in F77/C style
Declaration and initialization of variables:#include <Arrays_real.h> // for array functionality (and I/O)#include <cmath> // for the exponential function
int main(int argc, const char* argv[])
initDiffpack(argc, argv);s_o << "Give number of solution points: "; // write to the screenint n; // declare an integer n (no of grid points)s_i >> n; // read n from s_i, i.e. the keyboardreal h=1.0/(n-1); // note: 1/(n-1) gives integer division (=0)Mat(real) A(n,n); // create an nxn matrixArrayGen(real) b(n); // create a vector of length n.ArrayGen(real) u(n); // the grid point valuess_o << "Give beta: "; real beta; s_i >> beta;s_o << "Give gamma: "; real gamma; s_i >> gamma;
Diffpack intro – p. 45
Fill matrix and right-hand side
A.fill(0.0); // set all entries in A equal to 0.0b.fill(0.0); // set all entries in b equal to 0.0real x; int i;
i = 1;A(i,i) = 1;b(i) = 0;// inner grid points:for (i = 2; i <= n-1; i++) // i++ means i=i+1x = (i-1)*h;A(i,i-1) = 1; A(i,i) = -2; A(i,i+1) = 1;b(i) = - h*h*gamma*exp(-beta*x);
// i = n:i = n; x = (i-1)*h;A(i,i-1) = 2; A(i,i) = -2;b(i) = - 2*h - h*h*gamma*exp(-beta*x);
if (n <= 10) A.print (s_o,"A matrix"); // print matrix to the screenb.print (s_o,"right-hand side"); // print vector to the screen
Diffpack intro – p. 46
Solve for u and write out solution
A.factLU(); A.forwBack(b,u); // Gaussian elimination
s_o << "\n\n x numerical error:\n";real u_exact;for (i = 1; i <= n; i++) // \n is newlinex = (i-1)*h;if (beta < 1.0E-09) // is beta zero?u_exact = x*(1 + gamma*(1 - 0.5*x));
else u_exact = gamma/(beta*beta)*(1 - exp(-beta*x)) +
(1 - gamma/beta*exp(-beta))*x;s_o << oform("%4.3f %8.5f %12.5e\n",
x,u(i),u_exact-u(i));// test for the case of only one cell:if (n == 2) s_o << "u(2)=" << 1+0.5*gamma*exp(-beta) << "\n"; // write results to the file "SIMULATION.res"Os file ("SIMULATION.res", NEWFILE); // open filefor (i = 1; i <= n; i++)file << (i-1)*h << " " << u(i) << "\n";
file->close();
Diffpack intro – p. 47
Tridiagonal matrices
A is tridiagonal
Mat(real) A(n,n) is a dense matrix
Save memory and CPU-time: use tridiagonal matrix (this can bequite dramatic savings!)
MatTri(real) A(n)
A(i,-1), A(i,0), A(i,1) for Ai,i−1, Ai,i, Ai,i+1
Otherwise the program remains the same
Diffpack intro – p. 48
Exercises
1. Perform the steps to be a Diffpack user
2. Type in the Diffpack version of our numerical “Hello World!” program,compile and run the program
3. Introduce MatTri instead of Mat in the 1D heat conduction program(Exercise 1.4 in HPL)
Diffpack intro – p. 49
The heat conduction coefficient
The derivation of the 1D model ends in
− d
dx
(λdu
dx
)= s(x)
and allows a variable λ
λ: heat conduction coefficient
The continental crust is typically not homogeneous!
⇒ λ varies in space!
Model simplification: λ = λ(x)(λ = λ(x, y, z) would require a 3D model)
Need to discretize the operator
d
dx
(λdu
dx
)
Diffpack intro – p. 50
Discretization of variable coefficients
Mathematical problem
− d
dx
(λ(x)
du
dx
)= f(x), 0 < x < 1
u(0) = 0, u′(1) = 1.
NEVER expand (λu′)′ (by the rule of product differentiation)
Two-step discretization, first outer operator:
d
dx
(λ(x)
du
dx
)∣∣∣∣x=xi
≈ 1
h
λdu
dx
∣∣∣∣x=x
i+ 12
− λdu
dx
∣∣∣∣x=x
i− 12
Then inner operator:
λdu
dx
∣∣∣∣x=x
i+ 12
≈ λi+ 12
ui+1 − ui
hDiffpack intro – p. 51
Finite difference equations
Left point, inner points, right point:
u1 = 0
λi+ 12(ui+1 − ui)− λi− 1
2(ui − ui−1) = −h2fi, i = 2, . . . , n− 1
2λn(un−1 − un) = −2hλn+ 12− h2fn
Arithmetic mean:
λi+ 12
=1
2(λi + λi+1)
Harmonic mean:1
λi+ 12
=1
2
(1
λi+
1
λi+1
)
Geometric mean:λi+ 1
2= (λiλi+1)
1/2
Diffpack intro – p. 52
A nonlinear problem
A nonlinear problem – p. 53
Nonlinear heat conduction
Heat conduction typically depends upon the temperature
− d
dx
(λ(u)
du
dx
)= f(x), 0 < x < 1, u(0) = 0, u′(1) = 1
This is a nonlinear differential equation
Using the same discretization reasoning as when λ = λ(x),
u1 = 0
λi+ 12(ui+1 − ui)− λi− 1
2(ui − ui−1) = −h2fi
2λn(un−1 − un) = −2hλn+ 12− h2fn
whereλi+ 1
2≡ λ(ui+ 1
2)
A nonlinear problem – p. 54
The new problem
Our discrete equations contains λ(ui+ 12), i.e., the coefficients that we
previously put in the matrix A, now depend on the solution ui andui+1
The linear system can be written as
A(u)u = b
This is a set of nonlinear algebraic equations
The nonlinearity arises from the λ(u)u′ product in the underlyingdifferential equation
We cannot use LU decomposition because A depends on u
What can we do?
A nonlinear problem – p. 55
Solution method
If we only had a linear equation, we would get a linear systemAu = b, which know how to solve...
Idea: Guess a solution u0 and use this in λ:
− d
dx
(λ(u0)
du1
dx
)= f(x)
u1 is – hopefully – a better approximation than u0
This approach suggest an iteration procedure:use solution from last iteration in λthe equation is now linearuse the solution technology for −(λ(x)u′(x))′ = f(x)
A nonlinear problem – p. 56
Algorithm
Guess a solution u0 (need not be correct)
Solve the recursive equations
− d
dx
(λ(uk−1)
du
dx
k)
= f(x), uk(0) = 0,du
dx
k
(1) = 1
until difference between uk and uk−1 is small
“Small” can mean √√√√n∑
j=1
|ukj − uk−1
j |2 ≤ ε
Pros: may reuse previous code by inserting an evaluation of λ(ui+ 12)
Cons: slow convergence (faster methods exist)
A nonlinear problem – p. 57
The complete scheme (1)
For i = 1, 2, . . . , n− 1:
1
2
(λ(uk−1
i ) + λ(uk−1i+1 )
)(uk
i+1 − uki
)−
1
2
(λ(uk−1
i−1 ) + λ(uk−1i )
)(uk
i − uki−1
)= −h2f(xi)
A nonlinear problem – p. 58
The complete scheme (2)
For i = n
2λ(uk−1n )(uk
n−1 − ukn) = −2hλn+ 1
2− h2f(xn)
Now,
λn+ 12
=1
2
(λ(uk−1
n ) + λ(uk−1n+1)
)
Using the boundary condition
uk−1n+1 − uk−1
n−1
2h= 1, k > 1
gives
λn+ 12
=1
2
(λ(uk−1
n ) + λ(uk−1n−1 + 2h)
)
A nonlinear problem – p. 59
Implementation
Reuse old program (Heat1D) with:
Loop around system generation and solution
Two arrays uk and ukm
Initial guess in ukm
New auxiliary variables (for iteration etc.)
Function lambda to evaluate λ(u)
Update A and b for each step
Call A.resetFact() to enable new LU decomposition prior tocall to A.factLU()
Check for termination upon convergence
Set ukm equal uk before new iteration
A nonlinear problem – p. 60
The central code segment (1)
int k = 0; // iteration counterconst int k_max = 200; // max no of iterationsreal udiff = INFINITY; // udiff = ||uk - ukm||const real epsilon = 0.0000001; // tolerance in termination crit.
while (udiff > epsilon && k <= k_max)k++; // increase k by 1A.fill(0.0); b.fill(0.0); // initialize A and b
for (i = 1; i <= n; i++)if (i == 1) A(1,0) = 1; else if (i > 1 && i < n)
lambda1 = lambda(ukm(i-1), m);lambda2 = lambda(ukm(i), m);lambda3 = lambda(ukm(i+1), m);
A(i,-1) = 0.5*(lambda1 + lambda2);A(i, 0) = -0.5*(lambda1 + 2*lambda2 + lambda3);A(i, 1) = 0.5*(lambda2 + lambda3);
else if (i == n)
A(i,-1) = 2*lambda(ukm(i), m);A(i, 0) = - A(i,-1);b(i) = -(h*lambda(ukm(i-1)+2*h,m)+lambda(ukm(i),m));
A nonlinear problem – p. 61
The central code segment (1)
A.resetFact(); // ready for new factLUA.factLU(); A.forwBack(b,uk); // Gaussian elimination
// check termination criterion:udiff = 0;for (i = 1; i <= n; i++)
udiff += sqr(uk(i) - ukm(i));udiff = sqrt(udiff);s_o << "iteration " << k << ": udiff = " << udiff << "\n";
ukm = uk; // ready for next iteration
A nonlinear problem – p. 62
No of iterations
λ(u) = um
0
10
20
30
40
50
60
0 0.5 1 1.5 2 2.5 3 3.5 4
no o
f ite
ratio
ns
m
number of iterations
n=100
A nonlinear problem – p. 63
Numerical error
λ(u) = um
-35
-30
-25
-20
-15
-10
-5
0
0 0.5 1 1.5 2 2.5 3 3.5 4
log(
erro
r)
m
numerical error
n=100
A nonlinear problem – p. 64
Convergence
Define the error from iterations:
EI = ||uk − uk−1||
where || · || is some norm, e.g.,
||u|| =
√√√√ 1
n
n∑
i=1
u2i
Define the discretization error:
E∆ = ||u− uk||
Basic issue in discretization: how does E∆ vary with the cell size h?
Investigation: make EI negligible (EI E∆), compute E∆ fordifferent choices of h
Common model for relating discretization error to grid size:
E∆ = Chr
fit C and r to data (linear least squares):
logE∆ = logC + r log h
y = aξ + b, y = logE∆, b = logC, ξ = log h, r = a
Does E∆ → 0 as h→ 0? And how fast?
Second-order finite difference approximations suggest r = 2
A nonlinear problem – p. 65
Convergence plot (1)
λ(u) = um
-10
-9
-8
-7
-6
-5
-4
-3
-2
-8 -7 -6 -5 -4 -3 -2 -1
log(
erro
r)
log(h)
numerical error
m=0.2m=1.2m=3.2
A nonlinear problem – p. 66
Convergence plot (2)
λ(u) = (1 + u)m
-24
-22
-20
-18
-16
-14
-12
-10
-8
-6
-4
-8 -7 -6 -5 -4 -3 -2 -1
log(
erro
r)
log(h)
numerical error
m=0.2m=1.2m=3.2
A nonlinear problem – p. 67
Summary of results
Number of iterations increase with m
m = 1: numerical solution is exact (!)
λ(u) = um: O(hr) for r ≤ 1
(despite our use of O(h2) accurate finite differences!)
λ(u) = (1 + u)m: O(h2) – as expected
Explanation: um gives u′(0)→∞,need a very fine grid around x = 0 to get accurate results
Note: theory does not extend well to nonlinear problems; systematicexperiments may be an important additional tool
A nonlinear problem – p. 68
Simulation of waves
Simulation of waves – p. 69
Vibration of a string
Mathematical model: the wave equation
∂2u
∂t2= γ2 ∂
2u
∂x2, x ∈ (a, b)
- Time- and space-dependent problem- This is a partial differential equation (PDE)- Boundary conditions at x = a, b (u or ∂u/∂x)
- Initial conditions: known u(x, 0) and ut(x, 0)
Explicit finite difference method:
u`+1i = f(u`
i , u`i−1, u
`i+1, u
`−1i )
Implementation: run through a space-time grid and compute u`+1i for
each grid point
Simulation of waves – p. 70
Derivation of the model (1)
y
x
Physical assumptions:
the string = a line in 2D space
no gravity forces
up-down movement (i.e., only in y-direction)
Physical quantities:
r = xi + u(x, t)j : position
T (x) : tension force (along the string)
θ(x) : angle with horizontal direction
%(x) : density
Simulation of waves – p. 71
Derivation of the model (2)
y
x
T
T
(x-h/2)
(x+h/2)
u(x,t)
ρ∆ s
hPhysical principle: Newton’s second law applied to a small(infinitesimal) part of the string
sum of forces = total mass · acceleration
Simulation of waves – p. 72
Derivation of the model (3)
Great mathematicians had great problems with understanding how toset up the mathematical model for a vibrating string
Euler, D’Alambert and Taylor all made various attempts (which look“stupid” by today’s standards...)
Lagrange was the first one to derive the right partial differentialequation
This happened about 100 years after Newton had presented themathematics and physics we need to derive this PDE
The derivation to be presented here is typical: simple principles, butlots of mathematical details; it’s easy to get lost in the details
Simulation of waves – p. 73
Derivation of the model (4)
y
x
T
T
(x-h/2)
(x+h/2)
u(x,t)
ρ∆ s
h
Acceleration:
a =∂2r
∂t2=∂2u
∂t2j
Newton’s law applied to a string element:
T(x+
h
2
)− T
(x− h
2
)= %(x)∆s
∂2u
∂t2(x, t) j
⇒ A vector equation with two components
Simulation of waves – p. 74
Derivation of the model (5)
y
x
T
T
(x-h/2)
(x+h/2)
u(x,t)
ρ∆ s
h
The tension reads
T (x) = T (x) cos θ(x) i + T (x) sin θ(x) j
Newton’s law in component form
T (x+h
2) cos θ(x+
h
2)− T (x− h
2) cos θ(x− h
2) = 0
T (x+h
2) sin θ(x+
h
2)− T (x− h
2) sin θ(x− h
2) = %(x)∆s
∂2u
∂t2
Simulation of waves – p. 75
Derivation of the model (6)
Divide the first component by h and let h→ 0
∂
∂x
(T cos θ
)= 0
Similarly for the second component
∂
∂x
(T sin θ
)= %(
limh→0
∆s
h
)∂2u
∂t2
Simulation of waves – p. 76
Derivation of the model (7)
We need to determine the limit limh→0 ∆s/h
Assume linear segment, then by Pythagoras:
∆s2 = h2 + ∆u2, i.e.,∆s
h= lim
h→0
√
1 +
(∂u
∂x
)2
Furthermore,
tan θ =∂u
∂x,
which means that
sin θ =tan θ√
1 + tan2 θ=
∂u∂x√
1 +(
∂u∂x
)2
Simulation of waves – p. 77
Derivation of the model (8)
Altogether this gives
%
[1 +
(∂u
∂x
)2] 1
2∂2u
∂t2=
∂
∂x
T
[1 +
(∂u
∂x
)2]− 1
2∂u
∂x
which is a nonlinear partial differential equation.
Assume small vibrations, i.e., (∂u/∂x)2 1. For small vibrations,θ(x) ≈ 0, such that
0 =∂
∂x
(T cos θ
)=
∂
∂x
(T(1− θ2
2!+ . . .
))≈ ∂
∂xT
This means that T is approximately a constant and that the squareroots are 1
Simulation of waves – p. 78
Summing up
The governing PDE:
∂2u
∂t2= c2
∂2u
∂x2c2 = T/%
String fixed at the ends:
u(a, t) = u(b, t) = 0
String initially at rest:
u(x, 0) = I(x),∂u
∂t(x, 0) = 0
Simulation of waves – p. 79
The scaled wave equation problem
We scale the equations (γ ≡ 1, but kept as a label) and arrive at thefollowing initial-boundary value problem:
∂2u
∂t2= γ2 ∂
2u
∂x2, x ∈ (0, 1), t > 0
u(x, 0) = I(x), x ∈ (0, 1)
∂
∂tu(x, 0) = 0, x ∈ (0, 1)
u(0, t) = 0, t > 0,
u(1, t) = 0, t > 0
Simulation of waves – p. 80
Finite difference approximation (1)
Introduce a grid in space-time
xi = (i− 1)h, i = 1, . . . , n
t` = `∆t, ` = 0, 1, . . .
h x
t
∆ t
Simulation of waves – p. 81
Finite difference approximation (2)
Central difference approximations:
∂2
∂x2u(xi, t`) =
u`i−1 − 2u`
i + u`i+1
h2+O(h2)
∂2
∂t2u(xi, t`) =
u`−1i − 2u`
i + u`+1i
∆t2+O(∆t2)
Insert these in the PDE
∂2u
∂t2= γ2 ∂
2u
∂x2
Simulation of waves – p. 82
Finite difference approximation (3)
The PDE has been transformed to a difference equation
u`−1i − 2u`
i + u`+1i
∆t2= γ2u
`i−1 − 2u`
i + u`i+1
h2
All u values at time levels ` and `− 1 are assumed known
⇒ Only one unknown term: u`+1i
⇒ Can solve for u`+1i explicitly:
u`+1i = 2u`
i − u`−1i + γ2 ∆t2
h2
(u`
i−1 − 2u`i + u`
i+1
)
This scheme is classified as a an explicit finite difference method; noneed to solve coupled systems of linear equations (⇒ easierprogramming!)
Simulation of waves – p. 83
The computational procedure
h x
t
∆ t
Can find u`+1i for one i at a time if u at t` and t`−1 is known
Need u−1i and u0
i for all i to start the algorithm
Simulation of waves – p. 84
Initial conditions (1)
u0i = I(xi): evaluate directly
A bit more challenging:
∂u
∂t
∣∣∣∣t=0
= 0 ⇒ u1i − u−1
i
2∆t= 0 ⇒ u−1
i = u1i
but u−1i is outside the legal time grid...
Idea: eliminate u−1i by using the discrete PDE at t = 0, which gives a
special formula for the first step:
u1i = u0
i + γ2 ∆t2
2h2
(u0
i−1 − 2u0i + u0
i+1
)
Simulation of waves – p. 85
Initial conditions (2)
It is awkward to have a special first step. Instead we introduce
u−1i = u0
i + γ2 ∆t2
2h2(u0
i+1 − 2u0i + u0
i−1)
and use the standard difference equation also at the first step
Simulation of waves – p. 86
Algorithm
Define storage u+i , ui, u−i for u`+1
i , u`i , u
`−1i and set C = γ∆t/h
Set initial conditions: ui = I(xi), i = 1, . . . , n
Define the artificial quantity u−i (i = 2, . . . , n− 1)
u−i = ui +1
2C2(ui+1 − 2ui + ui−1),
Set t = 0; while t < tstop
t = t+ ∆t
Update all inner points (i = 2, . . . , n− 1)
u+i = 2ui − u−i + C2(ui+1 − 2ui + ui−1)
Set boundary condition: u+1 = 0, u+
n = 0
Initialize for next step u−i = ui, ui = u+i , all i
Simulation of waves – p. 87
Diffpack code in F77/C style
We use functions in C++void timeLoop (ArrayGen(real)& up, ArrayGen(real)& u,
ArrayGen(real)& um, real tstop, real C);void setIC (real C, ArrayGen(real)& u0, ArrayGen(real)& um);
The main program:ArrayGen(real) up (n); // u at time level l+1ArrayGen(real) u (n); // u at time level lArrayGen(real) um (n); // u at time level l-1// get n and Courant number C=dt/dx from the user
timeLoop (up, u, um, tstop, C); // finite difference scheme
Simulation of waves – p. 88
The timeLoop function
void timeLoop (ArrayGen(real)& up, ArrayGen(real)& u,ArrayGen(real)& um, real tstop, real C)
int n = u.size(); // length of the vector u (no of grid points)real h = 1.0/(n-1); // length of grid intervalsreal dt = C*h; // time step, assumes unit wave velocity!!real t = 0; // time
setIC (C, u, um); // set initial conditions
int i; // loop counter over grid pointsint step_no = 0; // current step number
while (t <= tstop)t += dt; step_no++; // increase time; count no. of steps
// update inner points according to finite difference scheme:for (i = 2; i <= n-1; i++)
up(i) = 2*u(i) - um(i) + sqr(C)*(u(i+1) - 2*u(i) + u(i-1));
up(1) = 0; up(n) = 0; // update boundary points:um = u; u = up; // update data struct. for next step
Simulation of waves – p. 89
The setIC function
void setIC (real C, ArrayGen(real)& u0, ArrayGen(real)& um)
int n = u0.size(); // length of the vector ureal x; // coordinate of a grid pointreal h = 1.0/(n-1); // length of grid intervalsreal umax = 0.05; // max string displacement
int i; // loop counter over grid pointsfor (i = 1; i <= n; i++) // set the initial displacement u(x,0)x = (i-1)*h;if (x < 0.7) u0(i) = (umax/0.7) * x;else u0(i) = (umax/0.3) * (1 - x);
for (i = 2; i <= n-1; i++) // set the help variable um:um(i) = u0(i) + 0.5*sqr(C) * (u0(i+1) - 2*u0(i) + u0(i-1));
um(1) = 0; um(n) = 0; // dummy values, not used in the scheme
Simulation of waves – p. 90
Dumping solution to file
We dump the solution at each time point to file such that we canmake a movie after the simulation is finished
Diffpack has tools for managing a large number of curves on files
CurvePlotFile: manager for a collection of curves
CurvePlot: a variable that holds a curve
Code example:int n = u.size(); // the number of unknownsreal h = 1.0/(n-1); // length of grid intervalsCurvePlot plot (plotfile); // a single plotplot.initPair ("displacement", // plot title
oform("u(x,%g)",t), // name of function"x", // name of indep. var.oform("C=%g, h=%g",C,h)); // comment
for (int i = 1; i <= n; i++) // add (x,y) data pointsplot.addPair (h*(i-1) /* x-value */, u(i) /* y value */);
plot.finish();
Simulation of waves – p. 91
Visualizing the results
The simulation produces a Diffpack case with name SIMULATION
Central files generated in the simulation:SIMULATION.dp — logfile for simulation, i.e., runtimeSIMULATION.map — overview of data filesSIMULATION.files — explanation of what the files are.SIMULATION_1, .SIMULATION_2, ... the (hidden) datafiles
Simulation of waves – p. 92
Animation
Make the animation using Diffpack features:curveplotmovie gnuplot SIMULATION.map -0.1 0.1
(script) (program) (name of map file) (ymin) (ymax)
Can replace gnuplot by matlabcurveplotmovie matlab SIMULATION.map -0.1 0.1
Simulation of waves – p. 93
Varying the Courant number C
C = γ∆t
h, t = 0.5, h = 1/20
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.02
−0.015
−0.01
−0.005
0
0.005
0.01
0.015
(a) C = 1.0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.1
−0.08
−0.06
−0.04
−0.02
0
0.02
0.04
0.06
0.08
(b) C = 1.05
Simulation of waves – p. 94
Varying the Courant number C
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.015
−0.01
−0.005
0
0.005
0.01
0.015
(c) C = 0.8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.015
−0.01
−0.005
0
0.005
0.01
0.015
(d) C = 0.3
Simulation of waves – p. 95
Another numerical example
−10 −5 0 5 10
0
0.2
0.4
0.6
0.8
1
Time t=0.0, σ=103
−10 −5 0 5 10
0
0.2
0.4
0.6
0.8
1
Time t=15.0, C=1.0
−10 −5 0 5 10
0
0.2
0.4
0.6
0.8
1
Time t=15.0, C=0.8
−10 −5 0 5 10
0
0.2
0.4
0.6
0.8
1
Time t=0.0, σ=1
−10 −5 0 5 10
0
0.2
0.4
0.6
0.8
1
Time t=15.0, C=0.99
−10 −5 0 5 10
0
0.2
0.4
0.6
0.8
1
Time t=15.0, C=0.8
Simulation of waves – p. 96
Numerical stability and accuracy
We have two parameters: ∆t and h
How do we choose ∆t and h?
Too large ∆t and h give- too large numerical errors- or in the worst case: unstable solutions
Too small ∆t and h require too much computing power
Simplified problems can be analyzed theoretically, which yields aguide to choosing ∆t and h
Basic result for our wave equation: ∆t ≤ h/γ (derived later)
Peculiar case: exact solution is obtained by ∆t = h/γ, regardless ofh (!!!)
Simulation of waves – p. 97
2D wave equation
2D wave equation – p. 98
A more general wave equation
General form of a 1D/2D/3D wave equation for u(x, t):
∂2u
∂t2= ∇ · [λ(x)∇u]
Wave travels with velocity γ =√λ
The operator ∇ · [λ(x)∇u] is frequently encountered in this course!
In 2D the operator is written out as
∂
∂x
(λ(x, y)
∂u
∂x
)+
∂
∂y
(λ(x, y)
∂u
∂y
)
Goal for the next slides: Learn how to discretize a 2D wave equationwith variable coefficients
We shall do this by putting together elements we have learned so far
2D wave equation – p. 99
Applications of wave the equation
Vibrations of a string (1D)
Vibrations of a drum (2D)
Large destructive water waves (2D; water elevation)
Sound waves (1D: organ pile, flute; 3D: room, space)
Light and radio waves (3D)
2D wave equation – p. 100
Example: earthquake-generated waves
−0.15
−0.105
−0.06
−0.015
0.03
0.075
0.12
0.165
0.21
0.255
0.3
X
Y
Z
−1
0
0.2
−20−10
010
20 −20 −10 0 10 20
Note: scales are distorted!
Earthquake close to seamount
Effect of earthquake: Sudden elevation of the surface, modeled hereas a prescribed initial surface at rest
Domain = segment of the ocean
2D wave equation – p. 101
Principles
∂2u
∂t2= ∇ · [λ(x)∇u]
∂2u
∂t2=
∂
∂x
(λ∂u
∂x
)+
∂
∂y
(λ∂u
∂y
)
Time discretization as for 1D wave eq.
Space discretization: generalized from (λ(x)u′)′
Boundary conditions: generalized from u(0) = 0 and u′(1) = 1
Initial conditions as for 1D wave eq.
Overall algorithm as for 1D wave eq.
2D wave equation – p. 102
Discretization
Seek approximation u`i,j on a rectangular grid to u(xi, yj , t`)
xi = (i− 1)∆x, yj = (j − 1)∆y, t` = `∆t
Approximate derivatives by central differences
[∂2u
∂t2
]`
i,j
≈u`+1
i,j − 2u`i,j + u`−1
i,j
∆t2
A spatial term like ∂∂y
(λ∂u
∂y
)takes the form
1
∆y
(λi,j+ 1
2
(u`
i,j+1 − u`i,j
∆y
)− λi,j− 1
2
(u`
i,j − u`i,j−1
∆y
))
2D wave equation – p. 103
The scheme
The finite difference scheme takes the form
u`+1i,j = 2u`
i,j − u`−1i,j + [∆u]`i,j
Exercise: derive the expression for [∆u]`i,j
2D wave equation – p. 104
Algorithm (BC: u = 0)
DEFINITIONS:Storage u+
i,j , ui,j, and u−i,j for u`+1i,j , u`
i,j , and u`−1i
The whole grid:¯(∞) = i = 1, . . . , nx, j = 1, . . . , ny
Inner points:(∞) = i = 2, . . . , nx − 1, j = 2, . . . , ny − 1INITIAL CONDITIONS:
ui,j = I(xi, yj), (i, j) ∈ (∞)
SET ARTIFICIAL QUANTITY u−i,j:
u−i,j = ui,j + 12 [4u]i,j, (i, j) ∈ (∞)
Set t = 0
2D wave equation – p. 105
Algorithm (BC: u = 0)
While t ≤ tstop
t← t+ ∆t
UPDATE ALL INNER POINTS:
u+i,j = 2ui,j − u−i,j + [4u]i,j , (i, j) ∈ (∞)
INITIALIZE FOR NEXT STEP:
u−i,j = ui,j , ui,j = u+i,j , (i, j) ∈ (∞)
2D wave equation – p. 106
A model for water waves
Physical assumption: long waves in shallow water
Corresponding mathematical model:
∂2u
∂t2= ∇ ·
[gH(x)∇u
]
Physical quantities:u(x, y, t) : water surface elevationg : acceleration of gravityH(x, y) : still-water depth
Boundary condition at coastline
∂u
∂n≡ ∇u · n = 0
(full reflection of waves)
2D wave equation – p. 107
Scaling
Let Hc be a characteristic value of H(x, y)
We introduce new variables
x = x/Hc, y = y/Hc, t = t/√Hc/g, λ = H/Hc
u = u/uc (uc cancels and can be arbibtary)
Inserted in the equation:
∂2u
∂t2= ∇ ·
[λ∇u
]
⇒ g (the driving force) is scaled away
2D wave equation – p. 108
Implementing boundary conditions (1)
There are two ways of handling ∂u/∂n = 0 conditions:“Ghost cells” at boundary with explicit updating of fictitious valuesModify stencil at boundary
We choose the second option, as this allows direct output of ui,j to avisualization program, i.e., no need to remove ghost cells.
2D wave equation – p. 109
Implementing boundary conditions (2)
Consider the boundary i = 1 (x=const)
Boundary condition:∂u
∂n≡ ∂u
∂x= 0
Discrete version:
u2,j − u0,j
2∆x= 0 ⇔ u2,j = u0,j
u0,j is outside the legal mesh
Use the discrete PDE for i = 1 and eliminate u0,j , that is, just replaceu0,j by u2,j
2D wave equation – p. 110
Modified difference operator
The boundary condition modifies the finite difference equations, andthis can be viewed as modifying the operator [4u]i,j (in this exampleat i = 1) according to
[4u]1,j:i−1→i+1 ≡(
∆t
∆x
)2(λ1+ 1
2 ,j(u2,j − u1,j)− λ1− 12 ,j(u1,j − u2,j)
)+
(∆t
∆y
)2(λ1,j+ 1
2(u1,j+1 − u1,j)− λ1,j− 1
2(u1,j − u1,j−1)
)
2D wave equation – p. 111
Efficiency issues
Two things should be considered:
Loops should be ordered such that u(i, j) is traversed in the order itis stored. In Diffpack ArrayGen objects are stored columnwise.Therefore the loop should read:for (j = 1; j <= ny; j++)
for (i = 1; i <= nx; i++)u(i,j) = ...
One should avoid if statements in loops if possible (they preventmany compiler optimization techniques); hence we will have separateloops over inner and boundary points.
Remark I: Debug code before optimizing it!!Remark II: Focus on a readable and maintainable code before thinking ofefficiency
2D wave equation – p. 112
Updating of internal points
We define a function for updating the solution:WAVE(u+, u, u−, a, b, c)
This function reads, at inner points,
u+i,j = 2aui,j − bu−i,j + c[4u]i,j, (i, j) ∈ (∞)
2D wave equation – p. 113
Updating of boundary points (1)
i = 1, j = 2, . . . , ny − 1;u+
i,j = 2aui,j − bu−i,j + c[4u]i,j:i−1→i+1,
i = nx, j = 2, . . . , ny − 1;u+
i,j = 2aui,j − bu−i,j + c[4u]i,j:i+1→i−1,
j = 1, i = 2, . . . , nx − 1;u+
i,j = 2aui,j − bu−i,j + c[4u]i,j:j−1→j+1,
j = ny, i = 2, . . . , nx − 1;u+
i,j = 2aui,j − bu−i,j + c[4u]i,j:j−1→j+1,
2D wave equation – p. 114
Updating of boundary points (2)
i = 1, j = 1;u+
i,j = 2aui,j − bu−i,j + c[4u]i,j:i−1→i+1,j−1→j+1
i = nx, j = 1;u+
i,j = 2aui,j − bu−i,j + c[4u]i,j:i+1→i−1,j−1→j+1
i = 1, j = ny;u+
i,j = 2aui,j − bu−i,j + c[4u]i,j:i−1→i+1,j+1→j−1
i = nx, j = ny;u+
i,j = 2aui,j − bu−i,j + c[4u]i,j:i+1→i−1,j+1→j−1
2D wave equation – p. 115
Modified algorithm (BC: ∂u/∂n = 0)
DEFINITIONS: as above
INITIAL CONDITIONS:
ui,j = I(xi, yj), (i, j) ∈ (∞)
SET ARTIFICIAL QUANTITY u−i,j:
WAVE(u−, u, u−, 0.5, 0, 0.5)
Set t = 0
2D wave equation – p. 116
Modified algorithm (BC: ∂u/∂n = 0)
While t ≤ tstop
t← t+ ∆t
UPDATE ALL POINTS:
WAVE(u+, u, u−, 1, 1, 1)
INITIALIZE FOR NEXT STEP:
u−i,j = ui,j , ui,j = u+i,j , (i, j) ∈ (∞)
2D wave equation – p. 117
Diffpack/C++ written in F77/C style
src/fdm/intro/Wave2D#include <Arrays_real.h>#include <FieldLattice.h>#include <SimRes2mtv.h>
// We define a macro LaplaceU to save typing of long// finite difference formulas. For example,//// #define mac(X) q0(i,j-X)//// defines a macro mac(X) and any text mac(i+2) will then be// transformed to q0(i,j-i+2) by the C/C++ preprocessor (cpp).
#define LaplaceU(i,j,im1,ip1,jm1,jp1) \sqr(dt/dx)*\( 0.5*(lambda(ip1,j )+lambda(i ,j ))*(u(ip1,j )-u(i ,j )) \-0.5*(lambda(i ,j )+lambda(im1,j ))*(u(i ,j )-u(im1,j )))\
+sqr(dt/dy)*\( 0.5*(lambda(i ,jp1)+lambda(i ,j ))*(u(i ,jp1)-u(i ,j )) \-0.5*(lambda(i ,j )+lambda(i ,jm1))*(u(i ,j )-u(i ,jm1)))
2D wave equation – p. 118
More code ...
void WAVE (ArrayGenSel(real)& up, const ArrayGen(real)& u,const ArrayGenSel(real)& um, real a, real b, real c,const ArrayGenSel(real)& lambda, real dt, real dx, real dy)
int nx, ny; up.getDim (nx, ny);int i,j;
// update inner points according to finite difference scheme:for (j = 2; j <= ny-1; j++)for (i = 2; i <= nx-1; i++)up(i,j) = a*2*u(i,j) - b*um(i,j)
+ c*LaplaceU(i,j,i-1,i+1,j-1,j+1);
// update boundary points (modified finite difference schemes):i=1;for (j = 2; j <= ny-1; j++)up(i,j)=a*2*u(i,j)-b*um(i,j) +c*LaplaceU(i,j,i+1,i+1,j-1,j+1);
i=nx;for (j = 2; j <= ny-1; j++)up(i,j)=a*2*u(i,j)-b*um(i,j) + c*LaplaceU(i,j,i-1,i-1,j-1,j+1);
j=1;for (i = 2; i <= nx-1; i++)up(i,j)=a*2*u(i,j)-b*um(i,j) + c*LaplaceU(i,j,i-1,i+1,j+1,j+1);
...
2D wave equation – p. 119
More code ...
int main (int argc, const char* argv[])
initDiffpack (argc, argv);s_o << "Give number of intervals in x and y direction: ";int h; s_i >> h; int nx = h+1; s_i >> h; int ny = h+1;s_o << "Give width of domain in x direction: ";real wx; s_i >> wx;s_o << "Give width of domain in y direction: ";real wy; s_i >> wy;
// ArrayGenSel is like ArrayGen, but has increased functionality// for finite difference methods// (we need it for the FieldLattice object for visualization)ArrayGenSel(real) up (nx,ny); // u at time level l+1ArrayGenSel(real) u (nx,ny); // u at time level lArrayGenSel(real) um (nx,ny); // u at time level l-1ArrayGenSel(real) lambda (nx,ny); // variable coefficientconst real dx = wx/(nx-1); // length of grid intervals in x dir.const real dy = wy/(ny-1); // length of grid intervals in y dir.
s_o << "Give time step length: (0 gives dt=dx): ";real dt; s_i >> dt;
2D wave equation – p. 120
More code ...
// fill lambda with values .... (see source file)
// fill um with initial values:um.fill(0.0);
// set the help variable um:WAVE (um, u, um, 0.5, 0, 0.5, lambda, dt, dx, dy);
int step_no = 0; // current step number
while (t <= tstop)
t += dt; // increase time by the time stepstep_no++; // increase step number by 1s_o << "t=" << t << "\n";
WAVE (up, u, um, 1, 1, 1, lambda, dt, dx ,dy);
um = u; u = up; // update data struct. for next step// dump solution to file ... (see source file)
2D wave equation – p. 121
Visualizing the results
unix> plotmtv -colorps W0017.tmp.mtv
2D wave equation – p. 122
Nature of some PDEs
Nature of some PDEs – p. 123
Hyperbolic equations (1)
Linear PDEs can be divided into basic categories:elliptic, parabolic, and hyperbolic
A typical hyperbolic equation is the wave equation
∂2u
∂t2= γ2 ∂
2u
∂x2
which has the general solution
u(x, t) = f(x− γt) + g(x+ γt)
i.e., two waves propagating to the left and to the rightThe functions f and g are determined from u(x, 0) and ∂u(x, 0)/∂t.Disturbances travel with a finite wave speed.Reference: HPL appendix A.5
Nature of some PDEs – p. 124
Changing boundary conditions
Boundary conditions: u(0, t) = 0, u(1, t) = 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time t=0.300
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time t=0.667
Boundary conditions: u(0, t) = 0, ux(1, t) = 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time t=0.300
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time t=0.667
Nature of some PDEs – p. 125
Observations
Changing the BC influences the solution at only some (x, t) points(e.g., at the midpoint, after the initial disturbance has left, a change inthe BC at x = 1 is not felt before the pulse has been reflected fromthe boundary)
An initial disturbance is transported without change of shape(essential for oral communication!)
Nature of some PDEs – p. 126
BC: let waves leave the domain
Boundary conditions: u(0, t) = 0 and(ut + γux
)|x=1 = 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time t=0.300
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time t=0.667
The wave propagates out of the domain, exactly as we want forocean waves! Unfortunately, the condition is hard to generalizesuccessfully to physically relevant 2D cases
Nature of some PDEs – p. 127
Other hyperbolic equations (2)
Uni-directional “wave” or transport equation:
∂u
∂t+ γ
∂u
∂x= 0
whereu(x, t) = I(x− γt)
I(x) is the initial condition u(x, 0)
Systems of wave equations (here: long ocean waves)
∂η
∂t= − ∂
∂x(uH)− ∂
∂y(vH)
(−∂H∂t
)
∂u
∂t= −∂η
∂x, x ∈ Ω, t > 0
∂v
∂t= −∂η
∂y, x ∈ Ω, t > 0
Nature of some PDEs – p. 128
Other hyperbolic equations (2)
Multi-dimensional standard wave equation:
∂2u
∂t= ∇ · (λ∇u)
Applications: radio waves, light, sound, membranes, ...
Nonlinear hyperbolic conservation law:
∂u
∂t+
∂
∂xf(u) = 0
or a system of such equations:
∂u
∂t+
∂
∂xf(u) = 0
Applications: gas dynamics, oil reservoir flow
Nature of some PDEs – p. 129
Elliptic equations
Elliptic equations are stationary (equilibrium or steady-state physicalconditions)
Example:−u′′(x) = 2, u(0) = 0, u′(1) = 0
Let us look at the effect of changing u′(1) = 0 to u(1) = 0
Nature of some PDEs – p. 130
Changing boundary conditions
u′(1) = 0 versus u(1) = 0:
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8 1
u(1)=0u’(1)=0
Observe: All points in the interior are effected by the boundarycondition! And the solution is smooth.
Nature of some PDEs – p. 131
Multi-dimensional elliptic equations
The Poisson equation is a typical multi-dimensional elliptic equation:
−∇2u(x) = f(x)
or−∇ · (λ(x)∇u(x)) = f(x)
Another elliptic equation:
−∇ · (λ(x)∇u(x)) + αu = f(x)
Nature of some PDEs – p. 132
The Helmholtz equation
Sometimes one solves the wave equation
∂2u
∂t2= γ2∇2u
by assuming periodical waves in time:
u(x, y, z, t) = exp (−iωt)u(x, y, z), i =√−1
This results in the famous Helmholtz equation
∇2u+ k2u = 0, k = ω/γ
The Helmholtz equation is not an elliptic equation (wrong sign!)
Nature of some PDEs – p. 133
Parabolic equations (1)
A 1D heat equation is a typical parabolic equation:
∂u
∂t=∂2u
∂x2+ 2, u(x, 0) = 0, u(0, t) = 0
Typically for parabolic equations:
time derivative = elliptic counterpart
That is, as t→∞, the eq. above tends to the elliptic equation
−∂2u
∂x2= 2, u(x, 0) = 0, u(0, t) = 0
Nature of some PDEs – p. 134
Changing boundary conditions
Let us see the effect of two different conditions at x = 1: u(1, t) = 0versus ux(1, t) = 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
Time t=0.300
As for the elliptic counterpart, all points in the interior are effected bythe boundary condition
Nature of some PDEs – p. 135
Parabolic equations (2)
Multi-dimensional parabolic (heat) equation:
∂u
∂t= ∇ · (λ(x)∇u) + f
Recall: as t→∞, we normally have that ∂u/∂t→ 0, and theparabolic equation approaches the elliptic counterpart
Nature of some PDEs – p. 136
Numerical methods
Hyperbolic equations:explicit schemes, must choose ∆t ∼ hElliptic equations:all grid-point values are coupled in a linear system, time-consumingto solve, requires sophisticated iterative methods
Parabolic equations:can use explicit schemes, if ∆t ∼ h2, but implicit schemes couplingall points (like in elliptic equations) are preferred
Nature of some PDEs – p. 137
Why solutions of elliptic eqs. are smooth
Solutions of elliptic equations are smooth!
−u′′(x) = f(x) with “noisy” f(x):
u = −∫ (∫
f(x)dx
)dx
Twice integration of noisy f(x)⇒ smooth u(x)
(λu′)′ = 0 with noisy coefficient λ(x):
u(x) = const ·∫ x
0
dτ
λ(τ)
Rough λ⇒ smooth u
Nature of some PDEs – p. 138
More general result from Fourier series
−u′′(x) = f(x) solved by Fourier series (here: sine series):
u(x) =∞∑
i=1
ui sin iπx
Expanding f(x) also in a Fourier series:
f(x) =∞∑
i=1
fi sin iπx
Inserting this in the equation gives
ui ∼ fi/i2
i.e., the u series converges faster than the f series and from Fourierseries theory this means that u is smoother than f
Nature of some PDEs – p. 139
Smoothness and variational calculus
∇2u = 0 is equivalent with
minv(x)
∫
Ω
||∇v||2dΩ
⇒ u is “smoothest” among “all” v over Ω
Nature of some PDEs – p. 140
Analysis of difference schemes
Analysis of difference schemes – p. 141
Outline
Try to explain the observed numerical behavior (accuracy, stability) bymathematical means
Tool: exact solution of the discrete equations
Main focus on the wave and heat equations
Intro to classical topics like truncation error and von Neumannstability
Analysis of difference schemes – p. 142
Operator notation (1)
Finite difference schemes are often long and difficult to readcompared to the underlying PDE
Operator notation gives condensed expressions much like the PDE
Define
[δxu]`i,j,k ≡
u`i+ 1
2 ,j,k− u`
i− 12 ,j,k
∆x
with similar definitions of δy, δz, and δt
Another difference:
[δ2xu]`i,j,k ≡
u`i+1,j,k − u`
i−1,j,k
2∆x
Analysis of difference schemes – p. 143
Operator notation (2)
Compound difference (1D now, to save index writing):
[δxδxu]`i =
1
h2
(u`
i−1 − 2u`i + u`
i+1
)
One-sided forward difference:
[δ+x u]`i ≡
u`i+1 − u`
i
h
and the backward difference:
[δ−x u]`i ≡
u`i − u`
i−1
h
Operator notation for arithmetic average:
[ux]`i ≡1
2
(u`
i+ 12
+ u`i− 1
2
)
Analysis of difference schemes – p. 144
Operator notation (3)
Put the whole equation inside brackets:
[δxδxu = −f ]i
This is a finite difference scheme for u′′ = −fExample: (λu′)′ = 0 is discretized as
[δxλxδxu = 0]i
Another example, the heat equation:
∂u
∂t= κ
(∂2u
∂x2+∂2u
∂y2
)
[δ+t u = κ (δxδxu+ δyδyu)]`i,j
Analysis of difference schemes – p. 145
Typical solution of a wave equation
Wave equation:∂2u
∂t2= γ2 ∂
2u
∂x2
Typical solution
u = Aei(kx−ωt)
for arbitrary k
Only real or imaginary part has physical interpretation, e.g., the realpart is A cos(kx− ωt)Inserting solution gives the dispersion relation ω = ω(k):
ω = ±γk ⇒ u = Aeik(x±γt)
Analysis of difference schemes – p. 146
General solutions
Can build general solutions as Fourier series
u =∑
k
Akei(kx−ωt)
...or Fourier integrals
u =
∫ ∞
−∞
A(k)ei(kx−ωt)dk
The basic component in these general solutions is
ei(kx−ωt)
which we will study in the following
Analysis of difference schemes – p. 147
Physical interpretation
What do A, k, and ω in
u = Aei(kx−ωt)
really mean?
-1.5
-1
-0.5
0
0.5
1
1.5
0 1 2 3 4 5 6
λ
c
A
λ = 2π/k is the wave length, c = ω/k = γ is the wave velocity
Analysis of difference schemes – p. 148
A heat equation
The heat (diffusion) equation:
∂u
∂t= κ
∂2u
∂x2
Typical solution u = A exp (i(kx− ωt)) for arbitrary k
Inserting solution gives ω = −iκk2 (called the dispersion relation)
The form of the solution:
u = Ae−κk2teikx
Can build more complicated solutions through Fourier series orintegral
Analysis of difference schemes – p. 149
The damping in the heat equation
A general solution component:
u = Ae−κk2teikx
damps short waves (big k) significantly (exp (−k2))
Example: add two components, k = π and k = 100π (chooseκπ = 1), consider the imaginary part:
u(x, t) = 1 · e−t sinπx+ 0.6 · e−10000t sin 100πx
This is a sine with period 1 plus a 60 percent perturbation whichoscillates 100 times faster
The damping factor of the perturbation is exp (−10000) that of the firstterm; after t = 1/1000 this damping os about 5 · 10−5
Recall: no damping in the wave equation
Analysis of difference schemes – p. 150
Plot of the damping
-0.5
0
0.5
1
1.5
2
0 0.2 0.4 0.6 0.8 1
u(x,0)
-0.5
0
0.5
1
1.5
2
0 0.2 0.4 0.6 0.8 1
u(x,0.001)
Analysis of difference schemes – p. 151
Solution of discrete equations
Consider the explicit finite difference for the wave equation, with u`j as
unknown
A typical solution reads
u`j = Aei(kjh−ω`∆t) = Aei(kx−ωt)
⇒ same structure as the solution of the PDE!
Inserting this form of u`i in the scheme:
ω = ω(k, h,∆t)
(numerical dispersion relation)
The velocity c = ω/k should be γ (constant), but will now depend onk, h, and ∆t
Hopefully, c→ γ as h,∆t→ 0
Analysis of difference schemes – p. 152
The wave equation scheme
Inserting u`j = exp (i(kjh− ω`∆t)) in the wave equation scheme
gives
sinω∆t
2= ±γ∆t
hsin
kh
2
Can solve for ω and then we have an analytical solution of thediscrete equations:
ω = ± 2
∆tarcsin
(γ∆t
hsin
kh
2
)
Can assess the accuracy by plotting ω − ωBetter: plot c = ω/k as a function of kh for different Courant (γ∆t/h)numbers
Can assess the accuracy by investigating
ω − ω = − 1
24γk3(h2 − γ2∆t2) +O(h2∆t2, h4,∆t4)
Analysis of difference schemes – p. 153
Stability (1)
We know that the exact solution of the wave equation PDE containsno damping or no growth of a wave component, i.e.,
ω is real
A numerical wave component should exhibit the same qualitativebehavior
...it can be slightly damped, but not amplified (for sufficiently large tthe wave becomes arbitrarily large)
⇒ Should have real ω (or a small negative imaginary part only, i.e.,slight damping)
The equation for ω:
sinω∆t
2= ±γ∆t
hsin
kh
2
Analysis of difference schemes – p. 154
Stability (2)
The equation for ω,
sinω∆t
2= ±γ∆t
hsin
kh
2
can also have complex ω
Complex ω will occur in conjugate pairs, i.e., one root has positiveimaginary part, leading to wave growth, and cannot be allowed
⇒ Only real ω can be accepted
sin = const × sin⇒ the const must be in [−1, 1], here
γ∆t
h≤ 1
or
∆t ≤ h
γ
which is the stability criterionAnalysis of difference schemes – p. 155
The effect of round-off errors
Consider the initial condition u(x, 0) = A exp (ikx) for the wave eq.
General solution:u(x, t) = Aei(kx−ωt)
Perturb initial condition:
u(x, 0) = Aeikx + aei10kx, a A
This always happens on a computer, because of round-off errors
If ∆t ≤ h/γ, no numerical wave components are damped or amplifiedand the solution reads
u`j = Aei(kjh−ω`∆t) + aei(10kjh−ω`∆t)
⇒ Initial perturbation (round-off errors) is not amplified
Analysis of difference schemes – p. 156
Consequences of stability
If ∆t > h/γ, at least one wave component starts to grow in time, i.e.,the initial perturbation is amplified
After some time, the solution is completely nonphysical
When a program gives nonsense solutions, recall that the reason caneither be a bug or a too large ∆t!
Analysis of difference schemes – p. 157
Accuracy
The exact and numerical solutions,
exp (i(kx− ωt)) and exp (i(kx− ωt))
have the same basic structure, only the ω values differ
Can assess the numerical accuracy as Eω = ω − ωCan plot Eω or make a Taylor series expansion in terms of the gridand physical parameters:
Eω = − 1
24γk3(h2 − γ2∆t2) +O(h2∆t2, h4,∆t4)
Errors go to zero as h2,∆t2
It turns out that Eω = 0 if ∆t = h/γ (!)
Analysis of difference schemes – p. 158
Summary of numerical properties
The wave equation:
Accuracy: O(h2,∆t2), but exact curves can also be produced
Special result: C ≡ γ∆t/h = 1 implies that the numerical solution isexact
Stability: ∆t ≤ h/γThe method of analysis applies to linear, homogeneous,time-dependent equations with constant coefficients
Analysis of difference schemes – p. 159
The heat equation; stability
The heat equation:∂u
∂t= κ
∂2u
∂x2
can be discretized by
[δ+t u = κδxδxu]`j
(explicit forward scheme, u`+1i = old values)
Inserting discrete wave component
u`j = Aei(kjh−ω`∆t) = Aξ`eikjh, ξ = e−iω∆t
in the numerical scheme results in
ξ = 1− κ4∆t
h2sin2 kh
2
Analysis of difference schemes – p. 160
Stability criterion
Damping, i.e. no growth, implies |ξ| ≤ 1
With
ξ = 1− κ4∆t
h2sin2 kh
2
this leads to
∆t ≤ h2
2κ
as the stability criterion
Analysis of difference schemes – p. 161
Truncation error
PDE: L(u) = f
Numerical approximation: u∆, L∆(u∆) = f∆
The truncation error is defined as
τ = L∆(u)− f∆
i.e. τ reflects the residual when the analytical solution is inserted inthe numerical scheme
Computational technique:expand analytical u(x) at xi±1 in Taylor series about xi:
u(xi±1) = u(xi)±∂u
∂x
∣∣∣∣xi
h+1
2
∂2u
∂x
∣∣∣∣xi
h2 ± 1
6
∂3u
∂x
∣∣∣∣xi
h3
+1
24
∂4u
∂x
∣∣∣∣xi
h4 + · · ·
Analysis of difference schemes – p. 162
Truncation error; example (1)
Problem:
L(u) ≡ −u′′(x)
L∆(u∆) ≡ −ui−1 − 2ui + ui+1
h2
Don’t multiply by h2! (τ ∼ L∆)
Insert Taylor expansions:
τ = u′′(xi) + f(xi) +1
12u′′′′(x)h2 +O(h4)
PDE is fulfilled pointwise: u′′(xi) + f(xi) = 0
Second order scheme because:
τ =1
12u′′′′(x)h2 +O(h4), τ ∼ O(h2)
Analysis of difference schemes – p. 163
What is the truncation error?
τ measures the error in the discrete equations when the exactsolution of the continuous problem is inserted
That is, τ measures the error in an equation (this error is called aresidual), not the error in the solution u− u∆
Hopefully, τ reflects the true error u− u∆
Analysis of difference schemes – p. 164
Truncation error; example (2)
Model problem: the 1D wave equation
L(u) ≡ ∂2u
∂t2− γ2 ∂
2u
∂x2
L∆(u∆) ≡ u`−1i − 2u`
i + u`+1i
∆t2
− γ2u`i−1 − 2u`
i + u`i+1
h2
τ =∂2u
∂t2+
1
12
∂4u
∂t4∆t2 +O(∆t4)
− γ2 ∂2u
∂x2− γ2 1
12
∂4u
∂x4h2 +O(h4)
τ = O(∆t2, h2)
Analysis of difference schemes – p. 165
von Neumann stability analysis (1)
Make an equation for the numerical error (see later)
Seek discrete solution for the error:
e`j =
∑
k
ξ` exp (ikjh)
Insert e`j in error equation, compute for a single component k:
e`j = exp (ikjh)
Common stability requirement:|e`
j | <∞⇒ |ξ| ≤ 1
Gives a condition on ∆t
Very similar to discussing numerical dispersion relations
Analysis of difference schemes – p. 166
von Neumann stability analysis (2)
1D heat equation: ut = κuxx
Perturbed solution v: vt = κvxx
Equation for error e = u− v: et = κexx
Inserting e`j = ξ` exp (ikjh):
ξ − 1
∆t= −4κ
h2sin2 kh
2
|ξ| ≤ 1⇒ −1 ≤ ξ ≤ 1:
−1 ≤ ξ = 1− 4κ∆t
h2⇒ ∆t ≤ h
2κ
1D wave equation: utt = c2uxx,
∆t ≤ h
c
Analysis of difference schemes – p. 167
Consistency and convergence
u: solution of continuous problem
u∆: solution of discrete problem
∆: mesh parameter (h,∆t etc)
Definition of consistency: τ → 0 as ∆→ 0
Example:
τ =1
12u′′′′(xi)h
2 +O(h4)→ 0 as h→ 0
Interpretation: The analytical solution fulfulls the discrete equationsas ∆→ 0
Convergence: u∆ → u as ∆→ 0
τ → 0 does not imply u∆ → u
Analysis of difference schemes – p. 168
Lax’ theorem
Convergence⇔ consistency and stability
⇒ easy tool for proving convergence!
Analysis of difference schemes – p. 169
Intro to finite elements
Intro to finite elements – p. 170
Features of the method
Flexibility
Straightforward handling of complicated geometries
Easy to construct higher-order approximations
Broad spectrum of applications
Popular method for demanding engineering applications
Strong mathematical foundation
Intro to finite elements – p. 171
Basic principles
Finite difference method:
∂u
∂x→ ui+1 − ui−1
2h
Finite element method:
u(x) → u(x) =M∑
j=1
ujNj(x)
Nj(x): prescribed functions
uj : unknown parameters
Need M equations for determining uj
Optimal goal: find uj such that ||u− u|| is minimized
Realistic goal: find uj such that the residual is small
Intro to finite elements – p. 172
A least-squares method
We look at a general PDE
L(u) = 0, x ∈ Ω
Insert u =∑
j ujNj for u, but L(u) is then not zero,
L(u) = R 6= 0, R = R(u1, . . . , uM ; x)
Idea: adjust u1, . . . , un such that
∫
Ω
R2(u1, . . . , uM ; x)dΩ
is minimized with respect to u1, . . . , uM
Result: ∫
Ω
2R∂R
∂uidΩ = 0, i = 1, . . . ,M.
⇒ algebraic system of equations for u1, . . . , uMIntro to finite elements – p. 173
Least squares: example
Boundary value problem:
−u′′(x) = f(x), x ∈ (0, 1), u(0) = u(1) = 0
Nj(x) = sin jπx, u =∑
j uj sin jπx
Boundary conditions are satisfied since each term vanishes(sin iπ0 = 0, sin iπ = 0)
Residual:
R =
M∑
j=1
ujN′′j (x) + f(x),
∂R
∂ui= N ′′
i (x)
Least-squares equations:
∫ 1
0
M∑
j=1
ujN′′j (x) + f(x)
N ′′
i (x)dx = 0
Intro to finite elements – p. 174
Writing up a system of linear equations
The least-squares equations
∫ 1
0
M∑
j=1
ujN′′j (x) + f(x)
N ′′
i (x)dx = 0
is a system of linear equations
To see this, we write the system on standard form
Au = b orn∑
j=1
Ai,juj = bi, i = 1, . . . , n
Interchange integration and summation, factor uj out:
n∑
j=1
−(∫ 1
0
N ′′i N
′′j dx
)
︸ ︷︷ ︸Ai,j
uj =
∫ 1
0
fN ′′i dx
︸ ︷︷ ︸bi
Intro to finite elements – p. 175
Weighted residual method (WRM)
We look at a general PDE again
L(u) = 0, x ∈ Ω
Insert u =∑
j ujNj for u and obtain a residual R:
R ≡ L(u), R = R(u1, . . . , uM ; x)
Require R to be zero in a weighted mean:
∫
Ω
RWi(x)dΩ = 0, i = 1, . . . ,M
Wi(x): prescribed weighting functions
⇒ Linear system for u1, . . . , uM
Galerkin’s method: Wi = Ni (common choice - often optimal)
Observation: least squares = WRM with Wi = ∂R/∂ui
Intro to finite elements – p. 176
Galerkin: example
Boundary value problem:
−u′′(x) = f(x), x ∈ (0, 1), u(0) = u(1) = 0
Nj(x) = sin jπx
Boundary conditions are satisfied
Residual:
R =M∑
j=1
ujN′′j (x) + f(x)
Galerkin equations:
∫ 1
0
M∑
j=1
ujN′′j (x) + f(x)
Ni(x)dx = 0
Intro to finite elements – p. 177
Writing up the linear system
We interchange summation and integration to write the equations onstandard form
∑j Ai,juj = bi such that we can identify the coefficient
matrix and the right-hand side (these must be known before we cancall up software to solve the linear system)
Ai,j = −∫ 1
0
Ni(x)N′′j (x)dx, bi =
∫ 1
0
f(x)Ni(x)dx
Intro to finite elements – p. 178
Linear algebra interpretation (1)
Recall from linear algebra:
(u, v) = 0 ∀v ∈ V ⇔ u = 0 or u ⊥ V (u 6∈ V )
Define function space V spanned by
B = N1, N2, . . . , NM
Define inner product
(u, v) =
∫
Ω
uvdΩ
With these definitions we can redefine/interpret the Galerkin methodas a “geometric” or linear algebra approach
Intro to finite elements – p. 179
Linear algebra interpretation (2)
Galerkin’s method: Find
u =M∑
j=1
ujNj ∈ V
such that the resulting residual is orthogonal to V :
(R, v) = 0 ∀v ∈ V (⇒ R 6∈ V or R ≡ 0)
We hope that V contains the "important" functions in the problem⇒R is "small"
Hope for convergence: u→ u as M →∞, when R→ 0
Intro to finite elements – p. 180
Collocation methods
Force the residual to vanish at M distinct points
Collocation:
R(u1, . . . , uM ; x[i]) = 0, i = 1, . . . ,M
Cf. the finite difference method:the PDE is fulfilled at M points
Note: collocation also arise from WRM with
Wi(x) = δ(x− x[i])
(Recall:∫∞
−∞f(x)δ(x− x0)dx = f(x0))
Subdomain collocation:∫
Ωi
RdΩ = 0, i = 1, . . . ,M, Ω = ∪Mi=1Ωi
Intro to finite elements – p. 181
A worked example (1)
Problem:
−u′′(x) = f(x), x ∈ (0, 1), u(0) = u(1) = 0
Approximation:
u(x) ≈ u(x) =M∑
j=1
ujNj(x)
Force boundary conditions:
Nj(0) = Nj(1) = 0, i = 1, . . . ,M
Intro to finite elements – p. 182
A worked example (2)
Choices of Nj(x):
Nj(x) = sin jπx
Nj(x) = xj(1− x)
Least squares:
−M∑
j=1
1∫
0
N ′′i (x)N ′′
j (x)dx
uj =
1∫
0
f(x)N ′′i (x)dx
Intro to finite elements – p. 183
A worked example (3)
Galerkin’s method:
−M∑
j=1
1∫
0
Ni(x)N′′j (x)dx
uj =
1∫
0
f(x)Ni(x)dx
Observation:Ni = sin iπx⇔ Ai,j = 0, i 6= j
Therefore (Galerkin and least squares):
uj =2
π2j2
∫ 1
0
f(x) sin jπxdx
No need for solving linear systems!
Intro to finite elements – p. 184
A worked example (4)
Collocation:
The residual is forced to vanish at M points
x[1], . . . , x[i], . . . , x[M ]
called the collocation points
Equivalent view: use u in PDE and let the PDE be fullfilled at thecollocation points
−M∑
j=1
N ′′j (x[i])uj = f(x[i]), i = 1, . . . ,M
which is a linear system with Ai,j = −N ′′j (x[i]) as coefficient matrix
and bi = f(x[i]) as right-hand side
Intro to finite elements – p. 185
A worked example (5)
With Nj = sin jπx the coefficient matrix Ai,j = N ′′j (x[i]) is in general
full (dense), e.g,
Ai,j = sin[j(i− 1)hπ], i = 1, . . . ,M, h = 1/(M − 1)
Intro to finite elements – p. 186
Ill-conditioning
u′′ = 1, u(0) = u(1) = 0⇒ u = −x(1− x)/2
u =∑M
j=1 ujxj(1− x) contains the exact solution
Galerkin gives u1 = −1/2, uj = 0, j > 1
On the computer (with 6 digits):
M (u1, . . . , uM )
2 (-.50000, .00001)4 (-.50001, .00039, -.00079, .00048 )6 (-.50072, .01296, -.07323, .17561, -.18727 ,. 073122 )8 (-.49977, -.00510, .01485, .11669, -.73884, 1.56298, ...)
⇒ The method does not converge!
Nj = xj(1− x) almost linearly dependent for j > 5
⇒ Ill-conditioned coefficient matrix, round-off errors accumulate
Intro to finite elements – p. 187
Cure for ill-conditioning
Choose orthogonal or nearly orthogonal Ni
Fourier series provide orthogonal Ni
Generalized Fourier series, using e.g. Legendre polynomials, Besselfunctions, Laguerre polynomials, etc., also give orthogonal Ni
Finite elements provide nearly orthogonal Ni
Intro to finite elements – p. 188
A new view on Fourier series (1)
We can interpret Fourier series, e.g.,
u ≈ u =∞∑
j=1
uj sin jπx on (0, 1)
uj = 2
1∫
0
u(x) sin jπxdx
as a special case of a Galerkin or least-squares method
Solve u = f by an approximate method
u ≈ u =∑
j
ujNj
Residual: R = u− f =∑
j ujNj − fLeast squares and Galerkin:
M∑
j=1
(∫
Ω
NiNjdΩ
)uj =
∫
Ω
fNidΩ, i = 1, . . . ,M
Intro to finite elements – p. 189
A new view on Fourier series (2)
Let Ω = (0, 1), choose Ni = sin iπx
Since these Ni are orthogonal,∫ 1
0NiNjdx = 1
2δij
Hence, the matrix becomes diagonal, and we can solve the systemby hand:
uj = 2
1∫
0
f(x) sin jπxdx
These are the well-known coefficients in the Fourier sine series off(x)
⇒ Fourier series can be viewed as a least-squares or Galerkin method
Intro to finite elements – p. 190
Treatment of boundary conditions (1)
u = 0 on ∂Ω⇒ Ni = 0 on ∂Ω
u = ψ(x) on ∂Ω:
u = ψ(x) +M∑
j=1
ujNj(x), Nj = 0 on ∂Ω
ψ = ψ on ∂Ω
Example: u(0) = UL, u(1) = UR,choose ψ = xUR + (1− x)UL
Note: ψ is not uniquely determined
Intro to finite elements – p. 191
Treatment of boundary conditions (2)
What about u(0) = UL and u′(1) = β?
u(x) ≈ u(x) = UL +M∑
j=1
ujNj(x), Nj(0) = 0
Integration by parts:
−∫ 1
0
u′′Nidx =
∫ 1
0
fNidx (Galerkin)
∫ 1
0
u′N ′idx− [u′Ni]
10 =
∫ 1
0
fNidx
∫ 1
0
u′N ′idx =
∫ 1
0
fNidx+ βNi(1)
Recall that Ni(0) = 0 due to u(0) = UL.Must have Ni(1) 6= 0.
Remark: could add an equation/constraint u = uL, e.g.,
M∑
j=1
ujNj(0) = UL
and get rid of ψ and the requirement Nj(0) = 0; this is what we willdo later (!)
Intro to finite elements – p. 192
Advantages of integration by parts
A way to incorporate derivative BCs
The physical derivative BCs naturally arise from the integration byparts (because the BC is closely related to the PDE, e.g.,−(λu′)′ = f has derivative condition −λu′ = · · · )Symmetric matrix:
∫N ′
iN′jdx
Lower continuity requirements on Ni
Intro to finite elements – p. 193
Essential and natural boundary conditions
u = UL must be enforced in u⇒ essential boundary conditions
u′ = β appears naturally in the formulas⇒ natural boundary conditions
Observation: forgetting boundary term implies u′ = 0
Intro to finite elements – p. 194
Multidimensional problems
Boundary-value problem
−∇ · [k(x)∇u(x)] = f(x), x ∈ Ω,
−k(x)∂u
∂n= g(x), x ∈ ∂ΩN ,
u(x) = ψ(x), x ∈ ∂ΩE .
Expansion:
u(x) ≈ u(x) = ψ(x) +M∑
j=1
ujNj(x)
withNj = 0 on ∂ΩE
Intro to finite elements – p. 195
Multidimensional integration by parts
Integration by parts lemma:
−∫
Ω
∇ · [k∇u]WidΩ =
∫
Ω
k∇u · ∇WidΩ−∫
∂Ω
Wik∂u
∂ndΓ
Galerkin’s method:
M∑
j=1
∫
Ω
k(x)∇Ni · ∇Nj dΩ
uj =
∫
Ω
f(x)NidΩ−∫
∂ΩN
g(x)NidΓ
−∫
Ω
k∇ψ · ∇NjdΩ
Common physical flux-conditions appear as natural boundaryconditions
Intro to finite elements – p. 196
2D example
Ω = (0, 1)× (0, 1)
Expansion:
u =
n1∑
i=1
n2∑
j=1
ui,jNi,j , Ni,j = sin iπx1 sin jπx2
Weighting functions (Galerkin): Nk,`
Linear system: ∑
i
∑
j
Ai,j,k,`ui,j = bk,`
For implementation: convert double to indices to single
N(j−1)n+i(x1, x2) = sin iπx1 sin jπx2
i, j = 1, . . . , n, M = n2
Intro to finite elements – p. 197
Time-dependent problems (1)
∂2u
∂t2= ∇ · (c2∇u), x ∈ Ω, t > 0
u(x, 0) = f(x), x ∈ Ω
∂
∂tu(x, 0) = 0, x ∈ Ω
∂u
∂n= 0, x ∈ ∂Ω
Finite differences in time:
∂2
∂t2u(x, t`) =
u`−1 − 2u` + u`+1
∆t2+O(∆t2)
Spatial problem at each time level:
u`+1(x) = 2u`(x)− u`−1(x) + (c∆t)2∇2u`(x)
Intro to finite elements – p. 198
Time-dependent problems (2)
Initial condition:u0 = f(x)
∂u
∂t= 0→ u1 − u−1
2∆t= 0
(can develop a special formula for u−1 such that the main schemecan be used for ` = 0, 1, 2, . . ., cf. the method for the 1D waveequation)
The spatial variation of u, u`(x), is expanded in the standard way:
u`(x) ≈ u` =M∑
j=1
u`jNj(x), ` = −1, 0, 1, 2, . . .
Intro to finite elements – p. 199
Summary of the time-discrete equations
u0 = f(x), x ∈ Ω
u−1 = u0 +1
2c2∆t2∇2u0, x ∈ Ω
u`+1 = 2u` − u`−1 + c2∆t2∇2u`, x ∈ Ω, ` = 0, 1, . . .
∂u`
∂n= 0, x ∈ ∂Ω k = 0, 1 . . .
Intro to finite elements – p. 200
Weighted residual methods
All the equations are on the form “u = g”, where g is known andu` ≈ u` =
∑j u
`jNj
Could, in principle, start with an analytical f(x) and analyticallyderive u1, u2 and so on, but these expressions become complicated
Working with u` ≈ u` =∑
j u`jNj instead allows easy update of
u`1, . . . , u
`M
Galerkin method for u = g: insert u`, multiply by Ni and integrate:
∫
Ω
Ni
∑
j
u`jNjdΩ =
∫
Ω
NigdΩ
giving a linear system (!):
M∑
j=1
(∫
Ω
NiNjdΩ
)u`
j =
∫
Ω
gNidΩ, i = 1, . . . ,M
Intro to finite elements – p. 201
Treatment of the right-hand sides
When g contains ∇2u` we integrate by parts on the right-hand side
Example:
u`+1 = 2u` − u`−1 + c2∆t2∇2u`
∫
Ω
u`+1NidΩ =
∫
Ω
(2u` − u`−1 + c2∆t2∇2u`)NidΩ
integration of∫Ni∇2udΩ by parts:
∫
Ω
u`+1NidΩ =
∫
Ω
(2u`Ni − u`−1Ni − c2∆t2∇Ni · ∇u`)dΩ
(the surface integral vanishes since ∂u`/∂n = 0)
Expand u to a sum and identify coefficient the matrix and theright-hand side
Intro to finite elements – p. 202
Spatial problems
M∑
j=1
Mi,ju0j =
∫
Ω
f(x)NidΩ,
M∑
j=1
Mi,ju−1j =
∫
Ω
[u0Ni −
1
2(c∆t)2∇Ni · ∇u0
]dΩ +
1
2∆t2
∫
∂Ω
c2∂f
∂nNidΓ,
M∑
j=1
Mi,ju`+1j =
∫
Ω
[(2u`(x)− u`−1(x)
)Ni − (c∆t)2∇Ni · ∇u`(x)
]dΩ,
where
Mi,j =
∫
Ω
NiNjdΩ
is the mass matrix
Intro to finite elements – p. 203
Matrix notation
Can introduce matrix-vector notation; M = Mij
Mu0 = f
Mu−1 = Mu0 − 1
2Ku0 + fn
Mu`+1 = 2Mu` −Mu`−1 −Ku`
Intro to finite elements – p. 204
Finite elements
Ni: piecewise polynomials
Example: piecewise linear Ni gives a piecewise linear u =∑
j ujNj :
x
u
Define elements Ωe and nodes x[i]
Definition of Ni:1. polynomial over each element
2. Ni(x[j]) = δij ; 1 if i = j and 0 if i 6= j
This is the type of Ni we use in the finite element method
Intro to finite elements – p. 205
A nice feature of finite element Ni
Property 2,
Ni(x[j]) = δij ; 1 if i = j and 0 if i 6= j
implies that uj is the value of u at node j
Proof:u(x[i]) =
∑
j
ujNj(x[i]) = ui
This interpretation of ui is very convenient both for practical work,implementation and for comparison with finite difference methods
Intro to finite elements – p. 206
Piecewise linear Ni
21 3 4 5 6 7x
Each element has 2 nodes
Intro to finite elements – p. 207
Quadratic basis functions
Each element has tree nodes
Intro to finite elements – p. 208
Essential boundary conditions
Boundary-value problem
−u′′ = f, x ∈ (0, 1), u(0) = uL, u(1) = uR
With ui = u(x[i]) we can construct ψ in a general way:
ψ(x) = uLN1(x) + uRNn(x)
u(x) = ψ(x) +n−1∑
j=2
ujNj(x)
In general (2D/3D): B = boundary nodes with essential conditions,I = internal nodes
u =∑
j∈B
ujNj +∑
j∈(1)
ujNj
Only uj , j ∈ I, enter the linear system as unknownsIntro to finite elements – p. 209
A worked example (1)
Boundary-value problem
−u′′ = f, x ∈ (0, 1), u(0) = uL, u(1) = uR
Galerkin’s method:
n∑
j=1
Ai,juj = bi, i = 1, . . . n
Ai,j =
1∫
0
N ′i(x)N
′j(x)dx, bi =
1∫
0
f(x)Ni(x)dx
Observation: Ni(x) and N ′i(x) vanish over large parts of the domain
("nearly" orthogonal functions)
Ai,j 6= 0 only for j = i− 1, i, i+ 1
Intro to finite elements – p. 210
A worked example (2)
Computations:
Ai,i−1 =
1∫
0
N ′i−1N
′idx = − 1
h, Ai,i =
1∫
0
N ′iN
′idx =
2
h
Ai,i+1 =
1∫
0
N ′iN
′i+1dx = − 1
h, A1,1 = An,n =
1
h
A1,2 = An,n−1 = − 1
h, bi =
1∫
0
f(x)Ni(x)dx
Intro to finite elements – p. 211
A worked example (3)
Numerical integration, trapezoidal rule:
1∫
0
f(x)Ni(x)dx
≈ 1
2f(x[1])Ni(x
[1])h+n∑
j=1
f(x[j])Ni(x[j])h+
1
2f(x[n])Ni(x
[n])h
=1
2f1δ1ih+
n−1∑
j=2
fjδijh+1
2fnδinh
where fi ≡ f(x[i]) (FDM-inspired notation)
For 2 ≤ i ≤ n− 1 the integral becomes fih (as with finite differences!!)
For i = 1, n we get 12f1h and 1
2fnh
Intro to finite elements – p. 212
Piecewise constant N ′i(x)
21 3 4 5 6 7x
Intro to finite elements – p. 213
The resulting equations
Replace eq. no. 1 and n by boundary conditions
u1 = uL, un = uR
The linear system:
u1 = uL,
− 1
hui−1 +
2
hui −
1
hui+1 = f(x[i])h,
i = 2, . . . , n− 1,
un = uR
Same result as from the finite difference method!
Exact or more accurate numerical integration: different right-handside term
Intro to finite elements – p. 214
Element by element computations (1)
Split integral into a sum over each element:
Ai,j =
1∫
0
N ′iN
′jdx =
m∑
e=1
A(e)i,j , A
(e)i,j =
∫
Ωe
N ′iN
′jdx
bi =
1∫
0
fNidx =
m∑
e=1
b(e)i , b
(e)i =
∫
Ωe
fNidx
A(e)i,j 6= 0 iff i and j are nodes in element e
b(e)i 6= 0 iff i is node in element e
Collect nonzero A(e)i,j in a 2× 2 element matrix :
A(e)r,s , r, s = 1, 2, r, s : local node numbers
Intro to finite elements – p. 215
Element by element computations (2)
Similar strategy for b(e)i ; we collect the nonzero entries on element e
in b(e)r , with r = 1, 2 counting local node numbers
Algorithm: run through all elements, compute A(e)r,s and b(e)r , and
combine all element matrices and vectors into a linear system
The local nature of Ni gives a method where one can compute just afew numbers for an element, independent of the other elements
The result is a sparse matrix and possibility for performing theelementwise computations in parallel
Intro to finite elements – p. 216
Local coordinates (1)
Map element Ωe = [x[e], x[e+1]] to [−1, 1]
Formula: local ξ ∈ [−1, 1] to global x,
x(e)(ξ) =1
2
(x[e] + x[e+1]
)+ ξ
1
2
(x[e+1] − x[e]
)
Define Ni in local ξ coordinates
c c -
bbbbbbb"""""""6
? ξ
1
N1(ξ) N2(ξ)
Perform all computations in local coordinates
Local node r (=1,2) in element e corresponds to global nodei = q(e, r)
Intro to finite elements – p. 217
Local coordinates (2)
Local linear basis functions:
N1(ξ) =1
2(1− ξ), N2(ξ) =
1
2(1 + ξ)
In general: can always compute finite element equation with suchlocal basis functions in a reference element of fixed size
We need to change variables:
x =1
2
(x[e] + x[e+1]
)+ ξ
1
2
(x[e+1] − x[e]
)
⇒ dx
dξ=h
2
Intro to finite elements – p. 218
Local coordinates (3)
In the integral:
x[e+1]∫
x[e]
N ′i(x)N
′j(x)dx =
1∫
−1
dNr(ξ)
dξ
dξ
dx
dNs(ξ)
dξ
dξ
dx
∣∣∣∣dx
dξ
∣∣∣∣ dξ
=
1∫
−1
(2
h
)2dNr(ξ)
dξ
dNs(ξ)
dξ
h
2dξ
Intro to finite elements – p. 219
Local coordinates (3)
The variable transformation can be expressed in general formulasapplicable to general finite element problems in 1D, 2D and 3D
General (isoparametric) mapping:
x(e)(ξ) =
ne∑
r=1
Nr(ξ)x[q(e,r)]
(specializes to the previous formula for linear Nr)
Change integration variable from x to ξ:
x[e+1]∫
x[e]
N ′i(x)N
′j(x)dx =
1∫
−1
J−1 dNr(ξ)
dξJ−1 dNs(ξ)
dξdetJdξ
We often write ∫
Ωe
dNr
dx
dNs
dxdetJdξ
as the expression in local coordinates, knowing that
dNr
dx= J−1 dN
dξ=
2
h
dN
dξ
Intro to finite elements – p. 220
Local coordinates (4)
−u′′(x) = f(x)
Jacobian matrix of mapping: J (1× 1 in 1D)
Uniform partition in 1D: J = h/2Element matrix and vector:
A(e)r,s =
∫ 1
−1
2
hN ′
r(ξ)2
hN ′
s(ξ)h
2dξ
b(e)r =
∫ 1
−1
f(x(e)(ξ))Nr(ξ)h
2dξ
Example: r = s = 1,
A(e)1,1 =
2
h
∫ 1
−1
(−1
2)(−1
2)dξ =
1
h
Intro to finite elements – p. 221
Results
Boundary-value problem
−u′′ = f, x ∈ (0, 1), u(0) = uL, u(1) = uR
Element matrix and vector:
A(e)
r,s
=
1
h
(1 −1
−1 1
)
b(e)r
=
h
2
(f(x(e)(−1))
f(x(e)(1))
)
where numerical integration is used:
1∫
−1
g(ξ)dξ ≈ g(−1) + g(1)
Intro to finite elements – p. 222
Essential boundary conditions
Incorporate essential boundary conditions at the element level
Element level equations:
2∑
s=1
A(e)r,s us = b(e)r , r = 1, 2
Example: Essential condition u1 = uL
Replace eq. no. 1 by u1 = uL:
u1 = uL
A(e)2,1u1 + A
(e)2,2u2 = b
(e)2
Modify the element matrix and vector:
(1 0
− 1h
1h
)(u1
u2
)=
(uL
12hf(x(1)(1))
)
Intro to finite elements – p. 223
Symmetric element/coefficient matrix
The element matrix is actually symmetric
The essential BC modification makes the symmetric nonsymmetric
Symmetrization: subtract column 1 in A(e) times uL from b(e), thenreplace eq. no. 1
This modification preserves the symmetry property of the elementmatrix and the resulting coefficient matrix
A symmetric coeff. matrix gives less storage and enables applicationof some efficient iterative solution methods
Intro to finite elements – p. 224
Numerical integration
Integration rules are normally tabulated for integrals on [−1, 1]:
1∫
−1
g(ξ)dξ ≈nI∑
k=1
g(ξk)wk
ξk: integration points
wk: integration weights
Some rules integrating polynomials of degree p exactly:
name nI p weights points
Gauss-Legendre 1 1 (2) (0)
Gauss-Legendre 2 3 (1, 1) (−1/√
3, 1/√
3)
Gauss-Legendre 3 5 (5/9, 8/9, 5/9) (−√
3/5, 0,√
3/5)
Gauss-Lobatto 2 1 (1, 1) (−1, 1)
Gauss-Lobatto 3 3 (1/3, 4/3, 1/3) (−1, 0, 1)
Intro to finite elements – p. 225
Assembly
Element matrices and vectors must be assembled (added) into theglobal system of linear equations
Essential: local→ global mapping, q(e, r),local node r in element e has global node number q(e, r)
In 1D, q(e, r) = e− 1 + r, but in 2D/3D the grid is more complicatedand q is just a table
Algorithm:
Aq(e,r),q(e,s) := Aq(e,r),q(e,s) + A(e)r,s , r, s = 1, 2
bq(e,r) := bq(e,r) + b(e)r , r = 1, 2
Intro to finite elements – p. 226
Illustration of the assembly process
element matrices global matrix
3
4
q(e,r)
Intro to finite elements – p. 227
Summing up the procedures
Weighted residual formulation,often Galerkin’s choice Wi = Ni
Integration by parts
Derivative boundary conditions in boundary terms
Compute element matrices and vectorsLocal coordinates with local numberingNumerical integrationEnforce essential boundary conditionsAssemble local contributions
Solve linear system
Intro to finite elements – p. 228
Generality
−u′′ = f , u(0) = uL, u(1) = uR
is just an example
The algorithm works in 1D, 2D, 3D
Complicated geometries can be handled
Element shapes in 2D: triangles, quadrilaterals
Element shapes in 3D: boxes, tetrahedra
Time dependency: "time loop outside a stationary solver"
Intro to finite elements – p. 229
Nonconstant element size
Trivial to work with varying element size
Just replace h by a he in the formulas from element e
Result in model problem:
u1 = uL
− 1
hi−1ui−1 +
(1
hi−1+
1
hi
)ui −
1
hiui+1 =
1
2(hi−1 + hi)f(x[i])
un = uR
i = 2, . . . , n− 1.
Varying element shape and size is straightforward inmulti-dimensional problems
Intro to finite elements – p. 230
The elementwise algorithm (1)
initialize global linear system:set Ai,j = 0 for i, j = 1, . . . , n
set bi = 0 for i = 1, . . . , n
loop over all elements:for e = 1, . . . , m
set A(e)r,s = 0, r, s = 1, . . . , ne
set b(e)r = 0, r = 1, . . . , ne
loop over numerical integration points:for k = 1, . . . , nI
evaluate Nr(ξk), derivatives of Nr wrt. ξ and x, J
contribution to element matrix and vectorfrom the current integration pointfor r = 1, . . . , ne
for s = 1, . . . , ne
A(e)r,s := A
(e)r,s + dNr
dxNsdx
det Jwk
b(e)r := b(e) + f(x(e)(ξk))Nr det Jwk
Intro to finite elements – p. 231
The elementwise algorithm (1)
incorporate essential boundary conditions:for r = 1, . . . , ne
if node r has an essential boundary condition then
modify A(e)r,s and b
(e)r due to this condition
assemble:for r = 1, . . . , ne
for s = 1, . . . , ne
Aq(e,r),q(e,s) := Aq(e,r),q(e,s) + A(e)r,s
bq(e,r) := bq(e,r) + b(e)r
Intro to finite elements – p. 232
Exercise 2.7
−u′′(x) = f(x) ≡ −(α+ 1)xα
Good repetition of the previous material
A new point:∫ 1
0xαNi(x)dx
The final discrete equations can be written
−h[δxδxu]i = hbi
for internal nodes i = 2, . . . , n− 1, where
bi =1
h2
1
α+ 2
(−(x[i−1])α+2 + 2(x[i])α+2 − (x[i+1])α+2
)
For i = 1: u1 = 0
For i = n: a slightly modified equation
Intro to finite elements – p. 233
Exercise 2.7 cont.
A standard finite difference method gives
−[δxδxu]i = fi = −(α+ 1)(x[i])α
for internal nodes i = 2, . . . , n− 1
FEM and FDM give slightly different equations
Which method is most accurate?
By accident, FEM solves this problem exactly (!)
Intro to finite elements – p. 234
FEM for the wave equation
∂2u
∂t2= ∇ · (c2∇u), x ∈ Ω, t > 0
u(x, 0) = f(x), x ∈ Ω
∂
∂tu(x, 0) = 0, x ∈ Ω
∂u
∂n= 0, x ∈ ∂Ω
Finite differences in time:
∂2
∂t2u(x, t`) =
u`−1 − 2u` + u`+1
∆t2+O(∆t2)
Spatial problem at each time level (chosen c = const for simplicity):
u`+1(x) = 2u`(x)− u`−1(x) + (c∆t)2∇2u`(x)
(can be discretized by finite elements) Intro to finite elements – p. 235
Time stepping
u0 = f(x), x ∈ Ω
u−1 = u0 +1
2c2∆t2∇2u0, x ∈ Ω
u`+1 = 2u` − u`−1 + c2∆t2∇2u`, x ∈ Ω, ` = 0, 2, . . .
∂u`
∂n= 0, x ∈ ∂Ω ` = 1, 2 . . .
u`(x) ≈ u` =n∑
j=1
u`jNj(x), ` = −1, 0, 1, 2, . . .
Intro to finite elements – p. 236
Spatial problems
n∑
j=1
Mi,ju0j =
∫
Ω
f(x)NidΩ,
n∑
j=1
Mi,ju−1j =
∫
Ω
[u0Ni −
1
2(c∆t)2∇Ni · ∇u0
]dΩ +
1
2∆t2
∫
∂Ω
c2∂f
∂nNidΓ,
n∑
j=1
Mi,ju`+1j =
∫
Ω
[(2u`(x)− u`−1(x)
)Ni − (c∆t)2∇Ni · ∇u`(x)
]dΩ,
where
Mi,j =
∫
Ω
NiNjdΩ
is the mass matrix
Intro to finite elements – p. 237
Spatial problems
Alternative notation:
n∑
j=1
Mi,ju0j = bi,
n∑
j=1
Mi,ju−1j =
n∑
j=1
Mi,ju0j + (c∆t)2
n∑
j=1
Ki,ju0j + bi,
n∑
j=1
Mi,ju`+1j = 2
n∑
j=1
Mi,ju`j −
n∑
j=1
Mi,ju`−1j − (c∆t)2
n∑
j=1
Ki,ju`j ,
where
Mi,j =
∫
Ω
NiNjdΩ from u terms
Ki,j =
∫
Ω
∇Ni · ∇NjdΩ from ∇2u termsIntro to finite elements – p. 238
FEM for the 1D wave equation
1D: ∇2u = u,xx
Linear elements
Compute element matrices corresponding to the two principal terms(u,tt and u,xx):
M (e)ij =
h
6
(2 1
1 2
)K(e)
ij =c2
h
(1 −1
−1 1
)
(mass matrix and stiffness matrix at the element level)
Assembling the stiffness matrix:
∑
j
Ki,ju`j =
c2
h
(u`
i−1 − 2u`i + u`
i+1
)
(same as FDM)
Intro to finite elements – p. 239
Interpreting the mass matrix term (1)
Assembling the mass matrix contributions:
∑
j
Mi,ju`+1j =
h
6
(u`+1
i−1 + 4u`+1i + u`+1
i+1
)
With FDM, only hu`+1i would appear
Can rewrite this as
h
(u`+1
i +1
6
(u`+1
i−1 − 2u`+1i + u`+1
i+1
))
or expressed with difference operators:
h[u+h2
6δxδxu]
`+1i
⇒ FDM representation + a diffusion term
Intro to finite elements – p. 240
Interpreting the mass matrix term (2)
The complete equation:
[δtδt(u+h2
6δxδxu) = c2δxδxu]
`i
⇒ FDM representation + [ 16h2δtδtδxδxu]
`i (dispersion)
Notice: FEM gives an implicit scheme(must solve a linear system to find u`+1
i )
In 1D: solution of tridiagonal systems is fast
In 2D/3D: solving linear systems slows down the method significantly
Intro to finite elements – p. 241
Lumping the mass matrix
If we apply nodal-point integration – the trapezoidal rule – theelement mass matrix becomes
h
2
(1 0
0 1
)
Assembling: ∑
j
Mi,ju`+1j = hu`+1
i
i.e. the same result as a finite difference method!
Making the mass matrix diagonal (by e.g. nodal point integration) iscalled mass lumping
Final lumped scheme:
[δtδtu = c2δxδxu]`i
i.e. a standard finite difference scheme Intro to finite elements – p. 242
Some questions
What is best, consistent or lumped mass matrix?
That depends on the equation!
We shall make an analysis of discrete wave eqs.
Is there any physical justification of lumping?
Yes! – see Exercise 2.11
Intro to finite elements – p. 243
Analysis of FEM for the wave equation (1)
Derive the discrete equations
Look for analytical solutions of the discrete equations
Find corresponding analytical solutions of the continuous problem
Compare principal quantities, e.g., wave velocity
This is called numerical dispersion analysis
Intro to finite elements – p. 244
Analysis of FEM for the wave equation (2)
The discrete equations from FEM:
[δtδt(u+h2
6δxδxu) = c2δxδxu]
`i
Lumped mass: only δtδtu on the left-hand side
Inserting a discrete solution:
u`j = Aei(kjh−ω`∆t) = Aei(kx−ωt)
Results in a numerical dispersion relation:
ω = ω(k, h,∆t)
Can use this for stability and accuracy analysis
Intro to finite elements – p. 245
Analysis of FEM for the wave equation (3)
Numerical dispersion relation follows from (and solving for ω)
sin2 ω∆t
2=c2∆t2
h2
(1− 2
3sin2 kh
2
)−1
sin2 kh
2
Truncation error or series expansion of error in wave velocity:
τ ∼ ∆t2
12
(∂4u
∂t4
)`
i
− c2h2
12
(∂4u
∂x4
)`
i
+h2
6
(∂4u
∂x2∂t2
)`
i
No big difference from FDM
Stability: must require real ω
c∆t
h≤ 1√
3
A reducing factor 1/√
3 compared with FDM!Intro to finite elements – p. 246
Consistent vs. lumped mass matrix
FEM w/lumped mass = FDM scheme
FEM w/consistent mass:- same order of accuracy as FDM
- lower stability (1/√
3)- exact solution for C = 1 is not true- must solve linear systems
⇒ Use lumped mass for this wave equation!
Intro to finite elements – p. 247
Other problems
The conclusions here apply to 2D/3D wave equations
∂2u
∂t2= ∇ · [c2∇u]
Another PDE (uni-directional wave eq.):
∂u
∂t+ v · ∇u = 0
Here, lumped mass reduces the accuracy significantly, so don’tgeneralize too much
Intro to finite elements – p. 248
Error in numerical wave velocity (1)
consistent mass
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3p
(c− c)/c as function of p = kh. C = c∆t/h. Top curve: C = 1/√
3 (max);mid curve: C = 0.3; bottom curve: C = 0.1.
Intro to finite elements – p. 249
Error in numerical wave velocity (2)
lumped mass or FDM
-0.35
-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
0 0.5 1 1.5 2 2.5 3
(c− c)/c as function of p = kh. C = c∆t/h. C = 1 gives exact solution.Top curve: C = 0.98 (max); mid curves: C = 0.9, 0.5; bottom curve:C = 0.1.
Intro to finite elements – p. 250
Software tools for experimentation (1)
Propagation of numerical high-frequency noise:src/fdm/Wave1D/steep1 (steep2)
σ (sigma) is a steepness parameter in a plug-shaped profile:
f(x) =
0.5− π−1 arctan(σ(x− 2)), x > 0,
0.5 + π−1 arctan(σ(x+ 2)), x ≤ 0
Vary the resolution and σ. Study the effect on the wave propagation.
Error in wave velocity for a sine component:
c− c =ω − ωk∼ −k2
use this information to explain the visual observationbs
Intro to finite elements – p. 251
Quadratic 1D elements
Piecewise quadratic Ni ⇒ piecewise quadratic u
Three nodes per element:one in the middle plus the two at the ends
We always have Ni(x[j]) = 0, i 6= j
Previous algorithms and techniques still work
Intro to finite elements – p. 252
Quadratic 1D elements in local coordinates
Three nodes per element:
ξ1 = −1, ξ2 = 0, ξ3 = 1
Calculation of basis functions:
Nr(ξ) = arξ2 + brξ + cr, Nr(ξs) = δrs
⇒ 3 equations for ar, br, cr (r = 1, 2, 3)
N1(ξ) =1
2ξ(ξ − 1), N2(ξ) = (1 + ξ)(1− ξ), N3(ξ) =
1
2ξ(1 + ξ)
Isoparametric mapping:
x =3∑
r=1
Nr(ξ)x[q(e,r)]
(could also use the linear mapping if the mid node is in the center)
3× 3 element matrixIntro to finite elements – p. 253
Why quadratic elements?
PDE:−u′′(x) = 1, x ∈ (0, 1)
Basic error estimate:
(∫ 1
0
(u− u)2dx) 1
2
= O(hs+1), s = degree of Ni
where h is the distance between two neighboring nodes
Linear elements: error ∼ h2
Quadratic elements: error ∼ h3
⇒ doubling the number of nodes reduces the error by 1/8
Intro to finite elements – p. 254
Quadratic elements: example
PDE:−u′′(x) = 1, x ∈ (0, 1)
Apply the general elementwise algorithm with ne = 3, quadratic Nr,analytical integration
Element matrix:
1
3he
7 −8 1
−8 16 −8
1 −8 7
he: physical length of element e.
element vector:
he
6
1
4
1
Intro to finite elements – p. 255
Implementation of 1D FE problems
Direct implementation in terms of arrays
Linear 1D elements, −u′′ = β = const, but still fairly generalimplementation
scan: read n and β, allocate vectors, matrices, etc., call initGrid.
initGrid: compute xi and q(e, r), i.e., the finite element grid.
makeSystem: calculate the linear system.
solve: solve linear system by Gaussian elimination.
calcElmMatVec: compute element matrix and vector for an element.
integrands: evaluate the integrands of the weighted residual statement
N: evaluate the basisfunctions in local coordinates.
dN: evaluate the derivatives of the basisfunctions in local coordinates.
Intro to finite elements – p. 256
Extensions of the program
1D to 2D (3D): big job
Another PDE: easy - integrands
Quadratic elements: small, scattered modifications - N, dN, integrands,initGrid ...
Goal: solver code is independent ofelement typelinear system solvermatrix formatgrid typeno of space dimensions
Intro to finite elements – p. 257
2D domains
Strength of the finite element method:easy to work with geometrically complicated domains
0−8.92 7.92
1
2
3
4
5
6
7
8
Lake Superior with 6 islands, 2330 triangles
0−8.92 7.92
1
2
3
4
5
6
7
8
Intro to finite elements – p. 258
Element shapes
Rectangular, triangular
Straight or curved sides
Illegal:
illegal node
Intro to finite elements – p. 259
2D rectangular bilinear element
e e
ee
6
-
21
3 4
ξ1
ξ2
4 nodes
Bilinear functions:
Nr(ξ1, ξ2) = ar + brξ1 + crξ2 + drξ1ξ2
Conditions for determining ar, br, cr, dr:
Nr(node s) = δrs
4 equations for ar, br, cr, dr (fixed r)Intro to finite elements – p. 260
Mapping of 2D bilinear element
ξ
ξ
global
2
1
x2
x1
local
Intro to finite elements – p. 261
2D linear 3-node element
ξ
ξ
local global
2
1
x
x1
2
1
2
3
linear Ni(ξ1, ξ2), straight sides
Intro to finite elements – p. 262
Typical 2D linear basis function
Intro to finite elements – p. 263
Construction of the basis functions
(0,1)(0,0)
(1,0)
1ξ
2ξ
reference element
13
2
Principles for the construction:
1. Ni(ξ1, ξ2) is a polynomial
2. Ni = δij at local node j
3 constraints⇒ Ni must be linear
N1 = ξ1, N2 = ξ2, N3 = 1− ξ1 − ξ2Intro to finite elements – p. 264
Construction of 2D basis functions; Example
Example: N2(ξ1, ξ2)
Ni(ξ1, ξ2) = αi + βiξ1 + γiξ2
N2(1, 0) = α2 + β2 · 1 + γ2 · 0N2(0, 1) = α2 + β2 · 0 + γ2 · 1N2(0, 0) = α2 + β2 · 0 + γ2 · 0
3 linear equations in 3 unknowns α2, β2, and γ2.
Solution: α2 = β2 = 0, γ2 = 1
N(ξ1, ξ2) = ξ2
Intro to finite elements – p. 265
2D quadratic 6-node element
x
ξ
ξ
local global
2
1
x
1
2
quadratic Ni(ξ1, ξ2), curved sides (parabola)
Intro to finite elements – p. 266
2D quadratic 9-node element
ξ
ξ
local
2
1
x
x1
2
global
biquadratic Ni(ξ1, ξ2), curved sides (parabola)
Intro to finite elements – p. 267
2D quadratic 8-node element
ξ
ξ
local global
2
1
x2
x1
biquadratic Ni(ξ1, ξ2) minus ξ21ξ22 term, curved sides (parabola)
Intro to finite elements – p. 268
3D elements
tetrahedron with 4 corner nodes (linear Ni)
tetrahedron with 10 nodes (quadratic Ni),mid-node on each edge
box with 8 corner nodes (tri-linear Ni)
box with 20 nodes (quadratic Ni),mod-node on each edge
box with 27 nodes (tri-quadratic Ni),mod-node on edges and sides
Intro to finite elements – p. 269
Triangular vs. box shape
Any 2D geometry can be divided into triangles(if the boundaries are approx. by polygons)
Any 3D geometry can be divided into tetrahedra(if the boundaries are approx. by polygons)
Many geometries can be divided into rectangles/boxes, but one mayneed an extra triangle/tetrahedron
Different element shapes have different properties, depending on thePDE system, so chosing the right element is not obvious, and it is notonly a geometry-approximation thing
Intro to finite elements – p. 270
Convection-dominated flow
Convection-dominated flow – p. 271
Projects 1.5.2 and 1.5.3
Model problem:
u′(x) = εu′′(x), x ∈ (0, 1), u(0) = 0, u(1) = 1
ε small: boundary layer at x = 1
Standard numerics (i.e. centered differences) will fail!
Cure: upwind differences
Convection-dominated flow – p. 272
Notation for difference equations (1)
Define
[δxu]`i,j,k ≡
u`i+ 1
2 ,j,k− u`
i− 12 ,j,k
h
with similar definitions of δy, δz, and δt
Another difference:
[δ2xu]`i,j,k ≡
u`i+1,j,k − u`
i−1,j,k
2h
Compound difference:
[δxδxu]`i =
1
h2
(u`
i−1 − 2u`i + u`
i+1
)
Convection-dominated flow – p. 273
Notation for difference equations (1)
One-sided forward difference:
[δ+x u]`i ≡
u`i+1 − u`
i
h
and the backward difference:
[δ−x u]`i ≡
u`i − u`
i−1
h
Put the whole equation inside brackets:
[δxδxu = −f ]i
is a finite difference scheme for u′′ = −f
Convection-dominated flow – p. 274
Centered differences
u′(x) = εu′′(x), x ∈ (0, 1), u(0) = 0, u(1) = 1
ui+1 − ui−1
2h= ε
ui−1 − 2ui + ui+1
h2, i = 2, . . . , n− 1
u1 = 0, un = 1
or[δ2xu = εδxδxu]i
Analytical solution:
u(x) =1− ex/ε
1− e1/ε
⇒ u′(x) > 0, i.e., monotone function
Convection-dominated flow – p. 275
Numerical experiments (1)
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8 1
u(x)
x
n=20, epsilon=0.1
centeredexact
Convection-dominated flow – p. 276
Numerical experiments (2)
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8 1
u(x)
x
n=20, epsilon=0.01
centeredexact
Convection-dominated flow – p. 277
Numerical experiments (3)
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8 1
u(x)
x
n=80, epsilon=0.01
centeredexact
Convection-dominated flow – p. 278
Numerical experiments (4)
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8 1
u(x)
x
n=20, epsilon=0.001
centeredexact
Convection-dominated flow – p. 279
Numerical experiments; summary
The solution is not monotone if h > 2ε
The convergence rate is h2
(in agreement with truncation error analysis)provided h ≤ 2ε
Completely wrong qualitative behavior for h 2ε
Convection-dominated flow – p. 280
Analysis
Can find an analytical solution of the discrete problem (!)
Method: insert ui ∼ βi and solve for β
β1 = 1, β2 =1 + h/(2ε)
1− h/(2ε)
cf. HPL app. A.4.4
Complete solution:
ui = C1βi1 + C2β
i2
Determine C1 and C2 from boundary conditions
ui =βi
2 − β2
βn2 − β2
Convection-dominated flow – p. 281
Important result
Observe: ui oscillates if β2 < 0
⇒ 1 + h/(2ε)
1− h/(2ε) < 0 ⇒ h > 2ε
Must require h ≤ 2ε for ui to have the same qualitative property asu(x)
This explains why we observed oscillations in the numerical solution
Convection-dominated flow – p. 282
Upwind differences
Problem:
u′(x) = εu′′(x), x ∈ (0, 1), u(0) = 0, u(1) = 1
Use a backward difference, called upwind difference, for the u′ term:
ui − ui−1
h= ε
ui−1 − 2ui + ui+1
h2, i = 2, . . . , n− 1
u1 = 0, un = 1
The scheme can be written as
[δ−x u = εδxδxu]i
Convection-dominated flow – p. 283
Numerical experiments (1)
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8 1
u(x)
x
n=20, epsilon=0.1
upwindexact
Convection-dominated flow – p. 284
Numerical experiments (2)
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
0 0.2 0.4 0.6 0.8 1
u(x)
x
n=20, epsilon=0.01
upwindexact
Convection-dominated flow – p. 285
Numerical experiments; summary
The solution is always monotone, i.e., always qualitatively correct
The boundary layer is too thick
The convergence rate is h(in agreement with truncation error analysis)
Convection-dominated flow – p. 286
Analysis
Analytical solution of the discrete equations:
ui = βi ⇒ β1 = 1, β2 = 1 + h/ε
ui = C1 + C2βi2
Using boundary conditions:
ui =βi
2 − β2
βn2 − β2
Since β2 > 0 (actually β2 > 1), βi2 does not oscillate
Convection-dominated flow – p. 287
Centered vs. upwind scheme
Truncation error:centered is more accurate than upwind
Exact analysis:centered is more accurate than upwind when centered is stable(i.e. monotone ui), but otherwise useless
ε = 10−6 ⇒ 500 000 grid points to make h ≤ 2ε
Upwind gives best reliability, at a cost of a too thick boundary layer
Convection-dominated flow – p. 288
An interpretation of the upwind scheme
The upwind scheme
ui − ui−1
h= ε
ui−1 − 2ui + ui+1
h2
or[δ−x u = εδxδxu]i
can be rewritten as
ui+1 − ui−1
2h= (ε+
h
2)ui−1 − 2ui + ui+1
h2
or
[δ2xu = (ε+h
2)δxδxu]i
Upwind = centered + artificial diffusion (h/2)
Convection-dominated flow – p. 289
Finite elements for the model problem
Galerkin formulation of
u′(x) = εu′′(x), x ∈ (0, 1), u(0) = 0, u(1) = 1
and linear elements leads to a centered scheme (show it!)
ui+1 − ui−1
2h= ε
ui−1 − 2ui + ui+1
h2, i = 2, . . . , n− 1
u1 = 0, un = 1
or[δ2xu = εδxδxu]i
Stability problems when h > 2ε
Convection-dominated flow – p. 290
Finite element theory (1)
Abstract finite element theory starts with
a(u, v) = L(v) ∀v ∈ V
Considerv · ∇u = ε∇2u
⇒ Nonsymmetric a(u, v)
General best-approximation result:
||u− uh||V ≤c1c2||u− v||V ∀v ∈ V
where c1 and c2 are the bounds of a:
c2||v||V ≤ a(v, v), |a(u, v)| ≤ c1||u||V ||v||V
Convection-dominated flow – p. 291
Finite element theory (2)
In the estimate, c2 = ε and
c1 = ε+√CΩ sup|v(x)| | x ∈ Ω
If “v is ε”, c1/c2 is also large
Some indication that the best-approximation property of the Galerkinmethod is not that much worth
Convection-dominated flow – p. 292
Finite elements and upwind differences
How to construct upwind differences in a finite element context?
One possibility: add artificial diffusion (h/2)
u′(x) = (ε+h
2)u′′(x), x ∈ (0, 1), u(0) = 0, u(1) = 1
Can be solved by a Galerkin method
Equivalent strategy: use perturbed weighting functions
Convection-dominated flow – p. 293
Perturbed weighting functions in 1D
TakeWi(x) = Ni(x) + τN ′
i(x)
as weighting function for the convective term u′:
∫ 1
0
u′Widx =
∫ 1
0
u′Nidx+
∫ 1
0
τN ′iu
′dx
The new term τN ′iu
′ is the weak formulation of an artificial diffusionterm τNiu
′′
With τ = h/2 we then get the upwind scheme
Convection-dominated flow – p. 294
Optimal artificial diffusion
Try a weighted sum of a centered and an upwind discretization:
[u′]i ≈ [θδ−x u+ (1− θ)δ2xu]i, 0 ≤ θ ≤ 1
[θδ−x u+ (1− θ)δ2xu = εδxδxu]i
Is there an optimal θ?
Yes, for
θ(h/ε) = cothh
2ε− 2ε
h
we get exact ui (i.e. u exact at nodal points)
Equivalent artificial diffusion τo = 0.5hθ(h/ε)
Exact finite element method: Wi(x) = Ni(x) + τoN′i(x) for the
convective term u′
Convection-dominated flow – p. 295
Multi-dimensional problems
Model problem:
vx∂u
∂x+ vy
∂u
∂y= ∇2u
often written asv · ∇u = ∇2u
Non-physical oscillations occur with centered differences or Galerkinmethods when the left-hand side terms are large
Remedy: upwind differences
Downside: too much diffusion
Important result: extra stabilizing diffusion is needed only in thestreamline direction (vx, vy)
Convection-dominated flow – p. 296
Streamline diffusion
Idea: add diffusion in the streamline direction
Isotropic diffusion:
d∑
i=1
d∑
j=1
kδij∂2u
∂xi∂xj= k∇2u
kδij is the diffusion tensor (same in all directions)
Streamline diffusion:
d∑
i=1
d∑
j=1
∂
∂xi
(kij
∂u
∂xj
), kij = τ
vivj
||v||2
Implementation: artificial diffusion term or perturbed weightingfunction
Convection-dominated flow – p. 297
Perturbed weighting functions (1)
Consider the weighting function
Wi = Ni + τ∗v · ∇Ni
for the convective (left-hand side) term:∫Wi v · ∇u dΩ
This expands to
∫Niv · ∇udΩ +
∫τ∗v · ∇uv · ∇NidΩ
The latter term can be viewed as the Galerkin formulation of (writev · ∇u =
∑i ∂u/∂xi etc.)
d∑
i=1
d∑
j=1
∂
∂xi
(τ∗vivj
∂u
∂xj
)
Convection-dominated flow – p. 298
Perturbed weighting functions (2)
⇒ Streamline diffusion can be obtained by perturbing the weightingfunction
Common name: SUPG(streamline-upwind/Petrov-Galerkin)
Convection-dominated flow – p. 299
Consistent SUPG
Why not just add artificial diffusion?
Why bother with perturbed weighting functions?
In standard FEM, ∫
Ω
L(u)WidΩ = 0
the exact solution is a solution of the FEM equations (it fulfills L(u))
This no longer holds if weadd an artificial diffusion term (∼ h/2)use different weighting functions on different terms
Idea: use consistent SUPGno artificial diffusion termsame (perturbed) weighting functionapplies to all terms
Convection-dominated flow – p. 300
A step back to 1D
Let us try to useWi(x) = Ni(x) + τN ′
i(x)
on both terms in u′ = εu′′:
∫ 1
0
(Niu′ + (ε+ τ)N ′
iu′)dx+ τ
∫ 1
0
N ′′i u
′dx = 0
Problem: last term
Remedy: drop it (!)
Justification: N ′′i = 0 on each linear element
Drop 2nd-order derivatives of Ni in 2D/3D too
Consistent SUPG is not so consistent...
Convection-dominated flow – p. 301
Choosing τ ∗
Choosing τ∗ is a research topic
Many suggestions
Two classes:τ∗ ∼ hτ∗ ∼ ∆t (time-dep. problems)
Little theory
Convection-dominated flow – p. 302
A test problem (1)
u=0
u=0du/dn=0 or1
1
y
x
u=1θ
v
y = x tan θ + 0.25
0.25
u=0
du/dn=0
Convection-dominated flow – p. 303
A test problem (2)
Methods:
1. Classical SUPG:Brooks and Hughes: "A streamline upwind/Petrov-Galerkin finiteelement formulation for advection domainated flows with particularemphasis on the incompressible Navier-Stokes equations", Comp.Methods Appl. Mech. Engrg., 199-259, 1982.
2. An additional discontinuity-capturing term
Wi = Ni + τ∗v · ∇Ni + τv · ∇u||∇u||2∇u
was proposed inHughes, Mallet and Mizukami: "A new finite element formulation forcomputational fluid dynamics: II. Beyond SUPG", Comp. MethodsAppl. Mech. Engrg., 341-355, 1986.
Convection-dominated flow – p. 304
Galerkin’s method
X
Y
Z
−0.65
0
1
2
0
10
1
Convection-dominated flow – p. 305
SUPG
X
Y
Z
−0.65
0
1
2
0
10
1
Convection-dominated flow – p. 306
Time-dependent problems
Model problem:∂u
∂t+ v · ∇u = ε∇2u
Can add artificial streamline diffusion term
Can use perturbed weighting function
Wi = Ni + τ∗v · ∇Ni
on all terms
How to choose τ∗?
Convection-dominated flow – p. 307
Taylor-Galerkin methods (1)
Idea: Lax-Wendroff + Galerkin
Model equation:∂u
∂t+ U
∂u
∂x= 0
Lax-Wendroff: 2nd-order Taylor series in time,
u`+1 = u` + ∆t
[∂u
∂t
]`
+1
2∆t2
[∂2u
∂t2
]`
Replace temporal by spatial derivatives,
∂
∂t= −U ∂
∂x
Result:
u`+1 = u` − U∆t
[∂u
∂x
]`
+1
2U2∆t2
[∂2u
∂x2
]`
Convection-dominated flow – p. 308
Taylor-Galerkin methods (2)
We can write the scheme on the form
[δ+t u+ U
∂u
∂x=
1
2U2∆t
∂2u
∂x2
]`
⇒ Forward scheme with artificial diffusion
Lax-Wendroff: centered spatial differences,
[δ+t u+ Uδ2xu =1
2U2∆tδxδxu]
`i
Alternative: Galerkin’s method in space,
[δ+t u+ Uδ2xu =1
2U2∆tδxδxu]
`i
provided that we lump the mass matrix
This is Taylor-Galerkin’s methodConvection-dominated flow – p. 309
Taylor-Galerkin methods (3)
In multi-dimensional problems,
∂u
∂t+ v · ∇u = 0
we have∂
∂t= −v · ∇
and (∇ · v = 0)
∂2
∂t2= ∇ · (vv · ∇) =
d∑
r=1
d∑
s=1
∂
∂xr
(vrvs
∂
∂xs
)
This is streamline diffusion with τ ∗ = ∆t/2:
[δ+t u+ v · ∇u =1
2∆t∇ · (vv · ∇u)]`
Convection-dominated flow – p. 310
Taylor-Galerkin methods (4)
Can use the Galerkin method in space(gives centered differences)
The result is close to that of SUPG, but τ ∗ is diferent
⇒ The Taylor-Galerkin method points to τ ∗ = ∆t/2 for SUPG intime-dependent problems
Convection-dominated flow – p. 311
Nonlinear PDEs
Nonlinear PDEs – p. 312
Examples
Some nonlinear model problems to be treated next:
−u′′(x) = f(u), u(0) = uL, u(1) = uR,
−(λ(u)u′)′ = 0, u(0) = uL, u(1) = uR
−∇ · [λ(u)∇u] = g(x), with u or − λ∂u∂n
B.C.
Discretization methods:standard finite difference methodsstandard finite element methodsthe group finite element method
We get nonlinear algebraic equations
Solution method: iterate over linear equations
Nonlinear PDEs – p. 313
Nonlinear discrete equations; FDM
Finite differences for −u′′ = f(u):
− 1
h2(ui−1 − 2ui + ui+1) = f(ui)
⇒ nonlinear system of algebraic equations
F (u) = 0, or Au = b(u), u = (u1, . . . , un)T
Finite differences for (λ(u)u′)′ = 0:
1
2h2([λ(ui+1) + λ(ui)](ui+1 − ui)
−[λ(ui) + λ(ui−1)](ui − ui−1)) = 0
⇒ nonlinear system of algebraic equations
F (u) = 0 or A(u)u = b
Nonlinear PDEs – p. 314
Nonlinear discrete equations; FEM
Finite elements for −u′′ = f(u):
u ≈ u =n∑
k=1
ukNk(x)
Galerkin approach:
∫ 1
0
N ′i u
′dx =
∫ 1
0
f(∑
k
Nkuk)Nidx
(assuming prescribed u(0) and u(1))
Left-hand side is easy to assemble:
− 1
h(ui−1 − 2ui + ui+1) =
∫ 1
0
f(∑
k
ukNk(x))Nidx
Nonlinear PDEs – p. 315
Nonlinearities in the FEM
Note thatf(∑
k
Nk(x)uk)
is a complicated function of u1, . . . , un
F.ex.: f(u) = u2
∫ 1
0
(∑
k
Nkuk
)2
Nidx
gives rise to a difference representation
h
12
(u2
i−1 + 2ui(ui−1 + ui+1) + 6u2i + u2
i+1
)
(compare with f(ui) = u2i in FDM!)
Must use numerical integration in general
Nonlinear PDEs – p. 316
The group finite element method
The group finite element method:
f(u) = f(∑
k
ukNk(x)) ≈n∑
k=1
f(uk)Nk
Resulting term:∫ 1
0f(u)Nidx =
∫ 1
0
∑k NiNkf(uk) gives
∑
k
∫ 1
0
NkNidx or f(uk), Mf(u)
which is a mass matrix-like term: h6 (f(ui−1) + 4f(ui) + f(ui+1))
Trapezoidal integration gives an FDM-like term:
∑
k
∫ 1
0
NkNidx f(uk)dx ≈ hf(ui)
⇒ similar results as FDM Nonlinear PDEs – p. 317
FEM for a nonlinear coefficient
We now look at
(λ(u)u′)′ = 0, u(0) = uL, u(1) = uR
Using a finite element method (exercise 4.2) results in an integral
∫ 1
0
λ(∑
k
ukNk)N ′iN
′j dx
⇒ complicated!
Linear elements and trapezoidal rule:
1
2(λ(ui) + λ(ui+1))(ui+1 − ui)−
1
2(λ(ui−1) + λ(ui))(ui − ui−1) = 0
⇒ FDM with arithmetic mean for λ(ui+1/2)
Nonlinear PDEs – p. 318
Nonlinear algebraic equations
FEM/FDM for nonlinear PDEs givesnonlinear algebraic equations:
(λ(u)u′)′ = 0 ⇒ A(u)u = b
−u′′ = f(u) ⇒ Au = b(u)
In general a nonlinear PDE gives
F (u) = 0
or
F1(u1, . . . , un) = 0
. . .
Fn(u1, . . . , un) = 0
Nonlinear PDEs – p. 319
Solving nonlinear algebraic eqs.
HaveA(u)u− b = 0, Au− b(u) = 0, F (u) = 0
Idea: solve nonlinear problem as a sequence of linear subproblems
Must perform some kind of linearization
Iterative method: guess u0, solve linear problems for u1, u2, . . . andhope that
limk→∞
uk = u
i.e. the iteration converges
Nonlinear PDEs – p. 320
Successive substitutions (1)
Model problem: A(u)u = b
Simple iteration scheme:
A(uk)uk+1 = b, k = 0, 1, . . .
Must provide (good) guess u0
Termination:||uk+1 − uk|| ≤ εu
or using the residual (expensive, req. new A(uk+1)!)
||b−A(uk+1)uk+1|| ≤ εr
Relative criteria:||uk+1 − uk|| ≤ εu||uk||
or (more expensive)
||b−A(uk+1)uk+1|| ≤ εr||b−A(u0)u0||
Simple method, but sometimes slow convergece
Nonlinear PDEs – p. 321
Successive substitutions (2)
Model problem: Au = b(u)
Simple iteration scheme:
Auk+1 = b(uk), k = 0, 1, . . .
Relaxation:
Au∗ = b(uk), uk+1 = ωu∗ + (1− ω)uk
(may improve convergence, avoids too large steps)
Picard iteration is another name of this method
Nonlinear PDEs – p. 322
Newton’s method (1)
The Newton (Newton-Raphson) method for f(x) = 0, x ∈ IR
Given an approximation xk
Approximate f by a linear function at xk:
f(x) ≈M(x;xk) = f(xk) + f ′(xk)(x− xk)
Find new xk+1 such that
M(xk+1;xk) = 0⇒ xk+1 = xk − f(xk)
f ′(xk)
Nonlinear PDEs – p. 323
Newton’s method (2)
Systems of nonlinear equations:
F (u) = 0, F (u) ≈M(u; uk)
Multi-dimensional Taylor-series expansion:
M(u; uk) = F (uk) + J(u− uk), J ≡ ∇F
Ji,j =∂Fi
∂uj
Iteration no. k:solve linear system J(uk)(δu)k+1 = −F (uk)
update: uk+1 = uk + (δu)k+1
Can use relaxation: uk+1 = uk + ω(δu)k+1
Nonlinear PDEs – p. 324
The Jacobian matrix; FDM (1)
Model: u′′ = −f(u)
Scheme:
Fi ≡1
h2(ui−1 − 2ui + ui+1)− f(ui) = 0
Jacobian matrix term (FDM):
Ji,j =∂Fi
∂uj
Fi = 0 contains only ui, ui±1
⇒ Jacobian is sparse (tridiagonal)
Nonlinear PDEs – p. 325
The Jacobian matrix; FDM (2)
Fi ≡1
h2(ui−1 − 2ui + ui+1)− f(ui) = 0
Derivation:
Ji,i−1 =∂Fi
∂ui−1=
1
h2
Ji,i+1 =∂Fi
∂ui+1=
1
h2
Ji,i =∂Fi
∂ui= − 2
h2− f ′(ui)
Must form the Jacobian in each iteration and solve
Jδuk+1 = −F (uk)
and then update
uk+1 = uk + ωδuk+1
Nonlinear PDEs – p. 326
The Jacobian matrix; FEM
−u′′ = f(u) + FEM gives Fi = 0, where
Fi ≡∫ 1
0
∑
j
N ′iN
′juj − f(
∑
s
usNs)Ni
dx
The Jacobian:
Ji,j =∂Fi
∂uj
becomes∫ 1
0
[N ′
iN′j − f ′(
∑
s
usNs)NjNi
]dx
In general, for FE function u =∑
s usNs,
∂
∂ujf(u) = f ′(u)
∂u
∂uj= f ′(u)
∂
∂uj
∑
s
usNs = f ′(u)Nj
Nonlinear PDEs – p. 327
A 2D/3D transient nonlinear PDE (1)
PDE:
%C∂u
∂t= ∇ · [κ(u)∇u]
(f.ex. u = g on the boundary and u = I at t = 0)
FDM in time:u` − u`−1
∆t= ∇ ·
[λ(u`)∇u`
]
with λ = κ/(%C)
FEM→ nonlinear algebraic equations:
Fi(u`1, . . . , u
`n) = 0, i = 1, . . . , n
where
Fi ≡∫
Ω
[(u` − u`−1
)Ni + ∆tλ(u`)∇u` · ∇Ni
]dΩ
Nonlinear PDEs – p. 328
A 2D/3D transient nonlinear PDE (2)
Successive substitution:Use “old” u`,k in λ(u`) term, solve linear problem for u`,k+1,k = 0, 1, . . .
Exercise: specify element matrix and vector
Newton-Raphson’s method: need J ,
Ji,j =∂Fi
∂uj
Exercise: carry out the differentiation, specify element matrix andvector
Nonlinear PDEs – p. 329
Iteration methods at the PDE level
Consider −u′′ = f(u)
Could introduce a successive substitution at the PDE level:
− d2
dx2uk+1 = f(uk), k = 0, 1, . . .
⇒ linear problem for uk+1
A PDE-level Newton-Raphson method can also be formulated(see the book for details)
We get identical results for our model problem
Time-dependent problems: first use finite differences in time, thenuse an interation method (successive subst. or Newton-Raphson) atthe time-discrete PDE level
Nonlinear PDEs – p. 330
Continuation methods
Challenging nonlinear PDE:
∇ · (||∇u||q∇u) = 0
For q = 0 this problem is simple
Idea: solve a sequence of problems, starting with q = 0, and increaseq towards a target value
Sequence of PDEs:
∇ · (||∇ur||qr∇ur) = 0, r = 0, 1, 2, . . .
with = 0 < q0 < q1 < q2 < · · · < qm = q
Start guess for ur is ur−1
(the solution of a “simpler” problem)
CFD: The Reynolds number is often the continuation parameter q
Nonlinear PDEs – p. 331
Exercises
Methods for nonlinear PDEs are best learned through exercises
Exercises 4.1, 4.2, 4.4, 4.5, 4.6, 4.7, 4.9, 4.10
Nonlinear PDEs – p. 332
Model problem for nonlinear PDEs
Let us make software for solving
∂u
∂t= ∇ · [λ(u)∇u]
Boundary conditions: u prescribed
Test solution: λ(u) = u, u = φ(x, t) = dt+∑
j xj at the boundary⇒ u = φ everywhere
Nonlinear PDEs – p. 333
Discretization in time
Backward Euler scheme in time:
u` − u`−1
∆t= ∇ ·
[λ(u`)∇u`
]
⇒ recursive set of spatial problems for u`(x)
Nonlinear PDEs – p. 334
Discretization in space
FEM for the spatial problems:
u`(x) ≈ u`(x) =n∑
j=1
u`jNj(x)
Fi(u`1, . . . , u
`n) = 0, i = 1, . . . , n,
where
Fi ≡∫
Ω
[(u` − u`−1
)Wi + ∆tλ(u`)∇u` · ∇Wi
]dΩ .
⇒ Nonlinear system of algebraic equations for u = (u`1, . . . , u
`n)
Nonlinear PDEs – p. 335
Solution of nonlinear systems
Idea: solve a nonlinear system as a sequence of linear systems
Approx. to u` in iteration k: u`,k(x)
Successive substitutions:use “old” u`,k in nonlinear coefficients, λ(u`,k), and solve for u`,k+1
Newton’s or Newton-Raphson’s method:in iteration k, solve
1. Jδuk+1 = −F
2. u`,k+1 = u`,k + δuk+1
where
Ji,j =∂Fi
∂uj
and Fi are computed using “old” values u`,k
(F = (F1, . . . , Fn), J = Ji,j)
Nonlinear PDEs – p. 336
The Jacobian
In our example:
Ji,j ≡∂Fi
∂u`j
=
∫
Ω
[WiNj + ∆t
dλ
du(u`,k)Nj∇Wi · ∇u`,k
+ ∆t λ(u`,k)∇Wi · ∇Nj
]dΩ
Such expressions must be calculated by hand(or symbolic math software)
Nonlinear PDEs – p. 337
What to implement
Fi: these are the same terms that appear in a corresponding linearPDE problem
In case of Newton-Raphson’s method, we also need to implementJi,j (containing terms that are not identical to those in the PDE)
⇒ Nonlinear PDE solver = Linear PDE solver + an outer “nonlinear” loop
Nonlinear PDEs – p. 338
Implementation in Diffpack
The evaluation of the Jacobian and right-hand side at an integrationfollows the same set-up as in linear problems
The management of a “nonlinear” loop is a new component
Need information about the type of nonlinear solver
Nonlinear PDEs – p. 339
Nonlinear systems in Diffpack
Nonlinear solvers are realized as subclasses of a class hierarchyNonLinEqSolver:
- SuccessiveSubst- NewtonRaphson
Nonlinear solver algorithm:// calling NonLinEqSolver’s solve() leads to
iteration = 0;while (!converged)
iteration++;// ask simulator to set up the linear (sub)system to be// solved in this iteration:
solver->makeAndSolveLinearSystem(); // define the PDE!
// perform updates according to the algorithm
This loop takes place in the Diffpack libraries
Nonlinear PDEs – p. 340
makeAndSolveLinearSystem
The purpose ofmakeAndSolveLinearSystem()
is the same as for a linear problem:makeSystem (assemble system)lineq’s solve (solve linear system)
Hence, the programmer has complete control of the linear systemand its solution in each iteration
Note: makeSystem defines the linear system and hence the PDE(implicitly)
This set-up makes it easy to switch between iterationmethods/strategies
Nonlinear PDEs – p. 341
Implementation
Derive simulator from NonLinEqSolverUDC and FEM
Add three new data items:Vec(real) nonlin_solution;Handle(NonLinEqSolver_prm) nlsolver_prm;Handle(NonLinEqSolver) nlsolver;
Initialize these objects in scan
Call nonlinear solver:nlsolver->solve();
In each iteration, the nonlinear solver jumps back to yourvirtual void makeAndSolveLinearSystem()
// essentiallymakeSystem (*dof, *lineq); // set up linear subsystemlineq->solve(); // solve linear subsystem
Tip: learn the numerics well before starting with the implementation!
Nonlinear PDEs – p. 342
A real makeAndSolveLinearSystem
void NlHeat1:: makeAndSolveLinearSystem ()
dof->vec2field (nonlin_solution, *u); // u = most recent guess
if (nlsolver->getCurrentState().method == NEWTON_RAPHSON)// essential boundary conditions must be set to zero because// the unknown vector in the linear system is a correction// vector (assume that nonlin_solution has correct ess. bc.)dof->fillEssBC2zero();
else// normal (default) treatment of essential boundary cond.dof->unfillEssBC2zero();
makeSystem (*dof, *lineq);
// init start vector for iterative linear solver:if (nlsolver->getCurrentState().method == NEWTON_RAPHSON)// start for a correction vector (expected to be approx 0):linear_solution.fill (0.0);
else// use the most recent nonlinear solution:linear_solution = nonlin_solution;
lineq->solve(); // invoke a linear system solver// the solution of the linear system is now available// in the vector linear_solution
Nonlinear PDEs – p. 343
integrands (1)
void NlHeat1::integrands(ElmMatVec& elmat,const FiniteElement& fe)
const real dt = tip->Delta(); // current time stepconst int nsd = fe.getNoSpaceDim(); // no of space dimsconst real u_pt = u->valueFEM (fe); // interpolate uconst real up_pt= u_prev->valueFEM (fe); // interpolate u_prevPtv(real) gradu_pt (nsd); // grad u at present pt.u->derivativeFEM (gradu_pt, fe); // compute gradu_ptPtv(real) gradup_pt (nsd); // grad u_prev --"--u_prev->derivativeFEM (gradup_pt, fe); // compute gradup_pt
const int nbf = fe.getNoBasisFunc(); // no of local nodesconst real detJxW = fe.detJxW();real gradNi_gradNj, gradNi_gradu, h;int i,j,s;
Nonlinear PDEs – p. 344
integrands (2)
if (nlsolver->getCurrentState().method == NEWTON_RAPHSON)for (i = 1; i <= nbf; i++)
gradNi_gradu = 0;for (s = 1; s <= nsd; s++)gradNi_gradu += fe.dN(i,s)*gradu_pt(s);
for (j = 1; j <= nbf; j++) gradNi_gradNj = 0;for (s = 1; s <= nsd; s++)
gradNi_gradNj += fe.dN(i,s)*fe.dN(j,s);h = fe.N(i)*fe.N(j) + dt*( lambda(u_pt)*gradNi_gradNj
+ dlambda(u_pt)*fe.N(j)*gradNi_gradu );elmat.A(i,j) += h*detJxW;
h = fe.N(i)*(u_pt - up_pt) + dt*u_pt*gradNi_gradu;elmat.b(i) += -h*detJxW;
else// error message ... not implemented ...
Nonlinear PDEs – p. 345
Lessons learned
Transient PDE solver = stationary PDE solver + a time loop and acouple of extra data items
Nonlinear PDE solver = linear PDE solver + a hidden nonlinear loopand three extra data items
Learn software tools for ∇2u = f well;- they can be trivially reused for systems of transient nonlinear
PDEs- the Diffpack programming philosophy remains the same
The implementational steps from simple to advanced problems canbe small
Be prepared for major numerical steps when moving to advancedproblems
Diffpack does not simplify the numerics, just the implementation(⇒ the Diffpack book contains both numerics and software)
Nonlinear PDEs – p. 346
Elasticity
Elasticity – p. 347
Linear thermo-elasticity
Application area: structural analysis
pressure load
Purpose of simulation: compute deformation and internal forces(stress)
Elasticity – p. 348
The deformation and a stress measure
6.1×10−17 1.0×100 2.0×100 3.0×1000
1
2
3
0.0195
0.161
0.302
0.443
0.584
0.725
0.866
1.01
1.15
1.29
1.43
6.1×10−17 1.0×100 2.0×100 3.0×1000
1
2
3
Elasticity – p. 349
Mathematical model (1)
Basic quantities:u(x) displacement field(a vector at each point)σij : the stress tensor (3x3 matrix)(needed for evaluating stresses)λ, µ: elasticity coefficientsT : temperature deviation
Basic equations:Equilibrium:
∇ · σ = 0
Constitutive law for elasticity (Hooke’s law):
σ = λ(∇ · u)I + µ(∇u + (∇u)T )− α(3λ+ 2µ)TI
Elasticity – p. 350
A look at the very basics
Consider elongation of a bar
F F
FF
F/2
F/2
F/2
F/2
The stress at the red circle depends on the surface orientation(note: stress is difficult to understand!)
Elasticity – p. 351
Mathematical model (2)
Combining the equations gives:
∇ [(λ+ µ)∇ · u] +∇ · [µ∇u] = ∇ [α(3λ+ 2µ)T ]
or with constant λ and µ:
(λ+ µ)∇(∇ · u) + µ∇2u = α(3λ+ 2µ)∇T
Primary unknown: u
Primary interest: σ
T is prescribed or found from a heat eq.
Solve for u, find σ from Hooke’s law
Note: d unknowns per node: u
Linear (elliptic) vector PDE for u
Elasticity – p. 352
Special versions of the model
Full 3D thermo-elasticity
2D plane strain elasticity u3 = 0, ∂/∂x3 = 0
2D plane stress elasticity:set u3 = 0, ∂/∂x3 = 0 and modify λ
Elasticity – p. 353
Notation
Index notation to condense formulas
Rule 1: ai is vector, aij is tensor
Rule 2: sum over repeated indices
aibi ≡d∑
i=1
aibi
Rule 3: comma denotes differentiation
f,i ≡∂f
∂xiai,k ≡
∂ai
∂xk
These rules can be combined, e.g.,
σij,j ≡d∑
j=1
∂σij
∂xj
Elasticity – p. 354
The Kronecker delta
Kronecker delta: δij = 0 if i 6= j, δij = 1 if i = j
With summation convention: δii = 1 + 1 + 1 = 3 (!)
Without summation convention: δii = 1
Rule 4: annihilate expressions with δij
δijvj = vi vjδij = vi aijδij = aii
(e.g., set i = 1, δ1jvj = 1 · vj + 0 · vj + 0 · vj)
Elasticity – p. 355
Mathematical model with new notation
Basic quantities:ui(xj): displacement field
σij : the stress tensor
λ, µ: elasticity coefficientsT : temperature deviation
Basic equations:Equilibrium:
σij,j = 0
Constitutive law for elasticity (Hooke’s law):
σij = λuk,kδij + µ(ui,j + uj,i)− α(3λ+ 2µ)Tδij
Combined into an equation for ui:
((λ+ µ)uk,k),i + (µui,j),j = (α(3λ+ 2µ)T ),i
Elasticity – p. 356
The Poisson equation revisited (1)
−∇ · [λ∇u] = f
Written as a first order system
∇ · q = f
q = −λ∇u
Starting with this system, and eliminating q after having derived theweighted residual form and performed integration by parts, is theapproach we shall use in elasticity (as it simplifies the mathematicaldetails in the elasticity problem)
Elasticity – p. 357
The Poisson equation revisited (1)
Weighted residual form (Galerkin’s method):
∫
Ω
∇ · qNidΩ, q ≈ q =∑
j
qjNj , u ≈ u =∑
j
ujNj
Integration by parts:
∫
Ω
∇ · qNidΩ = −∫
Ω
∇Ni · q dΩ +
∫
∂Ω
Niq · n dΓ
Insert q = −λ∇u, and obtain standard FEM problem
Short notation with indices (q → qi):
∇ · q =
d∑
k=1
∂qk∂xk
= qk,k, qk = −λu,k
Sum over repeated index and comma denotes differentiation Elasticity – p. 358
The Poisson equation revisited (3)
Integration by parts in alternative notation:
∫
Ω
qk,kNidΩ = −∫
Ω
Ni,kqkdΩ +
∫
∂Ω
NinkqkdΓ
qk = −λu,k ≈ −λn∑
j=1
ujNj,k
⇒ Linear system (as usual)
n∑
j=1
∫
Ω
λ Ni,kNj,k︸ ︷︷ ︸=∇Ni·∇Nj
dΩuj =
∫
Ω
fNidΩ +
∫
∂Ω
Niλu,knk︸ ︷︷ ︸= ∂u
∂n
dΓ
This new notation and use of both qk and u in the derivation makesthe numerical details of more complicated problems (e.g. elasticity)easier
Elasticity – p. 359
FEM in elasticity (1)
Equilibrium equation:σrs,s = 0
Galerkin’s method + integration by parts:
∫
Ω
σrsNi,sdΩ =
∫
∂Ω
Ni σrsns︸ ︷︷ ︸b.c.
dΓ
d equations, each weighted by Ni
Replace σrs by ur:
σrs = λuk,kδrs + µ(ur,s + us,r)− α(3λ+ 2µ)Tδrs
d equations for each i (node)
Elasticity – p. 360
FEM in elasticity (2)
Expansion:
ui ≈ ui =n∑
j=1
uijNj(x1, . . . , xd)
d unknowns at each node: u1j , . . . , u
dj
Linear system: Kx = b
x = (u11, . . . , u
d1, u
12, . . . , u
d2, . . . , u
1n, . . . , u
dn)T
Matrices (vectors) consist of d× d (d) blocks
Elasticity – p. 361
The element equations
Element level equations can be written as
∑
j
∑
s
Arsi,ju
sj = bri , r = 1, . . . , d, i = 1, . . . , n
Equation number: (i, r)→ d(i− 1) + r
Unknown number: (j, s)→ d(j − 1) + s
Element matrix: dn× dn (n = no of nodes in elem.)
For fixed i and j (node numbers), Arsi,j is a d× d matrix reflecting the
coupling of node i and j
Elasticity – p. 362
Derivation of the element equations
Aim 1: insert σrs in∫ΩσrsNi,sdΩ, where
σrs = λuk,kδrs + µ(ur,s + us,r)− α(3λ+ 2µ)Tδrs
and
ui ≈ ui =n∑
j=1
uijNj(x)
Aim 2: Manipulate expressions to identify element matrix Arsi,j and
vector bri
Elasticity – p. 363
Derivation; cont.
Let’s look at the first term in σrs:
σrs = λuk,kδrs = λ∑
j
Nk,jukj δrs in
∫
Ω
σrsNi,sdΩ
gives
∑
j
∑
s
∫
Ω
λ
(∑
k
Nj,k
)Ni,sδrsdΩ
uk
j
Elasticity – p. 364
Derivation; cont.
Try to rewrite
∑
j
∑
s
∫
Ω
λ
(∑
k
Nj,k
)Ni,sδrsdΩ
uk
j
on the form∑
j
∑sA
rsi,ju
sj
Step 1: δrsφs = φr for any vector φ
⇒ annihilate s:∑
s Ni,sδrs = Ni,r
Step 2: change k with s (dummy summation index)
∑
j
∑
s
∫
Ω
λNi,rNj,sdΩ
︸ ︷︷ ︸Ars
i,j
usj
Elasticity – p. 365
Derivation; cont.
Next term:
∑
j
∑
s
∫
Ω
µNj,sNi,sdΩ
ur
j
Step 1: urj = δrku
kj (∑
k)
Step 2: change k and s
Result:
∑
j
∑
s
∫
Ω
(∑
k
µNi,kNj,k
)δrs
us
j
Easy to identify Arsi,j
Next terms are straightforward
Elasticity – p. 366
Result of derivation (1)
General formula for Arsi,j :
Arsi,j =
∫
Ω
[µ
(∑
k
Ni,kNj,k
)δrs
+ µNi,sNj,r + λNi,rNj,s
]dΩ
This derivation: global level
Local level: replace Ω by Ω anddΩ by detJ dξ1 · · · dξd
Elasticity – p. 367
Result of derivation (2)
Right-hand side:
bri =
∫
Ω
[(2µ+ 3λ)αTNi,r] dΩ +
∫
∂Ω
NitrdΓ
tr = σrsns: stress vector at the surface ∂Ω
Essential conditions: ur given
Natural conditions: tr given
Elasticity – p. 368
Implementation
Standard Poisson1/2 data
The DegFreeFE object is more central(d unknown per node!)
FieldsFE u (vector field)
Vec(real) solution(solution of linear system)
Shuffling u↔ solution using DegFreeFE
Initialization:u.rebind (new FieldsFE (*grid,"u"));dof.rebind (new DegFreeFE (*grid, nsd)); // !!!solution.redim (u->getNoValues());lineq->attach (solution);
Elasticity – p. 369
Entries in the element matrix
1
2
3
2x2 matrix, coupling node
entry, coupling local dof 2
4
in node 3 with local dof 1
in node 4
2 and 3
1 2 3 4
Elasticity – p. 370
The heart of the integrands routine
4 loops:
// matrix:
for i = 1,....,nbffor j = 1,....,nbffor r = 1,....,dfor s = 1,....,d
add A_i,j^rs intoelmat.A (d*(i-1)+r, d*(j-1)+s)
// right-hand side:
for i = 1,....,nbffor r = 1,....,dadd b_i^r into
elmat.b (d*(i-1)+r)
Elasticity – p. 371
The important variables in elasticity
Primary unknowns in the finite element method: the displacementfield ur
Primary interest:the components of the stress tensor σrs
a norm of σrs, e.g.,
m ∝√σ′
ijσ′ij
where
σ′rs ≡ σrs −
1
3σkkδrs
Note: σrs ∼ ∂ur/∂xs
σrs is discontinuous accross element boundaries
Smoothing might be necessary
Elasticity – p. 372
Computing derivatives (1)
u(x) is a finite element field
Define g = ∂u/∂x1
Bilinear u⇒ g ∼ a+ bx2
Linear u⇒ g ∼ const
g is discontinuous across element boundaries
g has optimal accuracy at the reduced Gauss points (=centroid inlinear/bilinear elements)
Elasticity – p. 373
Computing derivatives (2)
g = ∂u/∂x1 (e.g.)
Find continuous g =∑
j gjNj(x) as approximate solution of
g = g
Galerkin or least squares:
∑
j
(∫
Ω
NiNjdΩ
)gj =
∫
Ω
gNidΩ
Integrate rhs with reduced Gauss rule
Lump the mass matrix∫NiNjdΩ
Efficient solution of diagonal system for gj
⇒ g is a continuous field
Elasticity – p. 374
Computing derivatives in Diffpack
Let m be a stress norm (discontinuous)
Representation of m: class FieldsFEatItgPt
FieldsFEatItgPt = nf point values of derivatives at each (possiblyreduced) integration point in each element (representing m only:nf = 1)
Class FieldsFEatItgPt has a functionvoid derivedQuantitiesAtItgPt(FEM& fesolver,GridFE& grid,int nfields,NumItgPoints pt_tp = GAUSS_POINTS,int relative_order = -1 // reduced Gauss pts);
that runs through all elements in the grid and their (reduced)integration points, and for each point, fesolver’s virtualderivedQuantitiesAtItgPt is called for defining thevalues of the nfields discontinuous fields at the current point
Elasticity – p. 375
Computing derivatives in Diffpack
In class Elasticity1:class Elasticity1 : public FEM
...Handle(FieldsFEatItgPts) stress_measures; // m...
;
void Elasticity1:: calcDerivedQuantities ()
// Handle(stress_measures) contains stresses (now only the norm m)stress_measures->derivedQuantitiesAtItgPt(*this, *grid, 1 /* 1 derived quantity */,GAUSS_POINTS, -1 /* reduced Gauss-Legendre points */);
FEM::smoothFields (*smooth_stress_measures, *stress_measures);
void Elasticity1:: derivedQuantitiesAtItgPt(VecSimple(NUMT)& quantities, const FiniteElement& fe)
// fill quantities(1) with the expression for m
Elasticity – p. 376
Plate with imperfection
Consider a plate with an elliptic hole:
σσ
Plane stress (thin plate)
Elasticity – p. 377
Boundary conditions
Equation: Navier, 2D, with λ replaced by
λ′ =2λµ
λ+ 2µ
Boundary with tension force:stress vector known (t = σi)
Inner boundary in hole: no stress
Upper and lower boundary: no stress
⇒ 2 conditions at each point on the boundary
(recall that σi3 = 0 by definition of plane strain)(In a 3D formulation of the problem we would trivially get3 cond. at each point)
Elasticity – p. 378
Symmetry
For numerical computations it is crucial to reduce the size of thedomain as much as possible
Here: symmetry about two lines
σσ
Condition at a symmetry line:vanishing normal displacement: 1 conditionno shear stress: 1 (2D) or 2 (3D) conditions
Elasticity – p. 379
Numerical simulations
0 1 1.1410
1
equivalent stress in deformed configuration
0.367
1.12
1.88
2.63
3.39
4.14
4.89
5.65
6.4
7.16
7.91
0 1 1.1410
1
Elasticity – p. 380
Plate with crack
Let the ellipse collapse to a line
σσ
Elasticity – p. 381
Boundary conditions
What has actually changed?
Nothing;same boundary conditions(stress-free inner surface of the crack)same symmetry properties
However: the extreme geometry will lead to infinite stresses at thecrack tip
Elasticity – p. 382
Numerical simulations
0 1 1.1410
1
equivalent stress in deformed configuration
0.447
0.94
1.43
1.92
2.42
2.91
3.4
3.89
4.38
4.88
5.37
0 1 1.1410
1
Elasticity – p. 383
Elastic beam with a crack
0 10
0.1
0.2
0.3
0 10
0.1
0.2
0.3uniform pressure load
clampedend
crack
0 1−0.1
0
0.1
0.2
0.3
0 1−0.1
0
0.1
0.2
0.3
Elasticity – p. 384
Mathematical model
Elasticity
Plane strain
No temperature effects
Elasticity – p. 385
Numerical simulation: stress
0 1−0.1
0
0.1
0.2
0.3
equivalent stress in deformed configuration
0.00067210.320.630.941.251.661.972.282.592.8103
0 1−0.1
0
0.1
0.2
0.3
Elasticity – p. 386
Shallow water waves
Shallow water waves – p. 387
Tsunamis
Waves in fjords, lakes, or oceans, generated byslideearthquakesubsea volcanoasteroid
human activity, like nuclear detonation, or slides generated by oildrilling, may generate tsunamis
Propagation over large distances
Hardly recognizable in the open ocean, but wave amplitude increasesnear shore
Run-up at the coasts may result in severe damage
Giant events: Dec 26 2004 (≈ 300000 killed), 1883 (similar to 2004),65 My ago (extinction of the dinosaurs)
Shallow water waves – p. 388
Norwegian tsunamis
5˚ 10˚
15˚
60˚
65˚
70˚
OsloStockholm
Bergen
Tromsø
Bodø
Trondheim
NO
RW
AY
SWED
EN
Circules: Major incidents, > 10 killed; Triangles: Selected smallerincidents; Square: Storegga (5000 B.C.)
Shallow water waves – p. 389
Tsunamis in the Pacific
Scenario: earthquake outside Chile, generates tsunami, propagating at800 km/h accross the Pacific, run-up on densly populated coasts in Japan;http://www.geophys.washington.edu/tsunami/ Shallow water waves – p. 390
Selected events; slides
location year run-up dead
Loen 1905 40m 61
Tafjord 1934 62m 41
Loen 1936 74m 73
Storegga 5000 B.C. 10m(?) ??Vaiont, Italy 1963 270m 2600
Litua Bay, Alaska 1958 520m 2
Shimabara, Japan 1792 10m(?) 15000
Shallow water waves – p. 391
Selected events; earthquakes etc.
location year strength run-up dead
Thera 1640 B.C. volcano ? ?Thera 1650 volcano ? ?Lisboa 1755 M=9 ? 15(?)m ?000Portugal 1969 M=7.9 1 mAmorgos 1956 M=7.4 5(?)m 1Krakatao 1883 volcano 40 m 36 000Flores 1992 M=7.5 25 m 1 000Nicaragua 1992 M=7.2 10 m 168Sumatra 2004 M=9 50 m 300 000
The selection is biased wrt. European events; 150 catastrophic tsunamievents have been recorded along along the Japanese coast in moderntimes.Tsunamis: no. 5 killer among natural hazards
Shallow water waves – p. 392
Why simulation?
Increase the understanding of tsunamis
Assist warning systems
Assist building of harbor protection (break waters)
Recognize critical coastal areas (e.g. move population)
Hindcast historical tsunamis (assist geologists/biologists)
Shallow water waves – p. 393
Problem sketch
η(x,y,t)
y
z
x
H(x,y,t)
Assume wavelength depth (long waves)
Assume small amplitudes relative to depth
Appropriate approx. for many ocean wave phenomena
Reference: HPL chapter 6.2
Shallow water waves – p. 394
Mathematical model
PDEs:
∂η
∂t= − ∂
∂x(uH)− ∂
∂y(vH)
(−∂H∂t
)
∂u
∂t= −∂η
∂x, x ∈ Ω, t > 0
∂v
∂t= −∂η
∂y, x ∈ Ω, t > 0
η(x, y, t) : surface elevation
u(x, y, t) and v(x, y, t) : horizontal (depth averaged) velocities
H(x, y) : stillwater depth (given)
Boundary conditions: either η, u or v given at each point
Initial conditions: all of η, u and v given
Shallow water waves – p. 395
Primary unknowns
Discretization: finite differences
Staggered grid in time and space
⇒ η, u, and v unknown at different points:
η`i+ 1
2 ,j+ 12, u
`+ 12
i,j+ 12
, v`+ 1
2
i+ 12 ,j+1
r
η`i+ 1
2 ,j+ 12
u`+ 1
2
i,j+ 12
u`+ 1
2
i+1,j+ 12
v`+ 1
2
i+ 12 ,j
v`+ 1
2
i+ 12 ,j+1
Shallow water waves – p. 396
A global staggered grid
Widely used grid in computational fluid dynamics (CFD)
Important for Navier-Stokes solvers
Basic idea:centered differences in time and space
Shallow water waves – p. 397
Discrete equations; η
∂η
∂t= − ∂
∂x(uH)− ∂
∂y(vH)
at (i+1
2, j +
1
2, `− 1
2)
1
∆t
[η`
i+ 12 ,j+ 1
2− η`−1
i+ 12 ,j+ 1
2
]= − 1
∆x
[(Hu)
`− 12
i+1,j+ 12
− (Hu)`− 1
2
i,j+ 12
]
− 1
∆y
[(Hv)
`− 12
i+ 12 ,j+1
− (Hv)`− 1
2
i+ 12 ,j
]
Shallow water waves – p. 398
Discrete equations; u
∂u
∂t= −∂η
∂xat (i, j +
1
2, `)
1
∆t
[u
`+ 12
i,j+ 12
− u`− 12
i,j+ 12
]= − 1
∆x
[η`
i+ 12 ,j+ 1
2− η`
i− 12 ,j+ 1
2
]
Shallow water waves – p. 399
Discrete equations; v
∂v
∂t= −∂η
∂yat (i+
1
2, j, `)
1
∆t
[v
`+ 12
i+ 12 ,j− v`− 1
2
i+ 12 ,j
]=
1
∆y
[η`
i+ 12 ,j+ 1
2− η`
i+ 12 ,j− 1
2
]
Shallow water waves – p. 400
Complicated costline boundary
Saw-tooth approximation to real boundary
Successful method, widely used
Warning: can lead to nonphysical waves
Shallow water waves – p. 401
Relation to the wave equation
Eliminate u and v (easy!)
∂2η
∂t2= ∇ · [H(x, y)∇η]
Eliminate discrete u and v
⇒ Standard 5-point explicit finite difference scheme for discrete η
Shallow water waves – p. 402
Stability and accuracy
Centered differences in time and space
⇒ truncation error: O(∆x2,∆y2,∆t2)
Stability as for the std. wave equation in 2D:
∆t ≤ H− 12
√1
1∆x2 + 1
∆y2
(CFL condition)
If H const, exact numerical solution is possible for one-dimensionalwave propagation
Shallow water waves – p. 403
Verification of an implementation
How can we verify that the program works?
Compare with an analytical solution(if possible)
Check that basic physical mechanisms are reproduced in aqualitatively correct way by the program
Shallow water waves – p. 404
Tsunami due to a slide
Surface elevation ahead of the slide, dump behind
Initially, negative dump propagates backwards
The surface waves propagate faster than the slide movesShallow water waves – p. 405
Tsunami due to faulting
The sea surface deformation reflect the bottom deformation
Velocity of surface waves (H ∼ 5 km): 790 km/h
Velocity of seismic waves in the bottom: 6000–25000 km/hShallow water waves – p. 406
Tsunami approaching the shore
The velocity of a tsunami is√gH(x, y, t).
The back part of the wave moves at higher speed⇒ the wavebecomes more peak-formed
Deep water (H ∼ 3 km): wave length 40 km, height 1 m
Shallow water (H ∼ 10 m): wave length 2 km, height 4 mShallow water waves – p. 407
Tsunamis experienced from shore
As a fast tide, with strong currents in fjords
A wall of water approaching the beach
Wave breaking: the top has larger effective depth and moves faster thanthe front part (requires a nonlinear PDE)
Shallow water waves – p. 408
A penalty N-S solver
A penalty N-S solver – p. 409
Viscous fluid flow
0 0.01 0.02 0.03 0.04 0.05 0.060
0.01
0.02
0 0.01 0.02 0.03 0.04 0.05 0.060
0.01
0.02
Many processes in science and technology involve viscous fluid flow, andthe numerical models then need solvers for the Navier-Stokes (N-S)equations:
%
(∂v
∂t+ v · ∇v
)= −∇p+ µ∇2v + %b
∇ · v = 0
A penalty N-S solver – p. 410
Different ways of writing the N-S eqs.
With vector symbols:
%
(∂v
∂t+ v · ∇v
)= −∇p+ µ∇2v + %b
∇ · v = 0
With index notation:
%(vr,t + vsvr,s) = −p,r + µvr,ss + %br
vs,s = 0
The latter is preferred when deriving the details of a numericalmethod (as in the elasticity problem)
A penalty N-S solver – p. 411
Numerical methods
The condition ∇ · v = 0 and the term ∇p make the N-S equationshard to solve numerically
There are numerous approaches:fully implicitartificial compressibilitypenalty functionsoperator splitting
A penalty N-S solver – p. 412
Penalty methods
Firm basis in calculus of variations
Main result:p = −λ∇ · v, λ→∞
⇒ Can eliminate the pressure!!!
Result: a kind of nonlinear transient elasticity problem
Very convenient from a numerical point of view
λ→∞ gives some undesired numerical properties (ill-conditionedmatrix systems)
Good educational example on using Diffpack
Reference: HPL chapter 6.3.
A penalty N-S solver – p. 413
Modified N-S equations
p = −λ∇ · v = −λvs,s eliminates p and the eq. ∇ · v = vs,s = 0
Result:%(αvr,t + vsvr,s) = λvs,sr + µvr,ss + %br
≈ eq. of linear elasticity, modulo the acceleration terms on theleft-hand side(which add transient and nonlinear effects)
Implementation in Diffpack:Extend class Elasticity1 with a time loop and a nonlinear solver
⇒ Combine class Poisson1, Heat1, NlHeat1, and Elasticity1
A penalty N-S solver – p. 414
Basic steps
1. Derive the weak form
2. Identify the integrands
3. Get control of the element degrees of freedom, i.e., how the formulasare stacked in the element matrix/vector
Strong similarity to the elasticity problem!
Then, use class NlHeat1 as a template for administering the solutionprocess
A penalty N-S solver – p. 415
Discretization (1)
In time: θ-ruleθ = 1: backward Euler, θ = 0.5: Crank-Nicolson
In space: isoparametric finite elements
v`r(x, t) =
n∑
j=1
vr,`j Nj(x)
Weak form: multiply by Ni, integrate 2nd order derivatives by parts
⇒ Nonlinear system at each time level
F ri (v1
1 , . . . , vd1 , v
12, . . . , v
d2 , . . . , v
1n, . . . , v
dn) = 0
for i = 1, . . . , n (nodes), r = 1, . . . , d (loc. dof.)
A penalty N-S solver – p. 416
Discretization (2)
Newton-Raphson method for F ri = 0
⇒ Sequence of linear systems
n∑
j=1
d∑
s=1
Arsi,jδv
sj = −F r
i
Arsi,j ≡
∂F ri
∂vs,`j
A penalty N-S solver – p. 417
Entries in the element matrix
1
2
3
2x2 matrix, coupling node
entry, coupling local dof 2
4
in node 3 with local dof 1
in node 4
2 and 3
1 2 3 4
A penalty N-S solver – p. 418
Selective reduced integration
The λ term must be integrated by a rule of one order lower than therule used for the other terms (equiv. to using mixed interpolation).
A penalty N-S solver – p. 419
Computing the pressure
p = −λvs,s can be computed when the velocity is known
p ∼ derivatives of vs ⇒ p becomes discontinuous
Might smooth p:n∑
j=1
Mi,jpj = bi
where
bi = −λ∫vs,sNidΩ, Mi,j =
∫NiNjdΩ
normally with lumped Mi,j
⇒ Another finite element assembly process
Diffpack tool: class FieldsFEatItgPt or integrands functor
A penalty N-S solver – p. 420
The idea of integrand functors
A solver can only have one integrands function
What if it needs more than one?if-else tests in integrandsexternal integrands functions as functors
Basic structure of an integrand functor:class MyExtraIntegrand : public IntegrandCalc
MySim* data; // access to all solver datapublic:
MyExtraIntegrand (MySim* sim) : data(sim)
void integrands (ElmMatVec& em, const FiniteElement& fe)
// normal integrands function// access physical parameters in the solver by data->
;
Overloaded versions of FEM::makeSystem work with integrandfunctors (as an alternative to the integrands function in the solver)
A penalty N-S solver – p. 421
Implementation; pressure computation
Integrand functor for
bi = −λ∫vs,sNidΩ
Mi,j can be computed once and for all by makeMassMatrix
Class FEM has a function smoothField for solving
∑
j
Mi,jpj = functor-defined right-hand side
// integrand functor:class PressureIntg : public IntegrandCalc ... ;
void NsPenalty1:: calcDerivedQuantities ()
PressureIntg penalty_integrand (this);FEM::smoothField (*p, penalty_integrand);// calls makeMassMatrix (if necessary), makeSystem// and solves the diagonal system
A penalty N-S solver – p. 422
Flow in a constricted channel (1)
0 0.01 0.02 0.03 0.04 0.05 0.060
0.01
0.02
0 0.01 0.02 0.03 0.04 0.05 0.060
0.01
0.02
uniform
inlet
profile
outlet
solid wall
solid wall
Re=60
0.0252 0.03 0.04 0.05 0.05560.008267
0.01
0.0192
0.0252 0.03 0.04 0.05 0.05560.008267
0.01
0.0192
A penalty N-S solver – p. 423
Flow in a constricted channel (2)
Re=60
0 0.01 0.02 0.03 0.04 0.05 0.060
0.01
0.02
−0.00336 −0.00
336
0.00
236
0.00807
0.01
95
0 0.01 0.02 0.03 0.04 0.05 0.060
0.01
0.02
0 0.01 0.02 0.03 0.04 0.05 0.060
0.01
0.02
0 0.01 0.02 0.03 0.04 0.05 0.060
0.01
0.02
A penalty N-S solver – p. 424
A fast FE N-S solver
A fast FE N-S solver – p. 425
Splitting the N-S equations
%(vr,t + vsvr,s) = −p,r + µvr,ss + %br
vs,s = 0
Difficulty: p,r and vs,s = 0
Idea: Split N-S into simpler equations
Common approach: split N-S intoan explicit convection-diffusion equation for vr
an (implicit) Poisson equation for pexplicit updating formula for vr
A fast FE N-S solver – p. 426
A 2nd order algorithm (1)
%(vr,t + vsvr,s) = −p,r + µvr,ss + %br
vs,s = 0
1. Calculation of an intermediate velocity field:
k(1)r = −∆t(v`
sv`r,s − νv`
r,ss)
vr = v`r + k(1)
r
k(2)r = −∆t(v`
sv`r,s − νv`
r,ss)
v∗r = v`r +
1
2
(k(1)
r + k(2)r
)
A fast FE N-S solver – p. 427
A 2nd order algorithm (1)
2. Solution of a Poisson equation for the new pressure (arising from theincompressibility constraint v`+1
s,s = 0):
∇2p`+1 =%
∆tv∗s,s
3. Correction of the intermediate velocity field:
v`+1r = v∗r − (p`+1
,r − %br)∆t/%
Same interpolation for vr and p (no need for mixed finite elements)
A fast FE N-S solver – p. 428
Discrete equations
First step:
Mk(1)r = −∆tar(v1, . . . ,vd)− ν∆tKvr
for r = 1, . . . , d.K: ∇2 operatorar: nonlinear convective term
Explicit updates a la
vr = v`r + k(1)
r , v∗r = v`
r +1
2
(k(1)
r + k(1)r
)
The pressure Poisson equation:
Kp`+1 =%
∆tBsv
∗s
Correcting the velocity field (cr contains body forces):
Mv`+1r = Mv∗
r − (Brp`+1 − %cr)
A fast FE N-S solver – p. 429
Implementation
Obvious: create a std solver for the Poisson equation and the scalarexplicit updates
Observation I: original vector equations are split into d independentscalar equations (!)
Requires several weak forms⇒ integrand functors
Observation II: M , K, and Bs are independent of time
⇒ Can speed up the code by precomputing M , K, and Bs, andgenerate the Poisson equation from matrix-vector products only(cf. class Wave0)
Problem: the nonlinear term ar
Solution: precompute as much as possible, multiply by v at theelement level and assemble
A fast FE N-S solver – p. 430
Solving linear systems
Solving linear systems – p. 431
The importance of linear system solvers
PDE problems often (usually) result in linear systems of algebraicequations
Ax = b
Special methods utilizing that A is sparse is much faster thanGaussian elimination!
Most of the CPU time in a PDE solver is often spent on solvingAx = b
⇒ Important to use fast methods
Solving linear systems – p. 432
Example: Poisson eq. on the unit cube (1)
−∇2u = f on an n = q × q × q grid
FDM/FEM result in Ax = b system
FDM: 7 entries pr. row in A are nonzero
FEM: 7 (tetrahedras), 27 (trilinear elms.), or 125 (triquadratic elms.)entries pr. row in A are nonzero
A is sparse (mostly zeroes)
Fraction of nonzeroes: Rq−3
(R is nonzero entries pr. row)
Important to work with nonzeroes only!
Solving linear systems – p. 433
Example: Poisson eq. on the unit cube (2)
Compare Banded Gaussian elimination (BGE) versus ConjugateGradients (CG)
Work in BGE: O(q7) = O(n2.33)
Work in CG: O(q3) = O(n) (multigrid; optimal),for the numbers below we use incomplete factorizationpreconditioning: O(n1.17)
n = 27000:CG 72 times faster than BGEBGE needs 20 times more memory than CG
n = 8 million:CG 107 times faster than BGEBGE needs 4871 times more memory than CG
Solving linear systems – p. 434
Classical iterative methods
Ax = b, A ∈ IRn,n, x, b ∈ IRn .
Split A: A = M −N
Write Ax = b asMx = Nx + b,
and introduce an iteration
Mxk = Nxk−1 + b, k = 1, 2, . . .
Systems My = z should be easy/cheap to solve
Different choices of M correspond to different classical iterationmethods:
Jacobi iterationGauss-Seidel iterationSuccessive Over Relaxation (SOR)Symmetric Successive Over Relaxation (SSOR)
Solving linear systems – p. 435
Convergence
Mxk = Nxk−1 + b, k = 1, 2, . . .
The iteration converges if G = M−1N has its largest eigenvalue,%(G), less than 1
Rate of convergence: R∞(G) = − ln %(G)
To reduce the initial error by a factor ε,
||x− xk|| ≤ ε||x− x0||
one needs− ln ε/R∞(G)
iterations
Solving linear systems – p. 436
Some classical iterative methods
Split: A = L + D + U
L and U are lower and upper triangular parts, D is A’s diagonal
Jacobi iteration: M = D (N = −L− U )
Gauss-Seidel iteration: M = L + D (N = −U )
SOR iteration: Gauss-Seidel + relaxation
SSOR: two (forward and backward) SOR steps
Rate of convergence R∞(G) for −∇2u = f in 2D with u = 0 as BC:
Jacobi: πh2/2
Gauss-Seidel: πh2
SOR: π2h
SSOR: > πh
SOR/SSOR is superior (h vs. h2, h→ 0 is small)
Solving linear systems – p. 437
Jacobi iteration
M = D
Put everything, except the diagonal, on the rhs
2D Poisson equation −∇2u = f :
ui,j−1 + ui−1,j + ui+1,j + ui,j+1 − 4ui,j = −h2fi,j
Solve for diagonal element and use old values on the rhs:
uki,j =
1
4
(uk−1
i,j−1 + uk−1i−1,j + uk−1
i+1,j + uk−1i,j+1 + h2fi,j
)
for k = 1, 2, . . .
Solving linear systems – p. 438
Relaxed Jacobi iteration
Idea: Computed new x approximation x∗ from
Dx∗ = (−L−U)xk−1 + b
Setxk = ωx∗ + (1− ω)xk−1
weighted mean of xk−1 and xk if ω ∈ (0, 1)
Solving linear systems – p. 439
Relation to explicit time stepping
Relaxed Jacobi iteration for −∇2u = f is equivalent with solving
α∂u
∂t= ∇2u+ f
by an explicit forward scheme until ∂u/∂t ≈ 0, providedω = 4∆t/(αh)2
Stability for forward scheme implies ω ≤ 1
In this example: ω = 1 best (⇔ largest ∆t)
Forward scheme for t→∞ is a slow scheme, hence Jacobi iterationis slow
Solving linear systems – p. 440
Gauss-Seidel/SOR iteration
M = L + D
For our 2D Poisson eq. scheme:
uki,j =
1
4
(uk
i,j−1 + uki−1,j + uk−1
i+1,j + uk−1i,j+1 + h2fi,j
)
i.e. solve for diagonal term and use the most recently computedvalues on the right-hand side
SOR is relaxed Gauss-Seidel iteration:compute x∗ from Gauss-Seidel it.
set xk = ωx∗ + (1− ω)xk−1
ω ∈ (0, 2), with ω = 2−O(h2) as optimal choice
Very easy to implement!
Solving linear systems – p. 441
Symmetric/double SOR: SSOR
SSOR = Symmetric SOR
One (forward) SOR sweep for unknowns 1, 2, 3, . . . , n
One (backward) SOR sweep for unknowns n, n− 1, n− 2, . . . , 1
M can be shown to be
M =1
2− ω
(1
ωD + L
)(1
ωD
)−1(1
ωD + U
)
Notice that each factor in M is diagonal or lower/upper triangular(⇒ very easy to solve systems My = z)
Solving linear systems – p. 442
Status: classical iterative methods
Jacobi, Gauss-Seidel/SOR, SSOR are too slow for paractical PDEcomputations
The simplest possible solution method for −∇2u = f and otherstationary PDEs in 2D/3D is to use SOR
Classical iterative methods converge quickly in the beginning butslow down after a few iterations
Classical iterative methods are important ingredients in multigridmethods
Solving linear systems – p. 443
Conjugate Gradient-like methods
Ax = b, A ∈ IRn,n, x, b ∈ IRn .
Use a Galerkin or least-squares method to solve a linear system (!)
Idea: write
xk = xk−1 +
k∑
j=1
αjqj
αj : unknown coefficients, qj : known vectors
Compute the residual:
rk = b−Axk = rk−1 −k∑
j=1
αjAqj
and apply the ideas of the Galerkin or least-squares methods
Solving linear systems – p. 444
Galerkin
Residual:
rk = b−Axk = rk−1 −k∑
j=1
αjAqj
(rk, qi) = 0
Galerkin’s method (r ∼ R, qj ∼ Nj , αj ∼ uj):
(rk, qi) = 0, i = 1, . . . , k
(·, ·): Eucledian inner product
Result: linear system for αj ,
k∑
j=1
(Aqi, qj)αj = (rk−1, qi), i = 1, . . . , k
Solving linear systems – p. 445
Least squares
Residual:
rk = b−Axk = rk−1 −k∑
j=1
αjAqj
∂
∂αi(rk, rk) = 0
Least squares: minimize (rk, rk)
Result: linear system for αj :
k∑
j=1
(Aqi,Aqj)αj = (rk−1,Aqi), i = 1, . . . , k
Solving linear systems – p. 446
The nature of the methods
Start with a guess x0
In iteration k: seek xk in a k-dimensional vector space Vk
Basis for the space: q1, . . . , qk
Use Galerkin or least squares to compute the (optimal)approximation xk in Vk
Extend the basis from Vk to Vk+1 (i.e. find qk+1)
Solving linear systems – p. 447
Extending the basis
Vk is normally selected as a so-called Krylov subspace:
Vk = spanr0,Ar0, . . . ,Ak−1r0
Alternatives for computing qk+1 ∈ Vk+1:
qk+1 = rk +k∑
j=1
βjqk
qk+1 = Ark +k∑
j=1
βjqk
How to choose βj?
Solving linear systems – p. 448
Orthogonality properties
Bad news: must solve a k × k linear system for αj in each iteration(as k → n the work in each iteration approach the work of solvingAx = b!)
The coefficient matrix in the αj system:
(Aqi, qj), (Aqi,Aqj)
Idea: make the coefficient matrices diagonal
That is,Galerkin: (Aqi, qj) = 0 for i 6= j
Least squares: (Aqi,Aqj) = 0 for i 6= j
Use βj to enforce orthogonality of qi
Solving linear systems – p. 449
Formula for updating the basis vectors
Define〈u,v〉 ≡ (Au,v) = uT Av
and[u,v] ≡ (Au,Av) = uT AT Av
Galerkin: require A-orthogonal qj vectors, which then results in
βi = −〈rk, qi〉〈qi, qi〉
Least squares: require AT A–orthogonal qj vectors, which thenresults in
βi = − [rk, qi]
[qi, qi]
Solving linear systems – p. 450
Simplifications
Galerkin: 〈qi, qj〉 = 0 for i 6= j gives
αk =(rk−1, qk)
〈qk, qk〉
and αi = 0 for i < k (!):
xk = xk−1 + αkqk
That is, hand-derived formulas for αj
Least squares:
αk =(rk−1,Aqk)
[qk, qk]
and αi = 0 for i < k
Solving linear systems – p. 451
Symmetric A
If A is symmetric (AT = A) and positive definite (positive eigenvalues⇔yT Ay > 0 for any y 6= 0), also βi = 0 for i < k⇒ need to store qk only(q1, . . . , qk−1 are not used in iteration k)
Solving linear systems – p. 452
Summary: least squares algorithm
given a start vector x0,compute r0 = b−Ax0 and set q0 = r0.for k = 1, 2, . . . until termination criteria are fulfilled:
αk = (rk−1,Aqk)/[qk, qk]xk = xk−1 + αkqk
rk = rk−1 − αkAqk
if A is symmetric thenβk = [rk, qk]/[qk, qk]qk+1 = rk − βkqk
elseβj = [rk, qj ]/[qj , qj ], j = 1, . . . , k
qk+1 = rk −∑kj=1 βjqj
The Galerkin-version requires A to be symmetric and positivedefinite and results in the famous Conjugate Gradient method
Solving linear systems – p. 453
Truncation and restart
Problem: need to store q1, . . . , qk
Much storage and computations when k becomes large
Truncation: work with a truncated sum for xk,
xk = xk−1 +k∑
j=k−K+1
αjqj
where a possible choice is K = 5
Small K might give convergence problems
Restart: restart the algorithm after K iterations(alternative to truncation)
Solving linear systems – p. 454
Family of methods
Generalized Conjugate Residual method= least squares + restart
Orthomin method= least squares + truncation
Conjugate Gradient method= Galerkin + symmetric and positive definite A
Conjugate Residuals method= Least squares + symmetric and positive definite A
Many other related methods: BiCGStab, Conjugate GradientsSquared (CGS), Generalized Minimum Residuals (GMRES),Minimum Residuals (MinRes), SYMMLQ
Common name: Conjugate Gradient-like methods
All of these are easily called in Diffpack
Solving linear systems – p. 455
Convergence
Conjugate Gradient-like methods converge slowly (but usually fasterthan SOR/SSOR)
To reduce the initial error by a factor ε,
1
2ln
2
ε
√κ
iterations are needed, where κ is the condition number:
κ =largest eigenvalue of A
smalles eigenvalue of A
κ = O(h−2) when solving 2nd-order PDEs(incl. elasticity and Poisson eq.)
Solving linear systems – p. 456
Preconditioning
Idea: Introduce an equivalent system
M−1Ax = M−1b
solve it with a Conjugate Gradient-like method and construct M suchthat1. κ = O(1) ⇒ M ≈ A (i.e. fast convergence)2. M is cheap to compute3. M is sparse (little storage)
4. systems My = z (occuring in the algorithm due to M−1Av-likeproducts) are efficiently solved (O(n) op.)
Contradictory requirements!
The preconditioning business: find a good balance between 1-4
Solving linear systems – p. 457
Classical methods as preconditioners
Idea: “solve” My = z by one iteration with a classical iterativemethod (Jacobi, SOR, SSOR)
Jacobi preconditioning: M = D (diagonal of A)
No extra storage as M is stored in A
No extra computations as M is a part of A
Efficient solution of My = z
But: M is probably not a good approx to A
⇒ poor quality of this type of preconditioners?
Conjugate Gradient method + SSOR preconditioner is widely used
Solving linear systems – p. 458
M as a factorization of A
Idea: Let M be an LU-factorization of A, i.e.,
M = LU
where L and U are lower and upper triangular matrices resp.
Implications:1. M = A (κ = 1): very efficient preconditioner!2. M is not cheap to compute
(requires Gaussian elim. on A!)3. M is not sparse (L and U are dense!)4. systems My = z are not efficiently solved
(O(n2) process when L and U are dense)
Solving linear systems – p. 459
M as an incomplete factorization of A
New idea: compute sparse L and U
How? compute only with nonzeroes in A
⇒ Incomplete factorization, M = LU 6= LU
M is not a perfect approx to A
M is cheap to compute and store(O(n) complexity)
My = z is efficiently solved (O(n) complexity)
This method works well - much better than SOR/SSORpreconditioning
Solving linear systems – p. 460
How to compute M
Run through a standard Gaussian elimination, which factors A asA = LU
Normally, L and U have nonzeroes where A has zeroes
Idea: let L and U be as sparse as A
Compute only with the nonzeroes of A
Such a preconditioner is called Incomplete LU Factorization, ILU
Option: add contributions outside A’s sparsity pattern to thediagonal, multiplied by ω
Relaxed Incomplete Factorization (RILU): ω > 1
Modified Incomplete Factorization (MILU): ω = 1
See algorithm C.3 in the book
Solving linear systems – p. 461
Numerical experiments
Two test cases:−∇2u = f on the unit cube and FDM
−∇2u = f on the unit cube and FEM
Diffpack makes it easy to run through a series of numericalexperiments, using multiple loops, e.g.,
sub LinEqSolver_prmset basic method = ConjGrad & MinRes oksub Precond_prmset preconditioning type = PrecRILUset RILU relaxation parameter = 0.0 & 0.4 & 0.7 & 1.0 ok
Solving linear systems – p. 462
Test case 1: 3D FDM Poisson eq.
Equation: −∇2u = 1
Boundary condition: u = 0
7-pt star standard finite difference scheme
Grid size: 20× 20× 20 = 8000 points and 20× 30× 30 = 27000 points
Source code:$NOR/doc/Book/src/linalg/LinSys4/
All details in HPL Appendix D
Input files:$NOR/doc/Book/src/linalg/LinSys4/experiments
Solver’s CPU time written to standard output
Solving linear systems – p. 463
Jacobi vs. SOR vs. SSOR
n = 203 = 8000 and n = 303 = 27000
Jacobi: not converged in 1000 iterations
SOR(ω = 1.8): 2.0s and 9.2s
SSOR(ω = 1.8): 1.8s and 9.8s
Gauss-Seidel: 13.2s and 97s
SOR’s sensitivity to relax. parameter ω:1.0: 96s, 1.6: 23s, 1.7: 16s, 1.8: 9s, 1.9: 11s
SSOR’s sensitivity to relax. parameter ω:1.0: 66s, 1.6: 17s, 1.7: 13s, 1.8: 9s, 1.9: 11s
⇒ relaxation is important,great sensitivity to ω
Solving linear systems – p. 464
Conjugate Residuals or Gradients?
Compare Conjugate Residuals with Conjugate Gradients
Or: least squares vs. Galerkin
Diffpack names: MinRes and ConjGrad
MinRes: not converged in 1000 iterations
ConjGrad: 0.7s and 3.9s
⇒ ConjGrad is clearly faster than the best SOR/SSOR
Add ILU preconditioner
MinRes: 0.7s and 4s
ConjGrad: 0.6s and 2.7s
The importance of preconditioning grows as n grows
Solving linear systems – p. 465
Different preconditioners
ILU, Jacobi, SSOR preconditioners (ω = 1.2)
MinRes:Jacobi: not conv., SSOR: 11.4s, ILU: 4s
ConjGrad:Jacobi: 4.8s, SSOR: 2.8s, ILU: 2.7s
Sensitivity to relax. parameter in SSOR, with ConjGrad as solver:1.0: 3.3s, 1.6: 2.1s, 1.8: 2.1s, 1.9: 2.6s
Sensitivity to relax. parameter in RILU, with ConjGrad as solver:0.0: 2.7s, 0.6: 2.4s, 0.8: 2.2s, 0.9: 1.9s,0.95: 1.9s, 1.0: 2.7s
⇒ ω slightly less than 1 is optimal,RILU and SSOR are equally fast (here)
Solving linear systems – p. 466
Test case 2: 3D FEM Poisson eq.
Equation: −∇2u = A1π2 sinπx+ 4A2π
2 sin 2πy + 9A3π2 sin 3πz
Boundary condition: u known
ElmB8n3D and ElmB27n3D elements
Grid size: 21× 21× 21 = 9261 nodes and 31× 31× 31 = 29791 nodes
Source code:$NOR/doc/Book/src/fem/Poisson2
All details in HPL Chapter 3.2 and 3.5
Input files:$NOR/doc/Book/src/fem/Poisson2/linsol-experiments
Solver’s CPU time available in casename-summary.txt
Solving linear systems – p. 467
Jacobi vs. SOR vs. SSOR
n = 9261 and n = 303 = 29791, trilinear and triquadratic elms.
Jacobi: not converged in 1000 iterations
SOR(ω = 1.8): 9.1s and 81s, 42s and 338s
SSOR(ω = 1.8): 47s and 248s, 138s and 755s
Gauss-Seidel: not converged in 1000 iterations
SOR’s sensitivity to relax. parameter ω:1.0: not conv., 1.6: 200s, 1.8: 83s, 1.9: 57s(n = 29791 and trilinear elements)
SSOR’s sensitivity to relax. parameter ω:1.0: not conv., 1.6: 212s, 1.7: 207s, 1.8: 245s, 1.9: 435s(n = 29791 and trilinear elements)
⇒ relaxation is important,great sensitivity to ω
Solving linear systems – p. 468
Conjugate Residuals or Gradients?
Compare Conjugate Residuals with Conjugate Gradients
Or: least squares vs. Galerkin
Diffpack names: MinRes and ConjGrad
MinRes: not converged in 1000 iterations
9261 vs 29791 unknowns, trilinear elements
ConjGrad: 5s and 22s
⇒ ConjGrad is clearly faster than the best SOR/SSOR!
Add ILU preconditioner
MinRes: 5s and 28s
ConjGrad: 4s and 16s
ILU prec. has a greater impact when using triquadratic elements (andwhen n grows)
Solving linear systems – p. 469
Different preconditioners
ILU, Jacobi, SSOR preconditioners (ω = 1.2)
MinRes:Jacobi: 68s., SSOR: 57s, ILU: 28s
ConjGrad:Jacobi: 19s, SSOR: 14s, ILU: 16s
Sensitivity to relax. parameter in SSOR, with ConjGrad as solver:1.0: 17s, 1.6: 12s, 1.8: 13s, 1.9: 18s
Sensitivity to relax. parameter in RILU, with ConjGrad as solver:0.0: 16s, 0.6: 15s, 0.8: 13s, 0.9: 12s, 0.95: 11s, 1.0: 16s
⇒ ω slightly less than 1 is optimal,RILU and SSOR are equally fast (here)
Solving linear systems – p. 470
More experiments
Convection-diffusion equations:$NOR/doc/Book/src/app/Cd/Verify
Files: linsol_a.i etc as for LinSys4 and Poisson2
Elasticity equations:$NOR/doc/Book/src/app/Elasticity1/Verify
Files: linsol_a.i etc as for the others
Run experiments and learn!
Solving linear systems – p. 471
Multigrid methods
Multigrid methods are the most efficient methods for solving linearsystems
Multigrid methods have optimal complexity O(n)
Multigrid can be used as stand-alone solver or preconditioner
Multigrid applies a hierarchy of grids
Multigrid is not as robust as Conjugate Gradient-like methods andincomplete factorization as preconditioner, but faster when it works
Multigrid is complicated to implement
Diffpack has a multigrid toolbox that simplifies the use of multigriddramatically
Solving linear systems – p. 472
The rough ideas of multigrid
Observation: e.g. Gauss-Seidel methods are very efficient during thefirst iterations
High-frequency errors are efficiently damped by Gauss-Seidel
Low-frequence errors are slowly reduced by Gauss-Seidel
Idea: jump to a coarser grid such that low-frequency errors get higherfrequency
Repeat the procedure
On the coarsest grid: solve the system exactly
Transfer the solution to the finest grid
Iterate over this procedure
Solving linear systems – p. 473
Damping in Gauss-Seidel’s method (1)
Model problem: −u′′ = f by finite differences:
−uj−1 + 2uj − uj+1 = h2fj
solved by Gauss-Seidel iteration:
2u`j = u`
j−1 + u`−1j+1 + h2fj
Study the error e`i = u`
i − u∞i :
2e`j = e`
j−1 + e`−1j+1
This is like a time-dependent problem, where the iteration index ` is apseudo time
Solving linear systems – p. 474
Damping in Gauss-Seidel’s method (2)
Can find e`j with techniques from Appendix A.4:
e`j =
∑
k
Ak exp (i(kjh− ω`∆t))
or (easier to work with here):
e`j =
∑
k
Akξ` exp (ikjh), ξ = exp (−iω∆t)
Inserting a wave component in the scheme:
ξ = exp (−iω∆t) =exp (ikh)
2− exp (−ikh) , |ξ| = 1√5− 4 cos kh
Interpretation of |ξ|: reduction in the error per iteration
Solving linear systems – p. 475
Gauss-Seidel’s damping factor
|ξ| = 1√5− 4 cos p
, p = kh ∈ [0, π]
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.5 1 1.5 2 2.5 3p
Small p = kh ∼ h/λ: low frequency (relative to the grid) and smalldamping
Large (→ π) p = kh ∼ h/λ: high frequency (relative to the grid) andefficient damping
Solving linear systems – p. 476
More than one grid
From the previous analysis: error components with high frequencyare quickly damped
Jump to a coarser grid, e.g. h′ = 2h
p is increased by a factor of 2, i.e., not so high-frequency waves onthe h grid is efficiently damped by Gauss-Seidel on the h′ grid
Repeat the procedure
On the coarsest grid: solve by Gaussian elimination
Interpolate solution to a finer grid, perform Gauss-Seidel iterations,and repeat until the finest grid is reached
Solving linear systems – p. 477
Transferring the solution between grids
From fine to coarser: restriction
q-1
q
2 3 4 5
3 4 5 6 7 8 9
1
1 2
simple restriction
weighted restriction
fine grid function
From coarse to finer: prolongation
q−1
q
2 3 4 5
3 4 5 6 7 8 9
1
1 2
interpolated fine grid function
coarse grid function
Solving linear systems – p. 478
Smoothers
The Gauss-Seidel method is called a smoother when used to damphigh-frequency error components in multigrid
Other smoothers: Jacobi, SOR, SSOR, incomplete factorization
No of iterations is called no of smoothing sweeps
Common choice: one sweep
Solving linear systems – p. 479
A multigrid algorithm
Start with the finest grid
Perform smoothing (pre-smoothing)
Restrict to coarser grid
Repeat the procedure (recursive algorithm!)
On the coarsest grid: solve accurately
Prolongate to finer grid
Perform smoothing (post-smoothing)
One cycle is finished when reaching the finest grid again
Can repeat the cycle
Multigrid solves the system in O(n) operations
Check out HPL C.4.2 for details!!
Solving linear systems – p. 480
V- and W-cycles
Different strategies for constructing cycles:
4
3
2
1
γ =1 γ
smoothing
coarse grid solve
q q q=2
Solving linear systems – p. 481
Multigrid requires flexible software
Many ingredients in multigrid:pre- and post-smootherno of smoothing sweepssolver on the coarsest levelcycle strategyrestriction and prolongation methodshow to construct the various grids?
There are also other variants of multigrid (e.g. for nonlinear problems)
The optimal combination of ingredients is only known for simplemodel problems (e.g. the Poisson eq.)
In general: numerical experimentation is required!(Diffpack has a special multigrid toolbox for this)
Solving linear systems – p. 482
Coupling simulators
Coupling simulators – p. 483
System of PDEs; coupling simulators
System of 2 PDEs:one momentum equation +one energy equation
Develop independent solvers for each PDE
Combine solvers in a few lines
Coupling simulators – p. 484
Physical problem
z
Ω
Non-Newtonian fluid
Temperature-dependent viscosity
Steady flow
Straight pipe
0 10.3
0.4
0.5
0.6
0.7
0.8
0.9
1
velocity
0
0.00147
0.00295
0.00442
0.00589
0.00737
0.00884
0.0103
0.0118
0.0133
0.0147
0.0162
0.0177
0.0192
0 10.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10.3
0.4
0.5
0.6
0.7
0.8
0.9
1
temperature
0
1.39e−05
2.78e−05
4.17e−05
5.56e−05
6.95e−05
8.34e−05
9.74e−05
0.000111
0.000125
0.000139
0.000153
0.000167
0.000181
0 10.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Coupling simulators – p. 485
General mathematical model
Equation of continuity:∇ · v = 0
Momentum equation:
%v · ∇v = −∇p+∇ ·P
Constitutive law:
P ∼ exp (−αT )|γ|n−1(∇v + (∇v)T )
γ =√∇v : ∇v
Energy equation:
C%v · ∇T = k∇2T + c exp (−αT )|γ|n+1
Coupling simulators – p. 486
Simplified mathematical model
Assumption: rectilinear flowv = (0, 0, w)
Simplified equation system:
∂
∂x
(µ∂w
∂x
)+
∂
∂y
(µ∂w
∂y
)= const
µ = µ0e−αTS(w)n−1
S(w) =
√(∂w
∂x
)2
+
(∂w
∂y
)2
∂2T
∂x2+∂2T
∂y2= −µ0e
−αTS(w)n+1
Two nonlinear Poisson equations
BC: w = 0 and T = 0 at the walls
Coupling simulators – p. 487
Numerical solution methods
Galerkin finite element method
Fully implicit formulation:
A(wk,Tk)wk = a
BTk = b(wk,Tk)
Sequential solution method
A(wk,Tk−1)wk = a
BTk = b(wk,Tk)
Solution of nonlinear systems:Picard iterationNewton-Raphson
Coupling simulators – p. 488
Simplified structure in 1D
System of PDEs:
d
dx
(e−αT
∣∣∣∣dw
dx
∣∣∣∣n−1
dw
dx
)= const
d2T
dx2= −e−αT
∣∣∣∣dw
dx
∣∣∣∣n+1
Nonlinearities depend on n and α
Coupling simulators – p. 489
Software development
Momentum1: ∇ · [µ∇u] = 0
Energy1: ∇2u = −f
Momentum1
Momentum2 Energy2
Energy1simple µ
µrelevant
simple f
relevant f
viscosity modelsCommonRel
Manager
Very little code in Momentum2, Energy2, Manager
Coupling simulators – p. 490
Advanced Diffpack features
Advanced Diffpack features – p. 491
Parallel computing in Diffpack
Idea: add a few statements to a Diffpack solver and get a parallelversion
Two approaches:domain decomposition of the mathematical problem(two-level block Jacobi iteration with course grid correction)parallellization of matrix generation and linear solver
Both approaches start with a sequential Diffpack solver and addparallel features in small subclasses
Current status: upcoming module 2001
Advanced Diffpack features – p. 492
A parallel simulator
MySimtools
Parallel
MySimP
Advanced Diffpack features – p. 493
Multigrid methods
Why multigrid?It is often the fastest solution method for linear systems
How to implement multigrid?wind it into the inner grid/PDE details?put it on top of a solver?
Diffpack was designed without multigrid methods in mind
Multigrid has recently been added as a top module
Advanced Diffpack features – p. 494
Basic features of Diffpack multigrid
Multigrid ingredientsmultiple gridsmultiple linear solversinterpolation/prolongation
...put together in an algorithm
Data structures: vectors of grids, linear solvers, sparse matrices
⇒ Can reuse standard Diffpack classes
About 10 lines of additional code in e.g. class Poisson1
Multigrid in a new problem requires heavy experimentation
⇒ Flexible run-time combination of standard Diffpack modules intomultigrid algorithms
Efficiency: surprisingly good
Available through the Multi-Level Module
Advanced Diffpack features – p. 495
Mixed finite elements
Diffpack applies a grid overlay, class BasisFuncGrid, for definingbasis functions on a GridFE geometry
Isoparametric elements: BasisFuncGrid is transparent
Mixed finite elements: BasisFuncGrid defines new nodes
Programming with MxFEM and MxFiniteElement instead of FEM andFiniteElement
Easy and flexible tool
Coupled to block matrices, block preconditioners etc
Available in Diffpack v3.5
Advanced Diffpack features – p. 496
Domain decomposition
DD as solver or preconditioner
Overlapping vs. non-overlapping
The ideas from OO implementation of multigrid carry over to DD
In fact, an abstract multilevel algorithm constitute the generalsoftware, with multigrid and DD as special cases
Current status: upcoming extension of the Multi-Level Module
Advanced Diffpack features – p. 497
Adaptivity
How to implement adaptivity:wind adaptive discretization and solution algorithm (multigrid)together?separate discretization and solvers? Diffpack always separatesdiscretization and solvers!
Adaptive grids require about 10 lines of extra code:adaptive grids are subclasses of GridFEadaptivity: simple loop calling up(1) a refinement criterion and(2) grid->refine
Available through the Adaptivity Module
Advanced Diffpack features – p. 498
Example: adaptive grids (1)
XY
Z
Z−
Axi
s
0
1
Y−Axis
0
1 X−Axis
0
1
Advanced Diffpack features – p. 499
Example: adaptive grids (2)
−1 0 1−1
0
1
−1 0 1−1
0
1
−1 0 1−1
0
1
0.77
3
0.773
0.773
0.773
0.954
0.954
1.13
1.31
1.68
−1 0 1−1
0
1
Advanced Diffpack features – p. 500
Example: adaptive grids (3)
−1 0 1−1
0
1
−1 0 1−1
0
1
−1 0 1−1
0
1
−1 0 1−1
0
1
Advanced Diffpack features – p. 501
Example: adaptive grids (4)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.8798
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.8798
Advanced Diffpack features – p. 502
The 2nd Diffpack book on Springer
Advanced Computational Partial Differential Equations – NumericalMethods and Diffpack Programming, edited by H.P. Langtangen andA. Tveito. Integration of some theory, models, and algorithms, withemphasis on Diffpack software.
Basic concepts in parallel computing
Parallel computing with Diffpack
Multilevel methods
Mixed finite elements
Block preconditioning
Stochastic PDEs
Computational medicine
Computational finance
Computational geology
Published 2003
Advanced Diffpack features – p. 503
Intro to OOP
Intro to OOP – p. 504
Traditional programming
Traditional procedural programming:
subroutines/procedures/functions
data structures = variables, arrays
data are shuffled between functions
Problems with procedural approach:
Numerical codes are usually large, resulting in lots of functions withlots of arrays (and their dimensions)
Too many visible details
Little correspondence between mathematical abstraction andcomputer code
Redesign and reimplementation tend to be expensive
Intro to OOP – p. 505
Programming with objects (OOP)
Programming with objects makes it easier to handle large andcomplicated codes:
Well-known in computer science/industry
Can group large amounts of data (arrays) as a single variable
Can make different implementations look the same for a user
Not much explored in numerical computing(until late 1990s)
Intro to OOP – p. 506
Example: programming with matrices
Mathematical problem:
Matrix-matrix product: C = MB
Matrix-vector product: y = Mx
Points to consider:
What is a matrix?
a well defined mathematical quantity, containing a table of numbersand a set of legal operations
How do we program with matrices?
Do standard arrays in any computer language give good enoughsupport for matrices?
Intro to OOP – p. 507
A dense matrix in Fortran 77
Fortran syntax (or C, conceptually)
integer p, q, rdouble precision M(p,q), B(q,r), C(p,r)double precision y(p), x(q)
C matrix-matrix product: C = M*Bcall prodm(M, p, q, B, q, r, C)
C matrix-vector product: y = M*xcall prodv(M, p, q, x, y)
Drawback with this implementation:
Array sizes must be explicitly transferred
New routines for different precisions
Intro to OOP – p. 508
Working with a dense matrix in C++
// given integers p, q, j, k, rMatDense M(p,q); // declare a p times q matrixM(j,k) = 3.54; // assign a number to entry (j,k)
MatDense B(q,r), C(p,r);Vector x(q), y(p); // vectors of length q and pC=M*B; // matrix-matrix producty=M*x; // matrix-vector productM.prod(x,y); // matrix-vector product
Observe that
we hide information about array sizes
we hide storage structure (the underlying C array)
the computer code is as compact as the mathematical notation
Intro to OOP – p. 509
A dense matrix class
class MatDenseprivate:
double** A; // pointer to the matrix dataint m,n; // A is an m times n matrix
public:// --- mathematical interface ---MatDense (int p, int q); // create pxq matrixdouble& operator () (int i, int j); // M(i,j)=4; s=M(k,l);void operator = (MatDense& B); // M = B;void prod (MatDense& B, MatDense& C); // M.prod(B,C); (C=M*B)void prod (Vector& x, Vector& z); // M.prod(y,z); (z=M*y)MatDense operator * (MatDense& B); // C = M*B;Vector operator * (Vector& y); // z = M*y;void size (int& m, int& n); // get size of matrix
;
Notice that the storage format is hidden from the user
Intro to OOP – p. 510
What is this object or class thing?
A class is a collection of data structures and operations on them
An object is a realization (variable) of a class
The MatDense object is a good example:1. data: matrix size + array entries2. operations: creating a matrix, accessing matrix entries,
matrix-vector products,..
A class is a new type of variable, like reals, integers etc
A class can contain other objects;in this way we can create complicated variables that are easy toprogram with
Intro to OOP – p. 511
Extension to sparse matrices
Matrix for the discretization of −∇2u = f .
Only 5n out of n2 entries are nonzero.
Store only the nonzero entries!
Many iterative solution methods for Au = b can operate on thenonzeroes only
Intro to OOP – p. 512
How to store sparse matrices (1)
A =
a1,1 0 0 a1,4 0
0 a2,2 a2,3 0 a2,5
0 a3,2 a3,3 0 0
a4,1 0 0 a4,4 a4,5
0 a5,2 0 a5,4 a5,5
.
Working with the nonzeroes only is important for efficiency!
Intro to OOP – p. 513
How to store sparse matrices (2)
The nonzeroes can be stacked in a one-dimensional array
Need two extra arrays to tell where a row starts and the column indexof a nonzero
A = (a1,1, a1,4, a2,2, a2,3, a2,5, . . .
irow = (1, 3, 6, 8, 11, 14),
jcol = (1, 4, 2, 3, 5, 2, 3, 1, 4, 5, 2, 4, 5).
⇒ more complicated data structures and hence more complicatedprograms
Intro to OOP – p. 514
Sparse matrices in Fortran
Code example for y = Mx
integer p, q, nnzinteger irow(p+1), jcol(nnz)double precision M(nnz), x(q), y(p)...call prodvs (M, p, q, nnz, irow, jcol, x, y)
Two major drawbacks:
Explicit transfer of storage structure (5 args)
Different name for two functions that perform the same task on twodifferent matrix formats
Intro to OOP – p. 515
Sparse matrix as a C++ class (1)
class MatSparse
private:double* A; // long vector with the nonzero matrix entriesint* irow; // indexing arrayint* jcol; // indexing arrayint m, n; // A is (logically) m times nint nnz; // number of nonzeroes
public:// the same functions as in the example above// plus functionality for initializing the data structures
void prod (Vector& x, Vector& z); // M.prod(y,z); (z=M*y);
Intro to OOP – p. 516
Sparse matrix as a C++ class (2)
What has been gained?
Users cannot see the sparse matrix data structure
Matrix-vector product syntax remains the same
The usage of MatSparse and MatDense is the same
Easy to switch between MatDense and MatSparse
Intro to OOP – p. 517
The jungle of matrix formats
When solving PDEs by finite element/difference methods there arenumerous advantageous matrix formats:
- dense matrix- banded matrix- tridiagonal matrix- general sparse matrix- structured sparse matrix- diagonal matrix- finite difference stencil as matrix
The efficiency of numerical algorithms is often strongly dependent onthe matrix storage scheme
Goal: hide the details of the storage schemes
Intro to OOP – p. 518
Different matrix formats
Intro to OOP – p. 519
The matrix class hierarchy
MatSparseMatDense MatTriDiag MatBanded
Matrix
Generic interface in base class Matrix
Implementation of storage and member functions in the subclasses
Generic programming in user code:Matrix& M;
M.prod(x,y); // y=M*x
i.e., we need not know the structure of M, only that it refers to someconcrete subclass object;C++ keeps track of which subclass object!
prod must then be a virtual function
Intro to OOP – p. 520
Object-oriented programming
Matrix = object
Details of storage schemes are hidden
Common interface to matrix operations
Base class: define operations, no data
Subclasses: implement specific storage schemes and algorithms
It is possible to program with the base class only!
Intro to OOP – p. 521
Bad news...
Object-oriented programming do wonderful things, but might beinefficient
Adjusted picture:When indexing a matrix, one needs to know its data storage structurebecause of efficiency
In the rest of the code one can work with the generic base class andits virtual functions
⇒ Object-oriented numerics: balance between efficiency and OOtechniques
Intro to OOP – p. 522
Some Diffpack/C++ programming
Some Diffpack/C++ programming – p. 523
Base class, subclass, inheritance
A subclass inherits data and functions from its base class
Base class:class X int i,k; void calc(); ;
Subclass:class Y : public X int n; void calc2(); ;
Class Y has int i,k,n and functions calc,calc2
Some Diffpack/C++ programming – p. 524
Organization of Diffpack vectors
class VecSimplest(Type):just a C array with indexingclass Type: no requirements
subclass VecSimple(Type):adds operator=, input/outputclass Type: operator=, operator<<, operator>>
subclass VecSort(Type):adds operator< etc, sortingclass Type: operator<, operator<= etc
subclass Vec(Type):adds numerical operations on vectorsclass Type: operator*, operator/ etc
Some Diffpack/C++ programming – p. 525
The vector class hierarchy
VecSimplest
VecSimple
VecSort
ArrayGenSel
ArrayGen
Vec
Vector
ArrayGenSimplest
ArrayGenSimple
op= op<< op>>
op< op<= etc
plain C arrayop()(int i)
op()(int i, int j)
op()(int i, int j, int k)
can print, scan, op=op+ op- op* op/
inactive entries (FDM & non-rect. geom.)
Vec + multiple indices
Some Diffpack/C++ programming – p. 526
Why this vector organization?
Vector of real (=double): Vec(real)
Vector of int: VecSort(int)Vec(int) has too many arithmetic op.
Vector of grids: VecSimple(Grid)operator= and printing/reading for Grid make sense, but notarithmetic operations or sorting
Vector of simulators: VecSimplest(MySim)neither printing/reading, operator=, nor arithmetic operators makesense
Want to use same basic array handling code forVecSimplest(Grid) as for Vec(real)
⇒ Use inheritance to share code and increase reliability
Some Diffpack/C++ programming – p. 527
Matrices revisited
Recall the intro example on handling various matrix formats
Declare base class Matrix
Define virtual functions for mathematical operations
Realize dense matrix, diagonal matrix, etc. as subclasses
Implement mathematical operations in subclasses only
Magic: Program with Matrix,C++ figures out which subclass you really mean!
Keywords: virtual functions, inheritance, object-orientedprogramming
Some Diffpack/C++ programming – p. 528
Why emphasize software design?
PDE simulator: 50 000+ code lines
Maintainability important
Should be easy to extend
Should be easy to use/understand
Abstractions close to mathematical language are needed
The design must be a balance between attractive abstractions andcomputational efficiency
Some Diffpack/C++ programming – p. 529
Grid and field abstractions
PDE:−∇ · [λ(x)∇u(x)] = f(x), x ∈ Ω
Assume some discretization (FDM, FEM, ...)
Natural abstractions:scalar fields: λ(x), f(x), u(x)(explicit functions, discrete fields)discrete Ω: gridfield = grid + valuesor field = explicit formuladiscrete operators?
Some Diffpack/C++ programming – p. 530
Programming considerations
Obvious ideas:collect grid information in a grid classcollect field information in a field class
Gain:shorter code, closer to the mathematicsfinite difference methods: minorfinite element methods: importantbig programs: fundamental
Some Diffpack/C++ programming – p. 531
Grids and fields in Diffpack
Assume a finite difference method:
Field represented by class FieldLattice:a grid of type GridLatticea set of point values, ArrayGenSel(ArrayGenSel is a subclass of ArrayGen with extrafunctionality)
Grid represented by GridLattice(uniform partition in d dimensions)
Some Diffpack/C++ programming – p. 532
The GridLattice class
0 10
1
0 10
1
class GridLatticeprivate:
// data that hold grid spacing, size of domain etcpublic:
GridLattice (int nsd);real getPt (int dir, int index); // get coordinate of pt.int getBase (int dir); // loops: start indexint getMaxI (int dir); // loops: stop indexreal Delta (int dir); // grid spacingvoid scan (Is is); // scan("d=1 [0,1] [1:40]");
;
// declare a 2D grid in a program:// GridLattice grid(2);// grid.scan("d=2 [0,1]x[1,2] [1:10]x[-20:20]");
Some Diffpack/C++ programming – p. 533
The FieldLattice class
0 10
1
0
0.00595
0.0119
0.0178
0.0238
0.0297
0.0357
0.0416
0.0476
0.0535
0.0595
0 10
1
class FieldLatticeprivate:
Handle(GridLattice) grid; // pointer to the gridHandle(ArrayGen(real)) vec; // pointer to the field values
public:FieldLattice (GridLattice& grid, const char* fieldname);GridLattice& grid (); // access to the gridArrayGen(real)& values (); // access to the field values
;
// given some 2D FieldLattice f, set f=sin(f):int i0 = f.grid().getBase(1); // start index, x-dirint in = f.grid().getMaxI(1); // stop index, x-dirint j0 = f.grid().getBase(2); // start index, y-dirint jn = f.grid().getMaxI(2); // stop index, y-dirint i,j;for (j = j0; j <= jn; j++)
for (i = i0; i <= in; i++)f.values()(i,j) = sin (f.values()(i,j)); Some Diffpack/C++ programming – p. 534
Smart pointers (handles)
Dynamic memory in C/C++⇒ need pointers
Bug no. 1 in C/C++: pointers
For example, if 5 fields point to the same grid, when can we safelyremove the grid object?
Make life easy: use a smart pointerHandle(X) x; // NULL pointer
x.rebind (new X()); // x points to new X object
// given a function void someFunc (X& xobj):
someFunc (*x); // send object (not the handle)someFunc (x()); // alternative syntax
// given a Handle(X) y:x.rebind (*y); // x points to y’s objectx = y; // not recommended (often a bug...)*x = *y; // set x’s object equal to y’s object
x.getRef(); // extract reference to x (same as *x)x.getPtr(); // extract pointer to x
negligible overhead, automatic garbage collection
Some Diffpack/C++ programming – p. 535
Simulator classes
The PDE solver is a class itself
⇒ Easy to extend/modify solver
⇒ Enables coupling to optimization,automatic parameter analysis etc.
⇒ Easy to combine solvers(systems of PDEs)
Typical look:class MySimprotected:
// grid and field objects// PDE dependent parameters
public:void scan(); // read input and initvoid solveProblem();void resultReport();
;
Some Diffpack/C++ programming – p. 536
Diffpack naming conventions
Local variables have lower-case letters, words are separated byunderscores, e.g., my_variable
Functions start with lower-case letters, words are separated bycapitals, e.g., myFunction
Class and enum names start with a capital letter, words areseparated by capitals, e.g., MyClass
Macros and enum values have upper-case letters, words areseparated by underscores, e.g., MY_MACRO
Remark: of course, you can follow your own convention, but it isimportant to be consistent!
Some Diffpack/C++ programming – p. 537
String vibration revisited
Problem:∂2u
∂t2= γ2 ∂
2u
∂x2
Explicit finite difference method; loop through (x, t) grid
New class-based code: Wave1Dclass Wave1D
Handle(GridLattice) grid; // lattice grid here 1D gridHandle(FieldLattice) up; // solution u at time level l+1Handle(FieldLattice) u; // solution u at time level lHandle(FieldLattice) um; // solution u at time level l-1Handle(TimePrm) tip; // time discretization parameters (dt etc.)CurvePlotFile plotfile;// for plotting resultsreal C; // the Courant number (appears in the scheme)
void setIC (); // set initial conditionsvoid timeLoop (); // perform time steppingvoid dumpSolution (); // make a curve plot of u
public:Wave1D()
~Wave1D() void scan (); // read discretization parameters and initializevoid solveProblem (); // solve the problemvoid resultReport () // just dummy here
; Some Diffpack/C++ programming – p. 538
.h files and .cpp files
The class declaration (listing of data and function) is placed in aseparate file, with extension .h (here Wave1D.h)#ifndef Wave1D_h_IS_INCLUDED#define Wave1D_h_IS_INCLUDED#include <FieldLattice.h>#include <TimePrm.h>
class Wave1D
Handle(GridLattice) grid; // lattice grid here 1D gridHandle(FieldLattice) up; // solution u at time level l+1
...;#endif
The bodies of the member functions are put in a file with extension.cpp (here Wave1D.cpp)
Some Diffpack/C++ programming – p. 539
Class TimePrm
Class TimePrm holds time parameters:∆t, time interval for simulation etc.
Initialization:Handle(TimePrm) tip = new TimePrm();tip.scan ("dt=0.1 t in [0,8]");
// only some characters are important:tip.scan ("=0.1 [0,8]");
Useful methods:class TimePrmpublic:
real Delta() const; // return time stepreal time() const; // return current timevoid initTimeLoop(); // initializebool finished(); // is stop time reached?void increaseTime(); // t = t + dtint getTimeStepNo(); // return time step number
;
Some Diffpack/C++ programming – p. 540
Reading input
Let us read input (C, the grid, and the stop time) from the Unixcommand line like this:
./app -C 0.8 -g ’d=1 [0,1] [0:40]’ -t 6.5
void Wave1D:: scan ()
// real C is a class member, initialize it here:initFromCommandLineArg
("-C", C, 1.0, "Courant number", "R1[0:1]");
String grid_str;initFromCommandLineArg
("-g", grid_str, "d=1 [0,1] [0:20]", "grid", "S");grid.rebind(new GridLattice(1)); grid->scan (grid_str);
tip.rebind (new TimePrm());real tstop;initFromCommandLineArg
("-t", tstop, 1.0, "tstop", "R1[0:10]");// construct the proper initialization string from C:tip->scan (aform("dt=%g t in [0,%g]", C*grid->Delta(1), tstop));// (we assume unit wave velocity)...
Some Diffpack/C++ programming – p. 541
Solving the problem
void Wave1D:: solveProblem () timeLoop();
void Wave1D:: timeLoop ()
tip->initTimeLoop();setIC();
const int i0 = u->grid().getBase(1); // start of loopconst int n = u->grid().getMaxI(1); // end of loopint i;dumpSolution (); // plot initial condition// useful abbreviations (also for efficiency):const ArrayGen(real)& U = u ->values();const ArrayGen(real)& Um = um->values();
ArrayGen(real)& Up = up->values();
while (!tip->finished()) tip->increaseTime();
for (i = i0+1; i <= n-1; i++)Up(i) = 2*U(i) - Um(i) + sqr(C) * (U(i+1) - 2*U(i) + U(i-1));
Up(i0) = 0; Up(n) = 0; // insert boundary values*um = *u; *u = *up; // update for next step, CHANGED
// alternative syntax: um() = u(); u() = up();dumpSolution ();
Some Diffpack/C++ programming – p. 542
Set initial conditions
void Wave1D:: setIC ()
// set initial conditions on u and umconst int i0 = u->grid().getBase(1); // start point indexconst int n = u->grid().getMaxI(1); // end point indexconst real umax = 0.05; // max amplitude
// initialization of upup->fill(0.0);
// initialization of u (the initial displacement of the string)u->fill(0.0);int i; real x;for (i = i0; i <= n; i++) x = grid->getPt(1,i); // get x coord of grid point no iif (x < 0.7) u->values()(i) = (umax/0.7) * x;else u->values()(i) = (umax/0.3) * (1 - x);
// initialization of um (the special formula)um->fill(0.0);for (i = i0+1; i <= n-1; i++) // set the help variable um:um->values()(i) = u->values()(i) + 0.5*sqr(C) *
(u->values()(i+1) - 2*u->values()(i) + u->values()(i-1));
Some Diffpack/C++ programming – p. 543
Dump results and main function
void Wave1D:: dumpSolution ()
// automatic dump of a curve plot of a 1D field:SimRes2gnuplot::makeCurvePlot(*u, // field to be plotted (1D)plotfile, // curve plot manager"displacement", // plot titleoform("u(x,%.4f)",tip->time()), // name of functionoform("C=%g, h=%g, t=%g", // comment
C,u->grid().Delta(1),tip->time()));
// main.cpp:
#include <Wave1D.h>
int main (int argc, const char* argv[])
initDiffpack (argc, argv);Wave1D simulator;simulator.scan ();simulator.solveProblem ();simulator.resultReport ();
Some Diffpack/C++ programming – p. 544
Nice exercise: manual plotting
Let us rewrite dumpSolution: write each data point on the u(x, ·)curve to a CurvePlot object:void Wave1D:: dumpSolution ()
CurvePlot curve (plotfile); // tie CurvePlot to CurvePlotFilecurve.initPair ("displacement", // title
aform("u(x,%.4f)",tip->time()), // curvename"x", // indep.var.aform("C=%g",C)); // comment
// loop through all points in the grid, add (x,u) to curve:int i0 = grid->getBase(1); // start indexint in = grid->getMaxI(1); // stop indexreal x,uval;for (int i = i0; i <= in; i++) x = grid->getPt(1,i); // extract x coordinateuval = u->values()(i);curve.addPair (x, uval);
curve.finish();
Some Diffpack/C++ programming – p. 545
Exercise 1.17
Consider a wave equation with damping:
∂2u
∂t2+ β
∂u
∂t= γ2 ∂
2u
∂x2
Same initial and boundary conditions as in class Wave1D
Modify the numerical scheme
Take a copy of class Wave1D
Implement the modification
Give β on the command line
Display a movie of a damped string:curveplotmovie gnuplot SIMULATION.map -0.1 0.1
Some Diffpack/C++ programming – p. 546