Finite Elements - simweb.iwr.uni-heidelberg.degkanscha/notes/fem.pdf · Chapter 1 Elliptic PDE and...

Finite Elements

Guido Kanschat

May 7, 2019

Contents

1 Elliptic PDE and Their Weak Formulation 3

1.1 Elliptic boundary value problems . . . . . . . . . . . . . . . . . . 3

1.1.1 Linear second order PDE . . . . . . . . . . . . . . . . . . 3

1.1.2 Variational principle and weak formulation . . . . . . . . 8

1.1.3 Boundary conditions in weak form . . . . . . . . . . . . . 10

1.2 Hilbert Spaces and Bilinear Forms . . . . . . . . . . . . . . . . . 11

1.3 Fast Facts on Sobolev Spaces . . . . . . . . . . . . . . . . . . . . 20

1.4 Regularity of Weak Solutions . . . . . . . . . . . . . . . . . . . . 24

2 Conforming Finite Element Methods 27

2.1 Meshes, shape functions, and degrees of freedom . . . . . . . . . 27

2.1.1 Shape function spaces on simplices . . . . . . . . . . . . . 30

2.1.2 Shape functions on tensor product cells . . . . . . . . . . 31

2.1.3 The Galerkin equations and Céa’s lemma . . . . . . . . . 35

2.1.4 Mapped finite elements . . . . . . . . . . . . . . . . . . . 37

2.2 A priori error analysis . . . . . . . . . . . . . . . . . . . . . . . . 40

2.2.1 Approximation of Sobolev spaces by finite elements . . . . 41

2.2.2 Estimates of stronger norms . . . . . . . . . . . . . . . . . 45

2.2.3 Estimates of weaker norms and linear functionals . . . . . 46

2.2.4 Green’s function and maximum norm estimates . . . . . . 48

2.3 A posteriori error analysis . . . . . . . . . . . . . . . . . . . . . . 50

2.3.1 Quasi-interpolation in H1 . . . . . . . . . . . . . . . . . . 50

1

3 Variational Crimes 59

3.1 Numerical quadrature . . . . . . . . . . . . . . . . . . . . . . . . 59

4 Solving the Discrete Problem 67

4.1 The Richardson iteration . . . . . . . . . . . . . . . . . . . . . . 70

4.2 The conjugate gradient method . . . . . . . . . . . . . . . . . . . 75

4.3 Condition numbers of finite element matrices . . . . . . . . . . . 79

4.4 Multigrid methods . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5 Discontinuous Galerkin methods 84

5.1 Nitsche’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5.2 The interior penalty method . . . . . . . . . . . . . . . . . . . . . 89

5.2.1 Bounded formulation in H1 . . . . . . . . . . . . . . . . . 93

2

Chapter 1

Elliptic PDE and Their WeakFormulation

1.1 Elliptic boundary value problems

1.1.1 Linear second order PDE

1.1.1 Notation: Dimension of “physical space” will be denoted by d.We denote coordinates in Rd as

x = (x1, . . . , xd)T .

In the special cases d = 2, 3 we also write

x =

(xy

), x =

xyz

,respectively. The Euclidean norm on Rd is denoted as

|x| =

√√√√ d∑i=1

x2i .

3

1.1.2 Notation: Partial derivatives of a function u ∈ C1(Rd) are de-noted by

∂u(x)

∂xi= ∂∂xiu(x) = ∂xiu(x) = ∂iu(x).

The gradient of u ∈ C1 is the row vector

∇u = (∂1u, . . . , ∂du)

The Laplacian of a function u ∈ C2(Rd) is

∆u = ∂21u+ · · ·+ ∂2du =d∑i=1

∂2i u

1.1.3 Notation: When we write equations, we typically omit the inde-pendent variable x. Therefore,

∆u ≡ ∆u(x).

1.1.4 Definition: A linear PDE of second order in divergence form fora function u ∈ C2(Rd) is an equation of the form

−d∑

i,j=1

∂i(aij(x)∂ju

)+

d∑i=1

(bi(x)∂iu

)+ c(x)u = f(x) (1.1)

1.1.5 Definition: An important model problem for the equations weare going to study is Poisson’s equation

−∆u = f. (1.2)

1.1.6. Already with ordinary differential equations we experience that we typi-cally do not search for solutions of the equation itself, but that we “anchor” thesolution by solving an initial value problem, fixing the solution at one point onthe time axis.

It does not make sense to speak about an initial point in Rd. Instead, it turnsout that it is appropriate to consider solutions on certain subsets of Rd andimpose conditions at the boundary.

4

1.1.7 Definition: A domain in Rd is a connected, open set of Rd. Wetypically use the notation Ω ⊂ Rd.The boundary of a domain Ω is denoted by ∂Ω. To any point x ∈ ∂Ω,we associate the outer unit normal vector n ≡ n(x).The symbol ∂nu ≡ (∇u)n denotes the normal derivative of a functionu ∈ C1(Ω) at a point x ∈ ∂Ω.

1.1.8 Definition: We distinguish three types of boundary conditionsfor Poisson’s equation, namely for a point x ∈ ∂Ω with a given functiong

1. Dirichlet:

u(x) = g(x)

2. Neumann:

∂nu(x) = g(x)

3. Robin: for some positive function α on dΩ

∂nu(x) + α(x)u(x) = g(x)

While only one of these boundary conditions can hold in a single pointx, different boundary conditions can be active on different subsets of ∂Ω.We denote such subsets as ΓD, ΓN , and ΓR.

1.1.9 Definition: The Dirichlet problem for Poisson’s equation (indifferential form) is: find u ∈ C2(Ω) ∩ C(Ω), such that

−∆u(x) = f(x) x ∈ Ω, (1.3a)u(x) = g(x) x ∈ ∂Ω. (1.3b)

Here, the functions f on Ω and g on ∂Ω are data of the problem.The Dirichlet problem is called homogeneous, if g ≡ 0.

5

1.1.10 Theorem (Dirichlet principle): If a function u ∈ C2(Ω) ∩C(Ω) solves the Dirichlet problem, then it minimizes the Dirichlet en-ergy

E(v) =

∫Ω

12 |∇v|

2 dx−∫

Ω

fv dx, (1.4)

among all functions v from the set

Vg ={v ∈ C2(Ω) ∩ C(Ω)

∣∣v|∂Ω = g}. (1.5)This minimizer is unique.

Proof. Using variation of E, we will show that

d

dεE(u+ εv)

∣∣∣ε=0

= 0

for all v ∈ V0 since this implies u+ εv = g on ∂Ω. By evaluating the square wehave

d

dεE(u+ εv) =

∫Ω

∇u∇v + ε|∇v|2 − fv dx

Since we are intersted in E(u), we now consider ε = 0. We get that u minimizesE(u+ εv) at ε = 0 implies∫

Ω

∇u∇v dx =∫

Ω

fv dx, ∀v ∈ V0.

By Green’s formula∫Ω

∇u∇v dx =∫

Ω

−∆uv dx+∫∂Ω

∂nuv ds

we obtain that if u minimizes E(·), then∫Ω

∇u∇v dx =∫

Ω

fv dx, ∀v ∈ V0,

since v ∈ V0 vanishes on ∂Ω. In summary, we have proven so far that if u solvesPoisson’s Equation, then it is a stationary point of E(·). It remains to showthat E(u) ≤ E(u+ v) for any v ∈ V0. Using

∫Ωfv dx =

∫Ω∇u∇v yields

E(u+v)−E(u) = 12

∫Ω

|∇(u+v)|2−2∇(u+v)∇u+|∇u|2 dx = 12|∇v|2 dx ≥ 0.

This also proves uniqueness.

6

1.1.11 Lemma: A minimizing sequence for the Dirichlet energy existsand it is a Cauchy sequence.

Proof. The Dirichlet energy E(·) is bounded from below and hence an infinumexists. Thus, there also exists a series {u(n)}n∈N converging to this infinum, i.e.

limn→∞

E(u(n)) = infv∈V0

E(v).

Second, we show that {u(n)}n is a Cauchy sequence.

For the first part we use Friedrich’s inequality

‖v‖L2(Ω) ≤ λ(Ω)‖∇v‖L2(Ω) v ∈ V0.

The proof of this result will be given later. Using Hölder’s inequality we obtain

E(v) =1

2‖∇v‖2L2(Ω −

∫Ω

fv dx ≥ 12‖∇v‖2L2(Ω − ‖f‖L2(Ω‖v‖L2(Ω

Applying Friedrich’s inquality yields that the above expression is greater orequal than

1

2‖∇v‖2L2(Ω − ‖∇v‖L2(Ω

1

λ(Ω)‖f‖L2(Ω.

Finally, we apply Young’s inequality ab ≤ 1/2(a2 + b2) to obtain

1

2‖∇v‖2L2(Ω − ‖∇v‖

2L2(Ω −

1

2λ(Ω)‖f‖2L2(Ω

which yields E(v) ≥ − 12λ(Ω)2 ‖f‖2L2(Ω as a lower bound independent of v. To

prove the second part, we use the parallelogram identity |v + w|2 + |v − w|2 =2|v|2 + 2|w|2. Let m,n be natural numbers, then

|u(n) − u(m)|21 =2|u(n)|21 + 2|u(m)|21 − 4|1/2(u(n) + u(m)|21

=4E(u(n)) + 4

∫fu(n) dx+ 4E(u(m)) + 4

∫fu(m) dx

− 8E(1/2(u(n) + u(m))− 8∫

1/2f(u(n) + u(m))

=4E(u(n)) + 4E(u(m))− 8E(1/2f(u(n) + u(m)))

Taking the limit m,n → ∞ yields 4E(u(n)) + 4E(u(m)) → 8 infv∈V0 E(v).Lastly, −E(1/2f(u(n) + u(m))) can be bounded by infv∈V0E(v). It follows thatlim supm,n→∞|u(n) − u(m)|21 ≤ 0 and consequently as desired

limm,n→∞

|u(n) − u(m)|21 = 0.

7

Note 1.1.12. Dirichlet-proof Dirichlet’s principle proved essential for the de-velopment of a rigorous solution theory for Poisson’s equation. Its proof will bedeferred to the next theorem.

1.1.2 Variational principle and weak formulation

1.1.13 Theorem: A function u ∈ Vg minimizes the Dirichlet energy, ifand only if there holds∫

Ω

∇u · ∇v dx =∫

Ω

fv dx, ∀v ∈ V0. (1.6)

Moreover, any solution to the Dirichlet problem in Definition 1.1.9 solvesthis equation.

1.1.14 Corollary: If a minimizer of the Dirichlet energy exists, it isnecessarily unique.

1.1.15 Lemma: A function u ∈ Vg minimizes the Dirichlet energyadmits the representation u = ug + u0, where ug ∈ Vg is arbitrary andu0 ∈ V0 solves∫

Ω

∇u · ∇v dx =∫

Ω

fv dx−∫

Ω

∇ug · ∇v dx, ∀v ∈ V0. (1.7)

The function u0 depends on the choice of ug, but not the minimizer u.

1.1.16 Notation: The inner product of L2(Ω) is denoted by

(u, v) ≡ (u, v)Ω ≡ (u, v)L2(Ω) =∫

Ω

uv dx.

Its norm is

‖u‖ ≡ ‖u‖Ω ≡ ‖u‖L2(Ω) ≡ ‖u‖L2 =√

(u, v)L2(Ω).

1.1.17 Lemma (Friedrichs inequality): For any function in v ∈ V0there holds

‖v‖Ω ≤ diam(Ω)‖∇v‖Ω. (1.8)

8

1.1.18 Lemma: The definitions

|v|1 = ‖∇v‖L2(Ω),

‖v‖1 =√‖v‖2L2(Ω) + |v|

21,

(1.9)

both define a norm on V0.

1.1.19 Problem: Prove the Friedrichs inequality.

1.1.20 Lemma: The Dirichlet energy with homogeneous boundary con-ditions is bounded from below and thus has an infimum. In particular,there exists a minimizing sequence {un} such that as n→∞,

E(un)→ infv∈V0

E(v). (1.10)

1.1.21 Lemma: The minimizing sequence for the Dirichlet energy is aCauchy sequence.

1.1.22 Definition: The completion of V0 under the norm ‖v‖1 is theSobolev space H10 (Ω).

1.1.23 Lemma (Friedrichs inequality): For any function in v ∈ H10there holds

‖v‖Ω ≤ diam(Ω)‖∇v‖Ω. (1.11)

Proof. Let v ∈ H10 (Ω). We make use of the fact, that by definition of H10 (Ω),there is a sequence vn → v with vn ∈ V0. By Lemma 1.1.17, Friedrichs’ inequal-ity holds for vn uniformly in n. We conclude

‖v‖Ω ≤ ‖v − vn‖Ω + ‖vn‖Ω≤ ‖v − vn‖Ω + diam Ω‖∇vn‖Ω≤ ‖v − vn‖Ω + diam Ω

(‖∇vn −∇v‖Ω + ‖∇v‖Ω

)As n→∞, the norms of the differences converge to zero, such that the desiredresult holds in the limit.

9

1.1.24 Definition: The Dirichlet problem for Poisson’s equation inweak form reads: find u ∈ H1g (Ω) such that∫

Ω

∇u · ∇v dx =∫

Ω

fv dx, ∀v ∈ H10 (Ω). (1.12)

1.1.25 Theorem: The weak formulation in Definition 1.1.24 has aunique solution.

1.1.3 Boundary conditions in weak form

1.1.26 Lemma: Let u ∈ V = H1(Ω) be a solution to the weak formu-lation ∫

Ω

∇u · ∇v dx =∫

Ω

fv dx, ∀v ∈ V (Ω). (1.13)

If u ∈ C2(Ω)∩C1(Ω) and Ω has C1-boundary, then u solves the boundaryvalue problem

−∆u = f in Ω∂nu = 0 on ∂Ω.

(1.14)

1.1.27 Definition: A boundary condition inherent in the weak formu-lation and not explicitly stated is called natural boundary condition.If boundary values are obtained by constraining the function space it iscalled essential boundary condition.We also call a boundary condition in strong form, if it is a constraint onthe function space, and in weak form, if it is part of the weak formulation.

Remark 1.1.28. Dirichlet and homogeneous Neumann boundary conditionsare examples for essential and natural boundary conditions, respectively.

10

1.1.29 Lemma: The boundary value problem

−∆u = f in Ωu = 0 on ΓD ⊂ ∂Ω

∂nu+ αu = g on ΓR ⊂ ∂Ω,(1.15)

has the weak form: find u ∈ V such that∫Ω

∇u · ∇v dx+∫

ΓR

αuv ds =

∫Ω

fv dx+

∫ΓR

gv ds, ∀v ∈ V (Ω).

(1.16)

1.2 Hilbert Spaces and Bilinear Forms

1.2.1 Definition: Let V be a vector space over R. An inner producton V is a mapping 〈., .〉 : V × V → R with the properties

〈αx+ y, z〉 = α〈x, z〉+ 〈y, z〉 ∀x, y, z ∈ V ;α ∈ K (1.17)〈x, y〉 = 〈y, x〉 ∀x, y ∈ V (1.18)〈x, x〉 ≥ 0 ∀x ∈ V and (1.19)〈x, x〉 = 0⇔ x = 0, (1.20)

usually referred to as (bi-)linearity, symmetry, and positive definiteness.We note that linearity in the second argument follows immediately bysymmetry.

1.2.2 Theorem (Bunyakovsky-Cauchy-Schwarz inequality): Forevery inner product there holds the inequality

〈v, w〉 ≤√〈v, v〉

√〈w,w〉. (1.21)

Equality holds if and only if v and w are collinear.

Proof. The case w = 0 is trivial. Without loss of generality we can thereforeassume that w 6= 0. Define λ ∈ R as λ = 〈v,w〉〈w,w〉 . By (1.19) we have

0 ≤ 〈v − λw, v − λw〉

11

and by (1.18) the right-hand side extends to

〈v, v〉 − 〈v, λw〉 − 〈λw, v〉+ 〈λw, λw〉= 〈v, v〉 − λ〈v, w〉 − λ〈v, w〉+ λλ〈w,w〉.

Evaluating λ yields the inequality

0 ≤ 〈v, v〉 − 〈v, w〉〈v, w〉〈w,w〉

− 〈v, w〉〈v, w〉〈w,w〉

+〈v, w〉〈v, w〉〈w,w〉

〈w,w〉2.

The result follows from multiplication with 〈w,w〉 and arranging the summands.

For the second part let v, w be colinear, i.e. there is a λ ∈ K such that v = λw.Then deducing the equality is trivial. Now let equality hold for (1.21). Weimmediately get that the equality must also hold for

0 = 〈v − λw, v − λw〉.

However, by (1.20) this implies

0 = v − λw.

Thus, v and w are colinear.

1.2.3 Lemma: Every inner product defines a norm by

‖v‖ =√〈v, v〉. (1.22)

Proof. Definiteness and homogeneity follow from the properties of the innerproduct. It remains to show the triangle inequality

‖u+ v‖ ≤ ‖u‖+ ‖v‖.

Squaring the left hand side yields with the Bunyakovsky-Cauchy-Schwarz in-equality

‖u+ v‖2 = 〈u+ v, u+ v〉 = ‖u‖2 + 2〈u, v〉+ ‖v‖2

≤ ‖u‖2 + 2‖u‖‖v‖+ ‖v‖2 =(‖u‖+ ‖v‖

)2.

12

1.2.4 Definition: A space V with is complete with respect to a norm,if all Cauchy sequences with elements in V have their limit in V . Asubspace W ⊂ V is closed if it is complete in the topology of V .The completion of a space V with respect to a norm consists of thespace V and the limits of all Cauchy sequences in V . We denote thecompletion of a space V by

V = V‖·‖V

. (1.23)

1.2.5 Definition: A normed vector space is a vector space V witha norm ‖·‖. We may also write ‖·‖V to highlight the connection.A normed vector space V which is complete with respect to its norm iscalled a Banach space.A vector space V equipped with an inner product 〈., .〉 is called an innerproduct space or pre-Hilbert space. A Hilbert space is a pre-Hilbert space which is also complete.

1.2.6 Definition: Let V be an inner product space over a field K. Twovectors x, y ∈ V are called orthogonal if 〈x, y〉 = 0. We write x ⊥ y.Let W be a subspace of V . We say that a vector v is orthogonal to thesubspace W , if it is orthogonal to every vector in W .A set of nonzero mutually orthogonal vectors {xi} ⊂ V is called or-thogonal set. If additionally ‖xi‖ = 1 for all vectors, it is called anorthonormal set. These notions transfer directly from finite to count-able sets.

1.2.7 Definition: Let W ⊂ V be a subspace of a Hilbert space V . Wedefine its orthogonal complement W⊥ ⊂ V by

W⊥ ={v ∈ V

∣∣〈v, w〉V = 0 ∀w ∈W}. (1.24)1.2.8 Lemma: The orthogonal complement W⊥ of a subspace W ⊂ Vis closed in the sense of Definition 1.2.4.

Proof. By the Bunyakovsky-Cauchy-Schwarz inequality, the inner product iscontinuous on V × V . Therefore, the mapping

ϕw : V → R,v 7→ 〈v, w〉,

13

is continuous. For any w ∈ W , the kernel of ϕw is closed as the pre-image ofthe closed set {0}. Since

W⊥ =⋂w∈W

ker (ϕw) ,

it is closed as the intersection of closed sets.

1.2.9 Theorem: Let W be a subspace of a Hilbert space V and W⊥

its orthogonal complement. Then, W⊥ = W⊥. Further, V = W ⊕W⊥

if and only if W is closed.

Proof. Clearly, W⊥ ⊂W⊥ since W ⊂W . Let now u ∈W⊥. Then, ϕ = 〈u, ·〉 is

a continuous linear functional on V . Therefore, if a sequence wn ⊂W convergesto w ∈W , we have

〈u,w〉 = limn→∞

〈u,wn〉 = 0,

since u ∈W⊥. Hence, u ∈W⊥ and W⊥ = W⊥.

Now, the “only if” follows by the fact, that ifW is not closed, there is an elementw ∈ W but not in W such that 〈w, u〉 = 0 for all u ∈ W⊥. Thus, w 6∈ W⊥ andconsequently w 6∈W⊥ ⊕W .

Let nowW be closed. We show that for all v ∈ V there is a unique decomposition

v = w + u, with w ∈W, u ∈W⊥. (1.25)

This is equivalent to V = W ⊕W⊥. Uniqueness follows, since

v = w1 + u1 = w2 + u2

implies that for any y ∈ V

0 = 〈w1 − w2 + u1 − u2, y〉 = 〈w1 − w2, y〉+ 〈u1 − u2, y〉.

Choosing y = u1−u2 and w1−w2 in turns, we see that one of the inner productsvanishes for orthogonality and the other implies that the difference is zero.

If v ∈ W , we choose w = v and u = 0. For v 6∈ W , we prove existence byconsidering that due to the closedness of W there holds

d = infw′∈W

‖v − w′‖ > 0.

Let wn be a minimizing sequence. Using the parallelogram identity

‖a+ b‖2 + ‖a− b‖2 = 2‖a‖2 + 2‖b‖2,

14

we prove that {wn} is a Cauchy sequence by

‖wm − wn‖2 = ‖(v − wn)− (v − wm)‖2

= 2‖v − wn‖2 + 2‖v − wm‖2 − ‖2v − wm − wn‖2

= 2‖v − wn‖2 + 2‖v − wm‖2 − 4∥∥∥∥v − wm + wn2

∥∥∥∥2≤ 2‖v − wn‖2 + 2‖v − wm‖2 − 4d2,

since (wm + wn)/2 ∈ W and d is the infimum. Now we use the minimizingproperty to obtain

limm,n→∞

‖wm − wn‖2 = 2d2 + 2d2 − 4d2 = 0.

Since V is given as a Hilbert space and as such complete, w = limwn existsand by the closedness of W , we have w ∈ W . Let u = v − w. By continuity ofthe norm, we have ‖u‖ = d. It remains to show that u ∈ W⊥. To this end, weintroduce the variation w + εw̃ with w̃ ∈W to obtain

d2 ≤ ‖v − w − εw̃‖2

= ‖u‖2 − 2ε〈u, w̃〉+ ε2‖w̃‖,

implying for any ε > 0

0 ≤ −2ε〈u, w̃〉+ ε2‖w̃‖,

which requires 〈u, w̃〉 = 0. Since w̃ ∈ W was chosen arbitrarily, we have u ∈W⊥.

1.2.10 Definition: Let W be a closed subspace of the Hilbert space Vand W⊥ be its orthogonal complement. Then, the orthogonal projec-tion operators

ΠW : V →WΠW⊥ : V →W⊥

(1.26)

are defined by the unique decomposition

v = ΠW v + ΠW⊥v. (1.27)

15

1.2.11 Definition: A linear functional on a vector space V is a linearmapping from V to K.The dual space V ∗ of a vector space V , also called the normed dual,is the space of all bounded linear functionals on V equipped with thenorm

‖ϕ‖V ∗ = supv∈V

ϕ(v)

‖v‖V. (1.28)

1.2.12 Theorem (Riesz representation theorem): Let V be aHilbert space. Then, V is isometrically isomorphic to V ∗. In partic-ular, there is an isomorphism

% : V → V ∗,y 7→ f,

(1.29)

such that

〈x, y〉 = f(x) ∀x ∈ V,‖y‖V = ‖f‖V ∗ .

(1.30)

We refer to % as Riesz isomorphism.

Proof. The proof is constructive and makes use of the orthogonal complement.

First, it is clear that for any y ∈ V a linear functional f ∈ V ∗ is defined byf(·) = 〈·, y〉. Furthermore, % is injective, since

〈x, y〉 = 0 ∀x ∈ V

implies y ∈ V ⊥ = {0}. By the Bunyakovsky-Cauchy-Schwarz inequality, wehave

‖f‖V ∗ = supx∈V

|f(x)|‖x‖V

= supx∈V

|〈x, y〉|‖x‖V

≤ ‖y‖V ,

with equality for x = y. It remains to show that % is surjective. To this end, letf ∈ V ∗ be arbitrary and let N = ker (f). If N = V , we choose y = 0. If not,choose y⊥ ∈ N⊥ and let

y =f(y⊥)

‖y⊥‖2y⊥ ∈ N⊥, (1.31)

16

such that f(y) =∣∣f(y⊥)∣∣2/∥∥y⊥∥∥2 6= 0. Let now x ∈ V be chosen arbitrarily.

Then, there holds

x =

(x− f(x)

f(y)y

)+f(x)

f(y)y,

where f(x)f(y) denotes a scalar. Since

f

(x− f(x)

f(y)y

)=

(f(x)− f(x)f(y)

f(y)

)= 0,

this decomposition amounts to x = x0 + x⊥ with x0 ∈ N and x⊥ ∈ N⊥. It isunique according to Definition 1.2.10. Thus, we have that x⊥ is a multiple ofy, say x⊥ = αy with α = f(x)f(y) and thus

f(x) = f(x0) + f(x⊥) =αf(y) =α

∣∣f(y⊥)∣∣2‖y⊥‖2

〈x, y〉 =〈x0, y

〉+〈x⊥, y

〉=α‖y‖2V =α

∣∣f(y⊥)∣∣2‖y⊥‖4

∥∥y⊥∥∥2Hence, the two terms are equal and % is surjective.

1.2.13 Definition: A bilinear form a(., .) on a Hilbert space V is amapping a : V ×V → R, which is linear in both arguments. The bilinearform is bounded, if there is a constant M such that

a(u, v) ≤M‖u‖V ‖v‖V , ∀u, v ∈ V. (1.32)

It is called coercive or elliptic, if there is a constant α such that

a(u, u) ≥ α‖u‖2V ∀u ∈ V. (1.33)

Notation 1.2.14. One often speaks of V -elliptic instead of elliptic to pointout that the ellipticity of the bilinear form indeed depends on the scalarproductinducing the V -norm.

17

1.2.15 Lemma: Let aij ∈ C1(Ω), bi, c ∈ C0(Ω). A solution to theDirichlet problem

−d∑

i,j=1

∂i(aij∂ju

)+

d∑i=1

(bi∂iu

)+ cu = f in Ω

u = 0 on ∂Ω

(1.34)

solves the weak problem: find u ∈ V0 such that for all v ∈ V0

a(u, v) ≡ (A∇u,∇v) + (b · ∇u, v) + (cu, v) = (f, v), (1.35)

where A(x) =(aij(x)

)is the matrix of coefficients of the second order

term and b(x) =(bi(x)

)is the vector of coefficients of the first order

term.If additionally u ∈ C2(Ω) holds, then the solution to the weak problemsolves the Dirichlet problem in differential form.

1.2.16 Lemma (Lax-Milgram): Let a(., .) be a bounded, coercivebilinear form on a Hilbert space V and let f ∈ V ∗. Then, there is aunique element u ∈ V such that

a(u, v) = f(v) ∀v ∈ V. (1.36)

Furthermore, there holds

‖u‖V ≤1

α‖f‖V ∗ . (1.37)

Proof. To prove Lax-Milgram we first consider uniqueness and then the exis-tence of a solution.

Assume that there are solutions u1, u2 ∈ V of (1.36), i. e. there holds a(u1, v) =f(v) and a(u2, v) = f(v) for all v ∈ V . Thus, a(u1 − u2, v) = 0 for all v ∈ V .Now choose v = u1 − u2 ∈ V . Since a(., .) is coercive with α > 0 it holds

0 = a(u1 − u2, u1 − u2) ≥ α‖u1 − u2‖2V

which implies u1 − u2 = 0. Hence u1 = u2.

Let us now consider the existence of a solution. We will define a linear functionalto apply Riesz and Banach fixed point theorem. For all y ∈ V there holds

〈y, ·〉 − ω [a(y, ·)− f(·)] ∈ V ∗

18

with ω > 0. Due to Riesz there exists an isomorphism % : V → V ∗ that maps agiven z ∈ V to 〈y, ·〉 − ω [a(y, ·)− f(·)] such that

〈v, z〉 = 〈y, v〉 − ω [a(y, v)− f(v)] ∀v ∈ V.

Now we define the mapping Tω : V → V that maps y 7→ z and define S : V → Vsuch that

〈Su, v〉 = a(u, v) ∀v ∈ V

with ‖Su‖V ≤M‖u‖V for M > 0. Now consider y − x instead of y. This leadsfor all v ∈ V to

〈Tωy − Tωx, v〉 = 〈y − x, v〉 − ω [a(y − x, v)] = 〈y − x, v〉 − ω〈S(y − x), v〉.

Thus, we can conclude that Tω(y− x) = y− x−ωS(y− x). Applying the normand using the fact that it is induced by the inner product of V we get

‖Tωy − Tωx‖2V = ‖y − x‖2V − 2ω〈S(y − x), y − x〉+ ω2‖S(y − x)‖2

≤ ‖y − x‖2 − 2ωa(y − x, y − x) + ω2M2‖y − x‖2

≤ (1− 2ωα+ ω2M2)‖y − x‖2.

As we want to apply Banach fixed point theorem, we need Tω to be a contraction.Therefore, we need 1 − 2ωα + ω2M2 < 1 which is given for ω ∈

(0, 2αM2

).

Hence, Tω is a contraction and there exists a ν ∈ V such that Tων = ν and〈Tων, ν〉 = 〈ν, ν〉 − ω [a(ν, v)− f(v)] which implies a(ν, v) = f(v). Thus, thereexists a solution ν.

Now the stability estimate (1.37) is left to prove. Using the coercivity of ourbilinear form yields

α‖u‖V ≤a(u, u)

‖u‖V=

f(u)

‖u‖V≤ supu∈V

|f(u)|‖u‖V

= ‖f‖V ∗ .

1.2.17 Lemma: Let aij , c ∈ L∞(Ω), bi ∈ C1(Ω) such that there holdsfor a positive constant α

α|ξ|2 ≤ ξTA(x)ξ, ∀ξ ∈ Rd,0 ≤ c+∇·b.

(1.38)

Then, the associated bilinear form is coercive and bounded on H10 (Ω),and thus the weak formulation has a unique solution.

1.2.18 Definition: A differential equation of second order in divergenceform (1.34) such that (1.38) holds is called elliptic. The lower bound αis the ellipticity constant.

19

1.3 Fast Facts on Sobolev Spaces

1.3.1. While this section reviews some of the basic mathematical properties ofSobolev spaces, it suffers a bit from overly abstract mathematical arguments.While the strongly mathematically inclined reader might appreciate this, it isnot really necessary for the remainder of this lecture, where we only need thebasic results.

1.3.2 Notation: For a multi-index α = (α1, . . . , αd) with nonnegativeinteger αi and a function with sufficient differentiability, we define thederivative

∂αf = ∂α11 · · · ∂αdd f.

The order of ∂α is

|α| =∑

αi.

1.3.3 Definition: If for a given function u there exists a function wsuch that ∫

Ω

wϕdx = −∫

Ω

u∂iϕdx, ∀ϕ ∈ C∞00(Ω), (1.39)

then we define ∂iu := w as the distributional derivative (partial) ofu with respect to xi. Here, C∞00(Ω) is the space of all functions in C∞(Ω)with compact support in Ω.Similarly through integration by parts, we define distributional direc-tional derivatives, distributional gradients, ∂αu, etc.We call a distributional derivative weak derivative in Lp if it is afunction in this space.

Remark 1.3.4. Formula (1.39) is the usual integration by parts. Therefore,whenever u ∈ C1 in a neighborhood of x, the distributional derivative and theusual derivative coincide.

Example 1.3.5. Let Ω = R and u(x) = |x|. Intuitively, it is clear that thedistributional derivative, if it exists, must be the Heaviside function

w(x) =

{−1 x < 01 x > 0.

(1.40)

The proof that this is actually the distributional derivative is left to the reader.

20

Example 1.3.6. For the derivative of the Heaviside function in (1.40), we firstobserve that it must be zero whenever x 6= 0, since the function is continuouslydifferentiable there. Now, we take a test function ϕ ∈ C∞ with support in theinterval (−ε, ε) for some positive ε. Let w′(x) be the derivative of w. Then, byintegration by parts∫ ε−εw(x)ϕ′(x) dx = −

∫ 0−εw(x)′ϕ(x) dx−

∫ ε0

w(x)′ϕ(x) dx+ 2ϕ(0) = 2ϕ(0),

since w′(x) = 0 under both integrals. Thus, w′(x) is an object which is zeroeverywhere except at zero, but its integral against a test function ϕ is nonzero.This contradicts our notion, that integrable functions can be changed on a setof measure zero without changing the integral. Indeed, w′ is not a functionin the usual sense, and we write w′(x) = 2δ(x), where δ(x) is the Dirac δ-distribution, which is defined by the two conditions

δ(x) = 0, ∀x 6= 0∫Rδ(x)ϕ(x) dx = ϕ(0), ∀ϕ ∈ C0(R).

We stress that δ is not an integrable function, or a function at all.

1.3.7 Definition: The Sobolev space W k,p(Ω) is the space

W k,p(Ω) ={u ∈ Lp(Ω)

∣∣∂αu ∈ Lp(Ω)∀|α| ≤ k}, (1.41)where the derivatives are understood in weak sense. Its norm is definedby

‖v‖pk,p = ‖v‖pk,p;Ω =

∑|α|≤k

‖∂αv‖pLp(Ω). (1.42)

The following seminorm will be useful:

|v|pk,p = |v|pk,p;Ω =

∑|α|=k

‖∂αv‖pLp(Ω). (1.43)

1.3.8 Notation: We will use the notation

‖v‖0 = ‖v‖0;Ω = ‖v‖L2(Ω).

Accordingly, W 0,p(Ω) = Lp(Ω).

21

1.3.9 Corollary: There holds

W k,p(Ω) ⊂W k−1,p(Ω) ⊂ · · · ⊂W 0,p(Ω) = Lp(Ω) (1.44)

1.3.10 Definition: The Sobolev space Hk,p(Ω) is the completion ofC∞(Ω) with respect to the norm ‖·‖pk,p.In the case p = 2, we write Hk(Ω) = Hk,2(Ω).

1.3.11 Theorem (Meyers-Serrin):

Hk,p(Ω) ∼= W k,p(Ω)

Example 1.3.12. Functions, which are in W k,p(Ω) or not.

1. The function x/|x| is in H1(B1(0)) if d = 3, but not if d = 2.

1.3.13 Definition: A bounded domain Ω ⊂ Rd is said to have Ck-boundary or to be a Ck-domain, if there is a finite covering {Ui} of itsboundary ∂Ω, such that for each Ui there is a mapping Φi ∈ Ck(Ui) withthe following properties:

Φi(∂Ω ∩ Ui) ⊂{x ∈ Rd | x1 = 0

},

Φi(Ω ∩ Ui) ⊂{x ∈ Rd | x1 > 0

}.

(1.45)

The domain is called Lipschitz, if such a construction exists withLipschitz-continuous mappings.

1.3.14 Definition: We say that a normed vector space U ⊂ V is con-tinuously embedded in another space V , in symbolic language

U ↪→ V, (1.46)

if the inclusion mapping U 3 x 7→ x ∈ V is continuous, that is, there isa constant c such that

‖x‖V ≤ c‖x‖U . (1.47)

If the spaces U and V consist of equivalence classes, the inclusion mayinvolve choosing representatives on the left or on the right.

22

1.3.15 Theorem: Let Ω ⊂ Rd be a bounded Lipschitz domain. For thespace W k,p(Ω) define the number

s = k − dp . (1.48)

Assume k1 ≤ k2 and p1, p2 ∈ [1,∞). Then, if s1 ≥ s2, we have thecontinuous embedding

W k1,p1(Ω) ↪→W k2,p2(Ω). (1.49)

1.3.16 Lemma: Let Ω be a bounded Lipschitz domain in Rd. Then,there exists a constant c only depending on Ω, such that every functionu ∈ H1(Ω) ∩ C1(Ω) admits the estimate

‖u‖Lp(∂Ω) ≤ c‖u‖W 1,p(Ω). (1.50)

1.3.17 Theorem (Trace theorem): Let Ω be a bounded Lipschitzdomain in Rd. Then, every function u ∈ W 1,p(Ω) has a well definedtrace γu ∈ Lp(∂Ω) and there holds

‖γu‖Lp(∂Ω) ≤ c‖u‖W 1,p(Ω), (1.51)

with the same constant as in the previous lemma. We simply write

u|∂Ω = γu. (1.52)

Remark 1.3.18. The trace theorem guarantees that the imposition of Dirichletboundary conditions on Sobolev functions is a reasonable operation. In partic-ular, it ensures that H1(Ω) and H10 (Ω) are indeed different spaces. The samedoes not hold, if we complete C1(Ω) and C10 (Ω) in L2(Ω).

Remarkably, we set out definingW 1,p(Ω) as a subset of Lp(Ω), which consists offunctions “defined up to a set of measure zero”. Now it turns out, that functionsin W 1,p(Ω) can have well-defined values on certain sets of measure zero. Fromthe point of view of subsets of Lp(Ω), this is always to be understood by choosingrepresentatives of the equivalence class. This is also the reason why we write“↪→” instead of “⊂”.

23

1.3.19 Definition: A function f ∈ C0(Ω) is Hölder-continuous withexponent γ ∈ (0, 1], if there is a constant Cf such that

|f(x)− f(y)| ≤ Cf |x− y|γ ∀x,y ∈ Ω. (1.53)

In particular, for γ = 1, we obtain Lipschitz-continuity.We define the Hölder space Ck,γ of k-times continuouly differentiablefunctions such that all derivatives of order k are Hölder-continuous. Thenorm is

‖u‖Ck,γ(Ω) = max|α|≤k

supx,y∈Ω

|∂αu(x)− ∂αu(y)||x− y|γ

(1.54)

1.3.20 Theorem: Let Ω ⊂ Rd be a bounded Lipschitz domain. Ifs = k − dp > j + γ, then every function in W

k,p has a representative inCj,γ . We write

W k,p(Ω) ↪→ Cj,γ(Ω). (1.55)

1.3.21 Corollary: Elements of Sobolev spaces are continuous if thederivative order is sufficiently high. In particular,

H1(Ω) ↪→ C(Ω) d = 1,H2(Ω) ↪→ C(Ω) d = 2, 3.

(1.56)

1.4 Regularity of Weak Solutions

1.4.1. So far, we have proven existence and uniqueness of weak solutions. Wehave seen, that these solutions may not even be continuous, far from differen-tiable. In this section, we collect a few results from the analysis of elliptic pdewhich establish higher regularity under stronger conditions.

1.4.2 Definition: The space W k,ploc (Ω) consists of functions u such thatu ∈ W k,p(Ω1) for any Ω1 ⊂⊂ Ω, where the latter reads compactly em-bedded, namely Ω1 ⊂ Ω. Similarly, we define Hkloc.

24

1.4.3 Theorem ([Gilbarg and Trudinger, 1998, Theorem 8.8]):Let aij ∈ C0,1(Ω) and bi, c ∈ L∞(Ω). If u ∈ H1(Ω) is a solution to theelliptic equation and f ∈ L2(Ω), then u ∈ H2loc(Ω).

1.4.4 Theorem (Interior regularity): Let aij ∈ Ck,1(Ω) and bi, c ∈Ck−1,1(Ω). If u ∈ H1(Ω) is a solution to the elliptic equation and f ∈W k,2(Ω), then u ∈ Hk+2loc (Ω).

Proof. [Gilbarg and Trudinger, 1998, Theorem 8.10]

1.4.5 Corollary: If in the interior regularity theorem d = 2, 3, then uis a classical solution of the PDE if k ≥ 2.If aij , bi, c ∈ C∞(Ω), then u ∈ C∞(Ω).

1.4.6 Theorem (Global regularity): If in addition to the assump-tions of the interior regularity theorem Ω is a Ck+2-domain, then thesolution u ∈ H10 (Ω) to the homogeneous Dirichlet boundary value prob-lem problem is in Hk+2(Ω).

Proof. [Gilbarg and Trudinger, 1998, Theorem 8.13]

1.4.7 Corollary: Let Ω ⊂ Rd with d = 2, 3 be a C2-domain, aij ∈C0,1(Ω) and bi, c ∈ L∞(Ω). If u ∈ H10 (Ω) is a solution to the ellipticequation and f ∈ L2(Ω), then u ∈ H2(Ω).

1.4.8 Remark: In order to guarantee a classical solution by these ar-guments, we must require that Ω has C4 boundary.

1.4.9 Remark: The condition ∂Ω ∈ C2 in the previous corollary canbe replaced by the assumption that Ω is convex.

25

1.4.10 Theorem (Kondratev): Let the assumptions of the interiorregularity theorem Theorem 1.4.3 hold. Assume further that ∂Ω ispiecewise C2 with finitely many irregular points. Then, the solutionu ∈ H10 (Ω) of the elliptic PDE admits a representation

u = u0 +

n∑i=1

ui, (1.57)

where u0 ∈ H2(Ω) and ui is a singularity function associated with theirregular point xi.

Proof. [Kondrat’ev, 1967]

26

Chapter 2

Conforming Finite ElementMethods

2.1 Meshes, shape functions, and degrees of free-dom

2.1.1 Definition: Let T ⊂ Rd be a polyhedron. We call the lowerdimensional polyhedra constituting its boundary facets. A facet of di-mension zero is called vertex, of dimension one edge, and a facet ofcodimension one is called a face.

2.1.2 Definition: A mesh T is a nonoverlapping subdivision of thedomain Ω into polyhedral cells denoted by T , for instance simplices,quadrilaterals, or hexahedra. The faces of a cell are denoted by F , thevertices by X. Cells are typically considered open sets.A mesh T is called regular, if each face F ⊂ ∂T of the cell T ∈ T is eithera face of another cell T ′, that is, F = T ∩ T ′, or a subset of ∂Ω.

Remark 2.1.3. For this introduction, we will assume that indeed Ω is theunion of mesh cells, which means, that its boundary consists of a finite union ofplanar faces. The more general case of a mesh approximating the domain willbe deferred to later discussion.

27

2.1.4 Definition: With a mesh cell T , we associate a finite dimen-sional shape function space P(T ) of dimension nT . The term nodefunctional denotes linear functionals on this space.A set of node functionals {N iT }i=1,...,nT is called unisolvent on P(T ) iffor any vector u = (u1, . . . , unT )T there exists a unique u ∈ P(T ) suchthat

N iT (u) = ui, i = 1, . . . , nT . (2.1)

A finite element is a set of shape function spaces P(T ) for all T ∈ Ttogether with unisolvent set of node functionals.

2.1.5 Notation: If the node functionals N i are unisolvent on P(T ),then, there is a basis {pk} of P(T ) such that

N i(pk) = δik. (2.2)

We refer to {pk} as shape function basis and use the term degreesof freedom for both the node functionals and the basis functions.

2.1.6 Definition: Node functionals can be associated with the cell Tor with one of its lower dimensional boundary facets. We call this asso-ciation the topology of the finite element.

2.1.7 Definition: The finite element space on the mesh T, denotedby VT is a subset of the concatenation of all shape function spaces,

VT ⊂{f ∈ L2(Ω)

∣∣f|T ∈P(T )}. (2.3)The degrees of freedom of VT are the union of all node functionals,where we identify node functionals associated to boundary facets amongall cells sharing this facet. The resulting dimension is

n = dimVT ≤∑

nT . (2.4)

2.1.8 Notation: When we enumerate the degrees of freedom of VT, weobtain a global numbering of degrees of freedom N i with i = 1, . . . , n.For each mesh cell, we have a local numbering N jT with j = 1, . . . , n. Byconstruction of the finite element space, there is a unique i, such thatN jT (f) = N

i(f) for all cells T and local indices j. The converse is nottrue due to the identification process.

28

•

••

•

•

•

•

• •

•

•

•• •• • ••

Figure 2.1: Identification of node functionals. The node functionals on sharededges (separated for presentation purposes) are distinguished locally as belong-ing to their respective cells, but identical global indices are assigned to all nodesin a single circle. Thus, all associated shape functions obtain the same coefficientin the global basis representation of a finite element function u.

2.1.9 Definition: We refer to the mapping between N i and N jT as themapping between global and local indices

ι : (T, j) 7→ i. (2.5)

It induces a “natural” basis {vi} of VT by

vi|T = pT,j , (2.6)

where {pT,j} is the shape function basis on T . For each N i, we defineT(N i) as the set of cells T sharing the node functional N i, and

Ω(N i

)=

⋃T∈T(N i)

T. (2.7)

2.1.10 Lemma: The support of the basis function vi ∈ VT is

supp(vi) ⊂ Ω(N i

).

2.1.11 Lemma: Let T be a subdivision of Ω, and let u be a functionon Ω, such that u|T ∈ C1(T ) for each T ∈ T. Then,

u ∈ H1(Ω) ⇐⇒ u ∈ C(Ω). (2.8)

29

2.1.12 Lemma: We have VT ⊂ C(Ω) if and only if for every facet F ofdimension dF < d there holds that

1. the traces of the spaces P(T ) on F coincide for all cells T havingF as a facet,

2. The node functionals associated to the facet are unisolvent on thistrace space.

2.1.1 Shape function spaces on simplices

2.1.13 Definition: A simplex T ∈ Rd with vertices X0, . . . ,Xd is de-scribed by a set of d + 1 barycentric coordinates λ = (λ0, . . . , λd)Tsuch that

0 ≤ λi ≤ 1 i = 0, . . . , d; (2.9)λi(Xj) = δij i, j = 0, . . . , d (2.10)∑λi(x) = 1, (2.11)

and there holds

T ={x ∈ Rd

∣∣∣x = ∑Xkλk}. (2.12)2.1.14 Lemma: There is a matrix BT ∈ Rd+1×d and a vector bT ∈Rd+1, such that

λ = BTx + bT . (2.13)

2.1.15 Corollary: The barycentric coordinates λ0, . . . , λd are the linearLagrange interpolating functions for the points X0, . . . ,Xd. In particu-lar, λk ≡ 0 on the facet not containing Xk.

Example 2.1.16. We can use barycentric coordinates to define interpolatingpolynomials on simplicial meshes easily, as in Table 2.1.

Remark 2.1.17. The functions λi(x) are the shape functions of the linear P1element on T . They allow us to define basis functions on the cell T without useof a reference element T̂ .

Note that λi ≡ 0 on the face opposite to the vertex xi.

30

Degrees of freedom Shape functions

ϕi = λi, i = 0, 1, 2

ϕii = 2λ2i − λi, i = 0, 1, 2

ϕij = 4λiλj j 6= i

ϕiii =12λi(3λi − 1)(3λi − 2) i = 0, 1, 2

ϕij =92λiλj(3λj − 1) j 6= i

ϕ0 = 27λ0λ1λ2

Table 2.1: Degrees of freedom and shape functions of simplicial elements interms of barycentric coordinates

2.1.2 Shape functions on tensor product cells

2.1.18 Definition: The space of tensor product polynomials ofdegree k in d dimensions, denoted as Qk consists of polynomials of degreeup to k in each variable. Given a basis for one-dimensional polynomials{pi}i=0,...,k, a natural basis for Qk is the tensor product basis

pi1,...,id(x) = pi1 ⊗ · · · ⊗ pid(x) =d∏k=1

pik(xk). (2.14)

Remark 2.1.19. Note that the basis functions ofQk can be denoted as productsof univariate polynomials, but that general polynomials in this space as linearcombinations of these basis functions do not have this structure.

31

2.1.20 Lemma: Let {Nj} be a set of one-dimensional node functionalsdual to the one-dimensional basis {pi} such that

Nj(pi) = δij . (2.15)

Then, a dual basis for {pi1,...,id} is obtained by defining on the tensorproduct basis of Qk

Nj1,...,jd(pi1,...,id) = Nj1 ⊗ · · · ⊗Njd(pi1 ⊗ · · · ⊗ pid) =d∏k=1

Njk(pik).

(2.16)

Proof. It is a theorem in linear algebra, that a linear functional on a vectorspace is uniquely defined by its values on a basis of the space. Thus, (2.16)uniquely defines the node functionals Nj1,...,jd . The duality property followsfrom the fact that

Nj1,...,jd(pi1,...,id) =d∏k=1

δik,jk ,

which is one if and only if all index pairs match and zero in all other cases.

Example 2.1.21. Let a basis {pi} of the univariate space Pk be defined byLagrange interpolation in k + 1 points tj ∈ [0, 1]. A basis of the d-dimensionalspace Qk is then obtained by all possible products

pi1,...,id(x) =

d∏k=1

pik(xk).

The node functionals following the construction above are obtained by

Nj1,...,jd(pi1,...,id) =d∏k=1

pik(xjk).

Finally, we have to convert the term on the right into an expression, which canbe applied to any polynomial in Qk. To this end, we observe that

d∏k=1

pik(xjk) = pi1,...,id(xj1 , . . . , xjd).

Therefore, we conclude that the tensor product node functionals resulting fromthis construction are

Nj1,...,jd(p) = p(xj1 , . . . , xjd).

32

2.1.22 Example (The space Q2):

2.1.23 Lemma: The trace of the d-dimensional tensor product poly-nomial space Qk on the δ-dimensional facets of the reference cubeT̂ = (0, 1)d is the δ-dimensional space Qk.The traces from two cells sharing the same face coincide, if the mappingis continuous. Therefore, continuity can be achieved by unisolvent setsof node functionals on the face.

Proof. By keeping d− δ variables constant in the tensor product basis in (2.14).

33

2.1.24 Example (Continuous basis functions):

2.1.25 Example (Discontinuous basis functions):

Example 2.1.26. As a second example, we choose d = 2 and the univariatespace P2 with node functionals

N0(p) = p(0), N1(p) =

∫ 10

p(t) dt, N2(p) = p(1), (2.17)

that is, a mixture of Lagrange interpolation and orthogonality on the interval

34

[0, 1]. The matching basis polynomials are

p0(t) = 3(1− t)2 − 2(1− t), p1(t) = 6t(1− t), p2(t) = 3t2 − 2t. (2.18)

Follwoing the construction of the previous example, we obtain

N00(p) = p(0, 0), N02(p) = p(0, 1),

N20(p) = p(1, 0), N22(p) = p(1, 1).(2.19)

Then,

N01(p01) = N01(p0 ⊗ p1) = p0(0)∫ 1

0

p1(y) dy =

∫ 10

p(0, y) dy. (2.20)

Thus, the node functional N01 is the integral over the left edge of the referencesquare. By the same construction, N01 is the integral over the right edge. N10and N12 are the integrals over the bottom and top edge, respectively. Finally,

N11(p11) =

∫ 10

p1(x) dx

∫ 10

p1(y) dy =

∫ 10

∫ 10

p11(x, y) dx dy. (2.21)

Thus, the tensor product of two line integrals becomes the integral over thearea.

2.1.3 The Galerkin equations and Céa’s lemma

2.1.27 Definition (Galerkin approximation): Let u ∈ V be deter-mined by the weak formulation

a(u, v) = f(v) ∀v ∈ V,

where V is a suitable function space including boundary conditions. TheGalerkin approximation, also called conforming approximation ofthis problem reads as follows: choose a subspace Vn ⊂ V of dimension nand find un ∈ Vn, such that

a(un, vn) = f(vn) ∀vn ∈ Vn.

We will refer to this equation as the discrete problem.

2.1.28 Corollary (Galerkin equations): After choosing a basis {vi}for Vn, the Galerkin equations are equivalent to a linear system

Au = f , (2.22)

with A ∈ Rn×n and f ∈ Rn defined by

aij = a(vj , vi), fi = f(vi). (2.23)

35

2.1.29 Lemma: If the lemma of Lax-Milgram holds for a(., .) on V , itholds on Vn ⊂ V . In particular, solvability of the Galerkin equations isimplied.

2.1.30 Lemma (Céa): Let a(., .) be a bounded and elliptic bilinearform on the Hilbert space V . Let u ∈ V and un ∈ Vn ⊂ V be thesolution to the weak formulation and its Galerkin approximation

a(u, v) = f(v) ∀v ∈ V,a(un, vn) = f(vn) ∀vn ∈ Vn,

respectively. Then, there holds

‖u− uh‖V ≤M

αinf

vn∈Vn‖u− vh‖V . (2.24)

2.1.31 Lemma: For a finite element discretization of Poisson’s equationwith the space VT, the Galerkin equations can be computed using thefollowing formulas:

aij =

∫Ω

∇vj · ∇vi dx =∫

Ω(N i)

∇vj · ∇vi dx =∑

T∈T(N i)

∫T

∇vj · ∇vi dx

fi =

∫Ω

fvi dx =

∫Ω(N i)

fvi dx =∑

T∈T(N i)

∫T

fvi dx

2.1.32 Algorithm (Assembling the matrix):

1. Start with a matrix A = 0 ∈ Rn×n

2. Loop over all cells T ∈ T

3. On each cell T , compute a cell matrix AT ∈ RnT×nT by integrating

aT,ij =

∫T

∇pT,j · ∇pT,i dx, (2.25)

where {pT,i} is the shape function basis.

4. Assemble the cell matrices into the global matrix by

aι(i),ι(j) = aι(i),ι(j) + aT,ij i, j = 1, . . . , nT . (2.26)

36

2.1.4 Mapped finite elements

2.1.33 Definition: A mapped mesh T is a set of cells T , which aredefined by a single reference cell T̂ and individual smooth mappings

ΦT : T̂ → Rd

ΦT (T̂ ) = T.(2.27)

The definition extends to small sets of reference cells, for instance fortriangles and quadrilaterals.

2.1.34 Example: Let the reference triangle T̂ be defined by

T̂ =

{(x̂ŷ

)∣∣∣∣x̂, ŷ > 0, x̂+ ŷ < 1} . (2.28)Then, every cell T spanned by the vertices X0, X1, and X2 is obtainedby mapping T̂ by the affine mapping

ΦT (x̂) =

(X1 −X0 X2 −X0Y1 − Y0 Y2 − Y0

)(x̂ŷ

)+

(X0Y0

)=: BT x̂ + bT (2.29)

2.1.35 Example: The reference cell for a quadrilateral is the referencesquare T̂ = (0, 1)2. Every quadrilateral T spanned by the vertices X0 toX3 is then obtained by the bilinear mapping

ΦT (x̂) = X0(1− x̂)(1− ŷ) + X1x̂(1− ŷ) + X2(1− x̂)ŷ + X3x̂ŷ (2.30)

2.1.36 Definition: Mapped shape functions {pi} on a mesh cell T aredefined by a set of shape functions {p̂i} on the reference cell T̂ throughpull-back

pi(x) = p̂i(Φ−1(x)

)= p̂i(x̂),

∇pi(x) = ∇Φ−T (x̂)∇̂p̂i(x̂)(2.31)

37

2.1.37 Lemma: Let T̂ be the reference triangle and let T be a triangularmesh cell with mapping x = ΦT (x̂) = Bx̂ + b. Let there hold u(x) =û(x̂). Then, u ∈ Hk(T ) if and only if û ∈ Hk(T̂ ) and we have with someconstant c the estimates

|û|k,T̂ ≤ c‖B‖k(detB)−

1/2|u|k,T ,

|u|k,T ≤ c‖B−1‖k(detB)1/2|û|k,T̂ .

(2.32)

2.1.38 Lemma: For a cell T , let R be the radius of the circumscribedcircle and % the radius of the inscribed circle. Then,

‖B‖ ≤ cR, ‖B−1‖ ≤ c%−1. (2.33)

2.1.39 Assumption: For more general mappings Φ: T̂ → T , we makethe assumption, that they can be decomposed into three factors,

Φ = ΦO ◦ ΦS ◦ ΦW , (2.34)

where ΦO is a combination of translation and rotation, ΦS is a scal-ing with a characteristic length hT , and ΦW is a warping function notchanging the characteristic length.

Example 2.1.40. We construct the inverse of Φ in two dimensions by thefollowing three steps, using as hT the length of the longest edge of T .

1. Choose ΦO as the rigid body movement which maps the longest edge tothe interval (0, hT ) on the x-axis and the cell itself to TO in the positivehalf plane. This mapping has the structure

Φ−1O (x) = Sx− SX0,

where S is an orthogonal matrix and X0 is the vertex moved to the origin.

2. Choose the scaling

Φ−1S (x) =1hT

x,

such that the longest edge of the resulting cell TS has the longest edgeequal to the interval (0, 1) on the x-axis.

3. Warp the cell TS into the reference cell T̂ by the mapping Φ−1W . This op-eration leaves the longest edge untouched. For triangles, it is the uniquelydefined linear transformation mapping the vertex not on the longest edgeto (0, 1). For quadrilaterals, it is a bilinear transformation.

38

In the first step, we have assumed that the cell is convex, which is always truefor triangles. For nonconvex quadrilaterals, it can be shown that the determi-nant of ∇Φ changes sign inside the cell, such that these cells are not useful forcomputations.

The idea of this decomposition is, that we separate mappings changing theposition, size, and shape of the cells.

2.1.41 Lemma (Scaling lemma): Let the typical length of a cell Tbe hT . Assume there are constants 0 < MT ,mT , dT , DT , such that

‖∇ΦW (x̂)‖ ≤MT ,‖∇Φ−1W (x̂)‖ ≤ m

−1T ,

d2T ≤ det∇ΦW (x̂)) ≤ D2T .(2.35)

for all x̂ ∈ T̂ . Then, for k = 0, 1 and a constant c

|û|k,T̂ ≤ cMTdT

hk−d/2T |u|k,T ,

|u|k,T ≤ cDTmT

hd/2−kT |û|k,T̂ .

(2.36)

This extends to higher derivatives under assumptions on higher deriva-tives of ΦT .

Proof. By the chain rule, ∇ΦT = ∇ΦO∇ΦS∇ΦW . By construction, ∇ΦO is anorthogonal matrix, such that

‖∇ΦO‖ = ‖∇Φ−1O ‖ = 1.

Since it preserves angles and lengths, det∇ΦO = 1. Since ΦS is a multiple ofthe identity, we have

‖∇ΦS‖ = hT , ‖∇Φ−1S ‖ =1

hT, det∇ΦS = hdT .

By change of variables, we have∫T

u2 dx =

∫T̂

û2|det∇ΦT |dx̂ =∫T̂

û2 det∇ΦS det∇ΦO det∇ΦW dx̂,

such that the case k = 0 is proven immediately by

hdT d2T

∫T̂

û2 dx̂ ≤∫T

u2 dx ≤ hdTD2T∫T̂

û2 dx̂

39

By the chain rule, we have

∇̂û(x̂) = ∇ΦT∇u(x) = ∇ΦTW∇ΦTS∇ΦTO∇u(x),

such that there holds

|∇̂û(x̂)| ≤ ‖∇ΦW ‖hT |∇u|,

|∇u(x)| ≤ ‖∇Φ−1W ‖h−1T |∇̂û|.

(2.37)

2.1.42 Remark: We have dT = DT , if and only if the mapping isaffine. The quotient MT /mT measures how much the shape of the meshcell deviates from the reference cell. For instance, it is one for squares.

2.2 A priori error analysis

2.2.1. In this section, we develop error estimates of the following type

If the size of mesh cells converges to zero, then the difference betweenthe true solution and the finite element solution converges to zerowith a certain order.

They are thus a prediction, that the solutions actually converge, and they mea-sure the asymptotic convergence rate. They do contain unknown constants,such that they are no prediction of the error on a given mesh.

The theory in this chapter is about bilinear forms which are bounded and ellipticon a subspace V ⊂ H1(Ω) determined by boundary conditions. We will chooseV = H10 (Ω), even if more general boundary conditions are possible. It is a goodapproach to think of the Dirichlet problem for the Laplacian, even if we allowfor somewhat more general equations.

40

2.2.1 Approximation of Sobolev spaces by finite elements

2.2.2 Lemma (Poincaré inequality): Let Ω be a bounded Lipschitzdomain. For any function u ∈ H1(Ω) define

u =1

|Ω|

∫Ω

u(x) dx, (2.38)

where |Ω| denotes the measure of Ω. There exists a constant c dependingon the domain only, such that each of the following inequalities hold:

‖u− u‖L2(Ω) ≤ c‖∇u‖L2(Ω) (2.39)

‖u‖2L2(Ω) ≤ c(‖∇u‖2L2(Ω) + u

2)

(2.40)

Proof. The proof exceeds the tools we have developed in this class. The proof in[Gilbarg and Trudinger, 1998, Section 7.8] seems elementary and direct, but istechnical and requires star-shaped domains. The proof in [Evans, 1998, Section5.8.1] is more elegant, but it uses compact embedding and is indirect, such thatthe constant cannot be determined.

2.2.3 Lemma (Bramble-Hilbert): Let T ⊂ Rd be a domain withLipschitz boundary and let s(.) be a bounded sublinear functional onHk+1(T ) with the property

s(p) = 0 ∀p ∈ Pk. (2.41)

Then, there exists a constant c only dependent on T such that

|s(v)| ≤ c|v|k+1,T . (2.42)

Proof. Since s(·) is sublinear and vanishes on Pk, we have for v ∈ Hk+1(T ):

|s(v)| ≤ |s(v + p)|+ |s(p)| = |s(v + p)| ∀p ∈ Pk. (2.43)

We will construct a polynomial, such that

∂α(v + p) =1

|T |

∫T

∂α(v + p) dx = 0 ∀|α| ≤ k, (2.44)

that is, the sum v + p and all its derivatives up to order k are mean-value free.Thus, by recursive application of Poincaré inequality, we get for |α| ≤ k

‖v + p‖2L2(T ) ≤ c[‖∇(v + p)‖2L2(T ) + v + p

2]

≤ c|v + p|21;T

‖∂α(v + p)‖2L2(T ), ≤ c[‖∇∂α(v + p)‖2L2(T ) + ∂α(v + p)

2]≤ c|v + p|2|α|+1;T ,

41

such that ‖v + p‖k+1;T ≤ |v + p|k+1;T . Furthermore, since p ∈ Pk

‖∂α(v + p)‖L2(T ) = ‖∂αv‖L2(T ) ∀|α| = k + 1.

Combining with (2.43), we obtain

|s(v)| ≤ c|v + p|k+1;T ≤ c|v|k+1;T .

It remains to construct the polynomial p with the desired properties. To thisend, we note that for two multi-indices α and β holds that ∂αxβ = 0 as soon asαi > βi for some index i. Let

p(x) =∑|β|≤k

aβxβ .

Then, for any |α| = k we get

∂αp(x) = α! δαβ ,

where α! = α1!α2! . . . αd!. Thus, we can use condition (2.44) to fix the coeffi-cients aβ to

aβ =1

β! |T |

∫T

∂βv dx |β| = k.

Thus, we have decomposed p = p̃k + pk−1, where p̃k is known and pk−1 ∈ Pk−1.Thus, we can repeat the process determining the coefficients of highest order inpk−1,

aβ =1

β! |T |

∫T

∂β(v − p̃k) dx |β| = k − 1.

Recursion down to k = 0 yields the polynomial p with the desired property.

2.2.4 Corollary: Let Π: Hk+1(T )→ Pk be a continuous, linear projec-tor. For any m ≤ k there exists a constant c such that

‖u−Πu‖m,T ≤ c|u|k+1,T . (2.45)

42

2.2.5 Definition: Let {Th} for h > 0 be a family of meshesparametrized by the parameter

h = maxT∈Th

hT , (2.46)

where hT is the characteristic length from the discussion of mappings,for instance the diameter of T . Such a family is called shape regular, ifthe constants MT , mT , dT , and DT in the scaling lemma can be chosenindependent of the cell T ∈ Th and of h > 0. The family is called quasi-uniform, if in addition there is a positive constant independent of hsuch that

h ≤ c minT∈Th

hT . (2.47)

2.2.6 Definition: Let V be a function space on Ω, and let Vh = VTh bea finite element space on the mesh Th on Ω with node functionals Ni anddual basis pi, where i = 1, . . . , nh. We define the nodal interpolationoperator by

Ih : V → Vhv 7→

∑Ni(v)pi.

(2.48)

2.2.7 Lemma: The nodal interpolation operator Ih is a projector. It iscontinuous onH2(Ω) if the dimension is d = 2, 3 and the node functionalsare defined as Lagrange interpolation.

2.2.8 Definition: On a mesh Th, we define the broken Sobolev normand seminorm by

‖u‖2k;h =∑T∈Th

‖u‖2k;T

|u|2k;h =∑T∈Th

|u|2k;T(2.49)

43

2.2.9 Theorem: Let {Th} be a shape regular family of meshes. Let thefinite element spaces Vh = VTh . Let the nodal interpolation operator Ihbe surjective onto Pk on every cell T ∈ Th and continuous on Hk+1(Ω).Then, there is a constant c such that for any u ∈ Hk+1(Ω) and m ≤ k+1there holds

‖u− Ihu‖m;h ≤ chk+1−m|u|k+1;h. (2.50)

Proof. We have by definition

|u− Ihu|m;h =∑T∈Th

|u− Ihu|m;T .

Using the scaling lemma, we get

|u− Ihu|m;T ≤ chd/2−mT |û− Îhu|.

On the reference cell, we use the Bramble-Hilbert lemma, more precisely, Corol-lary 2.2.4 to obtain

|û− Îhu| ≤ c|û|k+1;T̂ .

Scaling back yields

|û|k+1;T̂ ≤ chk+1−d/2T |u|k+1;T .

Combining, we obtain

|u− Ihu|m;T ≤ chk+1−d/2−m+d/2

T |u|k+1;T .

Summing up and pulling the maximum of hk+1−mT out of the sum yields theresult for hT ≤ 1.

Remark 2.2.10. Strictly speaking, we have only proven the result for hT ≤ 1.But then, if diam Ω = 1, this condition is always true. Therefore, by rescalingthe domain before computing, the estimate holds in general.

2.2.11 Corollary: Let a(., .) be a bounded and elliptic bilinear formon V = H10 (Ω) and let the finite element space Vh be defined on ashape-regular family of meshes {Th}, such that the interpolation esti-mate eq. (2.50) holds. If furthermore the solution u ∈ Hk+1(Ω), theerror of the finite element solution uh ∈ Vh ⊂ V admits the estimate

‖u− uh‖1;h ≤ chk|u|k+1;h. (2.51)

44

2.2.12. For 2nd order elliptic problems, we have now derived estimates of theH1-norm of the error under the assumption that the solution exhibits furtherregularity. Now, let us drop this assumptionfor an assumption on the boundarycondition only, to obtain a qualitative convergence result.

2.2.13 Theorem: Let a(., .) be a bounded and elliptic bilinear formon V = H10 (Ω) and let the finite element space Vh be defined on ashape-regular family of meshes {Th}, such that the interpolation esti-mate eq. (2.50) holds. Let u ∈ V and uh ∈ Vh be solutions to the exactand the finite element versions of a 2nd order boundary value problem.Then,

limh↘0‖u− uh‖1;Ω = 0. (2.52)

2.2.2 Estimates of stronger norms

2.2.14. So far, we have seen error estimates in a “natural norm” defined as anorm such that the Lax-Milgram lemma holds for a given bilinear form a(., .).In this subsection, we now consider the question of estimates in stronger norms,such that the bilinear form is not elliptic with respect to this norm.

2.2.15 Definition: Let ‖·‖X and ‖·‖Y be norms on a vector space V .We call ‖·‖X a stronger norm than ‖·‖Y , if there is a constant c suchthat for all v ∈ V :

‖v‖Y ≤ c‖v‖X . (2.53)

In this case, ‖·‖Y is called the weaker norm. If a converse inequalityholds, the norms are called equivalent.

Example 2.2.16. For a bounded domain Ω, the norms ‖·‖k+1 and ‖·‖k areboth defined on V = H10 (Ω). By the Sobolev embedding theorem, there is aconstant c such that for any v ∈ V

‖v‖k ≤ c‖v‖k+1

2.2.17 Lemma (Inverse estimate): Let T be a mesh cell of size hT .Then, there is a constant only depending on k and the constants of thescaling lemma, such that for every u ∈ Pk there holds

‖u‖1;T ≤ ch−1T ‖u‖0;T . (2.54)

45

2.2.18 Theorem: Let a(., .) be a bounded and elliptic bilinear formon V ⊂ H1(Ω) and let {Th} be a family of quasi-uniform meshes withfinite element spaces Vh ⊂ V containing the space Pk with k ≥ 2 on eachmesh cell. If furthermore the solution u ∈ Hk+1(Ω), the error betweenthe exact and the finite element solution to a uniquely solvable ellipticboundary value problem, admits the estimate

‖u− uh‖2;h ≤ chk−1|u|k+1;h. (2.55)

Proof. We cannot apply Céa’s lemma directly, since the bilinear form is notelliptic with respect to the broken H2-inner product on the space H1(Ω). Onthe other hand, the error is not polynomial, such that we cannot apply theinverse estimate to it. Instead, we use triangle inequality

‖u− uh‖2;h ≤ ‖u− Ihu‖2;h + ‖Ihu− uh‖2;h,

and observe, that we already have the desired estimate for the first term. Forthe second, we estimate by inverse estimate on each cell

minT∈Th

hT ‖Ihu− uh‖2;h‖Ihu− uh‖1 ≤ c‖Ihu− uh‖21,

and

‖Ihu− uh‖21 ≤c

γa(Ihu− uh, Ihu− uh)

=c

γa(Ihu− u, Ihu− uh)

≤ cMγ‖Ihu− u‖1‖Ihu− uh‖1

≤ Chk|u|k+1;h‖Ihu− uh‖1,

where we have used Galerkin orthogonality and the interpolation estimate.Combining the two and using the quasi-uniformity, we obtain the result of thetheorem.

2.2.3 Estimates of weaker norms and linear functionals

2.2.19. Deriving optimal error estimates in the L2-norm cannot be achieved bythe same technique as used for the H2-norm, as the following simple argumentshows: using triangle inequality, we obtain

‖u− uh‖L2 ≤ ‖u− Ihu‖L2 + ‖Ihu− uh‖L2 ,

we obtain that the first term is of order hk+1. Thus, for an optimal errorestimate, we require that the second term be of order hk+1 as well. We need

46

a replacement of the inverse estimate, which gains an order of h instead ofloosing it. This is Poincaré inequality, but it requires an almost mean-value freefunction. And since Ihu is not the interpolant of uh, we cannot guarantee asmall mean value of the difference on each mesh cell.

To the rescue comes a “duality argument” known as Aubin-Nitsche trick, whichwe introduce now.

2.2.20 Definition: Let the weak form of a boundary value problem onthe domain Ω be defined as: find u ∈ V such that

a(u, v) = f(v) ∀v ∈ V.

Then, the dual problem, also called adjoint problem, with right handside g ∈ V ∗ is: find u∗ ∈ V such that

a(v, u∗) = g(v) ∀v ∈ V.

2.2.21 Lemma: The adjoint problem of the Dirichlet boundary valueproblem for Poisson’s equation is equal to the dual problem, that is, foru ∈ H10 (Ω), the two statements

a(u, v) = f(v) ∀v ∈ V,a(v, u) = f(v) ∀v ∈ V,

are equivalent.

2.2.22 Assumption (Elliptic regularity): Let a(., .) be a boundedand elliptic bilinear form on H10 (Ω). We say that a boundary valueproblem has elliptic regularity, if for any g ∈ L2(Ω) the solution u isin H2(Ω). In other words, there is a constant c independent of g, suchthat

‖u‖2;Ω ≤ c‖g‖0;Ω. (2.56)

Example 2.2.23. By Remark 1.4.9, a second order PDE with coefficients aij ∈C0,1(Ω) and bi, c ∈ L∞(Ω) has elliptic regularity. The same holds by integrationby parts for ints adjoint.

The same boundary value problem does not have elliptic regularity, if the domainhas a nonconvex corner, since we have corner singularity functions which are notin H2(Ω), and we can construct a right hand side g ∈ L2(Ω), which producessuch singularities.

47

2.2.24 Theorem: Let a(., .) be a bounded and elliptic bilinear form onV = H10 (Ω) and let the finite element space Vh be defined on a shape-regular family of meshes {Th}, such that the interpolation estimate (2.50)holds. If furthermore the solution u ∈ Hk+1(Ω) and the dual problem haselliptic regularity, the error of the finite element solution uh ∈ Vh ⊂ Vadmits the estimate

‖u− uh‖0 ≤ chk+1|u|k+1;h. (2.57)

2.2.25 Corollary: Under the assumptions of Theorem 2.2.24, let J(.)be a bounded linear functional on L2(Ω). Then,

|J(u)− J(uh)|0 ≤ chk+1|u|k+1;h. (2.58)

2.2.4 Green’s function and maximum norm estimates

2.2.26. In this section, we will introduce the basic concepts needed for pointwiseerror estimation. They heavily rely on Green’s function, which is useful for ageneral understanding of the solution structure as well.

2.2.27 Definition: For a differential equation Lu = f in Ω with bound-ary conditions u = 0 on the boundary ∂Ω, we define Green’s functionG(y,x) associated to the point y ∈ Ω as solution to the problem

LG(y,x) = δ(x− y) ∀x ∈ Ω,G(y,x) = 0 ∀x ∈ ∂Ω.

(2.59)

2.2.28 Theorem: Green’s function for Poisson’s equation on the wholespace Ω = Rd is

G(y,x) =

{− 12π log|x− y| d = 2

1d(d−2)|B1(0)|

1|x−y|d−2 d ≥ 3.

(2.60)

Proof. See [Evans, 1998, Section 2.2].

48

2.2.29 Lemma: Let Ω ⊂ Rd be a bounded domain. Then, Green’sfunction associated to a point y ∈ Ω for a linear differential operator Lon Ω is obtained as the sum

G(y,x) = G∞(y,x)−G0(y,x), (2.61)

where G∞(y,x) is Green’s function for the whole space and G0(y,x)solves LG0(y,x) = 0 with boundary values G∞(y,x). If the domain isconvex, then, G0(y, .) ∈ H2(Ω) for any interior point y.

2.2.30 Theorem: Let f ∈ C(Ω) and let Ω be such that the solution to

Lu = f in Ω,u = 0 on ∂Ω,

(2.62)

is in C2(Ω). Then,

u(x) =

∫Ω

f(y)G(y,x) dy. (2.63)

Proof. See [Evans, 1998, Section 2.2]

2.2.31 Theorem: Let a(., .) be a bounded and elliptic bilinear formon V = H10 (Ω) on a bounded, convex domain Ω ⊂ R2. Let the finiteelement space Vh be defined on a quasi-uniform family of meshes {Th}with local spaces P1 or Q1. If furthermore the solution u ∈ W 2,∞(Ω),the error of the finite element solution uh ∈ Vh ⊂ V admits the estimate

‖u− uh‖∞ ≤ ch2(1 + |log h|)|u|2,∞. (2.64)

Proof. The proof relies on solving a dual problem for the error in a single point.The solution is Green’s function for the adjoint equation, which is very irreg-ular at the point of interest. Then, complicated analysis is needed to generateuseful approximation estimates in spite of the singularity. Details can be foundin [Schatz and Wahlbin, 1977, Rannacher and Scott, 1982].

Remark 2.2.32. This estimate extends to Ω ∈ Rd for d ≥ 3 and to higherpolynomial degrees without the logarithmic factor. The regularity assumptionsare quite high then.

49

2.3 A posteriori error analysis

2.3.1 Definition: Let u ∈ V be the solution to a boundary value inweak form and uh ∈ Vh be its finite element approximation on the meshTh. We call a quantity ηh(uh) a posteriori error estimator,

‖u− uh‖ ≤ cηh(uh). (2.65)

The estimator is reliable, if the constant c is computable. It is efficient,if the converse estimate holds, namely

ηh(uh) ≤ c‖u− uh‖. (2.66)

2.3.1 Quasi-interpolation in H1

2.3.2. Interpolation in Sobolev spaces in Theorem 2.2.9 relies on the nodalinterpolation operator in Definition 2.2.6, which in turn requires point values ofthe interpolated function. Therefore, it is not defined on H1(Ω) in dimensionsgreater than one. Since we need such interpolation operators in the analysis ofa posteriori error estimates, we provide them in this section.

Most details in this section are from [Verfürth, 2013]. For the ease of presen-tation, we present the results for Dirichlet problem of the Laplacian, namelyV = H10 (Ω) and

a(u, v) =

∫Ω

∇u · ∇v dx.

2.3.3 Definition: A shape-regular family of meshes {Th} is called lo-cally quasi-uniform, if there is a constant c such that for every pair ofcells T1 and T2 sharing at least one vertex there holds

hT1 ≤ chT2 . (2.67)

50

2.3.4 Definition: For a vertex or higher-dimensional boundary facetF , we define the set of cells

TF ={T ∈ T

∣∣F ⊂ ∂T}. (2.68)Similarly, the set of cells sharing at least one vertex with T is called TT .Additionally, we define the subdomains

ΩF =⋃T∈TF

T , ΩT =⋃

T ′∈TT

T′. (2.69)

2.3.5 Theorem (Clément quasi-interpolation): Let {Th} be a lo-cally quasi-uniform family of meshes with piecewise polynomial finiteelement spaces Vh ⊂ H10 (Ω). Then, there exist bounded operatorsIh : H

10 (Ω) → Vh such that for every function u ∈ H10 (Ω), every mesh

cell T , every face F , and for m = 0, 1 there holds

‖u− Ihu‖m;T ≤ ch1−mT |u|1;ΩT (2.70)

‖u− Ihu‖0;F ≤ ch1/2T |u|1;ΩT (2.71)

Proof. We construct the quasi-interpolation operator on a mesh cell T into thelowest order space P1 or Q1 in two steps. First, for each vertex Xi of T , let

ui =1

|ΩXi |

∫ΩXi

udx. (2.72)

By Poincaré inequality,

‖u− ui‖ΩXi ≤ cdiam(ΩXi)‖∇u‖ΩXi . (2.73)

For a vertex Xi on the boundary, we observe

‖ui‖2ΩXi =1

|ΩXi |2

∫ΩXi

(∫udy

)2dx

=1

|ΩXi |

(∫1udy

)2≤ 1|ΩXi |

∫12 dy

∫u2 dy

= ‖u‖2ΩXi≤ cdiam(ΩXi)2‖∇u‖2ΩXi ,

(2.74)

by Friedrichs’ inequality, since the subdomain ΩXi has at least one face on ∂Ω.

51

Now, we define the quasi-interpolation operator Ih : H10 (Ω) → Vh cellwise onsimplices by

Ihu|T =∑

Xi∈∂T∩Ω

λiui. (2.75)

Note, that zero boundary conditions are enforced by omitting vertices on theboundary. On quadrilaterals, we use the bilinear shape functions associated withthe vertices instead of the barycentric coordinates λi. Now, using

∑λi = 1 and

0 ≤ λi ≤ 1, we estimate

‖u− Ihu‖T ≤∑

Xi∈∂T

‖λi(u− ui)‖T +∑

Xi∈∂T∩∂Ω

‖λuui‖

≤∑

Xi∈∂T

‖u− ui‖T +∑

Xi∈∂T∩∂Ω

‖ui‖

≤ diam(ΩXi)

( ∑Xi∈∂T

‖∇u‖ΩXi +∑

Xi∈∂T∩∂Ω

‖∇u‖ΩXi

),

where we used (2.73) and (2.74) in the end. Observing that both sums extendover finitely many vertices and that by local quasi-uniformity we can bound thediameter of diam(ΩXi) by that of T , namely diam(ΩXi) ≤ chT , we obtain theestimate in L2(T ). For the estimate in H1(T ), we observe that the mean valuecalculation is a continuous operation on H10 (Ω(Xi)), and that (2.75) as a finitesum is continuous, therefore,

|u− Ihu|1;T ≤ ‖∇u‖T + ‖∇Ihu‖T ≤ c‖∇u‖Ω(T ). (2.76)

Finally, we use the trace estimate for u ∈ H1(T )

‖u‖F ≤ c(h−

1/2‖u‖T + h1/2‖∇u‖T

), (2.77)

to obtain the estimate on the edge.

2.3.6 Theorem (Scott-Zhang quasi-interpolation): Let {Th} bea locally quasi-uniform family of meshes with piecewise polynomial fi-nite element spaces Vh ⊂ H1(Ω). Then, there exist bounded operatorsIh : H

1(Ω) → Vh such that for every function u ∈ H1(Ω), every meshcell T , every face F , and for m = 0, 1 there holds

‖u− Ihu‖m;T ≤ ch1−mT |u|1;ΩT (2.78)

‖u− Ihu‖0;F ≤ ch1/2T |u|1;ΩT , (2.79)

and Ihu is a quasi-interpolation on the boundary ∂Ω.

52

2.3.7 Theorem (Schöberl quasi-interpolation): Let {Th} be a lo-cally quasi-uniform family of meshes with piecewise polynomial finiteelement spaces Vh ⊂ H1(Ω). Then, there exist bounded projection op-erators Ih : H1(Ω) → Vh such that for every function u ∈ H1(Ω), everymesh cell T , every face F , and for m = 0, 1 there holds

‖u− Ihu‖m;T ≤ ch1−mT |u|1;ΩT (2.80)

‖u− Ihu‖0;F ≤ ch1/2T |u|1;ΩT . (2.81)

2.3.8 Definition: Let a(u, v) = f(v) be the weak formulation of a BVPon the space V . Then, we define the residual

R : V → V ∗

w 7→ f(·)− a(w, ·).(2.82)

2.3.9 Lemma: For u ∈ V = H10 (Ω) solution to Poisson’s equation andany other function w ∈ V , there holds

|u− v|1 = ‖Rv‖−1 := supw∈V

〈Rv,w〉|w|1

. (2.83)

Proof. We have

〈Rv,w〉 = f(w)− a(v, w) = a(u− v, w). (2.84)

Since a(., .) is s.p.d., we can apply Cauchy-Schwarz to obtain the result.

2.3.10 Definition: Let u be a piecewise continuous function on a meshT. On a face F between two cells T1 and T2, we define the mean valueoperator

{{u}}(x) = u1(x) + u2(x)2

= limε↘0

u(x− εn1) + u(x− εn2)2

, (2.85)

where ni are the outer normal vectors of the two cells. The jump op-erator is

{{un}} = (u1 − u2)n12

=(u2 − u1)n2

2. (2.86)

53

2.3.11 Definition: Let u ∈ V = H10 (Ω) be the solution to Poisson’sequation and let uh ∈ Vh be a finite element function. The strong formof the residual is

〈Rv,w〉 =∑T∈Th

∫T

rT (uh)w dx−∑F∈Fih

2{{n · ∇uh}}w ds, (2.87)

where

rT (uh) = f + ∆uh. (2.88)

2.3.12 Lemma: Let Th be a locally quasi-uniform mesh. There is aconstant c > 0 such that the error between the true solution u ∈ V =H10 (Ω) and the finite element solution uh ∈ Vh ⊂ V is bounded by

|u− uh|1;Th ≤ c

∑T∈Th

h2T ‖rT (uh)‖2T +∑F∈Fih

hF ‖{{n · ∇uh}}‖2F

1/2 .(2.89)

Proof. We begin with the equivalence of error in V and residual in V ∗. For theresidual, there holds Galerkin orthogonality, such that we obtain

|u− uh|1 = supw∈V

〈Ruh, w〉|w|1

= supw∈V

〈Ruh, w − Ihw

〉|w|1

. (2.90)

Using the strong form and the quasi-interpolation Ih, we get

〈Ruh, w〉 =∑T∈Th

∫T

rT (uh)(w − Ihw) dx−∑F∈Fih

2{{n · ∇uh}}(w − Ihw) ds

≤∑T∈Th

‖rT (uh)‖T ‖w − Ihw‖T +∑F∈Fih

‖{{n · ∇uh}}‖T ‖w − Ihw‖T

≤ c1∑T∈Th

hT ‖rT (uh)‖T ‖∇w‖ΩT + c2∑F∈Fih

h1/2E ‖{{n · ∇uh}}‖T ‖∇w‖ΩT

Applying Hölder inequality, we obtain for the first term

∑T∈Th

hT ‖rT (uh)‖T ‖∇w‖ΩT ≤

(∑T∈Th

h2T ‖rT (uh)‖2T

)1/2(∑T∈Th

‖∇w‖2ΩT

)1/2,

54

and similar for the second term. Both contain a term of the form∑T∈Th

‖∇w‖2ΩT =∑T∈Th

∑T ′∈TT

‖∇w‖2T ′ ≤ n‖∇w‖2Ω,

where n is the maximal number of occurrences of a cell T ′ in the double sum.This n is bounded uniformly on shape regular family of meshes, since shaperegularity prohibits degeneration of cells. Therefore, we conclude

〈Ruh, w〉 ≤ c

∑T∈Th

h2T ‖rT (uh)‖2T +∑F∈Fih

hE‖{{n · ∇uh}}‖2T

1/2 |w|1;Ω.Entering into (2.90) yields the proposition.

Remark 2.3.13. The constant c in the previous theorem depends on theconstant in the quasi-interpolation estimate, which in turn was derived usingPoincaré and trace inequalities. Traditionally, both are derived using indirectarguments, but a constructive proof with computable bounds is possible. For de-tails, see [Verfürth, 2013, Chapter 3]. There still remains the question, whetherthese bounds are sufficiently sharp to be applicable in practice.

2.3.14 Definition: For a function f ∈ L2(Ω), we define the cell-wiseaverage

f(x) =1

|T |

∫T

f dx x ∈ T, T ∈ Th, (2.91)

and the data oscillation

oscT f = ‖f − f‖T . (2.92)

2.3.15 Definition: The residual based error estimator for the finiteelement method for Poisson’s equation on a mesh T is defined as

ηh(uh) =∑T∈T

ηT (uh), (2.93)

where

ηT (uh)2 = h2T ‖f + ∆uh‖2T +

1

2

∑F⊂∂T

hF ‖{{n · ∇uh}}‖2E . (2.94)

55

2.3.16 Theorem: There holds with positive constants c1 and c2

c1|u− uh|21 ≤ ηh(uh)2 +∑T∈Th

h2T osc2T f (2.95)

c2 η2T (uh) ≤ |u− uh|21;ΩT +

∑T∈TT

h2T osc2T f (2.96)

Proof. Note that the right hand side of the estimate (2.89) in Lemma 2.3.12differs from the estimator ηh(uh) only by the replacement of f by f . Therefore,we can start the proof of this lemma by changing (2.90) to

|u− uh|1 = supw∈V

〈Ruh, w〉|w|1

= supw∈V

〈Ruh, w

〉+(f − f, w

)|w|1

, (2.97)

where we ad hoc defined R like R, but replacing rT = f + ∇uh by f + ∇uh.Due to equality, we can still subtract the quasi-interpolant of w and continuethrough the whole proof. What remains is the estimate

(f − f, w − Ihw

)≤ c

∑T∈T

hT ‖f − f‖T ‖∇w‖ΩT ≤ c

(∑T∈T

hT ‖f − f‖T

)1/2|w|1;Ω,

(2.98)

which is proven by Hölder inequality and the boundedness of the number ofcells in TT . Thus, the upper bound (2.95) is an immediate consequence ofLemma 2.3.12.

Note: the proof for the lower bound is for linear elements only. It can begeneralized to higher order by generalizing the bubble functions below, which ismostly technical.

We observe that the lower bound is local, that is, the estimate on each cell Tis bounded from above by the error on the halo ΩT . Therefore, we start byconstructing test functions with support on a cell. On a simplex, we define thebubble function in terms of barycentric coordinates

BT =1

(d+ 1)d+1λ0λ1 . . . λd, (2.99)

while on the reference hypercube (0, 1)d we let

B̂T (x̂) = b(x̂1)b(x̂2) . . . b(x̂d) b(x) = 4x(1− x). (2.100)

Both share the following properties: they vanish on ∂T , they are positive insideT , and maxBT = 1. From positivity, we deduce the existence of a constant cdepending on shape regularity and the shape function space P(T ), such that∫

T

p2BT dx ≥ c‖p‖2T ∀p ∈P(T ). (2.101)

56

Furthermore, BT is polynomial, such that the inverse estimate holds.

Choose now the test function wT = (f + ∆uh)BT . Since it vanishes on theboundary and outside of T , the strong and weak form of the residual reduce to∫

T

rT (uh)wT dx =

∫T

∇(u− uh) · ∇wT dx. (2.102)

Adding the difference of f and f on both sides yields∫T

(f + ∆uh)2BT dx =

∫T

∇(u− uh) · ∇wT dx +∫T

(f − f)wT dx. (2.103)

On the left, we estimate

c‖f + ∆uh‖2T ≤∫T

(f + ∆uh)2BT dx. (2.104)

On the right, we estimate the residual term by∫T

∇(u− uh) · ∇wT dx ≤ |u− uh|1;T |wT |1;T

≤ c|u− uh|1;T h−1T ‖f + ∆uh‖T .

The data oscillation is estimated in a straightforward way by∫T

(f − f)wT dx ≤ oscT f‖f + ∆uh‖T

Combining these three estimates yields

c‖f + ∆uh‖T ≤ h−1T |u− uh|1;T + oscT f, (2.105)

which is the desired estimate for the cell term of the estimator ηT (uh) if wemultiply by hT .

The estimate for the face term is similar, using a bubble function for the face.Since this functon is nonzero on the face, its support extends over two mesh cells.On simplicial cells, it is constructed like the cell bubble in equation (2.99), butextending the product only over indices of vertices on the face. Therefore, it isa polynomial of degree d − 1 on the face and also in the interior of each cell.For hypercubes, it is the product of the quadratic polynomials b(x) of variableson the face times a linear function decaying linearly from 1 to zero. Again, wenormalize such that the maximum equals 1. Choosing1 wF = 2{{n · ∇uh}}FBF ,

1For linear elements, the derivative on the face is constant and thus this definition isobvious. For higher order polynomials, we need a an extension from the face into the interior,which can be obtained from the barycentric coordinates or the tensor product structure.

57

we obtain

c‖{{n · ∇uh}}‖2F ≤∫F

{{n · ∇uh}}2BF ds

=

∫F

{{n · ∇uh}}wF ds

=∑T∈TF

∫T

rTwF dx− 〈R,wF 〉

=∑T∈TF

∫T

rTwF dx−∫

ΩF

∇(u− uh) · ∇wF dx

=∑T∈TF

∫T

(f + ∆uh)wF dx−∫

ΩF

∇(u− uh) · ∇wF dx +∑T∈TF

∫T

(f − f)wF dx.

Again, we estimate the three terms on the right hand side separately. For thefirst, we estimate the norm of wF by its trace on the edge F . From the alreadyproven estimate for the cell residual (2.105), we obtain∑T∈TF

∫T

(f + ∆uh)wF dx ≤∑T∈TF

‖f + ∆uh‖T ‖wF ‖T

≤∑T∈TF

‖f + ∆uh‖T ch1/2F ‖{{n · ∇uh}}‖F

≤ +c∑T∈TF

(h−1/2F |u− uh|1;T + h

1/2F oscT f

)‖{{n · ∇uh}}‖F

Similarly, the inverse estimate for wF yields∫ΩF

∇(u− uh) · ∇wF dx ≤ |u− uh|1;ΩF ‖∇wF ‖ΩF

≤ |u− uh|1;ΩF ch−1/2F ‖{{n · ∇uh}}‖F .

Finally, the data oscillation terms becomes∑T∈TF

∫T

(f − f)wF dx ≤∑T∈TF

‖f − f‖T ‖wF ‖T

≤∑T∈TF

‖f − f‖T ch1/2F ‖{{n · ∇uh}}‖F .

Summing and scaling with h1/2F yields the result.

58

Chapter 3

Variational Crimes

3.0.1. In the previous chapter, we considered finite element methods applyingthe original bilinear form to a subspace Vh ⊂ V . This assumes, that the domainΩ is exactly represented by the mesh, and that all integrals are computed ex-actly. Both assumptions reduce the applicability of the finite element methodconsiderably. Therefore, we now extend our analysis to cases, where we allowVh 6⊂ V and discrete bilinear forms ah(., .) 6= a(., .).

3.1 Numerical quadrature

3.1.1. We begin the investigation of variational crimes by studying the effectof using numerical quadrature instead of exact integration on mesh cells. Inparticular, we investigate approximations of the form∫

T

f(x) dx ≈nq∑k=1

ωkf(xk) =: QT (f). (3.1)

First, we observe thatQT is not a bounded operator on L1(Ω), such thatQT (∇u·∇v) is undefined for functions in H1(Ω). The surprising result of this sectionis, that quadrature is still admissible for the implementation of a finite elementmethod.

We will first set a theoretical framework and then investigate quadrature rulesin detail. The presentation follows [Ciarlet, 1978, Chapter 4].

59

3.1.2 Lemma (Strang’s first lemma): Let a(., .) be a bounded andelliptic bilinear form on the Hilbert space V . Let Vn ⊂ V and let ah(., .)be a bilinear form, bounded and elliptic on Vn with constants Mn andαn. Let f, fn ∈ V ∗. If u ∈ V and un ∈ Vn ⊂ V are solutions to

a(u, v) = f(v) ∀v ∈ V,an(un, vn) = fn(vn) ∀vn ∈ Vn,

respectively, there holds

‖u− un‖V

≤ infvn∈Vn

[(1 +

M

αn

)‖u− vn‖V +

1

αn‖an(vn, .)− a(vn, .)‖V ∗h

]+

1

αn‖fn − f‖V ∗h . (3.2)

Proof. Using Vn-ellipticity of an(., .) yields for arbitrary vn ∈ Vn

αn‖un − vn‖2 ≤an(un − vn, un − vn)=fn(un − vn) + a(u− vn, un − vn)− f(un − vn)

+ a(vn, un − vn)− an(vn, un − vn)

We estimate separately

a(u− vn, un − vn)‖un − vn‖

≤M‖un − vn‖,

|fn(un − vn)− f(un − vn)|‖un − vn‖

≤ supwn∈Vn

|fn(wn)− f(wn)|‖wn‖

,

|a(vn, un − vn)− an(vn, un − vn)|‖un − vn‖

≤ supwn∈Vn

|an(vn, wn)− a(vn, wn)|‖wn‖

.

Combining all terms, we obtain

‖u− un‖ ≤ ‖u− vn‖+ ‖un − vn‖

≤ ‖u− vn‖+1

αn

(M‖u− vn‖+ ‖an(vn, .)− a(vn, .)‖V ∗h + ‖fn − f‖V ∗h

).

(3.3)

Remark 3.1.3. Strang’s lemma states, that the error can be split into theapproximation error of the space Vh and the consistency error of the approxi-mations ah(., .) und fh(.). Both errors are scaled with the stability factor 1/αn

60

of the discrete problem. So far, this is consistent with error estimates for in-stance for Runge-Kutta methods. There is an important difference though: theconsistency error is not evaluated for the exact solution, but only for its discreteapproximation.

Remark 3.1.4. We will apply Strang’s lemma to a family of meshes indexed bymesh size h and assess the infimum by an interpolation operator. It is clear, thatwe will only obtain optimal convergence rates compared to the interpolationestimate, if there exists α0 > 0 such that αn ≥ α0 uniformly with respectto n. While it is not a prerequisite of Strang’s lemma, it is our goal for alldiscretizations.

Remark 3.1.5. Quadrature would be infeasible, if we had to devise a quadra-ture rule for every mesh cell T . Instead, we tabulate quadrature formulas forthe reference cell T̂ by choosing quadrature points x̂k and weights ωk and write

QT̂ (f̂) =

nq∑k=1

ωkf̂(x̂k). (3.4)

We compute integrals over T though mapping,

QT (f) =

nq∑k=1

det∇ΦT (x̂k)ωkf̂(x̂k). (3.5)

Thus, QT is defined by quadrature points xk = Φ(x̂k) and quadrature weightsdet∇ΦT (x̂k)ωk.

Quadrature rules on the reference cell T̂ are obtained by interpolation afterchoosing quadrature points by employing the properties of orthogonal polyno-mials. The construction of such quadrature rules for simplices is somewhatcomplicated, and we refer to tables in the cited literature.

An important consequence of the use of quadrature points as roots of orthogonalpolynomials is the fact that all weights are positive.

3.1.6 Definition: Given a one-dimensional quadrature rule QI on theinterval I = [0, 1] with

QI(f̂) =

nq∑k=1

ωkf̂(x̂k). (3.6)

Then, a quadrature rule on T̂ = [0, 1]d is defined by

QT̂ (f̂) =

nq∑k1

�

Date post:	01-Feb-2021
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Finite Elements - simweb.iwr.uni-heidelberg.degkanscha/notes/fem.pdf · Chapter 1 Elliptic PDE and...

Documents