Date post: | 08-Feb-2017 |
Category: |
Education |
Upload: | alexander-litvinenko |
View: | 44 times |
Download: | 1 times |
1
Application of H-matrices for solving
multiscale problems
Litvinenko Alexander,
Dissertation work
Max-Planck-Institut fur Mathematik in den Naturwissenschaften,
Leipzig, 10 August, 2006.
www.hlib.org www.mis.mpg.de
2
H-matrices
IntegralEquations,BEM3D
Parallel Impl. of H-matrices
HelmholzEquation
Convection-DiffusionProblems
Multigrid+ H-matrices
H-Matrix Approximation ofsign(A), exp(A), etc
Aposteriory Err. Est.+efficient H-matrix update
Lyapunov, RiccatiEquations
DD methods
Schur ComplementMethods
HierarchicalDomain Decompositionfor Multiscale Problems
*
3D Skin problem*
MultidimensionalProblems
Fig. 1 – Main directions of applications H-matrices. The sym-
bol ? refers to the projects in which I took part.
3
Contents
1. Examples of multiscale problems
2. Multiscale methods
3. HDD method
4. Hierarchical matrices
5. Application of H-matrices to HDD
6. Complexity and storage of HDD
7. Modifications of HDD
– Two scales
– Truncation of the small scales
8. Numerical results
4
Example of multiscale problems
(a)macroscopic scale (b)microscopic scale
Different scales in a porous medium.[Bastian 99].
10 s-6
10 s-3
10 s0
10 s3
10 m-12
10 m-9
10 m-6
10 m-3
Atom Protein Cell Tissue
molecular events(ion channel gating)
diffusion cell signalling
mitosis
Example of time and length scales for modeling tumor growth.[Alarcon,
Byrne, Maini 05]
5
0,6
0,2
-0,6
0,4
0
x
621
-0,4
-0,2
3 4 50
Fig. 2 – Fine properties of the solution are out of interest.
6
Multiscale methods
The equation is :
−∇(a(x)∇u) = f in Ω,
u = 0 on ∂Ω.(1)
Homegenisation [Babuska 75], [Bensoussan, Lions, Papanicolau 78],
[Jikov, Kozlov, Oleinik 94]
Solution is
uε(x) = u0(x) + εu1(x,x
ε) + O(ε2).
u0 is the solution of the homogenized equation
∇a∗∇u0 = f in Ω, u0 = 0 on ∂Ω, (2)
Resonance effect in MsFEM [T.Hou, X. Wu 97]
‖u − uh‖0,Ω = O(h2 + ε/h). (3)
Heterogeneous multiscale method [Weinan E, B.Engquist 03]
7
Problem setup
The Poisson problem : find u ∈ H1(Ω) s.t. :
∑
1≤i,j≤2
∂
∂xi
ai,j(x)∂
∂xj
u = f in Ω
u = g on Γ
(4)
where ai,j ∈ L∞(Ω) such A(x) = (ai,j)i,j=1,...,d satisfies
0 < λ ≤ λmin(A(x)) ≤ λmax(A(x)) ≤ λ , ∀x ∈ Ω.
⇒ Oscillatory or jumping coefficients are allowed.
8
The idea of HDD
Find operators : Bh, Ch s.t.
uh = Bhfh + Chgh, (5)
where fh is the rhs and gh the Dirichlet-boundary values.
Composed matrix (Bh, Ch) is the ’inverse’ of the stiffness
matrix Ah.
Complete inverse (Bh, Ch) is too much of information. We
might be interested only in few functionals of the solution.
Example : we want to know uh(fh, gh) only for fh in a smaller
space VH ⊂ Vh.
9
Domain decomposition tree TTh
FE discretisation : triangulation Th, Ω = ∪t∈Tht.
1
2
3
4
5
6
7
910
11
12
13
14
15
8
5
6
7
11
12
13
14
15
8
1
2
3
4
5
6
7
910
3
4
19
10
......
5
611
12
13
14
15
6
7
11
15
8
......
26
2
6
• Ω is the root of the tree,
• TThis a binary tree,
• if ω ∈ TThhas two sons
ω1, ω2 ∈ TTh: ω = ω1 ∪ ω2
and γω = ∂ω1 ∩ ∂ω2,
• ω ∈ TThis a leaf, if and only
if ω ∈ Th.
10
Notations
Let ω ∈ TTh, ω = ω1 ∪ ω2.
Γω,1 := ∂ω ∩ ω1, Γω,2 := ∂ω ∩ ω2 and γω := ∂ω1\∂ω = ∂ω2\∂ω
ω 1 ω 2
ωPSfrag replacements
∂γω
Γω,1 Γω,2
Γω
I = I(Ω) = set of all vertices of Ω.
I(ω) = i ∈ I ; xi ∈ ω.
11
Discretisation
Let ω ∈ TTh. Denote dω :=
(
(fi)i∈I(ω) , (gi)i∈I(∂ω)
)
. Define the
following discrete problem in the variational form :
aω(uh, bj) = (fω, bj)L2(ω) ∀ j ∈ I(ω),
uh(xj) = gj ∀ j ∈ I(∂ω).(6)
a(bi, bj) =
∫
Ω
α(x)(∇bi,∇bj)dx, (f, bj) =
∫
suppbj
fbjdx.
12
1. Mapping Ψω
Ψω(d) = (Ψω(dω))i∈I(∂ω) with (Ψω(dω))i = aω(uh, bi) − (fω, bi)L2(ω) ,
Ψωdω = Ψfωfω + Ψg
ωgω.
2. Mapping Φω
(Φω(dω))i := uh(xi) , ∀i ∈ I(γω).
Hence, Φω(dω) is the trace of uh on γω.
Goal of HDD is to build the set
of mappings : Φ0, Φ1, Φ2, ..., Φn which
than produce sequentially the solution on
γω0, γω1
, γω2..., γωn.
ω
ω
ω
1
2
xjγ ω
xj
13
Construction of the mappings Ψω and Φω
Let ω1 and ω2 be two sons of ω ∈ TTh. Let dω1
and dω2the
data associated to ω1 and ω2 s.t. :
• (consistency conditions for the Dirichlet data)
g1,i = g2,i , ∀i ∈ I(ω1) ∩ I(ω2), (7)
• (consistency conditions for the right-hand side)
f1,i = f2,i , ∀i ∈ I(ω1) ∩ I(ω2). (8)
Let uω1and uω2
be the local FE solutions of the problem (6)
for the data dω1, dω2
.
14
ω
ω
ω
1
2
xjγ ω
xj
If uω1, uω2
satisfy to the Neu-
mann condition
γΨω1(dω1
) + γΨω2(dω2
) = 0,
Then, uω defined by
uω(xi) :=
uω1(xi) for i ∈ I(ω1)
uω2(xi) for i ∈ I(ω2)
(9)
is solution of (6) for the data dω := (fω, gω) given by
fω :=
f1,i for i ∈ I(ω1)
f2,i for i ∈ I(ω2)gω :=
g1,i for i ∈ I(∂ω1)
g2,i for i ∈ I(∂ω2)
15
(
γΨγω1
+ γΨγω2
)
gγ = −Ψfω1
f1 − ΨΓω1
g1,Γ − Ψfω2
f2 − ΨΓω2
g2,Γ.
We set
M := −( γΨγω1
+ γΨγω2
),
compute M−1 and solve for gγ :
gγ = M−1(Ψfω1
f1 + ΨΓω1
g1,Γ + Ψfω2
f2 + ΨΓω2
g2,Γ).
For given mappings Ψω1, Ψω2
, defined on the sons ω1, ω2, we
can compute Φω and Ψω for the father ω. This recursion
process ends as soon as ω = Ω.
16
Hierarchical Process
1. Leaves to Root
1. Compute Ψω on all leaves (3 × 3 matrices).
2. Recursion from the leaves to the root :
(a) Compute and store Φω and Ψω from Ψω1, Ψω2
.
(b) Delete Ψω1, Ψω2
.
2. Root to Leaves
1. Given dω = (fω, gω), compute the solution uh on the
interior boundary γω by Φω (dω).
2. Build the data dω1= (fω1
, gω1), dω2
= (fω2, gω2
) from
dω = (fω, gω) and gγ := Φω (dω).
17
Rank-k matrices
1. R ∈ Rn×m, R = ABT , where
A ∈ Rn×k, B ∈ R
m×k, k min(n, m).
The storage A and B is k(n + m)
instead of n · m.
=
A
BT
*
R
k
k
n
m
n
m
H-matrices (Hackbusch ’98)
2. Grid → cluster tree (TI) → blockclus-
ter tree (TI×J) + admissibility condition
→ admissible partitioning → H-matrix →H-matrix arithmetics .
4 2
2 2 3
3 3
4 2
2 2
4 2
2 2
4
18
3. Let I := I(Ω), t, s ∈ TI , (t × s) ∈ TI×I.
Admissibility : maxdiam(t), diam(s) ≤ η · dist(t, s).
if(adm=true) then M |t×s is a rank-k matrix block
if(adm=false) then divide M |t×s further or define as a dense
matrix block.
QQt
S
dist H=
t
s
...
I
I
I I
I
I
I I I I
I
1
1
2
2
11 12 21 22
I11
I12
I21
I22
19
Definition 0.1 H(TI×J , k) := M ∈ RI×J | rank(M |t×s) ≤ k for
all admissible leaves t × s of TI×J.
n := max(|I|, |J |, |K|).
Operation Sequential Compl. Parallel Complexity
(R.Kriemann 2005)
building(M) N = O(n log n) Np
+ O(|V (T )\L(T )|)storage(M) N = O(kn log n) N
Mx N = O(kn log n) Np
αM ′ ⊕ βM ′′ N = O(k2n log n) Np
αM ′ M ′′ ⊕ βM N = O(k2n log2 n) Np
+ O(Csp(T )|V (T )|)M−1 N = O(k2n log2 n) N
p+ O(nn2
min)
LU N = O(k2n log2 n) N
H-LU N = O(k2n log2 n) Np
+ O(k2n log2 n
n1/d )
20
Application of H-matrices to HDD
Let ω = ω1 ∪ ω2, γω = ∂ω1\∂ω.
Suppose Ψgω1
, Ψgω2
→ Ψgω =: A and Ψf
ω1, Ψf
ω2→ Ψf
ω =: F .
A11 A12
A21 A22
x1
x2
=
F1
F2
b.
Eliminate internal nodal points :
A11 − A12A−122 A21 0
A21 A22
x1
x2
=
F1 − A12A−122 F2
F2
b.
Ψgωx1 := (A11 − A12A
−122 A21)x1 = (F1 − A12A
−122 F2)b = Ψf
ωb
x2 = A−122 F2b − A−1
22 A21x1 =: Φfωb + Φg
ωx1,
21
13 4
4 45
5 8 5
5 82
28 5
5 16 5
5 85
5
8 5
5
16 5
5 81
18 5
5
8 5
5 15
5
516 5
5 15
Matrix Ψg with the weak admissibility condition
9 3
8 3
3 3
8 3
8 3
3 3
8 3
8 3
33 9
3 8
3 3
3 8
3 8
3 3
3 8
3 8
3 33 3
8 3 3
33 3
3 3
3 8 3
38 3
3 3
33 33 3
3 3
33
3
33 8
3 3
33 3
3 3
8 3
8 3
3 3
8 3
8 3
3 33 3
3 33 3
3 3
33 33 3
3 3
8 3
12 83 3
12 84 4
3 8
3 33 3
3 34 4
3 8
3 3
3 33 3
3 38 3
3 8
3 3
3 8
3 8 8
3 3
3
3
3 8
3 3
3
3 8
3 8
3 3
3 8
3 33 3
33 3
3 3
3 8
33 3
3 3
3 8
3 8
3 3
3 8
3 3
3 3
3 3
3 8
3 3
8 8
3 33 3
3 8
4 4
3 8
8 83 3
8 8
4 4
8 8
8 3
8 3
8 3
8 3 83
8 3
3 3
8 3
8 3
38 3
7 33
3
9 8
9 8 8
8
1
8 8
8 8
8 3
8 8
8 3
8 8
8 8
3 8
8 8
3 8
3 33 3
8 3
3 3
8 3
3 8
3 3
3 8
3 3
3 3
8 3
8 8
8 3
8 8
8 8
3 7
8 8
3 7
3 8
3 3
8 3
3 3
3 3
Matrix Ψf with the standard admissibility condition
22
Building (Ψgω)H ∈ R
512×512 from (Ψgω1
)H and (Ψgω2
)H ∈ R384×384.
25 5
5 86
6 16 6
6 166
6 32 7
7 321
132 6
6
16 6
6 32 6
6 166
6 3211
11
32 6
6
16 6
6
32 5
5 166
6 321
132 6
6
32 5
5
16 6
6
16 5
5 31
255 8
6 16
6 16
6 32
7 32
3216
3216
32
6
32
6
16
6
325 16
6 32
32
3216
16
31
1932
5 32
632
5 31
258
16
16
32
32
132
6
16
6 326 16
6 32
5
3216
3216
32
132
6
32
5
16
6
16
5 31
2032
5 32
632
5 31
25 7
7 89
9 16 10
10 1611
11 32 18
18 3215
1532 17
17
16 10
10
32 8
8 1611
11 3219
1932 11
11 32 14
1432 12
12 31
17 8
8 16 11
11
16 8
8 32 10
10 1617
17 3214
1432 16
16
17 6
6 16 9
9
16 10
10
16 8
8 31
20
2032 12
12 32 13
1332 11
11 31
25 5
5 86
6 16 6
6 166
6 32 7
7 321
132 6
6
16 6
6 32 6
6 166
6 3211
11
32 6
6
16 6
6
32 5
5 166
6 321
132 6
6
32 5
5
16 6
6
16 5
5 31
32
3232 10
10 32 12
1232 10
10 31
PSfrag replacements
(Ψgω1
)H (Ψgω2
)H
(Ψgω1
)H|I×I (Ψgω2
)H|I×I
(Ψgω)H ∈ H(TI×I , k)
H
23
Complexity and storage
storage complexity
Ψg - O(k3√nhnH log√
nhnH)
Ψf - O(k3√nhnH log2 √nhnH)
Φg O(k√
nhnH) O(k2√nhnH)
Φf O(k2√nhnH log2 √nhnH) O(k3√nhnH log√
nhnH)
24
Prolongation of the right-hand side on the fine grid
h H, fH ∈ VH ⊂ Vh is given ⇒ to build fh.
Mappings Ψf , Φf can be compressed.
H h
.=
PSfrag replacementsΦfh
ωΦfHω Ph←H
ω
25
Truncation of the small scales :
S(Φω) = S(Φgω) + S(Φf
ω) = O(k2√nhnH log√
nhnH).
Ω
h
HPSfrag replacements
T≥HTh
TTh
T<HTh
Fig. 3 – Domain decomposition tree TThand its parts.
26
Numeric results
27
(left) Skin problem, (right) model of a cell.
a b
c
Lipid layer
αβ
0 10.25 0.75
0.5
1
4h
[Khoromskij, Wittum 02]
28
α‖ucg−u‖2‖ucg‖2
‖ucg − u‖∞ ‖ucg − u‖A
1.0 6.6 ∗ 10−9 7.1 ∗ 10−10 2.3 ∗ 10−7
10−1 2.0 ∗ 10−8 1.4 ∗ 10−8 2.0 ∗ 10−6
10−2 6.6 ∗ 10−8 2.6 ∗ 10−7 1.7 ∗ 10−5
10−3 7.4 ∗ 10−7 1.8 ∗ 10−5 4.2 ∗ 10−4
10−4 4.2 ∗ 10−6 1.8 ∗ 10−3 1.4 ∗ 10−2
10−5 7.0 ∗ 10−5 2.3 ∗ 10−1 9.0 ∗ 10−1
Tab. 1 – Dependence of absolute and relative errors on α.
1292 dofs, εk = 10−8, β = 1.0, residium ‖Au − f‖ = 10−10,
‖A‖ = 1.22 ∗ 105.
29
ε‖ucg−u‖2‖ucg‖2
‖ucg − u‖∞ ‖ucg − u‖A
10−6 4.4 ∗ 10−1 6.67 ∗ 102 1.1 ∗ 103
10−8 7.27 ∗ 10−5 2.3 ∗ 10−1 9.0 ∗ 10−1
10−10 5.1 ∗ 10−7 1.0 ∗ 10−3 3.0 ∗ 10−3
10−12 3.9 ∗ 10−9 1.2 ∗ 10−5 2.9 ∗ 10−5
10−14 1.2 ∗ 10−11 6.6 ∗ 10−7 1.2 ∗ 10−7
10−16 1.6 ∗ 10−12 1.1 ∗ 10−8 1.7 ∗ 10−8
Tab. 2 – Dependence of absolute and relative errors on εk. 1292
dofs, α = 10−5, β = 1.0, residium ‖Au − f‖ = 10−10,
‖A‖2 = 1.22 ∗ 105.
ε is responsible for the H-matrix approximation accuracy.
σk ≤ εσ1.
30
dofs Φg,Φf ,h,Kb Φg,Φf ,H=0.5,Kb Φg,Φf ,H=0.125,Kb
332 2.45 ∗ 102, 4 ∗ 102 9.1 ∗ 10, 1.7 ∗ 102 2 ∗ 102, 2.8 ∗ 102
652 1.1 ∗ 103, 2.4 ∗ 103 2.9 ∗ 102, 1.2 ∗ 103 7.9 ∗ 102, 1.8 ∗ 103
1292 5 ∗ 103, 1.4 ∗ 104 6.8 ∗ 102, 8 ∗ 103 2.6 ∗ 103, 1.2 ∗ 104
2562 2.1 ∗ 104, 7.86 ∗ 104 1.4 ∗ 103, 4.1 ∗ 104 7.4 ∗ 103, 6.9 ∗ 104
Tab. 3 – Dependence of memory requirements for Φg and Φf
on numbers of dofs and size of the interface, f = 4, nmin = 12,
u = x2 + y2 and k = 7.
31
Storage
ε LLT (Mb) HDD(Mb) (A−1)H(Mb)
10−3 13.3 19.7 51.0
10−4 14.7 20.1 64.0
10−5 16.0 20.4 75.2
10−6 17.2 20.6 87.4
Tab. 4 – Dependence of memory requirements on ε, 1292 dofs.
32
dofs HDD pre,LLT ,cg Inv(A) pre,LLT
332 0.19 0.1=0.03+0.02+0.04 0.24 0.11=0.03+0.08
652 0.96 0.6=0.2+0.1+0.26 3.54 0.5=0.2+0.3
1292 5.6 5=2.6+0.6+1.8 65.8 4.7=2.7+2.0
2572 36.1 53=38.0+3.4+11.4 - 50=38.2+11.7
5122 218 - - -
Tab. 5 – Comparison of times for the skin problem with
α(x, y) = 10−5, ε = εcg = 10−8.
33
Oscillatory coefficients
global k ‖u40 − uk‖2/ ‖u40‖2
‖u40 − uk‖∞
2 7 7 ∗ 10−2
4 2 ∗ 10−2 1.8 ∗ 10−3
6 5.4 ∗ 10−4 4.5 ∗ 10−5
8 6.6 ∗ 10−5 6.3 ∗ 10−6
10 7.6 ∗ 10−6 9 ∗ 10−7
Tab. 6 – α(x, y) = 1 + 0.5sin(50x)sin(50y)
ω ‖u40 − uk‖2/ ‖u40‖2
‖u40 − uk‖∞
10 1.65 ∗ 10−4 1.76 ∗ 10−5
50 1.8 ∗ 10−4 1.9 ∗ 10−5
450 7.7 ∗ 10−4 10−4
Tab. 7 – 2572 Dofs, f = 1, α(x, y) = 1 + 0.5sin(ωx)sin(ωy).
34
Ω
α
β
0.1 0.2 0.8 0.9
0.1
0.2
0.80.9
Fig. 4 – Domain Ω = (0, 1)2 with jumping coefficients α and β.
35
ε ‖Au − f‖2 cg ; HDD(sec)‖ucg−u‖2‖u‖2
‖ucg − u‖∞10−4 2 ∗ 10−4 5.3 ; 8.9 6.7 ∗ 10−1 1.4
10−6 4.8 ∗ 10−7 5.0 ; 10.1 1.8 ∗ 10−4 9.5 ∗ 10−4
10−8 1.4 ∗ 10−8 5.7 ; 11.5 1.1 ∗ 10−6 1.48 ∗ 10−5
10−10 1.45 ∗ 10−8 6.7 ; 12.3 5.3 ∗ 10−7 10−5
10−12 1.2 ∗ 10−8 7.4 ; 13.5 5.2 ∗ 10−7 10−5
Tab. 8 – Dependence of the relative and absolute errors on ε,
u is HDD solution from ε, α = 10, β = 0.01, 1292 dofs.
36
Properties of HDD :
1. HDD computes uh := Bhfh + Chgh or uh := BHfH + Chgh.
2. Bh, BH and Ch have H-matrix format.
3. The complexities are O(k2nh log3 nh) and
O(k2√nhnH log3 √nhnH).
4. The storages are O(knh log2 nh) and
O(k√
nhnH log2 √nhnH).
5. HDD computes functionals of the solution :
(a) Neumann data ∂uh
∂nat the boundary,
(b) mean values∫
ωuhdx, ω ⊂ Ω, the solution at a point,
the solution in a small subdomain ω,
(c) flux∫
C∇u−→n dx, where C is a curve in Ω.
37
6. HDD for multiple right-hand sides and multiple Dirichlet
data.
7. HDD can easily be parallelized.
8. Problems with repeated patterns.
38
Thanks for your attention !