Intr
oduc
tion
to P
aral
lel F
EM
Ken
go N
akaj
ima
Info
rmat
ion
Tech
nolo
gy C
ente
rTh
e U
nive
rsity
of T
okyo
Intro
pFE
M2
•In
trodu
ctio
n to
MP
I•
Roa
d to
Par
alle
l FE
M–
Loca
l Dat
a S
truct
ure
•M
PI f
or P
aral
lel F
EM
–C
olle
ctiv
e C
omm
unic
atio
n–
Poi
nt-to
-Poi
nt (P
eer-t
o-P
eer)
Com
mun
icat
ion
Wha
t is
MPI
? (1
/2)
•M
essa
ge P
assi
ng In
terfa
ce•
“Spe
cific
atio
n” o
f mes
sage
pas
sing
AP
I for
dis
tribu
ted
mem
ory
envi
ronm
ent
–N
ot a
pro
gram
, Not
a li
brar
y•
http
://ph
ase.
hpcc
.jp/p
hase
/mpi
-j/m
l/mpi
-j-ht
ml/c
onte
nts.
htm
l
•H
isto
ry–
1992
MP
I For
um–
1994
MP
I-1–
1997
MP
I-2, M
PI-3
is s
oon
avai
labl
e•
Impl
emen
tatio
n–
mpi
chA
NL
(Arg
onne
Nat
iona
l Lab
orat
ory)
–O
penM
PI,
LAM
–H
/W v
endo
rs–
C/C
++, F
OTR
AN
, Jav
a ; U
nix,
Lin
ux, W
indo
ws,
Mac
OS
Intro
pFE
M3
Wha
t is
MPI
? (2
/2)
•“m
pich
” (fre
e) is
wid
ely
used
–su
ppor
ts M
PI-2
spe
c. (p
artia
lly)
–M
PIC
H2
afte
r Nov
. 200
5.–
http
://w
ww
-uni
x.m
cs.a
nl.g
ov/m
pi/
•W
hy M
PI i
s w
idel
y us
ed a
s de
fact
o st
anda
rd ?
–U
nifo
rm in
terfa
ce th
roug
h M
PI f
orum
•P
orta
ble,
can
wor
k on
any
type
s of
com
pute
rs•
Can
be
calle
d fro
m F
ortra
n, C
, etc
.
–m
pich
•fre
e, s
uppo
rts e
very
arc
hite
ctur
e
•P
VM
(Par
alle
l Virt
ual M
achi
ne) w
as a
lso
prop
osed
in
early
90’
s bu
t not
so
wid
ely
used
as
MP
I
Intro
pFE
M4
Ref
eren
ces
•W
.Gro
ppet
al.,
Usi
ng M
PI s
econ
d ed
ition
, MIT
Pre
ss, 1
999.
•
M.J
.Qui
nn, P
aral
lel P
rogr
amm
ing
in C
with
MP
I and
Ope
nMP
, M
cGra
whi
ll, 2
003.
•W
.Gro
ppet
al.,
MP
I:Th
e C
ompl
ete
Ref
eren
ce V
ol.I,
II, M
IT P
ress
, 19
98.
•ht
tp://
ww
w-u
nix.
mcs
.anl
.gov
/mpi
/ww
w/
–A
PI(
App
licat
ion
Inte
rface
) of M
PI
Intro
pFE
M5
How
to le
arn
MPI
(1/2
)•
Gra
mm
ar–
10-2
0 fu
nctio
ns o
f MP
I-1 w
ill be
taug
ht in
the
clas
s•
alth
ough
ther
e ar
e m
any
conv
enie
nt c
apab
ilitie
s in
MP
I-2–
If yo
u ne
ed fu
rther
info
rmat
ion,
you
can
find
info
rmat
ion
from
web
, bo
oks,
and
MP
I exp
erts
.•
Pra
ctic
e is
impo
rtant
–P
rogr
amm
ing
–“R
unni
ng th
e co
des”
is th
e m
ost i
mpo
rtant
•B
e fa
milia
r with
or “
grab
” the
idea
of S
PM
D/S
IMD
op’
s–
Sin
gle
Pro
gram
/Inst
ruct
ion
Mul
tiple
Dat
a–
Eac
h pr
oces
s do
es s
ame
oper
atio
n fo
r diff
eren
t dat
a•
Larg
e-sc
ale
data
is d
ecom
pose
d, a
nd e
ach
part
is c
ompu
ted
by e
ach
proc
ess
–G
loba
l/Loc
al D
ata,
Glo
bal/L
ocal
Num
berin
g
Intro
pFE
M6
7
SPM
D
PE
#0
Pro
gram
Dat
a #0
PE
#1
Pro
gram
Dat
a #1
PE
#2
Pro
gram
Dat
a #2
PE
#M
-1
Pro
gram
Dat
a #M
-1
mpirun -np M <Program>
You
unde
rsta
nd 9
0% M
PI,
if yo
u un
ders
tand
this
figu
re.
PE
: Pro
cess
ing
Ele
men
tP
roce
ssor
, Dom
ain,
Pro
cess
Eac
h pr
oces
s do
es s
ame
oper
atio
n fo
r diff
eren
t dat
aLa
rge-
scal
e da
ta is
dec
ompo
sed,
and
eac
h pa
rt is
com
pute
d by
eac
h pr
oces
sIt
is id
eal t
hat p
aral
lel p
rogr
am is
not
diff
eren
t fro
m s
eria
l one
exc
ept c
omm
unic
atio
n.
8
Som
e Te
chni
cal T
erm
s•
Pro
cess
or, C
ore
–P
roce
ssin
g U
nit (
H/W
), P
roce
ssor
=Cor
e fo
r sin
gle-
core
pro
c’s
•P
roce
ss–
Uni
t for
MP
I com
puta
tion,
nea
rly e
qual
to “c
ore”
–E
ach
core
(or p
roce
ssor
) can
hos
t mul
tiple
pro
cess
es (b
ut n
ot
effic
ient
)•
PE
(Pro
cess
ing
Ele
men
t)–
PE
orig
inal
ly m
ean
“pro
cess
or”,
but i
t is
som
etim
es u
sed
as
“pro
cess
” in
this
cla
ss. M
oreo
ver i
t mea
ns “d
omai
n” (n
ext)
•In
mul
ticor
e pr
oc’s
: PE
gen
eral
ly m
eans
“cor
e”
•D
omai
n–
dom
ain=
proc
ess
(=P
E),
each
of “
MD
” in
“SP
MD
”, ea
ch d
ata
set
•P
roce
ss ID
of M
PI (
ID o
f PE
, ID
of d
omai
n) s
tarts
from
“0”
–if
you
have
8 p
roce
sses
(PE
’s, d
omai
ns),
ID is
0~7
9
SPM
D
PE
#0
Pro
gram
Dat
a #0
PE
#1
Pro
gram
Dat
a #1
PE
#2
Pro
gram
Dat
a #2
PE
#M
-1
Pro
gram
Dat
a #M
-1
mpirun -np M <Program>
You
unde
rsta
nd 9
0% M
PI,
if yo
u un
ders
tand
this
figu
re.
PE
: Pro
cess
ing
Ele
men
tP
roce
ssor
, Dom
ain,
Pro
cess
Eac
h pr
oces
s do
es s
ame
oper
atio
n fo
r diff
eren
t dat
aLa
rge-
scal
e da
ta is
dec
ompo
sed,
and
eac
h pa
rt is
com
pute
d by
eac
h pr
oces
sIt
is id
eal t
hat p
aral
lel p
rogr
am is
not
diff
eren
t fro
m s
eria
l one
exc
ept c
omm
unic
atio
n.
10
How
to le
arn
MPI
(2/2
)
•N
OT
so d
iffic
ult.
•Th
eref
ore,
2-3
lect
ures
are
eno
ugh
for j
ust l
earn
ing
gram
mar
of M
PI.
•G
rab
the
idea
of S
PM
D !
11
Act
ually
, MPI
is n
ot e
noug
h ...
•M
ultic
ore
Clu
ster
s•
Het
erog
eneo
us C
lust
ers
(CP
U+G
PU
, CP
U+M
anyc
ore)
•M
PI +
X (
+ Y
)•
X –O
penM
P, P
thre
ad–
Ope
nAC
C–
CU
DA
, Ope
nCL
•Y –
Vec
toriz
atio
n
121212
Firs
t Exa
mpl
eimplicit REAL*8 (A-H,O-Z)
include 'mpif.h‘
integer :: PETOT, my_rank, ierr
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
write (*,'(a,2i8)') 'Hello World FORTRAN', my_rank, PETOT
call MPI_FINALIZE (ierr)
stop
end
#include "mpi.h"
#include <stdio.h>
int main(int argc, char **argv)
{int n, myid, numprocs, i;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
printf ("Hello World %d¥n", myid);
MPI_Finalize();
}
hello.f
hello.c
131313
Bas
ic/E
ssen
tial F
unct
ions
implicit REAL*8 (A-H,O-Z)
include 'mpif.h‘
integer :: PETOT, my_rank, ierr
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
write (*,'(a,2i8)') 'Hello World FORTRAN', my_rank, PETOT
call MPI_FINALIZE (ierr)
stop
end
#include "mpi.h"
#include <stdio.h>
int main(int argc, char **argv)
{int n, myid, numprocs, i;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
printf ("Hello World %d¥n", myid);
MPI_Finalize();
}
‘mpif.h’, “mpi.h”
Ess
entia
l Inc
lude
file
“use mpi”
is p
ossi
ble
in F
90
MPI_Init
Initi
aliz
atio
n
MPI_Comm_size
Num
ber o
f MP
I Pro
cess
esmpirun
-np
XX
<prog>
MPI_Comm_rank
Pro
cess
ID s
tarti
ng fr
om 0
MPI_Finalize
Term
inat
ion
of M
PI p
roce
sses
1414
mpi
.h,
mpi
f.himplicit REAL*8 (A-H,O-Z)
include 'mpif.h‘
integer :: PETOT, my_rank, ierr
call MPI_INIT (ierr)
call MPI_COMM_SIZE (MPI_COMM_WORLD, PETOT, ierr )
call MPI_COMM_RANK (MPI_COMM_WORLD, my_rank, ierr )
write (*,'(a,2i8)') 'Hello World FORTRAN', my_rank, PETOT
call MPI_FINALIZE (ierr)
stop
end
#include "mpi.h"
#include <stdio.h>
int main(int argc, char **argv)
{int n, myid, numprocs, i;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
printf ("Hello World %d¥n", myid);
MPI_Finalize();
}
•V
ario
us ty
pes
of p
aram
eter
s an
d va
riabl
es fo
r MP
I & th
eir i
nitia
l val
ues.
•N
ame
of e
ach
var.
star
ts fr
om “M
PI_
”•
Val
ues
of th
ese
para
met
ers
and
varia
bles
can
not b
e ch
ange
d by
us
ers.
•U
sers
do
not s
peci
fy v
aria
bles
st
artin
g fro
m “M
PI_
” in
user
s’
prog
ram
s.
1515
MPI
_Ini
t•
Initi
aliz
e th
e M
PI e
xecu
tion
envi
ronm
ent (
requ
ired)
•It
is re
com
men
ded
to p
ut th
is B
EFO
RE
all
stat
emen
ts in
the
prog
ram
.
•MPI_Init(argc, argv)
#include "mpi.h"
#include <stdio.h>
int main(int argc, char **argv)
{int n, myid, numprocs, i;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
printf ("Hello World %d¥n", myid);
MPI_Finalize();
}
C
1616
MPI
_Fin
aliz
e•
Term
inat
es M
PI e
xecu
tion
envi
ronm
ent (
requ
ired)
•It
is re
com
men
ded
to p
ut th
is A
FTE
R a
ll st
atem
ents
in th
e pr
ogra
m.
•P
leas
e do
not
forg
et th
is.
•MPI_Finalize()
#include "mpi.h"
#include <stdio.h>
int main(int argc, char **argv)
{int n, myid, numprocs, i;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
printf ("Hello World %d¥n", myid);
MPI_Finalize();
}
C
1717
MPI
_Com
m_s
ize
•D
eter
min
es th
e si
ze o
f the
gro
up a
ssoc
iate
d w
ith a
com
mun
icat
or•
not r
equi
red,
but
ver
y co
nven
ient
func
tion
•MPI_Comm_size (comm, size)
–comm
MP
I_C
ommI
com
mun
icat
or–
size
int
Onu
mbe
r of p
roce
sses
in th
e gr
oup
of c
omm
unic
ator
#include "mpi.h"
#include <stdio.h>
int main(int argc, char **argv)
{int n, myid, numprocs, i;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
printf ("Hello World %d¥n", myid);
MPI_Finalize();
}
C
1818
Wha
t is
Com
mun
icat
or ?
•G
roup
of p
roce
sses
for c
omm
unic
atio
n•
Com
mun
icat
or m
ust b
e sp
ecifi
ed in
MP
I pro
gram
as
a un
it of
com
mun
icat
ion
•A
ll pr
oces
ses
belo
ng to
a g
roup
, nam
ed
“MPI
_CO
MM
_WO
RLD
” (de
faul
t)•
Mul
tiple
com
mun
icat
ors
can
be c
reat
ed, a
nd c
ompl
icat
ed
oper
atio
ns a
re p
ossi
ble.
–C
ompu
tatio
n, V
isua
lizat
ion
•O
nly
“MP
I_C
OM
M_W
OR
LD” i
s ne
eded
in th
is c
lass
.
MPI_Comm_Size (MPI_COMM_WORLD, PETOT)
1919
MPI
_CO
MM
_WO
RLD
Com
mun
icat
or in
MPI
One
pro
cess
can
bel
ong
to m
ultip
le c
omm
unic
ator
s
CO
MM
_MA
NTL
EC
OM
M_C
RU
ST
CO
MM
_VIS
22
Targ
et A
pplic
atio
n•
Cou
plin
g be
twee
n “G
roun
d M
otio
n” a
nd “S
losh
ing
of
Tank
s fo
r Oil-
Sto
rage
”–
“One
-way
” cou
plin
g fro
m “G
roun
d M
otio
n” to
“Tan
ks”.
–D
ispl
acem
ent o
f gro
und
surfa
ce is
giv
en a
s fo
rced
di
spla
cem
ent o
f bot
tom
sur
face
of t
anks
.–
1 Ta
nk =
1 P
E (s
eria
l)
Def
orm
atio
nof
sur
face
w
ill b
e gi
ven
as
boun
dary
con
ditio
nsat
bot
tom
of t
anks
.
Def
orm
atio
nof
sur
face
w
ill b
e gi
ven
as
boun
dary
con
ditio
nsat
bot
tom
of t
anks
.
2003
Tok
achi
Ear
thqu
ake
(M8.
0)Fi
re a
ccid
ent o
f oil
tank
s du
e to
long
per
iod
grou
nd m
otio
n (s
urfa
ce w
aves
) dev
elop
ed in
the
basi
n of
Tom
akom
ai
Intro
pFE
M23
25
Sim
ulat
ion
Cod
es•
Gro
und
Mot
ion
(Ichi
mur
a): F
ortra
n–
Par
alle
l FE
M, 3
D E
last
ic/D
ynam
ic•
Exp
licit
forw
ard
Eul
er s
chem
e
–E
ach
elem
ent:
2m×
2m×
2m c
ube
–24
0m×
240m
×10
0m re
gion
•S
losh
ing
of T
anks
(Nag
ashi
ma)
: C–
Ser
ial F
EM
(Em
barr
assi
ngly
Par
alle
l)•
Impl
icit
back
war
d E
uler
, Sky
line
met
hod
•S
hell
elem
ents
+ In
visc
id p
oten
tial f
low
–D
: 42.
7m, H
: 24.
9m, T
: 20m
m,
–Fr
eque
ncy:
7.6
sec.
–80
ele
men
ts in
circ
., 0.
6m m
esh
in h
eigh
t–
Tank
-to-T
ank:
60m
, 4×
4•
Tota
l num
ber o
f unk
now
ns: 2
,918
,169
2626
Thre
e C
omm
unic
ator
sm
eshG
LOB
AL%
MP
I_C
OM
M
base
mem
t#0
base
men
t#1
base
men
t#2
base
men
t#3
mes
hBA
SE
%M
PI_
CO
MM
tank #0
tank #1
tank #2
tank #3
tank #4
tank #5
tank #6
tank #7
tank #8
mes
hTA
NK
%M
PI_
CO
MM
mes
hGLO
BA
L%m
y_ra
nk=
0~3
mes
hBA
SE
%m
y_ra
nk=
0~3
mes
hGLO
BA
L%m
y_ra
nk=
4~12
mes
hTA
NK
%m
y_ra
nk=
0~ 8
mes
hTA
NK
%m
y_ra
nk=
-1m
eshB
AS
E%
my_
rank
= -1
mes
hGLO
BA
L%M
PI_
CO
MM
base
mem
t#0
base
men
t#1
base
men
t#2
base
men
t#3
mes
hBA
SE
%M
PI_
CO
MM
base
mem
t#0
base
men
t#1
base
men
t#2
base
men
t#3
mes
hBA
SE
%M
PI_
CO
MM
tank #0
tank #1
tank #2
tank #3
tank #4
tank #5
tank #6
tank #7
tank #8
mes
hTA
NK
%M
PI_
CO
MM
tank #0
tank #1
tank #2
tank #3
tank #4
tank #5
tank #6
tank #7
tank #8
mes
hTA
NK
%M
PI_
CO
MM
mes
hGLO
BA
L%m
y_ra
nk=
0~3
mes
hBA
SE
%m
y_ra
nk=
0~3
mes
hGLO
BA
L%m
y_ra
nk=
4~12
mes
hTA
NK
%m
y_ra
nk=
0~ 8
mes
hTA
NK
%m
y_ra
nk=
-1m
eshB
AS
E%
my_
rank
= -1
2727
MPI
_Com
m_r
ank
•D
eter
min
es th
e ra
nk o
f the
cal
ling
proc
ess
in th
e co
mm
unic
ator
–“ID
of M
PI p
roce
ss” i
s so
met
imes
cal
led
“ran
k”
•MPI_Comm_rank (comm, rank)
–comm
MP
I_C
ommI
com
mun
icat
or–
rank
int
Ora
nk o
f the
cal
ling
proc
ess
in th
e gr
oup
of c
omm
Sta
rting
from
“0”
#include "mpi.h"
#include <stdio.h>
int main(int argc, char **argv)
{int n, myid, numprocs, i;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
printf ("Hello World %d¥n", myid);
MPI_Finalize();
}
C
Intro
pFE
M28
•In
trodu
ctio
n to
MP
I•
Roa
d to
Par
alle
l FEM
–Lo
cal D
ata
Stru
ctur
e
•M
PI f
or P
aral
lel F
EM
–C
olle
ctiv
e C
omm
unic
atio
n–
Poi
nt-to
-Poi
nt (P
eer-t
o-P
eer)
Com
mun
icat
ion
Mot
ivat
ion
for P
aral
lel C
ompu
ting
•La
rge-
scal
e pa
ralle
l com
pute
r ena
bles
fast
com
putin
g in
larg
e-sc
ale
scie
ntifi
c si
mul
atio
ns w
ith d
etai
led
mod
els.
C
ompu
tatio
nal s
cien
ce d
evel
ops
new
fron
tiers
of
scie
nce
and
engi
neer
ing.
•W
hy p
aral
lel c
ompu
ting
?–
fast
er–
larg
er–
“larg
er” i
s m
ore
impo
rtant
from
the
view
poi
nt o
f “ne
w fr
ontie
rs
of s
cien
ce &
eng
inee
ring”
, but
“fas
ter”
is a
lso
impo
rtant
.–
+ m
ore
com
plic
ated
–Id
eal:
Sca
labl
e•
Sol
ving
Nx
scal
e pr
oble
m u
sing
Nx
com
puta
tiona
l res
ourc
es d
urin
g sa
me
com
puta
tion
time.
Intro
pFE
M29
to
sol
ve la
rger
pro
blem
s fa
ster
...
–fin
er m
eshe
s pr
ovid
e m
ore
accu
rate
sol
utio
n
Wha
t is
Para
llel C
ompu
ting
?
Hom
ogen
eous
/Het
erog
eneo
usPo
rous
Med
iaLa
wre
nce
Live
rmor
e N
atio
nal L
abor
ator
y
Hom
ogen
eous
Het
erog
eneo
us
very
fine
mes
hes
are
requ
ired
for s
imul
atio
ns o
f he
tero
gene
ous
field
.
Intro
pFE
M30
P
C w
ith 1
GB
mem
ory
: 1M
mes
hes
are
the
limit
for F
EM
−S
outh
wes
t Jap
an w
ith (1
000k
m)3
in 1
km m
esh
-> 1
09m
eshe
s
Larg
e D
ata
-> D
omai
n D
ecom
posi
tion
-> L
ocal
Ope
ratio
n
Inte
r-D
omai
n C
omm
unic
atio
n fo
r Glo
bal O
pera
tion.
Larg
e-S
cale
Dat
a
Loca
lD
ata
Loca
lD
ata
Loca
lD
ata
Loca
lD
ata
Loca
lD
ata
Loca
lD
ata
Loca
lD
ata
Loca
lD
ataC
omm
unic
atio
n
Par
titio
ning
Wha
t is
Para
llel C
ompu
ting
?(c
ont.)
Intro
pFE
M31
P
aral
lel C
ompu
ting
-> L
ocal
Ope
ratio
ns
C
omm
unic
atio
ns a
re re
quire
d in
Glo
bal O
pera
tions
for
Con
sist
ency
.
Wha
t is
Com
mun
icat
ion
?
Intro
pFE
M32
In
itial
izat
ion
−C
ontro
l Dat
a−
Nod
e, C
onne
ctiv
ity o
f Ele
men
ts (N
: Nod
e#, N
E: E
lem
#)−
Initi
aliz
atio
n of
Arr
ays
(Glo
bal/E
lem
ent M
atric
es)
−E
lem
ent-G
loba
l Mat
rix M
appi
ng (I
ndex
, Ite
m)
G
ener
atio
n of
Mat
rix: L
ocal
Op.
: goo
d fo
r par
alle
l com
putin
g−
Ele
men
t-by-
Ele
men
t Ope
ratio
ns (d
o ic
el=
1, N
E)
E
lem
ent m
atric
es
Acc
umul
atio
n to
glo
bal m
atrix
B
ound
ary
Con
ditio
ns
Li
near
Sol
ver:
Glo
bal O
p.: C
omm
unic
atio
n ne
eded
−C
onju
gate
Gra
dien
t Met
hod
C
alcu
latio
n of
Stre
ss
Fini
te E
lem
ent P
roce
dure
sIn
tro p
FEM
33
Larg
e S
cale
Dat
a ->
par
titio
ned
into
Dis
tribu
ted
Loca
l Dat
a S
ets.
Loca
l Dat
a
Loca
l Dat
a
Loca
l Dat
a
Loca
l Dat
a
FEM
Mat
rix
FEM
Mat
rix
FEM
Mat
rix
FEM
Mat
rix
FEM
cod
e ca
n as
sem
bles
coe
ffici
ent m
atrix
for e
ach
loca
l dat
a se
t :
this
par
t cou
ld b
e co
mpl
etel
y lo
cal,
sam
e as
ser
ial o
pera
tions
Line
ar S
olve
r
Line
ar S
olve
r
Line
ar S
olve
r
Line
ar S
olve
r
Glo
bal O
pera
tions
& C
omm
unic
atio
ns h
appe
n on
ly in
Lin
ear S
olve
rsdo
t pro
duct
s, m
atrix
-vec
tor m
ultip
ly, p
reco
nditi
onin
g
Ope
ratio
ns in
Par
alle
l FEM
SP
MD
: Sin
gle-
Pro
gram
Mul
tiple
-Dat
a
MPI
MPI
MPI
Intro
pFE
M34
D
esig
n of
“Loc
al D
ata
Stru
ctur
e” is
impo
rtant
−Fo
r SP
MD
-type
ope
ratio
ns in
the
prev
ious
pag
e
M
atrix
Gen
erat
ion
P
reco
nditi
oned
Iter
ativ
e S
olve
rs fo
r Lin
ear E
quat
ions
Para
llel F
EM P
roce
dure
sIn
tro p
FEM
35
Bi-L
inea
r Squ
are
Elem
ents
Valu
es a
re d
efin
ed o
n ea
ch n
ode
1
5 1
2
6 2
3
7 3
8 4
1
5 1
6 2
3
7 3
8 4
Loca
l inf
orm
atio
n is
not
eno
ugh
for m
atrix
ass
embl
ing.
divi
de in
to tw
o do
mai
ns b
y “n
ode-
base
d” m
anne
r, w
here
nu
mbe
r of “
node
s (v
ertic
es)”
are
bala
nced
.
21
5 1
6 2
3
8 4
7 3
2
6 2
7 3
Info
rmat
ion
of o
verla
pped
el
emen
ts a
nd c
onne
cted
nod
es
are
requ
ired
for m
atrix
as
sem
blin
g on
bou
ndar
y no
des.
Intro
pFE
M36
Loca
l Dat
a of
Par
alle
l FEM
N
ode-
base
d pa
rtitio
ning
for I
C/IL
U ty
pe p
reco
nditi
onin
g m
etho
ds
Loca
l dat
a in
clud
es in
form
atio
n fo
r :
Nod
es o
rigin
ally
ass
igne
d to
the
parti
tion/
PE
E
lem
ents
whi
ch in
clud
e th
e no
des
: Ele
men
t-bas
ed o
pera
tions
(Mat
rix
Ass
embl
e) a
re a
llow
ed fo
r flu
id/s
truct
ure
subs
yste
ms.
All
node
s w
hich
form
the
elem
ents
but
out
of t
he p
artit
ion
N
odes
are
cla
ssifi
ed in
to th
e fo
llow
ing
3 ca
tego
ries
from
the
view
poin
t of t
he m
essa
ge p
assi
ng
In
tern
al n
odes
orig
inal
ly a
ssig
ned
node
s
Ext
erna
l nod
esin
the
over
lapp
ed e
lem
ents
but
out
of t
he p
artit
ion
B
ound
ary
node
sex
tern
al n
odes
of o
ther
par
titio
n
Com
mun
icat
ion
tabl
e be
twee
n pa
rtitio
ns
NO
glo
bal i
nfor
mat
ion
requ
ired
exce
pt p
artit
ion-
to-p
artit
ion
conn
ectiv
ity
Intro
pFE
M37
Intro
pFE
M38
Nod
e-ba
sed
Part
ition
ing
inte
rnal
nod
es -
elem
ents
-ex
tern
al n
odes
12
3
45
67
89
11
10
14
13
15
12
PE#0
78
910
45
612
311
12
PE#1
71
23
10
911
12
56
84 PE#2
34
8
69
10
12
12
5
11
7
PE#3
12
34
5
21
22
23
24
25
16
17
18
20
11
12
13
14
15
67
89
10
19PE#0
PE#1
PE#2
PE#3
Intro
pFE
M39
E
lem
ents
whi
ch in
clud
e In
tern
al N
odes
Nod
e-ba
sed
Part
ition
ing
inte
rnal
nod
es -
elem
ents
-ex
tern
al n
odes
89
11
10141315
12
E
xter
nal N
odes
incl
uded
in th
e E
lem
ents
in o
verla
pped
regi
on a
mon
g pa
rtitio
ns.
P
artit
ione
d no
des
them
selv
es (I
nter
nal N
odes
)
12
3
45
67
In
fo o
f Ext
erna
l Nod
es a
re re
quire
d fo
r com
plet
ely
loca
l el
emen
t–ba
sed
oper
atio
ns o
n ea
ch p
roce
ssor
.
E
lem
ents
whi
ch in
clud
e In
tern
al N
odes
Nod
e-ba
sed
Part
ition
ing
inte
rnal
nod
es -
elem
ents
-ex
tern
al n
odes
89
11
10141315
12
E
xter
nal N
odes
incl
uded
in th
e E
lem
ents
in o
verla
pped
regi
on a
mon
g pa
rtitio
ns.
P
artit
ione
d no
des
them
selv
es (I
nter
nal N
odes
)
12
3
45
67
In
fo o
f Ext
erna
l Nod
es a
re re
quire
d fo
r com
plet
ely
loca
l el
emen
t–ba
sed
oper
atio
ns o
n ea
ch p
roce
ssor
.
We
do n
ot n
eed
com
mun
icat
ion
durin
g m
atrix
ass
embl
e !!
Intro
pFE
M40
Intro
pFE
M41
Para
llel C
ompu
ting
in F
EMSP
MD
: Sin
gle-
Prog
ram
Mul
tiple
-Dat
a
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
MPI
MPI
MPI
MPI
MPI
MPI
Intro
pFE
M42
Para
llel C
ompu
ting
in F
EMSP
MD
: Sin
gle-
Prog
ram
Mul
tiple
-Dat
a
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
MPI
MPI
MPI
MPI
MPI
MPI
Intro
pFE
M43
Para
llel C
ompu
ting
in F
EMSP
MD
: Sin
gle-
Prog
ram
Mul
tiple
-Dat
a
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
MPI
MPI
MPI
MPI
MPI
MPI
12
3
45
67
89
11
10
14
13
15
12
12
3
45
67
89
11
10
14
13
15
12
71
23
10
911
12
56
84
71
23
10
911
12
56
84
78
910
45
612
311
12
78
910
45
612
311
12
34
8
69
10
12
12
5
11
7
34
8
69
10
12
12
5
11
7
Intro
pFE
M44
Para
llel C
ompu
ting
in F
EMSP
MD
: Sin
gle-
Prog
ram
Mul
tiple
-Dat
a
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
MPI
MPI
MPI
MPI
MPI
MPI
12
3
45
67
89
11
10
14
13
15
12
12
3
45
67
89
11
10
14
13
15
12
71
23
10
911
12
56
84
71
23
10
911
12
56
84
78
910
45
612
311
12
78
910
45
612
311
12
34
8
69
10
12
12
5
11
7
34
8
69
10
12
12
5
11
7
Intro
pFE
M45
Para
llel C
ompu
ting
in F
EMSP
MD
: Sin
gle-
Prog
ram
Mul
tiple
-Dat
a
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
Loca
l Dat
aFE
M c
ode
Line
ar S
olve
rs
MPI
MPI
MPI
MPI
MPI
MPI
1D F
EM: 1
2 no
des/
11 e
lem
’s/3
dom
ains
01
23
45
67
89
1011
01
23
45
67
89
10
01
23
45
67
89
1011
12
34
56
78
910
0
Intro
pFE
M46
1D F
EM: 1
2 no
des/
11 e
lem
’s/3
dom
ains
0
1
2
3
4
5
6
7
8
9
10
11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
Intro
pFE
M47
# “I
nter
nal N
odes
” sh
ould
be
bala
nced
#0 #1 #2
0
1
2
3
4
5
6
7
8
9
10
11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10
Intro
pFE
M48
Mat
rices
are
inco
mpl
ete
!0
1
2
3
4
5
6
7
8
9
10
11
#0 #1 #2
0 1 2 3 8 9 10 114 5 6 7
Intro
pFE
M49
Con
nect
ed E
lem
ents
+ E
xter
nal N
odes
#0 #1 #2
0
1
2
3
4
5
6
7
8
9
10
11
0 1 2 3 4
0 1 2 3
7 8 9 10 11
7 8 9 10
3 4 5 6 7 8
3 4 5 6 7
Intro
pFE
M50
1D F
EM: 1
2 no
des/
11 e
lem
’s/3
dom
ains
0 1 2 3 4
0 1 2 3
7 8 9 10 11
7 8 9 10
3 4 5 6 7 8
3 4 5 6 7
#0 #1 #2
0
1
2
3
4
5
6
7
8
9
10
11
0 1 2 3 4 5 6 7 8 9 10 11
Intro
pFE
M51
1D F
EM: 1
2 no
des/
11 e
lem
’s/3
dom
ains
01
23
45
67
89
1011
01
23
45
67
89
10
01
23
4
78
910
11
12
3
78
910
0
34
56
78
34
56
7
01
23
45
67
89
1011
12
34
56
78
910
0
Intro
pFE
M52
Loca
l Num
berin
g fo
r SPM
DN
umbe
ring
of in
tern
al n
odes
is 1
-N (0
-N-1
), sa
me
oper
atio
ns
in s
eria
l pro
gram
can
be
appl
ied.
How
abo
ut n
umbe
ring
of
exte
rnal
nod
es ?
01
23
?
?0
12
3
?0
12
3?
01
23
4
78
910
11
12
3
78
910
0
34
56
78
34
56
7
Intro
pFE
M53
SPM
D
PE
#0
Pro
gram
Dat
a #0
PE
#1
Pro
gram
Dat
a #1
PE
#2
Pro
gram
Dat
a #2
PE
#M
-1
Pro
gram
Dat
a #M
-1
mpirun -np M <Program>
You
unde
rsta
nd 9
0% M
PI,
if yo
u un
ders
tand
this
figu
re.
PE
: Pro
cess
ing
Ele
men
tP
roce
ssor
, Dom
ain,
Pro
cess
Eac
h pr
oces
s do
es s
ame
oper
atio
n fo
r diff
eren
t dat
aLa
rge-
scal
e da
ta is
dec
ompo
sed,
and
eac
h pa
rt is
com
pute
d by
eac
h pr
oces
sIt
is id
eal t
hat p
aral
lel p
rogr
am is
not
diff
eren
t fro
m s
eria
l one
exc
ept c
omm
unic
atio
n.
Intro
pFE
M54
Loca
l Num
berin
g fo
r SPM
DN
umbe
ring
of e
xter
nal n
odes
: N+1
, N+2
(N,N
+1)
01
23
4
40
12
3
40
12
35
Intro
pFE
M55
Intro
pFE
M56
•In
trodu
ctio
n to
MP
I•
Roa
d to
Par
alle
l FE
M–
Loca
l Dat
a S
truct
ure
•M
PI fo
r Par
alle
l FEM
–C
olle
ctiv
e C
omm
unic
atio
n–
Poi
nt-to
-Poi
nt (P
eer-t
o-P
eer)
Com
mun
icat
ion
P
aral
lel p
roce
dure
s ar
e re
quire
d in
:
Dot
pro
duct
s
Mat
rix-v
ecto
r mul
tiplic
atio
n
How
to “
Para
lleliz
e” It
erat
ive
Solv
ers
?e.
g. C
G m
etho
d (w
ith n
o pr
econ
ditio
ning
)
Compute r(0)= b-[A]x(0)
fori= 1, 2, …
z(i-1)= r(i-1)
i-1= r
(i-1) z(i-1)
ifi=1
p(1)= z(0)
else
i-1=
i-1/ i
-2
p(i)= z
(i-1) + i
-1p(i)
endif
q(i)= [A]p(i)
i = i
-1/p(i)q(i)
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) -
iq(i)
check convergence |r|
end
Intro
pFE
M57
Pre
cond
ition
ing,
DA
XP
YLo
cal O
pera
tions
by
Onl
y In
tern
al P
oint
s: P
aral
lel P
roce
ssin
g is
pos
sibl
e
/*
//--{x}= {x} + ALPHA*{p}
// {r}= {r} -ALPHA*{q}
*/
for(i=0;i<N;i++){
PHI[i] += Alpha * W[P][i];
W[R][i] -= Alpha * W[Q][i];
}
/*//--
{z}= [Minv]{r}
*/
for(i=0;i<N;i++){
W[Z][i] = W[DD][i] * W[R][i];
}
0 1 2 3 4 5 6 7 8 9 10 11
Intro
pFE
M58
Dot
Pro
duct
sG
loba
l Sum
mat
ion
need
ed: C
omm
unic
atio
n ?
/*
//--
ALPHA= RHO / {p}{q}
*/C1 = 0.0;
for(i=0;i<N;i++){
C1 += W[P][i] * W[Q][i];
} Alpha = Rho / C1;
0 1 2 3 4 5 6 7 8 9 10 11
Intro
pFE
M59
6060
MPI
_Red
uce
•R
educ
es v
alue
s on
all
proc
esse
s to
a s
ingl
e va
lue
–S
umm
atio
n, P
rodu
ct, M
ax, M
in e
tc.
•MPI_Reduce(sendbuf,recvbuf,count,datatype,op,root,comm)
–sendbuf
choi
ceI
star
ting
addr
ess
of s
end
buffe
r–
recvbuf
choi
ce
Ost
artin
g ad
dres
s re
ceiv
e bu
ffer
type
is d
efin
ed b
y ”datatype”
–count
int
Inu
mbe
r of e
lem
ents
in s
end/
rece
ive
buffe
r–
datatype
MP
I_D
atat
ypeI
data
type
of e
lem
ents
of s
end/
reci
vebu
ffer
FORTRAN MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION, MPI_CHARACTER etc.
C MPI_INT, MPI_FLOAT, MPI_DOUBLE, MPI_CHAR etc
–op
MP
I_O
pI
redu
ce o
pera
tion
MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD, MPI_LAND, MPI_BAND etc
Use
rs c
an d
efin
e op
erat
ions
by MPI_OP_CREATE
–root
int
Ira
nk o
f roo
t pro
cess
–
comm
MP
I_C
ommI
com
mun
icat
or
Red
uce
P#0
P#1
P#2
P#3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
C
6161
MPI
_Bca
st
•B
road
cast
s a
mes
sage
from
the
proc
ess
with
rank
"roo
t" to
all
othe
r pr
oces
ses
of th
e co
mm
unic
ator
•MPI_Bcast (buffer,count,datatype,root,comm)
–buffer
choi
ce
I/O
star
ting
addr
ess
of b
uffe
rty
pe is
def
ined
by ”datatype”
–count
int
Inu
mbe
r of e
lem
ents
in s
end/
rece
ive
buffe
r–
datatype
MP
I_D
atat
ypeI
data
type
of e
lem
ents
of s
end/
reci
ve b
uffe
rFORTRAN MPI_INTEGER, MPI_REAL, MPI_DOUBLE_PRECISION, MPI_CHARACTER etc.
C MPI_INT, MPI_FLOAT, MPI_DOUBLE, MPI_CHAR etc.
–root
int
Ira
nk o
f roo
t pro
cess
–
comm
MP
I_C
ommI
com
mun
icat
or
A0
P#0
B0
C0
D0
P#1
P#2
P#3
A0
P#0
B0
C0
D0
P#1
P#2
P#3
Bro
adca
stA
0P
#0B
0C
0D
0
A0
P#1
B0
C0
D0
A0
P#2
B0
C0
D0
A0
P#3
B0
C0
D0
A0
P#0
B0
C0
D0
A0
P#1
B0
C0
D0
A0
P#2
B0
C0
D0
A0
P#3
B0
C0
D0 C
6262
MPI
_Allr
educ
e•
MP
I_R
educ
e +
MP
I_B
cast
•S
umm
atio
n (o
f dot
pro
duct
s) a
nd M
AX
/MIN
val
ues
are
likel
y to
util
ized
in
each
pro
cess
•call MPI_Allreduce
(sendbuf,recvbuf,count,datatype,op, comm)
–sendbuf
choi
ceI
star
ting
addr
ess
of s
end
buffe
r–
recvbuf
choi
ce
Ost
artin
g ad
dres
s re
ceiv
e bu
ffer
type
is d
efin
ed b
y ”datatype”
–count
int
Inu
mbe
r of e
lem
ents
in s
end/
rece
ive
buffe
r–
datatype
MP
I_D
atat
ypeI
data
type
of e
lem
ents
of s
end/
reci
ve b
uffe
r
–op
MP
I_O
pI
redu
ce o
pera
tion
–comm
MP
I_C
ommI
com
mun
icat
or
All
redu
ceP
#0
P#1
P#2
P#3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
A0
P#0
B0
C0
D0
A1
P#1
B1
C1
D1
A2
P#2
B2
C2
D2
A3
P#3
B3
C3
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
op.A
0-A
3op
.B0-
B3
op.C
0-C
3op
.D0-
D3
C
63
“op”
of M
PI_R
educ
e/A
llred
uce
•MPI_MAX,MPI_MIN
Max
, Min
•MPI_SUM,MPI_PROD
Sum
mat
ion,
Pro
duct
•MPI_LAND
Logi
cal A
ND
MPI_Reduce
(sendbuf,recvbuf,count,datatype,op,root,comm)
C
Intro
pFE
M64
•In
trodu
ctio
n to
MP
I•
Roa
d to
Par
alle
l FE
M–
Loca
l Dat
a S
truct
ure
•M
PI fo
r Par
alle
l FEM
–C
olle
ctiv
e C
omm
unic
atio
n–
Poin
t-to-
Poin
t (Pe
er-to
-Pee
r) C
omm
unic
atio
n
P
aral
lel p
roce
dure
s ar
e re
quire
d in
:
Dot
pro
duct
s
Mat
rix-v
ecto
r mul
tiplic
atio
n
How
to “
Para
lleliz
e” It
erat
ive
Solv
ers
?e.
g. C
G m
etho
d (w
ith n
o pr
econ
ditio
ning
)
Compute r(0)= b-[A]x(0)
fori= 1, 2, …
z(i-1)= r(i-1)
i-1= r
(i-1) z(i-1)
ifi=1
p(1)= z(0)
else
i-1=
i-1/ i
-2
p(i)= z
(i-1) + i
-1p(i)
endif
q(i)= [A]p(i)
i = i
-1/p(i)q(i)
x(i)= x(i-1) + ip(i)
r(i)= r(i-1) -
iq(i)
check convergence |r|
end
Intro
pFE
M65
Mat
rix-V
ecto
r Pro
duct
sV
alue
s at
Ext
erna
l Poi
nts:
P-to
-P C
omm
unic
atio
n/*//--{q}= [A]{p}
*/for(i=0;i<N;i++){
W[Q][i] = Diag[i] * W[P][i];
for(j=Index[i];j<Index[i+1];j++){
W[Q][i] += AMat[j]*W[P][Item[j]];
}}
40
12
35
Intro
pFE
M66
Mat
-Vec
Pro
duct
s: L
ocal
Op.
Pos
sibl
e0
1
2
3
4
5
6
7
8
9
10
11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10 11
=
Intro
pFE
M67
Mat
-Vec
Pro
duct
s: L
ocal
Op.
Pos
sibl
e0
1
2
3
4
5
6
7
8
9
10
11
0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10 11
=
Intro
pFE
M68
Mat
-Vec
Pro
duct
s: L
ocal
Op.
Pos
sibl
e0
1
2
3
0
1
2
3
0
1
2
3
0 1 2 3 0 1 2 3 0 1 2 3
0 1 2 3 0 1 2 3 0 1 2 3
=
Intro
pFE
M69
Mat
-Vec
Pro
duct
s: L
ocal
Op.
#1
0
1
2
3
0 1 2 3
0 1 2 3
=
40
12
35
0
1
2
3
0 1 2 3
0 1 2 3
=
4 5
Intro
pFE
M71
Mat
-Vec
Pro
duct
s: L
ocal
Op.
#2
0 1 2 3
0 1 2 3
=
0
1
2
3
0 1 2 3
0 1 2 3
=
4
0
1
2
3
40
12
3
Intro
pFE
M72
1D F
EM: 1
2 no
des/
11 e
lem
’s/3
dom
ains
01
23
45
67
89
1011
01
23
45
67
89
10
01
23
4
78
910
11
12
3
78
910
0
34
56
78
34
56
7
01
23
45
67
89
1011
12
34
56
78
910
0
Intro
pFE
M73
1D F
EM: 1
2 no
des/
11 e
lem
’s/3
dom
ains
Loca
l ID
: Sta
rting
from
0 fo
r nod
e an
d el
em a
t eac
h do
mai
n
01
23
4
40
12
3
12
3
30
12
0
40
12
35
30
12
4
#0
#1
#2
Intro
pFE
M74
1D F
EM: 1
2 no
des/
11 e
lem
’s/3
dom
ains
Inte
rnal
/Ext
erna
l Nod
es
01
23
4
40
12
3
12
3
30
12
0
40
12
35
30
12
4
#0
#1
#2
Intro
pFE
M75
Wha
t is
Peer
-to-P
eer C
omm
unic
atio
n ?
•C
olle
ctiv
e C
omm
unic
atio
n–
MP
I_R
educ
e, M
PI_
Sca
tter/G
athe
r etc
.–
Com
mun
icat
ions
with
all
proc
esse
s in
the
com
mun
icat
or–
App
licat
ion
Are
a•
BE
M, S
pect
ral M
etho
d, M
D: g
loba
l int
erac
tions
are
con
side
red
•D
ot p
rodu
cts,
MA
X/M
IN: G
loba
l Sum
mat
ion
& C
ompa
rison
•P
eer-
toP
eer/P
oint
-to-P
oint
–M
PI_
Sen
d, M
PI_
Rec
eive
–C
omm
unic
atio
n w
ith li
mite
d pr
oces
ses
•N
eigh
bors
–A
pplic
atio
n A
rea
•FE
M, F
DM
: Loc
aliz
ed M
etho
d
01
23
4
40
12
3
12
3
30
12
0
40
12
35
30
12
4
#0
#1
#2
Intro
pFE
M76
MPI
_Ise
nd•
Beg
ins
a no
n-bl
ocki
ng s
end
–S
end
the
cont
ents
of s
endi
ng b
uffe
r (st
artin
g fro
m sendbuf
, num
ber o
f mes
sage
s: count
) to
dest
with
tag
. –
Con
tent
s of
sen
ding
buf
fer c
anno
t be
mod
ified
bef
ore
callin
g co
rresp
ondi
ng MPI_Waitall
.
•MPI_Isend
(sendbuf,count,datatype,dest,tag,comm,request)
–sendbuf
choi
ceI
star
ting
addr
ess
of s
endi
ng b
uffe
r–
count
int
Inu
mbe
r of e
lem
ents
in s
endi
ng b
uffe
r–
datatype
MP
I_D
atat
ypeI
data
type
of e
ach
send
ing
buffe
r ele
men
t–
dest
int
Ira
nk o
f des
tinat
ion
–tag
int
Im
essa
ge ta
gTh
is in
tege
r can
be
used
by
the
appl
icat
ion
to d
istin
guis
h m
essa
ges.
Com
mun
icat
ion
occu
rs if
tag’s
of
MPI_Isend
and MPI_Irecv
are
mat
ched
. U
sual
ly ta
g is
set
to b
e “0
” (in
this
cla
ss),
–comm
MP
I_C
ommI
com
mun
icat
or–
request
MP
I_R
eque
stO
com
mun
icat
ion
requ
est a
rray
used
in MPI_Waitall
CIn
tro p
FEM
77
•MPI_Isend
(sendbuf,count,datatype,dest,tag,comm,request)
sendbuf
choi
ceI
star
ting
addr
ess
of s
endi
ng b
uffe
rcount
int
Inu
mbe
r of e
lem
ents
in s
endi
ng b
uffe
rdatatype
MP
I_D
atat
ypeI
data
type
of e
ach
send
ing
buffe
r ele
men
tdest
int
Ira
nk o
f des
tinat
ion
78
SEN
D: s
endi
ng fr
om b
ound
ary
node
sSe
nd c
ontin
uous
dat
a to
sen
d bu
ffer o
f nei
ghbo
rs
12
3
45
67
89
11
10
14
13
15
12
PE#0
78
910
45
612
311
12
PE#1
71
23
10
911
12
56
84 PE#2
34
8
69
10
12
12
5
11
7
PE#3
12
3
45
67
89
11
10
14
13
15
12
PE#0
78
910
45
612
311
12
PE#1
71
23
10
911
12
56
84 PE#2
34
8
69
10
12
12
5
11
7
PE#3
Intro
pFE
M
Intro
pFE
M78
Com
mun
icat
ion
Req
uest
: req
uest
•MPI_Isend
(sendbuf,count,datatype,dest,tag,comm,request)
–sendbuf
choi
ceI
star
ting
addr
ess
of s
endi
ng b
uffe
r–
count
int
Inu
mbe
r of e
lem
ents
in s
endi
ng b
uffe
r–
datatype
MP
I_D
atat
ypeI
data
type
of e
ach
send
ing
buffe
r ele
men
t–
dest
int
Ira
nk o
f des
tinat
ion
–tag
int
Im
essa
ge ta
gTh
is in
tege
r can
be
used
by
the
appl
icat
ion
to d
istin
guis
h m
essa
ges.
Com
mun
icat
ion
occu
rs if
tag’s
of
MPI_Isend
and MPI_Irecv
are
mat
ched
. U
sual
ly ta
g is
set
to b
e “0
” (in
this
cla
ss),
–comm
MP
I_C
ommI
com
mun
icat
or–
request
MP
I_R
eque
stO
com
mun
icat
ion
requ
est u
sed
in MPI_Waitall
Siz
e of
the
arra
y is
tota
l num
ber o
f nei
ghbo
ring
proc
esse
s
•Ju
st d
efin
e th
e ar
ray
C
Intro
pFE
M79
80
REC
V: re
ceiv
ing
to e
xter
naln
odes
Rec
v. c
ontin
uous
dat
a to
recv
. buf
fer f
rom
nei
ghbo
rs•
MPI_Irecv
(recvbuf,count,datatype,source,tag,comm,request)
recvbuf
choi
ceI
star
ting
addr
ess
of re
ceiv
ing
buffe
rcount
int
Inu
mbe
r of e
lem
ents
in re
ceiv
ing
buffe
rdatatype
MP
I_D
atat
ypeI
data
type
of e
ach
rece
ivin
g bu
ffer e
lem
ent
source
int
Ira
nk o
f sou
rce
12
3
45
67
89
11
10
14
13
15
12
PE#0
78
910
45
612
311
12
PE#1
71
23
10
911
12
56
84 PE#2
34
8
69
10
12
12
5
11
7
PE#3
12
3
45
67
89
11
10
14
13
15
12
PE#0
78
910
45
612
311
12
PE#1
71
23
10
911
12
56
84 PE#2
34
8
69
10
12
12
5
11
7
PE#3
Intro
pFE
M
Intro
pFE
M80
MPI
_Ire
cv•
Beg
ins
a no
n-bl
ocki
ng re
ceiv
e –
Rec
eivi
ng th
e co
nten
ts o
f rec
eivi
ng b
uffe
r (st
artin
g fro
m recvbuf
, num
ber o
f mes
sage
s:
count
) fro
m source
with
tag
. –
Con
tent
s of
rece
ivin
g bu
ffer c
anno
t be
used
bef
ore
callin
g co
rresp
ondi
ng MPI_Waitall
.
•MPI_Irecv
(recvbuf,count,datatype,source,tag,comm,request)
–recvbuf
choi
ceI
star
ting
addr
ess
of re
ceiv
ing
buffe
r–
count
int
Inu
mbe
r of e
lem
ents
in re
ceiv
ing
buffe
r–
datatype
MP
I_D
atat
ypeI
data
type
of e
ach
rece
ivin
g bu
ffer e
lem
ent
–source
int
Ira
nk o
f sou
rce
–tag
int
Im
essa
ge ta
gTh
is in
tege
r can
be
used
by
the
appl
icat
ion
to d
istin
guis
h m
essa
ges.
Com
mun
icat
ion
occu
rs if
tag’s
of
MPI_Isend
and MPI_Irecv
are
mat
ched
. U
sual
ly ta
g is
set
to b
e “0
” (in
this
cla
ss),
–comm
MP
I_C
ommI
com
mun
icat
or–
request
MP
I_R
eque
stO
com
mun
icat
ion
requ
est a
rray
used
in MPI_Waitall
CIn
tro p
FEM
81
MPI
_Wai
tall
•MPI_Waitall
bloc
ks u
ntil
all c
omm
’s, a
ssoc
iate
d w
ith request
in th
e ar
ray,
co
mpl
ete.
It is
use
d fo
r syn
chro
nizi
ng MPI_Isend
and MPI_Irecv
in th
is c
lass
.•
At s
endi
ng p
hase
, con
tent
s of
sen
ding
buf
fer c
anno
t be
mod
ified
bef
ore
callin
g co
rresp
ondi
ng MPI_Waitall
. At r
ecei
ving
pha
se, c
onte
nts
of re
ceiv
ing
buffe
r ca
nnot
be
used
bef
ore
callin
g co
rresp
ondi
ng MPI_Waitall
.•
MPI_Isend
and MPI_Irecv
can
be s
ynch
roni
zed
sim
ulta
neou
sly
with
a s
ingl
e MPI_Waitall
if it
is c
onsi
tent
.–
Sam
e request
shou
ld b
e us
ed in
MPI_Isend
and MPI_Irecv
.•
Its o
pera
tion
is s
imila
r to
that
of MPI_Barrier
but, MPI_Waitall
can
not b
e re
plac
ed b
yMPI_Barrier
.–
Pos
sibl
e tro
uble
s us
ing MPI_Barrier
inst
ead
of MPI_Waitall
: Con
tent
s of
request
and
status
are
not u
pdat
ed p
rope
rly, v
ery
slow
ope
ratio
ns e
tc.
•MPI_Waitall
(count,request,status)
–count
int
Inu
mbe
r of p
roce
sses
to b
e sy
nchr
oniz
ed
–request
MP
I_R
eque
stI/O
com
m. r
eque
st u
sed
in MPI_Waitall
(arr
ay s
ize:
count
)–
status
MP
I_S
tatu
sO
arra
y of
sta
tus
obje
cts
MPI_STATUS_SIZE: defined in ‘mpif.h’, ‘mpi.h’C
Intro
pFE
M82
Arr
ay o
f sta
tus
obje
ct:
stat
us
•MPI_Waitall
(count,request,status)
–count
int
Inu
mbe
r of p
roce
sses
to b
e sy
nchr
oniz
ed
–request
MP
I_R
eque
stI/O
com
m. r
eque
st u
sed
in MPI_Waitall
(arr
ay s
ize:
count
)–
status
MP
I_S
tatu
sO
arra
y of
sta
tus
obje
cts
MPI_STATUS_SIZE: defined in ‘mpif.h’, ‘mpi.h’
•Ju
st d
efin
e th
e ar
ray
C
Intro
pFE
M83
Intro
pFE
M84
Nod
e-ba
sed
Part
ition
ing
inte
rnal
nod
es -
elem
ents
-ex
tern
al n
odes
12
3
45
67
89
11
10
14
13
15
12
PE#0
78
910
45
612
311
12
PE#1
71
23
10
911
12
56
84 PE#2
34
8
69
10
12
12
5
11
7
PE#3
12
34
5
21
22
23
24
25
16
17
18
20
11
12
13
14
15
67
89
10
19PE#0
PE#1
PE#2
PE#3
Des
crip
tion
of D
istr
ibut
ed L
ocal
Dat
a
•In
tern
al/E
xter
nal N
odes
–N
umbe
ring:
Sta
rting
from
inte
rnal
pts,
th
en e
xter
nalp
ts a
fter t
hat
•N
eigh
bors
–S
hare
s ov
erla
pped
ele
men
ts–
Num
ber a
nd ID
of n
eigh
bors
•E
xter
nal N
odes
–Fr
om w
here
, how
man
y, a
nd w
hich
ex
tern
al p
oint
s ar
e re
ceiv
ed/im
porte
d ?
•B
ound
ary
Nod
es–
To w
here
, how
man
y an
d w
hich
bo
unda
ry p
oint
s ar
e se
nt/e
xpor
ted
?
85M
PI P
rogr
amm
ing
71
23
109
1112
56
84
Exte
rnal
Nod
es: R
ECEI
VEP
E#2
: re
ceiv
e in
form
atio
n fo
r “ex
tern
al n
odes
”
71
23
109
1112
56
84
PE#2
12
3
45
67
89
11
10141315
12
PE#0
34
8
69
1012
12
5
11
7
PE#3
Intro
pFE
M86
Bou
ndar
y N
odes
: SEN
DP
E#2
: se
nd in
form
atio
n on
“bou
ndar
y no
des”
71
23
109
1112
56
84
PE#2
12
3
45
67
89
11
10141315
12
PE#0
34
8
69
1012
12
5
11
7
PE#3
Intro
pFE
M87
Gen
eral
ized
Com
m. T
able
: Sen
d•
Nei
ghbo
rs–
Nei
bPE
Tot,
Nei
bPE
[nei
b]•
Mes
sage
siz
e fo
r eac
h ne
ighb
or–
expo
rt_in
dex[
neib
], ne
ib=
0, N
eibP
ETo
t-1•
ID o
f bou
ndar
ypo
ints
–ex
port_
item
[k],
k= 0
, exp
ort_
inde
x[N
eibP
ETo
t]-1
•M
essa
ges
to e
ach
neig
hbor
–S
endB
uf[k
], k=
0, e
xpor
t_in
dex[
Nei
bPE
Tot]-
1
C
Intro
pFE
M88
SEN
D: M
PI_I
send
/Irec
v/W
aita
llne
ib#0
Send
Buf
neib
#1ne
ib#2
neib
#3
BU
Flen
gth_
eB
UFl
engt
h_e
BU
Flen
gth_
eB
UFl
engt
h_e
expo
rt_i
ndex
[0]
expo
rt_i
ndex
[1]
expo
rt_i
ndex
[2]
expo
rt_i
ndex
[3]
expo
rt_i
ndex
[4]
for (neib=0; neib<NeibPETot;neib++){
for (k=export_index[neib];k<export_index[neib+1];k++){
kk= export_item[k];
SendBuf[k]= VAL[kk];
}} for (neib=0; neib<NeibPETot; neib++){
tag= 0;
iS_e= export_index[neib];
iE_e= export_index[neib+1];
BUFlength_e= iE_e -iS_e
ierr= MPI_Isend
(&SendBuf[iS_e], BUFlength_e, MPI_DOUBLE, NeibPE[neib],0,
MPI_COMM_WORLD, &ReqSend[neib])
} MPI_Waitall(NeibPETot, ReqSend, StatSend);
Cop
ied
to s
endi
ng b
uffe
rs
expo
rt_i
tem
(exp
ort_
inde
x[ne
ib]:e
xpor
t_in
dex[
neib
+1]-1
) ar
e se
nt to
nei
b-th
nei
ghbo
r
CIn
tro p
FEM
89
Gen
eral
ized
Com
m. T
able
: Rec
eive
•N
eigh
bors
–N
eibP
ETo
t ,N
eibP
E[n
eib]
•M
essa
ge s
ize
for e
ach
neig
hbor
–im
port_
inde
x[ne
ib],
neib
= 0,
Nei
bPE
Tot-1
•ID
of e
xter
nalp
oint
s–
impo
rt_ite
m[k
], k=
0, i
mpo
rt_in
dex[
Nei
bPE
Tot]-
1•
Mes
sage
s fro
m e
ach
neig
hbor
–R
ecvB
uf[k
], k=
0, i
mpo
rt_in
dex[
Nei
bPE
Tot]-
1
C
Intro
pFE
M90
REC
V: M
PI_I
send
/Irec
v/W
aita
ll
neib
#0R
ecvB
ufne
ib#1
neib
#2ne
ib#3
BU
Flen
gth_
iB
UFl
engt
h_i
BU
Flen
gth_
iB
UFl
engt
h_i
for (neib=0; neib<NeibPETot; neib++){
tag= 0;
iS_i= import_index[neib];
iE_i= import_index[neib+1];
BUFlength_i= iE_i -iS_i
ierr= MPI_Irecv
(&RecvBuf[iS_i], BUFlength_i, MPI_DOUBLE, NeibPE[neib],0,
MPI_COMM_WORLD, &ReqRecv[neib])
} MPI_Waitall(NeibPETot, ReqRecv, StatRecv);
for (neib=0; neib<NeibPETot;neib++){
for (k=import_index[neib];k<import_index[neib+1];k++){
kk= import_item[k];
VAL[kk]= RecvBuf[k];
}} im
port
_ind
ex[0
]im
port
_ind
ex[1
]im
port
_ind
ex[2
]im
port
_ind
ex[3
]im
port
_ind
ex[4
]C
impo
rt_i
tem
(im
port
_ind
ex[n
eib]
:impo
rt_i
ndex
[nei
b+1]
-1)
are
rece
ived
from
nei
b-th
nei
ghbo
r
Cop
ied
from
rece
ivin
g bu
ffer
Intro
pFE
M91
Rel
atio
nshi
p SE
ND
/REC
V
do neib= 1, NEIBPETOT
iS_i= import_index(neib-1) + 1
iE_i= import_index(neib )
BUFlength_i= iE_i + 1 -iS_i
call MPI_IRECV &
& (RECVbuf(iS_i), BUFlength_i,MPI_INTEGER, NEIBPE(neib),0,&
& MPI_COMM_WORLD, request_recv(neib),ierr)
enddo
do neib= 1, NEIBPETOT
iS_e= export_index(neib-1) + 1
iE_e= export_index(neib )
BUFlength_e= iE_e + 1 -iS_e
call MPI_ISEND
&& (SENDbuf(iS_e), BUFlength_e, MPI_INTEGER, NEIBPE(neib),0,&
& MPI_COMM_WORLD, request_send(neib),ierr)
enddo
•C
onsi
sten
cy o
f ID
’s o
f sou
rces
/des
tinat
ions
, siz
e an
d co
nten
ts o
f mes
sage
s !
•C
omm
unic
atio
n oc
curs
whe
n N
EIB
PE
(nei
b) m
atch
es
Intro
pFE
M92
Rel
atio
nshi
p SE
ND
/REC
V (#
0 to
#3)
•C
onsi
sten
cy o
f ID
’s o
f sou
rces
/des
tinat
ions
, siz
e an
d co
nten
ts o
f mes
sage
s !
•C
omm
unic
atio
n oc
curs
whe
n N
EIB
PE
(nei
b) m
atch
es
Send
#0
Rec
v. #
3
#1 #5 #9
#1 #10
#0
#3
NEI
BPE
(:)=1
,3,5
,9N
EIB
PE(:)
=1,0
,10
Intro
pFE
M93
Exam
ple:
SEN
DP
E#2
: se
nd in
form
atio
n on
“bou
ndar
y no
des”
71
23
109
1112
56
84
PE#2
12
3
45
67
89
11
10141315
12
PE#0
34
8
69
1012
12
5
11
7
PE#3
NEIBPE= 2
NEIBPE[0]=3, NEIBPE[1]= 0
EXPORT_INDEX[0]= 0
EXPORT_INDEX[1]= 2
EXPORT_INDEX[2]= 2+3 = 5
EXPORT_ITEM[0-4]=1,4,4,5,6
Intro
pFE
M94
Send
ing
Buf
fer i
s ni
ce ..
.
Num
berin
g of
thes
e bo
unda
ry
node
s is
not
con
tinuo
us, t
here
fore
th
e fo
llow
ing
proc
edur
e of
M
PI_
Isen
d is
not
app
lied
dire
ctly
:
・S
tarti
ng a
ddre
ss o
f sen
ding
buf
fer
・X
X-m
essa
ges
from
that
add
ress
for (neib=0; neib<NeibPETot; neib++){
tag= 0;
iS_e= export_index[neib];
iE_e= export_index[neib+1];
BUFlength_e= iE_e -iS_e
ierr= MPI_Isend
(&SendBuf[iS_e], BUFlength_e, MPI_DOUBLE, NeibPE[neib],0,
MPI_COMM_WORLD, &ReqSend[neib])
}
C95
71
23
109
1112
56
84
PE#2
12
3
45
67
89
11
10141315
12
PE#0
34
8
69
1012
12
5
11
7
PE#3
NEIBPE= 2
NEIBPE[0]=3, NEIBPE[1]= 0
EXPORT_INDEX[0]= 0
EXPORT_INDEX[1]= 2
EXPORT_INDEX[2]= 2+3 = 5
EXPORT_ITEM[0-4]=1,4,4,5,6
Exam
ple:
REC
EIVE
PE
#2 :
rece
ive
info
rmat
ion
for “
exte
rnal
nod
es”
71
23
109
1112
56
84
PE#2
12
3
45
67
89
11
10141315
12
PE#0
34
8
69
1012
12
5
11
7
PE#3
NEIBPE= 2
NEIBPE[0]=3, NEIBPE[1]= 0
IMPORT_INDEX[0]= 0
IMPORT_INDEX[1]= 3
IMPORT_INDEX[2]= 3+3 = 6
IMPORT_ITEM[0-5]=7,8,10,9,11,12
Intro
pFE
M96
Not
ice:
Sen
d/R
ecv
Arr
ays
#PE0send:
VEC(start_send)~
VEC(start_send+length_send-1)
#PE1recv:
VEC(start_recv)~
VEC(start_recv+length_recv-1)
#PE1send:
VEC(start_send)~
VEC(start_send+length_send-1)
#PE0recv:
VEC(start_recv)~
VEC(start_recv+length_recv-1)
•“le
ngth
_sen
d” o
f sen
ding
pro
cess
mus
t be
equa
l to
“leng
th_r
ecv”
of r
ecei
ving
pro
cess
.–
PE
#0 to
PE
#1, P
E#1
to P
E#0
•“s
endb
uf” a
nd “r
ecvb
uf”:
diffe
rent
add
ress
97
Com
mun
icat
ion
Patte
rn u
sing
1D
St
ruct
ure
halo
halo
halo
halo
Dr.
Osn
i Mar
ques
(La
wre
nce
Ber
kele
y N
atio
nal L
abor
ator
y)
98
Dis
trib
uted
Loc
al D
ata
Stru
ctur
e fo
r Pa
ralle
l Com
puta
tion
•D
istri
bute
d lo
cal d
ata
stru
ctur
e fo
r dom
ain-
to-d
oain
co
mm
unic
atio
ns h
as b
een
intro
duce
d, w
hich
is
appr
opria
te fo
r suc
h ap
plic
atio
ns w
ith s
pars
e co
effic
ient
m
atric
es (e
.g. F
DM
, FE
M, F
VM
etc
.).–
SP
MD
–Lo
cal N
umbe
ring:
Inte
rnal
pts
to E
xter
nal p
ts–
Gen
eral
ized
com
mun
icat
ion
tabl
e
•E
very
thin
g is
eas
y, if
pro
per d
ata
stru
ctur
e is
def
ined
:–
Val
ues
at b
ound
ary
pts
are
copi
ed in
to s
endi
ng b
uffe
rs–
Sen
d/R
ecv
–V
alue
s at
ext
erna
lpts
are
upd
ated
thro
ugh
rece
ivin
g bu
ffers
99