1
Model Reduction of A Coupled Numerical Model Using Proper 2
Orthogonal Decomposition 3
4
Xinya Li1, Xiao Chen2, Bill X. Hu3,*, and I. Michael Navon4 5
1Hydrology, Energy & Environment Directorate, Pacific Northwest National Laboratory, 6
Richland, WA 99352, United States 7
2Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, 8
CA 94551, United States 9
3Department of Earth, Ocean and Atmospheric Science, Florida State University, Tallahassee, FL 10
32306, United States 11
4Department of Scientific Computing, Florida State University, Tallahassee, FL 32306, United 12
States 13
14
June 28th, 2011 15
16
Manuscript submitted to 17
Water Resource Research 18
19
20
*Corresponding Author: Tel: (850)644-3743; Fax: (850)644-4214; Email: [email protected]
1
Abstract 22
Numerical models for variable-density flow and solute transport (VDFST) are widely used to 23
simulate seawater intrusion and related problems. The mathematical model for VDFST is a 24
coupled nonlinear system written in state-space and time form, so the numerical discretization in 25
time and space are usually required to be as fine as possible. As a result, such large space and 26
time transient models are computationally very demanding, which is the disadvantage for state 27
estimation, forward prediction or model inversion. The purpose of this research was to develop 28
mathematical and numerical methods to simulate variable-density flow and salt transport via a 29
model reduction technique called Proper Orthogonal Decomposition (POD) designed for both 30
linear and nonlinear models. This method can restore the information reflecting the solutions of 31
the original partial differential equations. POD was applied to extract leading “model features” 32
(basis functions) through singular value decomposition from observational data or detailed 33
simulations (snapshots) of high-dimensional systems. These basis functions were then used in 34
the Galerkin projection procedure that yielded low-dimensional (reduced-order) models. The 35
original full numerical models were discretized by the Galerkin Finite-Element method (GFEM). 36
The implementation of the POD reduced-order method was straightforward referring to the 37
complex full model. The developed GFEM-POD model was applied to solve two classic VDFST 38
problems, the Henry problem and the Elder problem, to investigate the accuracy and efficiency 39
of the POD method. The reduced-order model can reproduce and predict the full model results 40
very accurately with much less computational labor in comparison with the full model. The 41
accuracy and efficiency of the POD reduced-order model is mainly determined by the optimal 42
selection of snapshots and POD bases. 43
44
2
Keywords: model reduction, proper orthogonal decomposition, single value decomposition, 45
Galerkin projection, variable density flow, Galerkin finite element 46
47
48
1. Introduction 49
Standard spatial discretization schemes for hydrogeological models usually lead to large-size, 50
high-dimensional, and in general, nonlinear systems of partial differential equations. Due to 51
limited computational and storage capabilities, model reduction techniques provide an attractive 52
approach to approximate the large-size discretized state equations using low-dimensional model. 53
Thus, the model reduction techniques have received significant attention in recent years. The 54
application of model reduction techniques for subsurface flow problems has been developed, 55
analyzed and implemented by Vermeulen and his colleagues [Vermeulen et al., 2004a; 2004b; 56
2005; Vermeulen and Heemink, 2006a]. In these pioneering studies, a proposed minimization 57
procedure results in a significant time reduction, whereas the forward original full model must be 58
executed certain times in order to determine optimal design or the operating parameters. The 59
model reduction procedures developed for subsurface flow applications are based on the use of 60
proper orthogonal decomposition (POD) [Cardoso and Durlofsky, 2010]. 61
Lumley [1967] introduced POD in the context of analysis of turbulent flow. In other 62
disciplines, the same procedure goes by the names of Karhunen-Loeve decomposition or 63
principal components analysis. It is a powerful and efficient method of data analysis aiming at 64
obtaining low-dimensional approximate descriptions (reduced-order model) of high-dimensional 65
processes [Holmes et al., 1996]. Data analysis using POD is often conducted to extract dominant 66
“model characters” or basis functions, from an ensemble of experimental data or detailed 67
3
simulations of high-dimensional systems, for subsequent use in the Galerkin projection 68
procedure that yield low-dimensional models [Chatterjee, 2000]. This model reduction technique 69
essentially identifies the most energetic models in a time-dependent system, thus providing a 70
way to obtain a low-dimensional description of the system’s dynamics [Fang et al., 2008]. POD 71
reduced-order approach is introduced to transform the original flow and transport equations into 72
a reduced form that can reproduce the behaviors of the original model. The basic idea is to 73
collect an ensemble of data of state variables (hydraulic head or solute concentration) called 74
snapshots, by running the original model, and then use singular value decomposition (SVD) to 75
create a set of basis functions that span the snapshot collection. The snapshots can be 76
reconstructed using these basis functions. The state variable at any time and location in the 77
domain is expressed as a linear combination of these POD basis functions and time coefficients. 78
A Galerkin numerical discretization method is applied to the original model to obtain a set of 79
ordinary differential equations for the time coefficients in the linear representation [Kunisch and 80
Volkwein, 2002]. 81
POD have been introduced and applied to various linear and nonlinear systems [Kunisch and 82
Volkwein, 2002; Zheng et al., 2002; Ravindran, 2002; Meyer and Matthies, 2003; Vermeulen et 83
al., 2006b; Cao et al., 2006; Khalil et al., 2007; Fang et al., 2008; Reis and Stykel, 2007, Siade 84
et al., 2010] . In practice, groundwater related problems in field that can be solved by a single 85
flow model are very limited. More complicated groundwater processes are involved in coupled 86
modeling using different numerical models. Robinson et al. [2009] attempted a simulation on 87
solute transport in porous media using model reduction techniques. POD was also applied to 88
multiphase (oil-water) flow [van Doren et al., 2006]. Overall, model reduction via POD 89
procedures is still a new mathematical technique in the area of hydrogeological modeling. Its 90
4
effective application to other groundwater flow and transport processes, such as the variable-91
density flow and solute transport (VDFST), constitutes challenging issues. 92
Numerical models of VDFST are widely used to simulate seawater intrusion and submarine 93
groundwater discharge processes [Bear, 1999; Diersch and Kolditz, 2002; Guo and Langevin, 94
2002; Voss and Provost, 2002; Li et al., 2009]. In the process of seawater intrusion to a coastal 95
aquifer, fresh groundwater flow causes the distribution of solute (mainly salt) concentration 96
varies. The variation alters the fluid density, and conversely affects groundwater movement. The 97
groundwater movement and the solute transport in the aquifer are coupled processes, and the 98
governing equations for the two processes must be solved jointly. Consequently, governing 99
equations for a VDFST problem are both transient and nonlinear. The classical numerical 100
method, Galerkin Finite Element Method (GFEM), is often adopted to solve the VDFST problem, 101
converting a continuous operator problem to a discrete problem [Segol et al., 1975; Navon, 1979; 102
Navon and Muller, 1979]. Comparing to finite difference method, GFEM approach is more 103
straightforward for reduction of a complicated model because its approximate solution has a 104
similar weighting structure as the structure for trial solution of the reduced-order model. 105
In this study, a GFEM-POD reduced-order method was developed to transform the original 106
VDFST model into a low-dimensional form that can approximately reproduce or predict the 107
results with much less computational effort. To our best knowledge, this is the first time when 108
POD reduction method is introduced to a density-dependent flow system. Two benchmark cases 109
were used to demonstrate the capability of the method to approximately solve density-dependent 110
flow problems. As a boundary controlled system, the modified Henry problem was used to test 111
the quality of the GFEM-POD model. Additionally, the GFEM-POD model was applied to 112
another classic VDFST problem, the Elder problem, in which the calculation results are only 113
5
determined by coupled governing equations and not by boundary forcing. Reproduction and 114
prediction tests were performed for the two problems with various permeability distributions to 115
investigate the accuracy and efficiency of the POD method in approximating the density-116
dependent flow fields. The developed method paves the way for future study on the parameter 117
estimation for VDFST problem based on POD reduced-order modeling. 118
The paper is organized as follows. In section 2, the variable density flow and solute transport 119
model is introduced and a numerical GFEM is applied to solve the mathematical model. In 120
section 3, the model reduction method using POD to a density dependent flow approximation is 121
developed. The developed method is applied to two density dependent flow problems to show 122
the efficiency and accuracy of the POD method in various scenarios in section 4. Finally, in 123
section 5, we provide conclusive remarks based on the findings from this study. 124
125
2. Variable Density Flow and Solute Transport (VDFST) Model 126
2.1. Mathematical Description of Variable-Density Flow and Solute transport Problems 127
Using a Cartesian coordinate system with the axes of coordinates coinciding with principal 128
directions of an anisotropic medium, the governing equation of two-dimensional (cross-section) 129
variable-density flow in terms of equivalent freshwater head and fluid concentration is [Guo and 130
Langevin, 2002]: 131
Ttzx
qt
C
t
hSC
z
hKC
zx
hKC
x ssssf
ff
fzf
fx
≤≤Ω∈
−∂∂+
∂∂
=
+
∂∂
+∂∂+
∂∂
+∂∂
0,
)1()1(0ρ
ρθηηηη (1) 132
where ][ Lh f is the equivalent freshwater head, ])[,( 1−LTzxK f is the freshwater hydraulic 133
conductivity tensor, ][ 1−LS f is specific storage, θ is the effective porosity, ][ 3−MLssρ and 134
6
][ 1−Tqss represent the source and/or sink term, and 3[ ]C ML− is the fluid concentration. 135
is the maximum fluid concentration, ][ 3−MLsρ is the corresponding maximum fluid density, 136
][ 30
−MLρ is the freshwater density. η is a dimensionless constant that represents the density-137
coupling coefficient, where sC/εη = , 00 /)( ρρρε −= s and thus Cηρρ += 1
0
. The 138
relationship between concentration and density is assumed to be linear and the influence of water 139
temperature on saltwater fluid density and viscosity can be neglected. µµ /f as the ratio of 140
freshwater and saltwater fluid viscosity is considered equal to 1 in this study. Ω represents the 141
bounded calculation domain and T is the time period of calculation. Equation (1) is subject to the 142
following initial and boundary conditions: 143
(2) 144
A second governing equation for the two-dimensional transport of solute mass in the porous 145
midia is [Guo and Langevin, 2002], 146
(3) 147
where ][ 12 −TLD is the hydrodynamic dispersion coefficient, ][ 1−LTu is the pore velocity, and 148
][ 3−MLCss is the solute concentration of source or sinks terms. 149
Equation (3) is subject to the following initial and boundary conditions, 150
3[ ]sC ML−
1
2
0
1 1
f f 2
1
2
( , ,0) ( , ) ( , )
( , , ) ( , , ) ( , )
( , , ) ( , )
s :Dirichlet Boundary Condition
s :Neumann Boundary Condition
s
f fx x z z s q
h x z h x z x z
h x z t h x z t x z s
h hK n K c n q x y t x z s
x zρ ρ η ρ
= ∈Ω
= ∈
∂ ∂ + + = ∈ ∂ ∂
( ) ( )
, 0
x sszxx zz ss
u C qu CC C CD D C
x x z z x z t
x z t T
θ∂ ∂∂ ∂ ∂ ∂ ∂ + − − = − ∂ ∂ ∂ ∂ ∂ ∂ ∂
∈ Ω ≤ ≤
7
(4) 151
Darcy’s Law is adopted in the variable-density form as, 152
(5) 153
Inserting (5) into (1) and (3) and using the empirical linear relation between the saltwater 154
density and concentration to obtain, 155
(6) 156
(7) 157
Eqs. (6) and (7) are the governing equations of a coupled nonlinear system of VDFST. 158
159
2.2 Numerical GFEM Solutions 160
The approximate solutions for hydraulic head and solute concentration in Eq. (6) and (7) are 161
defined in Eq. (8) using the nodal basis function according to Galerkin finite element method 162
[Xue and Xie, 2007], 163
(8) 164
1
2
0
1 1
2
( , ,0) ( , ) ( , )
( , , ) ( , , ) ( , )
( , , ) ( , )
s
xx x zz z s
c x z c x z x z
c x z t c x z t x z s
c cD n D n g x z t x z s
x z
= ∈Ω
= ∈
∂ ∂ + = ∈ ∂ ∂
f
fz
fxx
fz
hKu
xhK
u cz
θ
ηθ
∂= −
∂∂
= − + ∂
( ) ( )f f0
1 1
, 0
f f f ssx z f ss
h h h CC K C K C S q
x x z z t t
x z t T
ρη η η θηρ
∂ ∂ ∂ ∂ ∂ ∂+ + + + = + − ∂ ∂ ∂ ∂ ∂ ∂
∈ Ω ≤ ≤
f fz
, 0
f fx ssxx zz ss
h hK qKC C CD D C C C C
x x z z x x z z t
x z t T
ηθ θ θ
∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ + + + + = − ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂
∈ Ω ≤ ≤
( , , ) ( , , ) ( ) ( , )
( , , ) ( , , ) ( ) ( , )
NNODE
f L LL
NNODE
L LL
h x z t h x z t h t N x z
C x z t c x z t c t N x z
≈ =
≈ =
∑
∑
%
%
8
where )(thL is the approximated hydraulic head at node L at time t, )(tcL is the approximate 165
solute concentration at node L at time t. ),( zxNL is the finite-element basis function, NNODE is 166
the total number of nodes used across the domain. In general, the approximations are better with 167
larger NNODE. 168
An implicit time-extrapolated method was used for integrating the system of ordinary 169
differential equations in time resulting from the application of the GFEM to the VDFST model. 170
The boundary conditions must be implemented into the global matrices by modifying the global 171
matrices in GFEM until all prescribed boundary nodal variables have been treated. Aquifer 172
parameters such as hydraulic conductivity distribution in space are represented in an element-173
wise discrete way [Voss and Provost, 2002]. The coupling between flow and transport is 174
accomplished through the synchronous approach [Guo and Langevin, 2002], iterating the 175
solutions between the flow and transport equations. With the implicit coupling scheme, 176
numerical solutions of the flow and transport equations are iterated, and densities and 177
concentrations are updated simultaneously within each time step until the maximum difference in 178
fluid density at each single cell for sequential iterations is less than a tolerance value. This kind 179
of procedure leads to a larger amount of calculation labor, comparing with the constant-density 180
flow and transport model due to the additional coupling loop and also brings in difficulties into 181
parts of the POD model. The application of POD model will significantly reduce computation 182
time in such a calculation-expensive system. 183
184
3. Model Reduction using Proper Orthogonal Decomposition (POD) 185
The reduced-order model construction methodology is given in Figure 1, modified from 186
Vermeulen et al. [2004b]. First, the original full numerical model is run to genreate several 187
9
snapshots of model states. Second, we extract dominant patterns (the basis functions) from these 188
state snapshots via SVD. These two steps can be treated as the preprocessing steps for the 189
reduced-order model. With the unchanged numerical formulation and system inputs (e.g. 190
parameters, boundary conditions, initial conditions) of the original model, the selected bases are 191
used in Galerkin projection. The Galerkin projection is the central procedure used to construct 192
the reduced-order model by projecting both the partial differential equations of groundwater flow 193
and solute transport into a low-dimensional space. After the projection step, the reduced-order 194
model is able to simulate the same model behaviors through the reconstruction of model states 195
with significantly decreased computational burden. In this section, we will describe the 196
summarized formulation of the GFEM-POD model, which is capable of simulating the coupled 197
process of VDFST. 198
199
3.1. Snapshots and Singular Value Decomposition 200
As known for the VDFST model, the most important simulation results from the numerical 201
model as described above are the equivalent freshwater heads and the solute concentrations in 202
the model domain. The two variables are sampled from simulation results at defined checkpoint 203
times during the simulation period as snapshots. An ensemble of nodal-value represented 204
snapshots chosen in the analysis time interval [0, T] can be written as: 205
niRcccc
niRhhhhNNin
NNif
nfff
,...,2,1,...,,,
,...,2,1,...,,,21
21
=∈
=∈ (9) 206
where n is the number of snapshots and NN is the number of nodes in the mesh, the vectors ifh 207
and ic both have NN entries: 208
10
(10) 209
The collection of all ifh constructs a rectangular nNN × matrix R1, and the collection of all ic 210
constructs a rectangular nNN × matrix R2. The aim of POD is to find a set of orthogonal basis 211
functions of R1 and R2 respectively that can capture the most energy in the original VDFST 212
system. 213
Singular Value Decomposition (SVD) is a well-known technique for extracting dominant 214
“features” and coherent structures from 2D data and “compressing” that information into a few 215
low order “weights” (singular values) and associated orthonormal eigenfunctions [Golub and van 216
Loan, 1996]. The SVD of the matrix R, is calculated through the equation, 217
(11) 218
where U is an NNNN × orthogonal matrix whose columns are constructed by the eigenvectors of 219
TRR , V is an nn × orthogonal matrix whose columns are constructed by the eigenvectors of 220
RRT , and S is a diagonal nNN × matrix with singular values. The singular values in S are 221
square roots of eigenvalues from TRR or RRT . The singular values are arranged in descending 222
order. An optimal rank m approximation to R is calculated by, 223
(12) 224
In computation, one would actually replace U and V with the matrices of their first m columns; 225
and replace by its leading mm × principal minor, the sub-matrix consisting of first m rows 226
and first m columns of S. The optimality of the approximation in Eq. (12) lies in the fact that no 227
other rank m matrix can be closer to R in the Frobenius norm, which is a discrete version of the 228
( )( )
,1 ,
1
,...,
,...,
Ti i if f f NN
Ti i iNN
h h h
c c c
=
=
TR USV=
Tm mR US V=
mS
11
L2 norm [Chatterjee, 2000]. So the first mth columns of the matrix U (for any m) give an optimal 229
orthonormal basis for approximating the data. The basis vectors are given by: 230
(13) 231
where M is the number of basis functions. 232
SVD is applied to snapshots matrices R1 and R2, respectively, to obtain the POD basis 233
functions of head and concentration: 234
(14) 235
where is the number of bases from snapshots of hydraulic head, is the number of bases 236
from snapshots of solute concentration. 237
The eigenvalues jλ are real and positive, and they are sorted in descending order where the 238
j th eigenvalue is a measure of the energy transferred within the jth basis mode [Fang et al., 2008]. 239
Hence, if jλ decays very fast, the basis functions corresponding to small eigenvalues can be 240
neglected. The following formula is defined as the criterion of choosing a low-dimensional basis 241
of size M (M<< n) [Fang et al., 2008]: 242
(15) 243
where I(M) represents the percentage of energy which is captured by the POD basis 244
. This equation is used for both heads and concentrations. 245
246
3.2. Generation of POD Reduced-Order Model Using Galerkin Projection 247
, 1m mU m Mϕ = ≤ ≤
,,1 ,2
,,1 ,2
, ,...,
, ,...,
h
c
h Mh h h
c Mc c c
ψ ψ ψ
ψ ψ ψ
Ψ =
Ψ =
hM cM
( )
M
jj
n
jj
I M
λ
λ=∑
∑
1,..., ,...,m MΨ Ψ Ψ
12
To obtain the reduced-order model, we solved the numerical models of (6) and (7) to obtain 248
an ensemble of snapshots to generate POD bases, and then used a Galerkin projection scheme to 249
project the model equations onto the subspace spanned by the POD basis elements. The POD 250
solution can be expressed as [Chatterjee, 2000; Pinnau, 2008; Chen et al., 2011]: 251
(16) 252
where are POD basis functions, also known as POD modes. These modes can be used to 253
incorporate characteristics of the solution into a bounded problem by using numerical 254
simulations’ results and/or observational data. ),,( tzxh f and ),,( tzxc are decomposed into 255
linear combinations of time coefficients and POD modes which are the functions of space. 256
The POD modes are interpolated using finite element basis functions to form the GFEM-257
POD modes as [Aquino et al., 2008]: 258
(17) 259
where is a column vector that contains the nodal values of mode i. 260
The discretization (finite element) mesh used for generating the snapshots was consistent 261
with the mesh used for approximating the modes, but the GFEM-POD used a distinct mesh. 262
Therefore, we must use a Galerkin projection approach to smooth the derivatives of the modes 263
later [Aquino et al., 2008]. Based on Eq. (16) and (17), corresponding finite-element represented 264
POD solution can be expressed as [Chen et al., 2011]: 265
,
1
,
1
( , , ) ( , ) ( )
( , , ) ( , ) ( )
h
c
MPOD h i FEM POD h
f ii
MPOD c i FEM POD c
ii
h x z t x z t
c x z t x z t
ψ α
ψ α
−
=
−
=
=
=
∑
∑
( , )i x zψ
, ,
1
, ,
1
( , ) ( , ) 1,...
( , ) ( , ) 1,...
NNh i FEM POD h i
j j hj
NNc i FEM POD c i
j j cj
x z N x z i M
x z N x z i M
ψ ψ
ψ ψ
−
=
−
=
= = = =
∑
∑
iψ
13
(18) 266
The model states are decomposed into linear combinations of GFEM base functions, POD 267
modes and time coefficients. 268
From Eqs (6) and (7), we define two residual functions, 269
(19) 270
The Galerkin method requires the residuals to be orthogonal with respect to the basis 271
functions. Therefore, we need to project the original high-dimensional model onto a low-272
dimensional subspace generated by full model snapshots [Vermeulen et al., 2005]. 273
Substituting (18) into (19) and integrating with respect to the POD bases according to 274
Galerkin method gives: 275
(20) 276
with the inner product 277
278
and L2 norm 279
280
,
1 1
,
1 1
ˆ( , , ) ( , , ) ( , ) ( )
ˆ( , , ) ( , , ) ( , ) ( )
h
c
M NNh i h
f j j ii j
M NNc i c
j j ii j
h x z t h x z t N x z t
c x z t c x z t N x z t
ψ α
ψ α
= =
= =
≈ =
≈ =
∑∑
∑∑
( ) ( )
( )
f f0
f fz
( , , , , )
1 1
( , , , , )
f
f f f ssx z s ss
f
f fx ssxx zz ss
h c x z t
h h h cc K c K c S q
x x z z t t
h c x z t
h hK qKc c c cD D c c c
x x z z x x z z
ρη η η θηρ
ηθ θ θ
=
∂ ∂ ∂ ∂ ∂ ∂+ + + + − − + ∂ ∂ ∂ ∂ ∂ ∂
=
∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ + + + + − − − ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂
1
2
f
f
c
t∂
,
,
ˆ ˆ( , , , , ), , 0 1,..., ; 1,...,
ˆˆ( , , , , ), , 0 1,..., ; 1,...,
h mk h
c mk c
h c x z t N k NN m M
c h x z t N k NN m M
ψ
ψ
= = =
= = =
1
2
f
f
,f g fgdΩ
= Ω∫
1
2,f f f=
14
In the reduced-order model, equations (6) and (7) are finally changed to: 281
(21) 282
(22) 283
The key of generating a POD reduced-order model is to find the coupled ODEs of )(tcα and 284
)(thα according to Eq. (18)-(20). This key is also known as Galerkin Projection. The 285
integrations in equation (21) and (22) are the same as those for the numerical full model. The 286
trial solutions substituted into (19) are now equation (18) rather than equation (8). Finite-element 287
basis function has a different expression for each element, so Eq. (19) must be calculated per 288
element before making the summation of all the elements. It should be noted that the GFEM 289
basis functions ),( zxN j are the only spatial functions related to the areal integration of each 290
element. Since POD bases hΨ and cΨ , and time coefficients hα and cα are not spatial 291
functions, they can be extracted out of the areal integrations [Chen et al., 2011]. 292
The coupled system ODEs of are expressed as, 293
( )
( )
f
f
0
ˆˆ1
ˆˆ ˆ1 , 0
ˆ ˆ
x
hz k
sss ss
hc K
x x
hc K c N dxdz
z z
h cS q
t t
η
η η
ρθηρ
Ω
∂ ∂+ ∂ ∂
∂ ∂ + + + Ψ = ∂ ∂ ∂ ∂ − − +
∂ ∂
∫∫
( )
f fz
ˆ ˆ
ˆ ˆˆ ˆˆ , 0
ˆˆ
xx zz
cxk
ssss
c cD D
x x z z
K Kh c h cc N dxdz
x x z z
q cc c
t
ηθ θ
θ
Ω
∂ ∂ ∂ ∂ + ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ + + + Ψ = ∂ ∂ ∂ ∂
∂ − − − ∂
∫∫
( ), ( )c ht tα α
15
(23) 294
along with the initial conditions: 295
(24) 296
where 297
298
with the matrix notation: 299
300
( ) ( )
( ) ( )
1 2 3 4 5 6 1
1 2 3 4 2
h cT Th c h c c c
cT Tc h c c c
d dA A A A A A F
dt dt
dB B B B F
dt
α αα α α α α α
αα α α α α
+ + + + + =
+ + + =
,0 0
,0 0
( ) ( ( , , ), ), 1,...,
( ) ( ( , , ), ), 1,...,
h h mm h
c c mm c
t h x z t m m
t c x z t m m
α ψα ψ = =
= =
( ) ( )1 1( ) ( ),..., ( ) ; ( ) ( ),..., ( )h c
T Th cm mt t t t t tα α α α α α= =
( )
( )
( )
f f1 1 1 ,
2 2 2 ,
3 3
z
fx3,
1 1
f
1,..., 1,...,
;
;c
T j jh h e i ixe
e
je iM
Th h c mj je
m e j je
i j
iz
T
i j
h c
i NN j
A a a
NN
N NN NK K dxdz
x x z z
NNK
x xN dxdzANN
K
a
z
A a
z
a η ψ= =
= =
∂ ∂ ∂ ∂ = Ψ Ψ = + ∂ ∂ ∂ ∂
∂ ∂ ∂ ∂ = Ψ Ψ = ∂∂ + ∂ ∂
= Ψ Ψ
∑ ∫∫
∑ ∑ ∑∫∫
( )
( ) ( ) ( )
2
3 ,
4 4 4 ,
5 5 5 ,
6 6 6 ,
f
32 ,
f1 1
01
0
;
;
;
;
c
e iz je
e
MTh c e c m i
z j j jem e j
Th hs i je
e
Th
i j
i j
i
ci je
e
T qh ssi s
j
s
i
ie
j
s
NK N dxdz
z
NK N N dxdz
z
S N N dxdz
N N dx
a
A a a
A a a
A a a dz
qN ds q N dxdF z
η
η ψ
θη
ρ ρρ ρ
= =
∂ = ∂
∂= Ψ Ψ = ∂
= Ψ Ψ =
= Ψ Ψ =
= Ψ +
∑ ∫∫
∑ ∑ ∑∫∫
∑ ∫∫
∑ ∫∫
∫ ∫∫e
∑
16
301
The detailed derivation of the GFEM-POD model for a VDFST system is presented in Li 302
[2010]. The dimensions of the matrices A1-A6 and B1-B4 in Eq. (23) are now determined by the 303
number of POD bases (NB) instead of the number of nodes (NN), where NB << NN. Thus, the 304
dimension of the reduced-order model is much smaller than the dimension of the original full 305
model, which will save a large amount of computational labor. The coupled ODEs, Eq. (23), still 306
need to be solved according to the same implicit scheme stated in section 2.1. The estimated 307
nodal values of fh and c in the domain at a certain time can be reconstructed through Eq. (16). 308
309
3.3. Error analysis 310
In this subsection, the error estimates between numerical solutions of the original model and 311
the reduced model based on POD bases are discussed. 312
Let ),...,2,1( TnunNN = be vectors constructed with solutions of the full model, and 313
),...,2,1(*
TnunNN = be the vectors constituted with solutions of the reduced model. NN equals to 314
the number of active nodes. T represents the number of time steps. 315
( )
( )
3,f
1
1 1 1 ,
2 2 2 , 31 ,fz
1
,
1,..., 1,..., 1,...,
;
;h
T j jc c e ei ixx zze
e
ej jh mx
j iMT jc
i
c
em j jh m
j ij
j
i j k
i NN j NN k NN
N NN ND D dxdz
x x x z
N NKN
x xdxd
N NK
B b
Nz z
b
B b b
ψ ψ
ψθ
ψ ψψ
θ
=
=
=
= = =
∂ ∂ ∂ ∂ = = + ∂ ∂ ∂ ∂
∂ ∂ ⋅ ∂ ∂ = =
∂ ∂+ ⋅
∂ ∂
∑ ∫∫
∑∑
∑
( )
( ) ( )
2
3,fz
1 13 3 3 , ,
4 4 4 ,
2
;
;
c
i j k
ee
M eT jc c c m
j j iem e j
Tc ci je
e
Tc ssss i ie s
e
i j
z
NKN N dxdz
z
N N dxdz
qc N dxdz gN d
b
B
F s
B b
b b
ηψ ψ ψθ
ψ ψ
ψθ
= =
∂ = = ⋅ ∂
= =
= ⋅ +
∑ ∫∫
∑ ∑ ∑∫∫
∑ ∫∫
∑ ∫∫ ∫
17
If , the error estimates are obtained as follows [Aquino et al., 2008; Di et al, 316
2011]: 317
(25) 318
where λ represents the set of the eigenvalues of the matrices TRR or RRT , R is the matrix 319
of an ensemble of snapshots . uM is the number of basis functions chosen in the 320
reduced model. 321
Else, if , when are uniformly chosen from , and 322
and are bounded (i.e., and ), the 323
following error estimates exist [Di et al, 2011]: 324
(26) 325
Equation (25) indicates that the error can be controlled through optimal basis selection when 326
the sampling time period of snapshots is the same as the simulation period (e.g. a reproduction 327
test), but the error will be inevitably larger according to Eq. (26) when the sampling time period 328
of snapshots is different from the simulation period (e.g. a prediction test). The error in 329
prediction test is not bounded by the descending sorted eigenvalues because of the existence of 330
an added error function ),,,( ωtLTf ∆ . 331
332
4. Application Cases: Henry Problem And Elder Problem 333
4.1. Henry Problem 334
The Henry problem [Henry, 1964], a classic variable-density flow and salt transport problem, 335
is applied to test the proposed GFEM-POD model. The Henry problem has played a key role in 336
1,2,...,n T∈
2
*( 1) 1, 2,...,
u
n nNN NN ML
u u n Tλ +− ≤ ∈
(1 )lNNu l L≤ ≤
1,2,...,n T∉ ( )1lt l L≤ ≤ ( )1nt n N≤ ≤
2
1( )NN
L
u
t
ζ∂∂ 2
*2( )NN
L
u
t
ζ∂∂ 2
1( )NN
L
u
t
ζ ω∂ ≤∂ 2
*2( )NN
L
u
t
ζ ω∂ ≤∂
2
*( 1) ( , , , ) 1, 2,...,
n
n nNN NN ML
u u f T L t n Tλ ω+− ≤ + ∆ ∉
18
understanding of seawater intrusion into coastal aquifers, and in benchmarking density 337
dependent flow codes [Abarca et al., 2007]. The problem has been studied for decades, its 338
importance on parametric analysis of seawater intrusion is still attracting great attention [Sanz 339
and Voss, 2006]. 340
Numerical programs were compiled by Li [2010] to solve VDFST models using GFEM. To 341
examine the accuracy of these numerical programs, we used the same model inputs as Simpson 342
and Clement [2004] to simulate a standard Henry problem (Dm = 1.62925m2/d), except the time 343
step is 1 minute and the convergence criteria is 10-6 kg/m3 for the fluid concentration between 344
consecutive iterations. The system reached a steady state after approximately 250 minutes. The 345
concentration solutions from this numerical model are compared with the semi-analytical results 346
[Simpson and Clement, 2004]. The isochlors revealed an excellent correspondence, as revealed 347
by the fact that both the shape and position of the isochlors matched very well [Li, 2010]. 348
By halving the recharge rate of freshwater (Qin), a modified Henry problem is simulated 349
which served as the original full model. All the other model inputs are still same as the standard 350
Henry problem. Meanwhile, the maximum grid Peclet number is reduced from 4.1 under the 351
standard conditions to 2.8 for the modified conditions on this 2141× grid [Simpson and 352
Clement, 2004]. Under the modified conditions, the isochlor distribution will be more diffuse, 353
which can help alleviate potential oscillation near the top-right of the aquifer [Segol et al., 1975]. 354
The system required approximately 460 minutes when the change of fluid concentration is 355
smaller than 10-3 kg/m3 between two successive time steps. The CPU time required to simulate 356
500 minutes in MATLAB with a time step of 1 minute is approximately 1500 seconds for the 357
original full model. 358
359
19
4.2. Model Reduction of the Henry Problem 360
To demonstrate the application of model reduction, POD method discussed in section 3 is 361
illustrated using the modified Henry problem in various cases with different combination of 362
heterogeneity and anisotropy of the conductivity field in the aquifer. In the first case, a 363
homogeneous and isotropic aquifer is considered for the modified Henry problem. The hydraulic 364
conductivity throughout the domain is 864 m/day. Following the same procedure, the 365
original numerical model was used to generate snapshots. 366
For a prediction test, the snapshots were selected initially every 1 minute from the original 367
model solutions of the first 100 minutes for both head and concentration. We have an ensemble 368
of snapshots with a size of 100. Reduced model abstracted a certain number of bases from the 369
100 snapshots to predict the head and concentration distributions in a time period of 400 minutes, 370
from t = 101 minute to t = 500 minutes and the predicted time step is 1 minute. 371
Number of bases (NB), snapshots selection, and the predicted time length are the most 372
important factors in this study to determine the accuracy and efficiency of the reduced model. 373
The influences of these three factors on prediction were investigated as follows according to the 374
prediction test. 375
376
4.2.1. Basis selection 377
Previously discussed in section 3.1, in many cases, the first few eigenvalues comprise most 378
of the total energy of a matrix. Under this condition, we need to choose an efficient size of bases 379
to capture the most energy to predict the concentration with limited calculation. The relationship 380
between the percentage of the total energy and the number of eigenvalues is illustrated in Figure 381
2. By retaining only the first 5 eigenvalues (NB = 5) of the ensemble of snapshots of head 382
fK
20
solutions, 99.99% of total energy is extracted. However, for concentration solutions, we need 383
more than 12 eigenvalues of the same size of snapshots to reach the same level of percentage. 384
Hence, concentration can be approximated and predicted from the reduced model using a number 385
of bases more than 12 to obtain the accurate reproduction of original model. 386
To investigate the effect of NB on the solution accuracy, we vary the NB, but keep the size of 387
the ensemble of snapshots to be 100 and the predicted time steps to be 400. The accuracy of the 388
computed concentrations using model reduction with various NBs is presented in Figure 3. Two 389
error criteria are employed to compare the predicted results between the reduced model and the 390
original full model, by calculating root mean square error (RMSE) and correlation for each 391
predicted time step over the domain. From Figure 3, the accuracy of the reduced model is 392
positively correlated with the number of bases. The computation time of the reduced model with 393
different NB is listed in Table 1. As NB increasing, the required computation time increases. An 394
optimal value of NB is important to increase the efficiency of reduce model without sacrifice the 395
accuracy. Employing more bases during the reduction process will not efficiently increase the 396
accuracy, but require more computation time. In Figure 3, the accuracy of the reduced model 397
decreases gradually as the increase of prediction time steps. The accuracy of the reduced model 398
is best at the time t = 100 minutes. The predicted results using 20 bases have a relatively lower 399
accuracy at t = 500 minute (Figure 4 (b) and (d)) than at t = 200 minutes (Figure 4 (a) and (c)), 400
although, there are still good matches between the reduced model and the full model. The 401
simulation of reduced model only took the snapshots from the first 100 minutes. The coefficient 402
)(tα is calculated in the reduced model as a function of time. Thus, calculation error 403
accumulates as time increases. Normally, without additional information from new snapshots, 404
the best prediction time period will be the same as that the snapshots cover. That is the reason we 405
21
need to take more than 12 bases to maintain the accuracy, not dropping to a lower level (smaller 406
than 99%) in the future. The computation time using the original full model to predict 400 time 407
steps is about 1150 seconds, whereas it is only 5 seconds of CPU time were required for the 408
reduced model with NB = 20 to conduct the same prediction, which runs at least 230 times faster. 409
It runs nearly 1200 times faster when NB = 5. 410
411
4.2.2. Predicted Time Length 412
To overcome the problem of accuracy decrease with time, the best approach is to add 413
updated information in the prediction period. Observations will add significant amount of 414
information to the reduced model through new snapshots. Assumed that we add only one new 415
snapshot which is obtained from the observations at the time t = 200 minutes to the old 416
snapshots. The number of snapshots now is 101. The prediction period is still the same, from t = 417
101 minutes to t = 500 minutes. The updated results are shown in Figure 5. The NB used is still 418
20. Comparing to Figure 3, all predicted results were significantly improved. The reduced model 419
can be calibrated with updated information from observations or new snapshots to significantly 420
increase the accuracy. Addition of observation data will not only greatly increase the accuracy, 421
but also leads to a better snapshots selection. It is worth mentioning that, the computational time 422
is still the same, and it only changed slightly by increasing snapshots. The computational time is 423
mainly determined by the NB used in reduced model. 424
425
4.2.3. Snapshot selection 426
The ability of a reduced model obtained from POD to accurately represent and, in practice, 427
replace the full model is mainly based on the manner in which the full model snapshots are 428
22
obtained [Siade et al., 2010], because both the number of snapshots and the time intervals of 429
sampling will affect the accuracy of the reduced model.. If the snapshots did not include enough 430
amount of information, the reduced order model will not provide accurate results no matter how 431
many bases are used. Therefore, as shown in Figure 1, to maximize the accuracy, it is important 432
to optimize the snapshots by the interaction between the original full model and the reduced-433
order model. The number of snapshots is optimal when the addition of another snapshot does not 434
add a significant amount of information to the reduced model [Siade et al., 2010]. 435
The sampling time of snapshots from solutions of original model determines the number of 436
snapshots. If we sampled 100 time steps from the first 100 minutes, we have 100 snapshots. 50 437
snapshots will be taken with a sampling time step of 2 minutes, and 25 snapshots will be taken 438
with a sampling time step of 4 minutes. The results using different number of snapshots without 439
changing NB are shown in Figure 6. The accuracy of the reduced model is slightly changed. The 440
correlation coefficients are still higher than 99.99%, which means all the three ensembles of 441
snapshots captured the dominant characters of the model. A small set of snapshots is efficient for 442
the reduced model to perform accurately. 443
In subsection 4.2.2, when the snapshot size was changed because of new information was 444
included, selection of snapshots can be reevaluated. Figure 5 showed that the accuracy is further 445
enhanced with a selection of 101 snapshots. The importance of this new snapshot is obvious. A 446
large number of the old snapshots from the past 100 minutes will be not necessary. Adopting as 447
many snapshots as possible in a certain time period does not equals to a high level of accuracy. It 448
is predictable that the 100+1 snapshots can be reduced to 25+1 snapshots to produce the results 449
without sacrificing the accuracy. The result indicates that a snapshot from a new time period 450
contains much more information that a snapshot from an old period of time. 451
23
452
4.2.4. Heterogeneous Case 453
Hydraulic conductivity fields in natural media are commonly heterogeneous and anisotropic. 454
Thus, it is required to test the application of POD method on a more “realistic” case with a 455
variable conductivity field. The conductivity field will significantly affect the velocity field of 456
the VDFST system, which controls solute advection and dispersion processes. In the case study, 457
the variability of the conductivity field is represented by the pattern and parameter values of 458
in Eq. (6) and (7). 459
In this case study, all the other settings for both the full model and the reduced model are 460
same as those in the homogeneous case. We proposed two common heterogeneous cases, a 461
random field and a zonal field. From the homogeneous cases, we notice that the influences of 462
snapshots, bases and predicted period length on prediction must be considered. Under various 463
field conditions, we will investigate whether the reduced model via POD can still carry out the 464
results efficiently and accurately with heterogeneous porous medium. 465
The first case employed a hydraulic conductivity field generated by the geostatistical 466
approach. Assume the (hydraulic conductivity) field is heterogeneous and anisotropic, where 467
is assumed to satisfy a Gaussian distribution, )200,864(N . The anisotropic ratio is 468
5 all over the domain. The distribution of in x-coordinate direction, , is displayed in 469
Figure 7. The range of the parameter values is 200 m/day ~ 1400 m/day. Employing 20 bases 470
from 100 snapshots for this case, the reduced model runs approximately 250 times faster than the 471
full model. Comparing the predicted results (Figures 8 - 9), the accuracy of the reduced model is 472
illustrated according to the continuous good fit of head and concentration distributions with time 473
between the full and the reduced model respectively. 474
fK
fK
fK /fx fzK K
fK fxK
24
The second case employed a zonal heterogeneous medium. It is assumed that the field is 475
zonally distributed and anisotropic. The anisotropic ratio is still 5 all over the domain. 476
The distribution of field is displayed in Figure 10. The confined aquifer is divided into four 477
zones. There are two patterns adopted to present the hydraulic conductivities. In this confined 478
aquifer whose depth is 1m, the hydraulic conductivities decrease from zone 1 to zone 4 by depth 479
in case A, and increase by depth from zone 1 to zone 4 in case B (Figure 10). 480
No matter which pattern is chosen, the same procedure of model reduction is conducted. To 481
run the reduced model efficiently while retaining calculation accuracy, 25 snapshots are sampled 482
from the first 100 minutes, which is 1 snapshot every 4 minutes. 10 bases are then computed 483
from SVD. The spatial and temporal distributions of head and concentration over a period of 400 484
minutes are then solved from the reduced model. 485
For case A, the computation time of the reduced model is nearly 950 times faster than the full 486
model. Figure 11 shows the spatial distributions of hydraulic head and concentration at time t = 487
500 minutes, which are identical with the results from the full model. 488
For case B, the computation time of the reduced model is nearly 750 times faster than the full 489
model. Figure 12 shows the spatial distributions of hydraulic head and concentration at time t = 490
500 minutes, which are almost perfectly matched with the results from the original full model. 491
492
4.3. Model Reduction of the Elder Problem 493
As a boundary controlled system, the modified Henry problem was used to study the 494
accuracy and efficiency of the GFEM-POD reduced model in section 4.2. The GFEM-POD 495
reduced model is applied to another classic VDFST problem, the Elder problem. Compared to 496
Henry Problem, the Elder problem has the characteristic that the calculation results are only 497
fK
/fx fzK K
fxK
25
determined by correctly coupled governing equations, not by boundary forcing. As a result, the 498
Elder problem will be influenced more by nonlinearity induced by variable-density condition. 499
The Elder problem was originally designed for heat flow [Elder, 1967a; 1967b], but Voss 500
and Souza [1987] modified this problem into a density-dependent groundwater problem in which 501
the fluid density is a function of salt concentration. The Elder problem described a laminar fluid 502
flow in a closed rectangular aquifer and is commonly used to verify variable-density 503
groundwater codes [Simpson and Clement, 2003]. 504
For the Elder problem, we only consider advection and diffusion without dispersion. The 505
coupled governing equations are still Eq. (6) and (7). In an attempt to show the significance of 506
application of POD reduced-order model to the Elder problem, a modified Elder problem is 507
taken where the molecular diffusion coefficient (Dm) was doubled. For this modified Elder 508
problem, the domain is regularly discretized using 18913161 =× nodes and 3600 triangular 509
elements. A uniform time interval of 5 days is used for a simulation period of 5 years. All the 510
other settings are still same as the standard Elder problem [Simpson and Clement, 2003]. This 511
modified Elder problem is used as the original full model. The five-year evolution of the dense 512
fluid in this confined aquifer is shown in Figure 13. With symmetric system settings, the 513
distribution of the plume lobes is also symmetric along the centerline of the aquifer. 514
The full MATLAB code solving standard or modified Elder problem was adjusted from the 515
code for the Henry problems. The CPU time in MATLAB to simulate 5 years with a time step of 516
5 days is approximately 3 hours for the original full model. 517
In the previous section, the reduced model is applied only to predict the results for modified 518
Henry problems. The performance of model reduction is verified through different patterns of 519
space variation. The importance of snapshots selection and bases selection is discussed. 520
26
To further investigate the quality of the reduced model for Elder problem, two types of 521
calculation are performed, reproduction and the prediction. For the reproduction calculation, the 522
simulation period of the reduced model is the same as the time period used in the full model to 523
generate snapshots. While for prediction calculation, the simulation period of the reduced model 524
is beyond the time period for the full model to generate snapshots. Based on the error analysis in 525
section 3.3, the errors of reproduction test are addressed by equation (25) and the errors of 526
prediction test are expressed by equation (26). From the error analysis, the errors of reproduction 527
test can be controlled through optimal snapshots selection and base selection, which determine 528
the (M+1)th eigenvalue. The errors of prediction tests are not only determined by the eigenvalues, 529
but also by selected time period length and a case-specific coefficient. It is much more difficult 530
to control the errors for prediction tests. The accuracy will decrease gradually as the prediction 531
time increases. Therefore, the accuracy and efficiency of the reduced model have to be discussed 532
according to different objects of reduced modeling. 533
534
4.3.1. Reproduction Calculation 535
The reproduction test is the repeated calculation of the forward simulation of the full model. 536
The original full model was operated to simulate a time period of five years (1825 days) with a 537
uniform time interval of 5 days. 73 snapshots were chosen from the full model results for 538
hydraulic heads and concentrations, respectively. These 73 snapshots were sampled regularly, 539
one from every 25 days. From SVD process, 11 bases are selected for the reduced model, which 540
will reproduce the same time period with a time interval of 5 days and thus using 365 time steps. 541
The reduced model ran approximately 2500 time faster in MATLAB than the original full model. 542
27
The comparison of the dense fluid distribution is shown in Figure 13 at the end of the first year, 543
the third year and the fifth year, respectively. 544
The accuracy of the reduced model is satisfied according to Figure 13. The results of the 545
reduced model were over 99.9% matched with the results from the full model. For reproduction 546
test, the error can be very low because the important system information in this time period is all 547
available through optimal selection of snapshots. As long as the snapshots cover most 548
information, the reduced model can reproduce the head and concentration results at any time 549
inside this time period very accurately. The reproduction tests confirmed that the reduced model 550
can be used to replace the full numerical model for state estimation and inverse modeling which 551
normally require repeated forward run of the full model. 552
553
4.3.2 Prediction Calculation 554
The snapshots for prediction tests were sampled from the full-model results of first year. For 555
the first 365 days, we selected one snapshot from each 5 days. 11 bases were selected from the 556
73 snapshots. We used the information from the first year to predict the results in the next two 557
years. The time interval used in the prediction test is 5 days. The correlation of predicted 558
concentrations for the following two years between the reduced model and the full model is 559
shown in Figure 14. The accuracy of the reduced model decreases rapidly with increase of 560
prediction time. At the end of the second year ((number of time step = 146), the accuracy is 561
nearly 99%. However, at the end of the third year (number of time step = 219), the accuracy is 562
only 80%. Apparently, the reduced model cannot attain a satisfactory prediction in a time period 563
longer than one year for this modified Elder problem, if the accuracy must be kept higher than 99% 564
by a decision maker. 565
28
More snapshots were included and more basis functions were adopted trying to predict more 566
accurate results. However, the precision of the predicted results at the end of the third year is still 567
not satisfied. As mentioned previously, the errors generated in prediction calculation will 568
increase inevitably as the increase of predicted time length. The errors cannot be reduced by 569
choosing more POD bases produced from the unchanged ensemble of snapshots. Compared with 570
the Henry problem, the POD reduced-order method encountered increased errors due to a 571
stronger mathematical nonlinearity in the Elder problem. 572
In section 4.2.2, we proposed an appropriate approach to overcome the problem of accuracy 573
decrease with time, adding updated information in the prediction period. The principle is very 574
similar to the process of weather forecasting. The reduced model is kept running, but the 575
snapshots used also need to be updated. Observations at a certain time in the prediction period 576
will add significant amount of new information. Illustrated by Figure 5, new snapshots are 577
obtained from observations and are added to the old ensemble of snapshots. The updated 578
snapshots are then applied in the reduced model to increase model prediction accuracy. This 579
updating is continuously conducted to maintain the accuracy of the reduced model. 580
To investigate efficiency of this method, another case is designed. The concentration results 581
of the reduced model from the previous prediction test are compared with the results of the full 582
model (Figure 15, (a) and (b)) at the end of the 2nd year. The snapshots are all sampled from the 583
first year. Although, the two contours display a good fitting with each other, the transport depths 584
of the lobes at both sides do not match well, which is marked by the red dashed line in Figure 15. 585
It is assumed that we obtained a small set of observation data at a certain time point early in the 586
2nd year which was imitated from the simulation of the original full model. A new snapshot is 587
generated based on the observation data and is included it into the old snapshots. With updated 588
29
snapshots, we reran the reduced model to predict results in the same time period. The simulation 589
results are clearly improved (Figure 15, (c)). 590
The importance of updating snapshots indicates again that the accuracy of reduced model 591
depends strongly on the time period in which full-model snapshots are taken as discussed in 592
section 3.3. In practice, the observations need to be filtered and weighted before they are adopted 593
in the reduced model [Siade et al., 2010]. 594
595
5. Conclusion 596
In this study, we developed a POD approach to efficiently simulate a coupled nonlinear 597
subsurface flow and transport process. An integrated methodology of model reduction was 598
developed through combining POD with the GFEM, so it is referred to as GFEM-POD method. 599
The GFEM-POD method can reduce the dimension of stiffness matrices and forcing vectors in 600
the full finite element numerical model to a very small size. The reduced dimension depends on 601
the selected number of basis functions. 602
This method is efficient because the reduced-order model represents new states in terms of 603
the dominant basis vectors generated by a subset of old states. The simulations of the reduced-604
order model must be performed in a low-dimensional space depending on the proper 605
decomposition of model states (hydraulic head and solute concentration) in space and time. 606
We applied this procedure to two benchmark VDFST problems with various scenarios. These 607
case studies results indicate that this GFEM-POD reduced-order model can reproduce and 608
predict the full model results of spatial distributions for both hydraulic head and solute 609
concentration very accurately. The computational time required for the reduced-order model is 610
dramatically reduced compared to the time used in the full model simulation. The calculation 611
30
accuracy depends strongly on the sampling and updating strategy of the full-model snapshots. 612
The selected snapshots further determine how many basis functions should be applied to achieve 613
satisfactory results in the reduced-order model. The optimal selection of snapshots and basis 614
functions is crucial for the application of POD and should be carefully considered due to the 615
model’s mathematical and parametric structures. We also observed that the POD approach is less 616
robust for model prediction than for model reproduction. The reduced-order model will 617
encounter significant calculation errors for long-term prediction. This phenomenon is more 618
obvious when the study problem is highly mathematically nonlinear. An effective approach of 619
relieving this issue is to update snapshots continuously to assimilate new information from 620
observations or experiments. 621
According to our present study, future work will focus on the development of the adjoint 622
model for optimal parameters estimations (e.g. the freshwater hydraulic conductivity tensor) with 623
reduced-order modeling and the application of GFEM-POD method to other coupled and 624
nonlinear hydrogeological models. 625
626
31
References 627
Abarca, E., J. Carrera, X. Sanchez-Vila, and M. Dentz (2007), Anisotropic dispersive Henry 628
problem, Advances in Water Resources, 30, 913-926. 629
Aquino, W., J. C. Brigham, C. J. Earls and N. Sukumar (2009), Generalized finite element 630
method using proper orthogonal decomposition, International Journal for Numerical 631
Methods in Engineering, 79(7), 887-906. 632
Bear, J. (1999), Mathematical modeling of seawater intrusion, in Seawater Intrusion into Coastal 633
Aquifers, edited by J. Bear, et al., pp. 127-161, Kluwer Academic Publications. 634
Cao, Y., J. Zhu, Z. Luo and I. M. Navon (2006), Reduced order modeling of the upper tropical 635
Pacific ocean model using proper orthogonal decomposition, Computers & Mathematics with 636
Applications, 52(8–9), 1373–1386. 637
Cardoso, M. A. and L. J. Durlofsky (2010), Development and application of reduced-order 638
modeling procedures for subsurface flow simulation, Journal of Computational Physics, 229, 639
681-700. 640
Chatterjee, A. (2000), An introduction to the proper orthogonal decomposition, Current Science, 641
78(7), 808-817. 642
Chen, X., I. M. Navon and F. Fang (2011), A dual weighted trust-region adaptive POD 4D-Var 643
applied to a finite-element shallow water equations model, International Journal for 644
Numerical Methods in Fluids, 65(5), 520-541. 645
Di, Z., Z. Luo, Z. Xie, A. Wang and I. M. Navon (2011), An optimizing implicit difference 646
scheme based on proper orthogonal decomposition for the two-dimensional unsaturated soil 647
water flow equation, International Journal for Numerical Methods in Fluids, in early view. 648
Diersch, H. -J. G. and O. Kolditz (2002), Variable-density flow and transport in porous media: 649
32
approaches and challenges, Advances in Water Resources, 25(8-12), 899–944. 650
Elder, J. W. (1967a), Steady free convection in a porous medium heated from below, J Fluid 651
Mech., 27, 29–50. 652
Elder, J. W. (1967b), Transient convection in a porous medium, J Fluid Mech., 27, 609–623. 653
Fang, F., C. C. Pain, I. M. Navon, M. D. Piggott, G. J. Gorman, P. E. Farrell, P. A. Allison, and A. 654
J. H. Goddard (2008), A pod reduced-order 4D-Var adaptive mesh ocean modeling approach, 655
International Journal For Numerical Methods In Fluids, 60(7), 709-732. 656
Golub, G. H. and C. F. Van Loan (1996), Matrix Computations, 3rd Edition, John Hopkins Univ. 657
Press, Baltimore, Maryland. 658
Guo, W., and C. D. Langevin (2002), User’s guide to SEAWAT: A computer program for 659
simulation of three-dimensional variable-density ground-water flow, U.S. Geological Survey 660
Techniques of Water-Resources Investigations, Book 6, chapter A7, 77 p. 661
Henry, H. R. (1964), Effects of dispersion on salt encroachment in coastal aquifers, U.S. 662
Geological Survey Water-Supply Paper, 1613-C, C71-C84. 663
Holmes, P., J. L. Lumley and G. Berkooz (1996), Turbulence, Coherent Structures, Dynamical 664
System and Symmetry, Cambridge University Press, Cambridge, 1996. 665
Khalil, M., S. Adhikari, and A. Sarkat (2007), Linear system identification using proper 666
orthogonal decomposition, Mechanical Systems and Signal Processing, 21(8), 3123-3145. 667
Kunisch, K., and S. Volkwein (2002), Galerkin proper orthogonal decomposition methods for a 668
general equation in fluid dynamics, SIAM Journal on Numerical Analysis, 40(2), 492-515. 669
Li, X., B. X. Hu, W. C. Burnett, I. R. Santos and J. P. Chanton (2009), Submarine ground water 670
discharge driven by tidal pumping in a heterogeneous aquifer, Ground Water, 47(4), 558-671
568. 672
33
Li, X. (2010), Model simulation and reduction of variable-density flow and solute transport using 673
proper orthogonal decomposition, Ph.D. Thesis, Department of Earth, Ocean and 674
Atmospheric Science, Florida State University, Tallahassee, Florida. 675
Lumley, J. L. (1967), in Atmospheric turbulence and radio wave propagation, edited by A. 676
Yaglom and V. Tatarski, pp. 166–178, Nauka, Moscow. 677
Meyer, M. and H. G. Matthies (2003), Efficient model reduction in non-linear dynamics using 678
the Karhunen-Loève expansion and dual-weighted-residual methods, Computational 679
Mechanics 31: 179–191. 680
Navon, I. M. (1979), Finite element simulation of the shallow-water equations model on a 681
limited-area domain, Appl. Math. Modeling, 3, 337-348. 682
Navon, I. M. and U. Muller (1979), FESW - A finite-element FORTRAN IV program for solving 683
the shallow-water equations, Advances in Engineering Software, 1, 77-86. 684
Pinnau, R (2008), Model reduction via proper orthogonal decomposition, in Model Order 685
Reduction: Theory, Research Aspects and Applications, edited by W. H. A. Schilder and H. 686
van der Vorst, pp. 96-109, Springer. 687
Ravindran, S.S (2002), Adaptive reduced-order controllers for thermal flow system using proper 688
orthogonal decomposition, SIAM Journal of Scientific Computing, 23(6), 1924–1942. 689
Reis, T. and T. Stykel (2007), Stability analysis and model order reduction of coupled systems, 690
Mathematical and Computer Modeling of Dynamic Systems, 13(5), 413-436. 691
Robinson, B. A., Z. Lu, and D. Pasqualini (2009), Simulating solute transport in porous media 692
using model reduction techniques, submitted to Advances in Water Resources. 693
Sanz, E., and C. I. Voss (2006), Inverse modeling for seawater intrusion in coastal aquifers: 694
Insights about parameter sensitivities, variances, correlations and estimation procedures 695
34
derived from the Henry problem, Advances in Water Resources, 29, 439-457. 696
Segol, G., G. F. Pinder and W. G. Gray (1975), A Galerkin-finite element technique for 697
calculating the transient position of the saltwater front, Water Resour. Res., 11, 343–7. 698
Siade, A. J., M. Putti and W. W.-G. Yeh (2010), Snapshot selection for groundwater model 699
reduction using proper orthogonal decomposition, Water Resour. Res., 46, W08539. 700
Simpson, M. J. and T. P. Clement (2003), Theoretical analysis of the worthiness of Henry and 701
Elder problems as benchmarks of density-dependent groundwater flow models, Advances in 702
Water Resources, 26, 17-31. 703
Simpson, M. J. and T. P. Clement (2004), Improving the worthiness of the Henry problem as a 704
benchmark for density-dependent groundwater flow models, Water Resour. Res., 40, 705
W01504. 706
van Doren, J. F. M., R. Markovinovic and J. D. Jansen (2006), Reduced-order optimal control of 707
water flooding using proper orthogonal decomposition, Computational Geosciences, 10, 708
137–158. 709
Vermeulen, P. T. M., A. W. Heemink, and C. B. M. te Stroet (2004a), Low-dimensional modeling 710
of numerical groundwater flow, Hydrological Process 18(8): 1487-1504. 711
Vermeulen, P. T. M., A. W. Heemink, and C. B. M. te Stroet (2004b), Reduced models for linear 712
groundwater flow models using empirical orthogonal functions, Advances in Water 713
Resources, 27(1), 57-69. 714
Vermeulen, P. T. M., A. W. Heemink, and J. R. Valstar (2005), Inverse modeling of groundwater 715
flow using model reduction, Water Resour. Res., 41(6), W06003. 716
Vermeulen, P. T. M., and A. W. Heemink (2006a), Model-reduced variational data assimilation, 717
Monthly Weather Review, 134(10), 2888-2899. 718
35
Vermeulen, P. T. M., C. B. M. te Stroet, and A. W. Heemink (2006b), Model inversion of 719
transient nonlinear groundwater flow model using model reduction, Water Resour. Res., 720
42(9), W09417. 721
Voss, C. I. and W. R. Souza (1987), Variable density flow and solute transport simulation of 722
regional aquifers containing a narrow freshwater-saltwater transition zone, Water Resour. 723
Res., 23(10), 1851-1866. 724
Voss, C. I., and A. M. Provost (2002), SUTRA: A model for saturated-unsaturated, variable-725
density ground-water flow with solute or energy transport, U.S. Geological Survey Water-726
Resources Investigations Report 02-4231, 290p. 727
Xue, Y. and C. Xie (2007), Numerical Simulation for Groundwater, 451 pp., Science, Beijing. 728
Zheng, D. and K. A. Hoo, M. J. Piovoso (2002), Low-order model identification of distributed 729
parameter systems by a combination of singular value decomposition and the Karhunen-730
Loève expansion, Industrial & Engineering Chemistry Research, 41(6), 1545–1556. 731
732
36
Table Captions 733
Table 1. Computation times of the reduced-order model for the homogeneous case with different 734
NB to predict 400 time steps. 735
736
737
738
Table 1. 739
740
Computation Time (seconds) Number of Bases (NB)
0.125 1
0.350 2
0.880 5
1.820 10
3.250 15
4.900 20
741
37
742
Figure 1. Methodology for constructing a reduced-order model. 743
Original Full Model (GFEM) Snapshots Basis Functions
Galerkin Projection
Reduced-Order Model
Reconstruction
Results
SVD
Numerical Formulation
Selected Bases
Snapshots Optimization
Bases Optimization
38
744
745
Figure 2. (Top) The percentage of total energy of head exacted as function of number of 746
eigenvalues for the homogeneous case; (Bottom) The percentage of total energy of concentration 747
exacted as function of number of eigenvalues for the homogeneous case. 748
99.95%
99.96%
99.97%
99.98%
99.99%
100.00%
0 5 10 15 20
Number of Eigenvalues
Per
cent
age
of th
e T
otal
Ene
rgy
75%
80%
85%
90%
95%
100%
0 5 10 15 20
Number of Eigenvalues
Per
cent
age
of th
e T
otal
Ene
rgy
39
749
750
Figure 3. RMSE (Top) and correlation (Bottom) of predicted concentrations between the 751
reduced-order model and the original full model for the homogeneous case using different 752
number of bases from 100 snapshots. 753
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
100 150 200 250 300 350 400 450 500
Number of Predicted Time Steps
RM
SE
of C
once
ntra
tion
NB = 5NB = 10NB = 15NB = 20
0.994
0.995
0.996
0.997
0.998
0.999
1.000
100 150 200 250 300 350 400 450 500
Number of Predicted Time Steps
Cor
rela
tion
of C
once
ntra
tion
NB = 5NB = 10NB = 15NB = 20
40
754
755
Figure 4. Comparison of results between the reduced-order model (red dash) and the original full model (blue dash) for the 756
homogeneous case. (a) Predicted head distribution (m) at time t = 200 minutes; (b) Predicted head distribution (m) at time t = 500 757
minutes; (c) Predicted concentration distribution (kg/m3) at time t = 200 minutes; (d) Predicted concentration distribution (kg/m3) at 758
time t = 500 minutes. 759
b
a c
d
41
760
Figure 5. RMSE of predicted concentrations between the reduced-order model and the original full model for the homogeneous case 761
with addition of a new snapshot at t = 200 minutes (red) comparing to the previous simulation without new snapshots (black). 762
0
0.05
0.1
0.15
100 150 200 250 300 350 400 450 500
Number of Predicted Time Steps
RM
SE
of C
once
ntra
tion
NB = 20
NB = 20 with a new snapshot at t = 200 minutes
42
763
Figure 6. RMSE of predicted concentrations between the reduced-order model and the original full model for the homogeneous case 764
using different number of snapshots with the same NB =20. 765
0
0.05
0.1
0.15
0.2
100 150 200 250 300 350 400 450 500
Number of Predicted Time Steps
RM
SE
of C
once
ntra
tion
100 snapshots
50 snapshots
25 snapshots
43
766
Figure 7. Stochastic distributed hydraulic conductivity field used in the first heterogeneous case with a Gaussian distribution, N (864, 767
200). 768
44
(a) 769
(b) 770
Figure 8. Comparison of results between the reduced-order model (red dash) and original full 771
model (blue dash) for the first heterogeneous case. (a) Predicted head distribution (m) at time t = 772
200 minutes; (b) Predicted head distribution (m) at time t = 500 minutes. 773
774
45
775
Figure 9. Comparison of results between the reduced-order model (red dash) and original full 776
model (blue dash) for the first heterogeneous case. (Top) Predicted concentration distribution 777
(kg/m3) at time t = 200 minutes; (Bottom) Predicted concentration distribution (kg/m3) at time t = 778
500 minutes. 779
46
A 780
B 781
Figure 10. Diagrams display, in cross-section view, the two zonal patterns and parameter values 782
used in the second heterogeneous case. (A) Hydraulic conductivities decrease by depth; (B) 783
Hydraulic conductivities increase by depth. 784
47
785
Figure 11. Comparison of results between the reduced-order model (red dash) and original full 786
model (blue dash) for Case A using the zonal approach. (Top) Predicted head distribution (m) at 787
time t = 500 minutes; (Bottom) Predicted concentration distribution (kg/m3) at time t = 500 788
minutes. 789
48
790
Figure 12. Comparison of results between the reduced-order model (red dash) and original full 791
model (blue dash) for Case B using the zonal approach. (Top) Predicted head distribution (m) at 792
time t = 500 minutes; (Bottom) Predicted concentration distribution (kg/m3) at time t = 500 793
minutes. 794
49
795
Figure 13. Comparison of dense fluid distribution between the reduced-order model (right) and original full model (left) in the 796
reproduction test. The concentration contour interval is 28 kg/m3. 797
50
798
Figure 14. Correlation of predicted concentrations between the reduced-order model and the 799
original full model in the prediction test for the next 2 years with 146 time steps. 800
801
0.80
0.85
0.90
0.95
1.00
73 146 219
Number of Predicted Time Steps
Co
rrel
atio
n o
f C
on
cen
trat
ion
51
802
Figure 15. Predicted dense fluid distribution of the reduced-order model (a), the original full 803
model (b) and the updated reduced-order model (c) in the prediction test at the end of the 2nd 804
year. The concentration contour interval is 28 kg/m3. 805
0 50 100 150 200 250 300 350 400 450 500 550 6000
50
100
150
0 50 100 150 200 250 300 350 400 450 500 550 6000
50
100
150
0 50 100 150 200 250 300 350 400 450 500 550 6000
50
100
150
a
b
c