+ All Categories
Home > Documents > Ensemble Kalman Filtering in Distance-Kernel...

Ensemble Kalman Filtering in Distance-Kernel...

Date post: 29-May-2018
Category:
Upload: lydan
View: 243 times
Download: 0 times
Share this document with a friend
46
Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon Park Abstract Current geostatistical simulation methods allow generating a realization that hon- ors available data, such as hard and secondary data under certain geological scenar- ios. However, it is difficult to simulate large models that honor highly nonlinear response functions. The objective of this study is to generate multiple realizations all of which honor all available data. First, we generate a large ensemble of possi- ble realizations describing the spatial uncertainty for given hard data. Secondly, us- ing multidimensional scaling, we map these models into a low-dimensional space by defining a proper distance between these prior models. Next, kernel Kerhunen-Loeve expansion is applied to the realizations mapped in this metric space to parameterize them into short standard normal random vectors. We apply ensemble Kalman fil- tering to these parameterizations to update multiple realizations matching the same data. A back-transformation (the pre-image problem) allows the generation of mul- tiple geostatistical models that match all data, hard and non-linear response. The proposed framework has been successfully applied to generate multiple Gaussian models which honor hard data and dynamic response data. Dissimilarity distance which is highly correlated with the difference in dynamic data combined with multi- dimensional scaling provides a powerful tool for analyzing the path of optimization and the probability density of the inverse problem. Additionally, EnKF is boosted by reducing the ensemble size through kernel k-mean clustering without any significant loss of ensemble statistics. 1 Introduction Conditioning Many geoscientists and petroleum engineers have attempted to build geological models for predicting reservoir performance and building development plans. One of those main issues when modeling a reservoir is ”conditioning”. A model should be consistent with currently all available data: geologic processes, outcrop, log, core, seismic survey, pro- duction history, etc. Most of the data are static except for production history and 4-D seismic. 1
Transcript
Page 1: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Ensemble Kalman Filteringin Distance-Kernel Space

Kwangwon Park

Abstract

Current geostatistical simulation methods allow generating a realization that hon-ors available data, such as hard and secondary data under certain geological scenar-ios. However, it is difficult to simulate large models that honor highly nonlinearresponse functions. The objective of this study is to generate multiple realizationsall of which honor all available data. First, we generate a large ensemble of possi-ble realizations describing the spatial uncertainty for given hard data. Secondly, us-ing multidimensional scaling, we map these models into a low-dimensional space bydefining a proper distance between these prior models. Next, kernel Kerhunen-Loeveexpansion is applied to the realizations mapped in this metric space to parameterizethem into short standard normal random vectors. We apply ensemble Kalman fil-tering to these parameterizations to update multiple realizations matching the samedata. A back-transformation (the pre-image problem) allows the generation of mul-tiple geostatistical models that match all data, hard and non-linear response. Theproposed framework has been successfully applied to generate multiple Gaussianmodels which honor hard data and dynamic response data. Dissimilarity distancewhich is highly correlated with the difference in dynamic data combined with multi-dimensional scaling provides a powerful tool for analyzing the path of optimizationand the probability density of the inverse problem. Additionally, EnKF is boosted byreducing the ensemble size through kernel k-mean clustering without any significantloss of ensemble statistics.

1 Introduction

Conditioning

Many geoscientists and petroleum engineers have attempted to build geological modelsfor predicting reservoir performance and building development plans. One of those mainissues when modeling a reservoir is ”conditioning”. A model should be consistent withcurrently all available data: geologic processes, outcrop, log, core, seismic survey, pro-duction history, etc. Most of the data are static except for production history and 4-Dseismic.

1

Page 2: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Presently, geostatistical algorithms are being widely used to generate a model condi-tional to static data. Both two-point and multiple-point algorithms can handle hard andsoft data under certain geological scenario. Any other methods, such as object-based orprocess-based methods, still have limitations on conditioning to those data.

Optimization

Conditioning to dynamic data (usually called history matching) using geostatistical sim-ulations, however, is difficult because of the severe nonlinearity of the problem. Hence,we often regard it as an optimization problem which minimizes the difference betweenthe observed data and the calculated output, so called the objective function. Com-pared with various conventional optimization problems, this problem contains severaltough challenges: ill-posedness, sparsity of data, large number of model parameters, time-consuming forward simulation, and so forth. Various approaches to overcome those limi-tations have been proposed and are divided into two main categories: gradient-based andstochastic methods.

Gradient-based optimization methods usually using sensitivity coefficient are widelyused because of their fast convergence. Although the approaches are well-defined, thefluctuating nature of the objective function leads to non-optimal solution, such as a lo-cal minimum. In presence of local minima, stochastic optimization techniques, such assimulated annealing (SA) and genetic algorithm, are better suited to reach the optimal so-lution. In spite of the fact that those methods can find a solution which is conditioned todynamic data, they may not preserve static data conditioning as imposed by geostatisticalalgorithms.

Probability Perturbation Method (PPM; Caers, 2003) and Gradual Deformation Method(GDM; Hu, 2000) make it possible to update a model stochastically to condition to dy-namic data as well as static data and geologic constraints. Both methods update a currentmodel by combining it with a new possible model through a one-parameter optimization.The difference is that PPM deals with the probability and GDM the random numberswhich are used in geostatistical simulations (Caers, 2007). While PPM and GDM can pro-vide a model conditional to static and dynamic data with geologic constraints, they needrelatively large number of forward simulations, which take usually several hours to days.Moreover, they yield only one inverse solution per optimization, hence a large numberof optimizations with different initial realizations are required to obtain multiple condi-tional realizations. Due to uncertainty and nonuniqueness in the problem, it is necessaryto generate multiple conditional realizations.

Filtering (EnKF)

In order to get multiple models, Ensemble Kalman Filtering (EnKF) has been applied andresearched very actively recently. Kalman Filter (KF) is the technique to obtain the spec-

2

Page 3: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

ification of a linear dynamic system which accomplishes the prediction, separation, ordetection of a random signal (Kalman, 1960). Though KF deals with a linear stochasticdifference equation, Extended Kalman Filter (EKF) is able to be applied when the rela-tionship between the process and measurements is non-linear (Welch and Bishop, 2004).However, EKF is not applicable to highly nonlinear systems. Evensen (1994) developeda modified Kalman filter, EnKF, for highly nonlinear problems. Due to its high perfor-mance and pliant applicability, EnKF has rapidly spread and has been effectively appliedto a variety of fields, such as ocean dynamics, meteorology, hydrogeology, and so forth(Houtekamer and Mitchel, 1998; Reichle et al., 2002; Margulis et al., 2002; Evensen, 2003;Evensen, 2004).

Nævdal and Vefring (2002) brought EnKF into the history matching problem. In theirearlier studies, EnKF was applied to characterize a well-bore reservoir. Since then, EnKFwas utilized to identify detailed permeability distribution of an entire reservoir (Næv-dal et al., 2003). Gu and Oliver (2004) updated the permeability and porosity of a full3-Dimensional (3-D) reservoir simultaneously in PUNQ-S3 reservoir, which is a realisticsynthetic reservoir to verify the performance of history matching and uncertainty analy-sis. Gao et al. (2005) compared randomized maximum likelihood method with EnKF. Liuand Oliver (2005) carried out EnKF with the consideration of the geologic facies. Park etal. (2005) demonstrated the superior performance of EnKF in aquifer parameter identifi-cation compared to SA and GDM. Park et al. (2006) also verified the applicability of EnKFto a waterflooded reservoir with the methods of regeneration and selective uses of themeasurements. As a history matching through EnKF showed good performance and nu-merous advantages, it has been researched actively in a range of petroleum academies andindustries (Zhang et al., 2005; Zafari and Reynolds, 2005; Skjervheim et al., 2005; Lorentzenet al., 2005).

The strong points of EnKF are: (1) It makes it possible to update a model efficientlyand elaborately, since EnKF needs exactly one forward simulation per ensemble member,as opposed to hundreds to thousands numbers of forward simulations with conventionalmethods. (2) Since EnKF provides optimal output from noisy input, we can avoid thedeviation of solution by measurements errors. (3) It handles any kinds of reservoir prop-erties. (4) It utilizes all available kinds of measurement data simultaneously. (5) It is easilyimplemented to various forward simulators without additional calculation or modifica-tion of the equations.

However, the limitations of EnKF are: (1) It does not preserve the geologic informa-tion. It deals with Gaussian model only. Therefore, EnKF cannot handle facies models,such as channel bed reservoirs, which are often generated by multiple-point geostatistics(MPS). (2) It often provides physically unreasonable values, such as pressure and satu-ration inconsistent with the permeability or saturation values greater than 1.0 or smallerthat 0.0. (3) It requires the ensemble size large enough to represent the uncertainty. Largesize of ensemble needs large number of forward simulations.

3

Page 4: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

KL expansion

Recently, Sarma (2006) developed a novel method to parameterize a geologic realization.In the method, a realization is parameterized by an relatively short Gaussian randomvector through the Karhunen-Loeve (KL) expansion of the empirical covariance of an en-semble of geologic realizations. The KL expansion makes it possible to generate newrealizations which share the same covariance of the ensemble. In addition, in order tomaintain higher-order moments as well as covariance, Sarma introduced a kernel intothe parameterization. Therefore the parameterization is done in a very high-dimensionalkernel feature space and a realization is obtained by solving a pre-image problem. Thepre-image problem consists of yet another optimization problem to convert a realizationfrom a feature space to a realization space. Consequently, a new realization is obtainedby a nonlinear combination of an ensemble of realizations. Sarma optimized this rela-tively short parameters and found a solution which is conditioned to dynamic data bygradient-based optimization algorithms.

A kernel can be understood as a similarity measure, because a kernel is a dot productof feature vectors and similar feature vectors provide large dot product (Scholkopf andSmola, 2002). If this similarity from a kernel function can represent the similarity of dy-namic data, optimization would become easier and more efficient. However, polynomialkernels which are used in Sarma (2007) are not well correlated with the difference of dy-namic data, which will be shown later. Better and more relevant similarity measures arerequired for reservoir applications.

Distance

While a kernel is a similarity measure, a distance can be regarded as a dissimilarity mea-sure. As a dissimilarity measure, Suzuki and Caers (2006) introduced a Hausdorff dis-tance. They showed how a (static) distance between any two realizations that correlateswith their difference in flow response can be used to search for history matched models bymeans of efficient search algorithms, such as neighborhood algorithm and tree-search al-gorithm. This method was successfully applied to structurally complex reservoirs (Suzukiet al., 2006). In order for this method to work the dissimilarity distance between any tworealizations should be reasonably correlated with the dynamic data. To do this, Park andCaers (2007) proposed a connectivity distance which correlates well with the difference indynamic data between two realizations.

Scheidt and Caers (2007) showed how a distance which is well correlated with dy-namic data can be applied for uncertainty analysis in kernel feature space. Although weonly know the distance between two realizations in a distance space, if we use a radialbasis function (RBF) kernel of Euclidian distance calculated by multi-dimensional scalingof the distance measure, we can calculate a kernel function. Scheidt(2008) also shows thatthe RBF kernel based on a similarity distance can be applied to the parameterization based

4

Page 5: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

on kernel KL expansion.

The objective of this research is to overcome the limitations of conventional EnKFthrough implementing it in distance-based kernel space. Further, this research wouldshow how the EnKF in distance-based kernel space generates multiple geologic real-izations all of which honor all available data. In the following chapter, the theoreticalmethodology is explained. Then simple example is presented and discussed. This reportfocuses on ensemble Kalman filtering in distance-based kernel space. For details aboutthe other procedures (distance, parameterization, kernel, and pre-image problem), referto Appendix or the report of Caers (2008).

2 Methodology

2.1 Kalman Filter

Prior to explaining EnKF, the Kalman filter is briefly introduced. KF consists of a set ofmathematical equations that provides an efficient computational (recursive) means to es-timate the state of a process, in a way that minimizes the mean of the squared estimationerror (Welch and Bishop, 2004). KF considers the uncertainties in the model and simul-taneously provides optimal model estimates from noisy measurements. Each time themeasurements are taken during the prediction process, KF carries out the correction pro-cess. Through the repetitions of prediction and correction processes, KF helps the stateconverge to the value near the truth.

KF addresses the problem to estimate the state of a linear stochastic difference equa-tion (Equation 1) with a measurement (Equation 2). In this system, we can obtain mea-surements through the linear combination of the state vector. The measurements havemeasurement noises.

zt = Azt−1 + But−1 + wt−1 (1)dt = Htzt + vt (2)

where, z represents the state vector, i.e. the state of the system which cannot be measureddirectly (for example, gridblock pressure and saturation), u the control input (for exam-ple, boundary condition of the system), and w the process noise (for example, estimatederror from the forward simulator, which is A in KF). The subscript t represents the timestep. z denotes the measurement vector (for instance, bottom hole pressure), and v themeasurement noise (for instance, the noise when measuring the bottom hole pressure).Although we measure the same properties several times, measurements are not exactlyidentical because of noise. H is called the measurement matrix operator (for example, ifwe measure permeability at some gridblocks, H becomes a matrix that consists of 0 and 1such that we can obtain the permeability values at the gridblocks from the state vector byd = Hz).

5

Page 6: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

The estimation error is defined by Equations 3 and 4.

e− = z− z− (3)e = z− z (4)

where, e is the estimation error. The superscript ’−’ indicates a vector for the a priori stateand no superscript denotes the a posteriori state. A priori here is a state before assimilatingand a posteriori after assimilating the measurements. The hat denotes an estimated stateand no hat denotes the true state.

Four error covariances are defined by Equations 5 to 8.

Q = E [wwᵀ] (5)R = E [vvᵀ] (6)

P− = E[e−e−ᵀ] (7)

P = E [eeᵀ] (8)

where, Q is the model error covariance, R the measurement noise covariance, P the esti-mation error covariance. E[ ] represents the expectation operator.

In deriving the equations for KF, we begin with the goal of finding an equation thatcomputes an a posteriori state estimate as a linear combination of the a priori estimates anda weighted difference between an actual measurement and the prediction as shown belowin Equation 9, the assimilation equation:

zt = z−t + Gt(dt −Hz−t

)(9)

where, G is named Kalman gain, which is determined to minimize the estimation errorcovariance. When we substitute Equations 4 and 9 into Equation 8, we get Equation 10.

Pt = E[{

zt − z−t −G(dt −Hz−t

)} {zt − z−t −G

(dt −Hz−t

)}ᵀ] (10)

Expanding Equation 10 and assuming that the measurement errors are independentof the estimation error (Equation 11), we obtain Equation 12.

E[vte−ᵀt

]= E

[e−t vᵀ

t]

= 0 (11)

Pt = (I−GtH)P−t (I−GtH)ᵀ + GtRGᵀt (12)

Differentiating Equation 12 and finding the Kalman gain to minimize the estimationerror covariance, we finally deduce the Kalman gain equation, Equation 13. Additionally,the a posteriori estimation error covariance is calculated from the a priori estimation errorcovariance directly (Equation 14) and vice versa (Equation 15).

Gt = P−t Hᵀ (HP−t Hᵀ + R)−1 (13)

Pt = (I−GtH)P−t (14)P−t = APt−1Aᵀ + Q (15)

6

Page 7: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Equations 1 and 15 represent the forecast (predict) step, or time update, and Equa-tions 13 and 14 represent the assimilation (correct) step, or measurement update. Aftereach time and measurement updates, the process is repeated with the previous a posterioriestimate used to project or predict the new a priori estimate. This recursive nature is oneof the very appealing features of the KF (Welch and Bishop, 2004).

2.2 Ensemble Kalman Filter

The method of EnKF consists of two steps: the forecast step and the assimilation step.In history matching, the forecast step is the reservoir simulation step from current stateto the next measurement time. The assimilation step is a correction step to honor themeasurements at that time. EnKF is a recursive data process algorithm that updates allthe variables simultaneously with repetition of forecast and assimilation steps.

A nonlinear difference equation that calculates the state at time step t from that at timestep t− 1 is represented by Equation 16.

zt = f (zt−1, ut−1) (16)

where, the operator f ( ) means the nonlinear difference equation. In this case, the statevector consists of the parameterized permeability, pressure, and water saturation at eachgridblock and dynamic data, such as Equation 17.

z =

ykypySw

d

(17)

where, yk, yp, and ySw represent the parameterized permeability, the parameterized pres-sure, and the parameterized water saturation vectors of each center gridblock in kernelfeature space. In its most simple parameterization, each gridblock permeability, pressure,and water saturation are parameters.

Equation 18 shows the measurement at time step t which contains measurement noise.The measurement noise is assumed to be white noise. The measurement matrix operatoris composed of 0 and 1.

dt = Hzt + vt (18)

The measurement noise is generated by Equation 19.

vt = vt + δt (19)

where, the bar means the mean of measurement noise. δ represents the white noise.The measurement error covariance is calculated by Equation 20. If we assume that

the measurement error of a property is independent with the error measuring another

7

Page 8: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

properties at the same location; then the measurement error covariance is a block diag-onal matrix (for example, the porosity and the permeability at the same location). If weassume that the measurement error of any variable from one measurement location is in-dependent with that of the other location, even though the properties are the same, themeasurement error covariance is a diagonal matrix. In other words, the measurementerror covariance becomes the measurement error variance.

R = E [vvᵀ] = vvᵀ (20)

The aim of the assimilation step is to minimize the estimation error covariance. Theestimation error and the estimation error covariance are defined by Equations 21 to 24,respectively.

e− = x− x− (21)e = x− x (22)

P− = E[e−e−ᵀ] =

1NR

NR

∑j=1

e−j e−ᵀj (23)

P = E [eeᵀ] =1

NR

NR

∑j=1

ejeᵀj (24)

where, NR indicates the size of ensemble. That is to say, the assimilation step is to updatean a priori state to an a posteriori state in which the estimation error covariance is to beminimized. In EnKF, the true state is assumed to be the mean of ensemble members. Notethat the size of ensemble should be large enough to represent the underlying uncertainty.

The state vector that minimizes the estimation error covariance is obtained from Equa-tion 25 to 26.

zt = z−k + Gt(dt −Hz−t

)(25)

Gt = P−t Hᵀ (HP−t Hᵀ + R)−1 (26)

Once we obtain an a priori estimate through forward simulation, we could acquirethe a posteriori estimate after some basic matrix calculations. Assimilation is able to beconducted whenever the measurements are available.

Summarizing the EnKF processes (Figure 1), first, we generate an initial ensemblebased on initial measurements at time t0. Second, we conduct the time update to acquirenext production data at time t1 through a reservoir simulator: This is the prediction step.When the next measurements become available, we carry out the measurement updateby calculating the Kalman gain: This is the assimilation step. From the corrected state,we conduct the prediction step again until the next measurements are obtained at timet2. Likewise, whenever we get measurements, we execute the correction. The updateproceeds with the iterative prediction and correction step.

8

Page 9: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

3 The proposed workflow

Based on the theories that are stated above, the proposed procedure for conditioning en-semble to dynamic data under realistic geologic scenarios is as follows (Figure 2):

1. Generate the initial ensemble (realization space)

First we generate an initial ensemble. The initial ensemble should include real-izations that are honoring geologic information and are conditioned to all availablestatic data, that is, hard and soft data. To do this, we can choose a proper geostatisti-cal algorithms, such as SGSIM, SISIM, DSSIM, or SNESIM and FILTERSIM if using atraining image. Generating the ensemble, we may have to consider the uncertaintyin the static data. For example, if our geologic information is uncertain, we can usemultiple training images or variogram models.

2. Calculate the dissimilarity distances (distance space to metric space)

From the initial ensemble, we calculate the dissimilarity distances and constructa distance matrix. At this step, it is important for the distances to be correlated withthe dynamic data that we want to condition. If needed, we can apply multidimen-sional scaling to lower the dimension and get Euclidian distances, which make itpossible to use RBF kernels. The dissimilarity distance employed in this research isexplained in the Appendix.

3. Calculate the kernel matrix (to feature space)

Based on the Euclidian distances, we calculate the kernel matrix. RBF kernelmatrix would be easily calculated but a proper kernel should be chosen cautiously.

4. Parameterize the initial ensemble (to parameterization space)

After obtaining the eigenvalues and eigenvectors of the kernel matrix, each real-ization of the initial ensemble is parameterized to relatively short Gaussian randomvariables. Actually the parameterization is obtained from the eigenvectors of thekernel matrix directly. (See the Appendix)

5. Ensemble Kalman filtering (in parameterization space)

The ensemble Kalman filtering would be done on the parameterization of theinitial ensemble. Since the parameterizations are low-dimensional Gaussian ran-dom vectors, we may apply various optimization methods, such as gradient-basedmethods using the sensitivity coefficients, probability perturbation method, grad-ual deformation method, ensemble Kalman filter, and so on. Since we already havean ensemble, EnKF would be applied effectively and provide multiple realizationswhich show the same dynamic data response.

9

Page 10: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

6. Solve the pre-image problems (to realization space)

Now, the optimized parameterizations are converted back into the realizationspace. Using a proper minimization algorithm, such as a fixed-point iteration, wesolve the pre-image problems for all the optimized parameterizations. The pre-image problem will be discussed in another report (Caers, 2008).

7. Analyze multiple realizations

Finally, we obtain multiple realizations which satisfy all available data and ge-ologic scenarios. We can use these multiple realizations in a variety of purposes.Since we generate an initial ensemble reflecting the uncertainty after conditioning tostatic data acquired so far, these final multiple realizations indicate the uncertainty(a posteriori) after conditioning to static and dynamic data.

4 Example

4.1 Given information

• Geometry

Target reservoir is 310 ft × 310 ft × 10 ft.

• Geology

The permeability distribution is modeled with an anisotropic spherical vari-ogram (NE50 direction; 200 ft (major direction) and 50 ft (minor) of correlationlength). It is log-normally distributed and the mean and standard deviation of log-permeability are 4.0 and 1.0, respectively.

• Hard data

Two wells are available with quarter-five-spot pattern. We are given the perme-ability values at the two wells as hard data. Injector and producer are located at (45ft, 45 ft) and (275 ft, 275 ft), respectively. Both the permeability values at the twowells are 150 md.

• Dynamic data

During 3 years, watercut is measured at the producer every two months. Themeasured watercut value contains Gaussian noise (noise level: 10%). Figure 3 showsthe watercut data acquired.

4.2 Ensemble Generation

Based on the given geometry, the reservoir is discretized into 961 (31×31) gridblocks (10 ft× 10 ft× 10 ft). Based on the given geology and hard data, 1,000 realizations are generated

10

Page 11: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

by SGSIM with conditioning to hard data (permeability values at two locations). Figure 4shows the log-permeability of 6 out of 1,000 realizations. Figure 5 represents the mean andconditional variance of the log-permeability of 1,000 initial realization (initial ensemble).

Figure 6 depicts the simulated watercut curves for 1,000 initial realizations and themeasured watercut values. Conditioning only to hard data provides significantly biasedrealizations in terms of flow response. The measured watercut values are almost p90 ofthe initial realizations. Watercut curves for initial realizations would be called a priori flowresponses, which are useful when compared with the a posteriori flow responses later.

4.3 Distance calculation and multi-dimensional scaling

Dissimilarity distances between all possible combinations of two realizations among 1,000initial realizations have been calculated. In this example, connectivity distance has beenused (See Appendix). Figure 7 represents the scatter plot between dissimilarity distanceand the difference in actual dynamic data. High linear correlation means that the distancecan represent the difference in dynamic data without conducting any time-consumingforward simulation. One million (1,000×1,000) distance calculations are one to 100 timesfaster than one forward simulation.

Figure 8 displays 1,000 initial realizations in 2D MDS space. The fact that there aretwo groups (left narrow line and right wide plume) of realizations in MDS space can beeasily caught. Figure 9 demonstrates the nature of the two groups of realizations: in thefirst group (right-side wide plume), the injector is connected to the producer by high-permeability region (hot spot) in the realizations; in the second group (left-side narrowline), the two wells are disconnected in the realizations. Furthermore, the realizationswhich are located far left region in the left narrow line region are more disconnected thatthose located near the center (near the intersection point between the two regions). Thisis true for the right wide plume region.

Figure 10 exhibits the difference in dynamic data in 2D MDS space. Since we plotthe difference in dynamic data based on one realization (indicated by ×), the differencein dynamic data is thought to be the objective function where the reference is the real-ization indicated by ×. Figure 10 shows that the objective function in 2D MDS space issmoothly varying and does not have any local minimum, which is a favorable case forfast optimization.

4.4 Ensemble Kalman Filtering

EnKF has been implemented with 300 (randomly chosen) out of 1,000 initial realizations.After the eigenvalue decomposition of Gaussian kernel matrix based on the connectivitydistance, standard Gaussian parameterizations are assigned to 300 realizations. Detailedparameterization can be found in Appendix and Caers (2008).

11

Page 12: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

At 260 days, a first correction is executed based on the measurement. Reservoir simu-lations (prediction) from 0 days to 260 days for 300 initial realizations are done. Figure 11shows the watercut curves calculated by the reservoir simulations and the measured wa-tercut at 260 days. All the realizations have not yet shown breakthrough, so all the wa-tercut values simulated to 260 days are zero. Also the measured watercut is zero, whichmeans there is no need to correct the state vectors. Figure 12 displays the update at 260days in 2D MDS space. As we expect, all the realizations do not move at all in 2D MDSmap.

At 520 days, a second correction is performed based on the measurement. Reservoirsimulations (the prediction step) from 260 days to 520 days for 300 initial realizations aredone. For the prediction, the reservoir simulations are conducted from the pressure andwater saturation updated at 260 days. Figure 13 shows the watercut curves calculated bythe reservoir simulations and the measured watercut at 520 days. The watercut valuesfrom the simulations vary from 0.0 to 0.2. While the measured watercut is 0.03, whichmeans there are some realizations which show fast breakthrough. It can be shown that thesimulation displaying early breakthrough have their injectors connected to the producersby a high permeability zone. Therefore, the realizations that are more connected have tobe corrected.

Figure 14 displays the update at 520 days in 2D MDS space. Recall that the realiza-tions located in left narrow line region are highly connected ones and the realizationslocated in right wide plume region are not connected ones. As a result, almost all theconnected realizations are corrected to disconnected ones. The realizations in right wideplume region also moves to the reference point (which is indicated by × in Figure 10).Dissimilarity distance which is highly correlated with the difference in dynamic data andmulti-dimensional scaling provide a powerful tool for analyzing the path of optimizationand the quality of EnKF methodology.

Figure 15 displays the updates after 520 days. As the update proceeds, all the real-izations move toward the reference point and watercut curves starts honoring the mea-surements. Figure 16 lists the updates of log-permeability, a priori and a posteriori watersaturation of one realization. In three updates, the solution converges and the update ofpermeability is consistent with the water saturation. The pre-image problem (Caers, 2008)on dissimilarity distance-based kernel makes this consistent updates possible.

Figure 17 exhibits the final mean and conditional variance after EnKF. While the ini-tial mean (Figure 5) represents the disconnected high-permeability zone from injector toproducer, the final mean shows more connected high-permeability zone. Figure 18 showsthe reference field, which were assumed to be the true realization. The mean of final re-alizations looks very similar to the reference one. Figure 19 represents the prediction ofwatercut of 300 final realizations from 0 days to 1095 days. The mean of 300 watercutcurves matches the measurements very well, which can be identified by comparing itwith Figure 6.

12

Page 13: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

5 a priori and a posteriori probability density

Figures 20 to 22 depicts the probability density of unconditional realizations, conditionalrealizations to hard data only and to hard and dynamic data in 2D MDS map. Figure 20means a priori probability density when we have geologic information only. In this ex-ample, the geologic information is spherical variogram and log-normal distribution ofpermeability. Figure 21 means a posteriori probability density after conditioning to harddata. Compared with the a priori distribution, the portion of the connected realizations(in right wide plume region) has been increased, because the given hard data are highpermeability at well locations. Figure 22 represents a posteriori probability density afterconditioning to hard and dynamic data, which indicates that EnKF can provide multiplesolutions all of which are honoring to the dynamic data and represents the a posterioriprobability distribution.

Dissimilarity distance and multi-dimensional scaling made it possible to analyze thea priori and a posteriori probability density. It would be impossible unless the dissimilaritydistance is highly correlated with the difference in dynamic data and MDS maps theminto low-dimensional space.

6 Ensemble size reduction

Scheidt (2008) shows an effective selection method using kernel k-mean clustering indistance-kernel space. First, do kernel k-mean clustering of the ensemble members indistance-kernel space. Second, select a realization for each cluster which is nearest to thecentroid of that cluster. Then, the selected realizations reproduce the statistics of flow re-sponses of the whole ensemble members. This method is applied to reduce the size ofensemble in EnKF.

First, the initial 300 realizations (Figure 8) are clustered into 30 clusters through kernelk-mean clustering (Figure 24). Second, 30 realizations which are nearest to the centroidsof the corresponding clusters, respectively (Figure 25). Figure 26 and 27 display watercutcurves for the selected 30 realizations and 300 realizations and their p10, p50, and p90 (10-percentile, median, 90-percentile), respectively. Figure 28 compares the p10, p50, and p90of the selected 30 realizations with those of the whole initial 300 realizations. Only 30realizations reproduce the statistics of flow responses of 300 realizations.

Third, the same procedure of EnKF is applied to the 30 realizations only. In this case,only 30 reservoir simulations are needed. Figure 29 represents the movements of 30 real-izations at the update of 520 days. As in the case of 300 realizations, all the 30 realizationsmoves to the reference point. Figure 30 depicts the flow predictions of the final 30 realiza-tions. All the realizations are conditioned to the measurements within the measurementerror level (10%).

Figure 31 and 32 display watercut curves for the final 30 realizations and the final 300

13

Page 14: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

realizations and their p10, p50, and p90 (10-percentile, median, 90-percentile), respectively.Figure 33 compares the p10, p50, and p90 of the final 30 realizations with those of the wholefinal 300 realizations. EnKF with only 30 selected realizations reproduces the statistics offlow responses of final 300 realizations. In summary, the ensemble size can be reducedsignificantly by kernel k-mean clustering and the reduced ensemble reproduce the similarresults (final ensemble statistics or uncertainty) as those of full ensemble case.

7 Conclusion

EnKF has been successfully implemented in distance-based kernel space.

1. EnKF can be applied to any type of geologic model including Gaussian in distance-kernel space.

2. Standard normal parameterization through kernel Karhunen-Loeve expansion fitswell for the state vector of EnKF.

3. The proposed framework makes it possible to update the permeability, pressure,and water saturation consistently.

4. Dissimilarity distance which is highly correlated with the difference in dynamic datacombined with multi-dimensional scaling provide a powerful tool for analyzing thepath of optimization and the probability density of the inverse problem.

5. EnKF is boosted by reducing the ensemble size through kernel k-mean clusteringwithout any significant loss of ensemble statistics.

14

Page 15: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

References

[1] Caers, J.: Distance-based stochastic modeling: theory and applications, 21st SCRF AnnualMeeting Report (2008)

[2] Caers, J.: Comparison of the Gradual Deformation with the Probability Perturbation Methodfor Solving Inverse Problems, Mathematical Geology (2007) 39, 1.

[3] Caers, J.: History Matching under a Training Image-based Geological Model Constraints,SPEJ (2003) 218-226.

[4] Datta-Gupta, A. and King, M.J.: A Semianalytic Approach to Tracer Flow Modeling inHeterogeneous Permeable Media, Advances in Water Resources (1995) 18, 9.

[5] Deutsch, C.V. and Journel, A.G.: Geostatistical Software Library and User’s Guide, Ox-ford University Press, NY (1998).

[6] Evensen, G.: Sampling Strategies and Square Root Analysis Schemes for the EnKF, OceanDynamics (2004) 54, 539.

[7] Evensen, G.: The Ensemble Kalman Filter: Theoretical Formulation and Practical Imple-mentation, Ocean Dynamics (2003) 53, 343.

[8] Evensen, G.: Sequential Data Assimilation with a Nonlinear Quasi-Geostrophic Model Us-ing Monte Carlo Methods to Forecast Error Statistics, J. of Geophysical Research (1994) 99,10143.

[9] Gao, G., Zafari, M., and Reynolds, A.C.: Quantifying Uncertainty for the PUNQ-S3Problem in a Bayesian Setting with RML EnKF, paper SPE93324 presented at the 2005SPE Reservoir Simulation and Symposium, TX.

[10] Gu, Y. and Oliver D.S.: History Matching of the PUNQ-S3 Reservoir Model Using theEnsemble Kalman Filter, paper SPE89942 of the 2004 SPE ATCE, TX.

[11] Houtekamer, P.L. and Mitchell, H.L.: Data Assimilation Using an Ensemble KalmanFilter Technique, Monthly Weather Review (1998) 126, 796.

[12] Hu, L.Y.: Extended Probability Perturbation Method for Calibrating Stochastic ReservoirModels, Mathematical Geology, (2008) under review.

[13] Hu, L.Y., Blanc, G., and Noetinger, B.: Gradual Deformaion and Iterative Calibration ofSequential Stochastic Simulations, Mathematical Geology (2001) 33, 4.

[14] Hu, L.Y.: Gradual Deformaion and Iterative Calibration of Gaussian-Related StochasticModels, Mathematical Geology (2000) 32, 1.

15

Page 16: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

[15] Hu, L.Y. and Blanc, G.: Constraining a Reservoir Facies Model to Dynamic Data Using aGradual Deformation Method, proceedings of the 1998 ECMOR, UK.

[16] Journel, A.: Combining knowledge from diverse information sources: an alternative toBayesian analysis, Mathematical Geology (2002) 34, 5.

[17] Kalman, R.E.: A New Approach to Linear Filtering and Prediction Problems, J. of BasicEngineering (1960) 82, 35.

[18] Lorentzen, R.J., Nævdal, G., Valles, B., Berg, A.M., and Grimstad, A.-A.: Analysis ofthe Ensemble Kalman Filter for Estimation of Permeability and Porosity in Reservoir Models,paper SPE96375 presented at the 2005 SPE ATCE, TX.

[19] Margulis, S.A., Mclaughlin, D., Entekhabi, D., and Dunne, S.: Land Data Assimilationand Estimation of Soil Moisture Using Measurements from the Southern Great Plains 1997Field Experiment, Water Resources Research (2002) 38, 1.

[20] Nævdal, G., Johnsen, L.M., Aanonsen, S.I., Vefring, E.H.: Reservoir Monitoring andContinuous Model Updating Using Ensemble Kalman Filter, paper84372 of the 2003 SPEATCE, CO.

[21] Nævdal, G., and Vefring, E.H.: Near-Well Reservoir Monitoring through EnsembleKalman Filter, paper SPE75235 presented at the 2002 SPE/DOE Improved Oil Re-covery Symposium, OK.

[22] Park, K. and Caers, J.: History Matching in Low-Dimensional Connectivity Vector Space,proceedings of EAGE Petroleum Geostatistics 2007 Conference, Cascais, Portugal.

[23] Park, K., Choe, J., and Shin, Y.: Real-time Reservoir Characterization Using EnsembleKalman Filter During Waterflooding, J. of the Korean Society for Geosystem Engineer-ing (2006) 43, 143.

[24] Park, K., Choe, J., and Ki, S.: Real-Time Aquifer Characterization Using Ensemble KalmanFilter, proceedings of GIS and Spatial Analysis 2005 Annual Conference of the IAMG,Toronto, Canada.

[25] Reichle, R.H., McLanghlin, D.B., and Entekhabi, D.: Hydrologic Data Assimilation withthe Ensemble Kalman Filter, Monthly Weather Review (2002) 130, 103.

[26] Sarma, P: Efficient Closed-Loop Optimal Control of Petroleum Reservoirs under Uncer-tainty, Ph.D. Dissertation of Stanford University (2006)

[27] Scholkopf, B. and Smola, A.J.: Learning with Kernels: Support Vector Machines, Regu-larization, Optimization, and Beyond, The MIT Press, Cambridge, MA (2002).

16

Page 17: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

[28] Schaaf, T., Chavent, G., and Mezghani, M.: Refinement Indicators for Optimal Selectionof Geostatistical Realizations Using the Gradual Deformation Method, Mathematical Geol-ogy (2004) 36, 425.

[29] Scheidt, C.: (2008) Quantification of uncertainty on spatial and non-spatial reservoir pa-rameters - comparison between the experimental design and the distance-kernel method, 21stSCRF Annual Meeting Report (2007).

[30] Scheidt, C. and Caers, J.: Using Distances and Kernel to parameterize Spatial Uncertaintyfor Flow Applications, proceedings of EAGE Petroleum Geostatistics 2007 Conference,Cascais, Portugal.

[31] Skjervheim, J.-A., Evensen, G., Aanonsen, S.I., Ruud, B.O., and Johansen, T.A.,: Incor-porating 4D Seismic Data in Reservoir Simulation Models Using Ensemble Kalman Filter,paper SPE95789 presented at the 2005 SPE ATCE, TX.

[32] Suzuki, S. and Caers, J.: History Matching with an Uncertain Geological Scenario, paperSPE102154 presented at the 2006 SPE ATCE, TX.

[33] Suzuki, S., Caers, J., and Caumon, G.: History Matching of Structurally Complex Reser-voirs Using Discrete Space Optimization Method, proceedings of 2006 26th GOCADmeeting.

[34] Thiele, M.R., Batycky, R.P., and Blunt, M.J.: Simulating Flow in Heterogeneous MediaUsing Streamtubes and Streamlines, SPERE (1996) 10, 5.

[35] Welch, G. and Bishop, G.: An Introduction to the Kalman Filter, Department of Com-puter Science, University of North Carolina at Chapel Hill, NC (2004) 1-16.

[36] Zafari, M. and Reynolds, A.C.: Assessing the Uncertainty in Reservoir Description andPerformance Predictions with the Ensemble Kalman Filter, paper SPE95750 presented atthe 2005 SPE ATCE, TX.

[37] Zhang, D., Lu, Z., and Chen, Y.: Dynamic Reservoir Data Assimilation with an Efficient,Dimension-Reduced Kalman Filter, paper SPE95277 presented at the 2005 SPE ATCE,TX.

[38] Zuur, A.F., Ieno, E.N., and Smith, G.M.: Analysing Ecological Data, Springer, NY (2007)

17

Page 18: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

A Dissimilarity Distance

A distance is a measure of dissimilarity between two realizations. Distance between anytwo realizations contains more useful information rather than the realizations themselves.What we are interested in about the realizations determines a type of distance. For exam-ple, if we are interested in the flow responses of the realizations, a dissimilarity distanceshould be correlated with the flow response. If we are interested in the geometry of therealizations, a distance should represent the difference in the geometry.

Simply, we can evaluate the dissimilarity between realizations xa and xb (discretizedinto Ngb gridblocks) through the Minkowski model (Equation 27).

d(xa, xb) =

[Ngb

∑i=1|(xa)i − (xb)i|p

]1/p

(27)

where, (x)i represents i-th element of the vector x. p (≥ 1) defines the distance space,such as Euclidian space (p = 2) or City-Block (Manhattan) space (p = 1) or Dominancespace (p = ∞). Although the Minkowski model is easy to calculate, it may not be well-correlated with dynamic data because the forward simulated dynamic data may changedramatically by perturbing a small portion of the realization. Figure 23 depicts the corre-lation between Euclidian distance and the dissimilarity between dynamic data (the differ-ence between watercut curves). For this example, 1,000 Gaussian realizations (same as theinitial realizations in the example above) were used. It turns out that Euclidian distancesis not correlated with the dynamic data.

In order to optimize an inverse solution efficiently in the distance space, it is nec-essary that the dynamic data are spatially correlated in the space. For this, various dis-tances may be utilized. For instance, Suzuki and Caers (2006) proposed the Hausdorff dis-tance which is a measure of dissimilarity of two binary images, Scheidt and Caers (2007)the streamline-assisted solution-variables-based distance, and Park and Caers (2007) theconnectivity-based distance, which will be discussed later. Note that we do not have toidentify the coordinates of the space and it is sufficient that we know only the distancesamong the realizations in defining the distance space.

Various distances can be used as long as it is well correlated with the dynamic data.The connectivity distance, as an example of dissimilarity distances, will be discussed here.

In order for the connectivity vector to have high correlation with the production his-tory, TOF (time of flight, Datta-Gupta and King, 1995) based distance calculation is pro-posed which shows satisfactory correlation. TOF from an injector to a producer is calcu-lated by streamline simulation (Thiele et al., 1996) in steady-state conditions. Typically,steady-state simulation requires only one hundredth to one thousandth simulation timeequivalent to the usual reservoir simulation time. Equation 28 shows the TOF-based in-jector to producer distance calculation. We can choose a percentile among TOFs of stream-

18

Page 19: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

lines that arrive at a producer.

τjik =

∫ wPi

wIj

dζk

v(ζk)(28)

where, τjik represents TOF of k-th streamline from an injector wI

j to a producer wPi . ζk is the

coordinate along k-th streamline and v(ζ) is the interstitial velocity along streamline, ζ.Then, analytical water saturation at producer wP

i is calculated by

Sw(t; wPi ) =

1NwpNsl

Nwp,Nsl

∑i,j

1M◦ − 1

(√M◦

t

τjik

− 1

)(29)

where, Nwp and Nsl represent the number of producers and streamlines, respectively. M◦

means the end-point mobility and t the time.Analytical fractional flow at producer wP

i is obtained by

fw(t; wPi ) =

qw

qw + qo=

krw(Sw)/µw

krw(Sw)/µw + kro(Sw)/µo(30)

where, qw and qo mean the water and oil flow rates, respectively. krw and kro mean the wa-ter and oil relatively permeability, respectively. µw and µo are the water and oil viscosity,respectively.

Finally connectivity distance between realizations ma and mb is calculated by the dif-ference in the fractional flow curves:

d(ma, mb) =∫ t1

0

(f aw(t)− f b

w(t))2

dt (31)

where, the time t1 can be set to be the same as the simulation time.

B Parameterization

Prior to the parameterization of the geological model space, we start from an ensemble ofrealizations, xj, (j = 1, ..., NR, if we generate NR realizations). x could represent a faciesporosity or permeability realizations or any combination of those. The initial ensemblecan be generated by various geostatistical algorithms honoring the geologic informationand conditioning to the static data (hard and soft data). For simplicity, define the matrixfor the ensemble X as

[X]:,j = xj , (32)

where, [X]i,j means (i, j) element of matrix X and (:, j) is j-th column of matrix X. Thecovariance of the ensembles is calculated numerically by Equation 33.

C =1

NR

NR

∑j=1

xjxᵀj =

1NR

XXᵀ (33)

19

Page 20: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

When we perform an eigenvalue decomposition on the covariance (Equation 34), thena new realization can be obtained by Equation 35 (Karhunen-Loeve expansion). This algo-rithm is analogous to the LUSIM (lower/upper decomposition simulation; Deutsch andJournel, 1998).

CV = VL (34)xnew = VL1/2ynew (35)

where, V is a matrix each column of which is the eigenvector of the covariance. L is adiagonal matrix each diagonal element of which is the eigenvalue of the covariance. ynewis the parameter vector for the realization xnew. The parameter y is standard Gaussianrandom vector and the size is determined by how many eigenvalues are chosen to retain.We do not have to use all the nonzero NR eigenvalues; typically a few large eigenval-ues are retained. By Equation 35, we can generate many realizations based on the samecovariance.

In order to consider higher-order moments or spatial correlation beyond the point-by-point covariance, the feature expansions of the realizations can be introduced. Let φ bethe feature map from realization space R to feature space F (Equations 36 and 37).

φ : R→ F (36)

x 7→ φ := φ(x) (37)

where φ is the feature expansion of realization. With the feature expansions of the en-semble φ(X) (defined by Equation 38), the new feature expansion can be generated in thesame manner above (Equation 40). The covariance of the feature expansions (φ(xj)) of theensemble is calculated by Equation 39.

[φ(X)]:,j = φ(xj) (38)

C =1

NR

NR

∑j=1

φ(xj)φ(xj)ᵀ =1

NRφ(X)φ(X)ᵀ =

1NR

ΦΦᵀ (39)

φ(xnew) = VL1/2ynew (40)

However, since the feature expansion is often very high-dimensional and sometimesinfinite-dimensional, the eigenvalue decomposition of the covariance matrix is almost im-possible. The kernel trick makes it possible to obtain the exactly equivalent solution to theeigenvalue decomposition of the covariance. If we define a kernel function as a dot prod-uct of two feature expansions (Equation 41), the kernel function can be evaluated without

20

Page 21: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

representing the high-dimensional feature expansions explicitly. Then, the kernel matrix(Equation 42) can be calculated efficiently.

k(xi, xj) : = < φ(xi), φ(xj) > (41)K : = φ(X)ᵀφ(X) = ΦᵀΦ (42)

where, [K]i,j is k(xi, xj) and < > means the dot product.The main idea of the kernel trick is to assume that the new feature expansion is a linear

combination of the feature expansions of the ensemble and represent all the elements inthe equations as dot products of two feature expansions. Consider the eigenvalue decom-position of the kernel matrix:

KE = EΛ (43)

where, E is a matrix each column of which is the eigenvector of the kernel matrix. Λ is adiagonal matrix each diagonal element of which is the eigenvalue of the kernel matrix.

Then, the eigenvectors and the corresponding eigenvalues of the covariance are cal-culated directly from the eigenvectors and eigenvalues of the kernel matrix, which takesmuch less time (Equation 44).

L =1

NRΛ

ΦᵀV = EΛ1/2 (44)

For the parameterization of the given ensemble, we have to find an ensemble of pa-rameterization, Y such that

Φ = VL1/2Y (45)ΦᵀΦ = ΦᵀV︸ ︷︷ ︸

EΛ1/2

L1/2︸︷︷︸1√NR

Λ1/2

Y (46)

K =1√NR

EΛY (47)

(48)

and Equation 49 gives the parameterization.

Y =√

NRΛ−1EᵀK =√

NRΛ−1EᵀEΛEᵀ =√

NREᵀ (49)

21

Page 22: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

C Kernels

We can use various types of kernels, but the kernel matrix should be positive definite(Mercer theorem). Some widely used kernels are:

• Polynomial: k(x, z) = (< x, z > +c)d

• Gaussian: k(x, z) = exp(− ‖x−z‖2

2σ2

)• Sigmoid: k(x, z) = tanh(κ < x, z > +ϑ)

Likewise the Gaussian kernel, the kernel that is based on the Euclidian distance iscalled RBF kernel. Even though we know the Euclidian distance only, the RBF kernelfunction can be evaluated. Also, although the dissimilarity distance is not a Euclidiandistance, we can map the ensemble into the metric space by multidimensional scaling.Once the Euclidian distance in the metric space is well correlated with the dissimilaritydistance, we can evaluate the kernel function by replacing the distance to the Euclidiandistance in the metric space.

D Pre-image problem

Once the new feature expansion is acquired, the new realization can be calculated from thenew feature expansion (xnew = φ−1(ynew)). Since φ−1 cannot often be calculated explicitly,we have to calculate the new model such that

xnew = arg minxnew‖φ(xnew)− φ(X)b‖

= arg minxnew

{φ(xnew)Tφ(xnew)− 2φ(xnew)Tφ(X)b + bTKb

}. (50)

This is another optimization problem which is called the pre-image problem. This opti-mization problem can be solved by the fixed point iteration method (Scholkopf and Smola,2002). We find xnew such that

∇xnew

{φ(xnew)Tφ(xnew)− 2φ(xnew)Tφ(X)b + bTKb

}= 0 (51)

by iterations (Equation 52).

xnew =

NR

∑j=1

[b]jk′(xj, xnew)xj

NR

∑j=1

[b]jk′(xj, xnew)(52)

22

Page 23: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

where k′ means the differential of k. Since we have the kernel functions not the explicitfeature expansion, these iterations can be done efficiently. In conclusion, the new real-ization can be obtained by a nonlinear combination of the ensemble members. Note thatthe nonlinear weight sum to unity. This pre-image problem will be discussed in detail inCaers (2008).

23

Page 24: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 1: Flow diagram of EnKF applied in this study.

24

Page 25: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 2: The proposed workflow: Realization space→ Distance space→MDS space→Feature space→ Parameterization space→ Optimization→ Feature space→ Realizationspace

25

Page 26: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 3: Watercut data measured every two months. Only the red circles (noisy data) areavailable to the algorithm.

Figure 4: Log-permeability of 6 out of 1,000 realizations which are generated by SGSIM.All the realizations are conditioned to hard data: 150 md at (45 ft, 45 ft) and (275 ft, 275 ft).

26

Page 27: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 5: The mean (left) and conditional variance (right) of log-permeability of 1,000initial realizations which are generated by SGSIM. It can be verified that all the realizationsare conditioned to hard data. In the map of the mean (left), the well locations, or the harddata locations, are easily identified: (45 ft, 45 ft) and (275 ft, 275 ft).

Figure 6: Watercut curves simulated with all the 1,000 initial realizations and the mea-sured watercut data. Red circles mean the measured data. Green line means the mean ofthe watercut curves. Grey lines show 1,000 watercut curves.

27

Page 28: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 7: Scatterplot between dissimilarity distance and the difference in dynamic datafor 1,000 initial realizations generated by SGSIM. The correlation coefficient is almost 0.94,which means the connectivity distance is highly correlated with the difference in dynamicdata.

Figure 8: 2D MDS map of 1,000 initial realizations based on their own dissimilarity dis-tances. Each point represents each realization.

28

Page 29: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 9: 2D MDS map of 1,000 initial realizations based on their own dissimilarity dis-tances. All the realizations are turned out to be well-sorted based on their connectivitybetween injector and producer.

Figure 10: 2D MDS map of 1,000 initial realizations based on their own dissimilarity dis-tances. Color represents the difference in dynamic data from the realization that is in-dicated by ×. Since the connectivity distance is highly correlated with the difference indynamic data, although the realizations are mapped based on the connectivity distance,the realizations are well sorted with the difference in dynamic data.

29

Page 30: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 11: Watercut curves calculated by the reservoir simulations and the measured wa-tercut at 260 days.

Figure 12: Update at 260 days in 2D MDS space. ◦’s represent the a priori realizations(before correction) and ◦’s the a posteriori realizations (after correction).

30

Page 31: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 13: Watercut curves calculated by the reservoir simulations and the measured wa-tercut at 520 days.

Figure 14: Update at 520 days in 2D MDS space. ◦’s represent the a priori realizations(before correction) and ◦’s the a posteriori realizations (after correction). Grey lines (−)show the path of update.

31

Page 32: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

(a) 580 days

(b) 640 days

(c) 710 days

(d) 770 days

32

Page 33: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

(e) 840 days

(f) 900 days

(g) 970 days

(h) 1030 days

33

Page 34: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

(i) 1095 days

Figure 15: LEFT: Watercut curves calculated by the reservoir simulations and the mea-sured watercut from 580 days to 1,095 days; RIGHT: Update from 580 days to 1,095 daysin 2D MDS space. ◦’s represent the a priori realizations (before correction) and ◦’s the aposteriori realizations (after correction). Grey lines (−) show the path of update.

34

Page 35: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

(a) 260 days (LEFT: ln k; CENTER: a priori Sw; RIGHT: a posteriori Sw)

(b) 520 days (LEFT: ln k; CENTER: a priori Sw; RIGHT: a posteriori Sw)

(c) 580 days (LEFT: ln k; CENTER: a priori Sw; RIGHT: a posteriori Sw)

(d) 640 days (LEFT: ln k; CENTER: a priori Sw; RIGHT: a posteriori Sw)

(e) 710 days (LEFT: ln k; CENTER: a priori Sw; RIGHT: a posteriori Sw)35

Page 36: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

(f) 770 days (LEFT: ln k; CENTER: a priori Sw; RIGHT: a posteriori Sw)

(g) 840 days (LEFT: ln k; CENTER: a priori Sw; RIGHT: a posteriori Sw)

(h) 900 days (LEFT: ln k; CENTER: a priori Sw; RIGHT: a posteriori Sw)

(i) 970 days (LEFT: ln k; CENTER: a priori Sw; RIGHT: a posteriori Sw)

(j) 1030 days (LEFT: ln k; CENTER: a priori Sw; RIGHT: a posteriori Sw)36

Page 37: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

(k) 1095 days (LEFT: ln k; CENTER: a priori Sw; RIGHT: a posteriori Sw)

Figure 16: LEFT: Watercut curves calculated by the reservoir simulations and the mea-sured watercut from 580 days to 1,095 days; RIGHT: Update from 580 days to 1,095 daysin 2D MDS space. ◦’s represent the a priori realizations (before correction) and ◦’s the aposteriori realizations (after correction). Grey lines (−) show the path of update.

Figure 17: The mean (left) and conditional variance (right) of log-permeability of 300 finalrealizations after EnKF.

37

Page 38: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 18: Log permeability of reference realization.

Figure 19: Watercut curves predicted by reservoir simulations of 300 final realizationsfrom 0 days to 1095 days.

38

Page 39: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 20: Probability density of unconditional realizations in 2D MDS map. (uncondi-tional SGSIM)

Figure 21: Probability density of conditional realizations to hard data only in 2D MDSmap. (conditional SGSIM)

39

Page 40: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 22: Probability density of conditional realizations to hard and dynamic data in 2DMDS map. (EnKF)

40

Page 41: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 23: The distance and dissimilarity of dynamic data (watercut). On the y-axis is thedifference in watercut between any two realizations. On the x-axis the distance betweenany two realizations.

Figure 24: The initial 300 realizations clustered into 30 clusters through kernel k-meanclustering.

41

Page 42: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 25: The 30 realizations selected by kernel k-mean clustering.

Figure 26: Watercut curves for the initial 300 realizations and their p50, p10, and p90 (redsolid line and dotted lines).

42

Page 43: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 27: Watercut curves for the selected initial 30 realizations and their p50, p10, and p90(red solid line and dotted lines).

Figure 28: p50, p10, and p90 of the initial 300 realizations (red) and the selected initial 30realizations (blue).

43

Page 44: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 29: EnKF update of the selected 30 realizations at 520 days in 2D MDS space.

Figure 30: Watercut curves predicted by the final 30 realizations of EnKF.

44

Page 45: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 31: Watercut curves for the final 300 realizations (original ensemble) and their p50,p10, and p90 (red solid line and dotted lines).

Figure 32: Watercut curves for the final 30 realizations (reduced ensemble) and their p50,p10, and p90 (red solid line and dotted lines).

45

Page 46: Ensemble Kalman Filtering in Distance-Kernel Spacepangea.stanford.edu/departments/ere/dropbox/scrf/documents/reports/... · Ensemble Kalman Filtering in Distance-Kernel Space Kwangwon

Figure 33: p50, p10, and p90 of the final 300 realizations (red) and the final 30 realizations(blue).

46


Recommended