Modeling Uncertainty in the Earth Sciences
Jef Caers
Stanford University
Modeling spatial uncertainty
A reminder: models of uncertainty
1 2 3
Models of uncertainty
( )
( | )
( )
Samples: , , ,...,
X
n
P A
P A B b
f X Y y
x x x x
A set of samples drawn by Monte Carlo simulation are a valid model of uncertainty
Motivation
Physical model
SpatialStochastic
model
SpatialInput
parameters
Forecast and
decision model
Physicalinput
parameters
Rawobservations
Datasets
response
uncertain
uncertain
uncertain certain or uncertain
uncertain/error
uncertain
uncertain
Spatial stochastic simulation
Modeling spatial uncertainty
Spatial stochastic simulation
Spatial Stochastic simulator
input
Seed 1
Seed 2
Earth models
(spatial uncertainty)
(input uncertainty)
Data (samples, well, geophysics…)
Conditional simulation = constrained to data Unconditional simulation = not constrained to any data
“Hard” data
Definition: Hard data = direct, exact information at the scale of modeling
All the rest is soft data
If you dig out a volume of this size at that location and measure the variable of interest than you have hard data
Example
Size of the core = 6 inch x 6 inch x 12 inch 1 grid cell = 200ft x 200 ft x 3 ft
5 miles
If we assume the core to be hard data, i.e. assign it to a grid cell then we basically assume that there is NO spatial variation in that grid cell
Grid cells
Example spatial stochastic simulation
Input uncertainty: Range horizontal= [15,25] Range vertical = 5 Isotropy or horizontal anisotropy max range = [20,25] min range = [10,15] Azm=45
?
Example spatial stochastic simulation
If I take a fixed input Range horizontal= 25, Range vertical = 5 Horizontal Isotropy
Result
The Earth model reflects our knowledge about The sampled values at their location The interpreted spatial continuity model
vertical horizontal
Spatial uncertainty
and infinitely more if you want
Input and spatial uncertainty
and infinitely more if you want
Object-based stochastic simulation
Modeling spatial uncertainty
Unconditional simulation
Principle mostly used : rejection sampler
Place an object drawn from the pdfs of the various parameters defining the object
Place the object, either randomly or according to trend
If object violates interaction or other rules Then reject
Else accept
This will make sure that the 3D Earth model created reflects exactly the Boolean model specified
Conditional simulation
Use rejection sampler: slow
Use “Metropolis sampler” Create an initial 3D Earth model: it probably violates data
constraints
Propose a perturbation Move an object
Remove an object
Add an object
Accept the change with a probability a, where a is dependent on how much improvement was made
These methods are “iterative” methods, hence slow and may not converge
Example: conditional simulation
After 198 iterations After 200 iterations
After 202 iterations After 204 iterations
Channel added Same channel removed
Channel position changes
Some well data still
not honored
Final result: constrained to all wells
Courtesy: Norwegian Computing Center
Conditional Boolean simulation
Too slow for practical Earth modeling involving uncertainty
Any conflict between model and data makes it even slower
Can only be constrained to
Limited amount of well data
Not to geophysical data or more complex data
Training-image-based simulation
Modeling spatial uncertainty
Idea
Generate a single unconditional Boolean Earth model
Anchor patterns to data
Conditional Earth model
Extract 3D Patterns
ActualData
Principle of sequential simulation
A reservoir
with a 2x2 grid
A reservoir
with a 2x2 grid
A training imageA training image
Step 1.
Pick a cell
Step 1.
Pick a cell
Step 2.
Assign probability
50%
Step 2.
Assign probability
50%50%
Step 3.
Assign color
Step 3.
Assign color
Step 4.
Pick a cell100%
Step 4.
Pick a cell100%
Step 5.
Assign color
Step 5.
Assign color
Step 6.
Final result
Step 6.
Final result
Another realizationAnother realization
In general
= data
data event
P ( A | B ) ?
Algorithm
Generic sequential simulation algorithm (1) Assign any hard data to grid cells if required
(2) Define a random path
(3) Loop over all grid cells
(1) Determine P(A|B) B=any data and previously simulated values
(2) Draw from P(A|B) a value (3) Add that value to the data set
How to use the training image ?
u1
u?u2
u4u3
Simulation grid with some data points
Using a training image
Training image
P ( A | B ) = 1 / 4 A = blue
Scanning is CPU-demanding
u1
u?
u2
u4 u3
Template
Simulation grid with some data points
Create a data-base
2
14 11
5 7
3 1 2 5 3 0
5 3
1 1
1 0 0 0 0 0
0 0 0 0 0
1 1 1 1 1 1
1 1 1 1 1
3 2
2
Training image
Search tree
Construction requires scanning training image one single time
Minimizes memory demand
Allows retrieving all training probabilities for the template adopted
Spatial continuity at large scale
Training image
Coarse simulation
grid
Freeze coarse grid nodes and use them as conditioning data to simulate finer grid nodes
Finer simulation grid
Coarse
template
fine
template
Example
Training image
Background
shales
Estuarine
sands
Tidal sand
flats Transgressive
lags Tidal bars
Top of
estuarines
Top of reservoir
Anywhere,
eroded by
sand bars
Anywhere
Stratigraphy
820004000Sheets (rectangles)Estuarine
sands
Sheets (rectangles)
Sheets (rectangles)
Elongated ellipses
w/ upper sigmoidal
cross-section
Conceptual
description
410003000Transgr.
Lags
610002000Tidal
sand
flats
3 to 75002000 to
4000
Tidal
bars
Thickness
(ft)
Width
(m)
Length
(m)
Facies
type
Top of
estuarines
Top of reservoir
Anywhere,
eroded by
sand bars
Anywhere
Stratigraphy
820004000Sheets (rectangles)Estuarine
sands
Sheets (rectangles)
Sheets (rectangles)
Elongated ellipses
w/ upper sigmoidal
cross-section
Conceptual
description
410003000Transgr.
Lags
610002000Tidal
sand
flats
3 to 75002000 to
4000
Tidal
bars
Thickness
(ft)
Width
(m)
Length
(m)
Facies
type
Example Plan view of stratigraphic
grid with location of
the 140 wells
N
1
0
Background shale
Sand bars
Estuarine sands
Aerial proportion maps
Facies model
Vertical proportion
curves
Back
gro
un
d s
hale
San
d b
ars
Est
ua
rin
e sa
nd
s
N
Variogram-based simulation
Modeling spatial uncertainty
Introduction to spatial estimation
ju ?
Data point
Spatial estimation = What is the best guess for the value at the location where no data was taken ? There is only one single guess that is the best Depends on what you determine as “best”
What is best?
Best = as close as possible to the unknown truth
Consider a situation where you want to estimate the total amount of pollution of Pb at a specific location. You have two methodologies do so. Consider that you apply these methodologies to 10 sites
Estimation: principles
(1) (2) (1) (2)1 1 1 1 1
(1) (2) (1) (2)2 2 2 2 2
(1) (2) (1) (2)3 3 3 3 3
(1) (2) (1) (2)4 4 4 4 4
(1) (2)5 5 5 5
estimation estimation unknown error errorsite
method 1 method 2 real# Pb method 1 method 2
ˆ ˆ1 m m m
ˆ ˆ2 m m m
ˆ ˆ3 m m m
ˆ ˆ4 m m m
ˆ ˆ5 m m m
e e
e e
e e
e e
e(1) (2)5
(1) (2) (1) (2)6 6 6 6 6
(1) (2) (1) (2)7 7 7 7 7
(1) (2) (1) (2)8 8 8 8 8
(1) (2) (1) (2)9 9 9 9 9
(1) (2) (1) (2)10 10 10 10 10
ˆ ˆ6 m m m
ˆ ˆ7 m m m
ˆ ˆ8 m m m
ˆ ˆ9 m m m
ˆ ˆ10 m m m
e
e e
e e
e e
e e
e e
Estimation: principles
Unbiased: the average error is zero (it is a property measured over many “trials”
Best ?
Average square error is zero ?
Absolute value of error is zero ?
loss
-error +error
Introduction to Kriging What is kriging ? Is an estimation method Finds the “best” (Least Square) linear estimate of the unknown Accounts for the variogram * Spatial correlation between unknown and data * Redundancy between data But is mostly used in sequential simulation
Linear estimation
z1=0.5
z2=0.9
z3=1.5
d
1 d
2 d
3
3
i i
i 1
z z
223
3 22 2
2
1 2 3
11
. .1
1 1 1
ii
i
dde g
d d d d
3
1
2 h13
h12
Layered system
Problem Inverse distance solution
Inverse distance mapping
1
1
i
i
i
d
d
1
3
2
1
3
2
Situation 1 Situation 2
Inverse distance
kriging
1=1/3
2=1/3
3=1/3
1=1/4
2=1/4
3=1/2
Linear estimation
Direction of major continuity
To be estimated
Data
a a a
a
n
*SK j
1
Z ( ) Z( ), what is ?u u
*SK jZ ( ) u
ju
a a Z( ), 1, ,n u
Principle 1: a datum close in “geological distance” to the unknown should get a large weight Principle 2: Data close together are redundant and should “share” their weight
?
How does kriging do it?
z(u1)=0.5
z(u2)=0.9
z(u3)=1.5
h13 h23
h12
3
i i
i 1
z(u) z
Record all the distance between Data locations Data location vs location of unknown Calculate the covariance function for those distances
How does kriging do it?
)(
)(
)(
)()()(
)()()(
)()()(
03
02
01
3
2
1
2313
2312
1312
hC
hC
hC
zVarhChC
hCzVarhC
hChCzVar
Solve this linear system of equations
a a
a
1
ˆ z( ) ( ) is the kriging estimaten
j zu u
Note: mean = assumed zero
What is the average error we make?
)(
)(
)(
)()()(
)()()(
)()()(
03
02
01
3
2
1
2313
2312
1312
hC
hC
hC
zVarhChC
hCzVarhC
hChCzVar
Solve this linear system of equations
2 20
1
ˆ ( ) ( ) is the kriging variancen
j C ha a
a
u
Note: mean = assumed zero
Example
Note: mean = assumed zero
? d d
1 1
2 2
3 3
( ) ( ) (2 ) ( ) 1 / 4
( ) ( ) (2 ) ( ) 1 / 4
(2 ) (2 ) ( ) ( ) 1 / 2
Var z Var z C d C d
Var z Var z C d C d
C d C d Var z C d
e
2 1 1 1ˆ ( ) var( ) ( ) ( ) ( ) var( ) ( )
4 4 2z C d C d C d z C d u
Various “flavors” of kriging
Simple kriging
You assume there is no trend
You assume you know the mean of the variable over the domain
Ordinary kriging
You don’t want to assume anything explicitly
Kriging with locally varying mean
You assume there is a trend and you know that trend exactly
Kriging with trend
You assume there is trend but you only know the type of trend (don’t know it exactly its magnitude)
Example
Back to sequential simulation
(1) Assign any hard data to grid cells if required
(2) Define a random path
(3) Loop over all grid cells
(1) Determine P(A|B) B=any data and previously simulated values
(2) Draw from P(A|B) a value (3) Add that value to the data set
Kriging is not simulation !
ju ?
ju ?
Kriging: what is the best guess?
Simulation: what is the uncertainty as expressed through a probability or probability distribution or a set of Possible outcomes
Gaussian simulation
ju ?
Mean: m
Variance: 2
a a
a
a a
a
*SK
1
2SK
1
mean: Z ( ) ( ) is the simple kriging mean
variance: ( ) 1 ( ) is the kriging variance
n
j
n
j j
Z
Cov
u u
u u u
Transformation of the data
Gaussian simulation assumes the data is Gaussian
Problem ? Prior to simulation, perform a normal score transform of the hard data After sequential visit of all cells, perform a back-transform which is the exact reverse of the normal score transform
Normal score transform
Unit free
Complete SGS algorithm
(1) Transform the data into normal score domain
(2) Assign any hard data to grid cells if required
(3) Define a random path
(4) Loop over all grid cells
(1) Determine, using kriging, the distribution P(A|B) B=any data and previously simulated values
(2) Draw from P(A|B) a value (3) Add that value to the data set
(5) Back-transform all simulated values (requires extra/interpolation)
Example