soil_sampling.pdf

7/29/2019 soil_sampling.pdf

1/50

Sampling for soil survey

D G RossiterDepartment of Earth Systems Analysis

International Institute for Geo-information Science & Earth Observation (ITC)

December 28, 2008

Copyright 2008 ITC.

All rights reserved. Reproduction and dissemination of the work as a whole (not parts) freely permitted if thisoriginal copyright notice is included. Sale or placement on a web site where payment must be made to access this

document is strictly prohibited.

To adapt or translate please contact the author (http://www.itc.nl/personal/rossiter).
http://www.itc.nl/personal/rossiterhttp://www.itc.nl/personal/rossiter


2/50

Soil Sampling 1

Topic: Sampling for soil survey

1. Sampling in routine survey

2. Sampling for detailed survey

3. Sampling for detailed survey, with prior information

4. Sampling for environmental correlation

D G Rossiter


3/50

Soil Sampling 2

1 Soil sampling in routine soil survey

Routine soil survey follows the Discrete Model of Spatial Variability (DMSV):

homogeneous soil bodies mapped as polygons

conceptually-sharp boundaries

So the main aim of sampling is to characterize the soils in each map unit, i.e.

legend category

set of polygons with the same soil type

D G Rossiter


4/50

Soil Sampling 3

An area-class map

State Soil Geographic Database, Schuyler County, NY (USA)

D G Rossiter


5/50

Soil Sampling 4

Sample is a very small proportion of the population

A soil pit is about 1x2 m surface area; a typical soil borehole (auger hole is

10 cm diameter, so 0.00157 m2

So in 1 ha there are 10000/2 = 5000 potential pit sites, or10000/(0.052 ) 1273240 potential bore hole sites!

Sampling density is usually specified as one field observation per 1 4 cm2 of

map (regardless of map scale)

Example: at 1:25 000, 1cm2m = 250 mg 250 mg = 62 500 m2g = 6.25 ha

So, one observation per 6.2525 ha

This is a tiny sampling fraction!

How can we make a map with such a low sampling density?

D G Rossiter


6/50

Soil Sampling 5

Representative sampling

Solution: the surveyor uses expert opinion of the soil-landscape model:

Soils occur in specific positions because of the specific combination ofsoil-forming factors (Jenny equation)

So, place observations in the most representative (typical, modal, central

concept) sites, where the soil class is expected to be best-expressed

* Some observations nearby to get an idea of heterogeneity

* Maybe some quick observations (not full samples) near boundaries to

improve their location

D G Rossiter


7/50

Soil Sampling 6

Block diagram of a soil landscape

Wysocki, D. A., Schoeneberger, P. J., & LaGarry, H. E. (2005). Soil surveys: a window to the subsurface. Geoderma,

126(1-2), 167-180

D G Rossiter


8/50

Soil Sampling 7

Some landscapes to sample

Dorchester, England (GB)

D G Rossiter


9/50

Soil Sampling 8

Truxton, Cortland County, NY (USA)

D G Rossiter


10/50

Soil Sampling 9

Herikhuizerveld, Rheden (NL)

D G Rossiter


11/50

Soil Sampling 10

Sampling for associations

In smaller-scale maps (depending on landscape, from 1:50 000 down) we usually

expect more than one soil type in each map unit.

The map units are usually associations of related soils (e.g. hillslope catena).

Then the surveyor observes at the central concept of each component.

The proportion of components is estimated by landscape analysis.

D G Rossiter


12/50

Soil Sampling 11

2 Soil sampling in detailed soil survey

These are usually grid samples, to completely cover an area of interest.

Example: an area of suspected soil pollution.

The grid is then interpolated into a raster map, usually by kriging.

Two-step sampling:

1. For modelling the variogram

2. For kriging, once the variogram is known

Note: the success of kriging depends on a correct variogram model!

Note: the variogram may be known from similar studies

D G Rossiter


13/50

Soil Sampling 12

Sampling to model spatial dependence

Must have several separations to estimate structure

Especially important are some closely-separated observations, to estimatenugget

Can use a transect with variable spacing or a 2-D scheme (random directions,

fixed separations in a hierarchy)

Webster, R., Welham, S. J., Potts, J. M., & Oliver, M. A. (2006). Estimating the

spatial scales of regionalized variables by nested sampling, hierarchical analysis

of variance and residual maximum likelihood. Computers & Geosciences, 32(9),

1320-1333.

Lark, R. M. (2002). Optimized spatial sampling of soil for estimation of the

variogram by maximum likelihood. Geoderma, 105(1-2), 49-80.

D G Rossiter


14/50

Soil Sampling 13

What sample size to fit a variogram model?

Stochastic simulation from an assumed random field with a known variogram

suggests:

1. < 50 points: not at all reliable

2. 100 to 150 points: more or less acceptable

3. > 250 points: almost certaintly reliable

More points are needed to estimate an anisotropic variogram.

This is very worrying for many environmental datasets (soil cores, vegetation

plots, . . . ) especially from short-term fieldwork, where sample sizes of 40 60

are typical. Should variograms even be attempted on such small samples?

D G Rossiter


15/50

Soil Sampling 14

How to design the nested sample

Widest spacing s1 is the station, which are assumed so far away from each

other as to be spatially independent

* furthest expected dependence . . .

* . . . based on the landscape . . .

* . . . and expected range of process to be modelled

Closest spacing sn is the shortest distance whose dependence we want to know

D G Rossiter


16/50

Soil Sampling 15

Geometric series

A geometric series increases terms by multiplication

It allows us to cover a wide range of distances (possible ranges) with a fewstages.

Increase spacing in geometric series:

s = s1 sn

Fill in series with further geometric means

D G Rossiter


17/50

Soil Sampling 16

Geometric series: example

First series: s1 = 600m (stations), s5 = 6m (closest)

Intermediate spacing: s3 = 6m 600m = 60m

Series now {600m, 60m, 6m}

Fill in with the geometric means

* s2 = 600m 60m 190m* s4 =

60m 6m 19m

Final series {600m, 190m, 60m, 19m, 6m}

D G Rossiter


18/50

Soil Sampling 17

Locating the sample points

Objective: cover the landscape, while avoiding systematic or periodic features

Method: random bearings from centres at each stage

Stations can be along a transect if desired (no spatial dependence)

From a centre at stage i (Ei, Ni), to find a point (Ei+1, Ni+1) at the next spacingsi+

1:

* = random uniform[0 . . . 2 ]* Ei+1 = Ei + (si+1 sin )* Ni+1 = Ni + (si+1 cos )

D G Rossiter


19/50

Soil Sampling 18

Number of sample points

Number of stations selected to cover the area of interest

At each stage Si, the next stage Si+1 has in principle double the samples

One is for all the previous centres from stage S1 . . . S i1 and one is for the newcentre from stage Si

So the total number doubles: half old, half new centres

D G Rossiter

l l


20/50

Soil Sampling 19

Unbalanced sampling

After the first 4 stages, use an unbalanced design

Only half the centres at Si (i 4) are further sampled at Si+1

This still covers the area, but only uses half the samples at the shortest ranges

Number of pairs is still enough estimate short-range dependence

D G Rossiter

S il S li 20


21/50

Soil Sampling 20

Number of sample points: example

Five stages {600m, 190m, 60m, 19m, 6m}

Nine stations: n1 = 9

Double at stages 2 . . . 4: n2 = 18, n3 = 36, n4 = 72

At stage 5, only use half the 72 centres, i.e. 36

Total at stage 5: 72+ 36 = 108 (would have been 144 with balanced sampling)

D G Rossiter

S il S li 21


22/50

Soil Sampling 21

Nested ANOVA : Partition Variability by sampling level

Linear model:

zijk...m = +Ai + Bij + Cijk + + Qijk...m + ijk...m

Link with regional variable theory (semivariances): m stages; d1 shortest

distance at mth stage; dm largest distance at first stage

2m = (d1)2m1 + 2m = (d2)

...

21 + . . .+ 2m = (dm)

F-test from ANOVA table; for stage m+ 1 : F= MSm/MSm+1

D G Rossiter

S il S li 22


23/50

Soil Sampling 22

Nested ANOVA : Interpretation

There is spatial dependence from the closest spacing until the F-ratio is not

significant.

Samples from this distance are independent

To take advantage of spatial interpolation, must sample closer than this

Can estimate how much of the variation is accounted for at each spacing

D G Rossiter

Soil Sampling 23


24/50

Soil Sampling 23

Grid sampling for kriging

This assumes the Continuous Model of Spatial Variaility (CMSV).

So the soil property is modelled as a random field and the map is made by

kriging.

D G Rossiter

Soil Sampling 24


25/50

Soil Sampling 24

Kriging prediction Kriging prediction variance

Note: Prediction variance depends only on the spatial configuration of the

observations, not on the data value.

D G Rossiter

Soil Sampling 25


26/50

Soil Sampling 25

Sampling designs with the CMSV: objectives

1. Maximize information

Cover the largest possible area at minimum cost

Minimize some optimization criterion

2. Minimize costs

3. (Incorporate any existing sample see next subtopic)

D G Rossiter

Soil Sampling 26


27/50

Soil Sampling 26

What is to be optimized?

An optimization criterion is some numerical measure of the quality of the

sampling design. Some possibilities:

1. Minimize the maximum kriging variance in the area: nowhere is more poorly

predicted than this maximum

2. Minimize the average kriging variance over the entire area

D G Rossiter

Soil Sampling 27


28/50

Soil Sampling 27

Optimal point configuration (CMSV)

In a square area to be mapped, given a fixed number of points that can be

sampled, in the case of bounded spatial dependence:

Points should in on some regular pattern; otherwise some points duplicate

information at others (in kriging, will share weights)

Optimal (for both the minimal maximum and minimal average criteria):

equilateral triangles (If the triangle is 12, max. distance to a point

=

7/4 0.661) Sub-optimal but close: square grid (max. distance =

2/2 0.707)

* Grid should be slightly perturbed so samples do not line up exactly; avoids

unexpected periodic effects

(Problems: edge effects in small areas; irregular areas.)

D G Rossiter

Soil Sampling 28


29/50

Soil Sampling 28

Optimal point configuration in the presence of anisotropy

Optimal designs are easily adjusted for anisotropy (different range of spatial

dependence in two orthogonal axes)

The regular grid may be adapted for affine or geometric anisotropy: stretch it inthe direction of maximum dependence, based on the anisotropy ratio.

E.g. for a ratio of0.5, squares become rectangles, with the distance in the

direction with the longest range twice that of the shortest range.

D G Rossiter

Soil Sampling 29


30/50

Soil Sampling 29

Computing an optimal grid size

Reference: McBratney, A. B. & Webster, R. (1981) The design of optimal

sampling schemes for local estimation and mapping of regionalized variables -

I and II. Computers and Geosciences, 7(4), 331-334 and 335-365; also inWebster & Oliver.

Key point: In kriging, the estimation error is based only on the sample

configuration and the chosen model of spatial dependence, not the actual

data values

So, if we know the spatial structure (variogram model), we can compute the

maximum or average kriging variances before sampling, i.e. before we know

any data values.

This is known as OSSFIM from the original articles.

D G Rossiter

Soil Sampling 30


31/50

Soil Sampling 30

Error variance

Recall: The kriging variance at a point is given by:

2( x0) = bT

= 2N

i=1i( xi, x0)

Ni=1

Nj=1

ij( xi, xj)

This depends only on the sample distribution (what we want to optimise) and

the spatial structure (modelled by the semivariogram)

In a block this will be lowered by the within-block variance (B,B)

D G Rossiter

Soil Sampling 31


32/50

p g

Reducing kriging error

Once a regular sampling pattern is decided upon (triangles, rectangles, . . . ), the

kriging variance is decreased in two ways:

1. reduce the spacing (finer grid) to reduce semivariances; or

2. increase the block size of the prediction

These can be traded off; but usually the largest possible block size is selected,

based on the mimimum decision area.

D G Rossiter

Soil Sampling 32


33/50

p g

Error as a function of increasing grid resolution

Consider 4 sample points in a square

To estimate is one prediction point in the middle (furthest from samples highest kriging variance)

Criterion is minimize the maximum prediction error

If the variogram is close-range, high nugget, low sill, we need a fine grid to

take advantage of spatial dependence; high cost

If the variogram is long-range, low nugget, high sill, a coarse grid will give

similar results

D G Rossiter

Soil Sampling 33


34/50

p g

Kriging variances at centre point

spacing

block.size

20

40

60

80

100

120

100 200 300 400

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

spacing

block.size

20

40

60

80

100

120

100 200 300 400

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

long range variogram (1200 m) short range variogram (600 m)

D G Rossiter

Soil Sampling 34


35/50

3 Sampling with prior information

Problem: how to optimally place a limited number of observations in a study

area in order to extract the maximum information at minimum cost.

We consider here the information to be a map over some study area, made byordinary kriging from the sample points; so the assumptions of the CMSV must

be met.

Reference:

van Groenigen, J.-W. (2000). The influence of variogram parameters on optimal

sampling schemes for mapping by kriging. Geoderma, 97(3-4), 223-236.

also contained in the PhD thesis:

van Groenigen, J.-W. Constrained optimisation of spatial sampling Enschede, NL:

ITC.

D G Rossiter

Soil Sampling 35


36/50

Problems with the optimal grid

The optimal grid presented in the previous section is optimal only in restricted

circumstances. There are many reasons that approach might not apply:

Edge effects: study area is not infinite

Irregularly-shaped areas, e.g. a flood plain along a river

Off-limits or uninteristing areas, e.g. in a soils study: buildings, rock

outcrops, ditches . . .

Existing samples, maybe from a preliminary survey; dont duplicate the effort!

Impossible to compute an optimum analytically (as for the regular grid on an

infinite plane).

D G Rossiter

Soil Sampling 36


37/50

Annealing

Slowly cooling a molten mixture of metals into a stable crystal structure.

During annealing the temperature is slowly lowered.

At high temperatures, molecules move around rapidly and long distances

At low temperatures the system stabilizes.

Critical factor: speed with which temperature is lowered

too fast: stabilize in a sub-optimal configuration

too slow: waste of time

D G Rossiter

Soil Sampling 37


38/50

Simulated annealing

This is a numerical analogy to actual annealing:

Some aspect of a numerical system is perturbed

The configuration should approach an optimum

The amount of perturbation is controlled by a temperature

D G Rossiter

Soil Sampling 38


39/50

Outline of SSA

1. Decide on an optimality criterion

2. Place the desired number of sample points anywhere in the study area (grid,random . . . ); compute fitness according to optimality criterion

3. Repeat (iterate):

(a) Select a point to move; move it a random distance and direction

(b) If outside study area, try again

(c) Compute new fitness

(d) Ifbetter, accept new plan; if worse also accept with a certain probability

4. Stop according to some stopping criterion

D G Rossiter

Soil Sampling 39


40/50

Example of a single step

Colour ramp is from blue (low kriging variance) to red (high).

Point at lower right is moved to middle-bottom:

A large hot area (high kriging variance) is now cooler.

D G Rossiter

Soil Sampling 40


41/50

Temperature

The distance to move a point is controlled by the temperature; this is used to

multiply some distance.

Tk+1 = Tk (1)

where k is the step number and < 1 is an empirical factor that reduces the

temperature; we must also specify an initial temperature T0.

D G Rossiter

Soil Sampling 41


42/50

Fitness

Several choices, all based on the kriging variance:

Mean over the study area (MEAN OK)

* appropriate when estimating spatial averages to a given precision

Maximum anywhere in the study area (MAX OK)

* appropriate when the entire area must be mapped to a given precision, e.g.

to guarantee there is no health risk in a polluted area.

D G Rossiter

Soil Sampling 42


43/50

Stopping criterion

Possiblities:

fixed number of iterations

reach a certain (low) temperature

after a certain number of iterations with no change.

D G Rossiter

Soil Sampling 43


44/50

Acceptance criterion

Metropolis criterion: the probability P (S0 S1) of accepting the new scheme is:

P (S0

S1)=

1, if(S1)

(S0) (2)

P (S0 S1) = exp(S0)(S1)

c

, if(S1) > (S0)

where S0 is the fitness of the current scheme, S1 is the fitness of the proposed

new scheme, and c is the temperature. This can also be written:

p = ef /Tk (3)

where Tk is the current temperature and f is the change in fitness due to theproposed new scheme.

Note that this will be positive for a poorer solution, so its complement is used for

the exponent.

D G Rossiter

Soil Sampling 44


45/50

A real example

Industrial area, existing samples; more must be taken to lower the prediction

variance to a target level everywhere; where to place the new samples?

Reference: van Groenigen, J. W., Stein, A., & Zuurbier, R. (1997). Optimization of

environmental sampling using interactive GIS. Soil Technology, 10(2), 83-97D G Rossiter

Soil Sampling 45


46/50

4 Soil sampling for environmental correlation

We want to make a set of observations of soil properties, from which to build

regression models from a set of environmental covariables, e.g.

terrain parameters

digital imagery

climate-related layers (elevation, aspect . . . )

For example, if z is some soil property:

z = f( zxy

, CTI, z, . . . )

So the soil observations must somehow represent this feature-space, as well asgeographic space, efficiently.

D G Rossiter

Soil Sampling 46


47/50

Regression modelling

1. Simple (one dominant factor)

2. Multiple

3. (Stepwise: automatic selection of predictor set dangerous!)

4. Standardized principal components: removes multi-colinearity

(inter-correlated predictors), measurements on different scales

Generally linear models are used; may linearize some predictors if necessary.

z = o +n

i=1

iqi

(See standard regression textbooks)

D G Rossiter

Soil Sampling 47

li h


48/50

Feature-space sampling schemes

The aim is to efficiently sample combinations of feature-space predictors.

But:

not all combinations are found in nature (e.g. steep slope + high TWI)

combinations occupy different proportions of the area

Latin hypercube:

Minasny, B., & McBratney, A. B. (2006). A conditioned Latin hypercube method

for sampling in the presence of ancillary information. Computers & Geosciences,

32(9), 1378-1388.

D G Rossiter

Soil Sampling 48

Mi d li h


49/50

Mixed sampling schemes

Try to optimize sample placement in both feature and geographic space.

Simulated annealing

Brus, D. J., & Heuvelink, G. B. M. Optimization of sample patterns for universal

kriging of environmental variables. Geoderma, 138(1-2), 8695

D G Rossiter

Soil Sampling 49

D i i li l


50/50

Designing a sampling plan

1. Define the study area

2. Determine the objectives of the sampling

inferring spatial processes? mapping? decision support?

3. Define costs (budget) vs. benefits (precision needed)

4. Decide on any stratification by differential objectives, costs, benefits

Good luck!

D G Rossiter

Date post:	03-Apr-2018
Category:	Documents
Upload:	gauchex697355
View:	216 times
Download:	0 times

soil_sampling.pdf

Documents