Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | carter-vaughn |
View: | 45 times |
Download: | 3 times |
1
Bayesian methods for parameter estimation and data assimilation with crop models
Part 4: The Metropolis-Hastings algorithm
David Makowski and Daniel Wallach
INRA, France
January 2007
2
Previously
• Approximation of posterior distribution with the Importance Sampling algorithm.
• Implementation with the R statistical software.
• Application to estimate one parameter.
P Y PP Y
P Y P
Bayes’ Theorem
3
Objectives of part 4
• Present another algorithm to approximate the posterior probability distribution, the Metropolis-Hastings algorithm.
• Illustrate with 2 examples. The first has 1 parameter, the second 3 parameters.
• Furnish a program in the R language that you can run to implement Metropolis-Hastings for the examples.
•R is free (see http://www.r-project.org/.)
4
Two approaches for approximating posterior distributions from Monte Carlo simulations
1. Non adaptative methods• All parameter vectors can be generated at the start of the procedure. The choice of parameters to be tested does not depend on the results for previous parameters.
•Example: Importance Sampling (see part 3).
2. Markov chain Monte Carlo methods (MCMC)• Parameter values are generated from a Markov chain. The parameter value to be tested at stage i+1 can depend on the parameter value at stage i.
• The most important methods are the Metropolis-Hastings algorithm and Gibbs sampling.
5
The Metropolis-Hastings algorithm
General case
Step 0. Choose a starting value θ1. Define a proposal distrubution Pp(θc|θi). (For example, use a normal distribution with mean equal to θi ).
Repeat steps 1-3 for i=1,…,N
Step 1. Generate a candidate parameter value θc from Pp ( θc|θi ).
Step 2. Calculate
Step 3. If T1, then θi+1 = θc . If T<1, then draw u from a uniform distribution on the interval (0, 1). If u<T then θi+1 = θc otherwise θi+1 = θi .
c c p i c
i i p c i
P Y P PT
P Y P P
The result of the algorithm is a list of N parameter values. The same value may be repeated several times.
6
The Metropolis-Hastings algorithm
with symmetric proposal distribution
• A common choice for the proposal distribution P(θc|θi) is a normal distribution with mean equal to θi and constant variance.
• In this case P( θc|θi ) = P(θi|θc ) and the expression for T simplifies:
c c
i i
P Y PT
P Y P
Likelihood Prior density
7
The Metropolis-Hastings algorithm
Choices to me made
• The proposal distribution. The number of iterations required for a good approximation to the posterior distribution depends on this choice.
• The number of iterations. A large number is generally necessary (N=10000, 100000 or more depending on the problem).
• The number of parameter values in the list to be discarded (to reduce dependence on the value chosen for starting the algorithm).
• We will give some suggestions for these choices with example 1.
8
Example 1Example already presented in parts 2 and 3: Estimation of crop yield.
• The single unknown parameter is the yield of a particular field. The prior information is an expert’s opinion. There is also information from a measurement. Both the prior density and the likelihood are normal distributions.
•In this case, the exact expression of the posterior distribution is known.
• This example is used to show that the Metropolis-Hastings method can give a good approximation of the posterior distribution.
9
Example 1 – Exact posterior distribution
From part 2, we have:
• Measurement: Y=9 t/ha (sd=1)
• Prior distribution: P(θ) = N(5, 2²)
• Likelihood: P(Y|θ ) = N(θ, 1)
• Exact posterior distribution: P(θ |Y) = N(8.2, 0.8²)
0 2 4 6 8 10 12
Theta (t/ha)
0.0
0.1
0.2
0.3
0.4
0.5
De
nsi
ty
Prior probability distributionLikelihood function
Posterior probability distribution
10
Example 1 – Metropolis-Hastings
Step 0. Choose θ1=5t/ha. As proposal distribution use a normal distribution: P( θc|θi ) = N( θi , 0.8²).
Repeat steps 1-3 for i=1,…,N
Step 1. Generate a candidate parameter value θc from P( θc|θi ).
Step 2. Calculate with
and
Step 3. If u<min (1, T), where u is drawn from a uniform distribution on the interval (0, 1) then θi+1 = θc otherwise θi+1 = θi .
9
9c c
i i
P PT
P P
2
9exp
2
19
2
P
42
5exp42
1 2
P
11
Example 1 - ResultsN = 500. Chain 1.
Chain of parameter values
Posterior distribution approximated from the last 250 values
True posterior distribution
Measurement
12
Example 1 - ResultsN = 500. Chain 2.
Chain of parameter values
Posterior distribution approximated from the last 250 values
True posterior distribution
Measurement
13
Example 1 - ResultsN = 50000. Chain 1.
Chain of parameter values
True posterior distribution
Measurement
Posterior distribution approximated from
the last 25000 values
14
Example 1 - ResultsN = 50000. Chain 2.
Chain of parameter values
True posterior distribution
Measurement
Posterior distribution approximated from
the last 25000 values
15
Example 1 – Running the R program
• Install the file « MHyield.txt » on your computer. Note the path. This file has the R program that does the calculations.
• Open R. • You will use the “source” command to run the program:
– The command is given as a comment in the first line of the program.
– In my case, I had to type: source("c:\\David\\Enseignements\\Cours ICASA\\MHyield.txt").
– You must replace the path name by your path name. – Copy and paste the corrected command (without the “#”
character) in the Commands window of R.– Press RETURN to execute.
• You can easily change the value of N, the measurement value, its accuracy… See comments in my R function MHyield.
16
Example 2Estimation of the three parameters of a model of yield response to fertilizer.
• Non linear model relating wheat yield to nitrogen fertilizer dose.
Yield = θ1 + θ2 (Dose – θ3) if Dose < θ3
Yield = θ1 if Dose ≥ θ3
• Objective: Estimation of the 3 parameters for a given wheat field.
Dose0
θ1
θ2
θ3
17
Example 2 – Prior distribution
• The prior distribution of the parameters was defined in a previous study (Makowski and Lavielle. 2006. JABES 11, 45-60).
• It represents the between-field variability of the parameter values in a region (bassin of Paris).
P(θ1) = N(9.18, 1.16²) t/ha (maximal yield value)
P(θ2) = N(0.026, 0.0065²) t/kg N (slope of the linear part)
P(θ1) = N(123.85, 46.7²) kg N /ha (N dose threshold)
• The prior means define the « average » response curve in the region of interest.
18
Example 2 - Data
Data collected in a new wheat plot in the same region.
Four yield measurements obtained in this plot for four different N doses.
Tested doses: 0, 50, 100, and 200 kg/ha.
Corresponding yield measurements in the plot: 2.50, 5.01, 7.45, and 7.51 t/ha.
19
Example 2 - Likelihood
43214321 ,,, YPYPYPYPYYYYP
2
2
2
;exp
2
1
jjj
DfYYPwith
f(D; θ) is the linear-plus-plateau response function (D = N dose).
σ was estimated in a previous study and is set equal to 0.3.
20
Example 2 – Results with N=50000 – Chain 1.
Chains of parameter values
21
Example 2 – Results with N=50000 – Chain 1.
Curve based on prior means
Curve based on posterior means
22
Example 2 – Results with N=50000 – Chain 2.
Chains of parameter values
23
Curve based on prior means
Curve based on posterior means
Example 2 – Results with N=50000 – Chain 2.
24
Example 2 – Running the R program
• The R program is in the file MHresponse.txt.
• To run this function yourself, follow the previous instructions
• Type « Return » after the first series of graphs to obtain the second series.
25
• Both methods can be used to approximate the posterior distribution of parameters for any model.
• Both methods require the definition of a proposal distribution to generate parameter values. Not easy in practice.
• The comparison of the two types of methods is an active area of reasearch.
• MCMC methods (Gibbs sampling and MH) can be easily implemented with the WinBUGS software. See http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml.
Conclusion
Importance sampling versus MCMC