+ All Categories
Home > Documents > David Makowski and Daniel Wallach INRA, France

David Makowski and Daniel Wallach INRA, France

Date post: 03-Jan-2016
Category:
Upload: carter-vaughn
View: 45 times
Download: 3 times
Share this document with a friend
Description:
January 2007. Bayesian methods for parameter estimation and data assimilation with crop models Part 4: The Metropolis-Hastings algorithm. David Makowski and Daniel Wallach INRA, France. Previously. Bayes’ Theorem. - PowerPoint PPT Presentation
Popular Tags:
25
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 4: The Metropolis-Hastings algorithm David Makowski and Daniel Wallach INRA, France January 2007
Transcript
Page 1: David Makowski and Daniel Wallach INRA, France

1

Bayesian methods for parameter estimation and data assimilation with crop models

Part 4: The Metropolis-Hastings algorithm

David Makowski and Daniel Wallach

INRA, France

January 2007

Page 2: David Makowski and Daniel Wallach INRA, France

2

Previously

• Approximation of posterior distribution with the Importance Sampling algorithm.

• Implementation with the R statistical software.

• Application to estimate one parameter.

P Y PP Y

P Y P

Bayes’ Theorem

Page 3: David Makowski and Daniel Wallach INRA, France

3

Objectives of part 4

• Present another algorithm to approximate the posterior probability distribution, the Metropolis-Hastings algorithm.

• Illustrate with 2 examples. The first has 1 parameter, the second 3 parameters.

• Furnish a program in the R language that you can run to implement Metropolis-Hastings for the examples.

•R is free (see http://www.r-project.org/.)

Page 4: David Makowski and Daniel Wallach INRA, France

4

Two approaches for approximating posterior distributions from Monte Carlo simulations

1. Non adaptative methods• All parameter vectors can be generated at the start of the procedure. The choice of parameters to be tested does not depend on the results for previous parameters.

•Example: Importance Sampling (see part 3).

2. Markov chain Monte Carlo methods (MCMC)• Parameter values are generated from a Markov chain. The parameter value to be tested at stage i+1 can depend on the parameter value at stage i.

• The most important methods are the Metropolis-Hastings algorithm and Gibbs sampling.

Page 5: David Makowski and Daniel Wallach INRA, France

5

The Metropolis-Hastings algorithm

General case

Step 0. Choose a starting value θ1. Define a proposal distrubution Pp(θc|θi). (For example, use a normal distribution with mean equal to θi ).

Repeat steps 1-3 for i=1,…,N

Step 1. Generate a candidate parameter value θc from Pp ( θc|θi ).

Step 2. Calculate

Step 3. If T1, then θi+1 = θc . If T<1, then draw u from a uniform distribution on the interval (0, 1). If u<T then θi+1 = θc otherwise θi+1 = θi .

c c p i c

i i p c i

P Y P PT

P Y P P

The result of the algorithm is a list of N parameter values. The same value may be repeated several times.

Page 6: David Makowski and Daniel Wallach INRA, France

6

The Metropolis-Hastings algorithm

with symmetric proposal distribution

• A common choice for the proposal distribution P(θc|θi) is a normal distribution with mean equal to θi and constant variance.

• In this case P( θc|θi ) = P(θi|θc ) and the expression for T simplifies:

c c

i i

P Y PT

P Y P

Likelihood Prior density

Page 7: David Makowski and Daniel Wallach INRA, France

7

The Metropolis-Hastings algorithm

Choices to me made

• The proposal distribution. The number of iterations required for a good approximation to the posterior distribution depends on this choice.

• The number of iterations. A large number is generally necessary (N=10000, 100000 or more depending on the problem).

• The number of parameter values in the list to be discarded (to reduce dependence on the value chosen for starting the algorithm).

• We will give some suggestions for these choices with example 1.

Page 8: David Makowski and Daniel Wallach INRA, France

8

Example 1Example already presented in parts 2 and 3: Estimation of crop yield.

• The single unknown parameter is the yield of a particular field. The prior information is an expert’s opinion. There is also information from a measurement. Both the prior density and the likelihood are normal distributions.

•In this case, the exact expression of the posterior distribution is known.

• This example is used to show that the Metropolis-Hastings method can give a good approximation of the posterior distribution.

Page 9: David Makowski and Daniel Wallach INRA, France

9

Example 1 – Exact posterior distribution

From part 2, we have:

• Measurement: Y=9 t/ha (sd=1)

• Prior distribution: P(θ) = N(5, 2²)

• Likelihood: P(Y|θ ) = N(θ, 1)

• Exact posterior distribution: P(θ |Y) = N(8.2, 0.8²)

0 2 4 6 8 10 12

Theta (t/ha)

0.0

0.1

0.2

0.3

0.4

0.5

De

nsi

ty

Prior probability distributionLikelihood function

Posterior probability distribution

Page 10: David Makowski and Daniel Wallach INRA, France

10

Example 1 – Metropolis-Hastings

Step 0. Choose θ1=5t/ha. As proposal distribution use a normal distribution: P( θc|θi ) = N( θi , 0.8²).

Repeat steps 1-3 for i=1,…,N

Step 1. Generate a candidate parameter value θc from P( θc|θi ).

Step 2. Calculate with

and

Step 3. If u<min (1, T), where u is drawn from a uniform distribution on the interval (0, 1) then θi+1 = θc otherwise θi+1 = θi .

9

9c c

i i

P PT

P P

2

9exp

2

19

2

P

42

5exp42

1 2

P

Page 11: David Makowski and Daniel Wallach INRA, France

11

Example 1 - ResultsN = 500. Chain 1.

Chain of parameter values

Posterior distribution approximated from the last 250 values

True posterior distribution

Measurement

Page 12: David Makowski and Daniel Wallach INRA, France

12

Example 1 - ResultsN = 500. Chain 2.

Chain of parameter values

Posterior distribution approximated from the last 250 values

True posterior distribution

Measurement

Page 13: David Makowski and Daniel Wallach INRA, France

13

Example 1 - ResultsN = 50000. Chain 1.

Chain of parameter values

True posterior distribution

Measurement

Posterior distribution approximated from

the last 25000 values

Page 14: David Makowski and Daniel Wallach INRA, France

14

Example 1 - ResultsN = 50000. Chain 2.

Chain of parameter values

True posterior distribution

Measurement

Posterior distribution approximated from

the last 25000 values

Page 15: David Makowski and Daniel Wallach INRA, France

15

Example 1 – Running the R program

• Install the file « MHyield.txt » on your computer. Note the path. This file has the R program that does the calculations.

• Open R. • You will use the “source” command to run the program:

– The command is given as a comment in the first line of the program.

– In my case, I had to type: source("c:\\David\\Enseignements\\Cours ICASA\\MHyield.txt").

– You must replace the path name by your path name. – Copy and paste the corrected command (without the “#”

character) in the Commands window of R.– Press RETURN to execute.

• You can easily change the value of N, the measurement value, its accuracy… See comments in my R function MHyield.

Page 16: David Makowski and Daniel Wallach INRA, France

16

Example 2Estimation of the three parameters of a model of yield response to fertilizer.

• Non linear model relating wheat yield to nitrogen fertilizer dose.

Yield = θ1 + θ2 (Dose – θ3) if Dose < θ3

Yield = θ1 if Dose ≥ θ3

• Objective: Estimation of the 3 parameters for a given wheat field.

Dose0

θ1

θ2

θ3

Page 17: David Makowski and Daniel Wallach INRA, France

17

Example 2 – Prior distribution

• The prior distribution of the parameters was defined in a previous study (Makowski and Lavielle. 2006. JABES 11, 45-60).

• It represents the between-field variability of the parameter values in a region (bassin of Paris).

P(θ1) = N(9.18, 1.16²) t/ha (maximal yield value)

P(θ2) = N(0.026, 0.0065²) t/kg N (slope of the linear part)

P(θ1) = N(123.85, 46.7²) kg N /ha (N dose threshold)

• The prior means define the « average » response curve in the region of interest.

Page 18: David Makowski and Daniel Wallach INRA, France

18

Example 2 - Data

Data collected in a new wheat plot in the same region.

Four yield measurements obtained in this plot for four different N doses.

Tested doses: 0, 50, 100, and 200 kg/ha.

Corresponding yield measurements in the plot: 2.50, 5.01, 7.45, and 7.51 t/ha.

Page 19: David Makowski and Daniel Wallach INRA, France

19

Example 2 - Likelihood

43214321 ,,, YPYPYPYPYYYYP

2

2

2

;exp

2

1

jjj

DfYYPwith

f(D; θ) is the linear-plus-plateau response function (D = N dose).

σ was estimated in a previous study and is set equal to 0.3.

Page 20: David Makowski and Daniel Wallach INRA, France

20

Example 2 – Results with N=50000 – Chain 1.

Chains of parameter values

Page 21: David Makowski and Daniel Wallach INRA, France

21

Example 2 – Results with N=50000 – Chain 1.

Curve based on prior means

Curve based on posterior means

Page 22: David Makowski and Daniel Wallach INRA, France

22

Example 2 – Results with N=50000 – Chain 2.

Chains of parameter values

Page 23: David Makowski and Daniel Wallach INRA, France

23

Curve based on prior means

Curve based on posterior means

Example 2 – Results with N=50000 – Chain 2.

Page 24: David Makowski and Daniel Wallach INRA, France

24

Example 2 – Running the R program

• The R program is in the file MHresponse.txt.

• To run this function yourself, follow the previous instructions

• Type « Return » after the first series of graphs to obtain the second series.

Page 25: David Makowski and Daniel Wallach INRA, France

25

• Both methods can be used to approximate the posterior distribution of parameters for any model.

• Both methods require the definition of a proposal distribution to generate parameter values. Not easy in practice.

• The comparison of the two types of methods is an active area of reasearch.

• MCMC methods (Gibbs sampling and MH) can be easily implemented with the WinBUGS software. See http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/contents.shtml.

Conclusion

Importance sampling versus MCMC


Recommended