July 11, 2006 Bayesian Inference and Maximum Entropy 2006 1
Probing the covariance matrixKenneth M. Hanson
T-16, Nuclear Physics; Theoretical DivisionLos Alamos National Laboratory
This presentation available at http://www.lanl.gov/home/kmh/
LA-UR-06-xxxx
Bayesian Inference and Maximum Entropy Workshop, July 9-13, 2006
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 2
Overview• Analogy between minus-log-probability and a physical potential
• Gaussian approximation
• Probing the covariance matrix with an external force► deterministic technique to replace stochastic calculations
• Examples
• Potential applications
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 3
Analogy to physical system• Analogy between minus-log-posterior and a physical potential
► a represents parameters d represents dataI represents background information, essential for modeling
• Gradient ∂aφ corresponds to forces acting on the parameters
• Maximum a posteriori (MAP) estimates parameters âMAP
► condition is ∂aφ = 0► optimized model may be interpreted as mechanical system in equilibrium – net
force on each parameter is zero
• This analogy is very useful for Bayesian inference► conceptualization► developing algorithms
( ) log ( | , )p I a a d
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 4
Gaussian approximation • Posterior distribution is very often well approximated by a Gaussian
• Then, φ is quadratic in perturbations in the model parameters a – â = δa from the minimum in φ at â:
where K is the φ curvature matrix (aka Hessian);
• Uncertainties in the estimated parameters is summarized by the covariance matrix:
• Inference process reduced to finding â and C
minT1
2( ) a a K a
1Tˆ ˆcov( ) ( )( ) 2a a a a a C K
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 5
External force • Consider applying an constant external force to the parameters• Effect is to add a linearly increasing piece to potential
• Gradient of perturbed potential is
• At the new minimum, gradient is zero, or
• Displacement of minimum a is proportional to covariance matrix times the force• With the external force, one may “probe” the covariance
minT T1
2( ) a a K a f a
-1a = K f = C f
K a - fa
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 6
Effect of external force• Displacement of minimizer
of φ is in direction different than applied force
► its direction is affected by covariance matrix
2-D parameter space
Force, f
Displacement, δa
a
b
φ contour
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 7
Fit straight line to data• Linear model:
• Simulate 10 data points, exact values:
• Determine parameters, intercept a and slope b, by minimizing chi-squared (standard least-squares analysis)
• Result:
• Strong correlations between a and b
y a bx
ˆ 0.484a 0.127a ˆ 0.523b 0.044b
2min 4.04 0.775 p
1 0.867
0.867 1
R
0.2 y
0.5a 0.5b
Best fit10 data points
Scatter plot
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 8
Apply force to solution• Apply upward force to solution
line at x = 0 and find new minimum in φ
• Effect is to pull line upward at x = 0 and reduce its slope
► data constrain solution
• Conclude that parameters a (intercept) and b (slope) are anti-correlated
• Furthermore, these relationships yield quantified results
Pull upward on line
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 9
Straight line fit
• Family of lines for forces applied upward at x = 0: f = ± 1, 2 σa
-1
Upward force at x = 0 f = ± 1, 2 σa
-1
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 10
Straight line fit• Family of lines for forces applied
upward at x = 0
• Plot on top shows► perturbations proportional to f► slope of δa = σa
-2 = Caa
► slope of δb = Cab
• Plot below shows φ (or χ2) is quadratic function of force
► for force of f = ± σa-1 ;
min φ increases by 0.5, or min χ2 increases by 1
• Either dependence provides way to quantify variance
f at x = 0
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 11
Simple spectrum• Simulate a simple spectrum:
► Gaussian peak (ampl = 2, w = 0.2)► quadratic background► add random noise (rmsdev = 0.2)
• Fit involves 6 parameters► nonlinear problem► results:
parameters of interestampl.width
► fair degree of correlation
2min 34.32 0.852p
ˆ 1.948a 0.149a ˆ 0.1759w 0.0165w
1 0.427
0.427 1aw
R
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 12
Simple spectrum – apply force to area• To probe area under Gaussian
peak, apply force appropriate to area
• Force should be proportional to derivatives of area wrt parameters, a = amplitude, w = rms width:
• Plot shows result of applying force to these two parameters in this proportion
2A wa
2A aw
2A ww
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 13
Simple spectrum• Plots for +/– forces applied to area
• Plot below shows nonlinear response, but approximately linear for small f
► slope at 0 is σA-2
► φ increases by 0.5 for | f | = σA-1
• Other displacements give covariance wrt area
f = 3.4σA-1
f = - 8σA-1
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 14
Tomographic reconstruction from two views• Problem - reconstruct uniform-density object from two projections
► 2 orthogonal, parallel projections (128 samples in each view)► Gaussian noise added
Original object
Two orthogonal projections with 5% rms noise
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 15
The Bayes Inference Engine • BIE data-flow diagram to find max. a posteriori (MAP) solution
► 0ptimizer uses gradients that are efficiently calculated by adjoint differentiation, a key capability of the BIE
Boundary description
Input projections
2
2
1 likelihoodlog
ds
S 2
2 2prior log
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 16
MAP reconstruction – two views• Model object in terms of:
► deformable polygonal boundary with 50 vertices
► smoothness constraint► constant interior density
• Determine boundary that maximizes posterior probability
• Not perfect, but very good for only two projections
• Question is: How do we quantify uncertainty in reconstruction?
Reconstructed boundary (gray-scale) compared with
original object (red line)
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 17
Tomographic reconstruction from two views• Stiffness of model proportional to
curvature of • Displacement obtained by
applying a force to MAP model and re-minimizing is proportional to a row of the covariance matrix
• Displacement divided by force ► at position of force is proportional
to variance there► elsewhere, proportional to
covariance
Applying force (white bar) to MAP boundary (red) moves it to
new location (yellow-dashed)
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 18
Situations where probing covariance useful• Technique will be most useful when
► posterior can be well approximated by Gaussian pdf in parameters► interest is in uncertainty of one or a few quantities,
but there are many parameters ► optimization easy to do► gradient calculation can be done efficiently,
e.g. by adjoint differentiation of the forward simulation code► self-optimizing natural systems (populations, bacteria, traffic)
• May be useful in contexts other than probabilistic inference where Gaussian pdfs are used
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 19
Summary• Technique has been presented
► based on interpreting minus-log-posterior as physical potential► probe covariance matrix by applying force to estimated model► stochastic calculation replaced by deterministic one
► may be related to fluctuation-dissipation relation from statistical mechanics
July 11, 2006 Bayesian Inference and Maximum Entropy 2006 20
Bibliography ► "The hard truth," K. M. Hanson and G. S. Cunningham, Maximum Entropy
and Bayesian Methods, J. Skilling and S. Sibisi, eds., pp. 157-164 (Kluwer Academic, Dordrecht, 1996)
► Uncertainty assessment for reconstructions based on deformable models," K. M. Hanson et al., Int. J. Imaging Syst. Technol. 8, pp. 506-512 (1997)
► "Operation of the Bayes Inference Engine," K. M. Hanson and G. S. Cunningham, Maximum Entropy and Bayesian Methods, W. von der Linden et al., eds., pp. 309-318 (Kluwer Academic, Dordrecht, 1999)
► “Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation,” A. Griewank (SIAM, 2000)
This presentation available at http://www.lanl.gov/home/kmh/