Variational Bayesian Image Processing on Stochastic Factor Graphs
Xin LiXin Li
Lane Dept. of CSEELane Dept. of CSEE
West Virginia UniversityWest Virginia University
OutlineOutline Statistical modeling of natural imagesStatistical modeling of natural images
From old-fashioned local models to newly-propFrom old-fashioned local models to newly-proposed nonlocal models osed nonlocal models
Factor graph based image modelingFactor graph based image modeling A powerful framework unifying local and nonlA powerful framework unifying local and nonl
ocal approachesocal approaches EM-based inference on stochastic factor grEM-based inference on stochastic factor gr
aphsaphs Applications and experimental resultsApplications and experimental results
Denoising, inpainting, interpolation, post-procDenoising, inpainting, interpolation, post-processing, inverse halftoning, deblurring ... ...essing, inverse halftoning, deblurring ... ...
Cast Signal/Image Cast Signal/Image Processing Under a Processing Under a
Bayesian FrameworkBayesian Framework Image restoration (Besag Image restoration (Besag
et al.’1991)et al.’1991) Image denoising Image denoising
(Simoncelli&Adelson’199(Simoncelli&Adelson’1996)6)
Interpolation Interpolation (Mackay’1992) and super-(Mackay’1992) and super-resolution (Schultz& resolution (Schultz& Stevenson’1996 )Stevenson’1996 )
Inverse halftoning Inverse halftoning (Wong’1995)(Wong’1995)
Image segmentation Image segmentation (Bouman&Shapiro’1994)(Bouman&Shapiro’1994)
( | , ) ( | )( | , )
( | )
p H p Hp H
p H
y x xx y
y
x: Unobservable data
y: Observation data
Image prior(the focus of this talk)
Likelihood(varies from application
to application)
Statistical Modeling of Statistical Modeling of Natural Images:Natural Images:
the Pursuit of a Good Priorthe Pursuit of a Good Prior Local modelsLocal models
Markov Random Field (MRF) and its extensions (e.g., 2Markov Random Field (MRF) and its extensions (e.g., 2D Kalman-filtering, Field-of-Expert)D Kalman-filtering, Field-of-Expert)
Sparsity-based: DCT, wavelets, steerable pyramids, geoSparsity-based: DCT, wavelets, steerable pyramids, geometric wavelets (edgelets, curvelets, ridgelets, bandelemetric wavelets (edgelets, curvelets, ridgelets, bandelets)ts)
Nonlocal modelsNonlocal models Bilateral filtering (Tomasi et al. ICCV’1998)Bilateral filtering (Tomasi et al. ICCV’1998) Texture synthesis (Efros&Leung ICCV’1999)Texture synthesis (Efros&Leung ICCV’1999) Exemplar-based inpainting (Criminisi et al. TIP’2004)Exemplar-based inpainting (Criminisi et al. TIP’2004) Nonlocal mean denoising (Buades et al.’ CVPR’2005)Nonlocal mean denoising (Buades et al.’ CVPR’2005) Total Least-Square denoising (Hirakawa&Parks TIP’2Total Least-Square denoising (Hirakawa&Parks TIP’2
006)006) Block-matching 3D denoising (Dabov et al. TIP’2007)Block-matching 3D denoising (Dabov et al. TIP’2007)
Introducing a New Introducing a New Language of Factor Graphs Language of Factor Graphs
Why Factor Graphs?Why Factor Graphs? The most general form of graphical probability models (both MRF The most general form of graphical probability models (both MRF
and Bayesian networks can be converted to FGs)and Bayesian networks can be converted to FGs) Widely used in computer science and engineering (Widely used in computer science and engineering (forward-
backward algorithm, Viterbi algorithm, turbo decoding algorithm, Pearl’s belief propagation algorithm, Kalman filter1))
What is Factor Graph?What is Factor Graph? a bipartite graph that expresses which variables are arguments of
which local functions Factor/function node (solid squares) vs. variable nodes (empty Factor/function node (solid squares) vs. variable nodes (empty
circles)circles)
B1 B2 B7 B8B3 B4 B5 B6
f1 f2 f3 f4
f1
f2
f3
f4
1,2,4
3,65,77,8
L:F V
1Kschischang, F.R.; Frey, B.J.; Loeliger, H.-A., "Factor graphs and the sum-product algorithm," IEEE Transactions on Information Theory,, vol.47, no.2, pp.498-519, Feb 2001
Variable Nodes=Image Variable Nodes=Image PatchesPatches
Neuroscience: Neuroscience: receptive fields of receptive fields of neighboring cells in neighboring cells in human vision system human vision system have severe have severe overlappingoverlapping
Engineering: patch Engineering: patch has been under the has been under the disguise of many disguise of many different names such different names such as as windowswindows in digital in digital filters, filters, blocksblocks in JPEG in JPEG and the and the supportsupport of of wavelet bases wavelet bases
Cited from D. Hubel, “Eye, Brain and Vision”, 1988
Factorization: the Art of Factorization: the Art of Statistical Image ModelingStatistical Image Modeling
Wavelet-based statistical models(geometric proximity defines
the neighborhood)
Locally linear embedding1
(perceptual similarity defines the neighborhood)
SP ML
Domain-Markovian
Range-Markovian
1S.T. Roweis and L.K. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding”(22 December 2000),Science 290 (5500), 2323.
Unification Using Factor Unification Using Factor GraphsGraphs
f1 f2 f3 f4
B1 B2 B3 B4
naive Bayesian(DCT/wavelet-based models)
MRF-based
B0
B1B2
B3
x
B0 B1 B3B2
B0
B1
B2
B3
kNN/kmeans clustering(nonlocal image models)
A Manifold Interpretation A Manifold Interpretation of Nonlocal Image Priorof Nonlocal Image Prior
MRN
B1 Bk
B0
][ 10BΒD
]''[' 10 BΒD
0'Β1'Β
k'Β
How to maximize the sparsity of a representation?Conventional wisdom: adapt basis to signal (e.g., basis pursuit, matching pursuit)New proposal: adapt signal to basis (by probing its underlying organization principle)
Organizing Principle: Organizing Principle: Latent Variable LLatent Variable L
P(y|x)x y
image denoising
image inpainting
image coding
image halftoning
LB11
B22
B14B13B12
B41
B31
B21
B33B32
B23 B24
B34B44B43B42
fBfA
fC
image deblurring
Ff
jjFf
jj
jj
ffp )()()( 1STDx
)()1()0( kiiij BBBD sparsifying transform
“Nature is not economical of structures but organizing principles.” - Stanislaw M. Ulam
L
Maximum-Likelihood Maximum-Likelihood Estimation of Graph Estimation of Graph
Structure LStructure L
Pack into3D Array D
For. Trans.
Coring
B0 BkB1…
Inv. Trans.
unpack into2D patches
B0 BkB1…^ ^ ^
Update theestimate of L
Update theestimate of x
loop over every factor node fj
A variational interpretation of such EM-basedinference on FGs is referred to the paper
P(y|x)
Problem 1: Image Problem 1: Image DenoisingDenoising
PSNR(DB) PERFORMANCE COMPARISON AMONG DIFFERENT SCHEMES FOR 12 TEST IMAGES ATσw = 100
SSIM PERFORMANCE COMPARISON AMONG DIFFERENT SCHEMES FOR 12 TEST IMAGES ATσw = 100
BM3D(kNN,iter=2)
SFG(kmeans,iter=20) σw
org. 200 400 600 800 1000
Problem 2: Image Problem 2: Image RecoveryRecovery
top-down: test1, test3, test5
top-down: test2, test4, test6
DCT FoE EXP BM3D LSP SFG
PSNR(dB) performance comparison
SSIM performance comparison
Local models: DCT, FoE and LSPNonlocal models: EXP, BM3D1 and SFG1Our own extension into image recovery
x y
x y bicubic NEDI1 FG
28.70dB 27.34dB 28.19dB
31.76dB 32.36dB 32.63dB
34.71dB 34.45dB 37.35dB
18.81dB 15.37dB 16.45dB
Problem 3: Resolution Problem 3: Resolution EnhancementEnhancement
1X. Li and M. Orchard, “New edge directed interpolation”, IEEE TIP, 2001
29.06dB 31.56dB 34.96dB
x y DT KR FG1
28.46dB 31.16dB 36.51dB
17.90dB 18.49dB 29.25dB
26.04dB 24.63dB 29.91dB
Problem 4: Irregular Problem 4: Irregular InterpolationInterpolation
DT- DelauneyTriangle-based(griddata under MATLAB)
KR- KernalRegression-based(Takeda et al.IEEE TIP 2007w/o parameteroptimization)
1X. Li, “Patch-based image interpolation: algorithms and applications,” Inter. Workshop on Local and Non-Local Approximation (LNLA)’2008
25% kept
Problem 5: Post-Problem 5: Post-processingprocessing
JPEG-decoded at rate of 0.32bpp(PSNR=32.07dB)
SFG-enhanced at rate of 0.32bpp(PSNR=33.22dB)
SPIHT-decoded at rate of 0.20bpp(PSNR=26.18dB)
SFG-enhanced at rate of 0.20bpp(PSNR=27.33dB)
Maximum-Likelihood (ML) Decoding
Maximum a Posterior (MAP) Decoding
Problem 6: Inverse Problem 6: Inverse HalftoningHalftoning
without nonlocal prior1
(PSNR=31.84dB,SSIM=0.8390)
with nonlocal prior(PSNR=32.82dB,SSIM=0.8515)
1Available from Image Halftoning Toolbox released by UT-Austin Researchers
Conclusions and Conclusions and PerspectivesPerspectives
Despite the rich structures in natural images, Despite the rich structures in natural images, the underlying organization principle is simple the underlying organization principle is simple (self-similarity(self-similarity We have shown how We have shown how similaritysimilarity can lead to can lead to sparsitysparsity
in a nonlinear representation of imagesin a nonlinear representation of images FG only represents one mathematical language for FG only represents one mathematical language for
interpreting such principle (multifractal formalism interpreting such principle (multifractal formalism is another)is another)
Image processing (low-level vision) could Image processing (low-level vision) could benefit from data clustering (higher-level benefit from data clustering (higher-level vision): how does human visual cortex learn to vision): how does human visual cortex learn to decode the latent variable L through decode the latent variable L through unsupervised learning?unsupervised learning?
Reproducible Research: MATLAB codes accompanying this work areavailable at http://www.csee.wvu.edu/~xinl/sfg.html (more will be added)