Irrera gold2010

1

INTEGRATION OF SUPPORT VECTOR MACHINES AND MARKOV RANDOM FIELDS FOR REMOTE SENSING IMAGE CLASSIFICATION

Paolo Irrera, Gabriele Moser, Sebastiano B. Serpico

University of Genoa, Dept. of Biophysical and Electronic Eng. (DIBE),Via Opera Pia 11a, I-16145 Genoa Italy

2

OUTLINE

• Introduction– Remote sensing image classification– Objective of the paper– Support vector machines– Markov random fields

• Methodology– Markovian proposed method– Architecture of the method– Parameter optimization

• Experimental results– Confusion matrices– Classification maps

• Conclusions

3

REMOTE SENSING IMAGE CLASSIFICATION

• Techniques that aim at labeling each image pixel as belonging to a thematic class.

• Examples of applications:– land-use or land-cover mapping;– urban-area mapping;– forest inventory;– snow-cover mapping.

• Many approaches have been proposed for supervised classification:– parametric and nonparametric Bayesian;– neural;– fuzzy;– support vector machines (SVMs),– …

4

OBJECTIVE OF THE PAPER

• Key-idea of SVMs:– identifying an optimal linear discriminant hypersurface in a suitable

nonlinearly trasformed feature space.

• Good analytical properties (generalization capability) and excellent performance in many applications (e.g., object recognition, hyperspectral image classification).

• Limitation:– SVMs focus on i.i.d (indipendent and identically distribuited) samples;– in image classification, this implies an intrinsically noncontextual

approach. • Objective of the paper:

– integration of the SVM and Markov random field (MRF) approaches to classification, aiming at a rigorous contextual generalization of SVMs.

5

SVM CLASSIFIER

• It exploits the information associated to the samples located at the interface between distinct classes (support vectors).

• Training is expressed as a quadratic programming problem.

• The nonlinear transformation of the feature space is implicitly defined by a kernel function K(x,y), that allows a nonlinear problem to be formalized as a linear problem without a relevant increase in computational complexity.

• Here, we use a gaussian kernel.

Quadratic programming problem

Discriminant function, nonlinear case, two classes

6

MARKOV RANDOM FIELDS• MRFs constitute a general family of stochastic models for the contextual

information associated with an image, in Bayesian image-analysis problems.

• They allow global stochastic models to be formalized according to the local statistical relationships among neighboring pixels (Hammersley-Clifford’s theorem).

• When modeling the random field of the thematic class labels as an MRF, the “maximum a-posteriori” criterion can be formalized as the minimization of a suitable energy function:

7

INTEGRATING MRF AND SVM• Here, we prove that, under proper

assumptions, the Markovian minimum-energy decision rule can be reformulated as the application of a SVM discriminant function in a transformed feature space, associated to a suitable “contextual kernel”.

• Contextual information is formalized through an additional feature (“stacked vector”)

• A modified kernel function fuses contextual and noncontextual information (the linear combination of two related contributions).

• In this framework, a novel classifier is introduced by using the “iterated conditional mode” approach.

Discriminant function.

Contextual kernel

Kernel-based expression of the discriminant function

8

PROPOSED CLASSIFIER

I = image n channels to be classified.

T = training map.

m = update classification map at each iteration.

9

PARAMETER OPTIMIZATION

• The method presents the following parameters:– SVM regularization parameter C;– variance of the Gaussian kernel;– weight parameter λ of the spatial kernel contribution.

• Algorithms used for parameter estimation: Powell, Ho-Kashyap.

• Powell’s algorithm is a local unconstrained minimization method for multidimensional spaces. It does not involves derivatives and is applied here to the cross-validation error (nondifferentiable function) to optimize C and the variance of the Gaussian kernel.

• For the estimation of λ a recently proposed approach, based on the Ho-Kashyap’s algorithm for the optimization of weight parameters in MRF models, has been used.

10

DATA SETS FOR EXPERIMENTS

• Data set “Pavia”– SIR-C/XSAR– Rural area (near Pavia)– 700 x 280 pixels– 4 channels (XSAR channel is

shown in the figure)– Medium resolution (25m)– Main classes: “dry soil” and “wet

soil”.

• Data set “Tanaro”– COSMO/SkyMed– Flood of the Tanaro River near

Alessandria– 3155 x 1695 pixels– single-channel– Very high resolution (1m)– Main classes : “dry soil” and

“water or flooded soil”.

• Spatially disjoint training and test fields are available for both data sets.

11

EXPERIMENTAL RESULTSCONFUSION MATRICES AND ACCURACIES

Pavia. Confusion matrix, noncontextual SVM. Pavia. Confusion matrix, proposed method.

Tanaro. Confusion matrix, noncontextual SVM. Tanaro. Confusion matrix, proposed method.

12

EXPERIMENTAL RESULTSCLASSIFICATION MAPS

Pavia: map generated by a noncontextual SVM. Pavia: map generated by the proposed method.

Tanaro: map generated by a noncontextual SVM. Tanaro : map generated by the proposed method.

13

EXPERIMENTAL RESULTSCONVERGENCE OF THE METHOD

Tanaro: behavior of the accuracy (overall accuracy – OA, average accuracy – AA, and crossvalidation accuracy – XVAL) as a function of the number of iterations of the proposed method.

14

CONCLUSIONS• A feasible Markovian extension of SVM to contextual classification

has been introduced.• Experiments with real data suggest that the proposed method

allows a significant accuracy increase to be obtained, as compared to a standard (noncontextual) SVM.

• Very accurate results on different types of remote-sensing data, including very high resolution COSMO/SkyMed SAR data.

• Possible future extensions:– theoretical analysis of convergence properties (even though no

experimental evidence was collected about possibly critical convergence issues);

– testing the method with other typologies of remote-sensing data (in particular, optical and hyperspectral images) and with more sophisticated MRF models.

15

REFERENCES[1] J. Besag. Spatial interaction and statistical analysis of lattice systems. Journal of the Royal Statistical

Society, (6):192–236, 1974.[2] R. Brent. Algorithm for minimization without derivatives, chapter 5. Englewood Cliffs, NJ: Prentice-Hall, 1973.[3] C. J. Burges. A tutorial on support vector machines for pattern recognition. Research report, Kluwer

Academic Publishers, 1998.[4] N. Cristianini and J. Shawe-Taylor. An Introduction to support vector machines and other kernel-based

learning methods. Cambridge University Press, 2000.[5] M. Datcu, K. Seidel, and M. Walessa. Spatial information retrieva from remote sensing images: Information

theoretical perspective. IEEE Trans. Geosci. Remote Sensing, 36(5):1431–1445, 1998.[6] R. Dubes and A. Jain. Random fields models in image analysis. J. Appl. Stat., 16(2):131–163, 1989.[7] R. O. Duda, P. E. Hart, and D. G. Stork. Pattern classification. Wiley Interscience, 2001.[8] S. Geman and D. Geman. Sochastic relaxation Gibbs distributions and the bayesian restoration. IEEE Trans.

Pattern Anal. Mach. Intell., 6):721–741, 1984.[9] D. A. Landgrebe. Signal theory methods in multispectral remote sensing. Wiley-InterScience, 2003.[10] F. Melgani and S. B. Serpico. A Markov random field approach to spatio-temporal contextual image

classification. IEEE Trans. Geosci. Remote Sensing, 41(11):2478–2487, 2003.[11] G. Moser. Analisi di immagini telerilevate per osservazione della Terra, pages 7–48 and 140–197. ECIG,

2006.[12] C. Oliver and S. Quegan. Understanding synthetic aperture radar images. SciTech Publishing, 2004.[13] W. K. Pratt. Digital image processing. Wiley Interscience, 2007.[14] W. Press, S. Teukolsky, W. Vetterling, and B. Flannery. Numerical recipes in C, pages 394–455.

Cambridge University Press, New York, NY, U.S.A., 1992.[15] J. Richards and X. Jia. Remote sensing digital image analysis. Springer, 2005.[16] S. B. Serpico and G. Moser. Weight parameter optimization by the Ho-Kashyap algorithm in MRF model for

supervised image classification. IEEE Trans. Geosci. Remote Sensing, 44(12):3695–3705, 2006.[17] A. H. S. Solberg. Flexible nonlinear contextual classification. Pattern Recognit. Lett., 25(13):1501–1508,

2004.[18] V. N. Vapnik. Statistical learning theory. Wiley Interscience, 1998.

Date post:	12-May-2015
Category:	Documents
Upload:	grssieee
View:	310 times
Download:	1 times

Irrera gold2010

Documents