+ All Categories
Home > Documents > Spectral-Spatial Constraint Hyperspectral Image Classification

Spectral-Spatial Constraint Hyperspectral Image Classification

Date post: 04-Jan-2017
Category:
Upload: lyduong
View: 215 times
Download: 0 times
Share this document with a friend
14
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 52, NO. 3, MARCH 2014 1811 Spectral-Spatial Constraint Hyperspectral Image Classification Rongrong Ji, Member, IEEE, Yue Gao, Richang Hong, Member, IEEE, Qiong Liu, Member, IEEE, Dacheng Tao, and Xuelong Li, Fellow, IEEE Abstract— Hyperspectral image classification has attracted extensive research efforts in the recent decade. The main difficulty lies in the few labeled samples versus the high dimensional features. To this end, it is a fundamental step to explore the relationship among different pixels in hyperspectral image classification, toward jointly handing both the lack of label and high dimensionality problems. In the hyperspectral images, the classification task can be benefited from the spatial layout information. In this paper, we propose a hyperspectral image classification method to address both the pixel spectral and spatial constraints, in which the relationship among pixels is formulated in a hypergraph structure. In the constructed hypergraph, each vertex denotes a pixel in the hyperspectral image. And the hyperedges are constructed from both the distance between pixels in the feature space and the spatial locations of pixels. More specifically, a feature-based hyperedge is generated by using distance among pixels, where each pixel is connected with its K nearest neighbors in the feature space. Second, a spatial- based hyperedge is generated to model the layout among pixels by linking where each pixel is linked with its spatial local neighbors. Both the learning on the combinational hypergraph is conducted by jointly investigating the image feature and the spatial layout of pixels to seek their joint optimal parti- tions. Experiments on four data sets are performed to evaluate the effectiveness and and efficiency of the proposed method. Comparisons to the state-of-the-art methods demonstrate the superiority of the proposed method in the hyperspectral image classification. Index Terms— Hypergraph learning, hyperspectral, image classification, spatial-constraint. Manuscript received July 13, 2012; revised December 4, 2012 and January 22, 2013; accepted February 18, 2013. Date of publication June 10, 2013; date of current version December 17, 2013. This work was supported in part by the National Basic Research Program of China 973 Program under Grant 2012CB316400, the 985 Project of Xiamen University, the National Natural Science Foundation of China under Grant 61125106, Grant 91120302, and Grant 61072093, and by the Shaanxi Key Innovation Team of Science and Technology under Grant 2012KCT-04. (Corresponding author: Y. Gao.) R. Ji is with the Department of Cognitive Science, School of Information Science and Technology, Xiamen University, Xiamen 361005, China. Y. Gao is with the School of Computing, National University of Singapore, Singapore 117417, Singapore (e-mail: [email protected]). R. Hong is with the Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China. Q. Liu is with the Department of Electronic and Information Engineering, Huazhong University of Science and Technology, Wuhan, China. D. Tao is with the Centre for Quantum Computation and Intelligent Systems and the Faculty of Engineering and Information Technology, University of Technology, Sydney 2007, Australia. X. Li is with Center for OPTical IMagery Analysis and Learning, State Key Laboratory of Transient Optics and Photonics, Xiàn Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xiàn 710119, China. Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TGRS.2013.2255297 I. I NTRODUCTION H YPERSPECTRAL image classification is to classify the image pixels within a hyperspectral image into multiple categories, which has received increasing research interest in a wide variety of applications [1]–[5]. Under such a circum- stance, the training data for hyperspectral image classification is typically limited because of the expensive image labeling [6]. For example, it is extremely difficult, if not impossible, to label adequate samples for a large hyperspectral image, e.g., containing hundreds of spectral bands from the visible to the infrared range of the electromagnetic spectrum [7], [8]. Therefore, we are facing the challenge of few training samples versus high data dimensionality. Another challenge comes from the high spatial correlation among pixels, i.e., nearby pixels in the hyperspectral images are captured from spatially closed area, very likely with the same labels. Sub- sequently, how to explore the rich spatial correlation among pixels with the high data dimensionality is a key challenge in hyperspectral image classification. To handle the few labeled samples, semisupervised learn- ing methods are employed in hyperspectral image clas- sification, by exploiting both labeled and unlabeled data. Gómez-Chova et al. [9] introduced a graph Laplacian into support vector machine (SVM)-based hyperspectral image classification, in which a graph is constructed to explore the relations among pixels in a global stable state. Motivated by the graph Laplacian, we propose a semisupervised spatial- constraint hyperspectral image classification method. In our method, rather than a simple graph, the relationship among pixels is formulated by the hypergraph structure. Two types of hypergraphs are constructed to model the relation among pixels, both of which are further combined to generate a joint hypergraph Laplacian. In our model, each vertex in one hypergraph denotes a pixel in the image. First, a feature-based hyperedge is generated by considering the similarity among pixels in the feature space, where each pixel is connected with its K-nearest neighbors in the feature space. Second, a spatial-based hyperedge is generated to take the spatial layout information into consideration, in which each pixel is linked with its spatial neighbors. Both types of hyperedges are combined to generate the joint hypergraph Laplacian, upon which semisupervised learning is conducted for hyperspec- tral image classification. Fig. 1 shows the flowchart of the proposed method. We conducted hyperspectral image classi- fication experiments on four data sets, i.e., the Indian Pine, Indian Pine Sub, Salinas A, and University of Pavia to evaluate the effectiveness of the proposed method, with quantitative 0196-2892 © 2013 IEEE
Transcript
Page 1: Spectral-Spatial Constraint Hyperspectral Image Classification

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 52, NO. 3, MARCH 2014 1811

Spectral-Spatial Constraint HyperspectralImage Classification

Rongrong Ji, Member, IEEE, Yue Gao, Richang Hong, Member, IEEE, Qiong Liu, Member, IEEE,Dacheng Tao, and Xuelong Li, Fellow, IEEE

Abstract— Hyperspectral image classification has attractedextensive research efforts in the recent decade. The main difficultylies in the few labeled samples versus the high dimensionalfeatures. To this end, it is a fundamental step to explorethe relationship among different pixels in hyperspectral imageclassification, toward jointly handing both the lack of labeland high dimensionality problems. In the hyperspectral images,the classification task can be benefited from the spatial layoutinformation. In this paper, we propose a hyperspectral imageclassification method to address both the pixel spectral and spatialconstraints, in which the relationship among pixels is formulatedin a hypergraph structure. In the constructed hypergraph, eachvertex denotes a pixel in the hyperspectral image. And thehyperedges are constructed from both the distance between pixelsin the feature space and the spatial locations of pixels. Morespecifically, a feature-based hyperedge is generated by usingdistance among pixels, where each pixel is connected with itsK nearest neighbors in the feature space. Second, a spatial-based hyperedge is generated to model the layout among pixelsby linking where each pixel is linked with its spatial localneighbors. Both the learning on the combinational hypergraphis conducted by jointly investigating the image feature andthe spatial layout of pixels to seek their joint optimal parti-tions. Experiments on four data sets are performed to evaluatethe effectiveness and and efficiency of the proposed method.Comparisons to the state-of-the-art methods demonstrate thesuperiority of the proposed method in the hyperspectral imageclassification.

Index Terms— Hypergraph learning, hyperspectral, imageclassification, spatial-constraint.

Manuscript received July 13, 2012; revised December 4, 2012 andJanuary 22, 2013; accepted February 18, 2013. Date of publication June 10,2013; date of current version December 17, 2013. This work was supportedin part by the National Basic Research Program of China 973 Program underGrant 2012CB316400, the 985 Project of Xiamen University, the NationalNatural Science Foundation of China under Grant 61125106, Grant 91120302,and Grant 61072093, and by the Shaanxi Key Innovation Team of Scienceand Technology under Grant 2012KCT-04. (Corresponding author: Y. Gao.)

R. Ji is with the Department of Cognitive Science, School of InformationScience and Technology, Xiamen University, Xiamen 361005, China.

Y. Gao is with the School of Computing, National University of Singapore,Singapore 117417, Singapore (e-mail: [email protected]).

R. Hong is with the Computer Science and Information Engineering, HefeiUniversity of Technology, Hefei 230009, China.

Q. Liu is with the Department of Electronic and Information Engineering,Huazhong University of Science and Technology, Wuhan, China.

D. Tao is with the Centre for Quantum Computation and Intelligent Systemsand the Faculty of Engineering and Information Technology, University ofTechnology, Sydney 2007, Australia.

X. Li is with Center for OPTical IMagery Analysis and Learning, State KeyLaboratory of Transient Optics and Photonics, Xiàn Institute of Optics andPrecision Mechanics, Chinese Academy of Sciences, Xiàn 710119, China.

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TGRS.2013.2255297

I. INTRODUCTION

HYPERSPECTRAL image classification is to classify theimage pixels within a hyperspectral image into multiple

categories, which has received increasing research interest ina wide variety of applications [1]–[5]. Under such a circum-stance, the training data for hyperspectral image classificationis typically limited because of the expensive image labeling[6]. For example, it is extremely difficult, if not impossible,to label adequate samples for a large hyperspectral image,e.g., containing hundreds of spectral bands from the visibleto the infrared range of the electromagnetic spectrum [7],[8]. Therefore, we are facing the challenge of few trainingsamples versus high data dimensionality. Another challengecomes from the high spatial correlation among pixels, i.e.,nearby pixels in the hyperspectral images are captured fromspatially closed area, very likely with the same labels. Sub-sequently, how to explore the rich spatial correlation amongpixels with the high data dimensionality is a key challenge inhyperspectral image classification.

To handle the few labeled samples, semisupervised learn-ing methods are employed in hyperspectral image clas-sification, by exploiting both labeled and unlabeled data.Gómez-Chova et al. [9] introduced a graph Laplacian intosupport vector machine (SVM)-based hyperspectral imageclassification, in which a graph is constructed to explore therelations among pixels in a global stable state. Motivated bythe graph Laplacian, we propose a semisupervised spatial-constraint hyperspectral image classification method. In ourmethod, rather than a simple graph, the relationship amongpixels is formulated by the hypergraph structure. Two typesof hypergraphs are constructed to model the relation amongpixels, both of which are further combined to generate ajoint hypergraph Laplacian. In our model, each vertex in onehypergraph denotes a pixel in the image. First, a feature-basedhyperedge is generated by considering the similarity amongpixels in the feature space, where each pixel is connectedwith its K-nearest neighbors in the feature space. Second,a spatial-based hyperedge is generated to take the spatiallayout information into consideration, in which each pixel islinked with its spatial neighbors. Both types of hyperedges arecombined to generate the joint hypergraph Laplacian, uponwhich semisupervised learning is conducted for hyperspec-tral image classification. Fig. 1 shows the flowchart of theproposed method. We conducted hyperspectral image classi-fication experiments on four data sets, i.e., the Indian Pine,Indian Pine Sub, Salinas A, and University of Pavia to evaluatethe effectiveness of the proposed method, with quantitative

0196-2892 © 2013 IEEE

Page 2: Spectral-Spatial Constraint Hyperspectral Image Classification

1812 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 52, NO. 3, MARCH 2014

Fig. 1. Flowchart of the proposed spectral-spatial constraint hyperspectral image classification method.

comparisons with the state-of-the-art methods to demonstratethe superiority of the proposed method.

The rest of this paper is organized as follows. Section IIsurveys related work. Section III introduces the motivation ofthis paper and brief the proposed method. The detailed methodis provided in Section IV. Experimental results and comparisonwith existing methods are provided in Section V. Finally, weconclude this paper in Section VI.

II. RELATED WORK

Existing works in hyperspectral image classificationmainly focus on two aspects: reducing the feature dimen-sions [10] and handling the few training examples[11], [12]. For feature dimension reduction, previousworks investigated several traditional dimension reductionapproaches such as independent component analysis [13]and principal component analysis (PCA) [14], [15]. Morerecently, to better address the high dimensional featureissue, a kernel nonparametric weighted feature extrac-tion [16] is introduced by using a kernel nonparametricmethod to extract more compact features. Guo et al. [17]introduced to select a limited number of representative spectralband for feature dimension reduction. In this method, thecorrelation between each two spectral bands is measured byusing mutual information, and the representative bands areselected by minimizing the distance between the selectedbands and the estimated reference map. A clustering-basedspectral band selection [18] is proposed to select the bandswith the largest similarity to other bands. Camps-Valls et al.[19] introduced a kernel method for feature selection based onmeasuring nonlinear dependence between spectral bands andclass labels. In this method, redundant features are removedby minimizing the Hilbert-Schmidt independence criterionp-value. Concerning the feature selection in multiclass prob-lems, Bruzzone and Serpico [20] proposed to select featuresby minimizing the error probability of Bayes classifier, inwhich the upper bound to the error probability of the Bayesclassifier was employed. Tuia et al. [21] proposed a hyper-spectral image classification method, which is able to learnthe relevant features simultaneously. In this method, a linearcombination of kernels for different feature sets is dedicated

as the SVM objective function. The learnt weight for eachkernel can be employed to further improve the classificationaccuracy.

In terms of the classification part, popular techniques likeK-nearest neighbor classifier (KNN) and SVM were widelyintroduced [22], [23]. And in [24], manifold learning isdeployed in combination with the KNN classifier for hyper-spectral image classification. In [25], a manifold structureis constructed by the pixels. A local manifold learning isconducted in the manifold structure. Over the learnt man-ifold structure, a weighted KNN classifier is employed forhyperspectral image classification. The recent research focusedon the classification of hyperspectral image with few trainingdata. In [7], sparse representation was investigated to deal withthe few labeled samples. Wen et al. [26] proposed to employhypergraph structure to calculate the similarity among pixelsby using the spectral information. Zhong and Wang introducedthe conditional random fields (CRFs) [27], which models thespatial connection modeling among pixels to improve classifi-cation accuracy. To do CRF inference more efficiently, sparsehigher order potentials can be further extended as in [28].

To deal with the few training example problem in hyper-spectral image classification, Bruzzone et al. [11] proposeda transductive SVM, which explored all samples to dealwith the sampling insufficiency through weighting unlabeledpatterns using a time dependent criterion. A transductivegraph Laplacian [9] is employed in SVM to explore therelations among pixels in a global stable state, which is able tocharacterize the marginal distribution of the class of interest.In [12], Tuia and Camps-Valls proposed to employ the SVMwith a kernel, which was generated by using the clusteringof all data. This method is able to train a kernelized SVMclassifier directly and robustly from the image. Ratle et al. [29]proposed to employ semisupervised neural networks forefficient hyperspectral image classification. In this method,a regularizer is added to the loss function to train theneural networks, and the generated neural networks areable to handle the large-scale remote sensing classification.Bruzzone et al. [30] proposed a context-sensitive semisuper-vised SVM to deal with mislabeled training samples. In thismethod, a contextual term is employed in the cost function ofthe semisupervised framework.

Page 3: Spectral-Spatial Constraint Hyperspectral Image Classification

JI et al.: SPECTRAL-SPATIAL CONSTRAINT 1813

The spatial correlation among pixels has shown itsmerits in hyperspectral image classification, which createsconnections among pixels to improve classification perfor-mance. There are multiple evidences indicating that the explor-ing of spatial context is the key to deal with both the highdimensional feature and few training example issues. Thebasic observation comes from that spatial-closed pixels aregenerally highly correlated that should have large possibilityto be with similar labels. Therefore, on one hand, the spatialinformation can be used to investigate both the labeled andunlabeled samples in the hyperspectral image analysis process.On the other hand, the spatial information can be employedto smooth the hyperspectral image classification results. Asdiscussed in [31], it is important to mix the spectral and thespatial information in a better way. A spatial preprocessingapproach is introduced in [32] to remove the noise for imagesmoothing, which enhanced the spatial texture informationby using locally linear embedding in the feature space. Inthis method, image is presmoothed by a nonlinear diffusionof partial differential equations and wavelet shrinkages. Gaoand Chua [33] proposed to employ a hypergraph structureto estimate the relationship among pixels by using the pixelspatial correlation. Benediktsson et al. [34] introduced amorphology-based hyperspectral image preprocessing method.In this method, opening and closing morphological transformsare employed to isolate bright and dark structures, whichindicate brighter and darker compared with the surroundingfeatures in the image. More specifically, a morphologicalprofile is generated by using the base images selected byPCA. These morphological data can be further used forhyperspectral image classification. However, its limitation liesin that the spectral information of the data is not fully utilized.Fauvel et al. [35] further fused the morphological informationand the original hyperspectral data, to come up with a morecompact feature space for the subsequent SVM training.

III. MOTIVATION AND BRIEF INTRODUCTION

OF THE PROPOSED METHOD

The following observations hold for hyperspectral imageclassification.

1) The pixels that are close in the feature space are highlypossible to have the same label.

2) The pixels that are spatially nearby have large possibil-ities to have the same label. As hyperspectral imagesrepresent the real land surfaces, each type of the surfacepixels typically spreads in an area (at least a group ofpixels). Therefore, each pixel should have similar labelswith its spatial neighbor pixels.

Fig. 2(a) and (b) show examples of the Airborne Visi-ble/Infrared Imaging Spectrometer (AVIRIS) Indiana’s IndianPine image [23], [36], [37] and the corresponding groundtruthimage. As shown in Fig. 2(a) and (b), pixel A and pixel B arespatially nearby and belong to the same category; pixel A andpixel C are far away in the spatial layout while with differentlabels. Let’s further look into a detailed example shown inFig. 2 (c), where the blue-green and red-yellow points denotetwo types of pixels from different categories. Pixel X is onthe boundary between these two groups of pixels. Under thesecircumstances, pixel X has the same label with parts of its

(a) (b) (c)

Fig. 2. Examples for spatial information of hyperspectral images. (a) Originalimage of AVIRIS Indiana’s Indian Pine. (b) Groundtruth for (a). (c) Exampleshowing two groups of pixels with different labels.

spatial neighbor pixels, i.e., the blue-green pixels in the bluebox, which can be explored through their spatial relationship.On the other hand, for the other spatial neighbor pixels of X ,i.e., the red-yellow pixels in the red box, X can be identifiedfrom them with the help of the distances in the feature space.

Based on the above discussions, the relationship amongpixels in both the feature space and the spatial layout playsan important role in hyperspectral image classification. Tobuild the connection among pixels, most of existing methodsare based on pair-wise pixel connection. In the graph-basedhyperspectral image analysis method [25], each edge in theconstructed graph links two pixels. The connection amongpixels is more complex compared with pair-wise pixel rela-tions. Therefore, taking the pair-wise pixel connection aloneis not adequate to formulate the relationship among pixels. Onthe other hand, the spatial information is typically leveragedas a pre or postsmoothing over the pixel-wise classificationresults [32].

Exploring the complex relationship among pixels to jointlyinvestigate the feature-based and spatial-based information isof fundamental importance in hyperspectral image classifica-tion. To address this problem, we propose a unified spatialconstraint hypergraph analysis method under a semisupervisedlearning scheme.

Hypergraph is widely investigated in multimedia informa-tion retrieval [38], [39] because of its capability of captur-ing high-order relationships among samples. Bu et al. [40]proposed a hypergraph learning approach for music recom-mendation, where the multitype objects and relations in socialnetworks interested in music are modeled into the hyper-graph structure. The learnt hypergraph is used to measurethe relationship among music tracks for music recommenda-tion. Xia et al. [41] further extended this method to learna large-scale class-specific hypergraph (CSHG) model for3-D object recognition. A multiple-hypergraph learningmethod [42] is proposed for view-based 3-D model retrievaland recognition. Huang et al. [43] proposed a transductivelearning framework on the hypergraph structure for imageretrieval, in which each vertex denotes an image in the imagecorpus. And then, semisupervised learning on the hypergraphis conducted to estimate the relevant scores among images.A jointly textual-visual social image reranking method isintroduced in [44], in which a social image hypergraph isconstructed by using both the textual and the visual informa-tion of images. The learning on the hypergraph is conductedto estimate the relevance of social images for reranking.A CSHG [45] is proposed to integrate local SIFT and globalgeometric constraints for object recognition. In this method, a

Page 4: Spectral-Spatial Constraint Hyperspectral Image Classification

1814 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 52, NO. 3, MARCH 2014

Fig. 3. Proposed hyperspectral image classification method with spacialconstraint analysis.

specific category of objects with multiple appearance instancesis formulated in a hypergraph structure. The vertices of thehypergraph denote the images that belong to the objects, andthe selected SIFT points are employed as the vertex features.

Given the effectiveness of hypergraph structure on modelinghigh-order relationships, in this paper we employ it to modelthe relationship among different pixels in the hyperspectralimage. Correspondingly, the learning on the constructed hyper-graph is conducted to estimate the relevance scores amongthese pixels. Fig. 1 shows the schematic illustration of theproposed method.

The proposed method benefits from the following twoaspects.

1) Each constructed hyperedge connects a group of pixels,which is able to expand the connection ability com-pared with the one-to-one pair-wise pixel connections. Inaddition, the generated hypergraph can provide a robuststructure to model the relationship among pixels.

2) There are two types of hyperedges: the feature-based andthe spatial-based. The feature-based hyperedges explorethe relationship among pixels in the feature space, andthe spatial-based hyperedges analyze the relationshipamong pixels through the spatial neighborhoods. In thisframework, both the hyperspectral image feature andthe spatial information can be simultaneously consideredbased on our subsequent learning scheme toward anoptimal hyperspectral image classification.

IV. SPECTRAL-SPATIAL CONSTRAINT HYPERSPECTRAL

IMAGE CLASSIFICATION

Here, we introduce the proposed spatial-constraint hyper-spectral image classification method shown in Fig. 1. Wefirst present the hyperspectral hypergraph construction process,where the generation of two types of hyperedges is detailed.Then, the learning on the hypergraph structure for hyper-spectral image classification is given. The whole algorithmis shown in Fig. 3.

A. Spatial-Constraint Hypergraph Construction

To construct the spatial constraint hypergraph, each pixel inthe hyperspectral image ℵ = {x1, x2, . . . , xn} is regarded asa vertex in the constructed hypergraph G = {V, E, W}, where

Fig. 4. Illustration of feature-based hyperedge construction.

G is the constructed hypergraph, V is the set of vertices ofthe hypergraph, E is the set of hyperedges of the hypergraph,and W is the diagonal matrix of the hyperedge weights. Twotypes of hyperedges are introduced here: the feature-basedhyperedge EFea and the distance-based hyperedge ESpa.

1) Feature-Based Hyperedge: For feature-based hyperedge,the pixels with small distances in the feature space areconnected. This implies that the pixels which are close inthe feature space should share similar labels. The hyperedgeconstruction method used in [43] is employed. Similar to[43], each pixel in the hyperspectral image is considered asa centroid vertex and a hyperedge is generated by using itsKNNs. Therefore, in each hyperedge, there are totally K + 1vertices. Fig. 4 provides an example to show feature -basedhyperedges. In Fig. 4, the blue-green pixel is the centroid, andits five distance-based nearest neighbors (denoted by yellowpixels) are selected to construct the hyperedge. In this example,the hyperedge connects 6 pixels.

Each hyperedge ei ∈ {EFea} provides a weight wFea (ei ),which is estimated by the similarities among the pixels insidethe hyperedge in the corresponding band cluster

wFea (ei ) = exp

(−d

2(vi )

2σ 2

)(1)

where d (vi ) is the mean distance between all vertices in ei , σis the mean distance among all pixels, which is calculated byσ = 1/n2 ∑

i∑

j d(xi ,x j

). By using (1), a hyperedge with

small pixel pair-wise distance will be given a high weight.For feature-based hyperedge ei ∈ {EFea}, the entry of the

incidence matrix HFea of the hypergraph G is generated bythe following:

HFea (v, ei ) ={

1, if v ∈ ei0, if v /∈ ei .

(2)

Because of the hyperedge construction criterion (mini-mal distance in the feature space), the pixels connected byone hyperedge ei ∈ {EFea} are with high similarities. Thisdefinition guarantees that these pixels are with the sameweight to ei .

2) Spatial-Based Hyperedge: The spatial-based hyperedgeis constructed by using each pixel and its spatial neighbors. Inthis procedure, each pixel in the hyperspectral image is takenas a centroid vertex and a hyperedge is generated by using itsspatial neighbors. Fig. 5 provides four types of spatial neighborpixels for the centroid pixel, in which the number of spatialneighbors are 4, 8, 12, and 24, respectively. In Fig. 5, the blue-green pixel is the centroid, and it connects its spatial-basednearest neighbors (denoted by yellow pixels) in the constructedspatial-based hyperedge.

Page 5: Spectral-Spatial Constraint Hyperspectral Image Classification

JI et al.: SPECTRAL-SPATIAL CONSTRAINT 1815

(a) (b) (c) (d)

Fig. 5. Illustration of the spatial-based hyperedge construction procedure.

Assuming the selected number of spatial neighbors is L,there are totally L + 1 vertices in each hyperedge. Eachhyperedge ei ∈ {ESpa} provides a weight wSpa (ei ) = 1,which reveals that each hyperedge is constructed for oneindividual pixel, and meanwhile are with a fixed influence onthe constructed hypergraph structure.

It is noted that connecting pixels based on the spatialconstraint do not guarantee that these pixels are close enoughin the feature space. Therefore, these pixels may have differentconnection to the corresponding hyperedge, i.e., for a spatial-based hyperedge ei ∈ {ESpa}, the entry of the incidence matrixH of the hypergraph G is generated by the following:

HSpa (v, ei ) ={

1, if v = vc

exp(− d2(v,vc)

2σ 2

), otherwise

(3)

where vc is the centroid pixel, and d (v, vc) is the distancebetween one pixel v, and vc, and σ are the mean distanceamong all pixels.

With the generated incidence matrix HFea and HSpa, the ver-tex degree matrix and edge degree matrix can be constructed.

The feature-based and spatial-based vertex degree of avertex v ∈ V are calculated by the following:

dFea (v) =∑

e∈EFea

wFea (e) HFea (v, e) (4)

dSpa (v) =∑

e∈ESpa

wFea (e) HSpa (v, e). (5)

The edge degree of a hyperedge e ∈ EFea is measured bythe following:

dFea (e) =∑

v∈V HFea (v, e). (6)

The edge degree of a hyperedge e ∈ ESpa is measured bythe following:

dSpa (e) =∑

v∈V HSpa (v, e). (7)

Let DvFea, DvSpa, DeFea, and DeSpa denote the diagonalmatrices of the feature-based and spatial-based vertex degreesand hyperedge degrees, respectively, and let WFea and WSpadenote the diagonal matrices of the hyperedge weights forthe distance-based hyperedges and spatial-based hyperedges,respectively. DvFea, DvSpa, DeFea, WFea, WSpa, and DeSpa are

defined by the following:

DvFea =⎛⎜⎝

dFea (v1) 0. . .

0 dFea (vn)

⎞⎟⎠ (8)

DvSpa =⎛⎜⎝

dSpa (v1) 0. . .

0 dSpa (vn)

⎞⎟⎠ (9)

DeFea =⎛⎜⎝

dFea (e1) 0. . .

0 dFea (en)

⎞⎟⎠ (10)

DeSpa =⎛⎜⎝

dSpa (e1) 0. . .

0 dSpa (en)

⎞⎟⎠ (11)

WFea =⎛⎜⎝

wFea (e1) 0. . .

0 wFea (en)

⎞⎟⎠ (12)

WSpa =⎛⎜⎝

wSpa (e1) 0. . .

0 wSpa (en)

⎞⎟⎠. (13)

The constructed hypergraph is composed by two types ofhyperedges: the distance-based hyperedges and the spatial-based hyperedges. To combine these two types of hyperedges,a pair of weights κFea and κSpa is introduced to weightthese two types of hyperedges, where κFea+κSpa = 1, and{κFea, κSpa

}> 0.

B. Learning on the Constructed Hypergraph

With the constructed hypergraph, we carry out a semisuper-vised learning for classification, which follows the regulariza-tion framework proposed in [38] as follows:

arg minF

{�(F) + ξ Remp(F)

}. (14)

Let C is the number of pixel categories, and F =[ f1, f2, . . . , fC ] is the to-be-learned confidence score matrixfor hyperspectral image classification, where fk (s) is theconfidence score to categorize the sth pixel into the kth class.Remp is the empirical loss defined by the following:

Remp =C∑

k=1

‖ fk − yk‖2 (15)

where yk is an n × 1 labeled training vector for the kthclass, and Y = [y1, y2, · · · , yC ]. ξ > 0 is a tradeoffparameter, and � (F) is a regularizer on the hypergraph

Page 6: Spectral-Spatial Constraint Hyperspectral Image Classification

1816 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 52, NO. 3, MARCH 2014

� (F) = 1

2

C∑k=1

⎧⎪⎪⎪⎨⎪⎪⎪⎩

κFea∑

e∈EFea

∑u,v∈V

wFea(e)HFea(u,e)HFea(v,e)dFea(e)

(Fu,k√dFea(u)

− Fv,k√dFea(v)

)2

+κSpa∑

e∈ESpa

∑u,v∈V

wSpa(e)HSpa(u,e)HSpa(v,e)dSpa(e)

(Fu,k√dSpa(u)

− Fv,k√dSpa(v)

)2

⎫⎪⎪⎪⎬⎪⎪⎪⎭

=C∑

k=1

⎧⎪⎪⎪⎨⎪⎪⎪⎩

κFea∑

e∈EFea

∑u,v∈V

wFea(e)HFea(u,e)HFea(v,e)dFea(e)

(F2

u,kdFea(u) − Fu,kFv,k√

dFea(u)dFea(v)

)

+κSpa∑

e∈ESpa

∑u,v∈V

wSpa(e)HSpa(u,e)HSpa(v,e)dSpa(e)

(F2

u,kdSpa(u) − Fu,kFv,k√

dSpa(u)dSpa(v)

)⎫⎪⎪⎪⎬⎪⎪⎪⎭

=C∑

k=1

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

κFea

{∑u∈V

F2u,k

∑e∈EFea

wFea(e)HFea(u,e)dFea(u)

∑v∈V

HFea(v,e)dFea(e)

− ∑e∈EFea

∑u,v∈V

Fu,kHFea(u,e)wFea(e)HFea(v,e)Fv,k√dFea(u)dFea(v)dFea(e)

}

+κSpa

{∑u∈V

F2u,k

∑e∈ESpa

wSpa(e)HSpa(u,e)dSpa(u)

∑v∈V

HSpa(v,e)dSpa(e)

− ∑e∈ESpa

∑u,v∈V

Fu,kHSpa(u,e)wSpa(e)HSpa(v,e)Fv,k√dSpa(u)dSpa(v)dSpa(e)

}⎫⎪⎪⎪⎪⎬⎪⎪⎪⎪⎭

=C∑

k=1

{κFea f T

k (I − �Fea) fk + κSpa f Tk

(I − �Spa

)fk

}

=C∑

k=1

f Tk (I − �) fk . (16)

rolling as a smoothness constraint, which means that thelabel information between nearby points in the hypergraphstructure should not be changed too much. It is defined asthe sum of the weighted label distances between all pairs ofvertices in every hyperedge, as shown in (16) at the top ofthe page, where �Fea = D−1/2

vFea HFeaWFeaD−1eFeaHFea

T D−1/2vFea ,

�Spa = D−1/2vFea HSpaWSpaD−1

eSpaHSpaT D−1/2

vFea , and � =κFea�Fea+κSpa�Spa. Here let � = I−�, � (F) can be writtenas follows:

� (F) =C∑

k=1

f Tk � fk . (17)

With the help of the training data, we can further investigatethe weight learning for different types of hyperedges. To thateffect, a hypergraph weight regularizer can be added in theobjective function as follows:

arg minF,κFea,κSpa

{�(F) + ξ Remp(F) + ζ

(κ2

Fea + κ2Spa

)}s.t. κFea + κSpa = 1. (18)

where ξ and ζ are two positive parameters. In our experiment,ζ is set as ζ = μn, where n is the number of pixels in thehyperspectral image. The setting of ζ can be achieved by thesetting of μ.

Now the objective function can be rewritten by thefollowing:

arg minF,κFea,κSpa

⎧⎪⎨⎪⎩

C∑k=1

f Tk � fk + ξ

C∑k=1

‖ fk − yk‖2

+ζ(κ2

Fea + κ2Spa

)⎫⎪⎬⎪⎭ . (19)

To solve the above optimization problem, an alternatingoptimization method is employed. Each time we optimize oneand fix the other one for F and κ .

We first fix κ and optimize F. Then the objective functionbecomes

arg minF

{C∑

k=1

f Tk � fk + ξ

C∑k=1

‖ fk − yk‖2

}

s.t. ξ > 0. (20)

It can be derived as follows:

F =(

I + 1

ξ�

)−1

Y. (21)

Then we optimize κ with fixing F. The objective functioncan be written as follows:

arg minκFea,κSpa

{C∑

k=1

f Tk � fk + ζ

(κ2

Fea + κ2Spa

)}

s.t. ζ > 0, κFea + κSpa = 1. (22)

Since κFea + κSpa = 1, we can obtain

� = I − � = (κFea + κSpa

)I − �

= κFea (I − �Fea) + κSpa(I − �Spa

). (23)

We employ the Lagrangian method to solve this optimiza-tion problem, and it becomes

minκFea,κSpa,η

{ C∑k=1

f Tk

(κFea (I − �Fea) + κSpa

(I − �Spa

))fk

+ζ(κ2

Fea + κ2Spa

)+ η

((κFea + κSpa

)− 1

)}(24)

which can be derived by the following:

η =−

C∑k=1

f Tk

((I − �Fea) + (

I − �Spa))

fk−2ζ

2(25)

Page 7: Spectral-Spatial Constraint Hyperspectral Image Classification

JI et al.: SPECTRAL-SPATIAL CONSTRAINT 1817

TABLE I

DETAILS OF THE INDIAN PINE DATA SET

Class # Pixels Class # of Pixels

Soybeans-no till 972 Corn-no till 1428Soybeans-min 2455 Corn-min 830Grass/pasture 483 Grass/trees 730

Soybeans-clean till 593 Woods 1265Hay-windrowed 478 Total 9234

κFea = 1

2+

C∑k=1

f Tk

((I − �Fea) + (

I − �Spa))

fk

C∑k=1

f Tk (I − �Fea) fk

2ζ(26)

κSpa = 1 − κFea. (27)

As each step of the alternation optimization processdecreases the objective function in (19), and the objectivefunction has a lower bound of 0, the convergence of thealternation optimization process can be guaranteed.

After the computation of the confidence score matrix F,each pixel in the hyperspectral image can be classified to theclass with the highest confidence score.

V. EXPERIMENTAL RESULTS

Here, the experimental study is provided to demonstrate theeffectiveness of the proposed hyperspectral image classifica-tion approach. First the testing data sets are introduced andthen experimental results and discussion are provided.

A. Experimental Data sets

In our experiments, four data sets are used to evaluate theperformance of the proposed method. The first data set is theAVIRIS image data set, which is widely used in [36], [37],and [23]. It was taken over NW Indiana’s Indian Pine testsite in June 1992 mounted from an aircraft flown at 65 000 ftaltitude. The Indian Pine data set is with the size of 145×145pixels and has 220 spectral bands ranging from 0.4 to 2.5 μmwith a spatial resolution of 20 m. A few bands are removedbecause of the water absorption bands, and finally 200 out ofthe 220 bands are used in our experiment (the removed bandsare [104−108], [150−163], and 220). There are originally16 classes in total, ranging in size from 20 to 2455 pixels.Because of the small sizes of some classes, only nine classesare selected for evaluation. The details information about theselected classes is shown in Table I.

Following the experimental settings in [46], we furtherselect a subset scene of the Indian Pine data set, consistingof the pixels [27−94] × [31−116] for a size of 68 × 86 dataset (denoted by Indian Pine Sub). In Indian Pine Sub, there arefour labeled classes in total. This data set [46] is generated toevaluate the hyperspectral image classification method whendealing with different classes and similar spectral signatures.Similar to Indian Pine, 20 bands are removed because of thewater absorption bands. The details about the Indian Pine Subdata set are shown in Table II.

TABLE II

DETAILS OF THE INDIAN PINE SUB DATA SET

Class # of Pixels Class # of Pixels

Soybeans-no till 732 Corn-no till 1005Soybeans-min 1903 Grass/trees 730

Total 4370

TABLE III

DETAILS OF THE SALINAS A DATA SET

Class # of Pixels

Brocoli green weeds1 391Corn senesced green weeds 1343

Lettuce romaine 4k 616Lettuce romaine 5k 1525Lettuce romaine 6k 674Lettuce romaine 7k 799

Total 5348

TABLE IV

DETAILS OF THE PAVIAU DATA SET

Class # Pixels Class # of Pixels

Asphalt 6631 Bare Soil 5029Meadows 18 649 Bitumen 1330

Gravel 2099 Self-Blocking Bricks 3682Trees 3064 Shadows 947

Painted Metal Sheets 1345 Total 42 776

The third testing data set is the Salinas Valley data. Thisscene was collected by the AVIRIS sensor over Salinas Valley,California in 1998. This scene is with the size of 512 × 217pixels and is characterized with high spatial resolution with3.7-m pixels. In Salina Valley, there are 16 classes. A subset ofthe Salinas Valley scene [7], denoted as Salinas A hereinafter,is employed, which consists of 86 × 83 pixels located inthe [591−678] × [158−240] of Salinas Valley. There are sixclasses in Salinas A. The number of spectral bands is 224.The bands of [108−112], [154−167], and 224 are removedbecause of the water absorption bands, and finally 204 out ofthe 224 bands are used in our experiment. The details of theSalinas A data set are shown in Table III.

The fourth testing data set is the University of Pavia imagedata set, which is captured over an urban area (the Universityof Pavia) acquired by ROSIS-03 optical sensor. This scene iswith the size of 610 × 340 pixels and is characterized withhigh spatial resolution, i.e., 1.3 m pixels. In University ofPavia (denoted by PaviaU in the following) data set, there are9 classes. The original number of spectral bands is 115.12 most noisy bands are removed and finally 103 out of the115 bands are used in our experiment. The details of thePaviaU data set are shown in Table IV.

B. Compared Methods

To evaluate the effectiveness of the proposed spatial-constraint hyperspectral image classification, the followingmethods are implemented and compared.

1) Semi-supervised graph-based method [46], which for-mulates the hyperspectral image classification as a semi-

Page 8: Spectral-Spatial Constraint Hyperspectral Image Classification

1818 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 52, NO. 3, MARCH 2014

(a) (b) (c) (d)

Fig. 6. Classification results of compared methods in all testing data sets. (a) OA @ Indian Pine. (b) OA @ Indian Pine Sub. (c) OA @ Salinas A.(d) OA @ PaviaU.

supervised graph learning. All pixels are denoted by thevertices in the graph structure, which is able to exploitthe wealth of unlabeled samples by the graph learningprocedure. For comparison, the Cross+Stacked kernel ischosen that shows the best results in [46]. This methodis denoted by SSG+CS.

2) A robust sparse approach (SPARSE) [7]. The SPARSEmethod is based on the assumption that each hyper-spectral data class lies in a low-dimensional subspace.In SPARSE, a dictionary is generated by the labeleddata, and a sparse representation is calculated by usingthis dictionary for each testing pixel. In this method,an l1 minimization-based sparse representation methodis employed.

3) CRF [27]. Toward modeling the spatial layouts, a naturalalternative is to adopt the CRF instead of our spatial-layout-based hyperedge that is trained on local sam-ples. Inference over CRF can also provide a contextualsmoothing over the spatial layouts. As introduced in[27], CRF is able to incorporate the contextual infor-mation in both the labeled and unlabeled data in aprincipled way.

4) Local manifold learning-based (LML+KNN) [25].LML+KNN combines the local manifold learning andthe KNN classifier for hyperspectral image classifica-tion. In this method, all pixels in the hyperspectralimage are embedded in a manifold, and local manifoldlearning is conducted to estimate the relationship amongdifferent pixels. Then, the weighted KNN classifier isemployed for classification. The Supervised Locally Lin-ear Embedding method is used as the weighting methodsbecause of its steady performance as introduced in [25].

5) Hypergraph analysis with feature-based hyperedges forhyperspectral image classification (HG-Fea) [26]. InHG-Fea, only the feature-based hyperedges are con-structed, and the learning on the generated hypergraphis conducted to classify the hyperspectral image.

6) Hypergraph analysis with spatial-based hyperedges forhyperspectral image classification (HG-Spa) [33]. InHG-Spa, only the spatial-based hyperedges are con-structed and the learning on the generated hypergraphis conducted to classify the hyperspectral image.

7) Hypergraph analysis with spatial constraint forhyperspectral image classification (HG). In HG, boththe feature-based and the spatial-based hyperedges are

Fig. 7. Example for the pixels belonging to the same class that cannot beconnected by the spatial-based hyperedge.

employed for hypergraph construction. The learningon the generated hypergraph is conducted with spatialconstraint only, without regard to the hyperedge weights.

8) Hypergraph analysis with spatial constraint andhyperedge weight learning for hyperspectral imageclassification (HG-W). Different from HG, it furtherlearns the weights for the two types of hyperedges togenerate an optimal Hyperspectral hypergraph structure,i.e., the proposed method .

C. Experimental Results

In this experiment, we evaluate the proposed method and allcompared methods in the four testing data sets as introducedabove. In these experiments, we set K = 20, L = 12,ξ = 10, and μ = 0.5. In each data set, the number oflabeled samples per class varies from 3 to 100, i.e., {3, 5, 10,15, 20, 25, 30, 50, 100}. The overall accuracy (OA) is selectedas the evaluation metrics, which was widely employed inexisting works [2]. In all experiments, we randomly selectthe labeled data and employ all other data as the testing setfor ten trials, and the demonstrated results are the averageperformance of the ten trails. Experimental comparison in alltesting data sets is shown in Fig. 6.

The variability of the curve for the proposed method interms of OA is shown in Fig. 6. When the labeled dataper class is too few, i.e., three, the performance variance islarge. This variance becomes smaller with the increasing oflabeled training data. Even when the labeled pixels for eachclass exceed 5, the classification performance is stable withthe different randomly selected training data. This observationindicates the robustness of the proposed method.

Compared to the state-of-the-art methods, the proposedmethod HG outperforms SSG+CS, SPARSE, CRF, and LML–KNN in all testing data sets. We take the experimental results

Page 9: Spectral-Spatial Constraint Hyperspectral Image Classification

JI et al.: SPECTRAL-SPATIAL CONSTRAINT 1819

(a) (b) (c) (d) (e)

(f) (g) (h) (i) (j)

Fig. 8. Classifications results of the Indian Pine data set. (a) Groundtruth map with nine classes. (b)–(j) Classifications maps with 3, 5, 10, 15, 20, 25, 30,50, and 100 labeled training samples for each class.

(a) (b) (c) (d) (e)

(f) (g) (h) (i) (j)

Fig. 9. Classifications results of the Indian Pine Sub data set. (a) Groundtruth map with nine classes. (b)–(j) Classifications maps with 3, 5, 10, 15, 20, 25,30, 50, and 100 labeled training samples for each class.

(a) (b) (c) (d) (e)

(f) (g) (h) (i) (j)

Fig. 10. Classifications results of the Salinas A data set. (a) Groundtruth map with nine classes. (b)–(j) Classifications maps with 3, 5, 10, 15, 20, 25, 30,50, and 100 labeled training samples for each class.

when five samples per class are selected as the training data asan example. In the Indian Pine data set, the proposed methodachieves performance gains of 15.04%, 17.67%, 11.20%, and13.68% in terms of the OA measure compared with SSG+CS,SPARSE, CRF, and LML-KNN, respectively. In the Indian

Pine Sub data set, the proposed method achieves performancegains of 33.84%, 11.83%, 8.55%, and 10.74% in terms of theOA measure compared with SSG+CS, SPARSE, CRF, andLML–KNN respectively. In the Salinas A data set, the pro-posed method achieves performance gains of 6.40%, 0.85%,

Page 10: Spectral-Spatial Constraint Hyperspectral Image Classification

1820 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 52, NO. 3, MARCH 2014

(a) (b) (c) (d) (e)

(f) (g) (h) (i) (j)

Fig. 11. Classifications results of the PaviaU data set. (a) Groundtruth map with nine classes. (b)–(j) Classifications maps with 3, 5, 10, 15, 20, 25, 30, 50,and 100 labeled training samples for each class.

2.86%, and 5.15% in terms of the OA measure compared withSSG+CS, SPARSE, CRF, and LML-KNN, respectively. In thePaviaU data set, the proposed method achieves performancegains of 15.71%, 11.08%, 16.53%, and 12.87% in terms ofthe OA measure compared with SSG+CS, SPARSE, CRF, andLML–KNN, respectively.

From the experimental results, we can draw the followingconclusions. The proposed method HG achieves the best imageclassification performance in most of cases in all testing datasets, which indicates the effectiveness of the proposed method.The improvement on the Salinas A data set is smaller than thatin the other three data sets. This is partially because of the factthat the results for most of compared methods on the SalinasA data set are good and thus the improvement space is limitedin comparison with the other data sets.

We further compare the proposed method by using bothtypes of hyperedges, i.e., HG and HG-W, to the methodwith single type of hyperedge, i.e., HG-Fea and HG-Spa.In comparison with HG-Fea, in which only the distance-based hyperedges are generated, HG and HG-W achievebetter results. These results indicate that the spatial constraintis effective on improving the structure of the constructedhypergraph.

In comparison with HG-Spa, in which only the spatial-based hyperedges are generated, HG-Spa performs a bit worsecompared with HG and HG-W. This result indicates that thehypergraph by using only the spatial constraint is not effectiveenough. With only spatial constraint, the constructed hyper-graph may lose the general connection among different regionsin the hyperspectral image. One example is shown in Fig. 7.In Fig. 7, green pixels are segmented into two regions. Underthese circumstances, these two groups of green pixels in HG-Spa cannot be connected by the constructed hypergraph withonly the spatial-based hyperedges, which may decrease thediscriminative ability of the hypergraph structure. However,connection can be achieved by feature-based hyperedge usingHG and HG-W, as shown by the link in Fig. 7, which thereforeimprove the classification performance a lot.

Fig. 12. Classification performance comparison with different K values byusing five training sample per class in all testing data sets.

To investigate the effectiveness of the weight learning fordifferent types of hyperedges, we further compare HG-W withHG. In comparison with HG, HG-W achieves better resultsin all testing data sets. Here we take the experimental resultswhen five samples per class are selected as the training data asan example. Compared with HG, HG-W achieves performancegains of 4.66%, 1.95%, 0.58%, and 2.38% in terms of theOA measure for the Indian Pine, Indian Pine Sub, Salinas A,and PaviaU data sets respectively. These results indicate thatthe learning of weight for different types of hyperedges iseffective on optimizing the constructed hypergraph structurefor hyperspectral image modeling.

Fig. 8–11 show the classification results of the proposedmethod in the testing data sets with different number ofselected training samples per class.

D. On the Number of Selected Neighbors K for Feature-BasedHyperedge Construction

The parameter K is employed to select distance-basednearest neighbors to construct hyperedges. When K is small,

Page 11: Spectral-Spatial Constraint Hyperspectral Image Classification

JI et al.: SPECTRAL-SPATIAL CONSTRAINT 1821

Fig. 13. Classification performance comparison with different L values byusing five training sample per class in all testing data sets.

Fig. 14. Classification performance comparison with different ξ values byusing five training sample per class in all testing data sets.

only quite a few pixels can be connected by each hyper-edge. Therefore, many pixels cannot be fully connected bythe constructed hypergraph structure. When K is too large,too many pixels are connected by one hyperedge and thediscriminative ability of the constructed hypergraph is limited.Fig. 12 provides the OA performance curves with respect tothe variation of K in all testing data sets, where five samplesper class are selected as the training data.

As shown in the experimental results, we can observethat the classification performance is worse when K is toosmall. With the increasing of K , the classification performancebecomes better. When K is too large, the classification per-formance degrades. In turn, these results validate our previousanalysis on the number of selected neighbors K for the feature-based hyperedge construction.

E. On the Spatial Constraint L

The spatial constraint L determines the selected local neigh-bors for each pixel in the procedure of spatial-based hyperedgeconstruction. When L is too small, each pixel connects to afew pixels around it. When L becomes larger, more pixels canbe linked to the central pixel. In this experiment, we vary theparameter L as {4, 8, 12, 24} and evaluate the influence of the

Fig. 15. Classification performance comparison with different μ values byusing five training sample per class in all testing data sets.

parameter L. Fig. 13 provides the OA curves with respect tothe variation of L in all testing data sets, where five samplesper class are selected as the training data.

As shown in Fig. 13, following conclusions can be drawn.1) Generally, the hyperspectral image classification per-

formance is high for all experiments with different Lvalues. These results demonstrate the effectiveness ofthe hypergraph analysis method with spatial constraint.

2) The results are stable with the variation of parameterL. When L is large (e.g., 24) or small (e.g., 4), thehyperspectral image classification performance is only abit lower than that of L = 8 and L = 24. These resultsindicate that the proposed method can achieve a steadyperformance with different parameter L settings.

F. On the Parameters ξ and μ in the HypergraphLearning Procedure

The parameter ξ modulates the effect of the loss term‖F − Y‖2, and μ modulates the effects of weight regularizer.In these experiments, we fix one parameter and vary the otherone. Fig. 14 shows the OA curves with respect to the variationof ξ , where μ is fixed to be 0.5. Fig. 15 provides the OA curveswith respect to the variation of μ, where ξ is fixed to be 10.Experimental results demonstrate that the proposed methodcan achieve a steady performance when the parameters varyin a large range.

G. Computation Cost Analysis

Here, we discuss the computation cost of the proposedmethod. All the experiments are conducted on a PC withIntel i7-2600 3.4 GHz CPU and 16-GB memory. The runningtime for the proposed method is shown in Fig. 16 in allfour data sets. When the size of the data set enlarges, thecomputation cost increases fast. To deal with this problem,the large data set can be divided into several segments firstly,and then the proposed method is conducted in these segmentsrespectively. For PaviaU, we first split it into five segments,and the proposed method is conducted in each segment oneby one. The split method is shown in Fig. 17(a) and (b)provides the comparison between the performance obtained

Page 12: Spectral-Spatial Constraint Hyperspectral Image Classification

1822 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 52, NO. 3, MARCH 2014

Fig. 16. Computation cost of the proposed method in all data sets.

Fig. 17. Experimental evaluation on the split images of PaviaU.

by using the whole data set directly and by using the splitimages. The running time for the method on the split imagesis shown in Fig. 16. We achieve the following observations.First, the running time can be reduced significantly when thelarge original data set is split into several smaller pieces.Second, the classification performance degrades only a littlein this procedure. These observations indicate that the imagesplitting procedure is a moderately practical way to reduce thecomputation cost with acceptable performance decrease.

VI. CONCLUSION

Hyperspectral image classification attracted extensiveresearch efforts in the recent decade. The main challengelaid in the high-dimensional features and few training sam-ples. In this paper, we proposed a spectral-spatial constrainthyperspectral image classification method to deal with bothissues by investigating the spatial context. In the proposedmethod, the relationship among pixels in the hyperspectralimage was formulated into a hypergraph structure, in whicheach vertex denoted a pixel in the image. Two types ofhyperedges, i.e., the feature-based and spatial-based, weregenerated according to the pair-wise distances in the featurespace and spatial neighborhood. We further learnt the weightsfor these two types of hyperedges to generate an optimalhypergraph structure. Semisupervised learning was conductedin the hypergraph for hyperspectral image classification.

Experiments on four data sets, i.e., Indian Pine, Indian PineSub, Salinas A, and PaviaU were conducted to evaluate the

effectiveness of the proposed method, with comparisons withthe state-of-the-art methods. From experimental results, wecan draw the following conclusions.

1) The proposed method can achieve better results incomparison with the state-of-the-art methods.

2) The combination of the two types of hyperedges waseffective and the spatial-based hyperedge was ableto improve the classification performance significantly,because of jointly investigating the pixel relationship inthe feature space and the spatial layout.

As all the data were used in the hypergraph construction, wecan explore the pixel relation in the whole data set consideringall the labeled and unlabeled data. However, it leads to thehigh computation cost on both the memory and the runningtime. Therefore, when dealing with large data sets, we firstdivided the data set into several segments, and then conductedthe hypergraph-based classification method in each segment,respectively. We quantitatively showed that this method wasable to reduce the computation cost significantly, with a smalldecrease of classification performance. Such a performancedecrease mainly came from ignoring the relation among differ-ent unlabeled data sets in the classification procedure. Potentialsolutions might come from splitting the data set and investi-gating the relation among different subsets were important. Analternative solution was to classify the hyperspectral image ina hierarchal way. In this direction, the hyperspectral image canbe first segmented into different scales, and then a coarse-to-fine hypergraph-based classification process can be conductedto generate the coarse and refined results. Both alternativeswould be investigated in our future work.

REFERENCES

[1] D. Landgrebe, “Hyperspectral image data analysis,” IEEE SignalProcess. Mag., vol. 19, no. 1, pp. 17–28, Jan. 2002.

[2] B. Guo, S. Gunn, R. Damper, and J. Nelson, “Customizing ker-nel functions for SVM-based hyperspectral image classification,”IEEE Trans. Image Process., vol. 17, no. 4, pp. 622–629, Apr. 2008.

[3] H. Jiao, Y. Zhong, L. Zhang, and P. Li, “Unsupervised remote sensingimage classification using an artificial dna computing,” in Proc. Int.Conf. Comput., Netw. Commun., vol. 3. Jul. 2011, pp. 1341–1345.

[4] O. Eches, N. Dobigeon, C. Mailhes, and J. Tourneret, “Bayesianestimation of linear mixtures using the normal compositional model.application to hyperspectral imagery,” IEEE Trans. Image Process.,vol. 19, no. 6, pp. 1403–1413, Jun. 2010.

[5] T. Liu, L. Zhang, P. Li, and H. Lin, “Remotely sensed image retrievalbased on region-level semantic mining,” EURASIP J. Image VideoProcess., no. 1, pp. 1–11, Jan. 2012.

[6] L. Zhang, L. Zhang, D. Tao, and X. Huang, “A multifeature tensor forremote-sensing target recognition,” IEEE Geosci. Remote Sens. Lett.,vol. 8, no. 2, pp. 374–378, Mar. 2011.

[7] Q. Sami ul Haq, L. Tao, F. Sun, and S. Yang, “A fast and robustsparse approach for hyperspectral data classification using a fewlabeled samples,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 6,pp. 2287–2302, Jun. 2012.

[8] G. Bilgin, S. Erturk, and T. Yildirim, “Unsupervised classification ofhyperspectral-image data using fuzzy approaches that spatially exploitmembership relations,” IEEE Geosci. Remote Sens. Lett., vol. 5, no. 4,pp. 673–677, Oct. 2008.

[9] G. Chova, G. Camps-Valls, J. Munoz-Mari, and J. Calpe, “Semi-supervised image classification with Laplacian support vector machines,”IEEE Geosci. Remote Sens. Lett., vol. 5, no. 3, pp. 336–340, Jul. 2008.

[10] L. Zhang, L. Zhang, D. Tao, and X. Huang, “On combining mul-tiple features for hyperspectral remote sensing image classification,”IEEE Trans. Geosci. Remote Sens., vol. 50, no. 3, pp. 879–893,Mar. 2012.

Page 13: Spectral-Spatial Constraint Hyperspectral Image Classification

JI et al.: SPECTRAL-SPATIAL CONSTRAINT 1823

[11] L. Bruzzone, M. Chi, and M. Marconcini, “A novel transductive SVMfor semisupervised classification of remote sensing images,” IEEEGeosci. Remote Sens. Lett., vol. 44, no. 11, pp. 3363–3373, Nov. 2006.

[12] D. Tuia and G. Camps-Valls, “Semi-supervised remote sensing imageclassification with cluster kernels,” IEEE Geosci. Remote Sens. Lett.,vol. 6, no. 2, pp. 224–228, Apr. 2009.

[13] A. Hyvrine and E. Oja, “Independent component analysis: Algorithmsand applications,” Neural Netw., vol. 13, nos. 4–5, pp. 411–430,Jun. 2001.

[14] S. Wold, K. Esbensen, and P. Geladi, “Principal component analysis,”Chemometrics Intell. Lab. Syst., vol. 2, nos. 1–3, pp. 37–52, 1987.

[15] J. Wang and C.-I. Chang, “Independent component analysis-baseddimensionality reduction with applications in hyperspectral image analy-sis,” IEEE Trans. Geosci. Remote Sens., vol. 44, no. 6, pp. 1586–1600,Jun. 2006.

[16] B.-C. Kuo, C.-H. Li, and J.-M. Yang, “Kernel nonparametric weightedfeature extraction for hyperspectral image classification,” IEEE Trans.Geosci. Remote Sens., vol. 47, no. 4, pp. 1139–1155, Apr. 2009.

[17] B. Guo, S. G. R. Damper, and J. Nelson, “Band selection for hyper-spectral image classification using mutual information,” IEEE Geosci.Remote Sens. Lett., vol. 3, no. 4, pp. 522–526, Oct. 2006.

[18] A. Martinez-Uso, F. Pla, J. M. Sotoca, and P. Garcia-Sevilla, “Clustering-based hyperspectral band selection using information measures,”IEEE Trans. Geosci. Remote Sens., vol. 45, no. 12, pp. 4158–4171,Dec. 2007.

[19] G. Camps-Valls, J. Mooij, and B. Scholkopf, “Remote sensing featureselection by kernel dependence measures,” IEEE Geosci. Remote Sens.Lett., vol. 7, no. 3, pp. 587–591, Jul. 2010.

[20] J. Bruzzone and S. Serpico, “A technique for features selection inmulticlass problems,” Int. J. Remote Sens., vol. 21, no. 3, pp. 549–563,2000.

[21] D. Tuia, G. Camps-Valls, G. Matasci, and M. Kanevski, “Learningrelevant image features with multiple kernel classification,” IEEE Trans.Geosci. Remote Sens., vol. 48, no. 10, pp. 3780–3791, Oct. 2010.

[22] M. Marconcini, G. Camps-Valls, and L. Bruzzone, “A composite semisu-pervised SVM for classification of hyperspectral images,” IEEE Geosci.Remote Sens. Lett., vol. 6, no. 2, pp. 234–238, Apr. 2009.

[23] R. Archibald and G. Fann, “Feature selection and classification ofhyperspectral images with support vector machines,” IEEE Geosci.Remote Sens. Lett., vol. 4, no. 4, pp. 674–677, Oct. 2007.

[24] J. Wang, Z. Zhang, and H. Zha, “Adaptive manifold learning,” in Proc.Adv. Neural Inf. Process. Syst., 2004, pp. 313–338.

[25] L. Ma, M. Crawford, and J. Tian, “Local manifold learning-basedκ-nearest-neighbor for hyperspectral image classification,” IEEE Trans.Geosci. Remote Sens., vol. 48, no. 11, pp. 4099–4109, Nov. 2010.

[26] Y. Wen, Y. Gao, S. Liu, Q. Cheng, and R. Ji, “Hyperspetral imageclassification with hypergraph modeling,” in Proc. Int. Conf. InternetMultimedia Comput. Service, 2012, pp. 34–37.

[27] P. Zhong and R. Wang, “Learning conditional random fields for classi-fication of hyperspectral images,” IEEE Trans. Image Process., vol. 19,no. 7, pp. 1890–1907, Jul. 2010.

[28] P. Zhong and R. Wang, “Modeling and classifying hyperspectral imageryby CRFs with sparse higher order potentials,” IEEE Trans. Geosci.Remote Sens., vol. 49, no. 2, pp. 688–705, Feb. 2011.

[29] F. Ratle, G. Camps-Valls, and J. Weston, “Semi-supervised neuralnetworks for efficient hyperspectral image classification,” IEEE Trans.Geosci. Remote Sens., vol. 48, no. 5, pp. 2271–2282, May 2010.

[30] L. Bruzzone and C. Persello, “A novel context-sensitive semisupervisedsvm classifier robust to mislabeled training samples,” IEEE Trans.Geosci. Remote Sens., vol. 47, no. 7, pp. 2142–2154, Jul. 2009.

[31] G. Licciardi, F. Pacifici, D. Tuia, S. Prasad, T. West, F. Giacco,J. Inglada, E. Christophe, J. Chanussot, and P. Gamba, “Decision fusionfor the classification of hyperspectral data: Outcome of the 2008 GRS-Sdata fusion contest,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 11,pp. 3857–3865, Nov. 2009.

[32] S. Velasco-Forero and V. Manian, “Improving hyperspectral imageclassification using spatial preprocessing,” IEEE Geosci. Remote Sens.Lett., vol. 6, no. 2, pp. 297–301, Apr. 2009.

[33] Y. Gao and T.-S. Chua, “Hyperspectral image classification by usingthe pixel spatial correlation,” in Proc. Int. Conf. Multimedia Model.,Jan. 2013, pp. 141–151.

[34] J. Benediktsson, J. Palmason, and J. Sveinsson, “Classification ofhyperspectral data from urban areas based on extended morphologicalprofiles,” IEEE Trans. Geosci. Remote Sens., vol. 43, no. 3, pp. 480–490,Mar. 2005.

[35] M. Fauvel, J. Benediktsson, J. Chanussot, and J. Sveinsson, “Spectraland spatial classification of hyperspectral data using svms and morpho-logical profiles,” IEEE Trans. Geosci. Remote Sens., vol. 46, no. 11,pp. 3804–3814, Nov. 2008.

[36] G. Camps-Valls and L. Bruzzone, “Kernel-based methods for hyperspec-tral image classification,” IEEE Trans. Geosci. Remote Sens., vol. 43,no. 6, pp. 1351–1362, Jun. 2005.

[37] T. Bandos, L. Bruzzone, and G. Camps-Valls, “Classification of hyper-spectral images with regularized linear discriminant analysis,” IEEETrans. Geosci. Remote Sens., vol. 47, no. 3, pp. 862–873, Mar. 2009.

[38] D. Zhou, J. Huang, and B. Schokopf, “Learning with hypergraphs:Clustering, classification, and embedding,” in Proc. Adv. Neural Inf.Process. Syst., 2007, pp. 1600–1601.

[39] Y. Gao, M. Wang, H. Luan, J. Shen, S. Yan, and D. Tao, “Tag-basedsocial image search with visual-text joint hypergraph learning,” in Proc.ACM Conf. Multimedia, 2011, pp. 1517–1520.

[40] J. Bu, S. Tan, C. Chen, C. Wang, H. Wu, L. Zhang, and X. He,“Music recommendation by unified hypergraph: Combining social mediainformation and music content,” in Proc. ACM Int. Conf. Multimedia,2010, pp. 391–400.

[41] S. Xia and E. Hancock, “Learning large scale class specific hypergraphs for object recognition,” in Proc. Int. Conf. Image Graph., 2008,pp. 366–371.

[42] Y. Gao, M. Wang, D. Tao, R. Ji, and Q. Dai, “3D object retrievaland recognition with hypergraph analysis,” IEEE Trans. Image Process.,vol. 21, no. 9, pp. 4290–4303, Sep. 2012.

[43] Y. Huang, Q. Liu, S. Zhang, and D. Metaxas, “Image retrieval viaprobabilistic hypergraph ranking,” in Proc. IEEE Int. Conf. Comput.Vis. Pattern Recognit., Jun. 2010, pp. 3376–3383.

[44] Y. Gao, M. Wang, Z. Zha, J. Shen, X. Li, and X. Wu, “Visual-textualjoint relevance learning for tag-based social image search,” IEEE Trans.Image Process., vol. 22, no. 1, pp. 363–376, Jan. 2013.

[45] S. Xia and E. Hancock, “3D object recognition using hyper-graphs andranked local invariant features,” in Proc. Int. Joint IAPR Int. Workshop,SSPR SPR, Dec. 2008, pp. 117–126.

[46] G. Camps-Valls, T. B. Marsheva, and D. Zhou, “Semi-supervised graph-based hyperspectral image classification,” IEEE Trans. Geosci. RemoteSens., vol. 45, no. 10, pp. 3044–3054, Oct. 2007.

Rongrong Ji (M’10) received the Ph.D. degreein computer science from the Harbin Institute ofTechnology, Harbin, China.

He has been a Post-Doctoral Research Fellow withthe Department of Electrical Engineering, ColumbiaUniversity, New York, NY, USA, since 2011, work-ing with Prof. S.-F. Chang. He was a Visiting Studentwith the University of Texas, San Antonio, TX,USA, where he worked with Prof. Q. Tian fromMarch 2010 to May 2010, a Research Assistant withPeking University, Beijing, China, where he worked

with Prof. W. Gao from April 2010 to November 2010, a Research Internwith Microsoft Research Asia where he worked with Dr. X. Xie, and ledthe Multimedia Retrieval Group, Visual Intelligence Lab, Harbin Institute ofTechnology, from 2007 to 2010. He has authored over 80 referred journals andconferences in IJCV, TIP, TMM, TOMCCAP, IEEE Multimedia, PR, ACMMultimedia Systems, CVPR, ACM Multimedia, IJCAI, and AAAI. His currentresearch interests include image and video search, content understanding,mobile visual search, and interactive human-computer interface.

Dr. Ji was the recipient of the Best Paper Award at ACM Multimedia 2011and Microsoft Fellowship 2007. He is an Associate Editor for the InternationalJournal of Computer Applications, a Guest Editor of the International Journalof Advanced Computer Science and Applications, a session Chair of ICME2008, ICIMCS 2010, MMM 2013, and PCM 2012. He serves as a Reviewerfor IEEE TPAMI, TIP, TMM, CSVT, TSMC PART A, B, C, and theIEEE Signal Processing Magazine. He is in the program committees ofover 20 international conferences including CVPR 2013, ECCV 2012, ACMMultimedia 2012, ACM Multimedia 2011, and ICME 2012.

Page 14: Spectral-Spatial Constraint Hyperspectral Image Classification

1824 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 52, NO. 3, MARCH 2014

Yue Gao received the B.S. degree from the HarbinInstitute of Technology, Harbin, China, in 2005, andthe M.E. and Ph.D. degrees from Tsinghua Univer-sity, Beijing, China, in 2008 and 2012, respectively.

He was a Visiting Scholar at Carnegie MellonUniversity, Pittsburgh, PA, USA, where he workedwith Dr. Alexander Hauptmann from October 2010to March 2011, and was a Research Intern at theNational University of Singapore, Singapore, and aResearch Intern at the Intel China Research Center,Zhong Guan Cun, China. He is currently a Research

Fellow with the School of Computing, National University of Singapore,working with Prof. Tat-Seng Chua. He has authored over 40 journals andconference papers including TIP, TMM, TOMCCAP, PR, CVPR, and ACMMultimedia. His current research interests include large-scale image/videoretrieval, 3-D object retrieval and recognition, and social media analysis.

Dr. Gao is a Guest Editor of Multimedia System Journal, a session Chair ofPCM 2012, MMM 2013, and ICIMCS 2013, and a Reviewer for the IEEETRANSCATIONS ON INFORMATION PROCESSING, the IEEE TRANSACTIONS

ON MULTIMEDIA, CSVT, and the IEEE TRANSACTIONS ON SYSTEMS,MAN, AND CYBERNETICS, B. He is a member of the Association forComputing Machinery.

Richang Hong (M’10) received the Ph.D. degreefrom the University of Science and Technology ofChina, Hefei, China, in 2008.

He was a Research Fellow with the School ofComputing, National University of Singapore, Sin-gapore, from September 2008 to December 2010.He is currently a Professor with the Hefei Universityof Technology. His current research interests includemultimedia question answering, video content analy-sis, and pattern recognition. He has co-authoredmore than 50 publications in his areas of expertise.

Dr. Hong is a member of the Association for Computing Machinery (ACM).He was the recipient of the Best Paper Award in ACM Multimedia 2010.

Qiong Liu (M’11) received the B.S. degree in com-puter science and technology from Central ChinaNormal University, Wuhan, China, and the Ph.D.degree from the National Engineering ResearchCenter for Multimedia Software, Wuhan University,Wuhan, China, in 2008.

She was a Post-Doctoral Fellow with the Depart-ment of Automation, Tshinghua University, Beijing,from 2010 to 2012. She is currently with the Fac-ulty of Department of Electronic and InformationEngineering, Huazhong University of Science and

Technology, Wuhan, China. She has authored more than 30 journal andconference papers on image and video processing, 3-D image modeling andrepresentation, multi-view and free viewpoint video, and 3-D television. Sheholds more than 15 patents on video coding and 3-D video processing. Sheis also one of the chief contributors to three domestic 3-D industry standardsin China.

Dacheng Tao received the B.Eng. degree from theUniversity of Science and Technology of China,Hefei, China, the M.Phil. degree from the ChineseUniversity of Hong Kong, Hong Kong, and the Ph.D.degree from the University of London, London,U.K.

He is currently a Professor with the Centre forQuantum Computation & Intelligent Systems andthe Faculty of Engineering and Information Tech-nology, University of Technology Sydney, Sydney,Australia, a Research Associate Fellow with the

University of London, a Visiting Professor with the Xidian University, Xi’an,China, and a Guest Professor with Wuhan University, Wuhan, China. Inparticular, he mainly applies statistics and mathematics for data analysisproblems in neuroscience and related areas, e.g., neurocomputing, cognitivevision, large scale data processing, machine intelligence and compressedsensing. He has authored and co-authored more than 100 scientific articlesat top venues including IEEE T-PAMI, T-KDE, T-IP, NIPS, ICDM, CVPR,ECCV; ACM T-KDD, KDD and Cognitive Computation, with best paperawards. One of his T-PAMI papers was selected as a “New Hot Paper”in ScienceWatch.com (Thomson Scientific) for the contribution of tensormachines for computational neuroscience. His publications have been citedin Google Scholar more than 1000 times, his H-Index in Google Scholaris 17+, and his Erdors number is 3. His current research interests includecomputational neuroscience.

Dr. Tao was the recipient of several Meritorious Awards from the Inter-national Interdisciplinary Contest in Modeling, which is the highest levelmathematical modeling contest in the world, organized by COMAP. Heholds the K. C. Wong Education Foundation Award of Chinese Academyof Sciences. He is an Associate Editor of the IEEE TRANSACTIONS ON

KNOWLEDGE AND DATA ENGINEERING (T-KDE), Neurocomputing (Else-vier), Neural Processing Letters (Springer), Information Sciences (Else-vier), the Official Journal of the International Association for StatisticalComputing - Computational Statistics and Data Analysis (Elsevier), SignalProcessing (Elsevier), and Pattern Analysis and Applications (Springer). Hehas copyedited five books and eight journal special issues on several topicsof computational neuroscience. He has (co-)chaired for special sessions,invited sessions, workshops, panels and conferences. He has served with morethan 100 major international conferences including ICDM, KDD, CVPR,ICCV, ECCV, and Multimedia, and more than 40 prestigious internationaljournals including T-PAMI, T-KDE, T-OIS, and T-IP. He is a member ofthe IEEE Computer Society, the IEEE Signal Processing Society, the IEEESystems, Man, and Cybernetics (SMC) Society, and the IEEE SMC TechnicalCommittee on Cognitive Computing.

Xuelong Li (M’02–SM’07–F’12) received the Ph.D.degree from University of Science and Technologyof China, Hefei, China.

He is currently a Full Professor with the Centerfor OPTical IMagery Analysis and Learning (OPTI-MAL), State Key Laboratory of Transient Optics andPhotonics, Xi’an Institute of Optics and PrecisionMechanics, Chinese Academy of Sciences, Xi’an,China.


Recommended