Post on 05-Dec-2014
description
transcript
Explicit Signal to Noise Ratio inReproducing Kernel Hilbert Spaces
Luis Gómez-Chova1 Allan A. Nielsen2 Gustavo Camps-Valls1
1Image Processing Laboratory (IPL), Universitat de València, Spain.luis.gomez-chova@uv.es , http://www.valencia.edu/chovago
2DTU Space - National Space Institute. Technical University of Denmark.
IGARSS 2011 – Vancouver, Canada
*
IPL
Image Processing Laboratory
Intro SNR KMNF Results Conclusions
Outline
1 Introduction
2 Signal-to-noise ratio transformation
3 Kernel Minimum Noise Fraction
4 Experimental Results
5 Conclusions and Open questions
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 1/23
Intro SNR KMNF Results Conclusions
Motivation
Feature Extraction
Feature selection/extraction is essential before classification or regressionto discard redundant or noisy componentsto reduce the dimensionality of the data
Create a subset of new features by combinations of the existing ones
Linear Feature Extraction
Linear methods offer Interpretability ∼ knowledge discoveryPCA: projections maximizing the data set variancePLS: projections maximally aligned with the labelsICA: non-orthogonal projections with maximal independent axes
Drawbacks
1 Most feature extractors disregard the noise characteristics!2 Linear methods fail when data distributions are curved (nonlinear relations)
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 2/23
Intro SNR KMNF Results Conclusions
Motivation
Feature Extraction
Feature selection/extraction is essential before classification or regressionto discard redundant or noisy componentsto reduce the dimensionality of the data
Create a subset of new features by combinations of the existing ones
Linear Feature Extraction
Linear methods offer Interpretability ∼ knowledge discoveryPCA: projections maximizing the data set variancePLS: projections maximally aligned with the labelsICA: non-orthogonal projections with maximal independent axes
Drawbacks
1 Most feature extractors disregard the noise characteristics!2 Linear methods fail when data distributions are curved (nonlinear relations)
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 2/23
Intro SNR KMNF Results Conclusions
Objectives
Objectives
New nonlinear kernel feature extraction method for remote sensing data
Extract features robust to data noise
Method
Based on the Minimum Noise Fraction (MNF) transformationExplicit Kernel MNF (KMNF)
Noise is explicitly estimated in the reproducing kernel Hilbert spaceDeals with non-linear relations between the noise and signal features jointlyReduces the number of free parameters in the formulation to one
Experiments
PCA, MNF, KPCA, and two versions of KMNF (implicit and explicit)
Test feature extractors for real hyperspectral image classification
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 3/23
Intro SNR KMNF Results Conclusions
1 Introduction
2 Signal-to-noise ratio transformation
3 Kernel Minimum Noise Fraction
4 Experimental Results
5 Conclusions and Open questions
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 4/23
Intro SNR KMNF Results Conclusions
Signal and noise
Signal vs noise
Signal: magnitude generated by an inaccesible system, siNoise: magnitude generated by the medium corrupting the signal, ni
Observation: signal corrupted by noise, xi
Notation
Observations: xi ∈ RN , i = 1, . . . , n
Matrix notation: X = [x1, . . . , xn]> ∈ Rn×N
Centered data sets: assume X has zero mean
Empirical covariance matrix: Cxx = 1nX>X
Projection matrix: U (size N × np) → X′ = XU (np extracted features)
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 5/23
Intro SNR KMNF Results Conclusions
Principal Component Analysis Transformation
Principal Component Analysis (PCA)
Find projections of X = [x1, . . . , xN ]> maximizing the variance of data XU
PCA: maximize: Trace{(XU)>(XU)} = Trace{U>CxxU}subject to: U>U = I
Including Lagrange multipliers λ, this is equivalent to the eigenproblem
Cxxui = λiui → CxxU = UD
ui are the eigenvectors of Cxx and they are orthonormal, u>i uj = 0
PCA limitations
1 Axes rotation to the directions of maximum variance of data2 It does not consider noise characteristics:
Assumes noise variance is low → last eigenvectors with low eigenvaluesMaximum variance directions may be affected by noise
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 6/23
Intro SNR KMNF Results Conclusions
Principal Component Analysis Transformation
Principal Component Analysis (PCA)
Find projections of X = [x1, . . . , xN ]> maximizing the variance of data XU
PCA: maximize: Trace{(XU)>(XU)} = Trace{U>CxxU}subject to: U>U = I
Including Lagrange multipliers λ, this is equivalent to the eigenproblem
Cxxui = λiui → CxxU = UD
ui are the eigenvectors of Cxx and they are orthonormal, u>i uj = 0
PCA limitations
1 Axes rotation to the directions of maximum variance of data2 It does not consider noise characteristics:
Assumes noise variance is low → last eigenvectors with low eigenvaluesMaximum variance directions may be affected by noise
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 6/23
Intro SNR KMNF Results Conclusions
Minimum Noise Fraction Transformation
The SNR transformation
Find projections maximizing the ratio between signal and noise variances:
SNR: maximize: Tr
U>CssUU>CnnU
ffsubject to: U>CnnU = I
Unknown signal and noise covariance matrices Css and Cnn
The MNF transformation
Assuming additive X = S + N and orthogonal S>N = N>S = 0 noiseMaximizing SNR is equivalent to Minimizing NF = 1/(SNR+1):
MNF: maximize: Tr
U>CxxUU>CnnU
ffsubject to: U>CnnU = I
This is equivalent to solving the generalized eigenproblem:
Cxxui = λiCnnui → CxxU = CnnUD
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 7/23
Intro SNR KMNF Results Conclusions
Minimum Noise Fraction Transformation
The SNR transformation
Find projections maximizing the ratio between signal and noise variances:
SNR: maximize: Tr
U>CssUU>CnnU
ffsubject to: U>CnnU = I
Unknown signal and noise covariance matrices Css and Cnn
The MNF transformation
Assuming additive X = S + N and orthogonal S>N = N>S = 0 noiseMaximizing SNR is equivalent to Minimizing NF = 1/(SNR+1):
MNF: maximize: Tr
U>CxxUU>CnnU
ffsubject to: U>CnnU = I
This is equivalent to solving the generalized eigenproblem:
Cxxui = λiCnnui → CxxU = CnnUD
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 7/23
Intro SNR KMNF Results Conclusions
Minimum Noise Fraction Transformation
The MNF transformation
Minimum Noise Fraction equivalent to solve the generalized eigenproblem:
Cxxui = λiCnnui → CxxU = CnnUD
Since U>CnnU = I, eigenvalues λi are the SNR+1 in the projected space
Need estimates of signal Cxx = X>X and noise Cnn ≈ N>N covariances
The noise covariance estimation
Noise estimate: diff. between actual value and a reference ‘clean’ value
N = X− Xr
Xr from neighborhood assuming a spatially smoother signal than the noiseAssume stationary processes in wide sense:
Differentiation: ni ≈ xi − xi−1Smoothing filtering: ni ≈ xi − 1
MPM
k=1 wkxi−kWiener estimatesWavelet domain estimates
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 8/23
Intro SNR KMNF Results Conclusions
Minimum Noise Fraction Transformation
The MNF transformation
Minimum Noise Fraction equivalent to solve the generalized eigenproblem:
Cxxui = λiCnnui → CxxU = CnnUD
Since U>CnnU = I, eigenvalues λi are the SNR+1 in the projected space
Need estimates of signal Cxx = X>X and noise Cnn ≈ N>N covariances
The noise covariance estimation
Noise estimate: diff. between actual value and a reference ‘clean’ value
N = X− Xr
Xr from neighborhood assuming a spatially smoother signal than the noiseAssume stationary processes in wide sense:
Differentiation: ni ≈ xi − xi−1Smoothing filtering: ni ≈ xi − 1
MPM
k=1 wkxi−kWiener estimatesWavelet domain estimates
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 8/23
Intro SNR KMNF Results Conclusions
1 Introduction
2 Signal-to-noise ratio transformation
3 Kernel Minimum Noise Fraction
4 Experimental Results
5 Conclusions and Open questions
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 9/23
Intro SNR KMNF Results Conclusions
Kernel methods for non-linear feature extraction
Kernel methods
Input features space Kernel feature space
Φ
1 Map the data to a high-dimensional feature space, H (dH →∞)2 Solve a linear problem there
Kernel trick
No need to know dH →∞ coordinates for each mapped sample φ(xi )
Kernel trick: “if an algorithm can be expressed in the form of dotproducts, its non-linear (kernel) version only needs the dot productsamong mapped samples, the so-called kernel function:”
K(xi , xj ) = 〈φ(xi ),φ(xj )〉
Using this trick, we can implement K-PCA, K-PLS, K-ICA, etc
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 10/23
Intro SNR KMNF Results Conclusions
Kernel Principal Component Analysis (KPCA)
Kernel Principal Component Analysis (KPCA)
Find projections maximizing variance of mapped data [φ(x1), . . . , φ(xN)]>
KPCA: maximize: Tr{(ΦU)>(ΦU)} = Tr{U>Φ>ΦU}subject to: U>U = I
The covariance matrix Φ>Φ and projection matrix U are dH × dH !!!
KPCA through kernel trick
Apply the representer’s theorem: U = Φ>A where A = [α1, . . . ,αN ]
KPCA: maximize: Tr{A>ΦΦ>ΦΦ>A} = Tr{A>KKA}subject to: U>U = A>ΦΦ>A = A>KA = I
Including Lagrange multipliers λ, this is equivalent to the eigenproblem
KKαi = λiKαi → Kαi = λiαi
Now matrix A is N × N !!! (eigendecomposition of K)
Projections are obtained as ΦU = ΦΦ>A = KAL. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 11/23
Intro SNR KMNF Results Conclusions
Kernel MNF Transformation
KMNF through kernel trick
Find projections maximizing SNR of mapped data [φ(x1), . . . , φ(xN)]>
Replace X ∈ Rn×N with Φ ∈ Rn×NH
Replace N ∈ Rn×N with Φn ∈ Rn×NG
CxxU = CnnUD⇒ Φ>ΦU = Φ>n ΦnUD
Not solvable: matrices Φ>Φ and Φ>n Φn are NH × NH and NG × NGLeft multiply both sides by Φ, and use representer’s theorem, U = Φ>A:
ΦΦ>ΦΦ>A = ΦΦ>n ΦnΦ>AD→ KxxKxxA = KxnK>xnAD
Now matrix A is N × N !!! (eigendecomposition of Kxx wrt Kxn)Kxx = ΦΦ> is symmetric with elements K(xi , xj )
Kxn = ΦΦ>n = K>nx is non-symmetric with elements K(xi , nj )
Easy and simple to program!Potentially useful when signal and noise are nonlinearly related
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 12/23
Intro SNR KMNF Results Conclusions
SNR in Hilbert spaces
Implicit KMNF: noise estimate in the input space
Estimate the noise directly in the input space: N = X− Xr
Signal-to-noise kernel:
Kxn = ΦΦn> → K(xi , nj )
with Φn> = [φ(n1), . . . ,φ(nn)]
Kernels Kxx and Kxn dealing with objects of different nature → 2 paramsTwo different kernel spaces → eigenvalues have no longer meaning of SNR
Explicit KMNF: noise estimate in the feature space
Estimate the noise explicitly in the Hilbert space: Φn = Φ−Φr
Signal-to-noise kernel:
Kxn = ΦΦn> = Φ(Φ−Φr )
> = ΦΦ> −ΦΦr> = Kxx − Kxr
Again it is not symmetric K(xi , rj ) 6= K(ri , xj )
Advantage: same kernel parameter for Kxx and Kxn
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 13/23
Intro SNR KMNF Results Conclusions
SNR in Hilbert spaces
Explicit KMNF: nearest reference
Differentiation in feature space: φni≈ φ(xi )− φ(xi,d )
(Kxn)ij ≈ 〈φ(xi ),φ(xj )− φ(xj,d )〉 = K(xi , xj )− K(xi , xj,d )xrx
Explicit KMNF: averaged reference
Difference to a local average in feature space (e.g. 4-connected
neighboring pixels): φni≈ φ(xi )−
1DPD
d=1 φ(xi,d )xrxr x
xr
xr
(Kxn)ij ≈ 〈φ(xi ),φ(xj )−1D
DXd=1
φ(xj,d )〉 = K(xi , xj )−1D
DXd=1
K(xi , xj,d )
Explicit KMNF: autoregression reference
Weight the relevance of each kernel in the summation:
(Kxn)ij ≈ 〈φ(xi ),φ(xj )−DX
d=1
wdφ(xj,d )〉 = K(xi , xj )−DX
d=1
wdK(xi , xj,d )
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 14/23
Intro SNR KMNF Results Conclusions
1 Introduction
2 Signal-to-noise ratio transformation
3 Kernel Minimum Noise Fraction
4 Experimental Results
5 Conclusions and Open questions
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 15/23
Intro SNR KMNF Results Conclusions
Experimental results
Data material
AVIRIS hyperspectral image (220-bands): Indian Pine test site145× 145 pixels, 16 crop types classes, 10366 labeled pixelsThe 20 noisy bands in the water absorption region are intentionally kept
Experimental setup
PCA, MNF, KPCA, and two versions of KMNF (implicit and explicit)
The 220 bands transformed into a lower dimensional space of 18 features
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 16/23
Intro SNR KMNF Results Conclusions
Visual inspection: extracted features in descending order of relevance
Features 1–3 4–6 7–9 10–12 13–15 16–18
PCA
MNF
KPCA
implicitKMNF
explicitKMNF
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 17/23
Intro SNR KMNF Results Conclusions
Analysis of the eigenvalues: signal variance and SNR of transformed data
Analysis of the eigenvalues:
Signal variance of the transformed data for PCA
SNR of the transformed data for MNF and KMNF
0 5 10 15 200
50
100
150
# feature
Eig
enva
lue
/ SN
R
PCAMNFKMNF
The proposed approach provides the highest SNR!
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 18/23
Intro SNR KMNF Results Conclusions
LDA classifier: land-cover classification accuracy
2 4 6 8 10 12 14 16 180.1
0.2
0.3
0.4
0.5
0.6
# features
Kap
pa s
tatis
tic (κ)
PCAMNFKPCAKMNFiKMNF
2 4 6 8 10 12 14 16 180.1
0.2
0.3
0.4
0.5
0.6
# features
Kap
pa s
tatis
tic (κ)
PCAMNFKPCAKMNFiKMNF
Original hyperspectral image Multiplicative random noise (10%)
Best results: linear MNF and the proposed KMNF
The proposed KMNF method outperforms MNF when the image iscorrupted with non additive noise
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 19/23
Intro SNR KMNF Results Conclusions
LDA classifier: land-cover classification maps
LDA-MNF LDA-KMNF
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 20/23
Intro SNR KMNF Results Conclusions
1 Introduction
2 Signal-to-noise ratio transformation
3 Kernel Minimum Noise Fraction
4 Experimental Results
5 Conclusions and Open questions
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 21/23
Intro SNR KMNF Results Conclusions
Conclusions and open questions
Conclusions
Kernel method for nonlinear feature extraction maximizing the SNRGood theoretical and practical properties for extracting noise-free features
Deals with non-linear relations between the noise and signalThe only parameter is the width of the kernelKnowledge about noise can be encoded in the method
Simple optimization problem → eigendecomposition of the kernel matrix
Noise estimation in the kernel space with different levels of sophistication
Simple feature extraction toolbox (SIMFEAT) soon at http://isp.uv.es
Open questions and Future Work
Pre-images of transformed data in the input space
Learn kernel parameters in an automatic way
Test KMNF in more remote sensing applications: denoising, unmixing, ...
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 22/23
Intro SNR KMNF Results Conclusions
Explicit Signal to Noise Ratio inReproducing Kernel Hilbert Spaces
Luis Gómez-Chova1 Allan A. Nielsen2 Gustavo Camps-Valls1
1Image Processing Laboratory (IPL), Universitat de València, Spain.luis.gomez-chova@uv.es , http://www.valencia.edu/chovago
2DTU Space - National Space Institute. Technical University of Denmark.
IGARSS 2011 – Vancouver, Canada
*
IPL
Image Processing Laboratory
L. Gómez-Chova et al. Explicit Kernel Signal to Noise Ratio IGARSS 2011 – Vancouver 23/23