Post on 28-Nov-2014
description
transcript
ESTIMATION OF HYPERSPECTRAL COVARIANCE MATRICES
Avishai Ben-David1 and Charles E. Davidson2
1Edgewood Chemical Biological Center, USA.
2Science and Technology Corporation, USA.
Outline
• Why covariance matrices are important?
• What is the difficulty in estimation?
• Our approach
• Example for hyperspectral detection
Why covariance matrices are important
• The covariance matrix C is the engine of most multivariate detection algorithms
• Examples:
Matched Filter: score = αT·C-1·t
Anomaly Detector: score = α T·C-1· α
α = measurement vector, t = target vector
How do we compute C ?• z is measurement vector with p spectral bands
(i.e., z is p-by-1 vector) that is measured when target was absent (i.e., the H0 hypothesis)
• We acquire n z-vectors and construct a p-by-nmatrix Z, and center it (mean subtracted) ZZ-E(Z)
• C=cov(Z)=E(ZZT)=UUT (CWishart statistics) where is the estimated eigenvalue-matrix and U is the estimated eigenvector-matrix using SVD decomposition.
What is the difficulty in estimation• The problem is that there are not enough
measurement of z-vectors (n is too small)
• Example: sampled eigenvalues from sampled C(average of 1000 matrices)
• 5 spectral bands (p), i.e., C=5-by-5 matrix (very small) with true eigenvalues =[1 2 3 4 5]
(a) n=50 measurements: n/p=10
(e.g., p=150 (typical in hyperspectral) n 1,500
=[0.9 1. 8 2.8 4.0 5.6]
(b) n=10 measurements (n/p=2 e.g. RMB rule in radar) =[0. 4 1.1 2.1 4.0 7.3]
(Reed, Mallet & Brennan, 1974, average SNR loss for matched Filter is X2)
Our solution (general overview)
• Objective: to find a simple transformation from sampled eigenvalues Λx to population (truth) eigenvalues (ΛΩ).
Λ=f(Λx) ΛΩ
• The improved covariance matrix is computed as C=UTΛU. We replace sampled eigenvaluesΛx with the improved estimate Λ, and using the sampled eigenvectors U (for lack of knowledge of the population eigenvectors).
• Our solution involves two steps.
1st step is interpreted as adding energy spectrally.
2nd step is balancing the energy in two big blocks:
small and large eigenvalue regions.
Thus, we “redistribute” energy to the eigenvalues
• We use theory for statistical distribution of eigenvalues for Wishart matrices and bounds on magnitude of eigenvalues, and energy conservation constraint.
We view the sampled eigenvalues ”as if” they can be represented with diagonal of p block-matrices, each with Marcenko-Pastur law.
Sampled eigenvalues “as if” sampled from the mode (i.e., highest probability location).
Sampled eigenvalue are “shifted” (1ststep) toward the population eigenvalues.
We impose energy conservation (2nd step) for the solution - because the sum of eigenvalues (trace) is unbiased, i.e., trace(x)=trace()
Trace is the signal “energy” (total variation)
Our solution (detailed view)How simple is it?
Multiplication of 3 matrices:
EFxx nf ),(
)1(
)1(
)();1
(
2
modemode
n
pn
p
iFF
diagi
i
F
tp
tE IE
IE
small
large
0
0
t
i
x
t
ix
iF
is
is
E
1 mode
1large
)(
)(
)(
p
ti
x
p
tix
iF
is
is
E
1 mode
1small
)(
)(
)(
shift sampled eigenvalues
based on mode with matrix
F and multiplicity pi
balance the energy with
matrix E
x is the sampled eigenvalues
matrix, x = eig(C)
Regularization aspect of the solution(enhanced stability)
• The solution is a nonlinear transformation of the sampled eigenvalues x
• We can also write the solution in the framework of traditional regularization as
• Our correction is potentially different for each eigenvalue. (it is single offset in traditional regularization).
• With our method the condition number of improves (decreases) due to the fact that in the magnitude of the small sampled eigenvalues tend to increase.
Thus, cond() < cond(x)
)(; IEFxx
Eigenvalue estimation for diagonal matrix: Marcenko-Pastur law
• C is p-by-p diagonal matrix with C=2 (multiplicity of peigenvalues each is 2)
• The pdf of sampled eigenvalues is known analytically.
• There is a relationship between the mode of the pdf and the true (population) eigenvalue. Mode is ML position
),(~ 12 nnIWC p
npk
Fk
ksx
/
1
)1()mode( 2
mode2
2
• based on the mode location, the
sampled eigenvalue is shifted
upward (step 1 of the process)
toward population value (the mean)
Apparent multiplicity p for nondiagonal matrices
• We use theory for bounds of the sampled eigenvalues
• We count the number (pi) of overlapped eigenvalues within [ai bi] for each sampled eigenvalue
)()()( ibisia x npk
kisia x
/
1)()(2
21)()( kisib x
)1(
)1(
)();1
(
2
modemode
n
pn
p
iFF
diagi
i
F
The multiplicity of the 4th eigenvalue is 3
(two neighbors, the 2nd & 3rd plus itself)
Examples1. Simulations with many analytical functions & statistics
for population eigenvalues (normal, uniform, Gamma)
2. Field data: hyperspectral sensors SEBASS & TELOPS
figures of merit
Ratio of improvement of the
solution over the data
• Re = residual
• RA = area
• Rcond = condition #
• Rd =distance in probability
All figures of merit are greater than 1.
Hence, improvement of our solution over data
n/p=2
p=115
truthsolution
data
SEBASS
n/p=2p=135
solutiondata
truth
Probability density functions for TELOPS measurements for selected eigenvalues
Drastic Improvement:
panels 3, 4, 6, 7, 8, 9
(eigenvalues # 30, 40, 80,
100,120,135)
No Difference
panels 1, 5
(eigenvalues # 1, 50)
Failure
panel 2
(eigenvalue # 10)
All figures of merit are
greater than 1.
Hence, improvement
of our solution over data
Application to Hyperspectral Detection
• Matched Filter: score = αT·C-1·t
α = measurement vector, t = target vector
• Random target direction
data
solution
truth (known population)
clairvoyant (known population and directions)
C
C
• from data: Pd < 50%
• with solution Pd >60%
• known eigenvalues
(& sampled eigenvectors)
Pd >65%
• known covariance
(true eigenvalues & vectors)
Pd >80%
Summary• We presented a method to estimate the
eigenvalues of a sampled covariance matrix (Wishart distributed) with few samples.
• The method is practical, quick and simple for implementation with a multiplication of three diagonal matrices.
• The method achieves two objectives:
improved estimation of eigenvalues &
improved condition number (i.e., regularization).
• With the method we improve the detection
(ROC curve)