Date post: | 28-Dec-2015 |
Category: |
Documents |
Upload: | chester-summers |
View: | 222 times |
Download: | 0 times |
MIT Lincoln Laboratory000317-1S.T. SMITH
Geometry and Invariancein
Signal Processing
Steven T. Smith*
*MIT Lincoln Laboratory, Lexington, MA 02420; [email protected]. This work was sponsored by the United States Air Force under Air Force contract F19628-00-C-0002. Opinions, interpretations, conclusions, and recommendations are those of the author and are not necessarily endorsed by the United States Government.
MIT Lincoln Laboratory000317-2S.T. SMITH
Covariance Matrix Estimation Bounds
Steven T. Smith*
*MIT Lincoln Laboratory, Lexington, MA 02420; [email protected]. This work was sponsored by the United States Air Force under Air Force contract F19628-00-C-0002. Opinions, interpretations, conclusions, and recommendations are those of the author and are not necessarily endorsed by the United States Government.
MIT Lincoln Laboratory030303-3S.T. SMITH
Outline
• Introduction
• Geometric background
• Nonlinear estimation theory
• Nonlinear detection theory
• Geometric optimization and filtering theory
• Summary and conclusions
MIT Lincoln Laboratory030303-4S.T. SMITH
Applications That UseCovariance Matrix Estimation
Air and Ground Surveillance• Space-Time adaptive processing• SAR/GMTI• Tracking
Algorithms and systems analysis for detection, location, and classification of difficult signals
all rely on covariance-based methods
Algorithms and systems analysis for detection, location, and classification of difficult signals
all rely on covariance-based methods
Signals Intelligence• Spectral analysis• Superresolution
Robust Navigation• Adaptive beamforming
Undersea Surveillance• Adaptive beamforming• Spectral analysis• Tracking
AdvancedCommunications
• Adaptive beamforming• Spectral analysis• Speech
MIT Lincoln Laboratory030303-5S.T. SMITH
What’s Known AboutCovariance Matrix Estimation Quality?
• The sample covariance matrix (SCM) is the most likely covariance matrix estimate
– The SCM looks like: Rˆ = XX H (X is the N-by-K “data matrix”)
The “sample support” is K samples
– The SCM is unbiased: E [Rˆ] = R
– The SCM is “efficient”: Cov(Rˆ – R) is as small as possible
– The SCM is a lousy estimate at low sample support, SNRs
Subspace and ad hoc methods like “diagonal loading” are necessary
• The sample covariance matrix (SCM) is the most likely covariance matrix estimate
– The SCM looks like: Rˆ = XX H (X is the N-by-K “data matrix”)
The “sample support” is K samples
– The SCM is unbiased: E [Rˆ] = R
– The SCM is “efficient”: Cov(Rˆ – R) is as small as possible
– The SCM is a lousy estimate at low sample support, SNRs
Subspace and ad hoc methods like “diagonal loading” are necessary
Reed-Mallett-Brennan-Kelly-Boroson detection loss
Reed-Mallett-Brennan-Kelly-Boroson detection loss
1 2 5 10 15 200
3
6
Sample support / N
Lo
ss (
dB
)
K–N+2K+1
SCM eigenvalues(the “deformed quarter-circle law”)
SCM eigenvalues(the “deformed quarter-circle law”)
1 5 10 15 20–12
–6
0
6
Index
(d
B)
K = 2N
MIT Lincoln Laboratory030303-6S.T. SMITH
Geometry is the Foundationof Signal Processing
Covariance matrices• Hermitian positive definite
Covariance matrices• Hermitian positive definite
Signal subspaces• Euclidean space
• Grassmann manifold
• Stiefel manifold
Signal subspaces• Euclidean space
• Grassmann manifold
• Stiefel manifold
Scaling• Magnitude
• Phase
Scaling• Magnitude
• Phase
Invariance testingInvariance testing
Statistical models• ƒ(z |)
–Parameter space–Cramér-Rao bounds
Statistical models• ƒ(z |)
–Parameter space–Cramér-Rao bounds
Spectral estimation• Array manifolds
Spectral estimation• Array manifolds
Filtering +AdaptationFiltering +Adaptation DetectionDetection EstimationEstimation
PhysicalModelingPhysicalModeling
Measurements
Signal Processing Steps
TrackingTracking
MIT Lincoln Laboratory030303-7S.T. SMITH
S1
Does E[] = 0?
What’s The Average Value of a Circle?Uniform Distribution
What aboutthis circle?
Or this one?Should theembedding
matter?
Shouldn’t E[] liveon the circle?
Yes! But where?
MIT Lincoln Laboratory030303-8S.T. SMITH
Some Literature (Very Partial List)
• Array manifolds / Spectral estimation– Schmidt ’79, Roy & Kailath ’89, Swindlehurst et al. ’92
• Maximum likelihood (geometric approach)– Amari ’85, Douglas & Amari ’97
• Phase-only nulling– Baird & Rassweiller ’76, Steyskal ’83, Hirasawa ’88, Smith ’94
• Structured covariance matrix estimation– Burg, Luenberger & Wenger ’82, Fuhrman & Barton ’90, Barton & Smith ’96
• Fourier analysis on homogeneous spaces (spheres etc.)– Healy ’95, Rockmore ’95
• Invariant hypothesis testing / Detection– Fisher ’53, Kay & Scharf ’84, Bose ’94, Scharf ’95
• Ambiguity functions– Rendas & Moura ’98
• Estimation bounds– Rao ’45, Gorman & Hero ’90, Smith ’00, ’05, Bhattacharya and Patrangenaru ’02
• Communications– Douglas ’00, Rahbar ’01, Zheng & Tse ’02, Cichocki, Amari & Georgiev ’02, Xavier ’02
• Pose estimation– Srivastiva & Grenander ’99, Ma et al. ’01, Adler et al. ’02, Srivastava & Klassen ’02
MIT Lincoln Laboratory030303-9S.T. SMITH
Proving Wegener’s Theoryof Continental Drift
From Margaret Hanson (U Cincinnati)From Gary Glatzmaier (UCSC)
From www.ucmp.berkeley.edu/geology/tectonics.html
From www.itis-molinari.mi.it/Boundaries.html,based on Vine (1966)
Do magnetic polarities here and here have the same statistical distribution?
– “Dispersion on a sphere” (Fisher, 1953)
Do magnetic polarities here and here have the same statistical distribution?
– “Dispersion on a sphere” (Fisher, 1953)
Fisher’s famous paper actually analyzed data from Iceland
Fisher’s famous paper actually analyzed data from Iceland
730 MYA
65 MYA
QuickTime™ and aGIF decompressor
are needed to see this picture.
MIT Lincoln Laboratory030303-10S.T. SMITH
Outline
• Introduction
• Geometric background
– Spheres, subspaces, and covariance matrices
Non-Euclidean examples in signal processing
– Manifolds, derivatives, geodesics
• Nonlinear estimation theory
• Nonlinear detection theory
• Geometric optimization and filtering theory
• Summary and conclusions
MIT Lincoln Laboratory030303-11S.T. SMITH
What Is the Average ofTwo Points on a Circle?
1
2• Average(1, 2) = w1•1 + w2•2
– What does multiplication by a weight mean?
– What does addition mean?
• These operations only make sense if the circle “lives” in some Euclidean space
• Are there “intrinsic” or “natural” equivalents of these ideas so that all operations take place on the circle?
– w1•1 = some other point on the circle
– 1 + 2 = some other point on the circle
• Same questions for spheres, n-spheres
MIT Lincoln Laboratory030303-12S.T. SMITH
What Is the Average ofTwo Subspaces?
• Average(Y1, Y2) = w1•Y1 + w2•Y2
– What do these operations mean?
– Intrinsic explanations required
– w1•Y1 = some other subspace
– Y1 + Y2 = some other subspace
• No obvious way to embed the space of subspaces Gn,p (the Grassmann manifold) in Euclidean space
– Y an n-by-p matrix with orthonormal columns, but only the column span matters: YA Y column
– The n-by-n projection matrix YY T
– Neither gives a way to compute w1•Y1 + w2•Y2
Different Manifold, Same Questions
X1
X2
X3
Subspace Y1
SubspaceY2
MIT Lincoln Laboratory030303-13S.T. SMITH
Covariance Matrix Estimation
• Sample covariance matrix (SCM): Rˆ = K
–1XX T
• What’s the average value of the SCM?
– E[Rˆ] = Rˆƒ(X|R) dX = R
– If w1•R1 + w2•R2 makes sense, then integral makes sense
– May we treat covariance matrices as vectors?
• Question: What do you get when you subtract one covariance matrix from another?
• Answer: Not a covariance matrix!2
0
0
1
1
0
0
2
1
0
0
–1
– =
Cone of Hermitian positive definite
matrices• R is a covariance• So is R, > 0
R1
R2
The covariance matrices are not a vector space
data matrix X
MIT Lincoln Laboratory030303-14S.T. SMITH
Comparing Points on Manifolds
• Compare points using geodesic curves [exponential map]:
– Equate points on the manifold with tangent vectors at
• Average(1, 2) = exp(w1•exp–1
1 + w2• exp–1
2)
– Intrinsic average “lives” on the manifold
• Estimator bias b() depends upon choice of geodesics
ˆE[]
Parametermanifold
estimator
Tangent planeat exp–1
Geodesiccurves
b() = E[exp–1] = exp–1 E[]ˆ
ˆ
MIT Lincoln Laboratory030303-15S.T. SMITH
Natural Geodesics on Quotient Manifolds
Spheres
Great circle
Covariance geodesics:
R(t ) = R1/2 expm(R–1/2 Dt R–1/2) R
1/2
distance = 2-norm of log(eigenvalues)
Compare to flat geodesics R(t) = R + t D
Subspace geodesics:
Y(t ) = Y V cos(t )V H + U sin(t )V
H
distance = 2-norm of acos(singular values)
Subspaces Covariancematrices
X1X2
X3
Subspaces =U(n)
U(p) U(n–p)
= the part of U(n) that doesn’t give in-plane or co-plane rotations
Covariancematrices =
Gl(n,C)
U(n)
= the Hermitian part of the matrix polar decomposition
Spheres =U(n)
U(n–1)
= the part of U(n) that rotates the north pole
Lie groups
R1
R2
Y1
Y2
Distancesmeasured in
radiansDistances
measured indecibels
MIT Lincoln Laboratory030303-16S.T. SMITH
Derivatives
• The differential
(d /dt ) ƒ(x (t )) = ∂ƒ/∂x1 dx1/dt + ∂ƒ/∂x2 dx2/dt + … = ∂ƒ/∂x dx/dt
• The gradient
– The direction grad ƒ that solves the equation
grad ƒ, dx/dt = ∂ƒ/∂x dx/dt grad ƒ = G –1
∂ƒ/∂x
– Same derivation for Wiener filter equation w = R–1v
• The Hessian / Covariant differentiation
(d 2/dt
2) ƒ(x (t )) = 2ƒ(dx/dt, dx/dt ) (x(t ) a geodesic)
2ƒ(dxi /dt, dxj /dt ) = ∂ 2ƒ/∂xi ∂xj – ∑k
kij ∂ ƒ/∂xk
x1
x2
Manifold
x (t )
“curvature” terms
Think Cramér-Rao bound
Think Cramér-Rao bound
MIT Lincoln Laboratory030303-17S.T. SMITH
Outline
• Introduction
• Geometric background
• Nonlinear estimation theory
– Intrinsic Cramér-Rao bounds
– Covariance matrix estimation
– Subspace estimation accuracy
• Nonlinear detection theory
• Geometric optimization and filtering theory
• Summary and conclusions
MIT Lincoln Laboratory030303-18S.T. SMITH
The Fisher Information Matrix (1922)
The covariance of a Gaussian estimate is inversely proportional to the negative mean Hessian of the log-likelihood function
– “On the mathematical foundations of theoretical statistics” (Fisher, 1922)
The covariance of a Gaussian estimate is inversely proportional to the negative mean Hessian of the log-likelihood function
– “On the mathematical foundations of theoretical statistics” (Fisher, 1922)
MIT Lincoln Laboratory030303-19S.T. SMITH
C ≥ G –1C ≥ G –1
Intrinsic Cramér-Rao Lower BoundUnbiased Euclidean Case
ˆE[] =
Parameterspace
estimator
C ≥beamwidth2
SNRInverse FIM term
Error covariance Inverse Fisher information matrix
CRB looks like:
MIT Lincoln Laboratory030303-20S.T. SMITH
C ≥ (I + ∂b/∂)G –1(I + ∂b/∂)TC ≥ (I + ∂b/∂)G –1(I + ∂b/∂)T
Intrinsic Cramér-Rao Lower BoundBiased Euclidean Case
C ≥beamwidth2
SNRInverse FIM+bias term
Error covariance
Inverse Fisher information matrix
CRB looks like:
Derivative of bias vector b
Parameterspace
estimatorˆE[] = b()
MIT Lincoln Laboratory030303-21S.T. SMITH
Intrinsic Cramér-Rao Lower BoundUnbiased Riemannian Case
ˆE[] = Parametermanifold
estimator
C ≥beamwidth2
SNR
1 –beamwidth2 curvature
SNR+ O(SNR–3)
• Inverse FIM term– Really care
about this term
• Local curvature term– SNR–2 term with
Riemannian curvature– Not sure that I care: An
open question
• Higher order terms– I know that I don’t care:
CRB an asymptotic bound
C ≥ G –1 – (Rm(G
–1)G –1 + G
–1Rm(G –1)) C ≥ G
–1 – (Rm(G –1)G
–1 + G –1Rm(G
–1))
Inverse Fisher information matrix
Mean Riemannian curvature
1
3
CRB looks like:
Error covariance
MIT Lincoln Laboratory030303-22S.T. SMITH
Intrinsic Cramér-Rao Lower BoundBiased Riemannian Case
C ≥beamwidth2
SNR
1 –beamwidth2 curvature
SNR+ O(SNR–3)
• Inverse FIM+bias term
– Really care about this term
• Local curvature term– SNR–2 term with
Riemannian curvature– Not sure that I care: An
open question
• Higher order terms– I know that I don’t care:
CRB an asymptotic bound
C ≥ MbG –1Mb
T – (Rm(C)MbG –1Mb
T + MbG –1Mb
TRm(C))
Mb = I – ||b||2K(b) + b
C ≥ MbG –1Mb
T – (Rm(C)MbG –1Mb
T + MbG –1Mb
TRm(C))
Mb = I – ||b||2K(b) + b
Inverse Fisher information matrix
Mean Riemannian curvature
1
3
CRB looks like:
ˆE[]Parametermanifold
estimator
b()
1
3
Sectional curvature along bias b bases
Covariant differential of bias vector field b
Error covariance
MIT Lincoln Laboratory030303-23S.T. SMITH
• The Cramér-Rao lower bound is the smallest possible covariance of the estimation error
• The CRB depends upon the choice of geodesics
• The Cramér-Rao lower bound is the smallest possible covariance of the estimation error
• The CRB depends upon the choice of geodesics
Intrinsic Cramér-Rao Lower Bound
Error covariance ≥ ƒ(Inverse Fisher Information Matrix)
Estimation errorcovariance
ˆE[]
Parametermanifold
estimator
exp–1
b() = E[exp–1] = exp–1 E[]ˆ
ˆ
bias
MIT Lincoln Laboratory030303-24S.T. SMITH
Cramér-Rao Bound in Four Easy StepsA new proof of the CRB [Euclidean case]
Fact 1. = log ƒ(z|), = ∂ /∂ , G = E [T ]
Lemma 1. E [ ] = 0 (Differentiate equality ƒ(z|) d z = 1)
Fact 2. a biased estimator of , E [] = + b()
Lemma 2. E [( – – b) ] = I + b (Differentiate Fact 2)
Theorem. E [( – – b)( – – b)T] ≥ (I + b )G
–1(I + b )T
Proof. Consider the covariance of the random variablev = ( – – b) – (I+b)G
–1
T. E v = 0 by Lemma 1 and
Fact 2. By Lemma 2, E vvT = E [( – – b)( – – b)T] – (I+b)G
–1(I+b)T ≥ 0.
QED
Conclusion. E [( – – b)( – – b)T ] ≥ (I + b)G
–1(I + b)TConclusion. E [( – – b)( – – b)T
] ≥ (I + b)G–1(I + b)
T
The newpart
MIT Lincoln Laboratory030303-25S.T. SMITH
Intrinsic Efficiency Unbiased Euclidean Case
Consider the random tangent vector:
v = – – grad ˆ
Vector from to ˆ Gradient (w.r.t. FIM) of log-likelihood = log ƒ
Facts: E[v] = 0 and E[vvT] ≥ 0 (this is the CRB)
Efficiency: Estimator achieves the CRB iff
= + grad ˆ
ˆ
MIT Lincoln Laboratory030303-26S.T. SMITH
Intrinsic Efficiency Biased Euclidean Case
Consider the random tangent vector:
Facts: E[v] = 0 and E[vvT] ≥ 0 (this is the CRB)
Efficiency: Estimator achieves the CRB iff
= + b + (I + ∂b/∂)(grad )ˆ
ˆ
v = – – b() – (I + ∂b/∂)(grad )ˆ
Vector from to ˆGradient (w.r.t. FIM) of log-likelihood = log ƒBias vector
Derivative of bias vector
MIT Lincoln Laboratory030303-27S.T. SMITH
Intrinsic Efficiency Unbiased Riemannian Case
Consider the random tangent vector:
v = exp–1
– (I – Rm(C))(grad )1
3ˆ
Tangent from to ˆ Mean Riemannian curvature of exp
–1ˆ and basisGradient (w.r.t. FIM) of log-likelihood = log ƒ
Facts: E[v] = 0 and E[vvT] ≥ 0 (this is the CRB)
Efficiency: Estimator achieves the CRB iff
exp–1
= (I – Rm(C))(grad )1
3ˆ
ˆ
MIT Lincoln Laboratory030303-28S.T. SMITH
Intrinsic Efficiency Biased Riemannian Case
Consider the random tangent vector:
v = exp–1
– b – (I – ||b||2K(b) – Rm(C) + b)(grad )1
3
1
3ˆ
Tangent from to ˆ
Bias vector
Sectional curvatures between b and basis
Mean Riemannian curvature of exp
–1ˆ and basisGradient (w.r.t. FIM) of log-likelihood = log ƒ
Covariant differential of bias vector field b
Facts: E[v] = 0 and E[vvT] ≥ 0 (this is the CRB)
Efficiency: Estimator achieves the CRB iff
exp–1
= b + (I – ||b||2K(b) – Rm(C) + b)(grad )1
3
1
3ˆ
ˆ
MIT Lincoln Laboratory030303-29S.T. SMITH
What Are The Sectional andRiemannian Curvatures?
And Do They Matter?
• Bias term Mb depends upon ||b||2K(b)
– Small for small biases relative to (max |K |)–1/2
• Covariance term Rm(C) equals E[R(exp–1ˆ,),exp
–1ˆ]
– Small for small errors relative to (max |K |)–1/2
9 dB for covariance matrices, 1 radian for subspaces
A0 = r 2
Aexp
tX
tY–tX
–tY
ZZ(t )
R(X,Y )Z = limZ – Z(t )
t 2t 0
Riemanniancurvature
K(X Y ) = R(X,Y )Y,X / ||X Y ||2
= tr([X,Y ])2/4 ≤ 0 for covariances
= –tr([X,Y ])2/2 ≥ 0 for subspaces
K = limA0 – A
r 2A0r 0
sectionalcurvature
12
MIT Lincoln Laboratory030303-30S.T. SMITH
Metric Structure of Statistical ModelFisher Information Metric/Matrix
With coordinates (Rao ’45; e.g., angle, Doppler, etc.)• ds2 = g11 d 1 d 1 + 2g12 d 1 d 2 + g22 d 2d 2 + …
• gij = E [∂ /∂ i ∂ /∂ j ], () = log ƒ(z|) = log-likelihood
• G = [gij ] = Fisher information matrix (FIM)
Without coordinates (covariance, subspaces, etc.)• g = E [d d ] = Fisher information metric
Parameterspace
1
2
+d ds
What’s the length of ds?z
ƒ (z|+d )ƒ (z|)
What is the distance between two distributions?
Parameter spacepdf ƒ(z|), in U
Parameter spacepdf ƒ(z|), in U
Statistical modelS = { ƒ(z|) : in U }
Statistical modelS = { ƒ(z|) : in U }
MIT Lincoln Laboratory030303-31S.T. SMITH
Fisher Information Metric:
g E [d d ] = – E [ 2 ]
Fisher Information Metric:
g E [d d ] = – E [ 2 ]
The Fisher Information Metricand The Hessian
Why this fact is useful and important:
• Computationally convenient
– Second derivatives of many distributions more tractable than squares of first derivatives
• Independent of curvature of parameter space
– Fisher information matrix independent of arbitrary choice of affine connection and/or distance metric (“curvature” terms) for parameter space
Recall that 2ƒ(dxi /dt, dxj /dt ) = ∂ 2ƒ/∂xi
∂xj – ∑k k
ij ∂ ƒ/∂xk
“curvature” terms
MIT Lincoln Laboratory030303-32S.T. SMITH
Outline
• Introduction
• Geometric background
• Nonlinear estimation theory
– Intrinsic Cramér-Rao bounds
– Covariance matrix estimation
– Subspace estimation accuracy
• Nonlinear detection theory
• Geometric optimization and filtering theory
• Summary and conclusions
MIT Lincoln Laboratory030303-33S.T. SMITH
Covariance matrices flat:
Covariance matrices
R1R2
Surprising and Useful Result!
SCM is a Biased and Inefficient Estimator
• Sample covariance matrix (SCM): Rˆ = K
–1XX T
data matrix X
• Geodesics R(t) = R + t (Rˆ – R)
• ER[Rˆ] = expR (expR–1Rˆ)ƒ(X|R) dX
= R + (Rˆ – R)ƒ(X|R) dX
= R
• Rˆ is an unbiased and efficient (i.e., achieves CRB) estimate of R
• Doesn’t account for extra estimation loss at low sample support
• Geodesics R(t) = R1/2eR–1/2Dt R–1/2
R1/2
• ER[Rˆ] = e–(N,K)R
R
• Rˆ is an biased and inefficient (error larger than CRB) estimate of R
• The bias term (N,K) corresponds to extra estimation loss at low sample support
Covariance matrices curved:
No surprise hereNo surprise here Completely unexpected!Completely unexpected!
MIT Lincoln Laboratory030303-34S.T. SMITH
Sample Covariance Matrix Estimation
Covariance RMSE vs Sample Support
An estimator ˆ of is efficient (neglecting R-curvature) iff:
exp–1
E[ˆ] = b() +
(I–||b||2K(b)/3+b) grad
An estimator ˆ of is efficient (neglecting R-curvature) iff:
exp–1
E[ˆ] = b() +
(I–||b||2K(b)/3+b) grad
Is there a more efficient covariance estimator at low sample support?
≈ 10 dB difference
Is there a more efficient covariance estimator at low sample support?
≈ 10 dB difference
100 101 102
Sample support / N
100
101
102
Co
vari
ance
RM
SE
(d
B)
SCM (natural metric)
Biased natural CRB
Unbiased natural CRB
SCM (flat metric)
Flat unbiased CRB
6-by-6 Hermitian example, 1000 Monte Carlo trials
10/log 10·N/√K dB
10/log 10·ƒ(R)/√K dB
Flat efficiency:
E[ˆ] = + b() +
(I+∂b/∂)G–1(∂/∂)T
Flat efficiency:
E[ˆ] = + b() +
(I+∂b/∂)G–1(∂/∂)T
MIT Lincoln Laboratory030303-35S.T. SMITH
CRBs for SCMsClosed-Form Expressions
• Natural covariance metric/geodesics
– distance(Rˆ,R) = norm(log(eig(Rˆ,R)))
– mean-square distance(Rˆ,R) ≥ N 2/K + N·(N,K)2
(N,K) = N –1(N·log K + N – (K–N+1) + (K–N+1)(K–N+2) + (K+1) –
(K+1)(K+2)) [N-by-N Hermitian case] (N,K) = log K/2 – N
–1∑i ((K–i+1)/2) [N-by-N Symmetric case]
– Natural covariance metric invariant to R and complete
A “whitened” covariance metric
• Flat covariance metric/geodesics
– distance(Rˆ,R) = norm(Rˆ – R,’fro’)
– mean-square distance(Rˆ,R) ≥ K–1(∑i Rii 2 + 2∑I<j Rii Rjj )
[Hermitian]
– mean-square distance(Rˆ,R) ≥ 2K–1(∑i≤j Rij 2 + ∑I<j Rii Rjj )
[Symmetric]
– Flat covariance metric not invariant or complete
digamma function = ´/
MIT Lincoln Laboratory030303-36S.T. SMITH
Symmetric Space Geometry and SCM Bias
Covariancematrices
ER[Rˆ] = e–(N,K)R / B(R) = –(N,K)R
= the Hermitian part of the matrix polar decomposition
= determinant unit-determinant covariance matrices
Covariancematrices =
Gl(n,C)
U(n)= R
Sl(n,C)
U(n)~
=~
Gl(n,C)/U(n)
Sl(n,C)/U(n)R
• Only the SCM’s determinant is biased
– Implies that the eigenvalues are biased
• The covariant differential and curvature of B vanish
– Because B(R) = –R
• Is there a connection with the symmetric space decomposition of the covariance matrices?
• Can something be said about the bias in other symmetric spaces, e.g., the Grassmann manifold?
MIT Lincoln Laboratory030303-37S.T. SMITH
Outline
• Introduction
• Geometric background
• Nonlinear estimation theory
– Intrinsic Cramér-Rao bounds
– Covariance matrix estimation
– Subspace estimation accuracy
• Nonlinear detection theory
• Geometric optimization and filtering theory
• Summary and conclusions
MIT Lincoln Laboratory030303-38S.T. SMITH
Subspace Estimation Accuracy
• Standard subspace estimation method (SVD)
– [U,S,V] = svd(X,0) (Matlab notation)
– SVD-based subspace estimate: Yest = U(:,1:p)
• What is the error between this estimate and the truth?
– Natural subspace distance = (∑ principal angles2): norm(acos(min(svd(orth(Yest)’*orth(Ytrue)),1)))
– Recall that the Fisher information metric is independent of this choice of error metric
Ytrue
Yest
Subspace distance
MIT Lincoln Laboratory030303-39S.T. SMITH
Subspace Accuracy vsSNR and Sample Support
10 20 30 40 50 60
SNR (dB)
10–4
10–3
10–2
10–1
10 0
Su
bs
pac
e a
ccu
racy
(ra
d)
SVD-based subspace estimation
Cramér-Rao bound
5-by-2 example, 10 snapshots, 1000 Monte Carlo trials
100 101 102 103
Sample support / 2
10–3
10–2
10–1
10 0
Su
bs
pac
e a
ccu
racy
(ra
d)
SVD-based subspace estimation
Cramér-Rao bound
5-by-2 example, 21 dB SNR, 1000 Monte Carlo trials
• The RMSE of the SVD- based estimator is a small constant fraction above the CRB for fixed sample support
• The RMSE of the SVD- based estimator approaches the CRB for large sample support
• SVD-based estimation provides nearly optimum performance
• Coordinate-free CRB analysis required for this conclusion
(pn – p)1/2/(K·SNR)1/2 rad*
*white subspace covariance
MIT Lincoln Laboratory030303-40S.T. SMITH
Outline
• Introduction
• Geometric background
• Nonlinear estimation theory
• Nonlinear detection theory
– The detection problem
– Invariant analysis via fiber integration
– Mean and variance CFAR normalizers
• Geometric optimization and filtering theory
• Summary and conclusions
Wrap up
MIT Lincoln Laboratory030303-41S.T. SMITH
The Detection ProblemUnknown Background
• Detector design: Threshold data to detect signals in the presence of noisy backgrounds with a low false alarm rate
• Problem: Predict detection performance with unknown background mean* and variance*
0 20 40 60 80 100Time (µs)
0
10
20
30
40
Po
wer
(d
B)
Signal
Noise
ThresholdFalse alarm(Almost)
everythingyou need
to knowabout
thedetectionproblem:
mean*
variance*
MIT Lincoln Laboratory030303-42S.T. SMITH
Mean and Variance CFAR Normalizers
• The deflection ratio: x(z) = (z – )/
– Traditional method most encountered
– Minor nit: maps power domain data onto entire real line
Easily handled with remapping log(exp(z)+1) or (z+√(z2+1))/2
• The log-deflection ratio: x(z) = exp((log z – log)/log)
– Decibel-space version of deflection ratio
– Power-domain method
• The log-Gamma CFAR method: x(z) = –log (,·z/)/()
– Maps chi-squared backgrounds with = 2/2 complex dof to exponential backgrounds ( = 1); SNR SNR/2
– Power-domain method
• Can compute closed-form statistics of all of these
– Fiber integration using mapping x(z), = (, )
MIT Lincoln Laboratory030303-43S.T. SMITH
Outline
• Introduction
• Geometric background
• Nonlinear estimation theory
• Nonlinear detection theory
– The detection problem
– Invariant analysis via fiber integration
– Mean and variance CFAR normalizers
• Geometric optimization and filtering theory
• Summary and conclusions
MIT Lincoln Laboratory030303-44S.T. SMITH
Fiber IntegrationInterpretation for CFAR Analysis
ƒX(x0) = ƒZ() dy
x–1(x0)det
x/y/( )
CFAR output X = function of input data Z = X(Z)
MFMF | · |2| · |2Radardata
DivideDividetappeddelay line
∑∑
test cell
Selection logicSelection logic
z z z z z z z
XE.g.: mean-level CFAR
CFARoutput x
CFAR outputstatistics x0
Input data z yielding CFAR output x0 =
“fiber” above x0
Fiber above x0
Inputstatistics
Fiberintegration
MIT Lincoln Laboratory030303-45S.T. SMITH
Joint pdf of Sample Mean and Variance
• First two sample moments: m1 = (1/K)∑zk , m2 = (1/K)∑zk2
– Sample mean and variance: = m1, 2 = m2 – m12
– Mutually dependent random variables
– Will use this to our advantage to obtain finite integrals
• Apply fiber integration with the mapping (z1, z2, …, zK) (1, 2)
– Results in integral on the (K–1)-simplex (1/K)∑zk = 1
– Easily performed by Monte Carlo integration for high relative accuracy
Another application of invariance to obtain uniform density
ƒ (m1,m2) = m1NK–3e–NKm1s´(m2/m1
2)KKNNK
(K)(N)K
s(m2) = (1/M )∑ ∏(zk)N–1 (Monte Carlo evaluation)
A fiber
MIT Lincoln Laboratory030303-46S.T. SMITH
Outline
• Introduction
• Geometric background
• Nonlinear estimation theory
• Nonlinear detection theory
– The detection problem
– Invariant analysis via fiber integration
– Mean and variance CFAR normalizers
Deflection ratio: (z – )/ Log-deflection ratio: (log z – log)/log
Log-Gamma: –log (,·z/)/()
• Geometric optimization and filtering theory
• Summary and conclusions
MIT Lincoln Laboratory030303-47S.T. SMITH
CFAR Distributions
Deflection ratio and Log-Gamma
ƒdr(x) = e–S –1+/2s´(m2) dm2
(x–1/2+1)–1+(x–1/2+1+K)–N(K+1)
1F1(N(K+1); N; S(x–1/2+1)/(x–1/2+1+K))
KK (N(K+1))
(K)(N)K+1 1
min(K,1+1/(x))
Deflection ratio
(m2 = 1 + –1)
ƒlg(x) = e–S–x s´(m2)
()()–NK(1+N/)–N(K+1)e
1F1(N(K+1); N; S/(1+N/)) dm2
KK (N(K+1))
(K)(N)K+1
Log-Gamma CFAR
(m2 = 1 + –1; –log (,)/() = x)
1
K
• Background statistics: chi-squared with N complex dof
• Sample support: K samples
• Signal model: nonfluctuating target with SNR S
MIT Lincoln Laboratory030303-48S.T. SMITH
0 50 100 150
Variate
10–8
10–6
10–4
10–2
100
102
pd
f
CFAR Density Example
with Monte Carlo Comparisons
N = 2K = 10SNR = 9 dB
Deflection ratio (exact and M.C.)Log-Gamma (exact and M.C.)
MIT Lincoln Laboratory030303-49S.T. SMITH
0 5 10 15 20SNR (dB)
0
1/2
1
PD
Matched filterMean-level CFARDeflection ratioLog-Gamma CFARLog-deflection ratio N = 1
K = 20PFA = 10–4
Receiver Operating Characteristics
• Compare to mean-level CFAR (at PD = 50%)
– Extra CFAR loss for variance normalization
• Deflection ratio has 0.5 dB CFAR loss
• Log-Gamma has 1.2 dB CFAR loss
• Log-deflection ratio has 4 dB CFAR loss
• For sample support K = 10, these losses are 0.7 dB, 1.5 dB, and 8 dB
MIT Lincoln Laboratory030303-50S.T. SMITH
Outline
• Introduction
• Geometric background
• Nonlinear estimation theory
• Nonlinear detection theory
• Geometric optimization and filtering theory
• Summary and conclusionsWrap up
MIT Lincoln Laboratory030303-51S.T. SMITH
Geometric Optimization
• Generalization of Euclidean optimization
• Reformulate classical algorithms
– Newton’s method
– Steepest descent
– Conjugate gradient method
• Perform optimization of constraint surface Lines Geodesics, etc.
Benefits:• Natural description of problem• Unifying viewpoint for algorithms
Caveats:• Computational infeasible in general• Group invariance highly desirable
Benefits:• Natural description of problem• Unifying viewpoint for algorithms
Caveats:• Computational infeasible in general• Group invariance highly desirable
Luenberger’sant-on-surface
analogy
MIT Lincoln Laboratory030303-52S.T. SMITH
Applied Optimization Problems
• Signal processing and detection
H0: z = n (interference+noise hypothesis)
H1: z = av + n (signal-plus-interference+noise hypothesis)
R = covariance matrix = E[nnH]
• Subspace tracking Maximize ƒ(Y ) = tr Y HRY such that Y HY = I
• Eigenvector tracking Maximize ƒ(Y ) = tr Y HRYN such that Y HY = I
• Local density approximation of Schrödinger’s equation Maximize ƒ(Y ) = tr Y HHY + (Y ) such that Y HY = I
H = Hamiltonian
Y
n-by-p
Bradbury & Fletcher ’66, Alsén ’71, Ruhe ’74, Cullum ’78, Parlett et al. ’82, Comon & Golub ’90, Yang & Kaveh ’88, Yang ’95, Edelman, Arias & Smith ’98, many others
MIT Lincoln Laboratory030303-53S.T. SMITH
Common Structure Among Problems
• Constrained optimization
• Group invariance
– tr(Y )HR(Y ) = tr HY HRY = tr Y
HRY H = tr Y
HRY
for any unitary matrix
• Pertinent fields of study
– Numerical linear algebra
– Optimization
– Differential and Riemannian geometry
– Lie groups and Lie algebras
– Homogeneous and symmetric spaces
– Adaptive filtering
Exploit Natural Structure of ProblemExploit Natural Structure of Problem
Y
n-by-p
p-by-p
MIT Lincoln Laboratory030303-54S.T. SMITH
Newton’s Method
Quadratic Convergence with Newton Step = –(2ƒ)–1(ƒ)
Euclidean:
Maximize ƒ(x), x in Rn
Euclidean:
Maximize ƒ(x), x in Rn
Riemannian:
Maximize ƒ(x), x in M
Riemannian:
Maximize ƒ(x), x in M
MIT Lincoln Laboratory030303-55S.T. SMITH
Optimization on the Grassmannand Stiefel Manifolds
• Maximize ƒ(Y ) = tr Y HRY such that Y HY = I
– ƒ(Y ) = ƒ(Y ) for all unitary
• Maximize ƒ(Y ) = tr Y HRY such that Y HY = I
– ƒ(Y ) = ƒ(Y ) for all unitary
Y
n-by-p
MIT Lincoln Laboratory030303-56S.T. SMITH
Newton’s Method on theGrassmann Manifold
• A Sylvester equation for H
• Asymptotically equivalent to RQI (cubic convergence)
• Solution methods
– Direct O(n3p) method Uses Ritz vectors of Y
Computationally unattractive
– Linear conjugate gradient Truncated Newton approach
Yields O(n2p2) algorithm
Generalized Rayleigh Quotient ƒ(Y ) = tr Y HRY, Y HY = I
Solve for H:
(I – YY H)RH – H (Y
HRY ) = –(I – YY H)RY; Y
HH = 0
Solve for H:
(I – YY H)RH – H (Y
HRY ) = –(I – YY H)RY; Y
HH = 0
Hessian of ƒ –Gradient of ƒ Tangent vectorconstraint
MIT Lincoln Laboratory030303-57S.T. SMITH
Numerical Results
Trace Maximization on G 5,3
MIT Lincoln Laboratory030303-58S.T. SMITH
Interference Suppression
Rotating Phased Array Antenna
Problems:
• Maximize signal-to-interference-plus-noise ratio
• Track interference and/or signal subspace
Problems:
• Maximize signal-to-interference-plus-noise ratio
• Track interference and/or signal subspace
MIT Lincoln Laboratory030303-59S.T. SMITH
Time Varying Adaptive Filter
Rotating Phased Array Antenna
MIT Lincoln Laboratory030303-60S.T. SMITH
R(t ) = Covariance / Y (t ) = Principal Invariant Subspace
Dynamics for H0 = d Y/dt:
H0(Y HRY ) – (I – YY
H)RH0 = (I – YY H)(d R/dt )Y; Y
HH0 = 0
Dynamics for H0 = d Y/dt:
H0(Y HRY ) – (I – YY
H)RH0 = (I – YY H)(d R/dt )Y; Y
HH0 = 0
Tangent vectorconstraint
Subspacedynamics
Covariancedynamics
The Subspace Tracking Equation
• New solution “close” to old solution
• Approaches to subspace tracking:
– Rank-one updates versus full rank updates:
Rnew = Rold + xx
H versus Rnew = Rold + d R/dt
– Algebraic versus geometric approaches Algebraic (decomposition based): Fuhrmann ’88, Comon and
Golub ’90, Yu ’91, Moonen, Van Dooren & Vandewalle ’92, Stewart ’92, Champagne ’94, Liu ’94
Geometric (derivative based): Yang & Kaveh ’88, Yang ’95, Smith ’93, Smith ’96
MIT Lincoln Laboratory030303-61S.T. SMITH
Summary and Conclusions
• Geometric invariance ubiquitous in signal processing
– Geometric properties can be exploited for solutions and insight
• The Cramér-Rao bound with bias is generalized to arbitrary manifolds without intrinsic (prescribed) coordinates
– Estimator bias and efficiency depend upon geometry
– SCM biased and inefficient from intrinsic perspective Bias term corresponds to known sample support loss
• Derived formula bounding covariance and subspace accuracy
• Fiber integration applied to mean and variance CFAR analysis
– Previously unsolved performance analysis
– The deflection ratio outperforms the log-Gamma and log-deflection ratio tests (0.5 dB loss vs 1.2 dB vs 4 dB)
• Orthogonality constraints easily incorporated into many optimization problems (e.g., subspace tracking)
• Story incomplete—still the Age of Discovery
MIT Lincoln Laboratory030303-62S.T. SMITH
The End
MIT Lincoln Laboratory030303-63S.T. SMITH
Backups
MIT Lincoln Laboratory030303-64S.T. SMITH
Why Geometryfor Signal Processing?
• The easy reasons
– Euclidean geometry and linear algebra used for just about everything in signal processing
Optimal Filtering theory / Least squares concepts
Detection and Estimation and Root-Mean-Square Error
Parseval’s theorem
• Abstract geometry is useful
– Use the right tool for the right job
Many objects in signal processing are non-Euclidean Space of covariance matrices / Space of subspaces
An accurate geometrical description is insightful
– Useful for solving new problems [e.g., mean and variance CFAR]
• Abstract geometry is beautiful
– Differential and Riemannian geometry / Lie groups and Lie algebras / Homogeneous spaces
MIT Lincoln Laboratory030303-65S.T. SMITH
The Intrinsic Cramér-Rao Bound
C G–1 = E [d d ]–1 (unbiased, neglect curvature)
(I + b)G–1(I + b)T (biased, neglect curvature) MbG–1 Mb
T + R(C) (biased, with curvature terms)
C G–1 = E [d d ]–1 (unbiased, neglect curvature)
(I + b)G–1(I + b)T (biased, neglect curvature) MbG–1 Mb
T + R(C) (biased, with curvature terms)
CRB assumptions• Parameter space an arbitrary manifold
• () = log ƒ(z|) = log-likelihood function
• g = E [d d ] = Fisher information metric
• Arbitrary coordinates = (1, 2, …, n)
• G = [gij ] = FIM w.r.t. arbitrary coordinates
• Any estimator of with bias vector b()
• Error covariance matrix = E [( – – b())( – – b())T ]
• Matrix inequality: A B iff A – B is positive semidefinite
MIT Lincoln Laboratory030303-66S.T. SMITH
Subspace Estimation
• Subspace estimation is a frequently encountered in signal processing
• Consider the MIMO problem: z = An1 + n0
– Estimate the channel matrix A from several measurements of z
The noisy inputs n1 and n0 are unknown
– Invariance: z is the same measurement for A AM, n1 M–1n1 for an arbitrary matrix M
• Only the column span (subspace) is unchanged by A AM
– Can only measure the column span of A column
• How does one make sense of the “error” or “bias” of the subspace estimate?
– Space of subspaces is non-Euclidean
MIT Lincoln Laboratory030303-67S.T. SMITH
Homogeneous SpacesE.g.: Covariance Matrices
Identitymatrix I
Cone of Hermitian positive definite
matrices• R is a covariance• So is R
1
0
0
1
Gl(n,C) = Lie group of complex
invertible matrices
U(n) = Lie group of unitarymatrices: H = I
Group action: R MHRM
Identity matrix invariant to group action by U(n): I HI = I
1
0
2j
3
1
0
2j
3
H
=1
–2j
2j
13
c
–s
s
c
=
H c
–s
s
c
Covariancematrices
Gl(n,C)=
U(n)
= the part of Gl(n,C) that doesn’t give invariance
Set of allowable “directions”
“Directions” that don’t get me anywhere
Rule for following “directions”
MIT Lincoln Laboratory030303-68S.T. SMITH
Space of Subspaces
The Grassmann Manifold
X1
X2
X3
Gn,p
U(n) = Lie group of unitary matrices
U(p) = Lie group of in-plane rotations
Group action: Y Y (rotation)
Subspace Y invariant to both in-plane and co-plane rotations
Y = A p-dimensional subspace in Cn
U(n–p) = Lie group of co-plane rotations
SubspacesU(n)
=U(p) U(n–p)
= the part of U(n) that doesn’t give in-plane or co-plane rotations
Subspace Y
MIT Lincoln Laboratory030303-69S.T. SMITH
Riemannian Manifolds
• Manifold: A space that locally looks like Rn or Cn
• Riemannian manifold: A manifold with the distance metric
ds2 = g11 dx1 dx1 + 2g12 dx1 dx2 + g22 dx2dx2 + …
= dxT G dx = dx, dx
• Examples
– Sphere: x2 + y2 + z2 = 1
dimension = 3 – 1 = 2
– Orthogonal matrices: T = I
dimension = n2 – 1/2·n(n + 1) = 1/2·n(n – 1)
x1
x2
Manifold
x
x1
x2
Riemannianmanifold
xx+dxds
MIT Lincoln Laboratory030303-70S.T. SMITH
Geodesicsa.k.a. the Exponential Map
• Geodesics are generalizations of straight lines
• They minimize the length between two points
• Geodesic equation “exponential map”
d 2xk
/dt 2 + ij
kij dxi
/dt dxj /dt = 0
• Geodesics on homogeneous spaces very often may be expressed as matrix exponentials
Sphere:
Great circle
curvature terms
Covariance matrices:
R(t ) = R1/2 expm(R–1/2 Dt R–1/2) R
1/2
distance = 2-norm of log(eigenvalues)
Subspaces:
Y(t ) = Y V cos(t )V H + U sin(t )V
H
distance = 2-norm of acos(singular values)
MIT Lincoln Laboratory030303-71S.T. SMITH
Subspace Estimation Accuracy Bounds
• Statistical model for subspace estimation (real case)
– Estimate Y given data x = Yn1 + n0
E [n1n1H] = R1 (unknown), E [n0n0
H] = R0 (known)
– X = (x1, x2, …, xK) a matrix of iid snapshots, Rˆ = K–1XXH
– Gaussian pdf ƒ(X|R1, Y ) = [ exp(–1/2· tr RˆR2
–1) / ((2)n det R2) ]K ; R2 = Y R1Y H +
R0
• Subspace estimation accuracy
– Estimate Y in presence of unknown “nuisance” parameters R1
– Subspace estimation Fisher information metric
Derivatives of subspace Y look like matrices such that Y H = 0
Derivatives of covariance R1 are Hermitian matrices D
g((D,),(D,)) = 1/2·K tr ((R1Y H + Y R1H + Y DY H)R2
–1)2
– Derive accuracy bounds from Fisher information metric/matrix
See paper for details
MIT Lincoln Laboratory030303-72S.T. SMITH
Context of Mean and VarianceCFAR Analysis
• “Deflection ratio’’ (z – )/– Nuttall, “Operating characteristics of log-normalizer for
Weibull and log-normal inputs,” NUSC TR 8075, 1987
First closed-form analysis of mean and variance CFAR
Assumes Gaussian power-domain statistics
Independent sample mean and variance
– SAR speckle reduction
• Weibull clutter, many citations
• This talk
– Closed-form analysis (single finite integral)
– More-or-less arbitrary power-domain statistics
– More-or-less arbitrary form of CFAR normalizer
E.g., new log-Gamma CFAR: –log (,·z/)/()
– Mutually dependent sample mean and variance
MIT Lincoln Laboratory030303-73S.T. SMITH
Detector Performance: Receiver Operating Characteristic (ROC) Curves
• PD depends upon the threshold and target SNR
• PFA depends upon the threshold
• ROC curves show the dependence (direct or implicit) of the PD on the target SNR and/or PFA
– PD vs PFA (fixed SNR—i.e., vary the threshold)
– PD vs SNR (fixed PFA—i.e., fix the threshold)
– SNR vs PFA (fixed PD—i.e., vary the threshold)
0 10 20 30Output value z
0
.1
.2
.3
Pro
bab
ilit
yd
ensi
ty
ƒ(z|no target present)
ƒ(z|target present)
PFA
PD
threshold
2 4 6 8 10 12 14 16
SNR (dB)
1
10
50
90
99
99.9
PD
(%
)
MF
(4 lo
oks)
Swerling II (
4 looks)
Swerling I (4 looks)
PFA = 10–6
MIT Lincoln Laboratory030303-74S.T. SMITH
Constant False Alarm Rate (CFAR) Thresholding
• Problem: Must know (or estimate) noise floor to set threshold
• Solution: Estimate noise floor using noise-only samples
– Adaptive thresholding
• CFAR thresholding:
0 20 40 60 80 100Time (µs)
0
10
20
30
40
Po
wer
(d
B)
Signal
Noise floor
Absolutethreshold False alarm
test cellnoise floor estimate
> threshold
Threshold depends upon variance of background
Threshold depends upon variance of background
MIT Lincoln Laboratory030303-75S.T. SMITH
CFAR Techniques
• Mean-Level CFAR
– Noise estimate = RMS of noise-only cells
– Optimum estimator under ideal assumptions
– Not robust to target in training cells, inhomogeneous clutter
• Greatest-Of Mean-Level CFAR
– Noise estimate = Greatest of left-hand and right-hand sides
– Robust to false alarms caused by clutter on either side of test cell
– Not robust to target in training cells
• Censored Greatest-Of Mean-Level CFAR
– Noise estimate = Remove M largest samples, then GO-MLCFAR
– Robust to M targets in training cells, inhomogeneous clutter
MIT Lincoln Laboratory030303-76S.T. SMITH
Mean Level CFAR PerformanceNonfluctuating Target
z = iN|a + ni|
2 / K–1kK|k|
2 (noncentral F-distribution)
Kƒ(Kz) = (N+K+1
N+1)zN(1+z)–N–K e–N/(1+z)
1F1(–K; N; –Nz/(1+z))
PD(,) = (/(K+))N+K–1k=0K(N+K–1
N+k)(/K)kGk+1(N/(1+/K))
Nonfluctuating CFAR statistics Gk(z) (k,z)/(k)
z = iN|a + ni|
2 / K–1kK|k|
2 (noncentral F-distribution)
Kƒ(Kz) = (N+K+1
N+1)zN(1+z)–N–K e–N/(1+z)
1F1(–K; N; –Nz/(1+z))
PD(,) = (/(K+))N+K–1k=0K(N+K–1
N+k)(/K)kGk+1(N/(1+/K))
Nonfluctuating CFAR statistics Gk(z) (k,z)/(k)
MFMF | · |2| · |2Radardata
CompareCompare
detectiondecision
tappeddelay line
test cell
Selection logicSelection logic
z z z z z z z
Closed-form analysis provides ROC curves of mean-level CFAR
Closed-form analysis provides ROC curves of mean-level CFAR
MIT Lincoln Laboratory030303-77S.T. SMITH
CFAR LossNonfluctuating Target
• CFAR loss is extra SNR required to achieve same PD at fixed PFA
• Smaller CFAR loss for higher sample support, number of looks
10, 2
0
0 5 10 15 20 25
SNR (dB)
1
10
50
90
99
99.9
PD
(%
)
NC
I (4
loo
ks)
CFA
R: K
= 1
per
look
K =
5K
= 2
PFA = 10–6
Nonfluctuating
100 101 102
CFAR Sample Support (per look)
0
5
10
15
20
CF
AR
Lo
ss (
dB
)
PD = 95%PFA = 10–6
4 looksNonfluctuating
Relative to NCI
N = 1
48
MIT Lincoln Laboratory030303-78S.T. SMITH
Fiber Integration*
*See differential geometry literature, e.g., Abraham, Marsden & Ratiu
• Given joint pdf ƒZ(), is K-dimensional
– This will be either the joint background statistics, or the joint distribution of the sample mean and variance
• Given mapping x(), x is L-dimensional
• Want the pdf ƒX(x)
– This will be either the CFAR statistics, or the joint distribution of the sample mean and variance
• If L = K, standard change of variables: ƒX(x) = ƒZ()/det(∂x/∂)
– Very boring
• If L < K, fiber integration
– Very cool: invariant to choice of coordinates, inner product
ƒX(x0) = ƒZ() dy
x–1(x0)det
x/y/( )
or ƒX(x0) = ƒZ() dS
x–1(x0)||x/||
MIT Lincoln Laboratory030303-79S.T. SMITH
Geodesics on the Grassmann Manifold
• Maintains Grassmann constraint Y ()HY () = I
• Computational complexity O(np2)
• Approximation using QRD of Y + H possible
• Perform optimization line search along Y ()
Length Minimizing Curves
Geodesic curve Y() starting at Y in direction H:
Y() = (YV cos() + U sin())V H
UV H := H (compact SVD of H ), Y
HY = I, Y HH = 0
Geodesic curve Y() starting at Y in direction H:
Y() = (YV cos() + U sin())V H
UV H := H (compact SVD of H ), Y
HY = I, Y HH = 0
MIT Lincoln Laboratory030303-80S.T. SMITH
RQI is Cubically Convergent
Sketch of Proof
r () = cos2 + sin2
• Newton’s method cubically convergent on even functions
• RQI and Newton have same orderNewton: next = – 1/2·tan(2)
= – (4/3)3 + O(5)
RQI: next = – tan–
1(1/2·tan(2))= – 3 + O(5)
MIT Lincoln Laboratory030303-81S.T. SMITH
The Subspace Tracking Problem
• New solution “close” to old solution
• Approaches to subspace tracking:
– Rank-one updates versus full rank updates:
Rnew = Rold + xx
H versus Rnew = Rold + d
R/dt
– Algebraic versus geometric approaches Algebraic (decomposition based): Fuhrmann ’88, Comon
and Golub ’90, Yu ’91, Moonen, Van Dooren & Vandewalle ’92, Stewart ’92, Champagne ’94, Liu ’94
Geometric (derivative based): Yang & Kaveh ’88, Yang ’95, Smith ’93, Smith ’96
– Exploitation of time dynamics of R(t ) No direct use of d R/dt before
Many authors use structure of xx
H update
Problem: Determine principal invariant subspace of time varying covariance R(t )Problem: Determine principal invariant
subspace of time varying covariance R(t )