L. De Lathauwer
Block Component Analysis
A New Concept for Blind Source Separation
(Higher-Order Tensors and Blind Signal Separation)
Lieven De Lathauwer
KU Leuven
Belgium
[Selected slides]
1
L. De Lathauwer
Factor analysis and blind source separation
• Decompose a data matrix in meaningful rank-1 terms
T = A · BT
T =
a1
b1+ · · ·+
aR
bR
• Mixing vectors and sources
2
L. De Lathauwer
• Decomposition in rank-1 terms is not unique
T = A · BT
= (AM) · (M−1B
T )
= A · BT
T =
1
1+ · · ·+
R
R
aa
bb
3
L. De Lathauwer
Principal Component Analysis and Singular Value
Decomposition
PCA, SVD: uniqueness thanks to orthogonality constraints
T = U · Σ · VT
=∑
r
σrurvT
r
U, V orthogonal, Σ diagonal
4
L. De Lathauwer
Motivating example: excitation-emission fluorescence in
chemometrics
Matrix approach
row vector ∼ emission spectrum
column vector ∼ excitation spectrum
T =
a1
b1+ · · ·+
aR
bR
NMF not unique in general
5
L. De Lathauwer
Tensor solution: CP Analysis
Tensorization: one matrix → several matrices, stacked in tensor
row vector ∼ emission spectrum
column vector ∼ excitation spectrum
coefficients ∼ concentrations
T
=
a1
b1
c1
+ · · ·+
aR
bR
cR
[Smilde, Bro, Geladi ’04]
6
L. De Lathauwer
Tensor rank and Canonical Polyadic Decomposition
Rank: minimal number of rank-1 terms [Hitchcock, 1927]
Canonical Polyadic Decomposition (CPD): decomposition in minimal
number of rank-1 terms [Harshman ’70], [Carroll and Chang ’70]
T
=
a1
b1
c1
+ · · ·+
aR
bR
cR
• Unique under mild conditions on number of terms and differences
between terms
• Orthogonality not required
• Uniqueness in “underdetermined” case
7
L. De Lathauwer
Tensor data:
• telecommunications
• higher-order statistics
• annotated graphs
• hyperlink data
• matrices (deliberately) measured under different conditions / at different
time instances / . . .
• matrices depending on parameter(s)
• . . .
8
L. De Lathauwer
Alternative representation: tensor diagonalization
9
L. De Lathauwer
Alternative representation: joint matrix diagonalization
Also underdetermined case
10
L. De Lathauwer
Motivating example: EEG
0 1 2 3 4 5 6 7 8 9 10
T1
T2
P3
C3
F3
O1
T5
T3
F7
Fp1
Pz
Cz
Fz
P4
C4
F4
02
T6
T4
F8
Fp2
Time (sec)
236µV
Tensorization: biorthogonal wavelet
11
L. De Lathauwer
Components: eye blink and epileptic activity
0 1 2 3 4 5 6 7 8 9 10−0.08
−0.06
−0.04
−0.02
0
0.02
0.04
0.06
0.08
0.1
0.12temporal atom of eye−blink activity
Time (sec)0 1 2 3 4 5 6 7 8 9 10
−0.06
−0.04
−0.02
0
0.02
0.04
0.06
Time (sec)
temporal atom of seizure activity
0 5 10 15 20 25 30 35 40 450
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
Freq (Hz)
frequency distribution of eye−blink activity
0 5 10 15 20 25 30 35 40 450
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18frequency distribution of seizure activity
Freq (Hz)
[De Vos et al., Neuroimage ’07], [Acar et al., Bioinformatics ’07]
12
L. De Lathauwer
Independent Component Analysis (ICA)
Model: X = AS
s1
s2
s3
s1
s2s3
+
Sources statistically independent
13
L. De Lathauwer
ICA: basic equations
Model:
X = AS
Second order:
C(2)X
= E{XXT}
= A · C(2)S
· AT
uncorrelated sources: C(2)S
is diagonal
C(2)X
=
σ2s1
σ2s2
σ2sR
a1
a1
a2
a2
aR
aR
+ . . .++
14
L. De Lathauwer
Higher order:
C(N)X
= C(N)S
·1 A ·2 A ·3 . . . ·N A
independent sources: C(N)S
is diagonal
= +
c(N)s1
c(N)s2
c(N)sR
a1 a2 aR
a1 a2 aR
a1 a2 aR
+ . . .+C(N)X
Tensorization: decomposition data matrix → CPD cumulant tensor
[Comon ’94], [Cardoso ’93]
15
L. De Lathauwer
ICA based on second-order statistics
Condition: sources mutually uncorrelated, but individually correlated in time
C(2)X
(τ) = E{X(t)X(t + τ)T}
= A · C(2)S
(τ) · AT
=
Tensorization: stack covariance matrices in 3rd-order tensor
[Belouchrani et al. ’97], [Yeredor ’02], . . .
16
L. De Lathauwer
Tensor rank and Canonical Polyadic Decomposition
Rank: minimal number of rank-1 terms [Hitchcock, 1927]
Canonical Polyadic Decomposition (CPD): decomposition in minimal
number of rank-1 terms
T
=
a1
b1
c1
+ · · ·+
aR
bR
cR
[Harshman ’70], [Carroll and Chang ’70]
Unique under mild conditions
17
L. De Lathauwer
Decomposition in rank-(L,L, 1) terms
T
=
A1
B1
c1
+ · · ·+
AR
BR
cR
Unique under mild conditions
[DL ’08]
18
L. De Lathauwer
Decomposition in rank-(R1, R2, R3) terms
T
=
A1
B1
C1
+ · · ·+
AR
BR
CR
Unique under mild conditions
[DL ’08]
19
L. De Lathauwer
Alternative representation: tensor block diagonalization
20
L. De Lathauwer
Decomposition in rank-(R1, R2, •) terms
T
=
A1
B1 + · · ·+
AR
BR
[DL ’08]
Alternative representation: joint block diagonalization
21
L. De Lathauwer
Block Component Analysis
Demo
−20
2
−20
2
−1
0
1
−20
2
−20
2
−1
0
1
−20
2
−20
2
−1
0
1
−20
2
−20
2
−1
0
1
−20
2
−20
2
−1
0
1
−20
2
−20
2
−1
0
1
22
L. De Lathauwer
Exponentials, sinusoids, polynomials, exponential
polynomials
Principle: Map every row of T = A · BT to Hankel matrix
Hankel matrices are often very ill-conditioned
Hankel matrices generated by exponential polynomials are exactly low-rank
+ . . .+ + . . .+
[DL ’11]
23
L. De Lathauwer
0
0
0
0
0
0
0
0
0
0
0
0
11
11
1
1
1
1-1-1
0.50.5
0.50.5
0.50.5
2
2
2
2
-2-2
-2
4
5
-5
theoretical values: (L1, L2) = (2, 3)
perfect separation: (L1, L2) = (2, 3), (3, 3), (2, 4), (3, 4), (4, 4)
good separation: (L1, L2) = (2, 2), (1, 2)
24
L. De Lathauwer
0 0.5 1−0.2
0
0.2
0 0.5 1−0.1
0
0.1
0 0.5 1−0.1
0
0.1
0 0.5 1−0.5
0
0.5
0 0.5 1−0.05
0
0.05
0 0.5 1−0.2
0
0.2
501 samples, SNR = 5 dB
good separation: (L1, L2) = (1, 2), (2, 2), (2, 3)
25
L. De Lathauwer
theoretical values: L1 = 2, L2 = 251
0 50 100 150 200 250 300 350 400 450 500−1
−0.5
0
0.5
1
0 50 100 150 200 250 300 350 400 450 500−1
−0.5
0
0.5
1
0 50 100 150 200 250 300 350 400 450 500−2
−1
0
1
2
0 50 100 150 200 250 300 350 400 450 500−2
−1
0
1
2
26
L. De Lathauwer
theoretical values: L1 = 2, L2 = 251
results: L1 = 2, L2 = 2, 3, . . . , 7
0 50 100 150 200 250 300 350 400 450 500−2
−1
0
1
2
0 50 100 150 200 250 300 350 400 450 500−2
−1
0
1
2
0 50 100 150 200 250 300 350 400 450 500−1
−0.5
0
0.5
1
0 50 100 150 200 250 300 350 400 450 500−4
−2
0
2
4
27
L. De Lathauwer
Toy example: audio
5 10 15 20 25 30−0.4
−0.2
0
0.2
0.4
5 10 15 20 25 30−0.5
0
0.5
Chirp (top) and train (bottom) signal, 31 samples
28
L. De Lathauwer
5 10 150
0.5
1
1.5
2
2.5
5 10 150
0.5
1
1.5
2
2.5
3
100 200 300 400 5000
10
20
30
40
50
60
100 200 300 400 5000
20
40
60
80
100
singular values of Hankel matrices generated by chirp (left) and train (right)
top: 31 samples; bottom: 1000 samples
29
L. De Lathauwer
L1 / L2 1 2 3 4 5 6 7
1 20 48 49 37 20 15 15
2 48 47 49 48 44 17 16
3 49 49 49 47 23 20 19
4 37 48 47 47 47 20 18
5 20 44 23 47 45 29 16
6 15 17 20 20 29 25 33
7 15 16 19 18 16 33 24
mean SIR [dB] (Hankel, noiseless) (ICA: COM2: 15 dB, JADE: 14 dB)
L1 / L2 1 2 3 4 5 6 7
1 49 47 49 51 51 19 13
2 47 47 50 49 51 38 22
3 49 50 49 48 49 47 45
4 51 49 48 47 48 46 44
5 51 51 49 48 48 46 44
6 19 38 47 46 46 46 47
7 13 22 45 44 44 47 44
median SIR [dB] (Hankel, noiseless) (ICA: COM2: 15 dB, JADE: 14 dB)
30
L. De Lathauwer
Results for noisy data:
0 5 10 15 20 25 30 355
10
15
20
25
30
35
40
45
50
55
60
BCA Hankel L=1BCA Hankel L=2BCA Hankel L=3BCA Hankel L=4BCA wavelet L=1BCA wavelet L=2BCA wavelet L=3BCA wavelet L=4ICA COM2
SNR [dB]
SIR
[dB]
31
L. De Lathauwer
Foundation: BCA exploits low intrinsic dimensionality
intrinsic dimensionality ∼ multilinear rank
Related: Pareto analysis
compressed sensing
scientific computing
. . .
Tensorization: Hankel, wavelet, time-frequency, . . .
0 100 200 300 400 500−4
−3
−2
−1
0
1
2
3
4
unstructured signal
32
L. De Lathauwer
Analogy:
CPD: splitting in “atoms” (pure frequencies)
T
=
a1
b1
c1
+ · · ·+
aR
bR
cR
BTD: splitting in “molecules” (sounds)
T
=
A1
B1
c1
+ · · ·+
AR
BR
cR
33
L. De Lathauwer
Conclusion
• BCA: separation based on low intrinsic dimensionality
• Intrinsic dimensionality measured by (multilinear) rank
• Rank-1 hypothesis sometimes questionable
• Related to Pareto, compressed sensing, etc.
• Related to Sparse Component Analysis, etc.
• Tensorization: HOS, sets of matrices, Hankel, . . .
• Hankel: separation of exponential polynomials
• Low complexity variants of current tensorization-based schemes
• PCA, ICA, CPA, NMF, . . . : easier to use but assumptions should hold
• Constrained BCA: nonnegativity, sparsity, orthogonality, statistical
independence, . . .
Related work: CPA with orthogonality constraint [Sørensen, DL et al.]
CPA with independence constraint [De Vos, Van Huffel, DL]
Thanks: to Laurent Sorber for helping with figures
34
L. De Lathauwer
L. De Lathauwer, “Block Component Analysis, a New Concept for Blind
Source Separation,” in F. Theis, A. Cichocki, A. Yeredor, M. Zibulevsky
(Eds.): Latent Variable Analysis and Signal Separation, 10th International
Conference, LVA/ICA 2012, Tel-Aviv, Israel, March 2012, Proceedings,
LNCS 7191, Springer, Heidelberg, 2012, pp. 1-8.
35