Download - Non-Negative Blind Source Separation using Convex Analysispalomar/ELEC5470_lectures/20/slides_blind... · Non-Negative Blind Source Separation using Convex Analysis Wing-Kin ... Course

Non-Negative Blind Source Separationusing Convex Analysis

Wing-Kin (Ken) MaThe Chinese University of Hong Kong (CUHK)

Course on Convex Optimization for Wireless Comm. and Signal Proc.Jointly taught by Daniel P. Palomar and Wing-Kin (Ken) Ma

National Chiao Tung Univ., Hsinchu, Taiwan

December 19-21, 2008

Acknowledgement: Tsung-Han Chan, Chong-Yung Chi, and Yue Wang

Blind source separation (BSS): Problem statement

Signal model: a real-valued, N -input, M -output linear mixing model:

xi =N∑

j=1

aijsj, i = 1, . . . ,M

where

xi =

xi[1]...

xi[L]

, si =

si[1]...

si[L]

are observation & true source vectors.

11a1x

2x

3x

1s

2s

22a

21a

31a

12a

32a

Problem: extract {s1, . . . , sN} from {x1, . . . ,xM} without information of themixing matrix A = {aij}.

W.-K. Ma 1

BSS: A biomedical imaging example

Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) assessments of breast cancer

captured at different times. Courtesy to Yue Wang [Wang et al. 2003].

W.-K. Ma 2

Time

Fast flowSlow flowPlasma

Time activity cruves (TAC)

Illustration of source pattern mixing process. The signals represent a summation of vascular

permeability with various diffusion rates. The goal is to separate the distribution of multiple

biomarkers with the same diffusion rate.

W.-K. Ma 3

BSS techniques

• A BSS approach is based on some assumptions on the characteristics of{s1, . . . , sN} and/or A.

• There are two aspects in developing a BSS approach:

– criterion established from the assumptions made, &– optimization methods for fulfilling the criterion.

• The suitability of the assumptions (& the approach as a result) depends muchon the applications under consideration.

Example: Independent component analysis (ICA), a well-known BSStechnique, typically assumes that each si[n] is random non-Gaussian & ismutually independent.Mutual independence is a good assumption in speech & wireless commun., butnot so in hyperspectral imaging.

W.-K. Ma 4

Non-negative blind source separation (nBSS)

• In some applications source signals are non-negative by nature; imaging.

• nBSS approaches exploit the signal non-negativity characteristic (plus someadditional assumptions).

• Applications: biomedical imaging, hyperspectral imaging, & analyticalchemistry.

• Some existing nBSS approaches:

– non-negative ICA (nICA) [Plumbley 2003]– non-negative matrix factorization (NMF) [Lee-Seung 1999].

W.-K. Ma 5

• nICA is a statistical approach adopting the mutual independence assumption.

• NMF is a deterministic approach that may cope with correlated sources.

• Essentially NMF deals with an optimization

minS∈RL×M ,A∈RM×N

‖X − SA‖2Fs.t. S � 0, A � 0 (elementwise non-negative)

where X = [ x1, . . . ,xN ], & S = [ s1, . . . , sM ].

NMF may not be a unique factorization, however.

W.-K. Ma 6

CAMNS:Convex analysis of mixtures of non-negative sources

• CAMNS [Chan-Ma-Chi-Wang 2008] is a deterministic nBSS approach.

• In addition to utilizing source non-negativity, CAMNS employs a specialdeterministic assumption called local dominance.

• What is local dominance?Intuitively, signals with many‘zeros’ are likely to satisfy localdominance (math. def. availablesoon).

• Appears to be a good assumption for sparse or high-contrast images.

W.-K. Ma 7

An intuitive illustration of how CAMNS works

s1

s2

s3

x1 x2

x3

How can we extract {s1, . . . , sN} from {x1, . . . ,xM} without knowing {aij}?

W.-K. Ma 8

An intuitive illustration of how CAMNS works (cont’d)

x1 x2

x3

Based on some assumptions (e.g., signal non-negativity & local dominance) & byconvex analysis, we use {x1, . . . ,xM} to construct a polyhedral set.

W.-K. Ma 9


s1

s2

s3

We show that the ‘corners’ (formally speaking, extreme points) of this polyhedralset are exactly {s1, . . . , sN} (rather surprisingly).

W.-K. Ma 10


Using LP, we can locate the ‘corners’ of the polyhedral set effectively. As a resultperfect separation can be achieved.

W.-K. Ma 11

A quick review of some convex analysis concepts

Affine hull of a given set of vectors {s1, . . . , sN} ⊂ RL:

aff{s1, . . . , sN} ={x =

N∑i=1

θisi

∣∣∣∣ θ ∈ RN ,

N∑i=1

θi = 1}.

• An affine hull can always be represented by

aff{s1, . . . , sN} ={x = Cα+ d

∣∣ α ∈ RP}

for some (non-unique) d ∈ RL and C ∈ RL×P , where P ≤ N − 1 is the affinedimension.

• If {s1, . . . , sN} is affine independent (or {s1 − sN , . . . , sN−1 − sN} is linearlyindependent) then P = N − 1.

W.-K. Ma 12

Convex hull of a given set of vectors {s1, . . . , sN} ⊂ RL:

conv{s1, . . . , sN} ={x =

N∑i=1

θisi

∣∣∣∣ θ ∈ RN+ ,

N∑i=1

θi = 1}

• A point x ∈ conv{s1, . . . , sN} is an extreme point of conv{s1, . . . , sN} if xis not any nontrivial convex combination of {s1, . . . , sN}.

• If {s1, . . . , sN} is affine independent then {s1, . . . , sN} is the set of all extremepoints of its convex hull.

W.-K. Ma 13

Example of 3-dimensional signal space geometry with N = 3. In this example, aff{s1, s2, s3} is

a plane passing through s1, s2, s3, & conv{s1, s2, s3} is a triangle with corners (extreme points)

s1, s2, s3.

W.-K. Ma 14

The assumptions in CAMNS

Recall the model xi =∑M

j=1 aijsj. Our assumptions:

(A1) Source non-negativity: For each j, sj ∈ RL+.

(A2) Local dominance: For each i ∈ {1, . . . , N}, there exists an (unknown) index ìsuch that si[ì] > 0 and sj[ì] = 0, ∀j 6= i.

(Reasonable assumption for sparse or high-contrast signals).

(A3) Unit row sum: For all i = 1, . . . ,M ,∑N

j=1 aij = 1.

(Already satisfied in MRI, can be relaxed).

(A4) M ≥ N and A is of full column rank. (Standard BSS assumption)

W.-K. Ma 15

How to enforce (A3), if it does not hold

The unit row sum assumption (A3) may be relaxed.

Suppose that xTi 1 6= 0 (where 1 is an all-one vector) for all i.

Consider a normalized version of xi:

xi =xi

xTi 1

=N∑

j=1

(aijs

Tj 1

xTi 1︸︷︷︸

,aij

)(sj

sTj 1︸︷︷︸

,sj

).

One can show that (aij) satisfies (A3).

W.-K. Ma 16

CAMNS

Since∑N

j=1 aij = 1 [(A3)], we havefor each observation

xi =N∑

j=1

aijsj ∈ aff{s1, . . . , sN}

This implies

aff{s1, . . . , sN} ⊇ aff{x1, . . . ,xM}.

In fact, we can show that

Lemma. Under (A3) and (A4), aff{s1, . . . , sN} = aff{x1, . . . ,xM}.

W.-K. Ma 17

• Consider the representation

aff{s1, . . . , sN} = aff{x1, . . . ,xN}

={x = Cα+ d

∣∣ α ∈ RN−1}

, A(C,d)

for some (C,d) ∈ RL×(N−1) × RL with rank(C) = N − 1.

• Let us consider determining the source affine set parameters (C,d) from{x1, . . . ,xM}.

• The solution is simple for M = N :

d = xN , C = [ x1 − xN , . . . ,xN−1 − xN ]

• For M > N , we use an affine set fitting solution.

W.-K. Ma 18

Affine set fitting problem:

(C,d) = arg minC,d

CT C=I

M∑i=1

minx∈A(C,d)

‖x− xi‖22︸︷︷︸proj. error of xi onto A(C, d)

(∗)

where A(C,d) = { x = Cα+ d | α ∈ RN−1 }.

Proposition. Problem (∗) has a closed-form solution

d =1M

M∑i=1

xi, C = [ q1(UUT ), q2(UUT ), . . . , qN−1(UUT ) ]

where U = [ x1 − d, . . . ,xM − d ] ∈ RL×M , and qi(R) denotes the eigenvectorassociated with the ith principal eigenvalue of R.

W.-K. Ma 19

Be reminded that si ∈ RL+. Hence, it is true that

si ∈ aff{s1, . . . , sN} ∩ RL+ = A(C,d) ∩ RL

+ , S

The following lemma arises from local dominance (A2):

Lemma. Under (A1) and (A2),

S = conv{s1, . . . , sN}

Moreover, the set of all its extremepoints is {s1, . . . , sN}.

W.-K. Ma 20

Summarizing the above results, a new nBSS criterion is as follows:

Theorem 1. (CAMNS criterion) Under (A1)-(A4), the polyhedral set

S ={x ∈ RL

∣∣ x = Cα+ d � 0, α ∈ RN−1}

where (C,d) is obtained from the observation set {x1, ...,xM} by theaffine set fitting procedure in Proposition 1, has N extreme points givenby the true source vectors s1, ..., sN .

W.-K. Ma 21

Practical realization of CAMNS

• CAMNS boils down to finding all the extreme points of an observation-constructed polyhedral set.

• In the optimization context this is known as vertex enumeration.

• In CAMNS, there is one important problem structure that we can take fulladvantage of; that is,

Property implied by (A2): s1, . . . , sN are linear independent.

• By exploiting this property, we can locate all the extreme points by solving asequence of LPs ( ≈ 2N LPs at worst).

W.-K. Ma 22

Consider the following LP

p? = mins

rTs

s.t. s ∈ S(†)

for an arbitrary r ∈ RL. From basic LP theory, the solution of (†) is

• one of the extreme points of S (that is, one of the si), or

• any point on a face of S (look rather unlikely, intuitively).

W.-K. Ma 23

W.-K. Ma 24

We can prove that getting a non-extreme-pt. solution is very unlikely:

Lemma. Suppose that r is randomly generated following N (0, IL). Then, withprobability 1, the solution of

p? = mins

rTs

s.t. s ∈ S

is uniquely given by si for some i ∈ {1, ..., N}.

W.-K. Ma 25

• Suppose that we have found l extreme point, say, {s1, . . . , sl}.

• We can find the other extreme points, by using the linear independence of{s1, . . . , sN} to ‘annihilate’ the old extreme points.

Lemma. Suppose r = Bw, where w ∼ N (0, IL−l), & B ∈ RL×(L−l) is suchthat

B[s1, . . . , sl] = 0 BTB = IL−l

Then, with probability 1, at least one of the LPs

p? = mins∈S

rTs q? = maxs∈S

rTs

finds a new extreme point; i.e., si for some i ∈ {l + 1, ..., N}. The 1st LP finds anew extreme pt. if |p?| 6= 0; the 2nd LP finds a new extreme pt. if |q?| 6= 0.

W.-K. Ma 26

On alternatives of implementing CAMNS

• We have another theorem that converts S ⊂ RL to another polyhedral set onR(N−1), denoted by F below.

2s

1s

2x1x

3x = +C dαx

1α 2α

• The set F has a smaller vector dim. (note that L � N). Also it is a simplexwith extreme pts related to those of S in a one-to-one manner.

• For N = 2, F is a line segment on R and there is a closed form for locating itsextreme points.

• For N = 3, F is a triangle on R2 and there is also a simple way for locating itsextreme points.

W.-K. Ma 27

Simulation example 1: Dual energy X-Ray

Original sources

W.-K. Ma 28

Observations

W.-K. Ma 29

Separated sources by CAMNS

W.-K. Ma 30

Separated sources by nICA (a benchmarked nBSS method)

W.-K. Ma 31

Separated sources by NMF (yet another benchmarked nBSS method)

W.-K. Ma 32

Simulation example 2: Human faces

Original sources

W.-K. Ma 33

Observations

W.-K. Ma 34


W.-K. Ma 35

Separated sources by nICA

W.-K. Ma 36

Separated sources by NMF

W.-K. Ma 37

Simulation example 3: Ghosting

Original sources

W.-K. Ma 38

Observations

W.-K. Ma 39


W.-K. Ma 40

Separated sources by nICA

W.-K. Ma 41

Separated sources by NMF

W.-K. Ma 42

Simulation example 4: Five of my students

Original sources

W.-K. Ma 43

Observations

W.-K. Ma 44


W.-K. Ma 45

Simulation example 5: Monte Carlo performance for N = 3

25 30 35 4015

20

25

30

35

40

45

50

SNR (dB)

Ave

rage

(dB

)ˆ

(,

)e

SS CAMNS-LP

nLCACAMNS-geometric

nICANMF

Average sum squared errors of the sources with respect to SNRs.

W.-K. Ma 46

Conclusion

• A convex analysis framework, called CAMNS has been developed for nBSS.

• CAMNS guarantees perfect separation of the true sources, by determining theextreme points of an observation constructed polyhedral set (under severalassumptions)

• A systematic LP-based method has been proposed to realize CAMNS. Itscomplexity is polynomial (specifically, O(L1.5(N − 1)2)).

• A number of simulation results indicate that CAMNS performs very well even inthe presence of dependent sources.

• The source codes is available at http://www.ee.cuhk.edu.hk/~wkma

W.-K. Ma 47

References

[Wang et al. 2003] Y. Wang, J. Xuan, R. Srikanchana, & P. L. Choyke, “Modeling and

reconstruction of mixed functional and molecular patterns,” Int’l Journal Biomedical Imaging, pp.

1–9, 2005.

[Plumbley 2003] M. D. Plumbley, “Algorithms for nonnegative independent component analysis,”

IEEE Trans. Neural Networks, vol. 14, no. 3, pp. 534–543, May 2003.

[Lee-Seung 1999] D. Lee and H. S. Seung, “Learning the parts of objects by non-negative matrix

factorization,” Nature, vol. 401, pp. 788-791, Oct. 1999.

[Chan-Ma-Chi-Wang 2008] T.-H. Chan, W.-K. Ma, C.-Y. Chi, & Y. Wang, “A convex analysis

framework for blind separation of non-negative sources,” IEEE Trans. Signal Process., vol. 56, no.

10, pp. 5120-5134, Oct. 2008.

W.-K. Ma 48