Introduction to blind source separation - Uppsala University · April - 2006 Signaler & System...

transcript

April - 2006 Signaler & SystemUppsala universitet

Blind source separation

Introduction to blind source separation

Mathias Johansson

Lecture notes for the course “Adaptive signal processing”

Mixed signals are recorded

We want to demix the recordings and find s1(t) and s2(t)

Blind source separation• A number, M, of microphones record a mixture

of N source signals, for example: • several people talking in a room (the cocktail party

effect),• radio signals emitted from several mobile terminals and

received by others• electromagnetic signals from different brain regions

recorded by sensors on the head

• The job is to estimate the individual source signals, i.e. to demix the mixture, withoutknowledge about the actual sources.

Is this even possible??• Have you ever been to Orvars?

– Yes, it is possible (within reason)• But still, how to do it in a computer?

– Different approaches, but they all make use of assumptions or knowledge regarding the mixing process or the signals, e.g.

• ICA (Independent Component Analysis) only assumes that the sources are stat. independent

• DUET assumes that the signals are non-overlapping in the frequency domain, and a mixing model consisting of an anechoic environment.

Possible approaches• Beamforming:

– By delaying and attenuating the signals impinging on the recording array, a beam can be focused towards a certain direction

Beamforming example

Recording 1

Recording 2

Signal 1 Signal 2

Recording 2: Different delays due to different angles of arrival

Beamforming exampleA: Recording 1 delayed 500 samples

B: Recording 2

Beamforming: Average of A and B

Signal 1 is amplified whereas signal 2 is attenuated. The beamforming suppresses signals from certain angles.

Beamforming disadvantages• Can only distinguish signals that have well

separated angle-of-arrivals• Ad hoc technique relying on linear

processing, not optimal in general.• Need to estimate the delay (and

attenuation) of the desired source.

Independent Component Analysis• An approach that relies on assuming (almost)

only that the sources are statistically independent

• Assumes a linear instantaneous static mix:

• Using Bayes’ rule the most likely mixing matrix is computed

• Using the inverse of the estimated mixing matrix, the source signals are estimated

Independent Component Analysis• Works well for instantaneous static mixing • Generalizations to deal with echoes and

convolutive mixes exist, but exhibit poor performance

• We will not investigate ICA in the project

DUET• A relatively new approach that assumes

the following:– Signals are non-overlapping in the frequency

domain– Mixing model:

• Attenuation and delay (i.e. anaechoic mixtures)

• Can even handle cases when there are more signals than recordings

DUET algorithm outline1. Taking the DFT of a block of the recordings,

there is thus only one signal present in each frequency bin

2. Estimate the time delay and attenuation in each bin

3. Label the bins according to delay and attenuation

4. Each source signal is estimated by inverse-transforming the bins that have the same labels

DUET assumptionTime

FFT(x(1:N))

FFT of next block, etc.

At each time instant, each frequency bin contains only one signal, or no signal at all.

DUET algorithm• The mixing model can be described as

• N is the number of sourcesAttenuation Delay

DUET algorithm• Taking the DFT of both recorded signals,

we have

since only one signal, say the ith one, is non-zero at each frequency ω.– Recall that the Fourier transform of a delayed

signal is: DFT(s(t-d)) = exp(-jωd)S(ω)

DUET algorithm• We can then compute the amplitude and

delay of signal i from:

DUET algorithm• For each block of data and each frequency, we

get an amplitude and delay estimate.• We keep track of old estimates and form a two-

dimensional histogram of the amplitude-delay estimates

• There will be N clusters in the histogram, each one corresponding to a specific source

A histogram taken from [1]

DUET algorithm• We then label each frequency bin

according to the peak in the histogram that is closest to the current estimate

• To reconstruct a source signal icorresponding to a certain amplitude-delay peak in the histogram, just set all bins to zero that are not labelled as i.– Finally take the inverse Fourier transform!

Extensions• S. Rickard and co-workers have also

developed other versions of the algorithm– One is specially suited for real-time

implementations, check the project report from 2004 for more info.

– However, the current algorithm should work in real-time too.

Project focus 2006• The project set-up:

– 2 microphones, 2 acoustic source signals (speakers)

• Aim:– Based on results from 2004, implement a

working BSS system in real-time using Matlab• Compare the real-time algorithm used in

2004 with the histogram-based method

Project focus 2006• First, read the report from 2004 and the

background material [1]-[3] carefully• Begin with artificial mixes, i.e. generated

from Matlab• Then try anaechoic recordings (use the lab

on floor 2 at Magistern)

References[1] A. Jourjine, S. Rickard, Ö. Yilmaz, ”Blind separation of disjoint

orthogonal signals: demixing N sources from 2 mixtures”, ICASSP 2000.

[2] S. Rickard, R. Balan, J. Rosca, ”Real-time time-frequency based blind source separation”, ICA 2001.

[3] Ö. Yilmaz, S. Rickard, ”Blind separation of speech mixtures via time-frequency masking”, IEEE Trans. On Signal Processing, vol 52, no7, July 2004.

Introduction to blind source separation - Uppsala University · April - 2006 Signaler & System...

Documents