+ All Categories
Home > Documents > Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA...

Independent Component Analysis for Feature Extraction · MotivationIntroductionICA ApplicationICA...

Date post: 04-Jun-2018
Category:
Upload: dodiep
View: 239 times
Download: 0 times
Share this document with a friend
21
Motivation Introduction ICA Application ICA Dependencies Summary Independent Component Analysis for Feature Extraction Carmen Klaussner LCT Language and Communication Technology University of Groningen April, 25th, 2013
Transcript

Motivation Introduction ICA Application ICA Dependencies Summary

Independent Component Analysisfor Feature Extraction

Carmen Klaussner

LCT Language and Communication TechnologyUniversity of Groningen

April, 25th, 2013

Motivation Introduction ICA Application ICA Dependencies Summary

Independent Component Analysis (ICA)

Independent Component Analysis (ICA) is an unsupervised statisticaltechnique used for:

I separating a multivariate signal into independent subcomponents(blind source separation (BSS))

I revealing underlying latent concepts in feature extraction

Motivation Introduction ICA Application ICA Dependencies Summary

ICA and the Cocktail-Party Problem 1

I Imagine two different speakers in a room

I Two microphones placed at different locationsabout the room

I The microphones are recording mixtures of thevarious speech signals

1http://storage.blogues.canoe.ca/davidakin/200811141920.jpg

Motivation Introduction ICA Application ICA Dependencies Summary

ICA and the Cocktail-Party Problem cont’d

Figure: ICA Model 2

2http://www.imodenergy.com/images/courses/imode201/slide03.jpg

Motivation Introduction ICA Application ICA Dependencies Summary

ICA ModelResult of the recordings are mixed signals x1(t), x2(t)

x1(t) = a11s1 + a12s2

x2(t) = a21s1 + a22s2

I where x1 and x2 are the amplitudes and t the time index

I each recorded signal is a weighted sum of the original speech signalsof the two speakers denoted by s1(t) and s2(t)

I a11, a12, a21, and a22 are some parameters that depend on thedistances of the microphones from the speakers

I Assume that s1(t) and s2(t), at each time instant t, are statisticallyindependent

I Given only the mixed signals: x1, x2 ⇒ retrieve the original speechsignal of each speaker: s1(t), s2(t)

Motivation Introduction ICA Application ICA Dependencies Summary

ICA Model cont’d

x = As

with:

I x = (x1, x2...xn)T is a vector of observed random variables

I s = (s1, s2...sn)T the vector of the latent variables (the independentcomponents.)

I A is the unknown constant matrix, the mixing matrix A

I number of components is arbitrary, at most = no. of samples

Aim of Algorithm: find W = A−1, so that we obtain the independentcomponents by:

s = Wx

Motivation Introduction ICA Application ICA Dependencies Summary

ICA Feature Extraction on Text Documents

x =

D1023.txt D1392.txt D1394.txt D1400.txt D1406.txt . . .

able 73 2 1 32 7 . . .about 684 10 32 319 40 . . .above 51 2 4 31 4 . . .abroad 13 1 0 10 0 . . .absence 14 0 0 6 0 . . .absolutely 6 0 0 7 1 . . .accept 23 0 1 5 2 . . .accepted 14 1 0 7 2 . . .accident 11 0 1 9 0 . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . . .

ICA Interpretation

I documents are linear mixtures of concepts

I each term is a mixed signal/observation xi at a different time indext (here: time index = document)

I the source signals s are the latent concepts(independent components)

I aim is to find latent concepts − document representation

Motivation Introduction ICA Application ICA Dependencies Summary

ICA on Text Documentsx = As becomes:

Xterm x document = Aterm x concept ∗ Sconcept x document

s = Wx becomes:

Sconcept x document = Wconcept x term ∗ Xterm x document

I S is a new data representation that combines terms into latentconcepts

I A, the mixing matrix assigns a weight for each term in eachcomponent

I term-by-document matrix is unmixed to yield ’original’concept-by-document mapping

Motivation Introduction ICA Application ICA Dependencies Summary

ICA Output Example

x=

D1023.txt D1392.txt D1394.txt D1400.txt D1406.txt . . .

able 73 2 1 32 7 . . .about 684 10 32 319 40 . . .above 51 2 4 31 4 . . .abroad 13 1 0 10 0 . . .absence 14 0 0 6 0 . . .absolutely 6 0 0 7 1 . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . . .

=

A =

c1 c2 c3 c4 c5 . . .

able 0.076106808 −0.014558451 −0.10733842 −0.091537869 −0.0712187592 . . .about 0.168884358 −0.013135861 −0.04944864 −0.045366695 −0.0675653686 . . .above −0.087822012 −0.025989498 0.05227958 −0.002340966 −0.0181397638 . . .abroad −0.141542609 0.020390763 0.07750117 −0.040127687 0.0002770738 . . .absence −0.002402465 −0.134321250 0.04981664 0.140644925 −0.1017302731 . . .absolutely 0.002845907 −0.004149262 −0.01830506 0.047701236 −0.0910047210 . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . . .

*

s =

D1023.txt D1392.txt D1394.txt D1400.txt D1406.txt . . .

c1 1.000000 −1.000053 1.000000 1.000000 1.000000 . . .c2 −1.068787 −1.026944 0.9187293 −1.068788 −1.068790 . . .c3 −1.000675 −0.9531389 0.9558447 1.002504 −1.000675 . . .c4 −1.038625 −0.8975203 0.8958735 −1.151772 0.9906527 . . .c5 −0.9303368 0.9171785 −0.9577544 1.164191 1.081455 . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. . . .

Motivation Introduction ICA Application ICA Dependencies Summary

My Master Thesis: Dickens’ Style Analysis

Find characteristic terms of Charles Dickens compared to hiscontemporary writer Wilkie Collins.

Dickens’ keywords:lot, release, answering, ive, sunk, softened, beside, examined, seven, brothers, wear, eleven, correct,path, watched, sorrow, treated, sounds, masters, oclock, upon, lean, reality, song...

Collins’ keywords:gentle, fate, sweet, contrast, forth, whom, changes, strong, art, disturb, ventured, sorrow, blessing,parties, faded, imagination, towards, moon, portrait, daily, guide, game, although, lot, building,learn, visits, pay, animal, humanity...

Motivation Introduction ICA Application ICA Dependencies Summary

ICA Characteristic Terms Extraction

1. use ICA on term-by-document matrix to extract term concepts

2. extract weights for each keyword in document

3. select characteristic terms for each document set

4. test generalisation ability of each term list

terms in documents (individual) ⇒ concepts in documents (global)

⇓terms over authors

Motivation Introduction ICA Application ICA Dependencies Summary

Principal Component Analysis (PCA)I Principal component analysis (PCA) finds directions of maximum

variance in dataI Reduction of feature space by selecting those directions explaining

most of the varianceI Decorrelation of features, so that new data representation only

varies within each featureI Works best on gaussian distributions

Motivation Introduction ICA Application ICA Dependencies Summary

ICA and PCA: a comparison

I ICA is computationally superior to PCAI may not generally be superior (depending on application)

I PCA acts as preprocessing method for ICA

Figure: PCA vs. ICA 3

3http://www.sciencedirect.com/science/article/pii/S0957417406001308

Motivation Introduction ICA Application ICA Dependencies Summary

ICA Ambiguity

I components are extracted “randomly” depending on initial weight

I components are not ranked as in PCA

I ambiguity of signal variance and sign of ICs

I how many components to extract for application???

Given only the mixed signals and the assumption of statisticalindependence of the estimated signals ⇒ ICA retrieves original sources

Motivation Introduction ICA Application ICA Dependencies Summary

Objective Function and Statistical Independence

Statistical independence of two random variables y1, y2:

p(y1, y2) = p(y1)p(y2).

Measures of Statistical Independence:

I Minimization of mutual informationI Kullback-Leibler Divergence and Maximum-entropy

I Maximization of non-GaussianityI Kurtosis and Negentropy

Motivation Introduction ICA Application ICA Dependencies Summary

What is Statistical Independence?I Intuitively, statistical independence of two signals means, that at

each time point, signal 1 does not give any information aboutposition of signal 2 and vice versa

⇒ consequently: permuting the values of one signal and thus changingthe mapping at each time point should not have any effect

Figure: Mapping of two independent signals

Motivation Introduction ICA Application ICA Dependencies Summary

So...Independent component Analysis...

I is a method for blind source separation and feature extraction

I Given only mixed signals and statistical independence assumptionestimates original sources or latent variables

I Computationally expensive, best to try similar, but simplermethods first

Motivation Introduction ICA Application ICA Dependencies Summary

Where to find ICA

There are different implementations of ICA: Infomax, JADE,...FastICA

I implementation for FastICA AlgorithmI For R: http://cran.r-project.org/web/packages/fastICA/index.htmlI For Matlab: http://research.ics.aalto.fi/ica/fastica/

Motivation Introduction ICA Application ICA Dependencies Summary

Thank You!

Questions?

Motivation Introduction ICA Application ICA Dependencies Summary

References I

Altangerel Chagnaa, Cheol-young Ock, Chang-beom Lee, and PurevJaimai.Feature Extraction of Concepts by Independent Component Analysis,2007.

Timo Honkela and Aapo Hyvarinen.Linguistic Feature Extraction using Independent ComponentAnalysis.In Proceedings of IJCNN’04, pages 279–284, Budabest, Hungary,July 2004.

Aapo Hyvarinen and Erkki Oja.Independent component analysis: algorithms and applications.Neural Networks, 13:411–430, 2000.

T. Kolenda, L. K. Hansen, and S. Sigurdsson.Independent Components in Text, 2000.

Motivation Introduction ICA Application ICA Dependencies Summary

References II

Jonathon Shlens.A Tutorial on Principal Component Analysis.Technical report, Systems Neurobiology Laboratory, Salk Insitute forBiological Studies, December 2005.


Recommended