Meg preprocessing

transcript

Magnetoencephalography Preprocessing and Noise

Reduction TechniquesEliezer Kanal

2/20/2012MEG Basics Course

About Me

• 2005 - 2009!! ! University of Pittsburgh! ! ! ! ! ! PhD, Bioengineering

• 2009 - 2011!! ! Carnegie Mellon University! ! ! ! ! ! Postdoctoral fellow, CNBC

• 2011 - current! ! PNC Financial Services! ! ! ! ! ! Quantitative Analyst, Risk Analytics

Dealing with Noisy Data

• Overview of MEG Noise

• Noise Reduction

- Averaging, thresholding, frequency filters

- SSS/tSSS

• Source Extraction

MEG Noise

Breathing

Vigário, Jousmäki, Hämäläinen, Hari, & Oja (1997)

Line Noise

Subject

Empty Room

50 Hz Line Noise(60 Hz in USA)

Bad Channels

Find the bad one:

Bad Channels

Find the bad one:

Noise from nearby construction

Noise Reduction Techniques

• Averaging, thresholding, frequency filters

• SSP

• SSS/tSSS

Averaging

• Removes non-timelocked noise

• Requires:

- Time-locked block paradigm design

- Temporal or low-frequency analyses

Thresholding

• Discarding trials/channels with maximum signal intensity greater than some user-defined value

• Removes most “data blips”

• Rudimentary, better technique is to simply examine each trial/channel

Frequency Filter

• Very good first step, remove data you won’t analyze (don’t waste time cleaning what you won’t examine)

• Use more advanced techniques for specific noise signals

Filter Removes…

High-pass Lower frequencies

Low-pass Higher frequencies

Band-pass Outside specified band

Notch All except specified

Signal Space Projection

• Overview: SSP uses the difference between source orientations and locations to differentiate distinct sources.

• Theory: Since the field pattern from a single source is

1) unique

2) time-invariant,

we can differentiate sources by examining the angle between their “signal space representations”, and project noise signals out of the dataset.

• In general,

m(t) =MX

ai(t)si + n(t)

• In general,

m(t) =MX

ai(t)si + n(t)measured

signal

• In general,

m(t) =MX

signal

source i

M = Total number of channels

• In general,

m(t) =MX

signal

source amplitude source i

M = Total number of channels

• In general,

m(t) =MX

signal

noiseM = Total number of channels

• In general,

• SSP states that s can be split in two:

- s‖ ! = signals from known sources

- s⟂ ! = signals from unknown sources

m(t) =MX

signal

sk = Pkm

s? = P?m

• In general,

m(t) =MX

signal

sk = Pkm

s? = P?munknown sources

known sources MEG signal

Projection operators

• In general,

m(t) =MX

signal

sk = Pkm

s? = P?munknown sources

known sources MEG signal

Projection operators

sk + s? = sWorth mentioning that

Signal Space ProjectionHow find P‖ and P⟂?

• Ingenious application of the magic1 technique of Singular Value Decomposition (SVD)

1 Not really magic

• Let . Using SVD, we find a basis for s‖, and therefore P‖.2

1 Not really magic

K = {s1, s2, . . . , sk} 2 sk

a matrix of all known sources

• Let . Using SVD, we find a basis for s‖, and therefore P‖.2

1 Not really magic

2 Let . By the properties of the SVD, the first k columns of U form an orthonormal basis for the column space of K, so we can define

K = {s1, s2, . . . , sk} 2 sk

K = U⇤VT

Pk = UkUTk

P? = I�Pk

a matrix of all known sources

since sk + s? = Pkm+P?m = s

• Recall . To find a(t), invert s‖:

• In practice, s‖ often consists of known noise signals specific to a particular MEG scanner. The final step is simply to project those out of m(t), leaving only unknown (and presumably neural) sources in s.

m(t) =MX

ai(t)si + n(t)

m(t) = a(t)sk

a(t) = s�1k m(t)

a = V⇤�1UTm(t)

• Recall . To find a(t), invert s‖:

• In practice, s‖ often consists of known noise signals specific to a particular MEG scanner. The final step is simply to project those out of m(t), leaving only unknown (and presumably neural) sources in s.

m(t) =MX

ai(t)si + n(t)

m(t) = a(t)sk

a(t) = s�1k m(t)

a = V⇤�1UTm(t)

K = {s1, s2, . . . , sk} 2 sk

= U⇤VT

| {z }

Recall that

Signal Space Separation (SSS)

Signal Space Separation

• Overview: Separate MEG signal into sources (1) outside and (2) inside the MEG helmet

• Theory: Analyzing the MEG data using a basis which expresses the magnetic field as a “gradient of the harmonic scalar potential” (defined below) allows the field to be separated into internal and external components.

By simply dropping the external component, we can significantly reduce the MEG signal noise.

MEG data – raw

MEG data – SSP

MEG data – SSS

Signal Space Separation• Begin with Maxwell’s laws:

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

sourcesmagneticfield

Taulu et al, 2005

• Note that on surface of sensor array, J = 0. As such,

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

⇥�H = 0 on array surface

sourcesmagneticfield i.e., no

sources!

Taulu et al, 2005

• Defining H = ∇Ψ, we obtain the identity ∇ × ∇Ψ = 0 in (1). This term (∇Ψ) is called the “scalar potential.”

• “Scalar potential” has no physical correlate.

• Often written with a negative sign (–∇Ψ) for convenience.

• H = –∇Ψ → B = –μ0∇Ψ… used interchangeably

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

sources!

Taulu et al, 2005

• Defining H = ∇Ψ, we obtain the identity ∇ × ∇Ψ = 0 in (1). This term (∇Ψ) is called the “scalar potential.”

• “Scalar potential” has no physical correlate.

• Often written with a negative sign (–∇Ψ) for convenience.

• H = –∇Ψ → B = –μ0∇Ψ… used interchangeably

• Substituting scalar potential into (3) we obtain the Laplacian:

⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)

⇥ ·⇥� = ⇥2� = 0

sources!

Signal Space Separation• Substituting the scalar potential into (3), we obtain the

Laplacian:⇥ ·⇥� = ⇥2� = 0

⇥ · B = 0

Laplacian:

• We can express the scalar potential using spherical coordinates ( Ψ(Φ, θ, r) ), separate the variables ( Ψ(Φ,θ,r) = Φ(φ)Θ(θ)R(r) ), and solve the harmonic to obtain

⇥ ·⇥� = ⇥2� = 0⇥ · B = 0

B(r) = �µ0

⇥�

m=�l

�lm�lm(⇥, ⌅)

⇥ B�(r) + B�(r)

� µ0

⇥�

m=�l

�lmrl�lm(⇥,⌅)

externalsignalinternal

signal

r2 sin ✓

sin ✓

✓sin ✓

sin ✓

�+K2 = 0

Laplacian:

• We can express the scalar potential using spherical coordinates ( Ψ(Φ, θ, r) ), separate the variables ( Ψ(Φ,θ,r) = Φ(φ)Θ(θ)R(r) ), and solve the harmonic to obtain

⇥ ·⇥� = ⇥2� = 0⇥ · B = 0

B(r) = �µ0

⇥�

m=�l

�lm�lm(⇥, ⌅)

⇥ B�(r)internal

internalsignal

r2 sin ✓

sin ✓

✓sin ✓

sin ✓

�+K2 = 0

Signal Space Separation

Temporally-extended Signal Space Separation

(tSSS)

Conceptually very simple:

• Recall that the SSS algorithm ends with two signal components – Bα(r) and Bβ(r), or Bin(r) and Bout(r) – and we discard the Bout(r) component

- Rationale: signals originating outside MEG sensor helmet cannot be brain signal

• Recall that the SSS algorithm ends with two signal components – Bα(r) and Bβ(r), or Bin(r) and Bout(r) – and we discard the Bout(r) component

- Rationale: signals originating outside MEG sensor helmet cannot be brain signal

• tSSS looks for correlations between Bout(r) and Bin(r) and projects those correlations out of Bin(r)

- Rationale: Any internal signal correlated with the external noise component must represent noise that leaked into the Bin(r) component

• From theoriginal article:

• From the original article:

• Without tSSS:

• With tSSS:

Source Separation Algorithms

Primary Component Analysis (PCA)

• Ordinary Least Squares (OLS) regression of X to Y

Following five plots from http://stats.stackexchange.com/a/2700/2019

• Ordinary Least Squares (OLS) regression of Y to X

• Regression lines are different!

• PCA minimizes error orthogonal to the model line

(Yes, this is a different dataset)

• “Most accurate” regression line for the data

(Yes, this is another different dataset)

Primary Component Analysis

PCA – Formal Definition

http://stat.ethz.ch/~maathuis/teaching/fall08/Notes3.pdf

PCA – Formal Definition

http://stat.ethz.ch/~maathuis/teaching/fall08/Notes3.pdf

PCA shortcomings

• Will only detectorthogonal signals

•• Cannot detect

polymodal distributions

Appl. Environ. Microbiol. May 2007 vol. 73 no. 9 2878-2890

“A Tutorial on Principal Component Analysis”, Jonathon Shlens, April 2009

Independent Component Analysis (ICA)

Independent Component Analysis

• Assumptions: Each signal is…

1. Statistically independent

2. Non-gaussian

• Recall Central Limit Theorem:

! “Given independent random variables x + y = z, z is ! more gaussian than x or y.”

• Theory: We can find S by iteratively identifying and extracting the most independent and non-gaussian components of X

ICA in FieldTrip package

ICA – Mixing matrix

x1 = a11s1 + a12s2

x2 = a21s1 + a22s2

�⌘ x = As

x2x1 Goal: Separate s1 and s2 using

information from x1 and x2

x1 = a11s1 + a12s2

x2 = a21s1 + a22s2

�⌘ x = As

• Consider the general mixing equation:

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

>;⌘ x = As

sensorssources

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

>;⌘ x = As

mixing matrix

• If we could find one of the rows of A-1 (let’s call that vector w), we could reconstruct a row of s. Mathematically:

sensorssources

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

>;⌘ x = As

mixing matrix

wixi = y

sensorssources

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

>;⌘ x = As

mixing matrix

wixi = y

Some row from A-1

sensorssources

x1 = a11s1 + . . .+ a1nsn... =

...xn = an1s1 + . . .+ annsn

>;⌘ x = As

mixing matrix

wixi = y

Some row from A-1

One of the ICs

(independent components)

that make up S

• Working through the math… let

z = ATw

wixi = y

x = As

z = ATw

wixi = y

x = As

mixing matrix Some row from A-1

• So,

z = ATw

wixi = y

x = As

• So,

z = ATw

wixi = y

x = As

One of the ICs

• So,

z = ATw

wixi = y

x = As

One of the ICs

• So,

z = ATw

wixi = y

x = As

One of the ICs

• So,

z = ATw

wixi = y

x = As

One of the ICs

• So,

• y (an IC) is a linear combination of s, with weights zT.

z = ATw

wixi = y

x = As

One of the ICs

• So,

zT is more gaussian than any of si, and is least gaussian when equal to one of the si.

z = ATw

wixi = y

x = As

One of the ICs

• So,

zT is more gaussian than any of si, and is least gaussian when equal to one of the si.

z = ATw

wixi = y

x = As

We want to take wT as a vector that maximizes the nongaussianity of

wTx, ensuring that wTx = zTs One of the ICs

• How can we find wT so as to maximize the nongaussianity of wTx?

• Numerous methods:

- Kurtosis

- Negentropy

- Approximations of Negentropy

• Once find, similar to PCA… find wT, remove, find next best wT, remove, repeat until no more sensors available.

ICA in Fieldtrip (2)

Mantini, Franciotti, Romani, & Pizzella (2007)

ICA – Method Comparison

Zavala-Fernández, Sander, Burghoff, Orglmeister, & Trahms (2006)

Summary

• Examine your data in as many ways as possible

• Use SSS & tSSS to best clean data

• Use ICA to find specific artifacts

• Always check your data!

Questions?64

Meg preprocessing

Technology