Download - Meg preprocessing
Magnetoencephalography Preprocessing and Noise
Reduction TechniquesEliezer Kanal
2/20/2012MEG Basics Course
1
About Me
• 2005 - 2009!! ! University of Pittsburgh! ! ! ! ! ! PhD, Bioengineering
• 2009 - 2011!! ! Carnegie Mellon University! ! ! ! ! ! Postdoctoral fellow, CNBC
• 2011 - current! ! PNC Financial Services! ! ! ! ! ! Quantitative Analyst, Risk Analytics
2
Dealing with Noisy Data
• Overview of MEG Noise
• Noise Reduction
- Averaging, thresholding, frequency filters
- SSP
- SSS/tSSS
• Source Extraction
- PCA
- ICA
3
MEG Noise
4
Breathing
5
Brea
thin
g
6
Freq
uenc
y
7
Freq
uenc
y
8
Tim
e-Fr
eque
ncy
9
Vigário, Jousmäki, Hämäläinen, Hari, & Oja (1997)
Biol
ogic
al N
oise
10
Line Noise
Subject
Empty Room
50 Hz Line Noise(60 Hz in USA)
11
Bad Channels
Find the bad one:
12
Bad Channels
Find the bad one:
12
Noise from nearby construction
13
Noise Reduction Techniques
• Averaging, thresholding, frequency filters
• SSP
• SSS/tSSS
14
Averaging
• Removes non-timelocked noise
• Requires:
- Time-locked block paradigm design
- Temporal or low-frequency analyses
15
Thresholding
• Discarding trials/channels with maximum signal intensity greater than some user-defined value
• Removes most “data blips”
• Rudimentary, better technique is to simply examine each trial/channel
16
Frequency Filter
• Very good first step, remove data you won’t analyze (don’t waste time cleaning what you won’t examine)
• Use more advanced techniques for specific noise signals
Filter Removes…
High-pass Lower frequencies
Low-pass Higher frequencies
Band-pass Outside specified band
Notch All except specified
17
18
19
Signal Space Projection
20
Signal Space Projection
• Overview: SSP uses the difference between source orientations and locations to differentiate distinct sources.
• Theory: Since the field pattern from a single source is
1) unique
2) time-invariant,
we can differentiate sources by examining the angle between their “signal space representations”, and project noise signals out of the dataset.
21
22
23
Signal Space Projection
• In general,
m(t) =MX
i=1
ai(t)si + n(t)
24
Signal Space Projection
• In general,
m(t) =MX
i=1
ai(t)si + n(t)measured
signal
24
Signal Space Projection
• In general,
m(t) =MX
i=1
ai(t)si + n(t)measured
signal
source i
M = Total number of channels
24
Signal Space Projection
• In general,
m(t) =MX
i=1
ai(t)si + n(t)measured
signal
source amplitude source i
M = Total number of channels
24
Signal Space Projection
• In general,
m(t) =MX
i=1
ai(t)si + n(t)measured
signal
source amplitude source i
noiseM = Total number of channels
24
Signal Space Projection
• In general,
• SSP states that s can be split in two:
- s‖ ! = signals from known sources
- s⟂ ! = signals from unknown sources
m(t) =MX
i=1
ai(t)si + n(t)measured
signal
source amplitude source i
noiseM = Total number of channels
sk = Pkm
s? = P?m
24
Signal Space Projection
• In general,
• SSP states that s can be split in two:
- s‖ ! = signals from known sources
- s⟂ ! = signals from unknown sources
m(t) =MX
i=1
ai(t)si + n(t)measured
signal
source amplitude source i
noiseM = Total number of channels
sk = Pkm
s? = P?munknown sources
known sources MEG signal
Projection operators
24
Signal Space Projection
• In general,
• SSP states that s can be split in two:
- s‖ ! = signals from known sources
- s⟂ ! = signals from unknown sources
m(t) =MX
i=1
ai(t)si + n(t)measured
signal
source amplitude source i
noiseM = Total number of channels
sk = Pkm
s? = P?munknown sources
known sources MEG signal
Projection operators
sk + s? = sWorth mentioning that
24
Signal Space ProjectionHow find P‖ and P⟂?
25
Signal Space ProjectionHow find P‖ and P⟂?
• Ingenious application of the magic1 technique of Singular Value Decomposition (SVD)
1 Not really magic
25
Signal Space ProjectionHow find P‖ and P⟂?
• Ingenious application of the magic1 technique of Singular Value Decomposition (SVD)
• Let . Using SVD, we find a basis for s‖, and therefore P‖.2
1 Not really magic
K = {s1, s2, . . . , sk} 2 sk
a matrix of all known sources
25
Signal Space ProjectionHow find P‖ and P⟂?
• Ingenious application of the magic1 technique of Singular Value Decomposition (SVD)
• Let . Using SVD, we find a basis for s‖, and therefore P‖.2
1 Not really magic
2 Let . By the properties of the SVD, the first k columns of U form an orthonormal basis for the column space of K, so we can define
K = {s1, s2, . . . , sk} 2 sk
K = U⇤VT
Pk = UkUTk
P? = I�Pk
a matrix of all known sources
since sk + s? = Pkm+P?m = s
25
Signal Space Projection
• Recall . To find a(t), invert s‖:
• In practice, s‖ often consists of known noise signals specific to a particular MEG scanner. The final step is simply to project those out of m(t), leaving only unknown (and presumably neural) sources in s.
m(t) =MX
i=1
ai(t)si + n(t)
m(t) = a(t)sk
a(t) = s�1k m(t)
a = V⇤�1UTm(t)
26
Signal Space Projection
• Recall . To find a(t), invert s‖:
• In practice, s‖ often consists of known noise signals specific to a particular MEG scanner. The final step is simply to project those out of m(t), leaving only unknown (and presumably neural) sources in s.
m(t) =MX
i=1
ai(t)si + n(t)
m(t) = a(t)sk
a(t) = s�1k m(t)
a = V⇤�1UTm(t)
K = {s1, s2, . . . , sk} 2 sk
= U⇤VT
| {z }
Recall that
26
Signal Space Separation (SSS)
27
Signal Space Separation
• Overview: Separate MEG signal into sources (1) outside and (2) inside the MEG helmet
• Theory: Analyzing the MEG data using a basis which expresses the magnetic field as a “gradient of the harmonic scalar potential” (defined below) allows the field to be separated into internal and external components.
By simply dropping the external component, we can significantly reduce the MEG signal noise.
28
MEG data – raw
29
MEG data – SSP
30
MEG data – SSS
31
Signal Space Separation• Begin with Maxwell’s laws:
⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)
32
Signal Space Separation• Begin with Maxwell’s laws:
⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)
sourcesmagneticfield
32
Taulu et al, 2005
Signal Space Separation• Begin with Maxwell’s laws:
• Note that on surface of sensor array, J = 0. As such,
⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)
⇥�H = 0 on array surface
sourcesmagneticfield i.e., no
sources!
32
Taulu et al, 2005
Signal Space Separation• Begin with Maxwell’s laws:
• Note that on surface of sensor array, J = 0. As such,
• Defining H = ∇Ψ, we obtain the identity ∇ × ∇Ψ = 0 in (1). This term (∇Ψ) is called the “scalar potential.”
• “Scalar potential” has no physical correlate.
• Often written with a negative sign (–∇Ψ) for convenience.
• H = –∇Ψ → B = –μ0∇Ψ… used interchangeably
⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)
⇥�H = 0 on array surface
sourcesmagneticfield i.e., no
sources!
32
Taulu et al, 2005
Signal Space Separation• Begin with Maxwell’s laws:
• Note that on surface of sensor array, J = 0. As such,
• Defining H = ∇Ψ, we obtain the identity ∇ × ∇Ψ = 0 in (1). This term (∇Ψ) is called the “scalar potential.”
• “Scalar potential” has no physical correlate.
• Often written with a negative sign (–∇Ψ) for convenience.
• H = –∇Ψ → B = –μ0∇Ψ… used interchangeably
• Substituting scalar potential into (3) we obtain the Laplacian:
⇤⇥H = J (1)⇤⇥B = µ0J (2)⇤ · B = 0 (3)
⇥ ·⇥� = ⇥2� = 0
⇥�H = 0 on array surface
sourcesmagneticfield i.e., no
sources!
32
Signal Space Separation• Substituting the scalar potential into (3), we obtain the
Laplacian:⇥ ·⇥� = ⇥2� = 0
⇥ · B = 0
33
Signal Space Separation• Substituting the scalar potential into (3), we obtain the
Laplacian:
• We can express the scalar potential using spherical coordinates ( Ψ(Φ, θ, r) ), separate the variables ( Ψ(Φ,θ,r) = Φ(φ)Θ(θ)R(r) ), and solve the harmonic to obtain
⇥ ·⇥� = ⇥2� = 0⇥ · B = 0
B(r) = �µ0
⇥�
l=0
l�
m=�l
�lm�lm(⇥, ⌅)
rl+1
⇥ B�(r) + B�(r)
� µ0
⇥�
l=0
l�
m=�l
�lmrl�lm(⇥,⌅)
externalsignalinternal
signal
|{z}1
r2 sin ✓
sin ✓
@
@r
✓r2
@
@r
◆+
@
@✓
✓sin ✓
@
@✓
◆+
1
sin ✓
@2
@�2
�+K2 = 0
33
Signal Space Separation• Substituting the scalar potential into (3), we obtain the
Laplacian:
• We can express the scalar potential using spherical coordinates ( Ψ(Φ, θ, r) ), separate the variables ( Ψ(Φ,θ,r) = Φ(φ)Θ(θ)R(r) ), and solve the harmonic to obtain
⇥ ·⇥� = ⇥2� = 0⇥ · B = 0
B(r) = �µ0
⇥�
l=0
l�
m=�l
�lm�lm(⇥, ⌅)
rl+1
⇥ B�(r)internal
internalsignal
|{z}1
r2 sin ✓
sin ✓
@
@r
✓r2
@
@r
◆+
@
@✓
✓sin ✓
@
@✓
◆+
1
sin ✓
@2
@�2
�+K2 = 0
33
Signal Space Separation
34
Temporally-extended Signal Space Separation
(tSSS)
35
Temporally-extended Signal Space Separation
Conceptually very simple:
36
Temporally-extended Signal Space Separation
Conceptually very simple:
• Recall that the SSS algorithm ends with two signal components – Bα(r) and Bβ(r), or Bin(r) and Bout(r) – and we discard the Bout(r) component
- Rationale: signals originating outside MEG sensor helmet cannot be brain signal
36
Temporally-extended Signal Space Separation
Conceptually very simple:
• Recall that the SSS algorithm ends with two signal components – Bα(r) and Bβ(r), or Bin(r) and Bout(r) – and we discard the Bout(r) component
- Rationale: signals originating outside MEG sensor helmet cannot be brain signal
• tSSS looks for correlations between Bout(r) and Bin(r) and projects those correlations out of Bin(r)
- Rationale: Any internal signal correlated with the external noise component must represent noise that leaked into the Bin(r) component
36
Temporally-extended Signal Space Separation
• From theoriginal article:
37
Temporally-extended Signal Space Separation
• From the original article:
38
Temporally-extended Signal Space Separation
• Without tSSS:
39
Temporally-extended Signal Space Separation
• With tSSS:
40
Source Separation Algorithms
41
Primary Component Analysis (PCA)
42
• Ordinary Least Squares (OLS) regression of X to Y
Following five plots from http://stats.stackexchange.com/a/2700/2019
43
• Ordinary Least Squares (OLS) regression of Y to X
44
• Regression lines are different!
45
• PCA minimizes error orthogonal to the model line
(Yes, this is a different dataset)
46
• “Most accurate” regression line for the data
(Yes, this is another different dataset)
Primary Component Analysis
47
PCA – Formal Definition
48
PCA – Formal Definition
http://stat.ethz.ch/~maathuis/teaching/fall08/Notes3.pdf
49
PCA – Formal Definition
http://stat.ethz.ch/~maathuis/teaching/fall08/Notes3.pdf
49
PCA shortcomings
• Will only detectorthogonal signals
•• Cannot detect
polymodal distributions
Appl. Environ. Microbiol. May 2007 vol. 73 no. 9 2878-2890
“A Tutorial on Principal Component Analysis”, Jonathon Shlens, April 2009
50
Independent Component Analysis (ICA)
51
Independent Component Analysis
• Assumptions: Each signal is…
1. Statistically independent
2. Non-gaussian
• Recall Central Limit Theorem:
! “Given independent random variables x + y = z, z is ! more gaussian than x or y.”
• Theory: We can find S by iteratively identifying and extracting the most independent and non-gaussian components of X
52
ICA in FieldTrip package
53
ICA – Mixing matrix
54
s1s2
ICA – Mixing matrix
54
s1s2
x2x1
ICA – Mixing matrix
54
s1s2
x2x1
x1 = a11s1 + a12s2
x2 = a21s1 + a22s2
�⌘ x = As
ICA – Mixing matrix
54
s1s2
x2x1 Goal: Separate s1 and s2 using
information from x1 and x2
x1 = a11s1 + a12s2
x2 = a21s1 + a22s2
�⌘ x = As
ICA – Mixing matrix
54
Independent Component Analysis
• Consider the general mixing equation:
x1 = a11s1 + . . .+ a1nsn... =
...xn = an1s1 + . . .+ annsn
9>=
>;⌘ x = As
55
Independent Component Analysis
• Consider the general mixing equation:
sensorssources
x1 = a11s1 + . . .+ a1nsn... =
...xn = an1s1 + . . .+ annsn
9>=
>;⌘ x = As
mixing matrix
55
Independent Component Analysis
• Consider the general mixing equation:
• If we could find one of the rows of A-1 (let’s call that vector w), we could reconstruct a row of s. Mathematically:
sensorssources
x1 = a11s1 + . . .+ a1nsn... =
...xn = an1s1 + . . .+ annsn
9>=
>;⌘ x = As
mixing matrix
w
Tx =
X
i
wixi = y
55
Independent Component Analysis
• Consider the general mixing equation:
• If we could find one of the rows of A-1 (let’s call that vector w), we could reconstruct a row of s. Mathematically:
sensorssources
x1 = a11s1 + . . .+ a1nsn... =
...xn = an1s1 + . . .+ annsn
9>=
>;⌘ x = As
mixing matrix
w
Tx =
X
i
wixi = y
Some row from A-1
55
Independent Component Analysis
• Consider the general mixing equation:
• If we could find one of the rows of A-1 (let’s call that vector w), we could reconstruct a row of s. Mathematically:
sensorssources
x1 = a11s1 + . . .+ a1nsn... =
...xn = an1s1 + . . .+ annsn
9>=
>;⌘ x = As
mixing matrix
w
Tx =
X
i
wixi = y
Some row from A-1
One of the ICs
(independent components)
that make up S
55
• Working through the math… let
Independent Component Analysis
z = ATw
w
Tx =
X
i
wixi = y
x = As
56
• Working through the math… let
Independent Component Analysis
z = ATw
w
Tx =
X
i
wixi = y
x = As
mixing matrix Some row from A-1
56
• Working through the math… let
• So,
Independent Component Analysis
z = ATw
y = w
Tx
= w
TAs
= z
Ts
w
Tx =
X
i
wixi = y
x = As
mixing matrix Some row from A-1
56
• Working through the math… let
• So,
Independent Component Analysis
z = ATw
y = w
Tx
= w
TAs
= z
Ts
w
Tx =
X
i
wixi = y
x = As
mixing matrix Some row from A-1
One of the ICs
56
• Working through the math… let
• So,
Independent Component Analysis
z = ATw
y = w
Tx
= w
TAs
= z
Ts
w
Tx =
X
i
wixi = y
x = As
mixing matrix Some row from A-1
One of the ICs
56
• Working through the math… let
• So,
Independent Component Analysis
z = ATw
y = w
Tx
= w
TAs
= z
Ts
w
Tx =
X
i
wixi = y
x = As
mixing matrix Some row from A-1
One of the ICs
56
• Working through the math… let
• So,
Independent Component Analysis
z = ATw
y = w
Tx
= w
TAs
= z
Ts
w
Tx =
X
i
wixi = y
x = As
mixing matrix Some row from A-1
One of the ICs
56
• Working through the math… let
• So,
• y (an IC) is a linear combination of s, with weights zT.
Independent Component Analysis
z = ATw
y = w
Tx
= w
TAs
= z
Ts
w
Tx =
X
i
wixi = y
x = As
mixing matrix Some row from A-1
One of the ICs
56
• Working through the math… let
• So,
• y (an IC) is a linear combination of s, with weights zT.
• Recall Central Limit Theorem:
! “Given independent random variables x + y = z, z is ! more gaussian than x or y.”
zT is more gaussian than any of si, and is least gaussian when equal to one of the si.
Independent Component Analysis
z = ATw
y = w
Tx
= w
TAs
= z
Ts
w
Tx =
X
i
wixi = y
x = As
mixing matrix Some row from A-1
One of the ICs
56
• Working through the math… let
• So,
• y (an IC) is a linear combination of s, with weights zT.
• Recall Central Limit Theorem:
! “Given independent random variables x + y = z, z is ! more gaussian than x or y.”
zT is more gaussian than any of si, and is least gaussian when equal to one of the si.
Independent Component Analysis
z = ATw
y = w
Tx
= w
TAs
= z
Ts
w
Tx =
X
i
wixi = y
x = As
We want to take wT as a vector that maximizes the nongaussianity of
wTx, ensuring that wTx = zTs One of the ICs
56
Independent Component Analysis
• How can we find wT so as to maximize the nongaussianity of wTx?
• Numerous methods:
- Kurtosis
- Negentropy
- Approximations of Negentropy
• Once find, similar to PCA… find wT, remove, find next best wT, remove, repeat until no more sensors available.
57
ICA in Fieldtrip (2)
58
Mantini, Franciotti, Romani, & Pizzella (2007)
59
Mantini, Franciotti, Romani, & Pizzella (2007)
1
Mantini, Franciotti, Romani, & Pizzella (2007)
61
ICA – Method Comparison
Zavala-Fernández, Sander, Burghoff, Orglmeister, & Trahms (2006)
62
Summary
• Examine your data in as many ways as possible
• Use SSS & tSSS to best clean data
• Use ICA to find specific artifacts
• Always check your data!
63
Questions?64