signal processing for diagnosis and treatment of …introduction preview diagnosis of ad treatment...
TRANSCRIPT
Signal Processing forDiagnosis and Treatment of Brain Disorders
Justin Dauwels
www.dauwels.com
Laboratory for Information and Decision Systems (LIDS)Massachusetts Institute of Technology (MIT)
March 18, 2009
“With all the tools available to modern medicine—the blood tests,M.R.I.’s and endoscopes—you might think that misdiagnosis hasbecome a rare thing. But you would be wrong.”
Why doctors so often get it wrong, David Leonhardt, New York Times, Feb. 22, 2006
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Proposed: Signal processing aided decision making
medical/imaging data
treatment
diagnosis
feature extraction decision making
3 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Proposed: Signal processing aided decision making
medical/imaging data
treatment
diagnosis
feature extraction decision making
• Signal processing “microscope”discovery of features/patterns in data associated withcertain disorders
• Algorithmic decision makingcombine evidence “optimally” → diagnosis and treatment
• More reliable and earlier diagnosis
• More effective treatment (clinical outcome, time and cost)
4 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Synchrony of brain signals is an important feature
• Loss of synchrony of brain signals → brain disorder
• Normal synchrony of brain signals
• Increased synchrony of brain signals → brain disorder
Often subtle effects, requires signal processing
5 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Preview: Topics
Signal Processing Aided Diagnosis
Loss in EEG synchrony → diagnosis of early-Alzheimer’s
Signal Processing Aided Treatment
Increased EEG synchrony → localization of epileptic brain tissue
Future Work
How do we plan to improve our methods?
Electroencephalogram (EEG)= electrical activity along the scalp or brain
6 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Alzheimer’s disease is complex, devastating, and common
• Symptoms: memory loss, language breakdown, loss of motorcontrol, apathy, . . .
• Causes: not well understood, complex molecular mechanisms
• 2-5% of people over 65 years old
7 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Our goal: early diagnosis of AD using EEG
Early diagnosis of AD is important
• medication and other therapies most useful when used early
More precisely: diagnosis of MCI and mild AD
• predementia (a.k.a. mild cognitive impairment, MCI)
• mild–moderate–severe AD
8 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Our goal: early diagnosis of AD using EEG
Standard approach
mental testing, MR, CT, SPECT, PET, corticospinal fluid
Scalp EEG is attractive complementary technology
• simple, affordable, mobile
• useful for screening
signs of AD?scalp EEG MR, CT, . . .Yes
9 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
EEG signals of AD patients are less coherent
• Loss of neurons → weaker brain connectivity
Healthy subject AD patient
• EEG from different brain areas is less correlated in AD patients
• Loss in EEG synchrony → loss in brain connectivity → AD
• How to detect loss in EEG synchrony?
10 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
How to measure EEG synchrony?
Classical techniques
• correlation coefficient, coherence
• phase synchrony
• Granger causality
• information-theoretic measures
• . . .
Few of them applied on single data sets of early-AD patients
11 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
How to measure EEG synchrony?
Classical techniques
• correlation coefficient, coherence
• phase synchrony
• Granger causality
• information-theoretic measures
• . . .
Few of them applied on single data sets of early-AD patients
Systematic study
30+ measures on two data sets (and more to follow)
12 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
How to measure EEG synchrony?
New approach: “Stochastic Event Synchrony (SES)”
• Extracts “events” from EEG signals and aligns those events
• Provides complementary information about synchrony
• Promising results for diagnosing AD
• General formalism for similarity of point processes:• pairs of point processes and n > 2 point processes• one-dimensional and multi-dimensional point processes• time-varying similarity
In this talk: pairs of two-dimensional point processes (EEG)
13 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Local synchronous activity leads to “events” in EEG
Non-synchronized activity
EEG
LFP 1LFP 2LFP 3LFP 4LFP 5LFP 6
Synchronized activity
EEG
LFP 1LFP 2LFP 3LFP 4LFP 5LFP 6
Neuroscience: Exploring the Brain, M. Bear et al.14 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Events can clearly be observed in time-frequency map
f [Hz]
t [s]Fourier power
amplitude
time-frequency events“bumps”
time-frequency transform |X(t,f)|2
wavelet transform
EEG x(t)
EEG X(f)
15 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Oscillatory events extracted from time-frequency map
10.000 coefficients → 50 parameters
• Bump models are point processes on time-frequency plane
• 5 bump parameters:timing, frequency, width in time and frequency, amplitude
• Lossy compression: “background” activity neglected
F. Vialatte et al., Neural Networks 200716 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
We determine EEG synchrony from bump models
1 Group electrodes in regions
2 Extract bump model for each region
3 Determine similarity of each pair of bump models
4 Average similarity over all pairs
Misalignment of bump models → loss of brain connectivity → AD
17 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Stochastic Event Synchrony
Bump models of two EEG signals Coincident bumps
00 5 10 15 20
5
10
15
20
25
30
t [s]
f[H
z]
00 5 10 15 20
5
10
15
20
25
30
t [s]
f[H
z]1 How many bumps coincide?
2 How well do the coincident bumps align?
18 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Stochastic Event Synchrony (ρ, δt , δf , st , sf )
Bump models of two EEG signals Coincident bumps
00 5 10 15 20
5
10
15
20
25
30
t [s]
f[H
z]
00 5 10 15 20
5
10
15
20
25
30
t [s]
f[H
z]
1 How many bumps coincide?
• fraction of non-coincident bumps ρ
2 How well do the coincident bumps align?• average t and f offset of coincident bumps (δt and δf )• t and f jitter of coincident bumps (st and sf )
19 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Stochastic Event Synchrony (ρ, δt , δf , st , sf )
Bump models of two EEG signals Coincident bumps
00 5 10 15 20
5
10
15
20
25
30
t [s]
f[H
z]
00 5 10 15 20
5
10
15
20
25
30
t [s]
f[H
z]
1 How many bumps coincide?
• fraction of non-coincident bumps ρ
2 How well do the coincident bumps align?• average t and f offset of coincident bumps (δt and δf )• t and f jitter of coincident bumps (st and sf )
It is natural to iterate Step 1 and 2
20 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
How can we derive such iterative procedure?
Wish list
• Should give the “best” alignment
• Should converge
• Should be easily extendable
21 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
SES parameters computed by statistical inference
Maximum a posteriori estimation
(c∗, θ∗) = argmaxc,θ p(e, e′, c; θ)
• e and e′ are the given bump models
• ckk′= 1 if (ek ,e′k′ ) coincident; ckk′= 0 otherwise
• θ = (δt , δf , st , sf ), ρ follows from c
Questions
• What is the model p?
• How do carry out MAP estimation practically?
22 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Statistical model p relates two bump models
Two given bump models are generated from “mother” model
f
tmin tmax
t
fmin
fmax
1 “Mother” bumps uniformly at random within rectangle
23 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Statistical model p relates two bump models
Two given bump models are generated from “mother” model
eke′k′
f
tmin tmax
(− δt
2 ,− δf
2 )
( δt
2 , δf
2 )
t
fmin
fmax
1 “Mother” bumps uniformly at random within rectangle
2 Generate two copies of each “mother” bump and perturbbump positions randomly:Gaussian RVs with mean (±δt/2,±δf /2) and var (st/2, sf /2)
24 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Statistical model p relates two bump models
Two given bump models are generated from “mother” model
eke′k′
f
tmin tmax
(− δt
2 ,− δf
2 )
( δt
2 , δf
2 )
t
fmin
fmax
1 “Mother” bumps uniformly at random within rectangle
2 Generate two copies of each “mother” bump and perturbbump positions randomly:Gaussian RVs with mean (±δt/2,±δf /2) and var (st/2, sf /2)
3 Delete bumps at random → non-coincident bumps (ρ)
25 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Intermediate summary
Problem
Given two bump models e and e′, infer ρ and θ = (δt , δf , st , sf )
Approach: Maximum a posteriori estimation
(c∗, θ∗) = argmaxc,θp(e, e′, c; θ)
26 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
We alternate bump alignment and parameter estimation
Problem
Given two bump models e and e′, infer ρ and θ = (δt , δf , st , sf )
Approach: Maximum a posteriori estimation
(c∗, θ∗) = argmaxc,θp(e, e′, c; θ)
Iterative algorithm: coordinate descent
c(k+1) = argmaxcp(e, e′, c; θ(k)) Bump Alignment
θ(k+1) = argmaxθp(e, e′, c(k+1); θ) Parameter Estimation
• Alignment is equivalent to max-weight bipartite matching
• Can be solved efficiently (belongs to P)
• Overall algorithm is guaranteed to converge
• Easily extendable 27 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
We applied SES to EEG data
2 data sets, different patients/hospitals
• 22 MCI and 38 Control subjects
• 17 Mild AD and 24 Control subjects
Setup
• spontaneous EEG, in rest with eyes closed
• 21 electrodes (10-20 international system)
• electrodes grouped in 5 zones
• bandpass filtered between 4 and 30Hz
• 20s of artifact-free EEG
• 30+ synchrony measures, including SES
28 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Promising and consistent numerical results
MCI vs. Control Mild AD vs. Control
0.045 0.05 0.055 0.060.2
0.3
0.4
0.5
0.6
0.7
MCICTR
ρ
ffDTF0.04 0.05 0.06 0.07 0.08
0.4
0.45
0.5
MiADCTR
ρ
ffDTF
• Only full-frequency direct transfer function (ffDTF) and ρstrongly significant (p < 0.001), for BOTH data sets
• MCI vs. Control harder than Mild AD vs. Control
• EEG synchrony loss seems to be strong indicator of early AD
• SES improves diagnosis of AD (for those two data sets)
29 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Want to know more? Want to try it out?
http://www.dauwels.com/SESToolbox/SES.html
30 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Preview: Topics
Signal Processing Aided Diagnosis
Loss in EEG synchrony → diagnosis of early-Alzheimer’s
We needed to develop novel measures of synchrony
Signal Processing Aided Treatment
Increased EEG synchrony → localization of epileptic brain tissue
Existing synchrony measures proved to be effective
Future Work
How do we plan to improve our methods?
31 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Epilepsy is common, complex, and devastating
• 2.5 million patients in U.S., 300’000 in U.K., 50 million world
• Heterogeneous group of central nervous system disorderscharacterized by recurrent unprovoked seizures
• Symptoms: disturbance of consciousness or awareness,alterations of bodily movement, sensation or posture, . . .
• Causes: genetic/metabolic abnormalities, tumors, . . .
32 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Medication is usual treatment but not always effective
• Medication is most common treatment
• However, medication alone not effective for 30% of patients
33 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
If medication alone fails, surgery may be solution
Resection if medication alone fails
• Resection may be alternative for location-related epilepsy
• Objective: remove seizure onset zone
• How to determine the seizure onset zone?• from clinical behavior (e.g., left arm shaking) or scalp EEG• from intracranial EEG, if everything else fails
risky, costly, uncomfortable
34 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Standard procedure relies mostly on “seizure” EEG
5 days or more
rest EEG seizure EEG
35 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Our approach: exploit rest EEG to shorten hospitalization
5 days or more
rest EEG seizure EEG
We wish to make intracranial recordings as short as possible
10min – 1hour
rest EEG
ObjectiveSeizure onset zone from rest EEG only → shorter hospitalization
HypothesisSeizure onset zone is characterized by hypersynchronous activityeven while at rest
36 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Differences with diagnosis of AD project
• Intracanial EEG vs. scalp EEG
• Signal processing to guide surgery vs. diagnosis
• Increased EEG synchrony vs. EEG synchrony loss
37 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Three main questions
HypothesisSeizure onset zone is characterized by hypersynchronous activityeven while at rest
1 Does hypersynchronous activity occur in rest EEG?
2 Do hypersynchronous areas correlate with seizure onset zone?
3 How to determine seizure onset zone using hypersynchrony?
38 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Existing literature
• Most existing studies: scalp EEG
• Few studies: intracranial grid electrodes, no depth electrodes
• Few synchrony measures
39 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Our approach
• Most existing studies: scalp EEG
• Few studies: intracranial grid electrodes, no depth electrodes
• Few synchrony measures
• 4 patients with grid electrodes, 2 with depth electrodes
• multiple synchrony measures
• 1 hour segment of intracranial rest EEG
• 48 hours between segment and seizures
• bandpass filtered between 4-30Hz, no further preprocessing
40 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Some brain areas are consistently hypersynchronous
Strong local correlation in anterior temporal lobe
Patient 1 correlation coeff c histogram
Variability over time is small (less than 15%)
time [min]
ant
tem
ppos
tte
mp
41 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Some brain areas are consistently hypersynchronous
Strong local correlation in anterior temporal lobe
Patient 1 correlation coeff c histogram
Very similar results for other synchrony measures
Phase Synchrony Coherence Granger causality (DTF)
42 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Hypersynchrony correlates strongly with seizure onset zone
Strong local correlation in anterior temporal lobe
Patient 1 correlation coeff c histogram
Very similar results for other synchrony measures
Phase Synchrony Coherence Granger causality (DTF)
43 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Same phenomenon also occurs in other patients
Patient 2 (left hemisphere) correlation coefficient
Patient 3 (right hemisphere)
Patient 4 (right hemisphere) (right - left)
44 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Is this a counterexample?
Patient 5 (left hemisphere) correlation coefficient
Considered as “counter example” in earlier studies
45 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Depth electrodes provide crucial information . . .
Patient 5 (left hemisphere) correlation coefficient
Considered as “counter example” in earlier studies
46 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Depth electrode reveals actual seizure onset zone
seizure propagation
seizure onset
hypersynchrony
47 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Three main questions
1 Does hypersynchronous activity occur in rest EEG?
Yes! (in the 6 patients analyzed so far)
2 Do hypersynchronous areas correlate with seizure onset zone?
Yes! (in the 6 patients analyzed so far)
3 How to determine seizure onset zone using hypersynchrony?
Statistical decision making, using a graphical modelWork in progress . . .
Signal processing of rest EEG has high potential→ drastically shorter hospitalization
48 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Topics
Signal Processing Aided Diagnosis
Loss in EEG synchrony → diagnosis of early-Alzheimer’s
We needed to develop novel measures of synchrony
Signal Processing Aided Treatment
Increased EEG synchrony → localization of epileptic brain tissue
Existing synchrony measures proved to be effective
Future Work
How do we plan to improve our methods?
49 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Future work: Early diagnosis of AD
Analysis of additional EEG data
Signal Processing
• Extensions of SES
• Alternative approaches• spectral properties• source reconstruction• non-negative tensor decomposition
50 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Future work: Treatment of epilepsy
• Additional features• high-frequency ripples (80–300Hz) [Worrell et al., Staba et al.]
• slowing effect
• Additional data• Scalp EEG, MEG, SPECT, PET, CT, MR, DTI• Clinical behavior
• Fundamentals: seizure genesis/hypersynchrony/resection• biophysical models• animal models
• Real-time signal processing in Operating Room
• Beyond resection: intracortical stimulation and alternatives51 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Back to the big picture . . .
medical/imaging data
treatment
diagnosis
feature extraction decision making
• Signal processing “microscope”discovery of features/patterns in data associated withcertain disorders
• Algorithmic decision makingcombine evidence “optimally” → diagnosis and treatment
• More reliable and earlier diagnosis
• More effective treatment (clinical outcome, time and cost)
52 / 54
Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion
Acknowledgements
The SES Team
• Francois Vialatte (RIKEN)
• Theo Weber (MIT)
• Jordi Sole-Casals (U Vic)
• Andrzej Cichocki (RIKEN)
Diagnosis of AD
• Toshimitsu Musha (Tokyo Tech)
• Jaeseung Jeong (KAIST)
• Corinna Hanschel(MPI&U Frankfurt/U Bangor)
• Johannes Pantel(MPI&U Frankfurt)
SES: Applications beyond AD
• Monique Maurice (RIKEN)
• Tomek Rutkowski (RIKEN)
• Danilo Mandic (Imperial)
• Shin Ishii (Kyoto University)
• Yuichi Sakumura (NAIST)
Localization of seizure onset
• Sydney Cash (MGH/Harvard)
• Alan Willsky (MIT)
• Emery Brown(MIT/MGH/Harvard)
• Emad Eskandar (MGH/Harvard)
53 / 54
Signal Processing forDiagnosis and Treatment of Brain Disorders
Justin Dauwels
www.dauwels.com
Laboratory for Information and Decision Systems (LIDS)Massachusetts Institute of Technology (MIT)
March 18, 2009
Analogy: Waiting for a train
• Train may not arrive (e.g., mechanical problem)= event reliability ρ
• Train may or may not be on time= timing precision st
55 / 54
Priors for the SES parameters
12
3
00 5 10 15 20
5
10
15
20
25
30
t [s]
f[H
z]
Scaled inverse chi-square distributions
p(st) =(s0,tνt/2)νt /2
Γ(νt/2)
e−νt s0,t/2st
s1+νt /2t
p(sf ) =(s0,f νf /2)νf /2
Γ(νf /2)
e−νf s0,f /2sf
s1+νf /2f
where νt and νf are the degrees of freedom and Γ(x) is the Gamma function
0 0.2 0.4 0.60
5
10
15
20
25
30
ν = 20ν = 40ν = 60ν = 80ν = 100
s
p(s
)
56 / 54
Ambiguity in the matching process
5
6
f
tmin tmax
t
fmin
fmax
1’
1
2’
2
3
3’
4’
4
5
6
f
tmin tmax
t
fmin
fmax
1’
1
2’
2
3
3’
4’
4
57 / 54
Alignment is equivalent to bipartite max-weight matching
p(e, e ′, c , θ) ∝n∏
k=1
n′
∏
k′=1
(
N(
t ′k′ − tk ; δt , st)
N(
f ′
k′ − fk ; δf , sf)
β−2)ckk′
· I (c)p(δt)p(st)p(δf )p(sf ),
with
I (c) =
n∏
k=1
(
δ[
n′
∑
k′=1
ckk′
]
+ δ[
n′
∑
k′=1
ckk′ − 1]
)
·n′
∏
k′=1
(
δ[
n∑
k=1
ckk′
]
+ δ[
n∑
k=1
ckk′ − 1]
)
.
58 / 54
Alignment is equivalent to bipartite max-weight matching
c(i+1) = argmaxc
log p(e, e ′, c , θ(i))
= argmaxc
∑
kk′
wkk′ ckk′ + log I (c),
wkk′ = −
(
t ′k′ − tk − δ(i)t
)2
2st−
(
f ′k′ − fk − δ(i)f
)2
2sf− 2 logβ
− 1/2 log 2πst − 1/2 log 2πsf
wnn′
w1111
n
n′
11
n
n′
Find the heaviest disjoint set of edges in the bipartite graph59 / 54
Alignment is equivalent to bipartite max-weight matching
Algorithms for imperfect max-weight bipartite matching
• Edmond-Karp or auction algorithm
• tight LP relaxation to the integer programming formulation
• max-product algorithm
60 / 54
Alignment is solved efficiently using a graphical model
ttt
p(δt , st , δf , sf ) = p(δt)p(st)p(δf )p(sf )
δ[bn +∑n′
k′=1 cnk′ − 1]
δ[b′n′ +
∑nk=1 ckn′ − 1]
δ[bn] + β δ[bn − 1]
δ[b′n′ ] + β δ[b′
n′ − 1]
(
N(
t ′n′ − tn; δt , st)
N(
f ′n′ − fn; δf , sf)
)cnn′=
== === = ===
ΣΣ ΣΣ ΣΣ
θ = (δt , st , δf , sf )
N NNNNNNNN
βββββ β
. . .
. . .
. . .
. . .
. . .
. . .
. . . . . .
C11 C12 C1n′ C21 C22 C2n′ Cn1 Cn2 Cnn′
B1 B2 BnB ′1 B ′
2 B ′n′
p(e, e′, b, b
′, c, θ) ∝ p(δt )p(st )p(δf )p(sf )
n∏
k=1
(β δ[bk − 1] + δ[bk ])
n′∏
k′=1
(β δ[b′
k − 1] + δ[b′
k ])
·n∏
k=1
n′∏
k′=1
(
N(
t′
k′− tk ; δt , st
)
N(
f′
k′− fk ; δf , sf
)
)ckk′
61 / 54
Alignment is solved efficiently using a graphical model
tt
(
N(
t ′n′ − tn; δ(i)t , s
(i)t
)
N(
f ′n′ − fn; δ(i)f , s
(i)f
)
)cnn′
µ↑′′
µ↑′
µ↓′′
µ↓′
µ↑ µ↓
== === = ===
ΣΣ ΣΣ ΣΣ
N NNNNNNNN
βββββ β
. . .
. . .
. . .
. . .
. . .
. . .
. . . . . .
C11 C12 C1n′ C21 C22 C2n′ Cn1 Cn2 Cnn′
B1 B2 BnB ′1 B ′
2 B ′n′
62 / 54
Table of the max-product computation rules
1
...
X1
X2
Xm
Y
µ(y) ∝ maxx1,x2,...,xm g(x1, . . . , xm, y)µ(x1) . . . µ(xm)
g(x1, x2, . . . xm, y)
2 Y µ(y) ∝ g(y)
g(y)
3
=...
X1
X2
Xm
Y (
µ(y = 0)µ(y = 1)
)
∝
(
µ(x1 = 0)µ(x2 = 0) . . . µ(xm = 0)µ(x1 = 1)µ(x2 = 1) . . . µ(xm = 1)
)
δ[x1 − x2] . . . δ[xm − y ]
X1, . . . ,Xm,Y ∈ 0, 1
4
B
Σ
. . .X1 Xm
(
µ↓(xk = 0)µ↓(xk = 1)
)
∝
(
max(
µ↓(b=1)µ↓(b=0) ,maxℓ 6=k
µ↑(xℓ=1)µ↑(xℓ=0)
)
1
)
δ[
b +∑m
k=1 xk − 1]
(
µ↑(b = 0)µ↑(b = 1)
)
∝
(
maxkµ↑(xk=1)µ↑(xk=0)
1
)
B ,X1, . . . ,Xm ∈ 0, 1
63 / 54
SES algorithm for pairs of bump models
INPUT: Models e and e′, parameters β, νt , νf , s0,t , s0,f , δ(0)t , δ
(0)f ,
s(0)t , and s
(0)f
ALGORITHM: Iterate the following two steps until convergence:
1 Update the alignment c by max-product message passing
Initialize messages µ↓′(ckk′) = 1 = µ↓′′(ckk′ )
Iterative until convergence:
a. Upward messages:
µ↑′(ckk′ )∝µ↓′′(ckk′)gN (ckk′ ; θ(i))
µ↑′′(ckk′ )∝µ↓′(ckk′)gN (ckk′ ; θ(i)),where
gN (ckk′ ; θ(i)) =
(
N(
t ′k′ − tk ; δ(i)t , s
(i)t
)
N(
f ′k′ − fk ; δ(i)f , s
(i)f
)
)ckk′
,
with δ(i)t = δ
(i)t (∆tk + ∆t ′k′), δ
(i)f = δ
(i)f (∆fk + ∆f ′k′),
s(i)t = s
(i)t (∆tk + ∆t ′k′)2, s
(i)f = s
(i)f (∆fk + ∆f ′k′)2
b. Downward messages:
(
µ↓′(ckk′ = 0)µ↓′(ckk′ = 1)
)
∝
(
max(
β,maxℓ′ 6=k′ µ↑′(ckℓ′ = 1)/µ↑′(ckℓ′ = 0))
1
)
(
µ↓′′(ckk′ = 0)µ↓′′(ckk′ = 1)
)
∝
(
max (β,maxℓ 6=k µ↑′′(cℓk′ = 1)/µ↑′′(cℓk′ = 0))1
)
Compute marginalsp(ckk′) ∝ µ↓′(ckk′ )µ↓′′(ckk′)gN (ckk′ ; θ(i))
Compute decisions ckk′ = argmaxckk′p(ckk′ )
2 Update the SES parameters:
δ(i+1)t =
1
n(i+1)
n(i+1)∑
k=1
t′(i+1)k − t
(i+1)k
∆t(i+1)k + ∆t
′(i+1)k
δ(i+1)f =
1
n(i+1)
n(i+1)∑
k=1
f′(i+1)k − f
(i+1)k
∆f(i+1)k + ∆f
′(i+1)k
s(i+1)t =
νts0,t + n(i+1)s(i+1)t,sample
νt + n(i+1) + 2
s(i+1)f =
νf s0,f + n(i+1)s(i+1)f ,sample
νf + n(i+1) + 2,
OUTPUT: Alignment c and SES parameters ρ, δt , δf , st , sf
64 / 54
Surrogate data: σf = 0Hzℓ0 = 40 ℓ0 = 100
0 0.2 0.40
10
20
30
40
50
60
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd0 0.2 0.4
0
10
20
30
40
50
60
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd
0 0.2 0.40
0.2
0.4
0.6
0.8
σt = 10
σt = 30
σt = 50
pd
σ[σ
t]
0 0.2 0.40
0.2
0.4
0.6
0.8
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd
0 0.2 0.40
0.1
0.2
0.3
0.4
σt = 10
σt = 30
σt = 50
E[ρ
]
pd0 0.2 0.4
0
0.1
0.2
0.3
0.4
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd
0.1 0.2 0.3 0.40
0.1
0.2
0.3
0.4
0.5
σt = 10
σt = 30
σt = 50
pd
σ[ρ
]
0.1 0.2 0.3 0.40
0.1
0.2
0.3
0.4
0.5
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd65 / 54
Surrogate data: σf = 2.5Hzℓ0 = 40 ℓ0 = 100
0 0.2 0.40
10
20
30
40
50
60
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd0 0.2 0.4
0
10
20
30
40
50
60
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd
0 0.2 0.40
0.2
0.4
0.6
0.8
σt = 10
σt = 30
σt = 50
pd
σ[σ
t]
0 0.2 0.40
0.2
0.4
0.6
0.8
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd
0 0.2 0.40
0.1
0.2
0.3
0.4
σt = 10
σt = 30
σt = 50
E[ρ
]
pd0 0.2 0.4
0
0.1
0.2
0.3
0.4
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd
0.1 0.2 0.3 0.40
0.1
0.2
0.3
0.4
0.5
σt = 10
σt = 30
σt = 50
pd
σ[ρ
]
0.1 0.2 0.3 0.40
0.1
0.2
0.3
0.4
0.5
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd66 / 54
Surrogate data: σf = 5Hzℓ0 = 40 ℓ0 = 100
0 0.2 0.40
10
20
30
40
50
60
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd0 0.2 0.4
0
10
20
30
40
50
60
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd
0 0.2 0.40
0.2
0.4
0.6
0.8
σt = 10
σt = 30
σt = 50
pd
σ[σ
t]
0 0.2 0.40
0.2
0.4
0.6
0.8
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd
0 0.2 0.40
0.1
0.2
0.3
0.4
σt = 10
σt = 30
σt = 50
E[ρ
]
pd0 0.2 0.4
0
0.1
0.2
0.3
0.4
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd
0.1 0.2 0.3 0.40
0.1
0.2
0.3
0.4
0.5
σt = 10
σt = 30
σt = 50
pd
σ[ρ
]
0.1 0.2 0.3 0.40
0.1
0.2
0.3
0.4
0.5
σt = 10
σt = 30
σt = 50
E[σ
t][m
s]
pd67 / 54
Box plots for SES parameters
0.26
0.28
0.3
0.32
0.34
0.36
CTR MCI
s t
0.15
0.2
0.25
0.3
0.35
0.4
0.45
ρCTR MCI
st (p =0.19) ρ (p = 0.00012)
68 / 54
How do p-values depend on initial SES parameters?
0.05 0.1 0.15
0.14
0.16
0.18
0.2
0.22
0.24
0.26−1.5
−1.4
−1.3
−1.2
−1.1
−1
−0.9
s(0
)t
s(0)f
0.05 0.1 0.15
0.14
0.16
0.18
0.2
0.22
0.24
0.26
−2
−1.8
−1.6
−1.4
s(0
)t
s(0)f
0.05 0.1 0.15
0.14
0.16
0.18
0.2
0.22
0.24
0.26−3
−2.5
−2
−1.5
s(0
)t
s(0)f
β = 0.01, 3 zones β = 0.001, 3 zones β = 0.0001, 3 zones
0.05 0.1 0.15
0.14
0.16
0.18
0.2
0.22
0.24
0.26−2.5
−2.4
−2.3
−2.2
−2.1
−2
s(0
)t
s(0)f
0.05 0.1 0.15
0.14
0.16
0.18
0.2
0.22
0.24
0.26−2.8
−2.6
−2.4
−2.2
−2
−1.8
s(0
)t
s(0)f
0.05 0.1 0.15
0.14
0.16
0.18
0.2
0.22
0.24
0.26
−3.5
−3
−2.5
s(0
)t
s(0)f
β = 0.01, 5 zones β = 0.001, 5 zones β = 0.0001, 5 zones
0.05 0.1 0.15
0.05
0.1
0.15
0.2
0.25 −2
−1.8
−1.6
−1.4
−1.2
−1
−0.8
s(0
)t
s(0)f
0.05 0.1 0.15
0.05
0.1
0.15
0.2
0.25
−2.2
−2
−1.8
−1.6
−1.4
s(0
)t
s(0)f
0.05 0.1 0.15
0.05
0.1
0.15
0.2
0.25
−2.5
−2
−1.5
−1
s(0
)t
s(0)f
β = 0.01, 7 zones β = 0.001, 7 zones β = 0.0001, 7 zones 69 / 54
Number of iterations
0 10 20 30 40 500
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
number of iterations
freq
uen
cy
70 / 54
Extension of SES from 2 to n > 2 bump models
Clusters appear naturally (n = 5)
f
t
n models similar if:
• few deletions → number of bumps/cluster close to n
• little jitter in t and f → clusters have small dispersion
71 / 54
Extension from 2 to n > 2 bump models
n bump models are generated from “mother” model
f
t
• distribution pc of number of bumps per cluster
• average t and f offset (δt,i and δf ,i), i = 1, . . . , n
• t and f jitter (st,i and sf ,i), i = 1, . . . , n
72 / 54
We alternate clustering and parameter estimation
Problem
Given n bump models e, infer pc and θ = (δt, δf , st, sf)
Approach: Maximum a posteriori estimation
(c∗,θ∗) = argmaxc,θp(e, c;θ)
Iterative algorithm: coordinate descent
c(k+1) = argmaxcp(e, c; θ(k)
) Clustering
θ(k+1)
= argmaxθp(e, c(k+1);θ) Parameter Estimation
• Clustering is equivalent to max-weight k-partite matching
• NP-hard in general, but can be solved efficiently in practice
• Overall algorithm is provably convergent
73 / 54
Various extensions of SES
Point processes in other domains: (t, x), (x , y , z), . . .
• Analysis of spike trains (t)
• Study of cell migration using molecular imaging (t, x , y)
Key: meaningful events in signal (spike, burst, peak, etc.)
Time-dependent SES → dynamics of synchrony
no stimulusno stimulus stimulus
tst
f
More complex “parent” structure (future work)mother process
n given processes n given processes
. . . . . .
“parent” processes
74 / 54
Want to know more? Want to try it out?
http://www.dauwels.com/SESToolbox/SES.html75 / 54
SES will also be included in Spike Train Analysis Toolkit
http://neuroanalysis.org/toolkit/ 76 / 54
Standard invasive approach requires long hospitalization
• Implant electrodes in the brain
• Record EEG until sufficient seizures obtained (5 days–more)
• Determine seizure onset zone from “seizure” EEG
Critical issues
• risk (infection, prolonged seizures, damage to the brain)
• discomfort
• cost
77 / 54
Univariate measures normal in hypersynchronous areas
78 / 54
How to determine seizure onset area using hypersynchrony?
Adaptive threshold required
Patient 1 correlation coeff c histogram
Graphical model
binary variable h (hypersynchronous?)
spatial correlation
local evidence (synchrony s)
h = argmaxhp(s,h)79 / 54
Future work : Framework for clinical decision making
• Develop graphical model for each source of information• Features obtained through signal processing
e.g., hypersynchrony in rest EEG• Clinical behavior• Biophysical prior knowledge
• Combine those models for decision making
d = argmaxd log p(i,d) − q(d)
decision d, “information” i, statistical model p, risk q
• Graphical models → efficient optimization algorithms
80 / 54
Similarity measures: correlation and beyond
• Correlation coefficient
r
=1
N
N∑
k=1
(x(k) − x)
σx
(y(k) − y)
σy
• Magnitude square coherence
c(f )
=|〈X (f )Y ∗(f )〉|2
|〈X (f )〉| |〈Y (f )〉|
• Phase coherence
φ(f )
= arg〈X (f )Y ∗(f )〉
• Corr-entropy coefficient
rE
=
1N
∑Nk=1 κ(x(k), y(k)) − 1
N2
∑Nk,ℓ=1 κ(x(k), y(ℓ))
√
KX − 1N2
∑Nk,ℓ=1 κ(x(k), x(ℓ))
√
KY − 1N2
∑Nk,ℓ=1 κ(y(k), y(ℓ))
,
where
KX =1
N
N∑
k=1
κ(x(k), x(k)) KY =1
N
N∑
k=1
κ(y(k), y(k)),
and where κ is a symmetric positive definite kernel81 / 54
Similarity measures: Granger causality
• Multivariate autoregressive (MVAR) model
x(k) =
p∑
ℓ=1
A(j)x(k − ℓ) + e(k),
where x(k)
= (x1(k), x2(k), . . . , xn(k))T , p is model order, the modelcoefficients A(j) are n × n matrices, and e(k) is a zero-mean Gaussian randomvector of size n; can be rewritten as:
e(k) =
p∑
ℓ=0
A(j)x(k − ℓ),
where A(0) = I (identity matrix) and A(j)
= −A(j) for j > 0
• Transform into the frequency domain (by applying the z-transform and by
substituting z
= e−2πif ∆t , where 1/∆t is the sampling rate):
X (f ) = A−1(f )E(f )
= H(f )E(f )
• Power spectrum matrix of the signal x(k) is determined as
S(f )
= E[
X (f )X (f )†]
= H(f )VH†(f ),82 / 54
Similarity measures: Granger causality
• Partial coherence (PC)
Cij (f )
=Mij(f )
√
Mii (f )√
Mjj(f )∈ C,
where Mij is minor (i , j) of S = determinant of S with row i , column j removed
• Directed transfer function (DTF)
γ2ij (f )
=|Hij(f )|2
∑mj=1 |Hij(f )|2
∈ [0, 1]
= fraction of inflow to channel i stemming from channel j
• Full frequency directed transfer function (ffDTF)
F 2ij (f )
=|Hij(f )|2
∑
f
∑mj=1 |Hij(f )|2
∈ [0, 1]
• Partial directed coherence (PDC)
Pij (f )
=Aij(f )
√
∑mi=1 |Aij(f )|2
∈ C,
= fraction of outflow from channel j to channel i .
• Direct directed transfer function (dDTF)
χ2ij(f )
= F 2ij (f )C2
ij (f ) ∈ [0, 1]83 / 54
Phase synchrony
• Interdependence between the instantaneous phases φx and φy of two signals x
and y
• Instantaneous phases may be strongly synchronized even when the amplitudes ofx and y are statistically independent
• Instantaneous phase φWx (k, f )
φWx (k, f )
= arg[X (k, f )],
where X (k, f ) is time-frequency transform of x
• Phase synchrony index
γ =∣
∣
⟨
e i(nφx−mφy )⟩∣
∣ ∈ [0, 1],
where n and m are integers (usually n = 1 = m)
84 / 54
Information theoretic measures
• Mutual information X and Y :
I (X ;Y )
= H(X ) + H(Y ) − H(X , Y ),
where H(X ) and H(Y ) is the Shannon entropy of X and Y respectively, andH(X , Y ) is the joint entropy of X and Y
• Mutual information in time-frequency domain
IW (Cx , Cy ,Cxy ) =∑
k,f
Cxy (k, f ) logCxy (k, f )
Cx (k, f )Cy (k, f ),
where the (normalized) cross time-frequency distribution of x and y is defined as
Cxy (k, f )
=|X (k, f )Y ∗(k, f )|
∑
k,f |X (k, f )Y ∗(k, f )|.
• Kullback-Leibler Divergence
K(Cx , Cy ) =∑
k,f
Cx (k, f ) logCx (k, f )
Cy (k, f ).
• Renyi Divergence
Dα(Cx ,Cy ) =1
α − 1log∑
k,f
[Cx (k, f )]α[Cy (k, f )](1−α)
85 / 54
State space based measures
• Mean Euclidian distance RMk
(X ) between X (k) to K nearest neighbors
• Mean Euclidian distance RMk
(X |Y ) between delay vector X (k) and the X -delayvectors corresponding to the M nearest neighbors of Y (k)
• Measures by Quiroga et al.
Sk(X |Y ) =1
N
N∑
k=1
RMk
(X )
RMk
(X |Y )∈ (0, 1]
Hk(X |Y ) =1
N
N∑
k=1
logRk(X )
RMk
(X |Y )
Nk(X |Y ) =1
N
N∑
k=1
Rk(X ) − RMk
(X |Y )
Rk(X ) 86 / 54
Correlation between (dis)similarity measures
5 10 15 20 25 30
5
10
15
20
25
30
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
state space
corr/coh
mut inf
divergence
Granger
SES
phase
Nk(X |Y )
Nk(X
|Y)
Nk(Y |X )
Nk(Y
|X)
Sk(X |Y )
Sk(X
|Y)
Sk(Y |X )
Sk(Y
|X)
Hk(X |Y )
Hk(X
|Y)
Hk(Y |X )
Hk(Y
|X)
Ses
t
Sest
Ω
Ω
r
rc
c
rE
r E
wE
wE
IW
I W
I
I
γH
γH
γW
γW
φ
φ
EMA
EM
A
IPA
IPA
K (Y |X )
K(Y
|X)
K (X |Y )
K(X
|Y)
K
K
Dα
Dα
J
J
Jα
Jα
Kij
Kij
Cij
Cij
Pij
Pij
γij
γij
GFS
GFS
Fij
Fij
χij
χij
s t
stρ
ρ
87 / 54
p-values for synchrony measures (MCI vs. Control)
88 / 54
How do p-values depend on initial SES parameters?
0.05 0.1 0.15
0.14
0.16
0.18
0.2
0.22
0.24
0.26−1.5
−1.4
−1.3
−1.2
−1.1
−1
−0.9
s(0
)t
s(0)f
0.05 0.1 0.15
0.14
0.16
0.18
0.2
0.22
0.24
0.26
−2
−1.8
−1.6
−1.4
s(0
)t
s(0)f
0.05 0.1 0.15
0.14
0.16
0.18
0.2
0.22
0.24
0.26−3
−2.5
−2
−1.5
s(0
)t
s(0)f
β = 0.01, 3 zones β = 0.001, 3 zones β = 0.0001, 3 zones
0.05 0.1 0.15
0.14
0.16
0.18
0.2
0.22
0.24
0.26−2.5
−2.4
−2.3
−2.2
−2.1
−2
s(0
)t
s(0)f
0.05 0.1 0.15
0.14
0.16
0.18
0.2
0.22
0.24
0.26−2.8
−2.6
−2.4
−2.2
−2
−1.8
s(0
)t
s(0)f
0.05 0.1 0.15
0.14
0.16
0.18
0.2
0.22
0.24
0.26
−3.5
−3
−2.5
s(0
)t
s(0)f
β = 0.01, 5 zones β = 0.001, 5 zones β = 0.0001, 5 zones
0.05 0.1 0.15
0.05
0.1
0.15
0.2
0.25 −2
−1.8
−1.6
−1.4
−1.2
−1
−0.8
s(0
)t
s(0)f
0.05 0.1 0.15
0.05
0.1
0.15
0.2
0.25
−2.2
−2
−1.8
−1.6
−1.4
s(0
)t
s(0)f
0.05 0.1 0.15
0.05
0.1
0.15
0.2
0.25
−2.5
−2
−1.5
−1
s(0
)t
s(0)f
β = 0.01, 7 zones β = 0.001, 7 zones β = 0.0001, 7 zones 89 / 54
p-values for ffDTF (MCI vs. Control)
90 / 54