signal processing for diagnosis and treatment of …introduction preview diagnosis of ad treatment...

Signal Processing forDiagnosis and Treatment of Brain Disorders

Justin Dauwels

www.dauwels.com

Laboratory for Information and Decision Systems (LIDS)Massachusetts Institute of Technology (MIT)

March 18, 2009

“With all the tools available to modern medicine—the blood tests,M.R.I.’s and endoscopes—you might think that misdiagnosis hasbecome a rare thing. But you would be wrong.”

Why doctors so often get it wrong, David Leonhardt, New York Times, Feb. 22, 2006

Introduction Preview Diagnosis of AD Treatment of Epilepsy Future Work Conclusion

Proposed: Signal processing aided decision making

medical/imaging data

treatment

diagnosis

feature extraction decision making

3 / 54


Proposed: Signal processing aided decision making


treatment

diagnosis


• Signal processing “microscope”discovery of features/patterns in data associated withcertain disorders

• Algorithmic decision makingcombine evidence “optimally” → diagnosis and treatment

• More reliable and earlier diagnosis

• More effective treatment (clinical outcome, time and cost)

4 / 54


Synchrony of brain signals is an important feature

• Loss of synchrony of brain signals → brain disorder

• Normal synchrony of brain signals

• Increased synchrony of brain signals → brain disorder

Often subtle effects, requires signal processing

5 / 54


Preview: Topics

Signal Processing Aided Diagnosis

Loss in EEG synchrony → diagnosis of early-Alzheimer’s

Signal Processing Aided Treatment

Increased EEG synchrony → localization of epileptic brain tissue

Future Work

How do we plan to improve our methods?

Electroencephalogram (EEG)= electrical activity along the scalp or brain

6 / 54


Alzheimer’s disease is complex, devastating, and common

• Symptoms: memory loss, language breakdown, loss of motorcontrol, apathy, . . .

• Causes: not well understood, complex molecular mechanisms

• 2-5% of people over 65 years old

7 / 54


Our goal: early diagnosis of AD using EEG

Early diagnosis of AD is important

• medication and other therapies most useful when used early

More precisely: diagnosis of MCI and mild AD

• predementia (a.k.a. mild cognitive impairment, MCI)

• mild–moderate–severe AD

8 / 54


Our goal: early diagnosis of AD using EEG

Standard approach

mental testing, MR, CT, SPECT, PET, corticospinal fluid

Scalp EEG is attractive complementary technology

• simple, affordable, mobile

• useful for screening

signs of AD?scalp EEG MR, CT, . . .Yes

9 / 54


EEG signals of AD patients are less coherent

• Loss of neurons → weaker brain connectivity

Healthy subject AD patient

• EEG from different brain areas is less correlated in AD patients

• Loss in EEG synchrony → loss in brain connectivity → AD

• How to detect loss in EEG synchrony?

10 / 54


How to measure EEG synchrony?

Classical techniques

• correlation coefficient, coherence

• phase synchrony

• Granger causality

• information-theoretic measures

• . . .

Few of them applied on single data sets of early-AD patients

11 / 54



Classical techniques

• correlation coefficient, coherence

• phase synchrony

• Granger causality

• information-theoretic measures

• . . .

Few of them applied on single data sets of early-AD patients

Systematic study

30+ measures on two data sets (and more to follow)

12 / 54



New approach: “Stochastic Event Synchrony (SES)”

• Extracts “events” from EEG signals and aligns those events

• Provides complementary information about synchrony

• Promising results for diagnosing AD

• General formalism for similarity of point processes:• pairs of point processes and n > 2 point processes• one-dimensional and multi-dimensional point processes• time-varying similarity

In this talk: pairs of two-dimensional point processes (EEG)

13 / 54


Local synchronous activity leads to “events” in EEG

Non-synchronized activity

EEG

LFP 1LFP 2LFP 3LFP 4LFP 5LFP 6

Synchronized activity

EEG

LFP 1LFP 2LFP 3LFP 4LFP 5LFP 6

Neuroscience: Exploring the Brain, M. Bear et al.14 / 54


Events can clearly be observed in time-frequency map

f [Hz]

t [s]Fourier power

amplitude

time-frequency events“bumps”

time-frequency transform |X(t,f)|2

wavelet transform

EEG x(t)

EEG X(f)

15 / 54


Oscillatory events extracted from time-frequency map

10.000 coefficients → 50 parameters

• Bump models are point processes on time-frequency plane

• 5 bump parameters:timing, frequency, width in time and frequency, amplitude

• Lossy compression: “background” activity neglected

F. Vialatte et al., Neural Networks 200716 / 54


We determine EEG synchrony from bump models

1 Group electrodes in regions

2 Extract bump model for each region

3 Determine similarity of each pair of bump models

4 Average similarity over all pairs

Misalignment of bump models → loss of brain connectivity → AD

17 / 54


Stochastic Event Synchrony

Bump models of two EEG signals Coincident bumps

00 5 10 15 20

5

10

15

20

25

30

t [s]

f[H

z]

00 5 10 15 20

5

10

15

20

25

30

t [s]

f[H

z]1 How many bumps coincide?

2 How well do the coincident bumps align?

18 / 54


Stochastic Event Synchrony (ρ, δt , δf , st , sf )


00 5 10 15 20

5

10

15

20

25

30

t [s]

f[H

z]

00 5 10 15 20

5

10

15

20

25

30

t [s]

f[H

z]

1 How many bumps coincide?

• fraction of non-coincident bumps ρ

2 How well do the coincident bumps align?• average t and f offset of coincident bumps (δt and δf )• t and f jitter of coincident bumps (st and sf )

19 / 54


Stochastic Event Synchrony (ρ, δt , δf , st , sf )


00 5 10 15 20

5

10

15

20

25

30

t [s]

f[H

z]

00 5 10 15 20

5

10

15

20

25

30

t [s]

f[H

z]

1 How many bumps coincide?

• fraction of non-coincident bumps ρ

2 How well do the coincident bumps align?• average t and f offset of coincident bumps (δt and δf )• t and f jitter of coincident bumps (st and sf )

It is natural to iterate Step 1 and 2

20 / 54


How can we derive such iterative procedure?

Wish list

• Should give the “best” alignment

• Should converge

• Should be easily extendable

21 / 54


SES parameters computed by statistical inference

Maximum a posteriori estimation

(c∗, θ∗) = argmaxc,θ p(e, e′, c; θ)

• e and e′ are the given bump models

• ckk′= 1 if (ek ,e′k′ ) coincident; ckk′= 0 otherwise

• θ = (δt , δf , st , sf ), ρ follows from c

Questions

• What is the model p?

• How do carry out MAP estimation practically?

22 / 54


Statistical model p relates two bump models

Two given bump models are generated from “mother” model

f

tmin tmax

t

fmin

fmax

1 “Mother” bumps uniformly at random within rectangle

23 / 54




eke′k′

f

tmin tmax

(− δt

2 ,− δf

2 )

( δt

2 , δf

2 )

t

fmin

fmax


2 Generate two copies of each “mother” bump and perturbbump positions randomly:Gaussian RVs with mean (±δt/2,±δf /2) and var (st/2, sf /2)

24 / 54




eke′k′

f

tmin tmax

(− δt

2 ,− δf

2 )

( δt

2 , δf

2 )

t

fmin

fmax


2 Generate two copies of each “mother” bump and perturbbump positions randomly:Gaussian RVs with mean (±δt/2,±δf /2) and var (st/2, sf /2)

3 Delete bumps at random → non-coincident bumps (ρ)

25 / 54


Intermediate summary

Problem

Given two bump models e and e′, infer ρ and θ = (δt , δf , st , sf )

Approach: Maximum a posteriori estimation

(c∗, θ∗) = argmaxc,θp(e, e′, c; θ)

26 / 54


We alternate bump alignment and parameter estimation

Problem

Given two bump models e and e′, infer ρ and θ = (δt , δf , st , sf )


(c∗, θ∗) = argmaxc,θp(e, e′, c; θ)

Iterative algorithm: coordinate descent

c(k+1) = argmaxcp(e, e′, c; θ(k)) Bump Alignment

θ(k+1) = argmaxθp(e, e′, c(k+1); θ) Parameter Estimation

• Alignment is equivalent to max-weight bipartite matching

• Can be solved efficiently (belongs to P)

• Overall algorithm is guaranteed to converge

• Easily extendable 27 / 54


We applied SES to EEG data

2 data sets, different patients/hospitals

• 22 MCI and 38 Control subjects

• 17 Mild AD and 24 Control subjects

Setup

• spontaneous EEG, in rest with eyes closed

• 21 electrodes (10-20 international system)

• electrodes grouped in 5 zones

• bandpass filtered between 4 and 30Hz

• 20s of artifact-free EEG

• 30+ synchrony measures, including SES

28 / 54


Promising and consistent numerical results

MCI vs. Control Mild AD vs. Control

0.045 0.05 0.055 0.060.2

0.3

0.4

0.5

0.6

0.7

MCICTR

ρ

ffDTF0.04 0.05 0.06 0.07 0.08

0.4

0.45

0.5

MiADCTR

ρ

ffDTF

• Only full-frequency direct transfer function (ffDTF) and ρstrongly significant (p < 0.001), for BOTH data sets

• MCI vs. Control harder than Mild AD vs. Control

• EEG synchrony loss seems to be strong indicator of early AD

• SES improves diagnosis of AD (for those two data sets)

29 / 54


Want to know more? Want to try it out?

http://www.dauwels.com/SESToolbox/SES.html

30 / 54


Preview: Topics



We needed to develop novel measures of synchrony



Existing synchrony measures proved to be effective

Future Work


31 / 54


Epilepsy is common, complex, and devastating

• 2.5 million patients in U.S., 300’000 in U.K., 50 million world

• Heterogeneous group of central nervous system disorderscharacterized by recurrent unprovoked seizures

• Symptoms: disturbance of consciousness or awareness,alterations of bodily movement, sensation or posture, . . .

• Causes: genetic/metabolic abnormalities, tumors, . . .

32 / 54


Medication is usual treatment but not always effective

• Medication is most common treatment

• However, medication alone not effective for 30% of patients

33 / 54


If medication alone fails, surgery may be solution

Resection if medication alone fails

• Resection may be alternative for location-related epilepsy

• Objective: remove seizure onset zone

• How to determine the seizure onset zone?• from clinical behavior (e.g., left arm shaking) or scalp EEG• from intracranial EEG, if everything else fails

risky, costly, uncomfortable

34 / 54


Standard procedure relies mostly on “seizure” EEG

5 days or more

rest EEG seizure EEG

35 / 54


Our approach: exploit rest EEG to shorten hospitalization

5 days or more

rest EEG seizure EEG

We wish to make intracranial recordings as short as possible

10min – 1hour

rest EEG

ObjectiveSeizure onset zone from rest EEG only → shorter hospitalization

HypothesisSeizure onset zone is characterized by hypersynchronous activityeven while at rest

36 / 54


Differences with diagnosis of AD project

• Intracanial EEG vs. scalp EEG

• Signal processing to guide surgery vs. diagnosis

• Increased EEG synchrony vs. EEG synchrony loss

37 / 54


Three main questions

HypothesisSeizure onset zone is characterized by hypersynchronous activityeven while at rest

1 Does hypersynchronous activity occur in rest EEG?

2 Do hypersynchronous areas correlate with seizure onset zone?

3 How to determine seizure onset zone using hypersynchrony?

38 / 54


Existing literature

• Most existing studies: scalp EEG

• Few studies: intracranial grid electrodes, no depth electrodes

• Few synchrony measures

39 / 54


Our approach

• Most existing studies: scalp EEG

• Few studies: intracranial grid electrodes, no depth electrodes

• Few synchrony measures

• 4 patients with grid electrodes, 2 with depth electrodes

• multiple synchrony measures

• 1 hour segment of intracranial rest EEG

• 48 hours between segment and seizures

• bandpass filtered between 4-30Hz, no further preprocessing

40 / 54


Some brain areas are consistently hypersynchronous

Strong local correlation in anterior temporal lobe

Patient 1 correlation coeff c histogram

Variability over time is small (less than 15%)

time [min]

ant

tem

ppos

tte

mp

41 / 54


Some brain areas are consistently hypersynchronous



Very similar results for other synchrony measures

Phase Synchrony Coherence Granger causality (DTF)

42 / 54


Hypersynchrony correlates strongly with seizure onset zone



Very similar results for other synchrony measures

Phase Synchrony Coherence Granger causality (DTF)

43 / 54


Same phenomenon also occurs in other patients

Patient 2 (left hemisphere) correlation coefficient

Patient 3 (right hemisphere)

Patient 4 (right hemisphere) (right - left)

44 / 54


Is this a counterexample?


Considered as “counter example” in earlier studies

45 / 54


Depth electrodes provide crucial information . . .


Considered as “counter example” in earlier studies

46 / 54


Depth electrode reveals actual seizure onset zone

seizure propagation

seizure onset

hypersynchrony

47 / 54


Three main questions

1 Does hypersynchronous activity occur in rest EEG?

Yes! (in the 6 patients analyzed so far)

2 Do hypersynchronous areas correlate with seizure onset zone?

Yes! (in the 6 patients analyzed so far)

3 How to determine seizure onset zone using hypersynchrony?

Statistical decision making, using a graphical modelWork in progress . . .

Signal processing of rest EEG has high potential→ drastically shorter hospitalization

48 / 54


Topics



We needed to develop novel measures of synchrony



Existing synchrony measures proved to be effective

Future Work


49 / 54


Future work: Early diagnosis of AD

Analysis of additional EEG data

Signal Processing

• Extensions of SES

• Alternative approaches• spectral properties• source reconstruction• non-negative tensor decomposition

50 / 54


Future work: Treatment of epilepsy

• Additional features• high-frequency ripples (80–300Hz) [Worrell et al., Staba et al.]

• slowing effect

• Additional data• Scalp EEG, MEG, SPECT, PET, CT, MR, DTI• Clinical behavior

• Fundamentals: seizure genesis/hypersynchrony/resection• biophysical models• animal models

• Real-time signal processing in Operating Room

• Beyond resection: intracortical stimulation and alternatives51 / 54


Back to the big picture . . .


treatment

diagnosis


• Signal processing “microscope”discovery of features/patterns in data associated withcertain disorders

• Algorithmic decision makingcombine evidence “optimally” → diagnosis and treatment

• More reliable and earlier diagnosis

• More effective treatment (clinical outcome, time and cost)

52 / 54


Acknowledgements

The SES Team

• Francois Vialatte (RIKEN)

• Theo Weber (MIT)

• Jordi Sole-Casals (U Vic)

• Andrzej Cichocki (RIKEN)

Diagnosis of AD

• Toshimitsu Musha (Tokyo Tech)

• Jaeseung Jeong (KAIST)

• Corinna Hanschel(MPI&U Frankfurt/U Bangor)

• Johannes Pantel(MPI&U Frankfurt)

SES: Applications beyond AD

• Monique Maurice (RIKEN)

• Tomek Rutkowski (RIKEN)

• Danilo Mandic (Imperial)

• Shin Ishii (Kyoto University)

• Yuichi Sakumura (NAIST)

Localization of seizure onset

• Sydney Cash (MGH/Harvard)

• Alan Willsky (MIT)

• Emery Brown(MIT/MGH/Harvard)

• Emad Eskandar (MGH/Harvard)

53 / 54

Signal Processing forDiagnosis and Treatment of Brain Disorders

Justin Dauwels

www.dauwels.com

Laboratory for Information and Decision Systems (LIDS)Massachusetts Institute of Technology (MIT)

March 18, 2009

Analogy: Waiting for a train

• Train may not arrive (e.g., mechanical problem)= event reliability ρ

• Train may or may not be on time= timing precision st

55 / 54

Priors for the SES parameters

12

3

00 5 10 15 20

5

10

15

20

25

30

t [s]

f[H

z]

Scaled inverse chi-square distributions

p(st) =(s0,tνt/2)νt /2

Γ(νt/2)

e−νt s0,t/2st

s1+νt /2t

p(sf ) =(s0,f νf /2)νf /2

Γ(νf /2)

e−νf s0,f /2sf

s1+νf /2f

where νt and νf are the degrees of freedom and Γ(x) is the Gamma function

0 0.2 0.4 0.60

5

10

15

20

25

30

ν = 20ν = 40ν = 60ν = 80ν = 100

s

p(s

)

56 / 54

Ambiguity in the matching process

5

6

f

tmin tmax

t

fmin

fmax

1’

1

2’

2

3

3’

4’

4

5

6

f

tmin tmax

t

fmin

fmax

1’

1

2’

2

3

3’

4’

4

57 / 54

Alignment is equivalent to bipartite max-weight matching

p(e, e ′, c , θ) ∝n∏

k=1

n′

∏

k′=1

(

N(

t ′k′ − tk ; δt , st)

N(

f ′

k′ − fk ; δf , sf)

β−2)ckk′

· I (c)p(δt)p(st)p(δf )p(sf ),

with

I (c) =

n∏

k=1

(

δ[

n′

∑

k′=1

ckk′

]

+ δ[

n′

∑

k′=1

ckk′ − 1]

)

·n′

∏

k′=1

(

δ[

n∑

k=1

ckk′

]

+ δ[

n∑

k=1

ckk′ − 1]

)

.

58 / 54


c(i+1) = argmaxc

log p(e, e ′, c , θ(i))

= argmaxc

∑

kk′

wkk′ ckk′ + log I (c),

wkk′ = −

(

t ′k′ − tk − δ(i)t

)2

2st−

(

f ′k′ − fk − δ(i)f

)2

2sf− 2 logβ

− 1/2 log 2πst − 1/2 log 2πsf

wnn′

w1111

n

n′

11

n

n′

Find the heaviest disjoint set of edges in the bipartite graph59 / 54


Algorithms for imperfect max-weight bipartite matching

• Edmond-Karp or auction algorithm

• tight LP relaxation to the integer programming formulation

• max-product algorithm

60 / 54

Alignment is solved efficiently using a graphical model

ttt

p(δt , st , δf , sf ) = p(δt)p(st)p(δf )p(sf )

δ[bn +∑n′

k′=1 cnk′ − 1]

δ[b′n′ +

∑nk=1 ckn′ − 1]

δ[bn] + β δ[bn − 1]

δ[b′n′ ] + β δ[b′

n′ − 1]

(

N(

t ′n′ − tn; δt , st)

N(

f ′n′ − fn; δf , sf)

)cnn′=

== === = ===

ΣΣ ΣΣ ΣΣ

θ = (δt , st , δf , sf )

N NNNNNNNN

βββββ β

. . .

. . .

. . .

. . .

. . .

. . .

. . . . . .

C11 C12 C1n′ C21 C22 C2n′ Cn1 Cn2 Cnn′

B1 B2 BnB ′1 B ′

2 B ′n′

p(e, e′, b, b

′, c, θ) ∝ p(δt )p(st )p(δf )p(sf )

n∏

k=1

(β δ[bk − 1] + δ[bk ])

n′∏

k′=1

(β δ[b′

k − 1] + δ[b′

k ])

·n∏

k=1

n′∏

k′=1

(

N(

t′

k′− tk ; δt , st

)

N(

f′

k′− fk ; δf , sf

)

)ckk′

61 / 54

Alignment is solved efficiently using a graphical model

tt

(

N(

t ′n′ − tn; δ(i)t , s

(i)t

)

N(

f ′n′ − fn; δ(i)f , s

(i)f

)

)cnn′

µ↑′′

µ↑′

µ↓′′

µ↓′

µ↑ µ↓

== === = ===

ΣΣ ΣΣ ΣΣ

N NNNNNNNN

βββββ β

. . .

. . .

. . .

. . .

. . .

. . .

. . . . . .

C11 C12 C1n′ C21 C22 C2n′ Cn1 Cn2 Cnn′

B1 B2 BnB ′1 B ′

2 B ′n′

62 / 54

Table of the max-product computation rules

1

...

X1

X2

Xm

Y

µ(y) ∝ maxx1,x2,...,xm g(x1, . . . , xm, y)µ(x1) . . . µ(xm)

g(x1, x2, . . . xm, y)

2 Y µ(y) ∝ g(y)

g(y)

3

=...

X1

X2

Xm

Y (

µ(y = 0)µ(y = 1)

)

∝

(

µ(x1 = 0)µ(x2 = 0) . . . µ(xm = 0)µ(x1 = 1)µ(x2 = 1) . . . µ(xm = 1)

)

δ[x1 − x2] . . . δ[xm − y ]

X1, . . . ,Xm,Y ∈ 0, 1

4

B

Σ

. . .X1 Xm

(

µ↓(xk = 0)µ↓(xk = 1)

)

∝

(

max(

µ↓(b=1)µ↓(b=0) ,maxℓ 6=k

µ↑(xℓ=1)µ↑(xℓ=0)

)

1

)

δ[

b +∑m

k=1 xk − 1]

(

µ↑(b = 0)µ↑(b = 1)

)

∝

(

maxkµ↑(xk=1)µ↑(xk=0)

1

)

B ,X1, . . . ,Xm ∈ 0, 1

63 / 54

SES algorithm for pairs of bump models

INPUT: Models e and e′, parameters β, νt , νf , s0,t , s0,f , δ(0)t , δ

(0)f ,

s(0)t , and s

(0)f

ALGORITHM: Iterate the following two steps until convergence:

1 Update the alignment c by max-product message passing

Initialize messages µ↓′(ckk′) = 1 = µ↓′′(ckk′ )

Iterative until convergence:

a. Upward messages:

µ↑′(ckk′ )∝µ↓′′(ckk′)gN (ckk′ ; θ(i))

µ↑′′(ckk′ )∝µ↓′(ckk′)gN (ckk′ ; θ(i)),where

gN (ckk′ ; θ(i)) =

(

N(

t ′k′ − tk ; δ(i)t , s

(i)t

)

N(

f ′k′ − fk ; δ(i)f , s

(i)f

)

)ckk′

,

with δ(i)t = δ

(i)t (∆tk + ∆t ′k′), δ

(i)f = δ

(i)f (∆fk + ∆f ′k′),

s(i)t = s

(i)t (∆tk + ∆t ′k′)2, s

(i)f = s

(i)f (∆fk + ∆f ′k′)2

b. Downward messages:

(

µ↓′(ckk′ = 0)µ↓′(ckk′ = 1)

)

∝

(

max(

β,maxℓ′ 6=k′ µ↑′(ckℓ′ = 1)/µ↑′(ckℓ′ = 0))

1

)

(

µ↓′′(ckk′ = 0)µ↓′′(ckk′ = 1)

)

∝

(

max (β,maxℓ 6=k µ↑′′(cℓk′ = 1)/µ↑′′(cℓk′ = 0))1

)

Compute marginalsp(ckk′) ∝ µ↓′(ckk′ )µ↓′′(ckk′)gN (ckk′ ; θ(i))

Compute decisions ckk′ = argmaxckk′p(ckk′ )

2 Update the SES parameters:

δ(i+1)t =

1

n(i+1)

n(i+1)∑

k=1

t′(i+1)k − t

(i+1)k

∆t(i+1)k + ∆t

′(i+1)k

δ(i+1)f =

1

n(i+1)

n(i+1)∑

k=1

f′(i+1)k − f

(i+1)k

∆f(i+1)k + ∆f

′(i+1)k

s(i+1)t =

νts0,t + n(i+1)s(i+1)t,sample

νt + n(i+1) + 2

s(i+1)f =

νf s0,f + n(i+1)s(i+1)f ,sample

νf + n(i+1) + 2,

OUTPUT: Alignment c and SES parameters ρ, δt , δf , st , sf

64 / 54

Surrogate data: σf = 0Hzℓ0 = 40 ℓ0 = 100

0 0.2 0.40

10

20

30

40

50

60

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd0 0.2 0.4

0

10

20

30

40

50

60

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd

0 0.2 0.40

0.2

0.4

0.6

0.8

σt = 10

σt = 30

σt = 50

pd

σ[σ

t]

0 0.2 0.40

0.2

0.4

0.6

0.8

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd

0 0.2 0.40

0.1

0.2

0.3

0.4

σt = 10

σt = 30

σt = 50

E[ρ

]

pd0 0.2 0.4

0

0.1

0.2

0.3

0.4

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd

0.1 0.2 0.3 0.40

0.1

0.2

0.3

0.4

0.5

σt = 10

σt = 30

σt = 50

pd

σ[ρ

]

0.1 0.2 0.3 0.40

0.1

0.2

0.3

0.4

0.5

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd65 / 54

Surrogate data: σf = 2.5Hzℓ0 = 40 ℓ0 = 100

0 0.2 0.40

10

20

30

40

50

60

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd0 0.2 0.4

0

10

20

30

40

50

60

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd

0 0.2 0.40

0.2

0.4

0.6

0.8

σt = 10

σt = 30

σt = 50

pd

σ[σ

t]

0 0.2 0.40

0.2

0.4

0.6

0.8

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd

0 0.2 0.40

0.1

0.2

0.3

0.4

σt = 10

σt = 30

σt = 50

E[ρ

]

pd0 0.2 0.4

0

0.1

0.2

0.3

0.4

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd

0.1 0.2 0.3 0.40

0.1

0.2

0.3

0.4

0.5

σt = 10

σt = 30

σt = 50

pd

σ[ρ

]

0.1 0.2 0.3 0.40

0.1

0.2

0.3

0.4

0.5

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd66 / 54

Surrogate data: σf = 5Hzℓ0 = 40 ℓ0 = 100

0 0.2 0.40

10

20

30

40

50

60

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd0 0.2 0.4

0

10

20

30

40

50

60

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd

0 0.2 0.40

0.2

0.4

0.6

0.8

σt = 10

σt = 30

σt = 50

pd

σ[σ

t]

0 0.2 0.40

0.2

0.4

0.6

0.8

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd

0 0.2 0.40

0.1

0.2

0.3

0.4

σt = 10

σt = 30

σt = 50

E[ρ

]

pd0 0.2 0.4

0

0.1

0.2

0.3

0.4

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd

0.1 0.2 0.3 0.40

0.1

0.2

0.3

0.4

0.5

σt = 10

σt = 30

σt = 50

pd

σ[ρ

]

0.1 0.2 0.3 0.40

0.1

0.2

0.3

0.4

0.5

σt = 10

σt = 30

σt = 50

E[σ

t][m

s]

pd67 / 54

Box plots for SES parameters

0.26

0.28

0.3

0.32

0.34

0.36

CTR MCI

s t

0.15

0.2

0.25

0.3

0.35

0.4

0.45

ρCTR MCI

st (p =0.19) ρ (p = 0.00012)

68 / 54

How do p-values depend on initial SES parameters?

0.05 0.1 0.15

0.14

0.16

0.18

0.2

0.22

0.24

0.26−1.5

−1.4

−1.3

−1.2

−1.1

−1

−0.9

s(0

)t

s(0)f

0.05 0.1 0.15

0.14

0.16

0.18

0.2

0.22

0.24

0.26

−2

−1.8

−1.6

−1.4

s(0

)t

s(0)f

0.05 0.1 0.15

0.14

0.16

0.18

0.2

0.22

0.24

0.26−3

−2.5

−2

−1.5

s(0

)t

s(0)f

β = 0.01, 3 zones β = 0.001, 3 zones β = 0.0001, 3 zones

0.05 0.1 0.15

0.14

0.16

0.18

0.2

0.22

0.24

0.26−2.5

−2.4

−2.3

−2.2

−2.1

−2

s(0

)t

s(0)f

0.05 0.1 0.15

0.14

0.16

0.18

0.2

0.22

0.24

0.26−2.8

−2.6

−2.4

−2.2

−2

−1.8

s(0

)t

s(0)f

0.05 0.1 0.15

0.14

0.16

0.18

0.2

0.22

0.24

0.26

−3.5

−3

−2.5

s(0

)t

s(0)f


0.05 0.1 0.15

0.05

0.1

0.15

0.2

0.25 −2

−1.8

−1.6

−1.4

−1.2

−1

−0.8

s(0

)t

s(0)f

0.05 0.1 0.15

0.05

0.1

0.15

0.2

0.25

−2.2

−2

−1.8

−1.6

−1.4

s(0

)t

s(0)f

0.05 0.1 0.15

0.05

0.1

0.15

0.2

0.25

−2.5

−2

−1.5

−1

s(0

)t

s(0)f

β = 0.01, 7 zones β = 0.001, 7 zones β = 0.0001, 7 zones 69 / 54

Number of iterations

0 10 20 30 40 500

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

number of iterations

freq

uen

cy

70 / 54

Extension of SES from 2 to n > 2 bump models

Clusters appear naturally (n = 5)

f

t

n models similar if:

• few deletions → number of bumps/cluster close to n

• little jitter in t and f → clusters have small dispersion

71 / 54

Extension from 2 to n > 2 bump models

n bump models are generated from “mother” model

f

t

• distribution pc of number of bumps per cluster

• average t and f offset (δt,i and δf ,i), i = 1, . . . , n

• t and f jitter (st,i and sf ,i), i = 1, . . . , n

72 / 54

We alternate clustering and parameter estimation

Problem

Given n bump models e, infer pc and θ = (δt, δf , st, sf)


(c∗,θ∗) = argmaxc,θp(e, c;θ)

Iterative algorithm: coordinate descent

c(k+1) = argmaxcp(e, c; θ(k)

) Clustering

θ(k+1)

= argmaxθp(e, c(k+1);θ) Parameter Estimation

• Clustering is equivalent to max-weight k-partite matching

• NP-hard in general, but can be solved efficiently in practice

• Overall algorithm is provably convergent

73 / 54

Various extensions of SES

Point processes in other domains: (t, x), (x , y , z), . . .

• Analysis of spike trains (t)

• Study of cell migration using molecular imaging (t, x , y)

Key: meaningful events in signal (spike, burst, peak, etc.)

Time-dependent SES → dynamics of synchrony

no stimulusno stimulus stimulus

tst

f

More complex “parent” structure (future work)mother process

n given processes n given processes

. . . . . .

“parent” processes

74 / 54

Want to know more? Want to try it out?

http://www.dauwels.com/SESToolbox/SES.html75 / 54

SES will also be included in Spike Train Analysis Toolkit

http://neuroanalysis.org/toolkit/ 76 / 54

Standard invasive approach requires long hospitalization

• Implant electrodes in the brain

• Record EEG until sufficient seizures obtained (5 days–more)

• Determine seizure onset zone from “seizure” EEG

Critical issues

• risk (infection, prolonged seizures, damage to the brain)

• discomfort

• cost

77 / 54

Univariate measures normal in hypersynchronous areas

78 / 54

How to determine seizure onset area using hypersynchrony?

Adaptive threshold required


Graphical model

binary variable h (hypersynchronous?)

spatial correlation

local evidence (synchrony s)

h = argmaxhp(s,h)79 / 54

Future work : Framework for clinical decision making

• Develop graphical model for each source of information• Features obtained through signal processing

e.g., hypersynchrony in rest EEG• Clinical behavior• Biophysical prior knowledge

• Combine those models for decision making

d = argmaxd log p(i,d) − q(d)

decision d, “information” i, statistical model p, risk q

• Graphical models → efficient optimization algorithms

80 / 54

Similarity measures: correlation and beyond

• Correlation coefficient

r

=1

N

N∑

k=1

(x(k) − x)

σx

(y(k) − y)

σy

• Magnitude square coherence

c(f )

=|〈X (f )Y ∗(f )〉|2

|〈X (f )〉| |〈Y (f )〉|

• Phase coherence

φ(f )

= arg〈X (f )Y ∗(f )〉

• Corr-entropy coefficient

rE

=

1N

∑Nk=1 κ(x(k), y(k)) − 1

N2

∑Nk,ℓ=1 κ(x(k), y(ℓ))

√

KX − 1N2

∑Nk,ℓ=1 κ(x(k), x(ℓ))

√

KY − 1N2

∑Nk,ℓ=1 κ(y(k), y(ℓ))

,

where

KX =1

N

N∑

k=1

κ(x(k), x(k)) KY =1

N

N∑

k=1

κ(y(k), y(k)),

and where κ is a symmetric positive definite kernel81 / 54

Similarity measures: Granger causality

• Multivariate autoregressive (MVAR) model

x(k) =

p∑

ℓ=1

A(j)x(k − ℓ) + e(k),

where x(k)

= (x1(k), x2(k), . . . , xn(k))T , p is model order, the modelcoefficients A(j) are n × n matrices, and e(k) is a zero-mean Gaussian randomvector of size n; can be rewritten as:

e(k) =

p∑

ℓ=0

A(j)x(k − ℓ),

where A(0) = I (identity matrix) and A(j)

= −A(j) for j > 0

• Transform into the frequency domain (by applying the z-transform and by

substituting z

= e−2πif ∆t , where 1/∆t is the sampling rate):

X (f ) = A−1(f )E(f )

= H(f )E(f )

• Power spectrum matrix of the signal x(k) is determined as

S(f )

= E[

X (f )X (f )†]

= H(f )VH†(f ),82 / 54

Similarity measures: Granger causality

• Partial coherence (PC)

Cij (f )

=Mij(f )

√

Mii (f )√

Mjj(f )∈ C,

where Mij is minor (i , j) of S = determinant of S with row i , column j removed

• Directed transfer function (DTF)

γ2ij (f )

=|Hij(f )|2

∑mj=1 |Hij(f )|2

∈ [0, 1]

= fraction of inflow to channel i stemming from channel j

• Full frequency directed transfer function (ffDTF)

F 2ij (f )

=|Hij(f )|2

∑

f

∑mj=1 |Hij(f )|2

∈ [0, 1]

• Partial directed coherence (PDC)

Pij (f )

=Aij(f )

√

∑mi=1 |Aij(f )|2

∈ C,

= fraction of outflow from channel j to channel i .

• Direct directed transfer function (dDTF)

χ2ij(f )

= F 2ij (f )C2

ij (f ) ∈ [0, 1]83 / 54

Phase synchrony

• Interdependence between the instantaneous phases φx and φy of two signals x

and y

• Instantaneous phases may be strongly synchronized even when the amplitudes ofx and y are statistically independent

• Instantaneous phase φWx (k, f )

φWx (k, f )

= arg[X (k, f )],

where X (k, f ) is time-frequency transform of x

• Phase synchrony index

γ =∣

∣

⟨

e i(nφx−mφy )⟩∣

∣ ∈ [0, 1],

where n and m are integers (usually n = 1 = m)

84 / 54

Information theoretic measures

• Mutual information X and Y :

I (X ;Y )

= H(X ) + H(Y ) − H(X , Y ),

where H(X ) and H(Y ) is the Shannon entropy of X and Y respectively, andH(X , Y ) is the joint entropy of X and Y

• Mutual information in time-frequency domain

IW (Cx , Cy ,Cxy ) =∑

k,f

Cxy (k, f ) logCxy (k, f )

Cx (k, f )Cy (k, f ),

where the (normalized) cross time-frequency distribution of x and y is defined as

Cxy (k, f )

=|X (k, f )Y ∗(k, f )|

∑

k,f |X (k, f )Y ∗(k, f )|.

• Kullback-Leibler Divergence

K(Cx , Cy ) =∑

k,f

Cx (k, f ) logCx (k, f )

Cy (k, f ).

• Renyi Divergence

Dα(Cx ,Cy ) =1

α − 1log∑

k,f

[Cx (k, f )]α[Cy (k, f )](1−α)

85 / 54

State space based measures

• Mean Euclidian distance RMk

(X ) between X (k) to K nearest neighbors

• Mean Euclidian distance RMk

(X |Y ) between delay vector X (k) and the X -delayvectors corresponding to the M nearest neighbors of Y (k)

• Measures by Quiroga et al.

Sk(X |Y ) =1

N

N∑

k=1

RMk

(X )

RMk

(X |Y )∈ (0, 1]

Hk(X |Y ) =1

N

N∑

k=1

logRk(X )

RMk

(X |Y )

Nk(X |Y ) =1

N

N∑

k=1

Rk(X ) − RMk

(X |Y )

Rk(X ) 86 / 54

p-values for synchrony measures (MCI vs. Control)

88 / 54

How do p-values depend on initial SES parameters?

0.05 0.1 0.15

0.14

0.16

0.18

0.2

0.22

0.24

0.26−1.5

−1.4

−1.3

−1.2

−1.1

−1

−0.9

s(0

)t

s(0)f

0.05 0.1 0.15

0.14

0.16

0.18

0.2

0.22

0.24

0.26

−2

−1.8

−1.6

−1.4

s(0

)t

s(0)f

0.05 0.1 0.15

0.14

0.16

0.18

0.2

0.22

0.24

0.26−3

−2.5

−2

−1.5

s(0

)t

s(0)f


0.05 0.1 0.15

0.14

0.16

0.18

0.2

0.22

0.24

0.26−2.5

−2.4

−2.3

−2.2

−2.1

−2

s(0

)t

s(0)f

0.05 0.1 0.15

0.14

0.16

0.18

0.2

0.22

0.24

0.26−2.8

−2.6

−2.4

−2.2

−2

−1.8

s(0

)t

s(0)f

0.05 0.1 0.15

0.14

0.16

0.18

0.2

0.22

0.24

0.26

−3.5

−3

−2.5

s(0

)t

s(0)f


0.05 0.1 0.15

0.05

0.1

0.15

0.2

0.25 −2

−1.8

−1.6

−1.4

−1.2

−1

−0.8

s(0

)t

s(0)f

0.05 0.1 0.15

0.05

0.1

0.15

0.2

0.25

−2.2

−2

−1.8

−1.6

−1.4

s(0

)t

s(0)f

0.05 0.1 0.15

0.05

0.1

0.15

0.2

0.25

−2.5

−2

−1.5

−1

s(0

)t

s(0)f

β = 0.01, 7 zones β = 0.001, 7 zones β = 0.0001, 7 zones 89 / 54

p-values for ffDTF (MCI vs. Control)

90 / 54

signal processing for diagnosis and treatment of …introduction preview diagnosis of ad treatment...

Documents