csee4840 embedded system design - columbia …sedwards/classes/2015/4840/reports/embedd… ·...

25
Online Unsupervised Spike Sorting Accelerator Zhewei Jiang, Jamis Johnson CSEE4840 Embedded System Design

Upload: nguyenkien

Post on 10-Sep-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Online Unsupervised Spike Sorting Accelerator

Zhewei Jiang, Jamis Johnson

CSEE4840 Embedded System Design

Page 2: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Motivation for Hardware Spike Sorting

• Motivated by Transmission Cost High energy consumption

Throughput limitation

• Challenges Computation complexity

Feature extraction

Clustering

Memory Requirement

Feature Distribution

Cluster Centroid

ProsthesisDozens/Hundreds of Channels

SpikeSorting

32~64 * 8 bits/spike 3~4 bits/spike

Page 3: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Algorithm

Stage Algorithm Main Module(s)

Distribution 1D Kernel Density Estimation DISTR (main FSM)

Sample Criterion Laplacian DISTR (Laplace FSM)

TrainingBimodal prediction

& replacement policyCAM_CLUSTER

GRIDFIND

SortingClosest Grid

(grid granularity)

CAM_CLUSTERGRIDFIND

WTA

Page 4: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Algorithm - Kernel Density Estimation

• Approximate Bayesian Optimal Kernel Density Estimation

Marr wavelet as kernel

• Hardware cost assuming 8 bit resolution 2562 memory entry

Updates per spike ∝ kernel smoothing parameter

Page 5: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Algorithm - Kernel Density Estimation

• Histogram Trivial kernel

• Hardware cost 2562 memory entry

One update per spike

Page 6: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Algorithm - Kernel Density Estimation

• Histogram Trivial kernel

• Hardware cost 2562 memory entry

One update per spike

642 memory entryOne update per spike

Page 7: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Kernel Density Estimation – Trivial Case

• 2D Kernel Density Estimation Laplacian of Gaussian (Marr Wavelet)

• 1D Kernel Density Estimation Gaussian (or LoG, similar hardware

cost if precomputed)

• Histogram Smoothing with bin width

Page 8: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Laplacian

Convolution of histogram of negative Laplace operator [-1, 2, -1]

• Downward trend of 2nd

derivative is used to segment distribution space into informative and uninformative regions

• Only data in the shaded area are used for training

Page 9: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Informative Sample Vector Mask

• Informative sample vector mask reduces outliners sensitivity during training phase

• Certain features can share distribution space if no overlap can occur (e.g. peak & hyperpolarization)

Page 10: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Bayesian Classification

Copyright @ Yaroslav Bulatov

At this point, classification can be done through

Distance metric (e.g. Probabilistic Neural Network, MCK, …)

Boundary metric (Bayesian Boundary)

Page 11: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Top-Level Hardware Architecture

Spk_f[11:0]

Valid IDX[2] IDX[1] IDX[0]

MEM1: DISTRIBUTION

MEM2: INFO SAMPLE MEM3: GRID

Informative[63:0]

boundary[41:0]

Valid_grid[6:0]

ker

fin

laplace

clear

infm

Comb: GRIDFIND

Comb: WTA(min distance)

MEM4: Grid_index/Comb: grid match

Grid_idx[5:0]

V[7:0]D[1:0]x8

Page 12: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

S/H Interface

Spk_f[11:0]

Valid IDX[2] IDX[1] IDX[0]

ker

fin

laplace

clear

HARDWARE

Leak

Update

Address

Writedata

ReaddataSOFTEWARE – Scheduling, commands

AVALON BUS

Clk, reset, read, write, readdata, writedata

Clk, reset, read, write

Page 13: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Modules [Distribution]

FEATURE

+1

wKDE

w

Boundary_thr

Laplacian

convolve

Boundary

is_f

CADDR

Ctrl

Page 14: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Modules [Distribution]

KDE[i]<<1

SUB SUBComparator

Boundary_thr

Comparator

A

D

Q1

Q4

ENB

Register

informative

Boundary

A

D

Q1

Q4

ENB

Register

A

D

Q1

Q4

ENB

Register

convolve

Page 15: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Modules [Training & Sorting]

FIS(Informative Features)

SEGS(Grid Boundary – Peak)

GRIDFIND

SEGS(Grid Boundary –

Hyperpolarization)

GRIDFIND

FEATURES

CAM_CLUSTER(Content Associative Memory –

Grid indexes as cluster identifier)

ATTR WTA

Output

Page 16: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Modules [GRIDFIND]

Feature

Boundary

A

H

Q1

Q8

ENB

Regis ter

A

H

Q1

Q8

ENB

Regis ter

A

H

Q1

Q8

ENB

Regis ter

A

H

Q1

Q8

ENB

Regis ter

Comparator

Comparator

Comparator

Comparator

Load_b

Q

QSE T

C LR

D

Q

QSE T

C LR

D

Q

QSE T

C LR

D

Q

QSE T

C LR

D

Index

Fp

Boundary_fp

Load_bp

Idx_p

Boundary_fh

Fh

Load_bh

Idx_h

Page 17: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Modules [CAM_CLUSTER]

A

D

Q1

Q4

ENB

Register

SUB

A

D

Q1

Q4

ENB

Register

SUBLoad_C

F_p

F_h

Idx_p Idx_h

3b 3b

ADD Match

Distance

Strong Cluster

Outlier

Weak Cluster

Free

Update &

(miss|leak)

Each cluster is associated with a bimodal updater, every matching spike strengthens the usefulness of the cluster, and periodic leakage is applied to weaken the cluster. False cluster caused by outliers is cleared this way.

Page 18: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Modules [WTA]

Cluster_idx[0]

Candidate [0]

DIST0

Candidate [2]

DIST1

Candidate [2]

DIST2

Candidate [3]

DIST3

Candidate [4]

DIST4

Candidate [5]

DIST5

DIST7

DIST6

Candidate [7]

Candidate [6]

W

W

W

W

W

W

W

Cluster_idx[1]

VALID

DB

DA

vA

vB

>sel

D

v

Page 19: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Design – Timing

Sequential Components:

FSM (for histogram generation in DISTR)

FSM (for Laplacian in DISTR)

Usefulness Updater (Bimodal SM for outliner handling in CAM_CLUSTER)

Posedge triggers State Transition: State <= nState& FSM Outputs (registered output) based on nState

Negedge triggers nState Transition: nState <= f(state)

Combinational circuit (in FSM) propagation for half cycle, guarantee no set-up/hold violation from clock jitter or skew

Page 20: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Design – Timing (Legacy)

Negedge triggered FF

Level sensitive Latch(non-clk enable)

Q

QSET

CLR

D

Q

QSET

CLR

D

L

Comb(bimodal)

WRITE CLK Retention Latch is used in a power-gating version of the design, with no retention flip-flop available. A non-retention FF trails a Latch to allow the design of a FSM that does not require extra cycle to read back state from L after power gating

Page 21: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Design – Timing (Legacy)

Negedge triggered FF

Level sensitive Latch(non-clk enable)

Q

QSET

CLR

D

Q

QSET

CLR

D

L

Comb(bimodal)

WRITE CLK

Q

QSET

CLR

D

Comb(bimodal)

WRITE

CLK

Page 22: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Design – Timing of Module DISTR

Internal FSM timing as previous slides

Ack (signal Fin) based interface, communication locked until software read out “Fin”.

Command Ker: [read Max & incr][write][read Min & incr][write][Fin]

special case when memory overflow:

[read & >> 1][write]…(64 entries)

Command Laplace: [read][read][read]…(64 entries)…[Fin]

Secondary FSM find informative sample regions based on

(entry[i]<<1) > (entry[i-1]+entry[i+1])

Page 23: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Design – Timing of Module GC

FIS(Informative Features)

SEGS

GRIDFIND

FEATURES

CAM_CLUSTER

ATTR WTA

Output

Q

QS ET

C LR

S

R

Q

QS ET

C LR

S

R

Page 24: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

RTL Sim (Design Compiler)

Page 25: CSEE4840 Embedded System Design - Columbia …sedwards/classes/2015/4840/reports/Embedd… · CSEE4840 Embedded System Design. Motivation for Hardware Spike Sorting •Motivated by

Issues & Lessons Learned

• Testability Individual Modules

Overall Functionality

• Debugging