from tens to millions of neurons computer architecture group paul fox how computer architecture can...

Post on 29-Mar-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

From tens to millions of neurons

Computer Architecture Group

Paul Fox

How computer architecture can help

What hinders the scaling of neural computation?

Neural Computation = Communication

+ Data Structures + Algorithms

Neural Computation = Communication

+ Data Structures + Algorithms

But almost everybody ignores the first two!

What is Computer Architecture?

• Designing computer systems that are appropriate for their intended use

• Relevant design points for neural computation are:

• Memory hierarchy

• Type and number of processors

• Communication infrastructure

Just the things that existing approaches don’t consider!

Our approach

Bluehive system

•Vast communication and memory resources•Reprogrammable hardware using FPGAS

Can explore different system designs and see what is most appropriate for neural computation

Organisation of data for spiking neural networks

First approach – Custom FPGA pipeline

Running 256k Neurons

First approach – Custom FPGA pipeline

• Real-time performance for at least 256k neurons over 4 boards

• Saturates memory bandwidth

• Plenty of FPGA area left, so could use a more complex neuron model

• But only if it doesn’t need more data

• But time consuming and not really usable by non computer scientists

Can we use more area to make something that iseasier to program but still attains performanceapproaching the custom pipeline?

Can we use more area to make something that iseasier to program but still attains performanceapproaching the custom pipeline?

Single scalar processor

Data bus = 256 bits Data bus = any width

DDR2 RAM(from 200MHz FPGA)

Block RAM

Block RAM

Block RAM

Processor

One 32-bit transfer at a time

Multicore scalar processor

Data bus = 256 bits Data bus = any width

DDR2 RAM(from 200MHz FPGA)

Block RAM

Block RAM

Block RAM

Processor Processor … Processor

Ruinsspatiallocality

Inter-processorcommunication

needed

Vector processor – many words at a time

Data bus = 256 bits Data bus = any width

DDR2 RAM(from 200MHz FPGA)

Block RAM

Block RAM

Block RAM

Vector Processor

Productivity vs. Performance

Runtime (s)

Lines of code

200 500 5k-10k

12

125 Izhikevich.c

IzhikevichVec.cNeuronSimulator/*.bsv

Dual-coreNIOS II+BlueVec

NIOS II

Bluespec System Verilog

Vector version doesn’t have much more code than original codeMassive performance improvement

Example for LIF character recognition

Time (ms) %

I-values 331.7 83

Gain/Bias 39.2 9

Neuron updates 26.8 6

Total 397.7

Time (ms) %

I-values 7.9 42

Gain/Bias 3.6 18

Neuron updates 5.8 30

Total 18.9

LIF.c LIFVec.c

324 lines of code 496 lines of code

LIF simulator on FPGA running a Nengo model

Conclusion

• When designing a neural computation system you need to think about every part of the computation, not just the algorithm

• Some form of vector processor is likely to be most appropriate

Or write your model in NeuroML and let us do the hard work!

Questions?

top related