Post on 31-Jan-2016
description
transcript
From tens to millions of neurons
Computer Architecture Group
Paul Fox
How computer architecture can help
What hinders the scaling of neural computation?
Neural Computation = Communication
+ Data Structures + Algorithms
Neural Computation = Communication
+ Data Structures + Algorithms
But almost everybody ignores the first two!
What is Computer Architecture?
• Designing computer systems that are appropriate for their intended use
• Relevant design points for neural computation are:
• Memory hierarchy
• Type and number of processors
• Communication infrastructure
Just the things that existing approaches don’t consider!
Our approach
Bluehive system
•Vast communication and memory resources•Reprogrammable hardware using FPGAS
Can explore different system designs and see what is most appropriate for neural computation
Organisation of data for spiking neural networks
First approach – Custom FPGA pipeline
Running 256k Neurons
First approach – Custom FPGA pipeline
• Real-time performance for at least 256k neurons over 4 boards
• Saturates memory bandwidth
• Plenty of FPGA area left, so could use a more complex neuron model
• But only if it doesn’t need more data
• But time consuming and not really usable by non computer scientists
Can we use more area to make something that iseasier to program but still attains performanceapproaching the custom pipeline?
Can we use more area to make something that iseasier to program but still attains performanceapproaching the custom pipeline?
Single scalar processor
…
Data bus = 256 bits Data bus = any width
DDR2 RAM(from 200MHz FPGA)
Block RAM
Block RAM
Block RAM
Processor
One 32-bit transfer at a time
Multicore scalar processor
…
Data bus = 256 bits Data bus = any width
DDR2 RAM(from 200MHz FPGA)
Block RAM
Block RAM
Block RAM
Processor Processor … Processor
Ruinsspatiallocality
Inter-processorcommunication
needed
Vector processor – many words at a time
…
Data bus = 256 bits Data bus = any width
DDR2 RAM(from 200MHz FPGA)
Block RAM
Block RAM
Block RAM
Vector Processor
Productivity vs. Performance
Runtime (s)
Lines of code
200 500 5k-10k
12
125 Izhikevich.c
IzhikevichVec.cNeuronSimulator/*.bsv
Dual-coreNIOS II+BlueVec
NIOS II
Bluespec System Verilog
Vector version doesn’t have much more code than original codeMassive performance improvement
Example for LIF character recognition
Time (ms) %
I-values 331.7 83
Gain/Bias 39.2 9
Neuron updates 26.8 6
Total 397.7
Time (ms) %
I-values 7.9 42
Gain/Bias 3.6 18
Neuron updates 5.8 30
Total 18.9
LIF.c LIFVec.c
324 lines of code 496 lines of code
LIF simulator on FPGA running a Nengo model
Conclusion
• When designing a neural computation system you need to think about every part of the computation, not just the algorithm
• Some form of vector processor is likely to be most appropriate
Or write your model in NeuroML and let us do the hard work!
Questions?