christopher batten - chess.eecs.berkeley.edu · power constrains all modern systems mit csail...
TRANSCRIPT
Manycore Vector-Thread ArchitecturesChristopher Batten
Computer Science and Artificial Intelligence LaboratoryMassachusetts Institute of Technology
Parallel Computing LaboratoryUniversity of California, Berkeley
Spring 2009
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
The Transition to Multicore
MIT CSAIL Christopher Batten 2 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Power Constrains All Modern Systems
MIT CSAIL Christopher Batten 3 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
How does multicore help reduce power?
MIT CSAIL Christopher Batten 4 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Vector-Threading Is Energy-Efficient and Flexible
MIT CSAIL Christopher Batten 5 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Motivation
Vector-Thread Architectures
Scale VT Processor
Maven VT Processor
Future Research Directions
MIT CSAIL Christopher Batten 6 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Multithreaded Architectures
MIT CSAIL Christopher Batten 7 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Multithreaded Architectures
MIT CSAIL Christopher Batten 7 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Vector Architectures
MIT CSAIL Christopher Batten 8 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Vector Architectures
MIT CSAIL Christopher Batten 8 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Vector-Thread Architectures
MIT CSAIL Christopher Batten 9 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Energy-Efficiency for Various Kinds of Kernels
MIT CSAIL Christopher Batten 10 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Energy-Efficiency for Various Kinds of Kernels
MIT CSAIL Christopher Batten 10 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Mapping FFT to VT Programming Model
MIT CSAIL Christopher Batten 11 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Mapping Dither to VT Programming Model
MIT CSAIL Christopher Batten 12 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Mapping IP Lookup to VT Programming Model
MIT CSAIL Christopher Batten 13 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Motivation
Vector-Thread Architectures
Scale VT Processor
Maven VT Processor
Future Research Directions
MIT CSAIL Christopher Batten 14 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Motivation for Scale VT Processor
Goal: Evaluate energy-efficiencyand flexibility of vector-threadingby building a prototype chip withone control processor andseveral lanes
Initial implementation targetsmodern embedded systems
• Heavily power constrained
• Increasingly requirehigh-performance
• Support growing numberof features
set-top boxes • game consolesmedia players • network routers
wireless base stations • smart phones
MIT CSAIL Christopher Batten 15 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Atomic Instruction Block Interleaving
MIT CSAIL Christopher Batten 16 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Explicit Thread Fetches
MIT CSAIL Christopher Batten 17 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Scale Features
• Atomic Instruction Blocks
• Explicit Thread Fetches
• Clustering – reduce vector register file area, energy
• Cross-VP Ring Network – loop carried dependencies
• Configurable Vector Length – more elements, fewer vector registers
• Shared Vector Registers – temporaries, reductions, and constants
• AIB Caches – reduce fetch energy and amplify fetch issue bandwidth
• Vector Refill/Access Decoupling – better hide memory latencies
• Vector Segment Accesses – two-dimensional access patterns
International Symposium on Computer Architecture (ISCA), 2004International Symposium on Microarchitecture (MICRO), 2004
MIT CSAIL Christopher Batten 18 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Scale VT Processor Prototype Chip
MIT CSAIL Christopher Batten 19 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Executing FFT on Scale
MIT CSAIL Christopher Batten 20 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Executing FFT on Scale
MIT CSAIL Christopher Batten 20 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Executing Dither on Scale
MIT CSAIL Christopher Batten 21 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Executing Dither on Scale
MIT CSAIL Christopher Batten 21 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Executing IP Lookup on Scale
MIT CSAIL Christopher Batten 22 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Executing IP Lookup on Scale
MIT CSAIL Christopher Batten 22 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Benchmarks
MIT CSAIL Christopher Batten 23 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Energy-Efficiency of the Scale VT Processor
MIT CSAIL Christopher Batten 24 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
The Scale VT Processor Team at MIT
Project SupervisorProfessor Krste Asanovic
Scale Architecture and Chip ImplementationChristopher Batten, Ronny Krashinsky
VLSI Toolflow Development and Test Board Bring-UpKen Barr, Asif Khan, Albert Ma, Jaime Quinonez
Compiler DevelopmentMark Hampton
Benchmark and Test Case DevelopmentJared Casper, Jeff Cohen, Steve Gerding
DRAM Model DevelopmentBrian Pharris
MIT CSAIL Christopher Batten 25 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Motivation
Vector-Thread Architectures
Scale VT Processor
Maven VT Processor
Future Research Directions
MIT CSAIL Christopher Batten 26 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Motivation for the Maven VT Processor
MIT CSAIL Christopher Batten 27 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
A Manycore Vector-Thread Architecture
MIT CSAIL Christopher Batten 28 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Maven Lane Microarchitecture
MIT CSAIL Christopher Batten 29 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Maven Lane Microarchitecture
MIT CSAIL Christopher Batten 29 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Maven Lane Microarchitecture
MIT CSAIL Christopher Batten 29 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Maven Lane Microarchitecture
MIT CSAIL Christopher Batten 29 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Executing FFT on Maven
MIT CSAIL Christopher Batten 30 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Executing Dither on Maven
MIT CSAIL Christopher Batten 31 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Executing IP Lookup on Maven
MIT CSAIL Christopher Batten 32 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Related Work
MIT CSAIL Christopher Batten 33 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
The Maven VT Processor Team at UC Berkeley
Project SupervisorProfessor Krste Asanovic
Maven Architecture DevelopmentChristopher BattenDr. Hidetaka Aoki (visitor from Hitachi)Yunsup Lee
MIT CSAIL Christopher Batten 34 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Motivation
Vector-Thread Architectures
Scale VT Processor
Maven VT Processor
Future Research Directions
MIT CSAIL Christopher Batten 35 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Techniques which reduce the energy per operationand simplifying manycore parallel programming
Core Architecture
• Hardware support for managingcooperating control processors
On-Chip Network Architecture
• High-radix topologies• Custom circuits for specialized
networks (e.g. global barriers)
Leveraging EmergingTechnologies
• 3D stacking• Silicon photonics
MIT CSAIL Christopher Batten 36 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Silicon Photonic Chip-Level Networks
MIT CSAIL Christopher Batten 37 / 38
Motivation Vector-Threading Scale VT Processor Maven VT Processor Future Research
Conclusions
• Manycore is not enough – architectsshould work towards reducing the energyper operation while also improvingperformance.
• Vector-Threading combines theenergy-efficiency of vector execution withthe flexibility of multithreaded execution.
• The Scale VT Processor is a firstimplementation which illustrates theadvantages of vector-threading, and theMaven VT Processor will extendvector-threading to tens or hundreds oflanes on a chip.
MIT CSAIL Christopher Batten 38 / 38