introduction cell processor
Post on 29-Nov-2014
691 Views
Preview:
DESCRIPTION
TRANSCRIPT
Introduction Cell Processor
Why Cell Processor
Performance improvement with increase in frequency Possible due to increase in transistor
density Clock frequency is timing reference for a
processor Power density
Leakage currents increase with reducing the transistor density
Increase the idle power consumption
2
History of Cell Processor
A powerful processor of next generation of PS2 Powerful multimedia and broadband
network interface IBM contribution in shaping the concept
of Cell processor Collaboration with Toshiba STI Alliance
3
History of Cell Processor
Development of Cell 1999: Sony proposed partnership with IBM
for successor of PS2 2001: STI alliance initiated the development
on Cell 2004: first prototype of Cell 2005: Sony unveil the PS3 in an E3 2006: official release of PS3, Cell SDK by
IBM 2008: IBM Roadrunner become fastest
supercomputer in the world (1.026 pflops)
4
Overview of Cell
5Design and Animation Game Programming Graphics Programming Matthew Scarpino
Overview of Cell
66.189 IAP 2007 MIT
Cell components
Memory Interface Controller (MIC) Bus Interface Controller (BIC) PowerPC Processor Element/Unit
(PPE/PPU) Synergistic processing Element/Unit
(SPE/SPU) Element Interconnect Bus (EIB) Input/Output InterFace (IOIF)
7
Cell components
MIC Connects the processor with system
memory Two channels to system memory Xteram Data Rate Dynamic Random Access
Memory (XDR DRAM) Can support 8 data transfers per second Provides high data flow at low frequency
PS3 contains 256 MB XDR DRAM
8
Cell components
PPU Based on IBM PowerPC architecture RISC architecture Cell control center
Runs operating system Manages interrupts Manages L2 shared cache Issues work to SPU
9
Cell components
10
PPU
Design and Animation Game Programming Graphics Programming Matthew Scarpino
Cell components
11
PPU 64bit architecture Supports SIMD Supports cell related functions Dual thread processor Computation power is reduced
PPU is not computational element in Cell Reduces power consumption
Cell components
12
Functional units of PPU
Design and Animation Game Programming Graphics Programming Matthew Scarpino
Cell components
13
Instruction unit (IU) Fetches and executes the instruction
Load and Store Unit Receives the memory access request
Vector/Scalar Unit (VSU) Contains Floating Point Unit Performs FP operations on individual or
multiple operands
Design and Animation Game Programming Graphics Programming Matthew Scarpino
Cell components
14
Fixed point unit (FPU) Performs fix point operations
Arithmetic and logical operations
Memory Management Unit (MMU) Performs virtual memory management
PPU registers Provides quick access to operands Some functional unit can access only
processor registers
Design and Animation Game Programming Graphics Programming Matthew Scarpino
Cell components
15
32 general purpose registers 32 floating point registers Link register
Holds branch address of upcoming target Count register
Holds branch address of upcoming target (or)
Holds loop counter Fixed point exception register
Holds carry and overflow bits for fixed point op. Design and Animation Game Programming Graphics Programming Matthew
Scarpino
Cell components
16
Condition register Holds status of arithmetic, logical or
comparison Floating point status and control
register Status of scalar FP operation
Vector registers Contains data for vector operations
Vector status and control register Holds saturation bit for vector operation
Vector register save and restore register Saves vector registers in case of context
switch Design and Animation Game Programming Graphics Programming Matthew Scarpino
Cell components
17
SPU Basic work horse of Cell Designed to executes SIMD Separate Instruction set Takes the work for PPU Does have any cache No virtual memory Each SPU can contain only 256KB of
memory
Cell components
18
SPU SPU can only access its own 256KB memory
directly Dynamic Memory Access is required to
transfer the required data to SPU Memory alignment is required to pass data
to SPU Different methods to communicates with
PPU and other memory
Cell components
19Design and Animation Game Programming Graphics Programming Matthew Scarpino
Cell components
Purpose of SPU Take 128-bit data to local register Apply operation on it Save the result to local memory
Two distinct pipelines Even pipeline handles mathematical
operations Odd pipeline handles everything else
20
Cell components
SPU Control Unit (SCN) Fetches and dispatches the instructions Perform branching and other control
operations SPU even fixed point unit
Handles logic/arithmetic operations Performs comparisons and reciprocations
for FP SPU odd fixed point unit
Performs bit level shifts, rotations, and shuffling
21
Cell components
SPU floating point unit Performs floating point operations
SPU load/store unit Performs loads and stores Manages branch targets and DMA to Local
store SPU channel and DMA unit
Communicates with Memory Flow Controller Controls DMA transfer
22
Cell components
SPU registers 128 general purpose registers Floating point status and control registers
Contains status and results of floating point operations
SPU local Store (LS) Each SPU contains very low latency 256KB
memory It acts as local cache for SPU All data transfer is responsibility of the
programmer
23
Cell components
SPU local Store (LS) Not a cache just an SRAM Only one read/write operations per second Operations accessing the LS
DMA Transfer data from main memory to LS
SPU load/store Reads/writes 16 bytes at a time
Instruction fetch Reads 128 bytes of the LS at once
24
Cell components
SPU local Store (LS) Does not support virtual memory Tradeoff between cache coherence and
fetching the data to LS LS is low latency memory Cache coherence protocols are used for other
processors Data is transferred to LS using high throughput
EIB via DMA instead of cache coherence protocols Make the hardware simple
25
Cell components
communications between SPU and other system DMA Mailboxes Events and signals
26
Cell components
DMA Transfers data to LS Asynchronous in nature
SPU continues its operation while DMA Transfers data in chunk of bytes of size
power of 2 Provides control to manage and synchronize
the data transfer One DMA can maximum transfer 16KB
27
Cell components
28Design and Animation Game Programming Graphics Programming Matthew Scarpino
Cell components
EIB Connects all the system components Consists of four data ring (two clockwise
and two counter-clockwise) One ring is for control signals One bus cycles can transfer 16 bytes of
data Each ring can carry three DMA requests
simultaneously Each DMA takes at least 8 cycles to
complete
29
Cell components
MFC Coprocessor to communicate between SPU
and EIB Process data transfer without interrupting
the SPU SPU requests the MFC to get the data MFC processes the rest of data transfer
30
Cell components
Mailboxes Simplest way to transfer the data between
PPU and SPU Can only transfer 4 bytes of data Provides one-to-one communication Mailbox channels
Outgoing mailbox Outgoing interrupt mailbox
Holds the data for outside world and cause interrupt if applicable
Incoming mailbox
31
Cell components
Events and signals Commonly used for DMA notifications Signals can be sent directly to outside world Signals can provide one-to-many style
communication
32
Cell components
Events and signals Commonly used for DMA notifications Signals can be sent directly to outside world Signals can provide one-to-many style
communication
33
Software development of Cell
Different instruction sets for SPU and PPU
Different compilers are required to compile the applications for two codes
Embedding the SPU code in PPU executable
34
Software development of Cell
Tools to compile the application for Cell PPU compiler
ppu-gcc SPU compiler
spu-gcc Embed SPU code to PPU
ppu-embedspu
35
Software development of Cell
Cell simulator Full System Simulator Emulates all system components Can provides cycle accurate information Provides graphical interface to se and
interact with system components
36
Software development of Cell
37IBM Full System Simulator user guide
Software development of Cell
Three modes Fast mode Simple mode Cycle mode
Graphical visualization of SPU and PPU Provides debugging and profiling
information Provides system utilization information 38
Software development of Cell
39
Software development of Cell
40Design and Animation Game Programming Graphics Programming Matthew Scarpino
top related