Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 11
ELEC 5270/6270 Spring 2015ELEC 5270/6270 Spring 2015Low-Power Design of Electronic CircuitsLow-Power Design of Electronic Circuits
Memory and Multicore Design Memory and Multicore Design
Vishwani D. AgrawalVishwani D. AgrawalJames J. Danaher ProfessorJames J. Danaher Professor
Dept. of Electrical and Computer EngineeringDept. of Electrical and Computer EngineeringAuburn University, Auburn, AL 36849Auburn University, Auburn, AL 36849
[email protected]://www.eng.auburn.edu/~vagrawal/COURSE/E6270_Spr15course.html
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 22
Memory ArchitectureMemory Architecture
Word 0Word 1Word 2
M bits
Storage cell
Word N-2Word N-1
Input-Output (M bits)
N w
ord
s
S0
SN-1
Word 0Word 1Word 2
M bits
Storage cell
Word N-2Word N-1
Input-Output (M bits)
N w
ord
s
S0
SN-1
A0
A1
.AK-1
Dec
oder
K a
ddre
ss li
nes
K = log2NN = 2K
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 33
Memory OrganizationMemory Organization
Sense amplifiers/drivers
Column decoder
AL
AL+1
AK–1
Storage cell
Word line
Bit line
Input-Output (M bits)
A0
AL–1
2K – L
M.2L
K –
L b
it ro
wa
ddre
ss
L bit column address
N = 2K
M-bit words
Ro
w d
eco
der
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 44
An SRAM CellAn SRAM Cell
bit bit
VDD
WL
BL BL
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 55
Read OperationRead Operation
bit bit
VDD
WL
BL BL
1. Precharge to VDD
2. WL = Logic 1
3. Sense amplifier converts BL swing to logic level
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 66
Precharge CircuitPrecharge Circuit
bit bit
VDDWL
BL BLDiff. sense ampl.
VDDVDD PC
Equalizationdevice
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 77
Reading 1 from CellReading 1 from Cell
Pre
char
ge
time
WL
BL
BL
Sense ampl. output
Pulsed to save bit line charge
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 88
Write Operation, 1Write Operation, 1→ 0→ 0
bit bit
VDD
WL
BL BL
011. Set BL = 0, BL = 1
2. WL = 1
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 99
Cell Array Power ManagementCell Array Power Management
Smaller transistorsSmaller transistorsLow supply voltageLow supply voltageLower voltage swing (0.1V – 0.3V for Lower voltage swing (0.1V – 0.3V for
SRAM)SRAM)Sense amplifier restores the full voltage swing Sense amplifier restores the full voltage swing
for outside use.for outside use.Power-down and sleep modesPower-down and sleep modes
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 1010
Sense AmplifierSense Amplifier
bit bit
SEor CLK
Sense ampl. enable:Low when bit lines are precharged and equalized
VDD
Full voltage swing output
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 1111
Sense Amplifier: PrechargeSense Amplifier: Precharge
bit=1 bit=1
SE=0
VDD
0VDD
OFF
ON ON
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 1212
Sense Amplifier: Reading 0Sense Amplifier: Reading 0
bit=1 – ∆ bit=1
SE=1
VDD
10
ON
OFF ON
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 1313
Sense Amplifier: Reading 1Sense Amplifier: Reading 1
bit=1bit=1– ∆
SE=1
VDD
01
ON
OFFON
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 1414
Block-Oriented ArchitectureBlock-Oriented Architecture
A single cell array may contain 64 Kbits to A single cell array may contain 64 Kbits to 256 Kbits.256 Kbits.
Larger arrays become slow and consume Larger arrays become slow and consume more power.more power.
Larger memories are block oriented.Larger memories are block oriented.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 1515
Hierarchical OrganizationHierarchical Organization
Global data bus
Global amplifier/driver
I/O
Block 0 Block 1 Block P-1
Controlcircuitry
Block selector
Row addr.
Column addr.
Block addr.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 1616
Power SavingPower SavingBlock-oriented memoryBlock-oriented memory
Lengths of local word and bit lines are kept Lengths of local word and bit lines are kept small.small.
Block address is used to activate the addressed Block address is used to activate the addressed block.block.
Unaddressed blocks are put in power-saving Unaddressed blocks are put in power-saving mode:mode: sense amplifier and row/column decoders are sense amplifier and row/column decoders are
disabled.disabled.Cell array is put in power-saving mode.Cell array is put in power-saving mode.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 1717
Static PowerStatic Power
0.0 0.6 1.2 1.8Supply voltage
1.3μ
1.1μ
900n
700n
500n
300n
100n
0.13μ CMOS
0.18μ CMOS
8-kbit SRAM
7x
incr
eas
e
Lea
kag
e c
urr
ent
(A
mp
ere
s)
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 1818
Power Saving ModesPower Saving Modes
Power-down: Disconnect supply. Data is Power-down: Disconnect supply. Data is not retained. Must be refreshed before not retained. Must be refreshed before use. Example, caches.use. Example, caches.
Increasing thresholds by body biasing: Increasing thresholds by body biasing: Negative bias on nonactive cells reduces Negative bias on nonactive cells reduces leakage.leakage.
Sleep mode:Sleep mode: Insert resistance in leakage path; retain data.Insert resistance in leakage path; retain data.Lower supply voltage.Lower supply voltage.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 1919
Adding Resistance in Leakage PathAdding Resistance in Leakage Path
SRAM cell
SRAM cell
SRAM cell
GND
VDD
sleep
sleep
Low-threshold transistor
VSS.int
VDD.int
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 2020
Lowering Supply VoltageLowering Supply Voltage
SRAM cell
SRAM cell
SRAM cell
GND
VDD
sleep
VDDL ≥ 100mV for 0.13μ CMOS
Sleep = 1, data retention mode
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 2121
Parallelization of MemoriesParallelization of Memories
instr. A instr. C instr. E
.
.
.
f/2
Mem 1
instr. B instr. D instr. F
.
.
.
f/2
Mem 2
MUXf/2 0 1
Power = C’ f/2 VDD2
C. Piguet, “Circuit and Logic Level Design,” pp. 124-125 inW. Nebel and J. Mermet (Eds.), Low Power Design in DeepSubmicron Electronics, Springer, 1997.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 2222
ReferencesReferences
K. Itoh, K. Itoh, VLSI Memory Chip DesignVLSI Memory Chip Design, Springer-, Springer-Verlag, 2001.Verlag, 2001.
J. M. Rabaey, A. Chandrakasan and B. Nikolić, J. M. Rabaey, A. Chandrakasan and B. Nikolić, Digital Integrated CircuitsDigital Integrated Circuits, Upper Saddle River, , Upper Saddle River, New Jersey: Pearson Education, Inc., 2003, New Jersey: Pearson Education, Inc., 2003, Chapter 12.Chapter 12.
S.-M. Kang and Y. Leblebici, S.-M. Kang and Y. Leblebici, CMOS Digital CMOS Digital Integrated Circuits Analysis and DesignIntegrated Circuits Analysis and Design, New , New York: McGraw-Hill, 1996, Chapter 10.York: McGraw-Hill, 1996, Chapter 10.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 2323
Low-Power Datapath ArchitectureLow-Power Datapath Architecture Lower supply voltageLower supply voltage
This slows down circuit speedThis slows down circuit speed Use parallel computing to gain the speed backUse parallel computing to gain the speed back
Works well when threshold voltage is also Works well when threshold voltage is also lowered.lowered.
About 60% reduction in power obtainable.About 60% reduction in power obtainable. Reference: A. P. Chandrakasan and R. W. Reference: A. P. Chandrakasan and R. W.
Brodersen, Brodersen, Low Power Digital CMOS DesignLow Power Digital CMOS Design, , Boston: Kluwer Academic Publishers (Now Boston: Kluwer Academic Publishers (Now Springer), 1995.Springer), 1995.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 2424
A Reference DatapathA Reference Datapath
Combinationallogic
OutputInputR
eg
iste
r
Re
gis
ter
CK
Supply voltage = Vref
Total capacitance switched per cycle = Cref
Clock frequency = fPower consumption: Pref = CrefVref
2f
Cref
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 2525
A Parallel ArchitectureA Parallel Architecture
Comb.Logic
Copy 1
Comb.Logic
Copy 2
Comb.Logic
Copy N
Re
gis
ter
Re
gis
ter
Re
gis
ter
Re
gis
ter
N to
1 m
ulti
ple
xer
MultiphaseClock gen. and mux
control
InputOutput
CK
f
f/N
f/N
f/N
Each copy processes every Nth input, operates at reduced voltage
Supply voltage:VN ≤ V1 = Vref
N = Deg. of parallelism
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 2626
Level Converter: L to HLevel Converter: L to H
Vin_L
Vout_H
VDDH
VDDL
Transistors with thicker oxide and longer channels
N. H. E. Weste and D. Harris, CMOS VLSI Design, ThirdEdition, Section 12.4.3, Addison-Wesley, 2005.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 2727
Level Converter: Input 0Level Converter: Input 0
Vin_L = 0
Vout_H
VDDH
VDDL1L
0
short
shortopen
open
VDDH
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 2828
Level Converter: Input 1Level Converter: Input 1
Vin_L = 1L
Vout_H
VDDH
VDDL0
VDDH
short
shortopen
open
0
DVF4: Dual VDVF4: Dual VTHTH Feedback Type Feedback Type
4-Transistor Level Converter4-Transistor Level Converter
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 2929
Vin_L Vout_H
VDDHHigh Transistor
VDDL
VTH
References for DVF4References for DVF4
K. N. Jayaraman, “DVF4: A Dual Vth Feedback Based 4-Transistor Level Converter,” Master’s thesis, Auburn University, Dec 2013.
K. N. Jayaraman and V. D. Agrawal, “A Four-Transistor Level Converter for Dual-Voltage Low-Power Design,” J. Low Power Electronics, vol. 10, no. 4, pp. 617–628, Dec 2014.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 3030
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 3131
Level Converter: H to LLevel Converter: H to L
Vin_H Vout_L
VDDLTransistors with thicker oxide and longer channels
N. H. E. Weste and D. Harris, CMOS VLSI Design, ThirdEdition, Section 12.4.3, Addison-Wesley, 2005.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 3232
Control Signals, N = 4Control Signals, N = 4
CK
Phase 1
Phase 2
Phase 3
Phase 4
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 3333
PowerPowerPN = Pproc + Poverhead
Pproc = N(Cinreg+ Ccomb)VN2f/N + CoutregVN
2f
= (Cinreg+ Ccomb+Coutreg)VN2f
= CrefVN2f
Poverhead = CoverheadVN2f ≈ δCref(N – 1)VN
2f
PN = [1 + δ(N – 1)]CrefVN2f
PN VN2
── = [1 + δ(N – 1)] ───P1 Vref
2
Alpha-Power Law ModelAlpha-Power Law ModelVariation of delay with supply voltage:Variation of delay with supply voltage:
delay delay αα V VDDDD /(V /(VDD DD – V – VTH TH ))αα
VVTH TH = Threshold voltage= Threshold voltage
αα = 1 for short-channel devices, ≈ 2 for long-channel devices = 1 for short-channel devices, ≈ 2 for long-channel devices
T. Sakurai and A. R. Newton, “Delay analysis of series-connected MOSFET circuits,” IEEE Journal of Solid-State Circuits, Vol. 26, pp.122–131, Feb. 1991.
T. Sakurai and A. R. Newton, “A simple MOSFET model for circuit analysis,” IEEE Transaction on Electron Devices, Vol. 38, No. 4, pp.887–894, Apr. 1991.
T. Sakurai, “High-speed circuit design with scaled-down MOSFETs and low supply voltage (invited),” Proc. IEEE ISCAS, pp.1487–1490, Chicago, May 1993.
T. Sakurai, “Alpha-Power Law MOS Model,” IEEE Solid-State Circuits Society Newsletter, Vol. 9, No. 4, pp. 4–5, Oct. 2004.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC5270/6270 Spr 15, Lecture 8ELEC5270/6270 Spr 15, Lecture 8 3434
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9
3535
Voltage vs. SpeedVoltage vs. Speed k Vref
Circuit delay, T ≈ ──────── (Vref – VTH)2
wherek is a technology constantVTH is threshold voltage
Supply voltage
No
rma
lize
d g
ate
de
lay,
T
4.0
3.0
2.0
1.0
0.0VTH Vref =5VV2=2.9V
N=1
N=2
V3
N=31.2μ CMOS Voltage reduction
slows down as we get closer to VTH
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 3636
Increasing MultiprocessingIncreasing Multiprocessing
PN/P1
1 2 3 4 5 6 7 8 9 10 11 12
1.0
0.8
0.6
0.4
0.2
0.0
VTH = 0V (extreme case)
VTH = 0.4V
VTH = 0.8V
N
1.2μ CMOS, Vref = 5V
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 3737
Extreme Cases: VExtreme Cases: VTHTH = 0 = 0Delay, T α 1/ Vref
For N processing elements, delay = NT → VN = Vref/N
PN 1── = [1+ δ (N – 1)] ── → 1/NP1 N2
For negligible overhead, δ→0
PN 1── ≈ ──P1 N2
For VTH > 0, power reduction is less and there will be an optimum value of N.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 3838
Example: Multiplier CoreExample: Multiplier Core Specification:Specification:
200MHz Clock200MHz Clock15W dissipation @ 5V15W dissipation @ 5VLow voltage operation, VLow voltage operation, VDDDD ≥ 1.5 volts ≥ 1.5 voltsFor threshold voltage, For threshold voltage, VVtt = 0.5V, = 0.5V,
(V(VDDDD – 0.5) – 0.5)22
Clock frequency = Clock frequency = ────────────── GHzGHz 20.25 V20.25 VDDDD
Problem:Problem:Integrate multiplier core on a SOCIntegrate multiplier core on a SOCPower budget for multiplier ~ 5WPower budget for multiplier ~ 5W
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 3939
A Multicore DesignA Multicore Design
MultiplierCore 1
MultiplierCore 5
Reg
Reg
Reg
Reg
5 to
1 m
ux
MultiphaseClock gen.
and muxcontrol
Input
Output
200MHzCK
200MHz
40MHz
40MHz
40MHz
MultiplierCore 2
Core clock frequency = 200/N
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 4040
How Many Cores?How Many Cores?
For N cores:For N cores:clock frequency = 200/N MHzclock frequency = 200/N MHzSupply voltage, VSupply voltage, VDDN DDN
(V(VDDNDDN – 0.5) – 0.5)22 = 4.05 V= 4.05 VDDNDDN/N/N
VVDDNDDN22 – (1+4.05/N) V – (1+4.05/N) VDDNDDN + 0.25 = 0 + 0.25 = 0
Assuming 10% overhead per core,Assuming 10% overhead per core, VVDDNDDN
Power dissipation =15 [1 + 0.1(N – 1)] Power dissipation =15 [1 + 0.1(N – 1)] ((──────))2 2
wattswatts 55
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 4141
Design TradeoffsDesign TradeoffsNumber of cores, N
Clock (MHz)Core supply VDDN (volts)
Total Power
(watts)
11 200200 5.005.00 15.015.0
22 100100 2.942.94 5.705.70
33 66.6766.67 2.242.24 3.613.61
44 5050 1.881.88 2.762.76
55 4040 1.661.66 2.322.32
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 4242
Power Reduction in ProcessorsPower Reduction in Processors Just about everything is used.Just about everything is used. Hardware methods:Hardware methods:
Voltage reduction for dynamic powerVoltage reduction for dynamic power Dual-threshold devices for leakage reductionDual-threshold devices for leakage reduction Clock gating, frequency reductionClock gating, frequency reduction Sleep modeSleep mode
Architecture:Architecture: Instruction setInstruction set hardware organizationhardware organization
Software methodsSoftware methods
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 4343
Parallel ArchitectureParallel Architecture
Processor
f
Processor
f/2
Processor
f/2
f
Input Output
Input
Output
Capacitance = CVoltage = VFrequency = fPower = CV2f
Capacitance = 2.2CVoltage = 0.6VFrequency = 0.5fPower = 0.396CV2f
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 4444
Pipeline ArchitecturePipeline Architecture
Processor
f
Input Output
Re
gis
ter
½Proc.
f
Input Output
Re
gis
ter
½Proc.
Re
gis
ter
Capacitance = CVoltage = VFrequency = fPower = CV2f
Capacitance = 1.2CVoltage = 0.6VFrequency = fPower = 0.432CV2f
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 4545
Approximate TrendApproximate Trend n-parallel proc.n-parallel proc. n-stage pipeline proc.n-stage pipeline proc.
CapacitanceCapacitance nCnC CC
VoltageVoltage V/nV/n V/nV/n
FrequencyFrequency f/nf/n ff
PowerPower CVCV22f/nf/n22 CVCV22f/nf/n22
Chip areaChip area n timesn times 10-20% increase10-20% increase
G. K. Yeap, Practical Low Power Digital VLSI Design, Boston: Springer,1998.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 4646
Multicore ProcessorsMulticore Processors
2000 2004 2008
Per
form
ance
bas
ed o
nS
PE
Cin
t200
0 an
d S
PE
Cfp
2000
ben
chm
arks
Multicore
Single core
Computer, May 2005, p. 12
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 4747
Multicore ProcessorsMulticore Processors D. Geer, “Chip Makers Turn to Multicore D. Geer, “Chip Makers Turn to Multicore
Processors,” Processors,” ComputerComputer, vol. 38, no. 5, pp. 11-13, , vol. 38, no. 5, pp. 11-13, May 2005.May 2005.
A. Jerraya, H. Tenhunen and W. Wolf, A. Jerraya, H. Tenhunen and W. Wolf, “Multiprocessor Systems-on-Chips,” “Multiprocessor Systems-on-Chips,” ComputerComputer, , vol. 5, no. 7, pp. 36-40, July 2005; vol. 5, no. 7, pp. 36-40, July 2005; this special issue contains three more articles on multicore processors.
S. K. Moore, “Winner Multimedia Monster – S. K. Moore, “Winner Multimedia Monster – Cell’s Nine Processors Make It a Supercomputer Cell’s Nine Processors Make It a Supercomputer on a Chip,” on a Chip,” IEEE SpectrumIEEE Spectrum, vol. 43. no. 1, pp. , vol. 43. no. 1, pp. 20-23, January 2006. 20-23, January 2006.
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 4848
Cell - Cell Broadband Engine Cell - Cell Broadband Engine ArchitectureArchitecture
L to RAtsushi Kameyama, ToshibaJames Kahle, IBMMasakazu Suzoki, Sony
© I
EE
E S
pe
ctru
m,
Jan
ua
ry 2
00
6
Nine-processor chip:192 Gflops
Copyright Agrawal, 2007Copyright Agrawal, 2007 ELEC6270 Spring 15, Lecture 9ELEC6270 Spring 15, Lecture 9 4949
Cell’s Nine-Processor ChipCell’s Nine-Processor Chip
© IEEE Spectrum, January 2006 Eight IdenticalProcessors f = 5.6GHz (max)44.8 Gflops