Toshiba
Standard Cell Architecture for Standard Cell Architecture for High Frequency OperationHigh Frequency Operation
Peter Hsu, Ph.D.Peter Hsu, Ph.D.Chief ArchitectChief Architect
Microprocessor DevelopmentToshiba America Electronics Components, Inc.
Created 14 March 2001 at the University of Wisconsin in Madison
Layout Architecture for High Frequency Operation 2
DisclaimerDisclaimer
The ideas, data and conclusions presented here are solely those of the Author, and do not in any way represent Toshiba Corporation policy or strategy.
Layout Architecture for High Frequency Operation 3
IntroductionIntroduction
High Frequency is Difficult!– Many Issues:
• Signal Integrity, Power Dissipation, ...
– My Approach:• Disciplined Methodology• Global Optimization
Outline– Layout– Circuits– Analysis
Layout Architecture for High Frequency Operation 4
Layout StrategyLayout Strategy
Leverage Advanced Technologies– Local Interconnect– Flip-Chip Area Array I/O
CAD Tool Compatibility– Parasitic Estimation, Extraction
Complex, High Frequency Designs– Robust Power Grid– Flexible Macro Embedding
Layout Architecture for High Frequency Operation 5
Metal UsageMetal Usage
300nm300nm
150nm 150nm
900nm
450nm
300nm
200nm
VSS
VDD
Signal
Via
Clock(2x)
600nm 450nm
Glo
bal
Wire
sS
hort
Local Interconnect (M0):Tungsten, Aluminum or Copper
Top Metal: Flip-Chip Solder Pads
Dimensions are for nominal0.12µm generation process
Contact
Layout Architecture for High Frequency Operation 6
Standard Cell LayoutStandard Cell Layout
Cell RowPower Vias
(1 every6 Tracks)
U1.A U1.Z
U2.AU2.Z
A
Z Z
U1
VDD
U2
VSS
A
UnrelatedWire
Minimum Cell3 Tracks
CrosspointPower Vias Pins Must
Stagger
VSS
VDD
VDD
Cell RowPower Vias
Minimum Power Rail6 Tracks From Edge
Minimum PinWidth 2 Tracks
LocalInterconnect
Smallest Cell13 Tracks
Double Height Cell
Layout Architecture for High Frequency Operation 7
Area Array I/OArea Array I/O
De
cod
er
Sense Amp. Sense Amp.
256 Rows 256 Columns
Cell Array
256 Rows 256 Columns
Cell Array
307µm56µm
670µm
538µm
102µm
640µm
225m225mpitch 5 I/O Macro
(50Km2 )
Largest SRAM Macrowithout sacrificing I/O
(16 KBytes)
1.2m
2.5m2
Ce
ll
CoreVDD
CoreVSS
I/OVDD
I/OVSSSignal
Layout Architecture for High Frequency Operation 8
Self-Contained– 5 Signals– VDDQ, VSSQ– ESD Protection– Latch-Up Ring
SoC Flexibility– Many I/O Types– Different Voltages
Routing Porosity– 50% Channels Free in Global Wiring Layers– Short Output Trace on Top Metal (Electromigration)
I/O Macro CellI/O Macro Cell
I/O Macro Use M0+M1+M2M3
M4M5
Top Metal M6
Free RoutingChannels
Layout Architecture for High Frequency Operation 9
SRAM Metal UsageSRAM Metal Usage
6-Transistor Cell(1.2 2.1 m )
Bit Lines VSS
VSS
VDDWord Line #1
Word Line #2
M1
M2
M3 Global Wires (1 or 2 Pitch)SRAM MacroUses M0+M1+M2
SignalsVDD VSS
CAD Tool Inserts M3:M2 Power Vias
Layout Architecture for High Frequency Operation 10
Word Line ShieldingWord Line Shielding
Signals Signals Signals
VDD VSS
Cell Array
Decoder
Se
nse
Am
p.
Zigzag Minimizes Couplingfrom M3 Signals to M2 Word Lines
when SRAM is Rotated
BlockedTracks
M3 Global Wires
Bit Lines
Layout Architecture for High Frequency Operation 11
RationaleRationale
“Effective Area”– Actual Footprint + Routing Disturbance– Larger, More Porous Layout Faster
• Bigger Transistors• More Space around Bit Lines• Shielding
SoC– Complex Microarchitecture– Many Small SRAMs
Layout Architecture for High Frequency Operation 12
Circuit DesignCircuit Design
Building Blocks– Latch Array
• Malleable, Porous, Multi-Port SRAM
– Dynamic Wire-OR Gate• High Fan-in, Safe, CAD Compatible
Power Dissipation– Double Edge Flipflop
50% Clock Tree 30% Peak Chip-Wide
– Interpolation Cells
Layout Architecture for High Frequency Operation 13
Latch ArrayLatch Array
G
D QE
G
D QE
G
D QE
G
D QE
CKD
Q
CKD
Q
Dec
od
er
CK
D Q
Dec
od
er
ReadAddress
WriteAddress
Write Data
Read DataTest Mode
Latch + Tristate Driver
CombinatorialRead Path
May Bufferduring
Place&Route
CK
D Q
CK
D Q
Write PulseGenerator
WriteEnable
Layout Architecture for High Frequency Operation 14
Dynamic Wire-OR GateDynamic Wire-OR Gate
Input D1 Input DN
G
D Q
ClockClock
Clock
Output
Sized forMax. Length
Driver Cell
Receiver Cell
Max. Length by Max-Load, Max-Transition Spec.
Limit Max. N byMax-Fanout Spec.
Keeper
Sized forMax-Fanout
Sizedfor 1
Highest Leverage– Dynamic vs. Static
Safe, CAD Compatible– Limit Wire Length using
Timing Driven Placement– No Dynamic Inputs
_ G
D_Q
_ G
D_Q
Layout Architecture for High Frequency Operation 15
Double-Edge FlipflopDouble-Edge Flipflop
QD
Ck
SwitchingNodes withConstant“1” Data
Low Power– Clock ½ Frequency– Light Clock Load
• 2 Large + 4 Small
Small, Fast§
– 15P + 15N Transistors
Safe, Flexible– Fully Static– Supports Scan
______
§B. Nikolic, et.al., “Sense Amplifier-BasedFlip-Flop,” ISSCC 1999.
Layout Architecture for High Frequency Operation 16
Interpolation CellsInterpolation Cells
Same Footprint,Shorter Transistors
1X Cell
2X Cell
4X Cell
2/3 Power 5/6 Power Full Power
ForPost Route
In-PlaceOptimization
Layout Architecture for High Frequency Operation 17
AnalysisAnalysis
Signal Integrity– Parasitics “Accurate By Construction”
• Uniform Metal Density• Majority Coupling to Power Rails (Shielding)
Speed Yield– Balanced with Resources
• Area, Power, Design Time
– Goal: Adequate Confidence
Layout Architecture for High Frequency Operation 18
Uniform Metal DensityUniform Metal Density
AlgorithmicallyGeneratedFilled Metal
Uniform Density on allLayers (except Local
Interconnect)
A
Z Z
U1
VDD
U2
VSS
A
Post RouteMetalUsage
Layout Architecture for High Frequency Operation 19
AdvantagesAdvantages
Design– Accurate Estimation
• Capacitance has Low Variance
– Known Coupling 50% to Adjacent Power Line
– Quick Feedback• Interconnect-Only Extraction is Accurate
Manufacturing– Uniform Etch Resist Loading
Layout Architecture for High Frequency Operation 20
Asymmetric Rise-Fall DelaysAsymmetric Rise-Fall Delays
Slow
Shrink
Slow
Elongates
Delay Duty Cycle
Same
Same Size PTransistors
Same Size NTransistors
Layout Architecture for High Frequency Operation 21
Pros and ConsPros and Cons
Advantages– More Compact Cells, Faster Circuits
Disadvantages– Need Careful Analysis, Greater Margin
Strategy:– Main Library
• Asymmetric, “No Wasted Space”
– Symmetric Subset• Gated Clocks, Write Pulse Buffering, ...
Layout Architecture for High Frequency Operation 22
Speed Yield ManagementSpeed Yield Management
Maximum Process Variation
Slow N Fast N
Fast P
Slow P Transistors
“Four Corner”Analysis
Target Designand Characterize
Library Here
ProcessCenter
Mature ProcessVariation
Setup Time Failures
Hold Time Failures
CorrectOperation
Possibly Impossible toMeet Performance Goal,or Needlessly High Effort
Layout Architecture for High Frequency Operation 23
ConclusionsConclusions
“Precision Physical Design”– Global
• Power Grid• Macro Routing Porosity
– Methodical• Signal Integrity• Parasitic Extraction• Timing Uncertainties (Coupling)
– Confident• Correctness and Speed