closing gap between asic and custom chapter 12,13,14
TRANSCRIPT
Closing Gap Between Closing Gap Between ASIC and CustomASIC and CustomChapter 12,13,14Chapter 12,13,14
Chapter 12Chapter 12Semi-Custom Methods in a High-Semi-Custom Methods in a High-Performance Microprocessor designPerformance Microprocessor design
Custom Processor DesignCustom Processor Design
IBM eServer zSeries(S/390 mainframe)Physical design extensive use hierarchy
Each Functional unit is partitioned as a macro
Each macro unit is fully floorplanned Global wiring is done hierarchicallyMacros are characterized for timing, noise…Timing rules are generated using static
transistor level simulation
Circuit and Physical design start as soon as sufficient logic is defined
Custom Processor DesignCustom Processor DesignAs design matures emphasis shifts from
functional verification to logic modification and repartitioning for archive timing closureEfficiency, turn-around-time and flexibility are
as important as cycle-time.Three types of macros
ArraysSynthesized random logic macros(RLMs)Full custom dataflow
Done predominantly in static logic, with dynamic circuitry reserved for extremely critical functions
Custom Processor DesignCustom Processor Design
Custom design is very effective when elements are identical across the bit range of the data stack
Complex numerical functions usually are far less regular around the stackRequires more effort to produce full customOften Timing criticalCircuit architecture might evolve
Good candidate for Semi-Custom design
Semi-Custom designSemi-Custom design
Basic building block is a set of parameterized gates
Covers basis set capable covering most of the design space
No directly associated layout
Circuit TuningCircuit Tuning Tools can be divided to dynamic and static
tuning Dynamic tuning involves simulation with explicit
waveforms and measures Static tuning formulates optimization through static
timing, optimizing slack in the presence of timing assertions
Large, non-bitslice circuit’s are impractical for dynamic tuning, but good for static
Tool used here is Einstuner Build on top of static transistor-level timing
tool(EinsTLT) Combines a fast event driven simulator(SPECS) with
timing tool(Einstimer)
Cell GenerationCell Generation Create layout for corresponding to the
parameterized gates Writers use their own tool, C-cell which is script
based system designed to produce optimal layout
Tool supports semi custom design Generate set of layouts from cell specs. Parse a schematic Converts between parameterized and standard (RLM
library) cells Has integrated floorplanning aid Layout post-processing (flattening, shape trimming)
ConclusionConclusion
Faster method than Full customFeasible performance compared to full
customSometimes better performance if the
architecture selection for full custom is done non optimally
Adapting easily to global timing convergence is advantage of semi-custom design
Chapter 13Chapter 13Controlling Uncertainty in High Controlling Uncertainty in High Frequency DesignsFrequency Designs
Uncertainty DefinedUncertainty DefinedProcess uncertainty
for example in-die variationTool uncertainty
Inaccuracy in the simulation and extraction tools
For example: inductance is not extracted, the predicated frequency of design will be optimistic relative to actual frequency
Design uncertaintyUnpredictable variations in the design
process between design iterationsVariations of execution of design methods
across the chip
Uncertainty definedUncertainty defined
Uncertainty in the manufacturing tools and design processes cause a gap between the predicated and actual frequencies, thus reducing the cycle time available for logic functionality
Uncertainty and FrequencyUncertainty and FrequencyWhen process contains uncertainty
Time and energy is wasted on non-critical parts of design
Reduced frequency or delay in time-to market
Focused methodology developementFocused methodology developement
Reducing the uncertainty will minimize the number of paths in WNS(worst negative slack) bucket and thus the effort required to address them
It has been shown that the closer the correlation between the predicated frequency and the actual frequency for the paths in the WNS bucket, the higher the actual frequency will be
Methods for removing paths from the Methods for removing paths from the uncertainty windowuncertainty window
Traditionally CAD algorithms is WNSResult a large number of paths in WNS
bucketBetter to use total negative slack (TNS)
algorithmTNS is defined as sum of negative slacksTries to improve all negative paths until it
reaches zero slack Goal is to remove all paths from the negative
region, design goal is remove all paths from the WNS bucket
Design frequency should be setted that all paths in WNS bucket are in the negative region.
ConclusionConclusion
Gap between actual and predicated frequency implies uncertainty in the manufacturing, tool, and design processesDesign teams will work on wrong paths
If the uncertainty in analysis of the design is minimized, resources can be managed better and the gain of costly local optimizations is higher.
ConclusionConclusion To control design and tool uncertainty take the
following steps List all of sources of uncertainty Uncertainty plan development Reduce guard band as much as possible Use TNS-based cost model Tune the Design frequency Toward end of design threat all paths within sigma of
the design equal, reduce uncertainty by reducing automation
Push CAD vendors into algorithm development Finally remember any gap between predicated and
actual frequency is lowering the actual frequency
Chapter 14Chapter 14Increasing Circuit Performance through Increasing Circuit Performance through Statistical Design TechniquesStatistical Design Techniques
Process variability Process variability As CMOS technology keeps scaling the
magnitude of variability of the process will increaseSystematic variable due the interaction
between manufacturing process and the properties of the design
Optical proximity cause polysilicon feature to vary depending on the local layout surrounding
Inter layer dielectric thickness varies due to the dependence to CMP on the local wire dencity
Ability to improve manufacturing tolerances is limited
Mask fabrication Overlay control
Process variabilityProcess variability
Intra-chip variation should be taken accountRecent study shows 0.13um CMOS 35% of
variation in MOS channel length is affected by intra-chip variation
For 0.07um CMOS the intra-chip effect would be 60%
Intra-chip variation is caused by emergence of a number of variation-generating mechanisms located on the interface between design and process
Identifying sources of variationIdentifying sources of variation
Needed to decide which of the multiple sources and patters of variation deserve the most attentionFor example impact on path delay The exact variability contribution of a process
parameter is defined Sensitivity of a circuit performance Magnitude of the variation
Increasing performance through Increasing performance through probabilistic timing modelingprobabilistic timing modeling
How does intra-chip variation differ from inter-chip variation?Usually in high performance chips, the delay
is optimized by moving delay off from critical path to paths with slack
Resulting chip with large number of paths near to the maximum delay
The inter-chip variation affects in each path similarly
The intra-chip variation effect is dependent on surrounding and die position
Increasing performance through Increasing performance through probabilistic timing modelingprobabilistic timing modeling
Increasing performance through Increasing performance through probabilistic timing modelingprobabilistic timing modeling
Conservatism of the traditional timing tools is more disadvantageous for ASICs
No testing for full speed as in custom circuits By implementing probabilistic timing analysis
methodology the conservatism built into standard ASIC design can be reduced
By lowering yield the performance could be improved Yield of 98% (instead of 99,99%) reduct conservatism
by 17%
It has been noted that an ASIC chip prodused in foundry ca run up to 40% faster than predicated by standard timing analysis
Vendors would trade yield to performance if the revenue from faster chips will justify the additional expense in lost yield and testing overhead
Increasing performance through design for Increasing performance through design for manufucturability techniquesmanufucturability techniques
As mentioned intra-chip variation is affected by layout.
Most techniques presented are already in use in full custom design
Optical proximity correction (OPC)Cover wide range of reticle enhancement
techniques Geometrical structures are added to mask
Critical dimension (L) and resolution variable Corner rounding and line pull-back
Phase shifting mask (PSM)
Increasing performance through design for Increasing performance through design for manufucturability techniquesmanufucturability techniques
Increasing performance through design for Increasing performance through design for manufucturability techniquesmanufucturability techniques
Currently(2002), a significant effort is under way to provide cell libraries which are OPC- and PSM-compliant
Allows ASIC designer benefit from those Inserting a dummy features with regions of
lesser density will increase uniformity. Improves process uniformity of CMP Downside is increased coupling capacitances and the
delay and signal integrity dangers Might be better to use better model than adding metal
fill or use both Systematic spatial correlated variation through
lens aberrations, would need a mask level spatial correlation algorithm performed in conjunction with OPC
Increasing performance through design for Increasing performance through design for manufucturability techniquesmanufucturability techniques
Is the parameter variation systematic or randomSystematic variation can be deterministic
modeledRandom (or too complex to model
deterministicly) variation is best be described by statistical means
ConclusionConclusion
Intra-chip variation of a process parameter are increasingMakes timing estimates provided by standard
design methodology overly conservative Downgrade the speed
New methods needed for timing analysisASICs suffers more of these effects
No full speed tests, or trading yield for speed