cmp design space exploration subject to physical constraints yingmin li, benjamin lee, david brooks,...
DESCRIPTION
Contributions Various new observations for the CMP design given the physical constraints Experiment methodology which largely reduces the cost of design space explorationTRANSCRIPT
CMP Design Space Exploration Subject to Physical Constraints
Yingmin Li, Benjamin Lee, David Brooks, Zhigang Hu, Kevin Skadron
HPCA’0601/27/2010
Issues
• Power and thermal issues are critical to architectural design
• Design space exploration under physical constraints– core count, pipeline depth, superscalar width,
L2 cache, and voltage and frequency, under area and thermal constraints
• Prior work– exclusively on performance or on single-core
Contributions
• Various new observations for the CMP design given the physical constraints
• Experiment methodology which largely reduces the cost of design space exploration
Approach• There are so many design parameters to
optimize and co-optimize• In this paper, several methods are used
– Modeling and approximation • Performance, power and area scaling• Temperature
– Decoupled core and interconnect/cache simulations. Simulation infrastructures are modular
– Simpoint for representative simulation points
Approach• Modeling
– Formulas to model the power and performance scaling and area for pipeline width and depth
– Temperature - at the granularity of core• Decoupled Simulation
– Use IBM’s Turnandot/PowerTimer to generate L2 cache-access traces – one time cost
– Feed the traces to Zauber, a cache simulator. – Interpolation
n
Approaches
• DVFS• Workloads
– SPEC 2000– CPU bound and memory bound
• Constraints– 200 + LR+ MEMORY (Area + Thermal + CPU/Memory)
• Performance and power/performance efficiency
Results
• Without constraints• CPU-bound benchmarks favor deeper
pipelines• Memory-bound benchmarks favor
shallower pipelines
With Area Constraints• To meet the area constraints,
– Workloads• Decrease the cache size for CPU-bound workloads• Decrease the number of cores for memory-bound
workloads– Pipeline dimensions
• Shifting to narrower widths provides greater area impact
• CPU-bound and memory-bound workloads have different, incompatible optima
Results
Optimal Configurations with Varying Pipeline Width, Fixed Depth (18FO4)
Results
Optimal Configurations with Varying Pipeline Depth, Fixed Width (4D)
With Thermal Constraints
• To meet the thermal constraints– Decrease the cache size for CPU-bound
workloads– Decrease the number of cores for Memory-
bound workloads
Thermal Constraints
• Thermal constraints exert great influence on the optimal design configurations
• Thermal constraints should be considered early in the design process
Conclusions
• Joint optimization across multiple design variables is necessary
• Thermal constraints appear to dominate other physical constraints and tend to favor shallower pipelines and narrower cores