xin-wei shih and yao-wen chang. introduction problem formulation algorithms experimental results...

20
Fast Timing-Model Independent Buffered Clock-Tree Synthesis Xin-Wei Shih and Yao-Wen Chang

Upload: magdalene-hall

Post on 27-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Fast Timing-Model Independent Buffered Clock-Tree Synthesis

Xin-Wei Shih and Yao-Wen Chang

Page 2: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Introduction Problem formulation Algorithms Experimental results Conclusions

Outline

Page 3: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Skew-minimized buffered clock-tree synthesis plays an important role in high-performance VLSI designs for synchronous circuits.

Due to the insufficient accuracy of existing timing models for modern chip design, embedding simulation process into a clock-tree synthesis flow becomes inevitable.

Introduction

Page 4: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

A possible way to improve the speed is performing the clock construction by structure optimization.◦ Mesh

In this paper, a novel timing-model independent buffered clock tree synthesis method is proposed.◦ Buffering and wiring structures of all paths from

the clock source to its sinks are almost the same.

Introduction

Page 5: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Problem: Buffered Clock-Tree Synthesis (BCTS)

Instance: Given a set of clock sinks, a slew-rate constraint, and a library of buffers.

Question: Construct a buffered clock tree to minimize its skew, subject to no slew-rate violation.

Problem formulation

Page 6: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Algorithm

Page 7: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

The number of leaves (sinks) can be treated as a multiplication sequence of branching.◦ This multiplication sequence exactly forms a

factorization.

Then, the BNP is arranged in the non-increasing order1 of the factorization list.

Branch-Number Planning

Page 8: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Branch-Number Planning

Page 9: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

A top-down manner like [10] or a bottom-up one like [7, 11], they can hardly apply to non binary tree structures.

Therefore, we propose a novel partitioning method, which can not only handle non-binary tree structures, but also achieve good quality in terms of the cluster diameter.◦ cluster diameter : the maximum distance among

sub-trees within the same cluster.

Tree Construction-Partitioning

Page 10: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

We borrow the idea of cake cutting, i.e., slicing a cake into pieces from the center of the cake.

Partitioning

Page 11: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Embedding-Region Construction

Page 12: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Node Embedding

Page 13: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Since the identical branch numbers at the same level are required in the symmetrical structure, a pseudo sink should be transformed into a dangling wire to maintain the symmetry.

For partitioning, we relax that the sizes of clusters in a partition can differ by at most one for the first recursion.

For node embedding, we let the embedding regions of pseudo sinks cover the entire chip.

Pseudo Sink Handling

Page 14: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

A top-down manner◦ By tracing along the tree edges, once the slew

rate is about to violate the constraint, identical buffers are inserted for all branches.

◦ Insert identical buffers in terms of the type and the size at the same level.

◦ The slew rate is approximated by accumulated capacitance starting from the latest inserted buffer.

Buffer Insertion

Page 15: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Buffer Insertion

Page 16: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Implemented in the C++ programming language on a 2.6 GHz AMD-64 workstation.

Four ISPD’09 Clock Network Synthesis Contest benchmarks with no blockages [17] and the IBM benchmarks [19].

Use ngspice [13] simulation based on the 45nm process technology [14] to evaluate the quality.

Experimental results

Page 17: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

clock skew (skew) clock-latency range(CLR) total resource usage (usage)

Experimental results

Page 18: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Experimental results

Page 19: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

Experimental results

Page 20: Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions

We have presented a fast timing-model independent buffered clock tree synthesis method to construct a symmetrical clock tree with little wiring overhead.

By symmetrically constructing a clock tree, the clock skew can be minimized without referring to simulation information.

Conclusions