order statistics for correlated random variables and its application to at-speed testing

42

Order Statistics for Correlated Random Variables and Its Applicationto At-Speed Testing

YIYU SHI, Missouri University of Science and TechnologyJINJUN XIONG and VLADIMIR ZOLOTOV, IBM T. J. Watson Research CenterCHANDU VISWESWARIAH, IBM Microelectronics

Although order statistics have been studied for several decades, most of the results are based on the assump-tion of independent and identically distributed (i.i.d.) random variables. In the literature, how to computethe mth order statistics of n correlated random variables is still a problem. This article proposes a recursivealgorithm based on statistical min/max operations to compute order statistics for general correlated andnot necessarily identically distributed random variables. The algorithm has an O(mn) time complexity andO(m+ n) space complexity. A binary tree-based data structure is further developed to allow selective updateof the order statistics with O(nm2) time. As a vehicle to demonstrate the algorithm, we apply it to the pathselection algorithm in at-speed testing. A novel metric multilayer process space coverage metric is proposedto quantitatively gauge the quality of path selection. We then show that such a metric is directly linkedto the order statistics, and our recursive algorithm can thus be applied. By employing a branch-and-boundpath selection algorithm with these techniques, this article shows that selecting an optimal set of paths for amultimillion-gate design can be performed efficiently. Compared to the state of the art, experimental resultsshow both the efficiency of our algorithms and better quality of our path selection.

Categories and Subject Descriptors: G.3 [Mathematics of Computing]: Probability and Statistics

General Terms: Theory, Algorithms, Reliability

Additional Key Words and Phrases: Order statistics, path selection, process space coverage, at-speed testing

ACM Reference Format:Shi, Y., Xiong, J., Zolotov, V., and Visweswariah, C. 2013. Order statistics for correlated random variablesand its application to at-speed testing. ACM Trans. Des. Autom. Electron. Syst. 18, 3, Article 42 (July 2013),20 pages.DOI: http://dx.doi.org/10.1145/2491477.2491486

1. INTRODUCTION

It has been the dream of every gambler to be able to judge precisely the chances of hiswagers. In particular, for horse races, this would require that the gambler be able torank the horses according to their past performance and make his/her bet based onthe ranking. This seemingly difficult problem, from the view of statisticians, is an idealapplication of the order statistics theory.

For random variables s1, . . . , sn, we take a large number of sample sets, and in eachset we sort the n samples in order of their magnitude. We can then use s[i] (1 ≤ i ≤ n) todenote a random variable whose samples are the collection of the ith smallest in each

Preliminary results of this article appeared in Proceedings of the IEEE/ACM Design Automation Conference(DAC’09) [Xiong et al. 2009].Authors’ addresses: Y. Shi, Electrical and Computer Engineering Department, Missouri University of Scienceand Technology, Rolla, MO 65409; email: [email protected]; J. Xiong and V. Zolotov, IBM T. J. Watson ResearchCenter, Yorktown Heights, NY, 10598; C. Visweswariah, IBM Microelectronics, Hopewell Junction, NY 12533.Permission to make digital or hard copies of part or all of this work for personal or classroom use is grantedwithout fee provided that copies are not made or distributed for profit or commercial advantage and thatcopies show this notice on the first page or initial screen of a display along with the full citation. Copyrights forcomponents of this work owned by others than ACM must be honored. Abstracting with credit is permitted.To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of thiswork in other works requires prior specific permission and/or a fee. Permissions may be requested fromPublications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212)869-0481, or [email protected]© 2013 ACM 1084-4309/2013/07-ART42 $15.00

DOI: http://dx.doi.org/10.1145/2491477.2491486

ACM Transactions on Design Automation of Electronic Systems, Vol. 18, No. 3, Article 42, Pub. date: July 2013.

42:2 Y. Shi et al.

sorted set, that is, we have

s[1] ≤ · · · ≤ s[n]. (1)

Then s[i] is called the ith order statistic (i = 1, . . . ,n) [David and Nagaraja 2003]. Thesubject of order statistics deals with the properties and applications of these orderedrandom variables and of functions involving them, such as the extremes s[1] and s[n],the range W = s[n]-s[1], the extreme deviate (from the sample mean) (s[n]s̄), and for arandom sample from a Gaussian distribution N(μ, σ 2), the studentized range W/Sγ ,where Sγ is a root-mean-square estimator of σ based on μ degrees of freedom. All thesestatistics have important and universal applications. We have seen that the extremescan be useful for gamblers. In addition, they can also be used in the statistical studyof floods and droughts, in problems of breaking strength and fatigue failure, and inauction theory; the range can be used in product quality control; the extreme deviate isa fundamental detector for outliers; and the studentized range is of critical importancein the ranking of treatment means in the analysis of variance situations.

Given the importance of the order statistics, extensive research has been conductedin the past several decades. Monte-Carlo simulation can provide accurate results, butthe huge number of samples required renders the method of little practical value. Toestablish an analytical solution, Khatri [1962] examined the joint distribution of anys[i] and s[ j] of independent and identically distributed (i.i.d.) random variables from adiscrete parent. Gan and Bain [1995] obtained the distribution of any ith order statisticand also conditional distributions of discrete order statistics from a general discreteparent by tie-runs. Reiss [1989] considered the joint and marginal distributions of theith order statistic of i.i.d. random variables under both continuous and discontinuousdistribution functions. It also considered the distribution of bivariate order statisticsby marginal ordering of bivariate i.i.d. random vectors with a continuous distributionfunction by means of multinomial probabilities of appropriate cell frequency vectors,defining multivariate order statistics by marginal ordering of i.i.d. random vectorswith a continuous distribution. Guilbaud [1982] expressed probability of the functionsof independent and not necessarily identically distributed (i.n.n.i.d) random vectors asa linear combination of probabilities of the functions of i.i.d. random vectors and thusalso for order statistics of random variables. Balakrishnan [1994a, 1994b, 1995] andChilds and Balakrishnan [1998, 2006] studied the order statistics for i.n.n.i.d randomvariables following some special distributions.

However, all these methods either assume that the random variables are indepen-dent, or that they follow some special correlation structures. It is still not clear howto compute order statistics efficiently for general correlated random variables. At thevery least, we even do not know how to compute the order statistics for n correlated andnot necessarily identically distributed Gaussian variables. This is surprising, becauseGaussian distributions demonstrate many nice properties and are generally consideredto be one of the simplest distributions.

In this article, we propose an elegant recursive formula for computing the orderstatistics for correlated Gaussian variables, a hitherto open problem, with the pos-sibility of extending it to any type of random variables. By leveraging the recentlydeveloped approximation algorithm for statistical min/max operation from the statisti-cal static timing analysis (SSTA) community, we show that the computation of the mthorder statistics of n random variables can be done in O(mn) time and O(m+ n) space.In addition, we develop a novel data structure to help efficiently update the orderstatistics. As a vehicle to demonstrate the power of the algorithm, we apply it to pathselection in at-speed testing, with consideration of unsensitizable paths (with details tobe discussed in Sections 4 and 5). Experimental results show that with the new order


Order Statistics for Correlated Random Variables 42:3

statistics method, both the path selection algorithm efficiency and the quality of theselected paths can be improved significantly.

The remainder of the article is organized as follows: Section 2 reviews existing workon order statistics; Section 3 proposes our order statistic algorithms with linear spaceand time complexity and verifies them against the Monte-Carlo simulation; Section 4discusses the application of the algorithm to the path selection problem in at-speed test-ing; the corresponding experimental results are provided in Section 5, and concludingremarks are given in Section 6.

2. PRELIMINARIES

In this section, we briefly review the general method for computing order statistics fori.i.d. random variables.

Suppose s1, . . . , sn are n independent random variables, each following the samecumulative distribution function (cdf) F(t). Let F[r](t) denote the cdf of the rth orderstatistic s[r]. Then the cdf of the largest order statistic s[n] can be obtained as [Weiss2007] as

F[n](t) = P(s[n] ≤ t)= P(si ≤ t,∀i) (2)= Fn(t).

Note that this equality only holds under the assumption of independent random vari-ables.

Similarly, we can get the cdf of the smallest (first)-order statistic [Weiss 2007] as

F[1](t) = P(s[1] ≤ t)= P(si ≤ t, ∃i) (3)= 1 − P(si > t,∀i)= 1 − (1 − F(t))n.

In general, F[r](t) can be obtained [Weiss 2007] as

F[r](t) = P(s[r] ≤ t)= P(at least r of the si are not greater than t) (4)

=n∑

i=r

(ni

)Fi(t)(1 − F(t))n−i.

The last equality comes from the fact that each term in the summation actually corre-sponds to the probability that exactly i of the s1, . . . , sn are not greater than t.

As can be easily inferred from these derivations, it is impossible to obtain a sim-ilar expression for correlated random variables we have to consider the joint distri-butions between all combinations of these random variables, which quickly becomescomputationally infeasible as the number of variables increases or if we need repeatedevaluations.

3. ORDER STATISTICS FOR CORRELATED VARIABLES

3.1. General Framework

In this section, we propose a general framework for computing the order statisticsof n correlated and not necessarily identically distributed random variables based onstatistical min/max operations. For clarity of presentation, we will first discuss how ourframework works for Gaussian random variables and then generalize it to arbitrarydistributions.


42:4 Y. Shi et al.

It is well established that the min/max of two Gaussian variables can be linearly ap-proximated as a Gaussian variable through Clark’s formula [Clark 1961; Visweswariahet al. 2004a]. Specifically, for two Gaussian variables s1 and s2 with mean μ1, mu2 andvariance σ 2

1 and σ 22 , the mean and variance of their maximum s = max(s1, s2) can be

derived as

μs = μ1 Q+ μ2(1 − Q) + θ P; (5)σ 2

s = (μ2

1 + σ 21

)Q+ (

μ22 + σ 2

2

)(1 − Q) + (μ1 + μ2)θ P − μ2

s , (6)

where θ = σ (s1 − s2). P and Q are the pdf and cdf of a standard normal distributionevaluated at λ = μ(s1 − s2)/σ (s1 − s2).

P(λ) = 1√2π

exp(

−λ2

2

), (7)

Q(λ) =∫ λ

−∞P(x)dx. (8)

For simplicity of presentation, given n random variables s1, . . . , sn, we denote themth-order statistic of n random variables as sn

[m]. In this work, we propose an efficientway of computing sn

[m] based on the following theorem.

THEOREM 1. Given n random variables s1, s2, . . . , sn (n > 1), the mth -order statisticsn

[m] is given by

sn[m] = min

(sn−1[m] , max

(sn, sn−1

[m−1]

)). (9)

PROOF. Equivalently, we need to prove that ∀t,

P(sn

[m] ≤ t) = P

(min

(sn−1[m] , max

(sn, sn−1

[m−1]

)) ≤ t). (10)

Note that

P(sn

[m] ≤ t) = P(at least m of si are not greater than t), (11)

which can be further divided into two situations.

(a) At least m of the first n − 1 random variables are already not greater than t. Thissituation can be described as sn−1 ≤ t with probability P(sn−1

[m] ≤ t).(b) At least m−1 of the first n−1 random variables are not greater than t, and the last

random variable is not greater than t. This situation can be described as sn−1[m−1] ≤ t,

sn ≤ t with probability P(sn−1[m−1] ≤ t, sn ≤ t).

Also note that situations a and b are not mutually exclusive, and accordingly,

P(sn[m] ≤ t

) = P(A) + P(B) − P(A∩ B)

= P(sn−1[m] ≤ t

) + P(max

(sn−1

[m−1], sn) ≤ t

)(12)

−P(max

(sn−1

[m] , sn−1[m−1], sn

) ≤ t).

On the other hand,

P(min

(sn−1[m] , max

(sn, sn−1

[m−1]

)) ≤ t)

= P(sn−1[m] ≤ t or max

(sn, sn−1

[m−1]

) ≤ t)

(13)

= P(sn−1[m] ≤ t

) + P(max

(sn, sn−1

[m−1]

) ≤ t)

−P(max

(sn−1

[m] , sn, sn−1[m−1]

) ≤ t).



Comparing Eqs. (13) with (14), we can conclude that Eq. (9) holds.

Eq. (9) can recursively reduce the computation of sn[m] to the boundary cases of sk1

[1]

(2 ≤ k1 ≤ n) and sk2[k2] (2 ≤ k2 ≤ m), which is the maximum and minimum of a set of

random variables and can be computed recursively again as

sk1[1] = min(s1, s2, . . . , sk1 ) = min

(sk1−1[1] , sk1

); (14)

sk2[k2] = max(s1, s2, . . . , sk2 ) = max

(sk2−1[k2−1], sk2

). (15)

Based on Theorem 1, we can develop an efficient algorithm to compute all order-statistics sn

[1], . . . , sn[m], as shown in Algorithm 1.

ALGORITHM 1: Find 1st- to mth-Order Statistics of n Correlated Random VariablesINPUT: Random variables s1, . . . , sn, and order m;OUTPUT: Order statistics: sn

[1] . . . , sn[m];

a[1] = s1;for i = 2; i ≤ m; i++ do

a[i] = min(a[i − 1], si);end forfor j = 2; j ≤ n − m+ 1; j++ do

a[1] = min(a[1], sj);for i = 2; i ≤ m; i++ do

a[i] = min(a[i], max(a[i − 1], si+ j−1));end for

end forfor j = n − m+ 1; j ≤ n; j++ do

a[1] = min(a[1], sj);for i = 2; i ≤ n − j; i++ do

a[i] = min(a[i], max(a[i − 1], si+ j−1));end for

end forfor i = 1; i ≤ m; i++ do

sn[i] = a[i];

end for

The idea behind the algorithm can be understood with a simple example, as shownin Figure 1 with n = 6 and m = 3. To compute s6

[3] according to Eq. (9), we need twoinputs s5

[3] and s5[2] and two min/max operations; this holds similarly for s5

[3] and s5[2]. By

organizing the data needed for each computation into a directed graph and sharingintermediate results among different computations, we obtain the pattern in Figure 1.Each node’s two input nodes and edges correspond to the two inputs and two min/maxoperations, respectively, which are needed to compute the results to be stored at thatnode according to Eq. (9) (or Eqs. (14) and (15) for those nodes with one input edge). Byperforming the computation along the dashed lines bottom-up and only keeping thoseresults necessary for the next level of computation, we can compute all required orderstatistics (s6

[3], s6[2], and s6

[1]) in linear time O(nm) and space O(n + m).

3.2. Selective Update

It may also be of practical interest to seek the new mth order statistic if one of therandom variables needs to be replaced with a new random variable. If we know whichrandom variable needs to be replaced, such an update has a complexity of O(mn) bycalling Algorithm 1.


42:6 Y. Shi et al.

Fig. 1. Computation of order statistics.

However, in many applications, we would like to know replacing which one gives thebest mth order statistics in a certain sense (e.g., smallest deviation or smallest mean).We refer to this kind of operation as selective update in this article. A naive approachfor selective update would go through each random variable, replace it with the newone, and compute the updated mth order statistic. Such an operation has a complexityof O(mn2). If the number of variables, n, is large and such an operation is frequent,then the complexity is too high. In the following, we propose a more efficient way forselective update of order statistics under a linear complexity in n.

We utilize a similar binary tree data structure as that discussed in Zolotov et al.[2010] with the following major modifications. For the retained n slacks, we organizethem into a balanced binary tree with each leaf node corresponding to one randomvariable si. At every node D in the tree, we store two sets of auxiliary random vari-ables: the node order statistics set SD and the complement node order statistics set SD.The former contains the 1st- to mth-order statistics of the random variables in all itsdownstream leaf nodes; while the latter contains the 1st- to mth-order statistics of allrandom variables but not in its downstream leaf node.

We denote gm(s1, . . . , sr) as the set of 1st- to mth-order statistics from the algorithm,as illustrated in Figure 1, that is,

gm(s1, . . . , sr) ={

sr[1], . . . , sr

[m], if r ≥ m;sr

[1], . . . , sr[r], otherwise.

(16)

Then we can compute node order statistics set SD and complement node order statisticsset SD as follows.

SD = gm(SDlef t ∪ SDright

), (17)

SD = gm(SDparent ∪ SDsibling

), (18)

where Dlef t, Dright, Dparent, and Dsibling are node D’s left child, right child, parent, andsibling nodes, respectively.

To fill up the binary tree with proper node information, we need two traversals ofthis tree. The first traversal is bottom-up from leaves to the root by computing all thenode order statistics sets via Eq. (17); the second traversal is top-down from the root toleaves by computing all the complement node order statistics sets via Eq. (18). As theinput to gm(·) is limited to 2mnumber of random variables (binary tree), the complexityof constructing the binary tree is O(nm2).



Table I. Comparison of Order Statistics from Our Algorithm and Monte-Carlo Simulation for1,000 Random Variables and mth Order Statistic

m Monte-Carlo Our Methodμ σ runtime (sec) μ (diff) σ (diff) runtime(sec) (speedup)

1 0.472 0.119 184.5 0.471 (−0.2%) 0.117 (−0.2%) 0.123 (687×)2 0.633 0.235 192.6 0.631 (−0.3%) 0.234 (−0.4%) 0.174 (1106×)3 0.749 0.200 204.8 0.752 (0.4%) 0.200 (0.0%) 0.202 (1014×)4 0.852 0.197 221.4 0.851 (−0.1%) 0.196 (−0.5%) 0.221 (1002×)5 0.911 0.392 248.0 0.911 (0.0%) 0.331 (0.6%) 0.257 (965×)

Once the binary tree has been constructed, the new order statistics, after replacingone random variable si (in a leaf node) with a new random variable s̃, can be updatedin O(m2) time by calling gm(Si ∪ s̃) with the complement leaf order statistics set Si

already known: the mth-order statistic of gm(Si ∪ s̃) gives the updated sn[m]. To decide

which random variable should be replaced to achieve the best mth order statistic, weonly need to loop through all the leaf nodes once; hence, the complexity is O(nm2).

Once we find out the random variable to be replaced, we need to update the randomvariable in the corresponding leaf node with the new one and then do a bottom-uptraversal from this leaf node to the root in order to update the node order statisticsset via Eq. (17) and a top-down traversal from the root to all leaves to update thecomplement node order statistics set via Eq. (18). Again, the complexity is O(nm2).

To summarize, by utilizing the binary tree type of data structure and maintainingproper node order statistics sets and complement node order statistics sets, we havereduced the complexity of selective update from O(n2m) to O(nm2). This is of greatimportance, especially in the applications where the number of random variables n ismuch larger than the order of interest m. The example to be discussed in Section 4 willillustrate clearly the advantage of such a method.

3.3. Validation against Monte-Carlo Simulation

In this section, we compare the results obtained from the proposed algorithm withthose from the Monte-Carlo simulation (100K trials) on Gaussian random variables.The 100K trials are necessary in order to achieve 0.1% relative error bound basedon our experiments. When performing the min/max operations, we use the analyticalformula discussed in Visweswariah et al. [2004a].

We first study the runtime and accuracy comparison for different order statistic mon 1,000 correlated Gaussian random variables. To create those random variables, wefirst generate 30 independent Gaussian variables with mean and variance randomlyselected between 0 and 1. We then use the linear combination of those independentGaussian variables, with the coefficients randomly chosen between 0 and 1, to generatethe 1,000 correlated Gaussian random variables. The results are reported in Table I.From the table, we can see that the relative error of the distributions obtained fromour method is less than 0.6% in both mean (μ) and standard deviation (σ ). In terms ofruntime, we have over 1,000× speedup compared with the Monte-Carlo simulation.

Similarly, we also study the runtime and accuracy comparison for different numbersof Gaussian random variables for the third order statistic and report the results inTable II. From the table, we can see that our method results in less than 0.4% inmean and standard deviation while achieving over 800× speedup over the Monte-Carlosimulation.

Finally, we verify the accuracy of our selective update algorithm against Monte Carlosimulation. We iteratively update one random variable in 1,000 correlated randomvariables, find the 15th order statistic, and compare it with the Monte Carlo trials to


42:8 Y. Shi et al.

Table II. Comparison of Order Statistics from Our Algorithm and Monte-Carlo Simulation for n RandomVariables and 3rd Order Statistic

n Monte-Carlo Our Methodμ σ runtime (sec) μ (diff) σ (diff) runtime (sec) (speedup)

1000 0.381 0.250 210.5 0.380 (−0.3%) 0.249 (−0.4%) 0.203 (1037×)2000 0.495 0.386 367.4 0.493 (−0.4%) 0.386 (0.0%) 0.390 (942×)3000 0.638 0.671 488.5 0.638 (0.0%) 0.670 (−0.1%) 0.512 (954×)4000 0.971 0.245 634.8 0.969 (−0.2%) 0.245 (0.0%) 0.779 (815×)5000 1.324 0.397 751.3 1.324 (0.0%) 0.397 (0.0%) 0.906 (829×)

Fig. 2. Relative error distributions of μ and σ from selective update.

find the relative error in mean and standard deviation. As such, we can obtain 1,000relative errors in mean and standard deviation. We depict the corresponding histogramas shown in Figure 2. From the figure, we can see that the maximum relative error(i.e., the x-label of the rightmost two rectangles) is always smaller than 1%.

3.4. Generalization to Non-Gaussian Random Variables

The Preceding discussions establish that the proposed framework can be applied tocorrelated Gaussian variables. It can also be extended easily to the general cases ofcorrelated and not necessarily identically distributed random variables of arbitrarydistributions.

The proposed order statistics framework depends on the basic operations ofmin/max(s1, s2). As long as such operations can be done, our framework can be applied.The Clarks formula (Eqs. (5) and (6)) for correlated Gaussian variables is probably themost commonly used. For general correlated non-Gaussian variables, if similar ana-lytical results exist for their min/max operations, our framework can be directly used.Otherwise, we can apply power transform [Carroll and Ruppert 1981] to convert thevariables to normal distributions. One of the most commonly used power transforms isthe Box-Cox transform [Nishii 2001], which takes the following form.

s′i =

{sλi −1λ

, (if λ �= 0);log(si), (if λ = 0),

(19)

where si is the original random variable. λ is a properly selected number so that thetransformed variable s′

i is approximately Gaussian. The same λ needs to be used for all



the random variables so that their relative order will not change. If we know the jointdistribution of the original random variables, it is easy to derive that of the new setof Gaussian variables after transform, and thus their mean, variance, and co-varianceinformation. We then apply our framework to get the order statistics for this new setof variables, after which inverse Box-Cox transform can be used to get the distributionof the order statistics of the original random variables.

4. APPLICATION TO AT-SPEED TESTING

In this section, we use the path selection problem in at-speed testing to demonstratehow the proposed algorithm can be applied in such a context. It is understood, however,that the application of the algorithm should not be limited to one problem or one field.Indeed, it can be useful for all fields where order statistics are of interest.

4.1. Overview

Modern high-performance chips coming off the manufacturing line are tested at-speed,that is, a set of paths is selected to undergo at-speed tests to identify “bad” chipsin which one or more of the selected paths fail timing requirements [Iyengar et al.2006]. Selection of these paths is complicated by the presence of process variations, asdifferent paths can be critical in different chips manufactured under different processconditions [Wang et al. 2004]. Selection of paths to cover various process conditions isimportant to ensure high quality of testing [Iyengar et al. 2007].

Most existing work targets transition delay faults that are large and localizedchanges of delay [Bushnell and Agrawal 2000], and they tend to find short pathsthrough those fault sites rather than critical paths. Delay defects resulting from pro-cess variations are, however, small and distributed, and it is the accumulated delayvariation along the signal propagation path that cauSes Critical Paths to fail. Sharmaand Patel [2002] proposed finding the longest paths through each fault site, but thiswas based on a deterministic delay model. Other works [Liou et al. 2002; Qiu andWalker 2003; Wang et al. 2004; Iyengar et al. 2007] tried to take process variationinto account, but their approaches either suffer from high algorithmic complexity orare inaccurate due to simplified statistical timing models. Thus existing path selectiontechniques are not effective in detecting delay faults caused by process variations.

Recently, Zolotov et al. [2010] proposed a metric called the test quality metric (TQM)for at-speed testing. Such a metric provides a measure of process space coverage.Guided by such a metric, the authors proposed a branch-and-bound path selectionalgorithm to find a set of paths to maximize TQM, but there is a major drawback, thatis, the TQM metric is a single-layer process space coverage metric, which gives rise totwo main objections. First, the statistical timing model is not perfect, and any modelinaccuracy can easily cause the loss of process space coverage. Second, not all selectedpaths are sensitizable. In fact, as pointed out [Cheng et al. 1993], a large numberof paths are not sensitizable, because a path’s sensitizability is known only after anexpensive procedure called ATPG (automatic test pattern generation), so even if theTQM of the set of selected paths is high, after ATPG, many paths may turn out to beunsensitizable, and the post-ATPG TQM may be significantly reduced.

A natural solution to the problem is to enforce that every point in the multidi-mensional space of process variations is covered multiple times. Towards this, in theremainder of this section we will first define a new metric, called multilayer processspace coverage (mPCM), to quantitatively measure this multilayer process space cov-erage property for any given set of paths. We show that direct computation of such ametric has exponential complexity. A novel observation is made to relate the computa-tion of mPCM to the order statistics of a set of correlated random variables. With thealgorithm developed in the previous section, we are able to compute mPCM efficiently.


42:10 Y. Shi et al.

Then, under the guidance of mPCM, we extend the branch-and-bound path selectionalgorithm [Zolotov et al. 2010] to perform path selection. All these techniques enable usto optimally and efficiently select a set of paths to achieve the best multilayer processspace coverage. With the proposed method, the subspace that is not covered by anysensitizable paths is likely to be reduced. As such, the number of iterations betweenpath selection and ATPG, and thus the design turnaround time, can be reduced.

4.2. Review of Existing Works

Given a set of paths � = {π1, π2, . . . , πn}, the test quality metric (TQM) of this set ofpaths [Zolotov et al. 2010] is defined as the probability that a tested chip has no timingviolation conditional upon all paths in � passing at-speed testing, that is,

Q(�) = P(Sc ≥ 0|S� ≥ 0), (20)

where Sc is the statistical chip slack, and S� is the statistical slack of the correspondingpath set �, that is,

S� = min(s1, . . . , sn), (21)

where si is the statistical slack of the corresponding path πi. Clearly, the larger theQ(�), the better the testing quality, hence the better the path selection.

It has also been shown [Zolotov et al. 2010] that TQM Q(�) is related to PCM (processspace coverage metric) q(�) through

Q(�) = P(Sc ≥ 0)1 − q(�)

, (22)

with PCM being defined as

q(�) = P(S� ≤ 0) = P(min(s1, . . . , sn) ≤ 0). (23)

For a given design, chip slack Sc does not depend on the paths selected for testing.Hence maximizing TQM Q(�) in (22) is equivalent to maximizing PCM q(�) in Eq. (23).

Under the guidance of PCM (or equivalently TQM), Zolotov et al. [2010] proposeda branch-and-bound path selection algorithm to select the top n paths that give thelargest (best) TQM among billions of paths from the design.

The algorithm traverses the timing graph in a depth-first manner, and the efficiencyis achieved through an effective pruning strategy. Zolotov et al. [2010] proved thatthere exists an upper bound for PCM

q(�) ≤ P(min(s1, . . . , si−1, s�̃, si+1 . . . , sn) ≤ 0), (24)

where slack si of path πi is replaced with slack s�̃ of a set of paths �̃ with πi ⊆ �̃. In thecontext of path selection, the set of paths �̃ corresponds to the partial path segment.During traversal of the partial path, any of its branches can be pruned immediately ifits upper-bound metric is worse than the current metric, as there is no way the metriccan be improved by continuing path search through that branch.

4.3. Multilayer Process Space Coverage Metric

4.3.1. Metric Definition. In statistical timing, all timing quantities, such as slack s, arerepresented as functions (e.g., linear [Visweswariah et al. 2004a] or quadratic [Zhanget al. 2005]) of the underlying process parameters �X, that is, s = F(�X), where �X isa k×1 vector containing normalized Gaussian random variables to model the variationof process parameters, including chip-to-chip, within-chip, and local random variations.The entire k-dimensional space spanned by �X is also called the process space and isdenoted by = {�X|�X ∈ Rk}.



s2

s1< 0

s3< 0

< 0

AB

C

DE

F

Fig. 3. Example of process space coverage.

The meaning of process space coverage and its metric PCM (Eq. (23)) is better ex-plained by defining a mapping from the path slack si to a subspace ωi ⊆ as

ωi = {�X|si = Fi(�X) ≤ 0}. (25)

For a set of n paths, its corresponding subspace ω(1)n ⊆ is the union of subspaces

defined by each individual path, that is,

ω(1)n = ω1 ∪ . . . ∪ ωn. (26)

In other words, a set of paths defines (or covers) a subspace of the entire process spacesuch that by testing this path set at-speed, we could sort out all bad chips manufacturedunder those process conditions. The corresponding PCM, as defined in Eq. (23), can thusbe equally interpreted as

q(�) = |ω(1)n |

|| , (27)

where | · | is a Lebesgue measure (i.e., probability-weighted area) [Dudley 2002] of theprocess space.

Figure 3 shows a small example with three path slacks s1, s2, and s3 in a two-dimensional process space. Within the three-sigma circular region, the subspace cov-ered by slack s1 is C ∪ D ∪ E, the subspace covered by s2 is B ∪ C ∪ D ∪ F, and thesubspace covered by s3 is A ∪ B ∪ C. In other words, subspaces A, E, and F are onlycovered by one slack. Subspaces B and D are covered by two slacks, and subspace C iscovered by all three slacks. In general, if a subspace is covered by mslacks, we say thatit has m-layer coverage. In this example, A, E, and F have one-layer coverage, B and Dhave two-layer coverage, and C has three-layer coverage.

For the same reason, we call the PCM [Zolotov et al. 2010] as given by Eqs. (26)and (27) as asingle-layer process space coverage metric (sPCM) and the correspondingTQM as a single-layer-TQM (sTQM), because every point in the space will be countedtowards ω(1)

n as long as there is one path covering that point. In other words, sPCM willnot distinguish the subspaces that are covered by only one path (such as A in Figure 3)or by multiple paths (such as C in Figure 3).

The so-defined sPCM is not desirable for path selection for at-speed testing. Forexample, if one of the paths cannot be sensitized, the subspace that is covered only bythat path will be left untested. On the other hand, if we guarantee that all subspacesare covered multiple times (e.g., at least twice as B, C, D in Figure 3), then even if somepaths turn out to be unsensitizable after ATPG, their subspaces would likely still becovered by other remaining sensitizable paths. In other words, we want to find pathsto maximize the multiple coverage area as much as possible. To capture this property,we propose a new metric called multilayer process space coverage metric (mPCM) inthe following.


42:12 Y. Shi et al.

We first note that the subspace covered m times by a set of m paths is given by

ω(m)m = ω1 ∩ . . . ∩ ωm. (28)

For a set of n paths � = {π1, π2, . . . , πn} with the corresponding coverage subspace as{ω1, ω2, . . . , ωn}, the subspace covered by this set of paths at least m times, ω(m)

n ⊆ ,would be the union of ω

(m)m,i with each ω

(m)m,i covered m times by a set of m paths, which is

further a subset of �. There are a total(n

m

)number of ω

(m)m,i (or equivalently, subsets of

size m from �). Mathematically, this is given by

ω(m)n = ∪∀i∈(n

m)combinations(ω

(m)m,i

). (29)

Similar to Eq. (27), the multilayer process space coverage metric (mPCM) is thusdefined as

q(m)(�) =∣∣ω(m)

n

∣∣|| . (30)

Because for m= 1, Eqs. (29) and (30) would give back the same results as Eqs. (26) and(27), thus sPCM can be interpreted as a special case of mPCM.

For the example shown in Figure 3, to find the subspace covered by n = 3 path slacksat least m = 2 times, we can find the union of the subspace covered by both s1 and s2(i.e., C ∪ D), the subspace covered by both s1 and s3 (i.e., C), and the subspace coveredby both s2 and s3 (i.e., C ∪ B). The result is B∪ C ∪ D.

4.3.2. Relationship to Order Statistics. From the process space coverage point of view, wehave defined mPCM, as shown in Eq. (30). This definition is convenient for conceptualunderstanding, but not for computation. Therefore, we have the following lemma tohelp us perform the computation.

LEMMA 1. For a set of slacks {s1, s2, . . . , sm} and its corresponding coverage subspaces{ω1, ω2, . . . , ωm}, the union of these subspaces can be represented by the statistical mini-mum of all slacks, that is,

ω1 ∪ . . . ∪ ωm = {�X|min(s1, . . . , sm) ≤ 0}, (31)

and the intersection of these subspaces can be represented by the statistical maximumof all slacks, that is,

ω1 ∩ . . . ∩ ωm = {�X|max(s1, . . . , sm) ≤ 0, }. (32)

In other words, the union operation on subspaces corresponds to the statistical min-imum operation on slacks, while the intersection operation on subspaces correspondsto the statistical maximum operation on slacks.

Since we know how to perform the statistical minimum and maximum operations asin any statistical timer1, according to Eqs. (31) and (32), the subspace in Eq. (29) canbe obtained as

ω(m)n = {�X| fn,m(s1, . . . , sn) ≤ 0}, (33)

with slack fn,m as a function of s1, . . . , sn given by

fn,m = min∀i∈(nm)combinations(max(s1,i, . . . , sm,i)), (34)

1All derivations as shown in this work are exact as long as the statistical min/max operations are exact.Only when the approximated min/max operations are used (e.g., [Visweswariah et al. 2004a; Zhang et al.2005]) will our computation incur some approximation error.



where {s1,i, . . . , sm,i} is one of the(n

m

)combinations from {s1, s2, . . . , sn}. Therefore, rather

than using Eq. (30), we compute mPCM as follows.

q(m)(�) = P( fn,m(s1, . . . , sn) ≤ 0). (35)

Computing sPCM according to Eq. (23) is straightforward, but computation of mPCMaccording to Eq. (35) is not, because the latter requires enumeration of all

(nm

)combi-

nations, which is on the order of exponential complexity O(( nm)m).

A closer examination of the metric leads us to the following theorem, which buildsthe link between the metric and order statistics.

THEOREM 2. The random variable fn,m as defined in Eq. (34) is the same as the mth-order statistic of the n input random variables s1, . . . , sn (n > 1), that is,

fn,m(s1, . . . , sn) = sn[m]. (36)

For example, when m = 1 or m = n, we have

fn,1 = min(s1, s2, . . . , sn) = sn[1], (37)

fn,n = max(s1, s2, . . . , sn) = sn[n]. (38)

PROOF. From Eq. (34), we can easily see that fn,m ≤ t is equivalent to the statementthat there is at least one i such that max(s1,i, . . . , sm,i) ≤ t, which in turn is equivalentto the statement that there are at least m si which are not greater than t.

This theorem also gives us another intuitive explanation of Theorem 1, in the contextof process coverage. Let us first look at a simple example with n = 3. Assume that wewant to find mPCM for m = 2, then fn,m in Eq. (34) is given by

f3,2 = min(max(s1, s2), max(s1, s3), max(s2, s3)). (39)

When s1, s2, and s3 are deterministic values, it is easily seen that f3,2 would give thesecond smallest number among the inputs. When s1, s2, and s3 are random variables,then f3,2 would be the second-order statistic of the input set. Generally, from ourunderstanding of the mapping between slacks si and process space coverage ωi, wecan see that fn,m corresponds to the space covered by ωi at least m times, as shown inEq. (29). Then instead of enumerating all

(nm

)combinations, the same m-layer subspace

is obtained by the union of two subspaces: (1) the subspace covered by the first n − 1slacks m times, and (2) the intersection of the subspace covered by the last slack sn andthe subspace covered by the first n− 1 slacks m− 1 times. Mathematically, this can beformalized as

ω(m)n (m) = ω

(m)n−1(s1, . . . , sn−1) ∪ (

ω ∩ ω(m1)n−1 (s1, . . . , sn−1)

). (40)

For example, as shown in Figure 3, the first subspace is C ∪ D covered by both s1 and s2,the second subspace is B∪C, and the result would be same as before, that is, B∪C ∪ D.Note that Eq. (40) is in essence equivalent to Eq. (9), and this intuitively explainsTheorem 1 and its proof.

Accordingly, with the algorithm proposed in Section 3, we can compute the n-pathmPCM metric with O(mn) time and O(m+ n) space.

4.3.3. Relationship to Test Quality Metric. The preceding discussion establishes theextension from single-layer process space coverage metric (sPCM) to multilayerprocess space coverage metric (mPCM), based on the intuition that we would like eachpoint in the process space to be covered by multiple paths. However, one questionremains unanswered: How does the metric relate to the quality of the paths selectedfor at-speed testing, which is the eventual target we are interested in? In Section 4.2,


42:14 Y. Shi et al.

Eq. (22) reveals the relationship between the process space coverage metric and thetest quality metric, but only for single-layer coverage.

We can extend the test quality metric from single-layer to multilayer: the m-layertest quality metric (mTQM) for a set of n paths � can be defined as the probabilitythat a tested chip has no timing violation conditional upon at least n − m+ 1 paths in� passing at-speed testing. The remaining paths, which are no more than m− 1, areunsensitizable. Accordingly, this can be expressed as

Q(m)(�) = P(Sc ≥ 0|At least n − m+ 1 paths pass) (41)

= P(Sc ≥ 0, At least n − m+ 1 paths pass)P(At least n − m+ 1 paths pass)

(42)

= P(Sc ≥ 0)P(At least n − m+ 1 paths pass)

, (43)

where Sc is the statistical chip slack. In Eq. (43), we have used the fact that Sc ≥ 0indicates that all paths should pass.

The condition that at least n − m+ 1 paths pass is equivalent to the condition thatno more than m− 1 paths fail, that is, the the mth smallest slack is positive. Using thenotation in the previous section, such a condition can be expressed as

fn,m(s1, . . . , sn) ≥ 0. (44)

Plugging Eq. (44) into Eq. (43), we have

Q(m)(�) = P(Sc ≥ 0)P( fn,m(s1, . . . , sn) ≥ 0)

(45)

= P(Sc ≥ 0)1 − P( fn,m(s1, . . . , sn) ≤ 0)

(46)

= P(Sc ≥ 0)1 − q(m)(�)

. (47)

Eq. (47) is based on Eq. (35), and thus links between mTQM and mPCM are built.Since P(Sc ≥ 0) is constant for a given chip, it suggests that selecting a set of pathswith maximum mTQM for at-speed testing is equivalent to selecting a set of paths withmaximum mPCM.

4.3.4. Path Selection for Multilayer Process Space Coverage. In this section, we extend Zolo-tov et al. [2010] to select the best n paths such that mTQM (or, equivalently, mPCM) ismaximized.

The key to the branch-and-bound algorithm is an effective pruning strategy thathinges on the computation of a metric bound. For sPCM, as defined in Eq. (23), itsupper-bound metric is given by Eq. (24). In the case of mPCM, as defined in Eq. (35),we can also show that there exists a similar upper-bound metric

q(m)(�) ≤ P( fn,m(s1, . . . , si−1, s�̃, si+1, . . . , sn) ≤ 0), (48)

where slack si of path πi is replaced with slack s�̃ of a set of paths �̃ with πi ⊆ �̃.Once we know how to compute the upper bound for mPCM, we can utilize the same

branch-and-bound algorithm framework, as discussed in Zolotov et al. [2010], by (1)replacing the optimization metric from sPCM to mPCM, and (2) during path traversal,replacing the sPCM bound with the mPCM bound for pruning. Hence the entire branch-and-bound algorithm can be utilized to find the top n paths with maximum mPCM. In



the interest of space, we refer readers to Zolotov et al. [2010] for detailed discussion onthe overall branch-and-bound algorithm framework.

During path selection, for every potential branch to grow the current partial path ofinterest, we obtain a new slack s̃ of the growing partial path, and we need to decide(1) whether we should continue to search through this branch for a new path (i.e.,pruning), and (2) when a new path is found, which existing path should be replacedby this new one. All these decisions involve updating mPCM with the new slack s̃,and since we always keep the best so-far n slacks, we need to temporarily updatemPCM n times. This can be done by utilizing the binary tree data structure discussedin Section 3, which reduces the complexity of updating mPCM from O(n2m) to O(nm2).Typically, the number of required paths n is on the order of thousands based on theobservation in Zolotov et al. [2010]. On the other hand, even we set the number oflayers m for coverage to be less than ten, a significant improvement in terms of pathtest quality can be achieved, as will be demonstrated later in experiments. Therefore,we have reduced the complexity from quadratic O(n2) to linear O(n).

5. EXPERIMENTAL RESULTS

We have implemented the algorithm for mTQM (or, equivalently, mPCM) computationand the branch-and-bound based path selection algorithm in an in-house statisticaltiming tool [Visweswariah et al. 2004a]. The experiments are run on an AIX serverwith 2GHz CPU and 64G RAM. Three industrial 90 nm and 65 nm designs are usedfor experiments, and we denote them as D1, D2, and D3, and the size of each designis 5K, 170K, and 3.2M gates, respectively. Note that these gate counts do not includememory or IP blocks, and only logic paths are considered. In addition, all the designsare done with statistical timing optimization such that no single path has a muchsmaller slack compared with others in the top few hundred paths. All the path slacksare obtained through an industry-quality sign-off SSTA engine, EinStat [Visweswariahet al. 2004b]. Around 20 process variables are considered in our experiments, and theamount of process variation is obtained according to foundry rules for both frontendand backend process parameters.

5.1. mTQM Comparison

For a given set of n path slacks, we compute its mTQM via Eq. (47) for differentrequired layers m, and the mPCM is computed by Eq. (35) with fn,m given by Eq. (36).To check the accuracy of our algorithm, we also compute mTQM through Monte-Carlosimulation (100K trials). We report the mTQM comparison based on D2 in Table IIIfor different n and m. We observe that our approach computes mTQM with almost thesame accuracy as Monte Carlo (less than 0.2% error) while achieving two orders ofmagnitude speedup. This conclusion holds for different path numbers, of n and layernumber m, and the runtime grows almost linearly.

5.2. Quality of Path Selection

We run the branch-and-bound path selection algorithm, as discussed in Section 4.3.4,to select the top n paths with the maximum mTQM, and we denote it as mTQM-BnB.To illustrate the value of a multilayer coverage-based path selection, we compare thequality of path selection with that of Zolotov et al. [2010], which targets single-layerprocess space coverage and is denoted as sTQM-BnB.

We observe that (1) when m = 1, our path results are exactly the same as those ofZolotov et al. [2010]; this is expected as our mTQM metric is the same as sTQM whenm = 1; (2) when m is more than one, that is, we require more numbers of layers forprocess space coverage, our algorithm selects a different set of paths.


42:16 Y. Shi et al.

Table III. Comparison of mTQM Computation Between Monte-Carlo (MC) andOur Method (Our) in Terms of Accuracy and Runtime

Path n, Layer m Computed mTQM value Runtime (sec)MC Ours (diff) MC Ours (speedup)

n = 8K, m = 1 0.9868 0.9847 (−0.2%) 577.3 0.621 (930×)n = 8K, m = 2 0.8349 0.8359 (−0.1%) 589.9 0.648 (910×)n = 8K, m = 3 0.7902 0.7919 (0.2%) 595.2 0.706 (843×)n = 8K, m = 4 0.7493 0.7494 (0.0%) 612.6 0.732 (837×)n = 8K, m = 5 0.7343 0.7329 (−0.2%) 637.4 0.768 (830×)n = 1K, m = 2 0.7530 0.7523 (−0.1%) 157.3 0.116 (1356×)n = 3K, m = 2 0.8344 0.8359 (0.2%) 241.6 0.172 (1405×)n = 5K, m = 2 0.8359 0.8359 (0.0%) 421.5 0.316 (1334×)n = 7K, m = 2 0.8359 0.8359 (0.0%) 583.0 0.457 (1276×)n = 9K, m = 2 0.8358 0.8359 (0.0%) 765.6 0.598 (1280×)

Table IV. sTQM Comparison after Random Path Removal

Avg after random path removal Min after random path removalDesign sTQM-BnB mTQM-BnB sTQM-BnB mTQM-BnB

D1 0.772 0.898 0.515 0.637D2 0.827 0.905 0.551 0.696D3 0.716 0.830 0.511 0.640

Note: 10K runs are used, and the average (Avg) and minimum (Avg) sTQMs arereported.

To compare the quality of these two sets of paths, ideally we could run ATPG onboth sets of paths and find out which paths are sensitizable and which are not, andthen recompute the sTQM for only sensitizable paths. Obviously, the higher the post-ATPG sTQM, the better the results for testing. Such an experiment should be runfor multiple designs, and the algorithm with the better likelihood of getting higherpost-ATPG sTQM wins.

However, the ATPG process is typically a time-consuming and cumbersome step, soinstead, for most of the experiments in this article, we use a randomized approach thatmimics the ideal experiment but with much less effort. For a given design, we run bothalgorithms to obtain two sets of n paths, then we arbitrarily choose the same number ofpaths from each set and mark those paths as unsensitizable, and compute sTQM afterrandom path removal for the remaining sensitizable paths. We repeat this arbitrarychoice of paths many times (10K in our experiments) and report the comparison ofthe sTQM after random path removal from both algorithms. Table IV shows somestatistics of the comparison, that is, the average and minimum sTQM after randompath removal from both sTQM-BnB and mTQM-BnB (with m = 2). We observe thatmTQM-BnB always achieves both higher average and minimum sTQMs. For example,for Design D1, if sTQM-BnB is used, the average sTQM after random path removal is0.772. Considering sTQM, as defined in Eq. (20), this means that the probability that achip has positive slack given the condition that all the remaining paths selected havepositive slacks is 0.772. The probability is increased to 0.898 if mTQM-BnB is used.Such an observation is expected, because according to mTQM-BnB, every point in theprocess space has been covered multiple times, and the loss of space coverage onlyhappens if all those paths are unsensitizable. Apparently, such a likelihood is lowerthan paths selected with only one-layer coverage.

We also show the histogram comparison for D2 in Figure 4. To obtain the figure,we collect the final sTQM metrics after random path removal for both sTQM-BnBand mTQM-BnB methods. With 10K runs, we will have 10K final sTQM metrics for



Fig. 4. Histogram of the final sTQM distribution from sTQM-BnB and mTQM-BnB.

Table V. Evaluation of mTQM for Different m

mTQM mTQM-BnB targeting different mevaluation m = 1 m = 2 m = 3 m = 4

m = 1 0.881 0.881 0.881 0.871m = 2 0.650 0.763 0.738 0.728m = 3 0.562 0.553 0.591 0.578m = 4 0.513 0.503 0.533 0.558

both methods, and the histogram can thus be plotted. We observe that among the 10Kexperiments, the sTQMs after random path removal varies significantly, ranging from0.66 to 0.97. There is about a 25% chance of sTQM-BnB retaining the same sTQM asbefore (around 0.97), while mTQM-BnB has more than 70% chance of retaining thesame sTQM as before. In other words, mTQM-BnB has a much higher likelihood ofachieving higher post-ATPG sTQM than does sTQM-BnB. Hence, the quality of pathselection based on mTQM-BnB is better, and this improvement would be even moresignificant when mTQM-BnB targets mTQM with larger m.

For the required m-layer mTQM maximization, mTQM-BnB guarantees finding thebest set of n paths among all possible choices of paths. This can be confirmed by runningmTQM-BnB targeting different m to obtain different sets of paths, and re-evaluatingthese sets of paths in terms of mTQM for m = 1, 2, . . . -layer coverage. One suchexperiment for D3 is shown in Table V. We see that the set of paths achieving thebest mTQM only comes from the mTQM-BnB algorithm targeting the same requiredlayer m (diagonal cell in the table). For example, the mTQM-BnB targeting four-layercoverage has the best four-layer coverage mTQM (0.558) compared to others.

5.3. Efficiency of Path Selection

We compare the runtime of our mTQM-BnB algorithm with or without employing theproposed incremental mTQM updating techniques. Results based on D2 are presentedin Figure 5. The top plot shows the comparison for selecting 8,000 paths targetingmTQM with different m, and the lower plot shows the comparison for selecting different


42:18 Y. Shi et al.

1 2 3 4 50

50

100

150

200

250

300

# of layers

Run

time

(s)

with binary treewithout binary tree

1,000 3,000 5,000 7,000 9,0000

20

40

60

80

100

120

# of paths

Run

time

(s)

with binary treewithout binary tree

Fig. 5. Runtime scalability for mTQM-BnB.

Table VI. sTQM Comparison after ATPG

Design sTQM-BnB 2TQM-BnB 3TQM-BnB 4TQM-BnBc1908 0.581 0.681 0.681 0.622c2670 0.289 0.292 0.337 0.367c3540 0.472 0.593 0.610 0.614c5315 0.231 0.315 0.322 0.301c6288 0.081 0.092 0.105 0.105c7552 0.315 0.388 0.459 0.396

numbers of paths targeting two-layer mTQM. We see that mTQM-BnB without usingthe incremental mTQM updating technique has roughly quadratic complexity, whilemTQM-BnB with the speedup technique scales almost linearly with both the pathnumber and the layer number, confirming our prior complexity analysis.

We also compare the runtime of our two-layer mTQM-BnB algorithm with the singlesTQM-BnB algorithm. For designs D1, D2, and D3, we notice that sTQM-BnB hasruntime of 5.39, 19.76, and 22.85 CPU seconds, respectively, while the runtime formTQM-BnB is 8.87, 27.66, and 31.64, respectively. Therefore, by selecting paths totarget more layers of process space coverage, mTQM-BnB incurs less than 50% runtimeoverhead compared to the existing path selection targeting single-layer coverage, andleads to superior path selection.

Finally, to show the impact of our mTQM-BnB on real ATPG, we employ ISCAS85benchmark circuits with a 90nm standard cell library. Spatial correlations are mod-eled through a 10 × 10 grid-based correlation matrix generated following the methodpresented in Xiong et al. [2007]. We then place the cells using a commerical tool andperform SSTA based on the resulting layout. After that, we use mTQM-BnB with dif-ferent m to select 9,000 paths. We then perform ATPG and report the sTQM metricfor all the sensitizable paths, as shown in Table VI. From the table, we can see thatmTQM-BnB can apparently increase the sTQM in all the designs, which proves theefficacy of our approach.



6. CONCLUSIONS AND DISCUSSION

This article has proposed a recursive algorithm based on statistical min/max opera-tions to compute order statistics for general correlated and not necessarily identicallydistributed random variables, which has been an open problem in literature. Thealgorithm has a linear time and space complexity, and a binary tree-based data struc-ture is further developed to allow selective update of the order statistics efficiently. Asa vehicle to demonstrate the algorithm, we apply it to the path selection algorithm inat-speed testing. A novel metric multilayer process space coverage metric is proposed toquantitatively gauge the quality of path selection. We then show that such a metric isdirectly linked to the order statistics, and our recursive algorithm can thus be applied.By employing a branch-and-bound path selection algorithm with these techniques,this article shows that selecting an optimal set of paths for a multimillion-gate designcan be performed efficiently. Compared to the state of the art, experimental resultsshow both the efficiency of our algorithms and better quality of our path selection.

REFERENCES

BALAKRISHNAN, N. 1994a. On order statistics from non-identical right-truncated exponential random variablesand some applications. Commun. Stat. - Theor. Meth. 23, 12, 3373–3393.

BALAKRISHNAN, N. 1994b. Order statistics from non-identical exponential random variables and someapplications. Commun. Stat. - Theor. Meth. 18, 2, 203–253.

BALAKRISHNAN, N. 1995. Order statistics from non-identical power function random variables. Commun. Stat.- Theor. Meth. 24, 6, 1443–1454.

BUSHNELL, M. L. AND AGRAWAL, V. D. 2000. Essentials of Electronic Testing for Digital, Memory andMixed-Signal VLSI Circuits. Kluwer Academic Publishers, Norwell, MA.

CARROLL, R. J. AND RUPPERT, D. 1981. On prediction and the power transformation family. Biometrika 68, 3,609–615.

CHENG, K. T., DEVADAS, S., AND KEUTZER, K. 1993. Delay-fault test generation and synthesis for testabilityunder a standard scan design methodology. IEEE Trans. Comput.-Aid. Des. ICs Syst. 12, 8, 1217–1231.

CHILDS, A. AND BALAKRISHNAN, N. 1998. Generalized recurrence relations for moments of order statistics fromnon-identical pareto and truncated pareto random variables with applications to robustness. In OrderStatistics: Theory and Methods, Elsevier, Amsterdam, 403–438.

CHILDS, A. AND BALAKRISHNAN, N. 2006. Relations for order statistics from non-identical logistic randomvariables and assessment of the effect of multiple outliers on the bias of linear estimators. J. Stat. Plan.Infer. 136, 7, 2227–2253.

CLARK, C. E. 1961. The greatest of a finite set of random variables. Oper. Res. 9, 2, 145–162.DAVID, H. A. AND NAGARAJA, H. N. 2003. Order Statistic. John Wiley and Sons, Hoboken, NJ.DUDLEY, R. M. 2002. Real Analysis and Probability. Cambridge University Press, Cambridge, U.K.GAN, G. AND BAIN, L. J. 1995. Distribution of order statistics for discrete parents with applications to censored

sampling. J. Statist. Plan. Infer. 44, 1, 37–46.GUILBAUD, O. 1982. Functions of non-i.i.d. random vectors expressed as functions of i.i.d. random vectors.

Scand. J. Statist 9, 4, 229–233.IYENGAR, V., XIONG, J., VENKATESAN, S., ZOLOTOV, V., LACKEY, D., HABITZ, P. A., AND VISWESWARIAH, C. 2007.

Variation-aware performance verification using at-speed structural test and statistical timing. InProceedings of the IEEE International Conference on Computer-Aided Design. 405–412.

IYENGAR, V., YOKOTA, T., YAMADA, K., ANEMIKOS, T., BASSETT, R., DEGREGORIO, M., FARMER, R., GRISE, G.,JOHNSON, M., MILTON, D., TAYLOR, M., AND WOYTOWICH, F. 2006. At-speed structural test for high-performance ASICs. In Proceedings of the International Test Conference.

KHATRI, C. G. 1962. Distributions of order statistics for discrete case. Ann. Instit. Stat. Math. 14, 1, 167–171.LIOU, J.-J., KRSTIC, A., WANG, L.-C., AND CHENG, K.-T. 2002. False-path-aware statistical timing analysis and

efficient path selection for delay testing and timing validation. In Proceedings of the Design AutomationConference. 566–569.

NISHII, R. 2001. Box-Cox Transformation. Springer, Berlin Heidelberg.QIU, W. AND WALKER, D. M. H. 2003. An efficient algorithm for finding k longest testable paths through each

gate in a combinational circit. In Proceedings of the International Test Conference. 592–601.


42:20 Y. Shi et al.

REISS, R. D. 1989. Approximate Distributions of Order Statistics: With Applications to NonparametricStatistics. Springer, Berlin Heidelberg.

SHARMA, M. AND PATEL, J. H. 2002. Finding a small set of longest testable paths that cover every gate. InProceedings of the International Test Conference. 974–982.

VISWESWARIAH, C., RAVINDRAN, K., KALAFALA, K., WALKER, S. G., AND NARAYAN, S. 2004a. First-order incrementalblock-based statistical timing analysis. In Proceedings of the Design Automation Conference. 331–336.

VISWESWARIAH, C., RAVINDRAN, K., KALAFALA, K., WALKER, S. G., AND NARAYAN, S. 2004b. First-order incrementalblock-based statistical timing analysis. In Proceedings of the Design Automation Conference. 331–336.

WANG, L.-C., LIOU, J.-J., AND CHENG, K.-T. 2004. Critical path selection for delay fault testing based upon astatistical timing model. IEEE Trans. Comput.-Aid. Des. ICs Syst. 23, 11, 1550–1565.

WEISS, N. A. 2007. Introductory Statistics 8th Ed. Addison Wesley, Boston, MA.XIONG, J., SHI, Y., ZOLOTOV, V., Visweswariah, C. 2009. Statistical multilayer process space coverage for

at-speed test. In Proceedings of the Design Automation Conference. 122–125.XIONG, J., ZOLOTOV, V., AND HE, L. 2007. Robust extraction of spatial correlation. IEEE Trans. Comput.-Aid.

Des. ICs Syst. 26, 4, 619–631.ZHANG, L., CHEN, W., HU, Y., GUBNER, J. A., AND CHEN, C. C. 2005. Correlation-preserved non-gaussian statistical

timing analysis with quadratic timing model. In Proceedings of the Design Automation Conference. 83–88.ZOLOTOV, V., XIONG, J., FATEMI, H., AND VISWESWARIAH, C. 2010. Statistical path selection for at-speed test.

IEEE Trans. Comput.-Aid. Des. ICs Syst. 29, 5, 749–759.

Received November 2011; revised September 2012, January 2013; accepted February 2013


order statistics for correlated random variables and its application to at-speed testing

Documents