qiang zhou, li zeng, and shiyu zhou - cae usershomepages.cae.wisc.edu/~zhous/papers/05453039.pdf370...

11
370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010 Statistical Detection of Defect Patterns Using Hough Transform Qiang Zhou, Li Zeng, and Shiyu Zhou Abstract —Surface defects on semiconductor wafers often ex- hibit particular spatial patterns. These patterns contain valuable information of the fabrication processes and can help engineers identify the potential root causes. In this paper, we present a control chart technique to detect spatial patterns of surface defects by using the Hough transform. An approximate dis- tribution model of the monitoring statistic is proposed, and a comprehensive control chart design method is developed. This method is characterized by its intuitive implication and a simple design procedure which relates the statistical performance of the control chart to the design parameters. A case study is presented to validate the effectiveness of this method. Index Terms—Control chart, Hough transform (HT), spatial pattern, surface quality control. I. Introduction S URFACE quality plays a critical role in quality control of manufacturing processes involving products with flat sur- faces. Typical examples include hot-rolling processes produc- ing sheet metal, and fabrication processes for semiconductor wafers and liquid crystal display flat panels. Surface defects usually lower product quality and cause higher scrap rate, and therefore should be controlled tightly. In semiconductor industry, particularly, process yield has long been a primary concern, and surface defects on wafers generated during fabri- cation is generally the main reason for yield loss in integrated circuits (IC) fabrication [1]. Due to the increase of die sizes and decrease of feature sizes, chips become more vulnerable to surface defects. Furthermore, the increasing complexity of manufacturing processes leads to potentially larger process variations. Therefore, reduction of wafer surface defects has become a challenging issue. Inspection systems can often provide not only the number of defects on a surface, but also their spatial locations. Wafer defect maps can often be readily obtained from optical quality inspection systems in semiconductor processes. On the wafer defect map, defects are marked as dots on a white circular background representing the wafer. The spatial distribution of surface defects contains valuable information of the man- ufacturing processes and can thus be used for root cause Manuscript received February 4, 2009; revised October 4, 2009; accepted February 16, 2010. Date of publication April 22, 2010; date of current version August 4, 2010. This work was supported in part by the National Science Foundation, under Grants 0545600 and 092608. This paper was recommended by Associate Editor R.-S. A. Guo. The authors are with the Department of Industrial and Systems Engineer- ing, University of Wisconsin-Madison, Madison, WI 53706 USA (e-mail: [email protected]; [email protected]; [email protected]). Digital Object Identifier 10.1109/TSM.2010.2048959 identification purposes [1]–[6]. Generally, surface defects gen- erated in wafer fabrication processes can be categorized into two types [7], [8]: 1) globally scattered random defects (later referred to as background noise) caused by natural variation of the process; and 2) locally clustered defects due to certain assignable causes. The final surface defects are usually a superposition of these two types. The spatial distribution of the locally clustered defects can often provide hints on root causes. For example, one of the most widely seen patterns on wafer is the line segment, which could be the result of scratches during material handling. Another typical pattern, the edge ring, is normally due to etching problems. Thus, detection of such spatial patterns can help identify potential root causes. The monitoring of wafer surface defects is traditionally conducted using various control charts based on summary measures [9], [10]. For example, a c chart is used to monitor the total number of defects on a wafer and raise alarms when plotted dots fall out of control limit. The underlying assumption for the c chart is that the occurrence of defects follows a Poisson distribution. However, it has been reported that defects on wafer maps tend to cluster [11]–[13]. To solve this problem, some revised distributions, such as Neyman type- A distribution and negative binomial distribution have been proposed [14]–[19]. Generally, control chart techniques are conceptually intuitive and convenient to use for practitioners, but they cannot detect specific spatial patterns of the defects since only the count data are used as monitoring statistics. Because of the importance of detecting spatial patterns, several techniques have been developed. These methods can be roughly put into three categories. 1) Spatial statistics based methods: In these methods, the theory of spatial statistics is utilized to analyze and detect nonrandom patterns [20]. For example, Hansen et al. [7], Friedman et al. [21], and Jeong et al. [22] developed statistics measuring spatial dependency of defects to detect systematic clustering, while Fellows et al. [23] compared two popular spatial randomness models. Although the presence of clusters can be de- tected, these methods cannot distinguish specific spatial patterns. Moreover, the monitoring statistics often bear relatively complex statistical properties, adding difficulty to implementation of these methods. 2) Data mining methods: These include neural networks, fuzzy rule-based inferences, and various clustering meth- ods, for supervised learning [4], [24], [25], and unsuper- vised learning [1], [8], [26]–[30]. Many of these methods 0278-0070/$26.00 c 2010 IEEE

Upload: others

Post on 14-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Qiang Zhou, Li Zeng, and Shiyu Zhou - CAE Usershomepages.cae.wisc.edu/~zhous/papers/05453039.pdf370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010 Statistical

370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010

Statistical Detection of Defect PatternsUsing Hough Transform

Qiang Zhou, Li Zeng, and Shiyu Zhou

Abstract—Surface defects on semiconductor wafers often ex-hibit particular spatial patterns. These patterns contain valuableinformation of the fabrication processes and can help engineersidentify the potential root causes. In this paper, we present acontrol chart technique to detect spatial patterns of surfacedefects by using the Hough transform. An approximate dis-tribution model of the monitoring statistic is proposed, and acomprehensive control chart design method is developed. Thismethod is characterized by its intuitive implication and a simpledesign procedure which relates the statistical performance of thecontrol chart to the design parameters. A case study is presentedto validate the effectiveness of this method.

Index Terms—Control chart, Hough transform (HT), spatialpattern, surface quality control.

I. Introduction

SURFACE quality plays a critical role in quality control ofmanufacturing processes involving products with flat sur-

faces. Typical examples include hot-rolling processes produc-ing sheet metal, and fabrication processes for semiconductorwafers and liquid crystal display flat panels. Surface defectsusually lower product quality and cause higher scrap rate,and therefore should be controlled tightly. In semiconductorindustry, particularly, process yield has long been a primaryconcern, and surface defects on wafers generated during fabri-cation is generally the main reason for yield loss in integratedcircuits (IC) fabrication [1]. Due to the increase of die sizesand decrease of feature sizes, chips become more vulnerableto surface defects. Furthermore, the increasing complexity ofmanufacturing processes leads to potentially larger processvariations. Therefore, reduction of wafer surface defects hasbecome a challenging issue.

Inspection systems can often provide not only the numberof defects on a surface, but also their spatial locations. Waferdefect maps can often be readily obtained from optical qualityinspection systems in semiconductor processes. On the waferdefect map, defects are marked as dots on a white circularbackground representing the wafer. The spatial distributionof surface defects contains valuable information of the man-ufacturing processes and can thus be used for root cause

Manuscript received February 4, 2009; revised October 4, 2009; acceptedFebruary 16, 2010. Date of publication April 22, 2010; date of current versionAugust 4, 2010. This work was supported in part by the National ScienceFoundation, under Grants 0545600 and 092608. This paper was recommendedby Associate Editor R.-S. A. Guo.

The authors are with the Department of Industrial and Systems Engineer-ing, University of Wisconsin-Madison, Madison, WI 53706 USA (e-mail:[email protected]; [email protected]; [email protected]).

Digital Object Identifier 10.1109/TSM.2010.2048959

identification purposes [1]–[6]. Generally, surface defects gen-erated in wafer fabrication processes can be categorized intotwo types [7], [8]: 1) globally scattered random defects (laterreferred to as background noise) caused by natural variationof the process; and 2) locally clustered defects due to certainassignable causes. The final surface defects are usually asuperposition of these two types. The spatial distribution of thelocally clustered defects can often provide hints on root causes.For example, one of the most widely seen patterns on waferis the line segment, which could be the result of scratchesduring material handling. Another typical pattern, the edgering, is normally due to etching problems. Thus, detection ofsuch spatial patterns can help identify potential root causes.

The monitoring of wafer surface defects is traditionallyconducted using various control charts based on summarymeasures [9], [10]. For example, a c chart is used to monitorthe total number of defects on a wafer and raise alarmswhen plotted dots fall out of control limit. The underlyingassumption for the c chart is that the occurrence of defectsfollows a Poisson distribution. However, it has been reportedthat defects on wafer maps tend to cluster [11]–[13]. To solvethis problem, some revised distributions, such as Neyman type-A distribution and negative binomial distribution have beenproposed [14]–[19]. Generally, control chart techniques areconceptually intuitive and convenient to use for practitioners,but they cannot detect specific spatial patterns of the defectssince only the count data are used as monitoring statistics.

Because of the importance of detecting spatial patterns,several techniques have been developed. These methods canbe roughly put into three categories.

1) Spatial statistics based methods: In these methods, thetheory of spatial statistics is utilized to analyze anddetect nonrandom patterns [20]. For example, Hansenet al. [7], Friedman et al. [21], and Jeong et al. [22]developed statistics measuring spatial dependency ofdefects to detect systematic clustering, while Fellowset al. [23] compared two popular spatial randomnessmodels. Although the presence of clusters can be de-tected, these methods cannot distinguish specific spatialpatterns. Moreover, the monitoring statistics often bearrelatively complex statistical properties, adding difficultyto implementation of these methods.

2) Data mining methods: These include neural networks,fuzzy rule-based inferences, and various clustering meth-ods, for supervised learning [4], [24], [25], and unsuper-vised learning [1], [8], [26]–[30]. Many of these methods

0278-0070/$26.00 c© 2010 IEEE

Page 2: Qiang Zhou, Li Zeng, and Shiyu Zhou - CAE Usershomepages.cae.wisc.edu/~zhous/papers/05453039.pdf370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010 Statistical

ZHOU et al.: STATISTICAL DETECTION OF DEFECT PATTERNS USING HOUGH TRANSFORM 371

need a large training dataset, and due to the complexityof the algorithms, most of these methods do not providea statistically rigorous evaluation of their performances(i.e., Type I error, also called false alarm probability, andType II error, also called miss detection probability) inpattern detection.

3) Computer vision based methods: Point pattern matchingand analysis has been an active research field in com-puter vision [31]. For spatial pattern detection, one of themost commonly used methods is Hough transform (HT).As a technique for detecting spatial patterns from binaryimage data [32]–[34], it transforms the binary image intoa parameter space and tries to detect the parameterizedpattern through a voting process in which each pointvotes for all the possible patterns passing through it. Thepatterns with higher votes indicate a higher probabilityof occurrence on the map. As long as a parameter-ized model can be established for the spatial patternof interest, this method can be applied. The HT haslong been recognized as a reliable and versatile methodwith high detection power for spatial pattern detectionespecially in noisy pictures [35]. However, most of theexisting research focuses on performance improvementthrough deterministic parameterization, quantization ofthe parameters and image spaces, and computationalload reduction. Limited work has been done on sta-tistical evaluation of the performance of the HT undernoisy conditions, which is critical in decision makingfor quality control.

In this paper, we propose a statistical pattern detection methodbased on the HT. This method aims to construct a controlchart that is able to monitor the highest number of votes inthe Hough space (Hough matrix). This statistic is calculatedfor each defect map, and an alarm will be raised when itis larger than the control limit, indicating that a pattern hasbeen detected. An approximate distribution of this statistic isdeveloped and the control limit can thus be estimated. Com-pared to existing techniques, this method has the followingcharacteristics.

1) It focuses on detecting specific spatial patterns of defectsinstead of only detecting the existence of clustering.

2) The detection is realized through an easy-to-use controlchart and the monitoring statistic is intuitive and easy tocompute.

3) A quantitative design procedure is also provided, whichrelates the Type I and Type II errors to the designparameters.

In this way, requirements on the statistical performance of thecontrol chart can be achieved through choosing proper valuesfor design parameters. Essentially, this method can be usedto detect any parameterized surface patterns. Two types ofmost commonly observed patterns, linear and circular patterns,will be focused on in this paper as illustrative examples,while the treatment can be extended to other parametricpatterns.

The rest of this paper is organized as follows. Section IIpresents the HT-based control chart design procedure for linear

Fig. 1. Line detection by HT. (a) Accumulator array. (b) Defect map.

pattern detection. Section III describes the detection of circularpatterns. Section IV provides a case study to validate thismethod, and Section V concludes the paper and discussespossible future work.

II. HT-Based Control Chart Design for Linear

Pattern Detection

A. Basics of Hough Transform for Line Detection

First, we shall briefly introduce some principles of HT. Thenotation used here follows [33]. Using normal parameteriza-tion, a line in x–y plane can be uniquely defined by its distanceρ from the origin and the angle θ of its norm as

x cos θ + y sin θ = ρ (1)

where θ is restricted within [0, π). This parameterization mapsevery line in x–y plane to a unique point in θ–ρ plane. A pointin x–y plane, e.g., P0(x0, y0), will be mapped to the θ–ρ planeas a sinusoidal curve defined as

x0 cos θ + y0 sin θ = ρ. (2)

Every point (θ, ρ) on the curve (2) corresponds to a unique linein x–y plane passing through P0. Therefore, for points lyingon the same line in x–y plane, their corresponding sinusoidalcurves in θ–ρ plane will pass through a common point.

Based on this mapping relationship, a voting process canbe designed for detecting collinear points in x–y plane. Inthis process, θ is equally sampled and ρ is quantized intoequal intervals, resulting in an array of accumulators in θ–ρ plane as illustrated in Fig. 1(a). Each accumulator, e.g.,the shaded cell (θ = θi, ρj ≤ ρ < ρj+1), corresponds to aparticular stripe area in the physical defect map shown inFig. 1(b). Initially, all the accumulators are assigned with value“0.” As previously mentioned, each point in Fig. 1(b) has acorresponding sinusoidal curve in θ–ρ plane in Fig. 1(a). Whenthis curve passes through an accumulator, the accumulator willget a vote and its value increases by “1.” As a result, the finalvalue of each accumulator equals the number of points fallingwithin its corresponding stripe. Therefore, if there is a line onthe defect map, the accumulator corresponding to the stripethat overlaps the most with the line will get the highest vote.

Assume background noise on the defect map follows aPoisson distribution with density λ. The existence of back-ground noise poses a problem for conventional HT. Expected

Page 3: Qiang Zhou, Li Zeng, and Shiyu Zhou - CAE Usershomepages.cae.wisc.edu/~zhous/papers/05453039.pdf370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010 Statistical

372 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010

Fig. 2. Maximum entropy quantization for equal-area stripes.

votes of some accumulators are intrinsically higher than othersbecause a stripe near the origin is expected to contain moredefects than one near the edge due to its larger area. Toensure a fair comparison among the accumulators, the stripesin Fig. 1(b) must have equal areas. Here, we adopt themaximum entropy quantization suggested in [32]. The angle θ

is still sampled at equal distance intervals from 0 to π, i.e.,θ = (θ1, θ2, . . . , θNθ

), where Nθ , called the sampling param-eter of θ, denotes the number of samples. Hence, we haveθ1 = 0, θ2 = π/Nθ, . . . , θNθ

= (Nθ−1)π/Nθ.ρ will be divided insuch a way, as shown in Fig. 2, to produce equal-area stripes.Let R be the radius of the defect map, then ρ will be partitionedinto Nρ intervals by ρ0 < ρ1 < · · · < ρNρ, satisfying

ρ0 = −R, ρNρ = R (3)

and

stripe area As =∫ ρi+1

ρi

2√

R2 − ρ2dρ = πR2/Nρ. (4)

Correspondingly, the vote in each accumulator follows thesame Poisson distribution with density λAs. Here, Nρ is calledthe quantization parameter of ρ.

Let hij denote the vote received by the accumulator (θ =θi, ρj ≤ ρ < ρj+1), i = 1, . . . , Nθ; j = 0, . . . , Nρ − 1. It isnatural to establish a control chart to monitor the largest vote,(hij)max, among all the accumulators. When (hij)max exceedsa predefined threshold, i.e., control limit, an alarm will beraised to warn engineers of the potential presence of lineardefect patterns. We call this chart an HT-based control chart.Clearly, for a given defect map with radius R and a backgroundnoise level λ, three parameters need to be decided: the uppercontrol limit denoted as UCL, and parameters Nθ and Nρ. Thiswill be discussed in the following sections.

B. Control Limit of the HT-Based Control Chart

This section deals with the following problem: given pa-rameters (R, λ, Nθ , and Nρ), how can we specify the controllimit for the HT-based chart at a predefined Type I error level(denoted by α)? A straightforward method is through Monte-Carlo simulations: simulate the voting process as described inthe previous section for a large number of in-control defectmaps (i.e., with only background noise), and then obtain thecontrol limit by obtaining the 100(1 − α) percentile of thesimulated values of the monitoring statistic. However, thismethod is computationally intensive, particularly consideringthat a different set of simulation runs is needed for each choice

Fig. 3. Different degrees of correlation between accumulators.

of Nθ and Nρ combination. In this section, we propose anapproximation method which can avoid such simulations whileprovide a quite accurate control limit.

1) Approximate Distribution of the Monitoring Statistic:According to Section II A, hij follows a Poisson distributionwith density λAs when there is only background noise on thewafer, and thus, (hij)max is the maximum value of Nθ × Nρ

identically distributed variables. If we assume that these vari-ables are all independent of each other, then the cumulativedistribution function of (hij)max can be expressed as

Fmax(x) = P[(hij)max ≤ x] = P(all hij ≤ x)

=Nθ∏i=1

Nρ−1∏j=0

P(hij ≤ x) =Nθ∏i=1

Nρ−1∏j=0

F (x)

= [F (x)]Nθ×Nρ

(5)

where

F (x) =x∑

k=0

e−λAs (λAs)k

k!(6)

is the cumulative distribution function of a Poisson distributionwith density λAs. Plugging (6) into (5) yields

Fmax(x) =

[x∑

k=0

e−λAs (λAs)k

k!

]Nθ×Nρ

. (7)

However, the hijs in the accumulator array are in factcorrelated. This can be illustrated using the example in Fig. 3,where the four accumulators, A1, A2, A3, and A4, correspondto stripes S1, S2, S3, and S4, respectively. For example, h1 andh2 (the votes received by A1 and A2) are not independent ofeach other because S1 and S2 share a common area. FromFig. 3, we can summarize the following characteristics of thecorrelations among accumulators: 1) accumulators in the samecolumn, i.e., with the same θ value, are independent, e.g., A1

and A4 (or S1 and S4); and 2) in general, closer accumulatorshave higher correlations, e.g., A1 and A3(or S1 and S3), andthe degree of correlation is determined by the ratio of theoverlapped area over the stripe area.

Clearly, it is very difficult, if not impossible, to obtain theexact distribution of (hij)max by accounting for the complicatedcorrelation structure of all the accumulators. Instead, we pro-pose an approximate model based on the independent modelgiven in (7)

Fmax(x) =

[x∑

k=0

e−λAs (λAs)k

k!

]Nθ×Nρ×K

(8)

where K ∈ (0, 1] is an adjusting factor depending on the over-all (average) degree of correlation among all the accumulators.A lower value of K represents a higher correlation among

Page 4: Qiang Zhou, Li Zeng, and Shiyu Zhou - CAE Usershomepages.cae.wisc.edu/~zhous/papers/05453039.pdf370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010 Statistical

ZHOU et al.: STATISTICAL DETECTION OF DEFECT PATTERNS USING HOUGH TRANSFORM 373

Fig. 4. Correlation due to stripe overlapping.

accumulators and K = 1 corresponds to the independent case.The idea is that the Nθ × Nρ correlated variables can be“equivalently” viewed as a smaller number, i.e., Nθ ×Nρ ×K,of i.i.d. variables when we want to find the maximum valueof these variables.

To get some insight about the overall degree of correla-tion, the general case of two stripes’ geometric relation isconsidered, as shown in Fig. 4. To be representative, assumethat the two stripes are of the same length, L, the averagelength of all stripes, and the same width, �ρ = 2R/Nρ. Theangle �θ = π/Nθ is the smallest angle difference between twostripes. Consequently, the correlation can be described as

correlation =overlapped area

stripe area=

�ρ2/ sin �θ

L�ρ=

�ρ/ sin �θ

L.

(9)Recall that �θ is the sampling resolution of θ, which

should not be coarse in practice. Therefore, we limit it to�θ ≤ π/36 = 5°, which results in the approximation sin�θ ≈�θ. Moreover, since the relationship between L and R isalmost linear, i.e., L ≈ tR, where t is a constant. Hence, (9)can be further simplified as

correlation =�ρ/ sin �θ

L≈ �ρ/�θ

tR

=(2R/Nρ)/(π/Nθ)

tR=

2

πt

= CNθ

(10)

where C is a constant. Equation (10) suggests that K is relatedwith the ratio Nθ/Nρ. Hence, we can estimate K as a functionof Nθ/Nρ

K = f (Nθ/Nρ). (11)

Thus, we have

Fmax(x) =

[x∑

k=0

e−λAs (λAs)k

k!

]Nθ×Nρ×f (Nθ/Nρ)

. (12)

2) Empirical Formula for Estimation of Control Limits:The function in (11) will be decided in this section. LetUCLα be the “true” control limit obtained through Monte-Carlo simulations with Type I error α, which can be expressedas function

UCLα = g1(R, λ, Nθ, Nρ). (13)

Similarly, the estimated control limit, denoted as UCL′α, can

be obtained based on (8) and expressed as

UCL′α = g2(R, λ, Nθ, Nρ, K). (14)

Fig. 5. Simulation results: Ln(K) against Nθ/Nρ at different α levels.

Consequently, K can be estimated by minimizing

| UCLα − UCL′α | . (15)

Through extensive simulations, we can obtain the relation-ship between the estimated K and Nθ/Nρ at different α levels,as shown in Fig. 5. In the simulations, R ∈ [50, 200], λ ∈[0.0002, 0.01], Nθ ∈ [36, 180], Nρ ∈ [20, 200], and α = 0.01,0.05, and 0.1. Clearly, Ln(K) has an approximately linearrelationship with Nθ/Nρ. R and λ are not much related withK because they are irrelevant to the correlation structure.Therefore, we can obtain the fitted results of K as

α = 0.01 : Ln(K) = −0.091(Nθ/Nρ) − 0.002α = 0.05 : Ln(K) = −0.110(Nθ/Nρ) − 0.015α = 0.10 : Ln(K) = −0.123(Nθ/Nρ) − 0.038.

(16)

At the 0.05 level, we further simplify it as

K = exp(−0.11Nθ/Nρ). (17)

Substituting (17) into (8) yields the approximate model

Fmax(x) =

[x∑

k=0

e−λAs (λAs)k

k!

]Nθ×Nρ×exp(−0.11Nθ/Nρ)

. (18)

To show how good this approximation is, Table I comparesUCLα and UCL′

α at α = 0.05 for ten new sets of parameters. Itcan be clearly seen that the estimated control limits from (18)predict the simulated results quite well. Moreover, comparedwith the i.i.d. model given in (7), the approximate model pro-vides much better estimations, especially when Nθ/Nρ is large,i.e., the correlation is high. It should be noted that (hij)max

follows a discrete distribution, and the values of the control

Page 5: Qiang Zhou, Li Zeng, and Shiyu Zhou - CAE Usershomepages.cae.wisc.edu/~zhous/papers/05453039.pdf370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010 Statistical

374 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010

TABLE I

Comparision of Simulated and Estimated Control Limits

Parameter Set 1 2 3 4 5Nθ/Nρ 1.8 2 2.4 3 3.6Simulated UCLα 25.96 23.31 26.14 30.31 26.53UCL′

α from (18) 25.92 23.40 26.16 30.20 26.48UCL′′

α from (7) 26.19 23.67 26.55 30.70 26.92| UCLα − UCL′

α | 0.04 0.09 0.02 0.11 0.05| UCLα − UCL′′

α | 0.23 0.36 0.41 0.39 0.39Parameter Set 6 7 8 9 10Nθ/Nρ 4 4.5 5.1 6 6Simulated UCLα 36.64 30.65 21.92 24.04 36.98UCL′

α from (18) 36.62 30.49 21.89 23.92 36.80UCL′′

α from (7) 37.31 31.13 22.57 24.74 37.84| UCLα − UCL′

α | 0.02 0.16 0.03 0.12 0.18| UCLα − UCL′′

α | 0.67 0.48 0.65 0.70 0.86

Fig. 6. Comparisons of estimated distributions. Histogram: simulated results.(a) * is calculated based on (7); (b) * is calculated based on (18), K = 0.47.

limits in Table I are obtained through linear interpolation. Forexample, if P(x ≤ 6) = 0.91 and P (x ≤ 7) = 0.97 by (18),then UCL′

α is approximately 6.67.The strength of the approximate model can be better seen

from Fig. 6, which shows a typical distribution of (hij)max. Theparameters in this case are R = 100, λ = 0.01, Nθ = 180, andNρ = 25. Clearly, the approximate model with an appropriateK value fits the simulated results better than the i.i.d. model,especially at the upper tail which is crucial for determiningthe control limit.

3) Modified Control Limit Due to Discretization: AlthoughType I error of a control chart is given as a fixed value, e.g.,α = 0.05, this value is hard to be reached precisely sincethe underlying distribution of (hij)max is discrete. Therefore,directly using the estimated value UCL′

α as the control limitmay cause unexpected problems. For example, Fig. 7 showsthe simulated distribution when R = 100, λ = 0.01, Nθ = 180,and Nρ = 200. In this distribution, P(x ≤ 9) = 82.46%, andP(x ≤ 10) = 97.26%. The estimated control limit from (18)is UCL′

α = 9.85 (at α = 0.05). Using 9.85 as the control limit,the actual Type I error will be as high as

αactual = 1 − P(x ≤ 9.85) = 1 − P(x ≤ 9) ≈ 0.18. (19)

To avoid such problems, we suggest using the followingrevised control limit:

UCLcc = [UCL′α] + 0.5 (20)

where [x] denotes the integer that is closest to x, and if x isexactly in the middle of two integers, then either integer can

Fig. 7. Simulated distribution of (hij)max when R = 100, λ = 0.01, Nθ = 180,and Nρ = 200.

Fig. 8. Linear defect pattern on the wafer defect map.

be selected. By using (20), the actual Type I error αactual willbe the closest possible Type I error to the given α. The benefitof selecting this control limit can be seen from the followinglemma.

Lemma 1: Given a discrete random variable X and a valueα ∈ (0, 1). Assume m is a positive integer such that P1 =P(X ≤ m) ≤ 1 − α and P2 = P(X ≤ m + 1) > 1 − α. DefineC = m + (1 − α − P1)/P2 − P1) and CL = [C] + 0.5. Then,| 1 − α − P(X ≤ CL) |= min(1 − α − P1, P2 − (1 − α)) and| 1 − P(X ≤ CL) |≤ 2α.

The proof of this lemma is given in Appendix I. In ourcase, we let (hij)max be the discrete random variable X andthen from this lemma, we can see that CL provides a controllimit whose actual Type I error is the closest to the specifiedvalue α. The value of CL is determined by the value of C in thelemma. In practice, the true value of C for (hij)max is unknownbecause the true distribution of (hij)max is unknown. However,as we discussed in previous session, UCL′

α is a very goodapproximation to C (we call it UCLα in previous sections)and thus, we can simply substitute C with UCL′

α. Hence, wepropose to use (20) as the final control limit. Using this controllimit, αactual in the example in Fig. 7 becomes

αactual = 1 − P(x ≤ [9.85] + 0.5) = 1 − P(x ≤ 10.5) ≈ 0.0274(21)

which is much closer to 0.05 than 0.18.

C. Selection of Nθ and Nρ for the HT-Based Control Chart

To decide the parameters Nθ and Nρ, we need to considerthe Type II error of the control chart defined as β = P (no linedetected | there is at least one line). For simplicity, we assumethat there is at most one line on the wafer, and the profile ofcollinear points is rectangular, as shown in Fig. 8.

The following parameters are needed to uniquely definesuch a line as in Fig. 8:

1) point density of the line λl, λl > λ;2) length of the line L;

Page 6: Qiang Zhou, Li Zeng, and Shiyu Zhou - CAE Usershomepages.cae.wisc.edu/~zhous/papers/05453039.pdf370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010 Statistical

ZHOU et al.: STATISTICAL DETECTION OF DEFECT PATTERNS USING HOUGH TRANSFORM 375

3) width of the line d;4) line direction γ with respect to x-axis;5) location of the line center C(x, y).

Consequently

β = f (R, λ, Nθ, Nρ, λl, L, d, γ, C). (22)

Influences of R, λ, λl, L, d, γ and C on β are as follows.

1) The radius of a defect map is not truly a parameterbecause we can always normalize it as “1” in ouranalysis, e.g., a map with R = 100 and λ = 0.01 isessentially the same as a map with R = 1 and λ = 100.

2) Larger ratio λl/λ or larger values of L and d will lead tomore easily detectable patterns and thus, result in smallerType II errors.

3) Because of isotropy, γ and C will not have significantinfluence on Type II error, but may cause minor fluctu-ations due to discretization.

The influences of Nθ and Nρ, however, are more complicated.As parameters of the HT, they can be arbitrarily chosento achieve different performances of detection, leading todifferent Type II errors even for the same defect pattern.Essentially, the other seven parameters in (22) are determinedonce the wafer is given, and can thus be categorized as uncon-trollable parameters, while Nθ and Nρ can be categorized ascontrollable parameters. How these parameters influence theType II error will be discussed in the follows.

1) Impact of Nθ and Nρ on Type II Errors: Extensivesimulation results with various sets of parameters suggest thatNθ and Nρ have very large influence on the Type II error. Asa typical case, Fig. 9(a) shows their impact when R = 100, λ

= 0.001, λl = 0.03, L = 100, d = 5, γ = π/6, C(130, 130)and Type I error α = 0.05. It is obvious that the Type II errorvaries largely (from 2% to 20% in this case) with differentvalues of Nθ and Nρ. In the figure, each curve rises up anddrops down along the horizontal axis, and small Type II errorscan always be found right after curves drop down. Takingadvantage of this property, proper combinations of Nθ and Nρ

can be identified to obtain small Type II errors.Fig. 9(b) shows the corresponding control limits defined

in (20). For example, when Nθ = 90 and Nρ = 110, fromFig. 9(b) we know that the control chart uses a controllimit UCLcc = 5.5, and from Fig. 9(a) we know that itscorresponding Type II error is approximately 0.16. A carefulcomparison of these two figures reveals that a Type II errorcurve in Fig. 9(a) always drops down when its correspondingcontrol limit curve in Fig. 9(b) drops down. For example,when Nθ = 90 (line with triangles), both curves drop downnear Nρ = 60 and Nρ = 120. Extensive simulations show thatthis phenomenon is not merely coincidence in this particularexample, but common among all parameter sets.

To understand this phenomenon, we need to take a closelook at what happens when UCLcc drops. Fig. 10 shows thedistributions of (hij)max of a wafer having a line pattern intwo cases with Nρ = n and Nρ = n + 1 (all other parametersbeing the same). Assume that UCLcc drops from m + 0.5 tom − 0.5 when Nρ increases from n to n + 1, where both mand n are integers, and n should not be small, e.g., n > 20.

Fig. 9. Illustration of Type II error depending on Nθ and Nρ (α = 0.05).(a) Type II error with respect to Nθ and Nρ . (b) Corresponding UCLcc withrespect to Nθ and Nρ .

Fig. 10. Illustration of control limit change due to increase of Nρ . (a) Case 1:Nρ = n. (b) Case 2: Nρ = n + 1.

The solid lines in Fig. 10 indicate the control limits used.Because the only difference between the two cases is that Nρ

has a slight increase of 1, the two distributions of (hij)max willbe almost identical, as can be seen clearly from Fig. 10. Incase (a), Type II error β1 = P(X ≤ m + 0.5), and in case (b),β2 = P(X ≤ m−0.5). Apparently, β2 is smaller than β1 by theamount of P(X = m). Therefore, we can conclude that thosesudden drops of Type II error curves are caused by the lossof the portion P(X = m) in Type II error when UCLcc dropsfrom m + 0.5 to m − 0.5.

Based on this intuitive understanding, we can obtain thefollowing insights.

1) UCLcc will keep dropping down as Nρ increases. AsNρ increases, the area of each stripe decreases and sodoes the expected number of points falling within eachstripe, i.e., E(hij). Therefore, the null (without lines)distribution of (hij)max will keep moving/leaning towardleft, and thus, UCLcc will become smaller.

2) The Type II error will keep rising with Nρ between twodropping points of UCLcc. UCLcc remains as a constantbetween two dropping points, while the alternative (witha line) distribution of (hij)max keeps moving toward leftas Nρ increases, causing larger Type II errors.

Page 7: Qiang Zhou, Li Zeng, and Shiyu Zhou - CAE Usershomepages.cae.wisc.edu/~zhous/papers/05453039.pdf370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010 Statistical

376 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010

3) The Type I error can also change when UCLcc dropsbecause the distribution of the monitoring statistic isdiscrete. In practice, we should keep the Type I erroras a constant when we try to reduce the Type II errorby adjusting the system parameters. But clearly in thiscase, the selection of Nθ and Nρ will influence both theType I and Type II errors.

Fortunately, a close examination of the change pattern in TypeI error reveals the following fact: When UCLcc drops, theactual Type I errors before and after the dropping point deviateapproximately the same amount from the specified α level andtheir difference is not larger than 2α. A brief description ofthe proof procedure of this result is given in Appendix II.This result provides a bound of the change in Type I errorand enables us with the following guidelines in selecting Nθ

and Nρ.2) Guidelines on the Selection of Nθ and Nρ: According

to the above analysis, given the radius of defect maps, R, andbackground noise level, λ, the key to implement the HT-basedcontrol chart is choosing proper values of the two parameters,Nθ and Nρ. This can be done following the guidelines below.

1) Based on (18) and (20), calculate UCLcc at the specifiedα level for different combinations of Nθ and Nρ, and plotthem on a figure like Fig. 9(b).

2) Choose a proper combination of Nθ and Nρ. This canbe done by identifying those dropping points of UCLcc

curves. They can be chosen at the point after the curvedrops while far from its next drop point to avoid ahigh Type II error. Other useful information, such as theobserved width of lines we intend to detect, can also betaken into consideration when selecting Nθ and Nρ.

After Nθ and Nρ are chosen, their corresponding UCLcc canbe used as the control limit, and the actual Type I error canbe estimated based on (18).

III. HT-Based Control Chart Design for Circular

Pattern Detection

In this section, we will briefly present the extension of theproposed methodology for detecting circular patterns includingcircles (rings) and arcs.

A. Basics of the Circular Hough Transform

To detect circular patterns, the circular Hough transform(CHT) will be used [33], [37]. A circle can be parameterizedby its center O(a, b) and radius r as

(x − a)2 + (y − b)2 = r2. (23)

The voting process for circular pattern detection onthe defect map is illustrated in Fig. 11. The accumulator(Oi, rj< r ≤ rj+1) in Fig. 11(a) uniquely corresponds to anannulus in Fig. 11(b) with Oi being the center, and rj andrj+1 being the inner and outer radius, respectively. In otherwords, the number of votes, hij , for this accumulator equalsthe number of defects falling within such an annulus. With thesame assumption that background noise is Poisson distributed,

Fig. 11. Circle detection by HT: (a) accumulator array; (b) annulus on thewafer.

Fig. 12. Quantization of circle centers.

Fig. 13. Imaginary extension for the wafer.

hij is Poisson distributed with density Acλ, where Ac is thearea of the annulus, i.e., Ac = π(r2

j+1 − r2j ).

Similar to linear pattern detection, sampling of the centerO and quantization of the radius r are needed. Without lossof generality, here we limit our interest to circular patternswith: 1) RL < r ≤ RH , 0 < RL < RH ; and 2) O is withinthe wafer, i.e., a2 + b2 ≤ R2. The range of r to consider, i.e.,(RL, RH ], will be divided into Nr intervals by r0 < r1 < . . . <

rNr, where r0 = RL, rNr

= RH . The area of any annulus is thesame as Ac = π(R2

H −R2L)/Nr. Na and Nb are the quantization

parameters along a and b axis for center O, and we let Na = Nb

for simplicity. This gives a sampling resolution of � = 2R/Na,as shown in Fig. 12. The resulting centers are O1, O2, . . . , ONo

,where NO is the total number of sampled points. When R islarge compared to �, NO ≈ πR2/�2 = πN2

a /4.One problem will arise when implementing the above

scheme, that is, the annulus might be partially outside thewafer, as shown in Fig. 13, where the black solid circle repre-

Page 8: Qiang Zhou, Li Zeng, and Shiyu Zhou - CAE Usershomepages.cae.wisc.edu/~zhous/papers/05453039.pdf370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010 Statistical

ZHOU et al.: STATISTICAL DETECTION OF DEFECT PATTERNS USING HOUGH TRANSFORM 377

sents the wafer. A simple solution is to add an imaginary ex-tension to the wafer so that all the annuluses of interest will beencompassed. Under the two assumptions, i.e., RL < r ≤ RH

and O is within the defect map, the radius of the extendededge (the gray circle in Fig. 13) is R + RH . Background noisewith density λ will be added into the extended region, so thatpattern detection will be conducted within the gray circle as ifwe have a “wafer” with a radius of R + RH . Besides the aboveproblem, another problem will arise in the implementation isthat if a small value of r is allowed, then the width of thecorresponding annulus will be large in order to keep the areathe same as those of larger r values. Under this situation,a small cluster of points can be mistakenly included in thisannulus and identified as a circle. To avoid this problem, wewill make the lower bound of the radius sufficiently large.Particularly, in this paper, we set RL = R/2. In semiconductormanufacturing, most circular patterns have large radii (e.g., theradius of an edge ring is close to R). Therefore, this assumptionwill not limit the applicability of the proposed method.

B. Distribution of (hij)max in Circular Pattern Detection

Under the CHT defined in the previous section, the controllimit in circular pattern detection is studied. Similar to theline detection, an approximate distribution of the form as (8)is assumed for (hij)max among the accumulators in Fig. 11(a),and the adjusting factor, K, is determined through simulations.Not surprisingly, the relationship of K and the two designparameters, Nr and Na, depends on the interested range ofcircular patterns. When RL= R/2 and RH = R, K can beapproximated by

Ln(K) = −0.000047N2a /Nr − 0.2794 (24)

at α = 0.05. Consequently, the approximate distribution modelof (hij)max is

Fmax(x)

=

[x∑

k=0

e−λAc (λAc)k

k!

]π/4×N2a ×Nr×exp(−0.000047N2

a /Nr−0.2794)

.

(25)

This result shows that, N2a and Nr play similar roles in

circular pattern detection as Nθ and Nρ in linear patterndetection. We can also see that when N2

a /Nr is not verylarge, K can be approximated by a constant exp(−0.2794). Theintuition is that in this case the overlapping among annuluses isrelatively small, and thus, the correlation among accumulatorsis not sensitive to the design parameters.

We also studied the Type II error when using the controllimits determined by (25) to detect circular patterns throughextensive simulations. Note that in implementing the controlchart, an extended region as shown in Fig. 13 needs to beartificially added to each defect map. The simulation resultsexhibit similar characteristics as those in linear pattern detec-tion, rising up and dropping down along Nr. Small Type IIerrors can be achieved by carefully choosing the combinationof Na and Nr.

Fig. 14. Control limit UCLcc versus Nθ and Nρ .

Fig. 15. Estimated Type I error versus Nθ and Nρ .

IV. Case Study

In this section, we will use a case study to validate theeffectiveness of the proposed methodology. The defect mapsare generated through simulations with the basic parametersextracted from real surface defect maps described in [24]. Thisis a common way for validation of such methods, e.g., [37].

A. Control Charts for Detecting Linear and Circular Patterns

1) Detection of Linear Patterns: Let’s first consider the useof the HT-based control chart for detecting linear patterns. Inthis example, R = 100 and λ = 0.0114. The linear pattern hasparameters λl = 0.2444, L = 25, and d = 4. Two sets of defectmaps are generated: one with only background noise, and theother with both background noise and a linear defect pattern.Each group has 5000 defect maps.

A proper range is assigned to Nρ, i.e., 50 ≤ Nρ ≤ 100,and the angle resolution is �θ = 5°, i.e., Nθ = 180°/ 5° = 36.Together with parameters R, λ, and the specified α = 0.05,UCLcc can be calculated and the corresponding Type I errorscan be estimated based on (18). The results are shown inFig. 14 and Fig. 15. The choice of Nθ is based on the desiredaccuracy of how well a line can be matched in terms of itsangle. Here, we choose Nθ = 180, i.e., �θ = 1°. Therefore,based on Fig. 14 and the method described in Section II,proper Nρ values can be identified. In this example, potentiallygood choices are Nρ = 52, 57, 63, 69, 77, 87, and 100.When a combination of Nρ and Nθ has been selected, theircorresponding Type I error can be identified from Fig. 15. Tochoose from these multiple combinations, other informationcan be taken into consideration. For example, Nρ should not betoo small (quantization is too coarse) or too large (unnecessary

Page 9: Qiang Zhou, Li Zeng, and Shiyu Zhou - CAE Usershomepages.cae.wisc.edu/~zhous/papers/05453039.pdf370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010 Statistical

378 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010

TABLE II

Results Based on Selected Parameters (Nθ = 180)

Nρ 52 57 63 69 78 87 100Type I error 0.0502 0.0576 0.0594 0.0684 0.0604 0.073 0.0658Type II error 0.0466 0.045 0.0528 0.0568 0.0734 0.076 0.10Nρ 51 56 62 68 76 86 99Type I error 0.0272 0.0216 0.021 0.0244 0.024 0.0208 0.0146Type II error 0.08 0.0836 0.1046 0.1022 0.1306 0.1644 0.2132

Fig. 16. HT-based control chart for linear pattern detection (Nθ = 180,Nρ = 53).

Fig. 17. Corresponding defect maps of points I and II in Fig. 16.

computational load). Our experience suggests that a rule ofthumb in the selection of Nρ is to make the width of thequantized strip on the map comparable with the width of theline we intend to detect.

To show the advantages of these chosen parameters, sim-ulations have been conducted based on the generated defectmaps, and UCLcc is used as control limit. For comparison, badchoices of Nρ, e.g., 51, 56, 62, 68, 76, 86, and 99, are alsotested during the simulation. The results are given in Table II.As can be seen clearly from the table, the error rates varyconsiderably under different choices of Nρ, and the selectedvalues provide significantly smaller Type II errors.

To construct the control chart, we should first select oneof the suggested parameter combinations, e.g., Nθ = 180 andNρ = 52. According to Fig. 14, the upper control limit is 20.5in this case. The HT is then applied to each defect map. Fromthe resulting accumulator array, we obtain the highest numberof votes and mark it on the control chart. Fig. 16 shows thechart with 12 monitored statistics (showing two out-of-controlpoints) and Fig. 17 shows their corresponding defect maps.

2) Detection of Circular Patterns: To define a circularpattern on the wafer, four parameters are needed:

1) point density of the circle λc (λc > λ);2) center of the circle C(x, y);3) inner radius rin;4) outer radius rout;

Fig. 18. Circular pattern detection (Na = 100, Nr = 50). (a) HT-basedcontrol chart. (b) Corresponding defect map of the out-of-control point.

TABLE III

Alarming Rates of the Two Detectors

None 1 Line 1 Circle 1 Line + 1 CircleLine detector 1.46% 99.52% 5.92% 99.58%Circle detector 2.72% 12.08% 95.02% 95.58%

where rin < rout. In the simulation, we consider a case withλc = 0.2444, C(100, 0), rin = 51, and rout = 51.8. Followingsimilar steps, we have found an appropriate choice of designparameters to be Na = 100 and Nr = 50. With RL = R/2 andRH = R, the approximated control limit is 24.5. Fig. 18 showsthe control chart with 12 samples and the corresponding defectmap of the identified out-of-control point.

B. Simultaneous Detection of Linear and Circular Patterns

In practice, linear patterns and circular patterns could co-exist on a wafer. Thus, the HT-based control chart for lin-ear patterns and that for circular patterns should be appliedsimultaneously. It needs to be pointed out that if multiplepatterns of the same type, e.g., three lines, two circles, etc., arepresent, the most “obvious” pattern will be identified becausethe monitoring focuses on the accumulator with highest votesin the parametric space.

To show the performance of the two detectors together,four sets of defect maps are generated: the first set with onlybackground noise λ = 0.0114, the second set with one line,the third set with one circle, and the last set with both a lineand a circle. The parameters of the line are: λl = 0.1629, L =150, and d = 1; the parameters of the circle are: λc = 0.07,C(0, 0), rin = 99.5, and rout = 100. Each set contains 5000 de-fect maps. The linear detector and circular detector are appliedto each set with design parameters Nθ = 180, Nρ = 200, andNa = 100, Nr = 75, respectively. The alarming rates (percent-age of cases where the detector alarms) are listed in Table III.The results suggest that the two detectors are very powerfulin detecting corresponding patterns, and quite robust to other

Page 10: Qiang Zhou, Li Zeng, and Shiyu Zhou - CAE Usershomepages.cae.wisc.edu/~zhous/papers/05453039.pdf370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010 Statistical

ZHOU et al.: STATISTICAL DETECTION OF DEFECT PATTERNS USING HOUGH TRANSFORM 379

types of patterns. Nonetheless, in general, the performance ofthe two detectors is a complex function of the characteristicsof the patterns on the wafer.

V. Conclusion and Discussion

In this paper, we proposed an HT-based control chart tostatistically detect the linear and circular patterns on a defectmap. Using this method, the largest number in the accumulatorarray of the HT was monitored. An approximate distributionmodel for the monitoring statistic was developed and usedto identify the control limits. The impact of two importantparameters of HT, Nθ , and Nρ (Na and Nr for CHT), on theType I and Type II errors were also studied, and guidelineson how to choose their values were provided based on ourfindings. The effectiveness of the proposed technique wasvalidated using a case study. In Section III, we extended theproposed method to detect circular patterns based on CHT.

The proposed methodology can also be extended to otherspatial patterns in a similar way. However, as the patterngets more complex, it becomes increasingly hard to use theconventional HT method because of the high dimensionalityof parameters. A more promising way to extend this methodfor detecting various commonly observed patterns is to an-alyze “Hough signatures” of different patterns in the Houghmatrix of the linear HT. The idea is that a particular type ofdefect pattern, after applying the HT, corresponds to a unique“signature” in the Hough matrix, and thus, the identification ofsuch patterns can be realized through analysis of its signaturepattern [38]. However, more statistical analysis on propertiesof these signatures is needed to provide rigorous results ofpattern detection, which will be studied in the future.

Acknowledgment

The authors gratefully thank the editors and referees fortheir valuable comments and suggestions.

Appendix I

Proof of Lemma

Let Pα = 1 − α and �m as (Pα − P1)/(P2 − P1), we have

�m =Pα − P1

P2 − P1=

Pα − P1

(P2 − Pα) + (Pα − P1)

=1

(P2 − Pα)/(Pα − P1) + 1. (A1)

If Pα − P1 ≤ P2 − Pα, then �m ≤ 0.5. Consequently

|Pα − P(X ≤ CL)| = Pα − P(X ≤ m) = Pα − P1. (A2)

If Pα −P1 > P2 −Pα, then �m > 0.5, CL = [m+�m]+0.5 =m + 1.5 and

|Pα − P(X ≤ CL)| = P(X ≤ m + 1) − Pα = P2 − Pα. (A3)

Therefore

|(1 − α) − P(X ≤ CL)|= min(1 − α − P1, P2 − (1 − α)).(A4)

Furthermore, from 1 ≥ P2 > Pα we have

|Pα − P2| ≤ α. (A5)

Based on (A4) and (A5), we obtain

|1 − P(X ≤ CL) − α| ≤ α (A6)

which means |1 − P(X ≤ CL)| is not more than 2α.

Appendix II

Proof of the Result

Without loss of generality, we can take the two casesin Fig. 10 as example. As mentioned previously, the nulldistributions of them are almost the same. In case (a), theactual Type I error is

αactual1 = 1 − P(X ≤ m + 0.5) = 1 − P(X ≤ m) (A7)

while in case (b)

αactual2 = 1 − P(X ≤ m − 0.5) = 1 − P(X ≤ m − 1). (A8)

Assume αactual1 = α − δ, where 0 ≤ δ ≤ α. Since UCLcc

drops from m + 0.5 to m − 0.5 when Nρ has a slight increaseof 1, the interpolated UCL′

α must be slightly larger than m −0.5 when Nρ = n, and slightly smaller than m − 0.5 whenNρ = n + 1, meaning UCL′

α ≈ m − 0.5 in both cases. SinceUCL′

α is obtained by linear interpolation, this indicates

P(X ≤ m) ≈ (1 − α) + δ (A9)

and

P(X ≤ m − 1) ≈ (1 − α) − δ. (A10)

Therefore, αactual2 ≈ α + δ and

αactual2 − αactual1 ≈ P(X = m) = 2δ ≤ 2α. (A11)

References

[1] F. L. Chen and S. F. Liu, “A neural-network approach to recognize defectspatial pattern in semiconductor fabrication,” IEEE Trans. Semicond.Manuf., vol. 13, no. 3, pp. 366–373, Aug. 2000.

[2] S. P. Cunningham and S. MacKinnon, “Statistical methods for visualdefect metrology,” IEEE Trans. Semicond. Manuf., vol. 11, no. 1,pp. 48–53, Feb. 1998.

[3] F. Lee, P. Wang, and R. Goodner, “Factory start-up and productionramp: Yield improvement through signature analysis and visual/electricalcorrelation,” in Proc. IEEE/SEMI Adv. Semicond. Manuf. Conf., 1995,pp. 267–270.

[4] F. Lee, A. Chatterjee, and D. Croley, “Advanced yield enhancement:Computer-based spatial pattern analysis: Part 1,” in Proc. IEEE/SEMIAdv. Semicond. Manuf. Conf., 1996, pp. 409–415.

[5] R. Ken, S. Brain, and H. Neil, “Using full wafer defect maps asprocess signatures to monitor and control yield,” in Proc. IEEE/SEMIInt. Semicond. Manuf. Sci. Symp., 1991, pp. 129–135.

[6] W.-J. Tsai, L.-I. Tong, and C.-H. Wang, “Developing a new defect clusterindex,” J. Chin. Inst. Ind. Engineers, vol. 25, no. 1, pp. 18–30, 2008.

[7] M. H. Hansen, V. N. Nair, and D. J. Friedman, “Monitoring wafer mapdata from integrated circuit fabrication processes for spatially clustereddefects,” Technometrics, vol. 39, pp. 241–253, Aug. 1997.

[8] J. Y. Hwang and W. Kuo, “Model-based clustering for integrated circuityield enhancement,” Eur. J. Oper. Res., vol. 178, pp. 143–153, Apr.2007.

Page 11: Qiang Zhou, Li Zeng, and Shiyu Zhou - CAE Usershomepages.cae.wisc.edu/~zhous/papers/05453039.pdf370 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010 Statistical

380 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 23, NO. 3, AUGUST 2010

[9] D. C. Montgomery, Introduction to Statistical Quality Control, 5th ed.Hoboken, NJ: Wiley, 2005.

[10] C. J. Spanos, “Statistical process control in semiconductor manufactur-ing,” Proc. IEEE, vol. 80, no. 6, pp. 819–830, Jun. 1992.

[11] C. H. Stapper, “The effects of wafer to wafer defect density variationson integrated-circuit defect and fault distributions,” Int. Bus. MachinesJ. Res. Develop., vol. 29, pp. 87–97, Jan. 1985.

[12] C. H. Stapper, “The defect-sensitivity effect of memory chips,” IEEE J.Solid-State Circuits, vol. 21, no. 1, pp. 193–198, Feb. 1986.

[13] C. H. Stapper, “Simulation of spatial fault distributions for integratedcircuit yield estimations,” IEEE Trans. Comput.-Aided Design Integr.Circuits Syst., vol. 8, no. 12, pp. 1314–1318, Dec. 1989.

[14] S. L. Albin and D. J. Friedman, “The impact of clustered defectdistributions in IC fabrication,” Manage. Sci., vol. 35, pp. 1066–1078,Sep. 1989.

[15] A. V. Ferrisprabhu, “A cluster-modified Poisson model for estimatingdefect density and yield,” IEEE Trans. Semicond. Manuf., vol. 3, no. 2,pp. 54–59, May 1990.

[16] I. Koren, Z. Koren, and H. Stapper, “A unified negative-binomialdistribution for yield analysis of defect-tolerant circuits,” IEEE Trans.Comput., vol. 42, no. 6, pp. 724–734, Jun. 1993.

[17] C. H. Stapper, “On yield, fault distributions, and clustering of particles,”Int. Bus. Mach. J. Res. Develop., vol. 30, pp. 326–338, May 1986.

[18] C. H. Stapper, F. M. Armstrong, and K. Saji, “Integrated-circuit yieldstatistics,” Proc. IEEE, vol. 71, no. 4, pp. 453–470, Apr. 1983.

[19] L. I. Tong, “Modified process control chart in IC fabrication usingclustering analysis,” Int. J. Quality Reliab. Manage., vol. 15, no. 6,pp. 582–598, 1998.

[20] P. Diggle, Statistical Analysis of Spatial Point Patterns, 2nd ed. London,U.K.: Arnold, 2003.

[21] D. J. Friedman, M. H. Hansen, V. N. Nair, and D. A. James, “ Model-freeestimation of defect clustering in integrated circuit fabrication,” IEEETrans. Semicond. Manuf., vol. 10, no. 3, pp. 344–359, Aug. 1997.

[22] Y. S. Jeong, S. J. Kim, and M. K. Jeong, “Automatic identification ofdefect patterns in semiconductor wafer maps using spatial correlogramand dynamic time warping,” IEEE Trans. Semicond. Manuf., vol. 21,no. 4, pp. 625–637, Nov. 2008.

[23] H. H. Fellows, C. M. Mastrangelo, and K. P. White, “An empiricalcomparison of spatial randomness models for yield analysis,” IEEETrans. Electron. Packag. Manuf., vol. 32, no. 2, pp. 115–119, Apr. 2009.

[24] T. P. Karnowski, K. W. Tobin, S. S. Gleason, and F. Lakhani, “Theapplication of spatial signature analysis to electrical test data: Validationstudy,” in Proc. Int. Soc. Opt. Eng., 1999, pp. 530–541.

[25] K. W. Tobin, S. S. Gleason, F. Lakhani, and M. H. Bennett, “Automatedanalysis for rapid defect sourcing and yield learning,” in Future FabInternational, vol. 1. London, U.K.: Technology Publishing Ltd., 1997,p. 313.

[26] C. H. Wang, S. J. Wang, and W. D. Lee, “Automatic identification ofspatial defect patterns for semiconductor manufacturing,” Int. J. Prod.Res., vol. 44, no. 23, pp. 5169–5185, Dec. 2006.

[27] C. H. Wang, “Recognition of semiconductor defect patterns using spatialfiltering and spectral clustering,” Expert Syst. Appl., vol. 34, no. 3,pp. 1914–1923, Apr. 2008.

[28] C. H. Wang, “Separation of composite defect patterns on wafer bin mapusing support vector clustering,” Expert Syst. Appl., vol. 36, pp. 2554–2561, Mar. 2009.

[29] T. Yuan and W. Kuo, “A model-based clustering approach to therecognition of the spatial defect patterns produced during semiconductorfabrication,” IIE Trans., vol. 40, no. 2, pp. 93–101, Feb. 2008.

[30] T. Yuan and W. Kuo, “Spatial defect pattern recognition on semiconduc-tor wafers using model-based clustering and Bayesian inference,” Eur.J. Oper. Res., vol. 190, pp. 228–240, Oct. 2008.

[31] B. Li, Q. Meng, and H. Holstein, “Point pattern matching and appli-cations: A review,” in Proc. IEEE Int. Conf. Syst. Man Cybern., 2003,pp. 729–736.

[32] M. Cohen and G. T. Toussaint, “Detection of structures in noisypictures,” Pattern Recognit., vol. 9, pp. 95–98, Jul. 1977.

[33] R. O. Duda and P. E. Hart, “Use of Hough transformation to detectlines and curves in pictures,” Commun. Assoc. Comput. Mach., vol. 15,pp. 11–15, Jan. 1972.

[34] J. Illingworth and J. Kittler, “A survey of the Hough transform,” Comput.Vision Graph. Image Process., vol. 44, pp. 87–116, Oct. 1988.

[35] V. F. Leavers, “Which Hough transform?” Comput. Vision Graph. ImageProcess.: Image Understanding, vol. 58, no. 2, pp. 250–264, 1993.

[36] H. K. Yuen, J. Princen, J. Illingworth, and J. Kittler, “Comparative-studyof Hough transform methods for circle finding,” Image Vision Comput.,vol. 8, pp. 71–77, Feb. 1990.

[37] F. D. Palma, G. D. Nicolao, G. Miraglia, E. Pasquinetti, and F. Piccinini,“Unsupervised spatial pattern classification of electrical-wafer-sortingmaps in semiconductor manufacturing,” Pattern Recognit. Lett., vol. 26,no. 12, pp. 1857–1865, Sep. 2005.

[38] K. P. White, B. Kundu, and C. M. Mastrangelo, “Classification of defectclusters on semiconductor wafers via the Hough transformation,” IEEETrans. Semicond. Manuf., vol. 21, no. 2, pp. 272–278, May 2008.

Qiang Zhou received the B.Eng. degree in vehicleengineering and the M.Eng. degree in mechanicalengineering, both from Tsinghua University, Beijing,China, in 2005 and 2007, respectively. He is cur-rently working toward the Ph.D. degree in industrialengineering and the M.S. degree in statistics, bothfrom the University of Wisconsin-Madison, Madi-son.

He is a member of INFORMS.

Li Zeng received the B.S. and M.S. degreesin optical engineering from Tsinghua University,Beijing, China, in 2002 and 2004, respectively, theM.S. degree in statistics, and the Ph.D. degree inindustrial engineering, both from the University ofWisconsin-Madison, Madison, in 2007 and 2010,respectively.

She is currently a Research Associate with theDepartment of Industrial and Systems Engineer-ing, University of Wisconsin-Madison. Her currentresearch interests include statistical modeling and

analysis in complex systems, and quality measure and control in healthcare.

Shiyu Zhou received the B.S. and M.S. degreesin mechanical engineering from the University ofScience and Technology of China, Hefei, China,in 1993 and 1996, respectively, the M.S. degree inindustrial engineering in 2000, and the Ph.D. degreein mechanical engineering, both from the Universityof Michigan, Ann Arbor.

He is currently an Associate Professor with theDepartment of Industrial and Systems Engineering,University of Wisconsin-Madison, Madison. His cur-rent research interests are the in-process quality and

productivity improvement methodologies by integrating statistics, system andcontrol theory, and engineering knowledge.

Dr. Zhou’s research is sponsored by the National Science Foundation,Arlington, VA, the Department of Energy, Washington D.C., the NationalInstitute of Standards and the Technology-Advanced Technology Program,Gaithersburg, MD, and industries. He was a recipient of the CAREER Awardfrom the National Science Foundation in 2006. He is a member of the Instituteof International Education, INFORMS, the American Society of MechanicalEngineers, and Society of Manufacturing Engineers.