office of solid waste permits and state programs … › work › 05 › 89012.pdf · because any...

143
STATISTICAL ANALYSIS OF GROUND-WATER MONITORING DATA AT RCRA FACILITIES ADDENDUM TO INTERIM FINAL GUIDANCE OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS DIVISION U.S. ENVIRONMENTAL PROTECTION AGENCY 401 M STREET, S.W. WASHINGTON, D.C. 20460 JULY 1992

Upload: others

Post on 27-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

STATISTICAL ANALYSIS OFGROUND-WATER MONITORING

DATA AT RCRA FACILITIES

ADDENDUM TO INTERIM FINALGUIDANCE

OFFICE OF SOLID WASTEPERMITS AND STATE PROGRAMS DIVISIONU.S. ENVIRONMENTAL PROTECTION AGENCY401 M STREET, S.W.WASHINGTON, D.C. 20460

JULY 1992

Page 2: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

DISCLAIMER

This document is intended to assist Regional and State personnel in evaluating ground-watermonitoring data from RCRA facilities. Conformance with this guidance is expected to result instatistical methods and sampling procedures that meet the regulatory standard of protecting humanhealth and the environment However, EPA will not in all cases limit its approval of statisticalmethods and sampling procedures to those that comport with the guidance set forth herein. Thisguidance is not a regulation (i.e., it does not establish a standard of conduct which has the force oflaw) and should not be used as such. Regional and State personnel should exercise their discretionin using this guidance document as well as other relevant information in choosing a statisticalmethod and sampling procedure that meet the regulatory requirements for evaluating ground-watermonitoring data from RCRA facilities.

This document has been reviewed by the Office of Solid Waste, U.S. EnvironmentalProtection Agency, Washington, D.C., and approved for publication. Approval does not signifythat the contents necessarily reflect the views and policies of the U.S. Environmental ProtectionAgency, nor does mention of trade names, commercial products, or publications constituteendorsement or recommendation for use.

Page 3: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

CONTENTS

1. CHECKING ASSUMPTIONS FOR STATISTICAL PROCEDURES ..................... 1

1.1 Normality of Data............................................................ l

1.1.1 Interim Final Guidance Methods for Checking Normality... 3

1.1.2 Probability Plots.................................................. 5

1.1.3 Coefficient of Skewness......................................... 8

1.1.4 The Shapiro-Wilk Test of Normality (n£50).................. 9

1.1.5 The Shapiro-Francia Test of Normality (n>50)............... 12

1.1.6 The Probability Plot Correlation Coefficient.................. 13

1.2 Testing for Homogeneity of Variance .....................................20

1.2.1 Box Plots..........................................................20

1.22 Levene's Test.....................................................23

2. RECOMMENDATIONS FOR HANDLING NONDETECTS .............................. 25

2.1 Nondetectsin ANOVA Procedures........................................26

22 Nondetects in Statistical Intervals.......................................... 27

2.2.1 Censored and Detects-Only Probability Plots................. 28

2.2.2 Aitchison's Adjustment........................................-.33

Page 4: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

2.2.3 More Than 50% Noncletects ..................................... 34

2.2.4 Poisson Prediction Limits....................................... 35

2.2.5 Poisson Tolerance Limits........................................ 38

3. NON-PARAMETRIC COMPARISON OF COMPLIANCE DATA TO BACKGROUND.. 41

3.1 Kruskal-Wallis Test..........................................................41

3.1.1 Adjusting for Tied Observations................................ 42

3.2 Wilcoxon Rank-Sum Test for Two Groups ..............................45

3.2.1 Handling Ties in the Wilcoxon Test............................48

4. STATISTICAL INTERVALS: CONFIDENCE, TOLERANCE, AND PREDICTION ..... 49

4.1 Tolerance Intervals........................................................... 51

4.1.1 Non-parametric Tolerance Intervals............................ 54

4.2 Prediction Intervals.......................................................... 56

4.2.1 Non-parametric Prediction Intervals ............................59

4.3 Confidence Intervals.........................................................60

5. STRATEGIES FOR MULTIPLE COMPARISONS ....................................... 62

5.1 Background of Problem...................................................-.62

5.2 Possible Strategies...........................................................67

5.2.1 Parametric and Non-parametric ANOVA......................67

Page 5: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

5.2.2 Retesting with Parametric Intervals............................. 67

5.2.3 Retesting with N'on-parametric Intervals.......................71

6. OTHER TOPICS..........................................................................75

6.1 Control Charts..................................................;............. 75

6.2 Outlier Testing................................................................ 80

Page 6: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

ACKNOWLEDGMENT

This document was developed by EPA's Office of Solid Waste under the direction of Mr.James R. Brown of the Permits and State Programs Division. The Addendum was prepared by thejoint efforts of Mr. James R. Brown and Kirk M. Cameron, Ph.D., Senior Statistician at ScienceApplications International Corporation (SAIC). SAIC provided technical support in developingthis document under EPA Contract No. 68-WO-0025. Other SAIC staff who assisted in thepreparation of the Addendum include Mr. Robert D. Aaron, Statistician.

Page 7: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

STATISTICAL ANALYSIS OFGROUND-WATER MONITORING DATA

AT RCRA FACILITIES

ADDENDUM TO INTERIM FINAL GUIDANCE

JUNE 1992

This Addendum offers a series of recommendations and updated advice concerning theInterim Final Guidance document for statistical analysis of ground-water monitoring data. Someprocedures in the original guidance are replaced by alternative methods that reflect more currentthinking within the statistics profession. In other cases, further clarification is offered for currentlyrecommended techniques to answer questions and address public comments that EPA has receivedboth formally and informally since the Interim Final Guidance was published.

1. CHECKING ASSUMPTIONS FOR STATISTICALPROCEDURES

Because any statistical or mathematical model of actual data is an approximation of reality, allstatistical tests and procedures require certain assumptions for the methods to be used correctly andfor the results to have a proper interpretation. Two key assumptions addressed in the InterimGuidance concern the distributional properties of the data and the need for equal variances amongsubgroups of the measurements. In the Addendum, new techniques are outlined for testing bothassumptions that offer distinct advantages over the methods in the Interim Final Guidance.

1.1 NORMALITY OF DATA

Most statistical tests assume that the data come from a Normal distribution. Its densityfunction is the familiar bell-shaped curve. The Normal distribution is the assumed underlyingmodel for such procedures as parametric analysis of variance (ANOVA), t-tests, toleranceintervals, and prediction intervals for future observations. Failure of the data to follow a Normaldistribution at least approximately is not always a disaster, but can lead to false conclusions if thedata really follow a more skewed distribution like the Lognormal. This is because the extreme tailbehavior of a data distribution is often the most critical factor in deciding whether to apply astatistical test based on the assumption of Normality.

Page 8: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

The Interim Final Guidance suggests that one begin by assuming that the original data areNormal prior to testing the distributional assumptions. If the statistical test rejects the model ofNormality, the data can be tested for Lognormaliry instead by taking the natural logarithm of eachobservation and repeating the test If the original data are Lognormal, taking the natural logarithmof the observations will result in data that are Normal. As a consequence, tests for Normality canalso be used to test for Lognormality by applying the tests to the logarithms of the data.

Unfortunately, all of the available tests for Normality do at best a fair job of rejecting non-Normal data when the sample size is small (say less than 20 to 30 observations). That is, the testsdo not exhibit high degrees of statistical power. As such, small samples of untransformedLognormal data can be accepted by a test of Normality even though the skewness of the data maylead to poor statistical conclusions later. EPA's experience with environmental concentration data,and ground-water data in particular, suggests that a Lognormal distribution is generally moreappropriate as a default statistical model than the Normal distribution, a conclusion shared byresearchers at the United States Geological Survey (USGS* Dennis Helsel, personalcommunication, 1991). There also appears to be a plausible physical explanation as to whypollutant concentrations so often seem to follow a Lognormal pattern (Ott, 1990). In Ott's model,pollutant sources are randomly diluted in a multiplicative fashion through repeated dilution andmixing with volumes of uncontaminated air or water, depending on the surrounding medium.Such random and repeated dilution of pollutant concentrations can lead mathematically to aLognormal distribution.

Because the Lognormal distribution appears to be a better default statistical model than theNormal distribution for most ground-water data, it is recommended that all data first be loggedprior to checking distributional assumptions. McBean and Rovers (1992) have noted that"[s]uppon for the lognormal distribution in many applications also arises from the shape of thedistribution, namely constrained on the low side and unconstrained on the high side.... Thelogarithmic transform acts to suppress the outliers so that the mean is a much better representationof the central andeacy of the simple data."

Transformation to the logarithmic scale is not done to make "large numbers look smaller."Performing a logarithmic or other monotonic transformation preserves the basic ordering within adata set, so that the data are merely rescaled with a different set of units. Just as the physicaldifference between 80* Fahrenheit and 30* Fahrenheit does not change if the temperatures arerescaled or transformed to the numerically lower Celsius scale, so too the basic statisticalrelationships between data measurements remain the same whether or not the log transformation is

Page 9: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

applied. What does change is that the logarithms of Lognormally distributed data are more nearlyNormal in character, thus satisfying a key assumption of many statistical procedures. Because ofthis fact, the same tests used to check Normality, if run on the logged data, become tests forLognormaliry.

If the assumption of Lognormality is not rejected, further statistical analyses should beperformed on the logged observations, not the original data. If the Lognormal distribution iirejected by a statistical test, one can either test the Normality of the original data, if it was notalready done, or use a non-parametric technique on the ranks of the observations.

If no data are initially available to test the distributional assumptions, "referencing" may beemployed to justify the use of, say, a Normal or Lognormal assumption in developing a statisticaltesting regimen at a particular site. "Referencing" involves the use of historical data or data fromsites in similar hydrogeologic settings to justify the assumptions applied to currently plannedstatistical tests. These initial assumptions must be checked when data from the site becomeavailable, using the procedures described in this Addendum. Subsequent changes to the initialassumptions should be made if formal testing contradicts the initial hypothesis.

1.1.1 Interim Final Guidance Methods for Checking Normality

The Interim Final Guidance outlines three different methods for checking Normality: theCoefficient-of-Variation (CV) test. Probability Plots, and the Chi-squared test. Of these three,only Probability Plots are recommended within this Addendum. The Coefficient-of-Variation andthe Chi-squared test each have potential problems that can be remedied by using alternative tests.These alternatives include the Coefficient of Skewness, the Shapiro-Wilk test, the Shapiro-Franciatest, and the Probability Plot Correlation Coefficient

The Coefficient-of-Variation is recommended within the Interim Guidance because it is easyto calculate and is amenable to small sample sizes. To ensure that a Normal model which predicts asignificant fraction of negative concentration values is not fitted to positive data, the Interim FinalGuidance recommends that the sample Coefficient of Variation be less than one; otherwise this"test" of Normality fails. A drawback to using the sample CV is that for Normally distributed data,one can often get a sample CV greater than one when the true CV is only between 0.5 and 1 Inother words, the sample CV, being a random variable, often estimates the true Coefficient ofVariation with some error. Even if a Normal distribution model is appropriate, the Coefficient ofVariation test may reject the model because the sample CV (but not the true CV) is too large.

Page 10: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

The real purpose of the CV is to estimate the skewness of a dataset, not to test Normality.Truly Normal data can have any non-zero Coefficient of Variation, though the larger the CV, thegreater the proportion of negative values predicted by the model. As such, a Normal distributionwith large CV may be a poor model for positive concentration data. However, if the Coefficient ofVariation test is used on the logarithms of the data to test Lognormality, negative loggedconcentrations will often be expected, nullifying the rationale used to support the CV test in thefirst place. A better way to estimate the skewness of a dataset is to compute the Coefficient ofSkewness directly, as described below.

The Chi-square test is also recommended within the Interim Guidance. Though an acceptablegoodness-of-fit test, it is not considered the most sensitive or powerful test of Normality in thecurrent literature (Can and Koehler, 1990). The major drawback to the Chi-square test can beexplained by considering the behavior of parametric tests based on the Normal distribution. Mosttests like the t-test or Analysis of Variance (ANOVA), which assume the underlying data to beNormally distributed, give fairly robust results when the Normality assumption fails over themiddle ranges of the data distribution. That is, if the extreme tails are approximately Normal inshape even if the middle part of the density is not, these parametric tests will still tend to producevalid results. However, if the extreme tails are non-Normal in shape (e.g., highly skewed),Normal-based tests can lead to false conclusions, meaning that either a transformation of the dataor a non-parametric technique should be used instead.

The Chi-square test entails a division of the sample data into bins or cells representingdistinct, non-overlapping ranges of the data values (see figure below). In each bin, an expectedvalue is computed based on the number of data points that would be found if the Normaldistribution provided an appropriate model. The squared difference between the expected numberand observed number is then computed and summed over all the bins to calculate the Chi-squaretest statistic.

~ CHl SQUARE GOODNESS OF FIT

Page 11: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23^2

If the Chi-square test indicates that the data are not Normally distributed, it may not be clearwhat ranges of the data most violate the Normality assumption. Departures from Normality in themiddle bins are given nearly the same weight as departures from the extreme tail bins, and all thedepartures are summed together to form the test statistic. As such, the Chi-square test is not aspowerful for detecting departures from Normality in the extreme tails of the data, the areas mostcrucial to the validity of parametric tests like the t-test or ANOVA (Miller, 1986). Furthermore,even if there are departures in the tails, but the middle portion of the data distribution isapproximately Normal, the Chi-square test may not register as statistically significant in certaincases where better tests of Normality would. Because of this, four alternative, more sensitive testsof Normality are suggested below which can be used in conjunction with Probability Plots.

1.1.2 Probability Plots

As suggested within the Interim Final Guidance, a simple, yet useful graphical test forNormality is to plot the data on probability paper. The y-axis is scaled to represent probabilitiesaccording to the Normal distribution and the data are arranged in increasing order. An observedvalue is plotted on the x-axis and the proportion of observations less than or equal to each observedvalue is plotted as the y-coordinate. The scale is constructed so that, if the data are Normal, thepoints when plotted will approximate a straight line. Visually apparent curves or bends indicatethat the data do not follow a Normal distribution (see Interim Final Guidance, pp. 4-8 to 4-11).

Probability Plots are particularly useful for spotting irregularities within the data whencompared to a specific distributional model like the Normal. It is easy to determine whetherdepartures from Normality are occurring more or less in the middle ranges of the data or in theextreme tails. Probability Plots can also indicate the presence of possible outlier values that do notfollow the basic pattern of the data and can show the presence of significant positive or negauveskewness.

If a (Normal) Probability Plot is done on the combined data from several wells and Normalityis accepted, it implies that all of the data came from the same Normal distribution. Consequently,each subgroup of the data set (e.g., observations from distinct wells), has the same mean andstandard deviation. If a Probability Plot is done on the data residuals (each value minus itssubgroup mean) and is not a straight line, the interpretation is more complicated. In this case,either the residuals are not Normal, or there is a subgroup of the data with a Normal distributionbut a different mean or standard deviation than the other subgroups. The Probability Plot willindicate a deviation from the underlying Normality assumption either way.

Page 12: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

The same Probability Plot technique may be used to investigate whether a set of data orresiduals follows the Lognormal distribution. The procedure is the same, except that one firstreplaces each observation by its natural logarithm. After the data have been transformed to theirnatural logarithms, the Probability Plot is constructed as before. The only difference is that thenatural logarithms of the observations are used on the x-axis. If the data are Lognormal, theProbability Plot (on Normal probability paper) of the logarithms of the observations willapproximate a straight line.

Many statistical software packages for personal computers will construct Probability Plotsautomatically with a simple command or two. If such software is available, there is no need toconstruct Probability Plots by hand or to obtain special graph paper. The plot itself may begenerated somewhat differently than the method described above. In some packages, the observedvalue is plotted as before on the x-axis. The y-axis, however, now represents the quantile of theNormal distribution (often referred to as the "Normal score of the observation") corresponding tothe cumulative probability of the observed value. The y-coordinate is often computed by thefollowing formula:

+

where <X>"' denotes the inverse of the cumulative Normal distribution, n represents the sample size,and i represents the rank position of the ith ordered concentration. Since the computer does thesecalculations automatically, the formula does not have to be computed by hand.

EXAMPLE 1

Determine whether the following data set follows the Normal distribution by using aProbability Plot.

Page 13: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft ?/:3 :̂

Nickel Concentration (ppb)Month Well 1 Well 2 Well 3 Well 4

12345

58.81.0262568.7

1981.533114

64.4

3915127

21.4578

3.194285.610

637

SOLUTIONStep 1. List the measured nickel concentrations in order from lowest to highest

NickelConcentration

(ppb)

13.18.7101419

21.4273956

58.864.481.585.6151262331578637942

Order(i)

123456789

1011121314151617181920

Probability

510141924293338434852576267717681869095

NormalQuantile

-1.645-1.28-1.08-0.88-0.706-0.55-0.44-0.305-0.176-0.050.050.1760.3050.440.550.7060.881.081.281.645

Step 2. The cumulative probability is given in the third column and is computed as 100*(i/(n+1))where n is the total number of samples (n=20). The last column gives the Normalquantiles corresponding to these probabilities.

Step 3. If using special graph paper, plot the probability versus the concentration for eachsample. Otherwise, plot the Normal quanrile versus the concentration for each sample,as in the plot below. The curvature found in the Probability Plot indicates that there isevidence of non-Normality in the data.

Page 14: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

PROBABILITY PLOT

O"_iz

•1 ••ZM

1.1.3 Coefficient of Skewness

The Coefficient of Skewness (YI) indicates to what degree a data set is skewed orasymmetric with respect to the mean. Data from a Normal distribution will have a SkewnessCoefficient of zero, while asymmetric data will have a positive or negative skewness depending onwhether the right- or left-hand tail of the distribution is longer and skinnier than the opposite tail.

Since ground-water monitoring concentration data are inherently nonnegative, one oftenexpects the data to exhibit a certain degree of skewness. A small degree of skewness is not likelyto affect the results of statistical tests based on an assumption of Normality. However, if theSkewness Coefficient is larger than 1 (in absolute value) and the sample size is small (e.g., n<25),statistical research has shown that standard Normal theory-based tests are much less powerful thanwhen the absolute skewness is less than 1 (Gayen, 1949).

Calculating the Skewness Coefficient is useful and not much more difficult than computingthe Coefficient of Variation. It provides a quick indication of whether the skewness is minimalenough to assume that the data are roughly symmetric and hopefully Normal in distribution. If theoriginal data exhibit a high Skewness Coefficient, the Normal distribution will provide a poorapproximation to the data set. In that case, Yi can be computed on the logarithms of the data to testfor symmetry of the logged data.

Page 15: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

The Skewness Coefficient may be computed using the following formula:

_y t —\3

_ n ' '

where the numerator represents the average cubed residual and SD denotes the standard deviationof the measurements. Most statistics computer packages (e.g., Minitab, GEO-EAS) will computethe Skewness Coefficient automatically via a simple command.

EXAMPLE 2

Using the data in Example 1, compute the Skewness Coefficient to test for approximatesymmetry in the data.

SOLUTION

Step 1. Compute the mean, standard deviation (SD), and average cubed residual for the nickelconcentrations:

x = 169.52 ppbSD = 259.72ppb

-5V(x,-x)3 = 2.98923* 10* ppb3

Step 2. Calculate the Coefficient of Skewness using the previous formula to get YI=1.84. Sincethe skewness is much larger than 1, the data appear to be significantly positivelyskewed. Do not assume that the data follow a Normal distribution.

Step 3. Since the original data evidence a high degree of skewness, one can attempt to computethe Skewness Coefficient on the logged data instead. In that case, the skewness worksout to be ryik 0.24 < 1, indicating that the logged data values are slightly skewed, butnot enough to reject an assumption of Normality in the logged data. In other words, theoriginal data may be Lognormally distributed.

1.1.4 The Shapiro-Wilk Test of Normality (n£50)

The Shapiro-Wilk test is recommended as a superior alternative to the Chi-square test fortesting Normality of the data. It is based on the premise that if a set of data are Normallydistributed, the ordered values should be highly correlated with corresponding quantiles taken froma Normal distribution (Shapiro and Wilk, 1965). In particular, the Shapiro-Wilk test gives

Page 16: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

substantial weight to evidence of non-Normality in the tails of a distribution, where the robustnessof statistical tests based on the Normality assumption is most severely affected. The Chi-squaretest treats departures from Normality in the tails nearly the same as departures in the middle of adistribution, and so is less sensitive to the types of non-Normality that are most crucial. Onecannot tell from a significant Chi-square goodness-of-fit test what son of non-Normality isindicated.

The Shapiro-Wilk test statistic (W) will tend to be large when a Probability Plot of the dataindicates a nearly straight line. Only when the plotted data show significant bends or curves willthe test statistic be small. The Shapiro-Wilk test is considered to be one of the very best tests ofNormality available (Miller, 1986; Madansky, 1988).

To calculate the test statistic W, one can use the following formula:

where the numerator is computed as

In this last formula, X(j) represents the jth smallest ordered value in the sample andcoefficients aj depend on the sample size n. The coefficients can be found for any sample sizefrom 3 up to 50 in Table A-l of Appendix A. The value of k can be found as the greatest integerless than or equal to n/2.

Normality of the data should be rejected if the Shapiro-Wilk statistic is too low whencompared to the critical values provided in Table A-2 of Appendix A. Otherwise one can assumethe data are approximately Normal for purposes of further statistical analysis. As before, it isrecommended that the test first be performed on the logarithms of the original data to test forLognormality. If the logged data indicate non-Normality by the Shapiro-Wilk test, a re-test can beperformed on the original data to test for Normality of the original concentrations.

EXAMPLE 3

Use the data of Example 1 to compute the Shapiro-Wilk test of Normality.

10

Page 17: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

1.1.6 The Probability Plot Correlation Coefficient

One other alternative test for Normality that is roughly equivalent to the Shapiro-Wilk andShapiro-Francia tests is the Probability Plot Correlation Coefficient test described by Filliben(1975). This test fits in perfectly with the use of Probability Plots, because the essence of the testis to compute the common correlation coefficient for points on a Probability Plot. Since thecorrelation coefficient is a measure of the linearity of the points on a scatterplot, the Probability PlotCorrelation Coefficient, like the Shapiro-Wilk test, will be high when the plotted points fall along astraight line and low when there are significant bends and curves in the Probability Plot.Comparison of the Shapiro-Wilk and Probability Plot Correlation Coefficient tests has indicatedvery similar statistical power for detecting non-Normality (Ryan and Joiner, 1976).

The construction of the test statistic is somewhat different from the Shapiro-Wilk W, but notdifficult to implement Also, tabled critical values for the correlation coefficient have been derivedfor sample sizes up to n»100 (and are reproduced in Table A-4 of Appendix A). The ProbabilityPlot Correlation Coefficient may be computed as

C.xSDVn-1where X(j) represents the ith smallest ordered concentration value, Mi is the median of the ith orderstatistic from a standard Normal distribution, and % and M~ represent the average values of X(j)and M<i). The ith Normal order statistic median may be approximated as Mi**-1 (mi), where asbefore, <£~] is the inverse of the standard Normal cumulative distribution and m; can be computedas follows (given sample size n):

(i-.3175)/(n+.365) f o r l < i < n

Quantity CD represents the square root of the sum of squares of the Mi's minus n times the averagevalue IvT, that'll

13

Page 18: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

When working with a complete sample (i.e., containing no nondetects or censored values), theaverage value M*«0, and so the formula for the Probability Plot Correlation Coefficient simplifiesto

r = £.X(.,Mt

xSDVrTT

EXAMPLE 4

Use the data of Example 1 to compute the Probability Plot Correlation Coefficient test

SOLUTIONStep 1. Order the data from smallest to largest and list, as in the following table.Step 2. Compute the quantities m, from Filliben's formula above for each i in column 2 and the

order statistic medians, Mi, in column 3 by applying the inverse Normal transformationto column 2.

Step 3. Since this sample contains no nondetects, the simplified formula for r may be used.Compute the products X(j)*Mi in column 4 and sum to get the numerator of thecorrelation coefficient (equal to 3,836.81 in this case). Also compute M;2 in column 5and sum to find quantity Cn2*17.12.

X(i)*Mi Mi2

1234567891011121314151617181920

1.03.18.710.014.019.021.427.039.056.058.864.481.585.6151.0262.0331.0578.0637.0942.0

.03406

.08262

.13172

.18082

.22993

.27903

.32814

.37724

.42634

.47545

.52455

.57366

.62276

.67186

.72097

.77007

.81918

.86828

.91738

.96594

-1.8242-1.3877-1.1183-0.9122-0.7391-0.5857-0.4451-0.3127-0.1857-0.06160.06160.18570.31270.44510.58570.73910.91221.11831.38771.8242

-1.824-4.302-9.729-9.122-10.347-11.129-9.524-8.444-7.242-3.4483.62111.95925.48838.09788.445193.638301.953646.376883.9411718.408

3.3281.9261.2510.8320.5460.3430.1980.0980.0340.0040.0040.0340.0980.1980.3430.5460.8321.2511.9263.328

14

Page 19: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7,C

SOLUTION

Step 1. Order the datadata in reverse

from smallest to largest and list.order alongside the first column.

Step 2. Compute the differences X(n.j+i)-X(j) in columnfrom column 2

i TO

1 1.02 3.13 8.74 10.05 14.06 19.07 21.48 27.09 39.0

10 56.011 58.812 64.413 81.514 85.615 151.016 262.017 331.018 578.019 637.020 942.0

X(n-i-i-l) *(r

942.0637.0578.0331.0262.0151.0

85.681.564.458.856.039.027.021.419.014.010.08.73.11.0

l-i+l)-X(i)

941.0633.9569.3321.0248.0132.064254.525.42.8

-2.8-25.4-54.5-64.2

-132.0-248.0-321.0-569.3-633.9-941.0

as in the following table. Also list the

3 of the table by

an-i+i

.4734

.3211

.2565

.2085

.1686

.1334

.1013

.0711

.0422

.0140

subtracting column 1

bi

445.47203.55146.0366.9341.8117.616.503.871.07Q.Q4

b=932.88

Step 3. Compute k as the greatest integer less than or equal to n/2. Since n=20, k=10 in thisexample.

Step 4. Look up the coefficients an.j+j from Table A-l and list in column 4. Multiply thedifferences in column 3 by the coefficients in column 4 and add the first k products toget quantity b. In this case, b*932.88.

Step 5. Compute the standard deviation of the sample, SD*259.72. Then

932.88

Step 6. Compare the computed value of W=0.679 to the 5% critical value for sample size 20 inTable A-2, namely W 05^0=0-905. Since W < 0.905, the sample shows significantevidence of non-Normality by the Shapiro-Wilk test. The data should be transformedusing natural logs and rechecked using the Shapiro-Wilk test before proceeding withfurther statistical analysis (Actually, the logged data should have been tested first. The

11

Page 20: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

original concentration data are used in this example to illustrate how the assumption ofNormality can be rejected.)

1.1.5 The Shapiro-Francia Test of Normality (n>50)

The Shapiro-Wilk test of Normality can be used for sample sizes up to 50. When the sampleis larger than 50, a slight modification of the procedure called the Shapiro-Francia test (Shapiro andFrancia, 1972) can be used instead.

Like the Shapiro-Wilk test, the Shapiro-Francia test statistic (WO will tend to be large when aProbability Plot of the data indicates a nearly straight line. Only when the plotted data showsignificant bends or curves will the test statistic be small.

To calculate the test statistic W', one can use the following formula:

r i2

(n-lJSETljinf

where x^ represents the ith ordered value of the sample and where mj denotes the approximateexpected value of the ith ordered Normal quantile. The values for mj can be approximatelycomputed as

m.=<D-'.n + 1

where 4>'] denotes the inverse of the standard Normal distribution with zero mean and unitvariance. These values can be computed by hand using a Normal probability table or via simplecommands in many statistical computer packages.

Normality of the data should be rejected if the Shapiro-Francia statistic is too low whencompared to the critical values provided in Table A-3 of Appendix A. Otherwise one can assumethe data are approximately Normal for purposes of further statistical analysis. As before, thelogged data should be tested first to see if a Lognormal model is appropriate. If these data indicatenon-Normality by the Shapiro-Francia test, a re-test can be performed on the original data.

12

Page 21: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draf; "C

Step 4. Compute the Probability Plot Correlation Coefficient using the simplified formula for r,where SD=259.72 and Cn=4.l375, to get

(4. 1375X259. 72W19

Step 5. Compare the computed value of r=0.819 to the 5% critical value for sample size 20 inTable A-4, namely R .05,20=0-950. Since r < 0.950, the sample shows significantevidence of non-Normality by the Probability Plot Correlation Coefficient test. The datashould be transformed using natural logs and the correlation coefficient recalculatedbefore proceeding with further statistical analysis.

EXAMPLE 5

The data in Examples 1, 2, 3, and 4 showed significant evidence of non-Normality. Insteadof first logging the concentrations before testing for Normality, the original data were used. Thiswas done to illustrate why the Lognormal distribution is usually a better default model than theNormal. In this example, use the same data to determine whether the measurements better follow aLognormal distribution.

Computing the natural logarithms of the data gives the table below.

Logged Nickel Concentrations log (ppb)Month Welll Well 2 Well 3 Well 4

12345

4.070.005.574.032.16

2.944.405.802.644.17

3.665.023.303.066.36

1.136.854.452.306.46

SOLUTIONMethod 1. Probability PlotsStep 1. List the natural logarithms of the measured nickel concentrations in order from lowest to

highest

15

Page 22: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Order(i)

123456789

1011121314151617181920

Log NickelConcencration

log(ppb)

0.001.132.162.302.642.943.063.303.664.034.074.174.404.455.025.575.806.366.466.85

Probability —

510141924293338434852576267717681869095

NormalQuantiles

-1.645-1.28-1.08-0.88-0.706-0.55-0.44-0.305-0.176-0.050.050.1760.3050.440.550.7060.881.081.281.645

Step 2. Compute the probability as shown in the third column by calculating 100*(to+l), wheren is the total number of samples (n*20). The corresponding Normal quantiles are givenin column 4.

Step 3. Plot the Normal quantiles against the natural logarithms of the observed concentrationsto get the following graph. The plot indicates a nearly straight line fit (verified bycalculation of the Correlation Coefficient given in Method 4). There is no substantialevidence that the data do not follow a Lognormal distribution. The Normal-theoryprocedure(s) should be performed on the log-transformed data. ~

16

Page 23: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7,C

PROBABILITY PLOT

I_iI

-I « 2 4

LN(Nkk«0 LN<W*)

Method 2. Coefficient of SkewnessStep 1. Calculate the mean, SD, and average cubed residuals of the natural logarithms of the

data.

X = 3.9181og(ppb)= 1.8021og(ppb)

-I.K-*)^ -1.325 logj(ppb)

Step 2. Calculate the Skewness Coefficient, yi

-1.3257. = (.95)2(U02)3

= -0.244

Step 3.Step 4.

Compute the absolute value of the skewness, ryjl«l-0.244l«0.244.Since the absolute value of the Skewness Coefficient is less than 1, the data do not showevidence of significant skewness. A Normal approximation to the log-transformed datamay therefore be appropriate, but this model should be further checked.

17

Page 24: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23^2

Method 3. Shapiro-Wilk Test

Step 1. Order the logged data from smallest to largest and list, as in following table. Also listthe data in reverse order and compute the differences *(n-H.i)-X(j)-

i1234567891011121314 .151617181920

LN(x<i))0.001.132.162.302.642.943.063.303.664.034.074.174.404.455.025.575.806.366.466.85

LN(x(n.i+i))6.856.466.365.805.575.024.454.404.174.074.033.663.303.062.942.642.302.161.130.00

an-i+l.4734.3211.2565.2085.1686.1334.1013.0711.0422.0140

bi3.241.711.080.730.490.280.140.080.02Q.OQ

b=7.77

Step 2. Compute k=10, since n/2=10. Look up the coefficients a^j+i from Table A-l andmulnply by the first k differences between columns 2 and 1 to get the quandties bj. Addthese 10 products to get b»7.77.

Step 3. Compute the standard deviation of the logged data, SD»1.8014. Then the Shapiro-Wilkstatistic is given by

Step 4. Compwe the computed value of W to the 5% critical value for sample size 20 in Table A-2, namely W.os,20»0-90S. since W=0.979>0.905, the sample shows no significantevidence of non-Normality by the Shapiro-Wilk test. Proceed with further statisticalanalysis using the log-transformed data.

Method 4. Probability Plot Correlation CoefficientStep 1. Order the logged data from smallest to largest and list below.

18

Page 25: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft lfll/92

Order(i)

1234567891011121314151617

. 181920

Log NickelConcentrationlog(ppb)

0.001.132.162.302.642.943.063.303.664.034.074.174.404.455.025.575.806.366.466.85

mi

.03406

.08262

.13172

.18082

.22993

.27903

.32814

.37724

.42634

.47545

.52455

.57366

.62276

.67186

.72097

.77007

.81918

.86828

.91738

.96594

Mi

-1.8242-1.3877-1.1183-0.9122-0.7391-0.5857-0.4451-0.3127-0.1857-0.06160.06160.18570.31270.44510.58570.73910.91221.11831.38771.8242

X(i)*M,

0.000-1.568-2.416-2.098-1.951 .-1.722-1.362-1.032-0.680-0.2480.2510.7741.3761.9812.9404.1175.2917.1128.96512.496

Mj2

3.3281.9261.2510.8320.5460.3430.1980.0980.0340.0040.0040.0340.0980.1980.3430.5460.8321.2511.9263.328

Step 2.

Step 3.

Step 4.

Compute the quantities mi and the order statistic medians Mi, according to the procedurein Example 4 (note that these values depend only on the sample size and are identical tothe quantities in Example 4).Compute the products X(j)*Mj in column 4 and sum to get die numerator of thecorrelation coefficient (equal to 32.226 in this case). Also compute Mj2 in column 5 andsum to find quantity Cn^H. 12.

Compute the Probability Plot Correlation Coefficient using the simplified formula for r,where SD-1.8025 and Cn-4.1375, to get

32.226(4.1375)(1.8025W19

= 0.991

Step5. Cooptre the computed value of r=0.991 to the 5% critical value for sample size 20 inTable A-4, namely R.Q5 jo=0-950- Since r > 0.950, the logged data show no significantevidence of non-Normality by the Probability Plot Correlation Coefficient test.Therefore. Lognormality of the original data could be assumed in subsequent statisticalprocedures.

19

Page 26: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/^2

1.2 TESTING FOR HOMOGENEITY OF VARIANCE

One of die most important assumptions for die parametric analysis of variance (ANOVA) isthat die different groups (e.g., different wells) have approximately die same variance. If this is notdie case, the power of die F-test (its ability to detect differences among die group means) isreduced. Mild differences in variance are not too bad. The effect becomes nodceable when dielargest and smallest group variances differ by a ratio of about 4 and becomes quite severe when dieratio is 10 or more (Milliken and Johnson, 1984).

The procedure suggested in die EPA guidance document, Bartlen's test, is one way to testwhedier die sample data give evidence diat die well groups have different variances. However,Bartlett's test is sensitive to non-Normality in die data and may give misleading results unless oneknows in advance diat die data are approximately Normal (Milliken and Johnson, 1984). As analtemadve to Banlett's test, two procedures for tesdng homogeneity of die variances are describedbelow diat are less sensitive to non-Normality.

1.2.1 Box Plots

Box Plots were first developed for exploratory data analysis as a quick way to visualize die"spread" or dispersion widrin a data set In die context of variance testing, one can construct a BoxPlot for each well group and compare die boxes to see if die assumption of equal variances isreasonable. Such a comparison is not a formal test procedure, but is easier to perform and is oftensufficient for checking die group variance assumption.

The idea behind a Box Plot is to order die data from lowest to highest and to trim off 25percent of die observations on eidier end, leaving just die middle SO percent of die sample values.The spread between die lowest and highest values of diis middle 50 percent (known as dieinterquartile range or IQR) is represented by die lengtii of die box. The very middle observation(i.e., die median) can also be shown as a line cutting die box in two.

To construct a Box Plot, calculate die median and upper and lower quantiles of die data set(respectively, the 50th, 25th, and 75th percentiles). To do diis, calculate k»p(n+l)/100 wheren=number of samples and p=percemile of interest. If k is an integer, let die kdi ordered or rankedvalue be an estimate of die pdi percenule of die data. If k is not an integer, let die ptii percentile beequal to die average of die two values closest in rank position to k. For example, if die data setconsists of the 10 values {1, 4, 6.2, 10, 15. 17.1, 18, 22, 25, 30.5), the position of the median

20

Page 27: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/C3/92

would be found as 50*(10+1)/100=5.5. The median would then be computed as the average ofthe 5th and 6th ordered values, or (15+17.1)/2=16.05.

Likewise, the position of the lower quartile would be 25*(10+1)/100=2.75. Calculate theaverage of the 2nd and 3rd ordered observations to estimate this percenrile, i.e., (4*6.2)72=5.1.Since the upper quartile is found to be 23.5, the length of Box Plot would be the differencebetween the upper and lower quartiles, or (23.5-5.1)= 18.4. The box itself should be drawn on agraph with the y-axis representing concentration and the x-axis denoting the wells being plotted.Three horizontal lines are drawn for each well, one line each at the lower and upper quartiles andanother at the median concentration. Vertical connecting b'nes are drawn to complete the box.

Most statistics packages can directly calculate the statistics needed to draw a Box Plot, andmany will construct the Box Plots as well. In some computer packages, the Box Plot will alsohave two "whiskers" extending from the edges of the box. These lines indicate the positions ofextreme values in the data set, but generally should not be used to approximate the overalldispersion.

If the box length for each group is less than 3 times the length of the shortest box, the samplevariances are probably close enough to assume equal group variances. If, however, the box lengthfor any group is at least triple the length of the box for another group, the variances may besignificantly different (Kirk Cameron, SAIC, personal communication). In that case, the datashould be further checked using Levene's test described in the following section. If Levene's testis significant, the data may need to be transformed or a non-parametric rank procedure consideredbefore proceeding with further analysis.

EXAMPLE 6

Construct Box Plots for each well group to test for equality of variances.

Arsenic Concentration (ppm)Month Welll Well 2 Well 3 Well 4 Well 5 Well 6

1234

22.93.0935.74.18

2.01.257.852

2.0109.4

4.52.5

7.849.3

25.92.0

24.91.3

0.7527

0.344.782.85

1.2

21

Page 28: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

SOLUTIONStep 1. Compute the 25th, 50th, and 75th percentiles for the data in each well group. To

calculate the pth percentile by hand, order the data from lowest to highest. Calculatep*(n+l)/100 to find the ordered position of the pth percentile. If necessary, interpolatebetween sample values to estimate the desired percentile.

Step 2. Using well 1 as an example, n+l=5 (since there are 4 data values). To calculate the 25thpercentile, compute its ordered position (i.e., rank) as 25*5/100=1.25. Average the 1stand 2nd ranked values at well 1 (i.e., 3.09 and 4.18) to find an estimated lower quartileof 3.64. This estimate gives the lower end of the Box Plot. The upper end or 75thpercentile can be computed similarly as the average of the 3rd and 4th ranked values, or(22.9+35.7)/2=29.3. The median is the average of the 2nd and 3rd ranked values,giving an estimate of 13.14.

Step 3. Construct Box Plots for each well group, lined up side by side on the same axes.

BOX PLOTS OF WELL DATA

120

100

oat

OurCflet<

40

20

_L

WELL

Step 4. Since the box length for well 3 is more than three times the box lengths for wells 4 and6, there is evidence that the group variances may be significantly different. These datashould be further checked using Levene's test described in the next section.

22

Page 29: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7,2

1.2.2 Levene's Test

Levene's test is a more formal procedure than Box Plots for testing homogeneity of variancethat, unlike Bartlen's test, is not sensitive to non-Normality in the data. Levene's test has beenshown to have power nearly as great as Bartlctt's test for Normally distributed data and powersuperior to Bartlen's for non-Normal data (Milliken and Johnson, 1984).

To conduct Levene's test, first compute the new variables

Z ss Ix- — X-y I u > |

where xy represents the jth value from the ith well and x, is the ith well mean. The values Zjjrepresent the absolute values of the usual residuals. Then run a standard one-way analysis ofvariance (ANOVA) on the variables Zjj. If the F-test is significant, reject the hypothesis of equalgroup variances. Otherwise, proceed with analysis of die x^'s as initially planned.

EXAMPLE 7

Use the data from Example 6 to conduct Levene's test of equal variances.

SOLUTION

Step 1. Calculate the group mean for each well (Ij

Well 1 mean = 16.47 Well 4 mean = 11.26Well 2 mean = 15.76 Well 5 mean » 13.49Well 3 mean » 29.60 Well 6 mean * 229

23

Page 30: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Step 2. Compute the absolute residuals zy in each well and the well means of the residuals (z , ) .

Absolute ResidualsMonth

1234

Welll

6.4313.3819.2312.29

Well 2

13.7614.517.96

36.24

Well 3

27.679.825.127.1

Well 4

3.421.96

14.649.26

Well 5

11.4112.1912.7413.51

Well 6

1.952.490.561.09

WellMean(Zj) =12.83

OverallMean (z) = 15.36

18.12 39.9 7.32 12.46 1.52

Step 3. Compute the sums of squares for the absolute residuals.

r^ = (N-l) SDz2 = 6300.89

= £n zf - Nz2 = 3522.90

2777.99

Step 4. Construct an analysis of variance table to calculate the F-statistic. The degrees offreedom (df) are computed as (#groups-l)»(6-l)«5 df and (#samples-#groups)»(24-6)= 18 df.

ANOVA TableSource

Between WellsError

Total

Sum-of- Squares

3522.902777.99

6300.89

df

518

23

Mean-Square F-Ratio

704.58 4.56154.33

P

0.007

Step 5. Since the F-statisric of 4.56 exceeds the tabulated value of F 05*2.77 with 5 and 18 df,the assumption of equal variances should be rejected. Since the original concentrationdata are used in this example, the data should be logged and retested.

24

Page 31: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7,13ft2

2. RECOMMENDATIONS FOR HANDLINGNONDETECTS

The basic recommendations within the Interim Final Guidance for handling nondetectanalyses include the following (see p. 8-2): 1) if less than 15 percent of all samples are nondetect,replace each nondetect by half its detection or quantitation limit and proceed with a parametricanalysis, such as ANOVA, Tolerance Limits, or Prediction Limits; 2) if the percent of nondetects isbetween 15 and 50, either use Cohen's adjustment to the sample mean and variance in order toproceed with a parametric analysis, or employ a non-parametric procedure by using the ranks ofthe observations and by treating all nondetects as ded values; 3) if the percent of nondetects isgreater than 50 percent, use the Test of Proportions.

As to the first recommendation, experience at EPA and research at the United StatesGeological Survey (USGS, Dennis Helsel, personal communicadon, 1991) has indicated that ifless than 15 percent of the samples are nondetect, the results of parametric statistical tests will notbe substantially affected if nondetects are replaced by half their detection limits. When more than15 percent of the samples are nondetect, however, the handling of nondetects is more crucial to theoutcome of statistical procedures. Indeed, simple substitution methods tend to perform poorly instatistical tests when the nondetect percentage is substantial (Gilliom and Helsel, 1986).

Even with a small proportion of nondetects, however, care should be taken when choosingbetween the method detection limit (MDL) and the practical quantitation limit (PQL) incharacterizing "nondetect" concentrations. Many nondetects are characterized by analyticallaboratories with one of three data qualifier flags: "U," "J," or "E." Samples with a "U" dataqualifier represent "undetected" measurements, meaning that the signal characteristic of that analytecould not be observed or distinguished from "background noise" during lab analysis. Inorganicsamples with an "E" flag and organic samples with a "J" flag may or may not be reported with anestimated concentration. If no concentration is estimated, these samples represent "detected but notquantified" measurements. In this case, the actual concentration is assumed to be positive, butsomewhere between zero and the PQL. Since all of these non -detects may or may not have actualpositive concentrations between zero and the PQL, the suggested substitution for parametric

procedures is to replace each nondetect by one- half the PQL (note, however, that "E" and"J" samples reported with estimated concentrations should be treated, for statistical purposes, asvalid measurements. Substitution of one-half the PQL is not recommended for these samples).

25

Page 32: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

In no case should nondetect concentrations be assumed to be bounded above by the MDL.The MDL is estimated on the basis of ideal laboratory conditions with ideal analyte samples anddoes not account for matrix or other interferences encountered when analyzing specific, actual fieldsamples. For this reason, the PQL should be taken as the most reasonable upper bound fornondetect concentrations.

It should also be noted that the distinction between "undetected" and "detected but notquantified" measurements has more specific implications for rank-based non-parametricprocedures. Rather than assigning the same tied rank to all nondetects (see below and in Section3), "detected but not quantified" measurements should be given larger ranks than those assigned to"undetected" samples. In fact the two types of nondetects should be treated as two distinct groupsof tied observations for use in the Wilcoxon and Kruskal-Wallis non-parametric procedures.

2.1 NONDETECTS IN ANOVA PROCEDURES

For a moderate to large percentage of nondetects (i.e., over 15%), the handling of nondetectsshould vary depending on the statistical procedure to be run. If background data from one or moreupgradient wells are to be compared simultaneously with samples from one or more downgradientwells via a t-test or ANOVA type procedure, the simplest and most reliable recommendation is toswitch to a non-parametric analysis. The distributional assumptions for parametric procedures canbe rather difficult to check when a substantial fraction of nondetects exists. Furthermore, the non-parametric alternatives described in Section 3 tend to be efficient at detecting contamination whenthe underlying data are Normally distributed, and are often more powerful than the parametricmethods when the underlying data do not follow a Normal distribution.

Nondetects are handled easily in a nonparametric analysis. All data values are first orderedand replaced by their ranks. Nondetects are treated as tied values and replaced by their midranks(see Section 3). Then a Wilcoxon Rank-Sum or Kruskal-Wallis test is run on the ranked datadepending on whether one or more than one downgradient well is being tested.

The Test of Proportions is not recommended in this Addendum, even if the percentage ofnondetects is over 50 percent. Instead, for all two-group comparisons that involve more than 15percent nondetects, the non-parametric Wilcoxon Rank-Sum procedure is recommended.Although acceptable as a statistical procedure, the Test of Proportions does not account forpotentially different magnitudes among the concentrations of detected values. Rather, each sampleis treated as a 0 or 1 depending on whether the measured concentration is below or above the

26

Page 33: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 1/23/92

detection limit. The Test of Proportions ignores information about concentration magnitudes, andhence is usually less powerful than a non-parametric rank-based test like the Wilcoxon Rank-Sum,even after adjusting for a large fraction of tied observations (e.g., nondetects). This is because theranks of a dataset preserve additional information about the relative magnitudes of the concentrationvalues, information which is lost when all observations are scored as O's and 1's.

Another drawback to the Test of Proportions, as presented in the Interim Final Guidance, isthat the procedure relies on a Normal probability approximation to the Binomial distribution of O'sand 1's. This approximation is recommended only when the quantities n x (%NDs) and n x (1-%NDs) are no smaller than 5. If the percentage of nondetects is quite high and/or the sample sizeis fairly small, these conditions may be violated, leading potentially to inaccurate results.

Comparison of the Test of Proportions to the Wilcoxon Rank-Sum test shows that for smallto moderate proportions of nondetects (say 0 to 60 percent), the Wilcoxon Rank-Sum procedureadjusted for ties is more powerful in identifying real concentration differences than the Test ofProportions. When the percentage of nondetects is quite high (at least 70 to 75 percent), the Testof Proportions appears to be slightly more powerful in some cases than the Wilcoxon, but theresults of the two tests almost always lead to the same conclusion, so it makes sense to simplyrecommend the Wilcoxon Rank-Sum test in all cases where nondetects constitute more than 15percent of the samples.

2.2 NONDETECTS IN STATISTICAL INTERVALS

If the chosen method is a statistical interval (Confidence, Tolerance or Prediction limit) usedto compare background data against each downgradient well separately, man options are availablefor handling moderate proportions of nondetects. The basis of any parametric statistical intervallimit is the formula x ± ic-s, where x and s represent the sample mean and standard deviation ofthe (background) data and K depends on the interval type and characteristics of the monitoringnetwork. To use a parametric interval in the presence of a substantial number of nondetects, it isin 11 in minim HI the sample mean and standard deviation. But since nondetect concentrationsare unknown, simple formulas for the mean and standard deviation cannot be computed directly.Two basic approaches to estimating or "adjusting" the mean and standard deviation in this situationhave been described by Cohen (1959) and Aitchison (1955).

The underlying assumptions of these procedures are somewhat different. Cohen'sadjustment (which is described in detail on pp. 8-7 to 8-11 of the Interim Final Guidance) assumes

27

Page 34: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

that all the data (detects and nondetects) come from the same Normal or Lognormal population, butthat nondetect values have been "censored" at their detection limits. This implies that thecontaminant of concern is present in nondetect samples, but the analytical equipment is notsensitive to concentrations lower than the detection limit. Aitchison's adjustment, on the otherhand, is constructed on the assumption that nondetect samples are free of contamination, so that allnondetects may be regarded as zero concentrations. In some situations, particularly when theanalyte of concern has been detected infrequently in background measurements, this assumptionmay be practical, even if it cannot be verified directly.

Before choosing between Cohen's and Aitchison's approaches, it should be cautioned thatCohen's adjustment may not give valid results if the proportion of nondetects exceeds 50%. In acase study by McNichols and Davis (1988), the false positive rate associated with the use of t-testsbased on Cohen's method rose substantially when the fraction of nondetects was greater than 50%.This occurred because the adjusted estimates of the mean and standard deviation are more highlycorrelated as the percentage of nondetects increases, leading to less reliable statistical tests(including statistical interval tests).

On the other hand, with less than 50% nondetects, Cohen's method performed adequately inthe McNichols and Davis case study, provided the data were not overly skewed and that moreextensive tables than those included within the Intenm Final Guidance were available to calculateCohen's adjustment parameter. As a remedy to the latter caveat, a more extensive table of Cohen'sadjustment parameter is provided in Appendix A (Table A-5). It is also recommended that the data(detected measurements and nondetect detection limits) first be log-transformed prior to computingeither Cohen's or Aitchison's adjustment, especially since both procedures assume that theunderlying data are Normally distributed.

2.2.1 Censored and Detects-Only Probability Plots

To decide which approach is more appropriate for a particular set of ground water data, twoseparate Probabiiry Plots can be constructed. The first is called a Censored Probability Plot and isa test of Cohen's underlying assumption. In this method, the combined set of detects andnondetects is ordered (with nondetects being given arbitrary but distinct ranks). Cumulativeprobabilities or Normal quantiles (see Section 1.1) are then computed for the data set as in aregular Probability Plot However, only the detected values and their associated Normal quantilesare actually plotted. If the shape of the Censored Probability Plot is reasonably linear, thenCohen's assumption that nondetects have been "censored" at their detection limit is probably

28

Page 35: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 1,13/92

acceptable and Cohen's adjustment can be made to estimate the sample mean and standarddeviation. If the Censored Probability Plot has significant bends and curves, particularly in one orboth tails, one might consider Aitchison's procedure instead.

To test the assumptions of Aitchison's method, a Detects-Only Probability Plot may beconstructed. In this case, nondetects are completely ignored and a standard Probability Plot isconstructed using only the detected measurements. Thus, cumulative probabilities or Normalquantiles are computed only for the ordered detected values. Comparison of a Detects-OnlyProbability Plot with a Censored Probability Plot will indicate that the same number of points andconcentration values are plotted on each graph. However, different Normal quantiles areassociated with each detected concentration. If the Detects-Only Probability Plot is reasonablylinear, then the assumptions underlying Aitchison's adjustment (i.e., that "nondetects" representzero concentrations, and that detects and nondetects follow separate probability distributions) areprobably reasonable.

If it is not clear which of the Censored or Detects-Only Probability Plots is more linear,Probability Plot Correlation Coefficients can be computed for both approaches (note that thecorrelations should only involve the points actually plotted, that is, detected concentrations). Theplot with the higher correlation coefficient will represent the most linear trend. Be careful,however, to use other, non-statistical judgments to help decide which of Cohen's and Aitchison'sunderlying assumptions appears to be most reasonable based on the specific characteristics of thedata set It is also likely that these Probability Plots may have to be constructed on the logarithmsof the data instead of the original values, if in fact the most appropriate underlying distribution isthe Lognormal instead of the Normal.

EXAMPLE 8

Create Censored and Detects-Only Probability Plots with the following zinc data to determinewhether Cohen's adjustment or Aitchison's adjustment is most appropriate for estimating the truemean and staadnd deviation.

29

Page 36: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Zinc Concentrations (ppb) at Background WellsSample Well 1 Well 2 Well 3 Well 4 Well 5

12345678

<711.41

<7<7<7

10.0015.00

<7

<7<7

13.7011.56

<7<7

10.5012.59

<712.8514.209.36

<712.00

<7<7

11.6910.90

<712.2211.05

<713.24

<7

<7<7<7

11.15 '13.3112.35

<78.74

SOLUTION

Step 1. Pool together the data from the five background wells and list in order in the tablebelow.

Step 2. To construct the Censored Probability Plot, compute the probabilities i/(n+l) using thecombined set of detects and nondetects, as in column 3. Find the Normal quantilesassociated with these probabilities by applying the inverse standard Normaltransformation,

Step 3. To construct the Detects-Only Probability Plot, compute the probabilities in column 5using only the detected zinc values. Again apply the inverse standard Normaltransformation to find the associated Normal quantiles in column 6. Note thatnondetects are ignored completely in this method.

30

Page 37: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Order (i)

———— I ——23456789

10111213141516171819202122232425262728293031323334353637383940

Zinc Cone,(ppb)

——— ̂ 7 ——<7<7<7<7<7<7<7<7<7<7<7<7<7<7<7<7<7<7<7

8.749.36

10.0010.5010.9011.0511.1511.4111.5611.6912.0012.2212.3512.5912.8513.2413.3113.7014.2015.00

CensoredProbs..024.049.073.098.122.146.171.195.220.244.268.293.317.341.366.390.415.439.463.488.512.537.561.585.610.634.659.683.707.732.756.780.805.829.854.878.902.927.951.976

NormalQuandles-1.0J1-1.657-1.453-1.296-1.165-1.052-0.951-0.859-0.774-0.694-0.618-0.546-0.476-0.408-0.343-0.279-0.216-0.153-0.092-0.0310.0310.0920.1530.2160.2790.3430.4080.4760.5460.6180.6940.7740.8590.9511.0521.1651.2961.4531.6571.971

Detects-OnlyProbs.

.048

.095

.143

.190

.238

.286

.333

.381

.429

.476

.524

.571

.619

.667

.714

.762

.810

.857

.905

.952

NormalQuandles

-1.668-1.309-1.068-0.876-0.712-0.566-0.431-0.303-0.180-0.0600.0600.1800.3030.4310.5660.7120.8761.0681.3091.668

Step 4. Plot the detected zinc concentrations versus each set of probabilities or Normal quandles,as per the procedure for constructing Probability Plots (see figures below). Thenondetect values should not be plotted. As can be seen from the graphs, the CensoredProbability Plot indicates a definite curvature in the tails, especially the lower tail. TheDetects-Only Probability Plot, however, is reasonably linear. This visual impression isbolstered by calculation of a Probability Plot Correlation Coefficient for each set of

31

Page 38: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

detected values: the Censored Probability Plot has a correlation of r=.969, while theDetects-Only Probability Plot has a correlation of r=.998.

Step 5. Because the Detects-Only Probability Plot is substantially more linear than the CensoredProbability Plot, it may be appropriate to consider detects and nondetects as arising fromstatistically distinct distributions, with nondetects representing "zero" concentrations.Therefore, Aitchison's adjustment may lead to better estimates of the true mean andstandard deviation than Cohen's adjustment for censored data.

CENSORED PROBABILITY PLOT

2

9_]

s

1.5

0.5

•1.5

•2J10 11 12 13 14

ZINC CONCENTRATIONS (»pt)

15

32

Page 39: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/C3/92

DETECTS-ONLY PROBABILITY PLOT

9

&

2.5

1.5

OJ

••3

•1.5

•15

I I I T I

10 11 12 13 14

ZINC CONCENTRATIONS (pp»)

15

2.2.2 Aitchison's Adjustment

To actually compute Aitchison's adjustment (Aitchison, 1955), it is assumed that the detectedsamples follow an underlying Normal distribution. If the detects are Lognormal, computeAitchison's adjustment on the logarithms of the data instead. Let d*# nondetects and let n=total #of samples (detects and nondetects combined). Then if x * and s* denote respectively the samplemean and standard deviation of the detected values, the adjusted overall mean can be estimated as

f, dV-: 1—— ffV nj

and the adjusted overall standard deviation may be estimated as the square root of the quantity

n-1

The general formula for a parametric statistical interval adjusted for nondetects by Aitchison'smethod is given by V ± * a, with K depending on the type of interval being constructed.

33

Page 40: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

EXAMPLE 9

In Example 8, it was determined that Aitchison's adjustment might lead to more appropriateestimates of the true mean and standard deviation than Cohen's adjustment. Use the data inExample 8 to compute Aitchison's adjustment

SOLUTION

Step 1 . The zinc data consists of 20 nondetects and 20 detected values; therefore d=20 and n=40in the above formulas.

Step 2. Compute the average x" = 1 1.891 and the standard deviation s* = 1.595 of the set ofdetected values.

Step 3. Use the formulas for Aitchison's adjustment to compute estimates of the true mean andstandard deviation:

£= i - _ x l l . 891 = 5. 95

• = 37.4,5

If Cohen's adjustment is mistakenly computed on these data instead, with a detectionlimit of 7 ppb.the estimates become 4 * 7.63 and b * 4-83. Thus, the choice ofadjustment can have a significant impact on the upper limits computed for statisticalintervals.

2.2.3 More Than 50% Nondetects

If more than 50% but less than 90% of the samples are nondetect or the assumptions ofCohen's and Aitchison's methods cannot be justified, parametric statistical intervals should beabandoned in favor of non-parametric alternatives (see Section 3 below). Nonparametricstatistical intervals are easy to construct and apply to ground water data measurements, and nospecial steps need be taken to handle nondetects.

When 90% or more of the data values are nondetect (as often occurs when measuring volatileorganic compounds [VOCs] in ground water, for instance), the detected samples can often bemodeled as "rare events" by using the Poisson distribution. The Poisson model describes thebehavior of a series of independent events over a large number of trials, where the probability ofoccurrence is low but stays constant from trial to trial. The Poisson model is similar to theBinomial model in that both models represent "counting processes." In the Binomial case,nondetects are counted as 'misses' or zeroes and detects are counted (regardless of contamination

Page 41: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23,92

level) as 'hits' or ones; in the case of the Poisson, each panicle or molecule of contamination iscounted separately but cumulatively, so that the counts for detected samples with highconcentrations are larger than counts for samples with smaller concentrations. As Gibbons (1987,p. 574) has noted, it can be postulated

...that the number of molecules of a particular compound out of a much largernumber of molecules of water is the result of a Poisson process. For example,we might consider 12 ppb of benzene to represent a count of 12 units of benzenefor every billion units examined. In this context, Poisson's approach is justifiedin that the number of units (i.e., molecules) is large, and the probability of theoccurrence (i.e., a molecule being classified as benzene) is small.

For a detect with concentration of 50 ppb, the Poisson count would be 50. Counts fornondetects can be taken as zero or perhaps equal to half the detection limit (e.g., if the detectionlimit were 10 ppb, the Poisson count for that sample would be 5). Unlike the Binomial (Test ofProportions) model, the Poisson model has the ability to utilize the magnitudes of detectedconcentrations in statistical tests.

The Poisson distribution is governed by the average rate of occurrence, X, which can beestimated by summing the Poisson counts of all samples in the background pool of data anddividing by the number of samples in the pool. Once the average rate of occurrence has beenestimated, the formula for the Poisson distribution is given by

where x represents the Poisson count and X represents the average rate of occurrence. To use thePoisson distribution to predict concentration values at downgradient wells, formulas forconstructing Poisson Prediction and Tolerance limits are given below.

2.2.4 Poisson Prediction Limits

To estimate a Prediction limit at a particular well using the Poisson model, the approachdescribed by Gibbons (1987b) and based on the work of Cox and Hinkley (1974) can be used. Inthis case, an upper limit is estimated for an interval that will contain all of k future measurements ofan analyte with confidence level l-o, given n previous background measurements.

To do this, let Tn represent the sum of the Poisson counts of n background samples. Thegoal is to predict Tk*. representing the total Poisson count of the next k sample measurements. As

35

Page 42: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/^2

Cox and Hinkley show, if Tn has a Poisson distribution with mean jj. and if no contamination hasoccurred, it is reasonable to assume that TV will also have a Poisson distribution but with meancji, where c depends on the number of future measurements being predicted.

In particular, Cox and Hinckley demonstrate that the quantity

T;- U-t-c; j //!____

(1 + c)2

has an approximate standard Normal distribution. From this relation, an upper prediction limit forTt* is calculated by Gibbons to be approximately

where t=tn.it(X is the upper (1-a) percentile of the Student's t distribution with (n-1) degrees offreedom. The quantity c in the above formulas may be computed as k/n, where, as noted, k is thenumber of future samples being predicted

EXAMPLE 10

Use the following benzene data from six background wells to estimate an upper 99% PoissonPrediction limit for the next four measurements from a single downgradient well.

Month123456

Welll<2<2<2<2<2<2

Benzene Concentrations (ppb)

Well 2 Well 3 Well 4 Well 5<2<2<212.0<2<2

<2<2<2<2<2<2

<215.0<2<2<2<2

<2<2<2<2<2<2

Well 6<2<2<2<210.0<2

36

Page 43: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

SOLUTION

Step 1. Pooling the background data yields n=36 samples, of which, 33 (92%) are nondetect.Because the rate of detection is so infrequent (i.e., <10%), a Poisson-based Predictionlimit may be appropriate. Since four future measurements are to be predicted, k=4, andhence, c=k/n=l/9.

Step 2. Set each nondetect to half the detection limit or 1 ppb. Then compute the Poisson countof the sum of all the background samples, in this case, Tn=33(l)+(12.0+15.0+10.0) =70.0. To calculate an upper 99% Prediction limit, the upper 99th percentile of the t-distribution with (n-l)=35 degrees of freedom must be taken from a reference table,namely t35,.oi=2.4377.

Step 3. Using Gibbons' formula above, calculate the upper Prediction limit as:

Step 4. To test the upper Prediction limit, the Poisson count of the sum of the next fourdowngradient wells should be calculated. If this sum is greater than 15.3 ppb, there issignificant evidence of contamination at the downgradient well. If not, the well may beregarded as clean until the next testing period.

The procedure for generating Poisson prediction limits is somewhat flexible. The value kabove, for instance, need not represent multiple samples from a single well It could also denote acollection of single samples from k distinct wells, all of which are assumed to follow the samePoisson distribution in the absence of contamination. The Poisson distribution also has thedesirable property that the sum of several Poisson variables also has a Poisson distribution, even ifthe individual components are not identically distributed. Because of this, Gibbons (1987b) hassuggested that if several analytes (e.g., different VOCs) can all be modeled via the Poissondistribution, the combined sum of the Poisson counts of all the analytes will also have a Poissondistribution, meaning that a single prediction limit could be estimated for the combined group ofanalytes, thus reducing the necessary number of statistical tests.

A major drawback to Gibbons' proposal of establishing a combined prediction limit forseveral analytts is that if the limit is exceeded, it will not be clear which analyte is responsible for"triggering" the test In pan this problem explains why the ground-water monitoring regulationsmandate that each analyte be tested separately. Still, if a large number of analytes must be regularlytested and the detection rate is quite low, the overall facility-wide false positive rate may beunacceptably high. To remedy this situation, it is probably wisest to do enough initial testing ofbackground and facility leachate and waste samples to determine those specific parameters presentat levels substantially greater than background. By limiting monitoring and statistical tests to a few

37

Page 44: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

parameters meeting the above conditions, it should be possible to contain the overall facility-widefalse positive rate while satisfying the regulatory requirements and assuring reliable identificationof ground-water contamination if it occurs.

Though quantitative information on a suite of VOCs may be automatically generated as aconsequence of the analytical method configuration (e.g., SW-846 method 8260 can providequantitative results for approximately 60 different compounds), it is usually unnecessary todesignate all of these compounds as leak detection indicators. Such practice generally aggravatesthe problem of many comparisons and results in elevated false positive rates for the facility as awhole. This makes accurate statistical testing especially difficult. EPA therefore recommends thatthe results of leachate testing or the waste analysis plan serve as the primary basis for designatingreliable leak detection indicator parameters.

2.2.5 Poisson Tolerance Limits

To apply an upper Tolerance limit using the Poisson model to a group of downgradientwells, the approach described by Gibbons (1987b) and based on the work of Zacks (1970) can betaken. In this case, if no contamination has occurred, the estimated interval upper limit will containa large fraction of all measurements from the downgradient wells, often specified at 95% or more.

The calculations involved in deriving Poisson Tolerance limits can seem non-intuitive,primarily because the argument leading to a mathematically rigorous Tolerance limit is complicated.The basic idea, however, uses the fact that if each individual measurement follows a commonPoisson distribution with rate parameter, X, the sum of n such measurements will also follow aPoisson distribution, this time with rate nX.

Because the Poisson distribution has the property that its true mean is equal to the rateparameter X, the concentration sum of n background samples can be manipulated to estimate thisrate. But since we know that the distribution of the concentration sum is also Poisson, the possiblevalues of X can actually be narrowed to within a small range with fixed confidence probability (y).

For each "possible" value of X in this confidence range, one can compute the percentile of thePoisson distribution with rate X that would lie above, say, 95% of all future downgradientmeasurements. By setting as the "probable" rate, that X which is greater than all but a small

38

Page 45: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

EXAMPLE 11

Use the benzene data of Example 10 to estimate an upper Poisson Tolerance limit with 95%coverage and 95% confidence probability.

SOLUTION

Step 1. The benzene data consist of 33 nondetects with detection limit equal to 2 ppb and 3detected values for a total of n=36. By setting each nondetect to half the detection limitas before, one finds a total Poisson count of the sum equal to Tn=70.0. It is also knownthat the desired confidence probability is ye.95 and the desired coverage is (J=.95.

Step 2. Based on the observed Poisson count of the sum of background samples, estimate theprobable occurrence rate Xjn using Zacks' formula above as

2] - *in42] = 2.37

Step 3. Compute twice the probable occurrence rate as 2kfn=4.14. Now using a Chi-squaretable, find the smallest degrees of freedom (df), k, such that

Since the 5th percentile of the Chi-square distribution with 12 df equals 5.23 (but only4.57 with 1 1 df), it is seen that (2k+2)»12, leading to k»5. Therefore, the upperPoisson Tolerance limit is estimated as k*5 ppb.

Step 4. Because the estimated upper Tolerance limit with 95% coverage equals 5 ppb, anydetected value among downgradient samples greater than 5 ppb may indicate possibleevidence of contamination.

40

Page 46: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

percentage a of the most extreme possible X's, given the values of n background samples, one cancompute an upper tolerance limit with, say, 95% coverage and (l-a)% confidence.

To actually make these computations, Zacks (1970) shows that the most probable rate X canbe calculated approximately as

where as before Tn represents the Poisson count of the sum of n background samples (settingnondetects to half the method detection limit), and

KHTl

represents the y percentile of the Chi-square distribution with (2Tn+2) degrees of freedom.

To find the upper Tolerance limit with p*% coverage (e.g., 95%) once a probable rate X hasbeen estimated, one must compute the Poisson percentile that is larger than p% of all possiblemeasurements from that distribution, that is, the P% quantile of the Poisson distribution with meanrate XTU, denoted by P~*(p\XTo)- Using a well-known mathematical relationship between thePoisson and Chi-square distributions, finding the 3% quantile of the Poisson amounts todetermining the least positive integer k such that

y2 rnu .. 2! > 2,2_*l-0l2k + 2J22ATn

where, as above, the quantity [2k+2] represents the degrees of freedom of the Chi-squaredistribution. By calculating two times the estimated probable rate Xjn on the right-hand-side of theabove inequfljfey, and then finding the smallest degrees of freedom so that the (!-(})% percentile ofthe Chi-squat distribution is bigger than 2Xjn. the upper tolerance limit k can be determined fairlyeasily.

Once the upper tolerance limit, k, has been estimated, it will represent an upper PoissonTolerance limit having approximately P% coverage with y% confidence in all comparisons withdowngradient well measurements.

39

Page 47: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft ~/23/9:

3. NON-PARAMETRIC COMPARISON OFCOMPLIANCE WELL DATA

TO BACKGROUND

When concentration data from several compliance wells are to be compared withconcentration data from background wells, one basic approach is analysis of variance (ANOVA).The ANOVA technique is used to test whether there is statistically significant evidence that themean concentration of a constituent is higher in one or more of the compliance wells than thebaseline provided by background wells. Parametric ANOVA methods make two key assumptions:1) that the data residuals are Normally distributed and 2) that the group variances are allapproximately equal. The steps for calculating a parametric ANOVA arc given in the Interim FinalGuidance (pp. 5-6 to 5-14).

If either of the two assumptions crucial to a parametric ANOVA is grossly violated, it isrecommended that a non-parametric test be conducted using the ranks of the observations ratherthan the original observations themselves. The Interim Final Guidance describes the Kruskal-Wallis test when three or more well groups (including background data, see pp. 5-14 to 5-20) arebeing compared. However, the Kruskal-Wallis test is not amenable to two-group comparisons,say of one compliance well to background data. In this case, the Wilcoxon Rank-Sum procedure(also known as the Mann-Whitney U Test) is recommended and explained below. Since mostsituations will involve the comparison of at least two downgradient wells with background data,the Kruskal-Wallis test is presented first with an additional example.

3.1 KRUSKAL-WALLIS TEST

When the assumptions used in a parametric analysis of variance cannot be verified, e.g.,when the original or transformed residuals are not approximately Normal in distribution or havesignificantly different group variances, an analysis can be performed using the ranks of theobservations. Usually, a non-parametric procedure will be needed when a substantial fraction ofthe measurements are belc^detection (more than 15 percent), since then the above assumptionsare difficult to verify.

The assumption of independence of the residuals is still required. Under the null hypothesisthat there is no difference among the groups, the observations are assumed to come from identicaldistributions. However, the form of the distribution need not be specified.

41

Page 48: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

A non-parametric ANOVA can be used in any situation that the parametric analysis ofvariance can be used. However, because the ranks of the data are being used, the minimumsample sizes for the groups must be a little larger. A useful rule of thumb is to require a minimumof three well groups with at least four observations per group before using the Kruskal-Wallisprocedure.

Non-parametric procedures typically need a few more observations than parametricprocedures for two reasons. On the one hand, non-parametric tests make fewer assumptionsconcerning the distribution of the data and so more data is often needed to make the same judgmentthat would be rendered by a parametric test Also, procedures based on ranks have a discretedistribution (unlike the continuous distributions of parametric tests). Consequently, a largersample size is usually needed to produce test statistics that will be significant at a specified alphalevel such as 5 percent

The relative efficiency of two procedures is defined as the ratiaof the sample sizes needed byeach to achieve a certain level of power against a specified alternative hypothesis. As sample sizesget larger, the efficiency of the Kruskal-Wallis test relative to the parametric analysis of variancetest approaches a limit that depends on the underlying distribution of the data, but is always at least86 percent. This means roughly that in the wont case, if 86 measurements are available for aparametric ANOVA, only 100 sample values are needed to have an equivalently powerful Kruskal-Wallis test. In many cases, the increase in sample size necessary to match the power of aparametric ANOVA is much smaller or not needed at all The efficiency of the Kruskal-Wallis testis 95 percent if the data are really Normal, and can be much larger than 100 percent in other cases(e.g., it is 150 percent if the residuals follow a distribution called the double exponential).

These results concerning efficiency imply that the Kruskal-Wallis test is reasonably powerfulfor detecting concentration differences despite the fact that the original data have been replaced bytheir ranks, and on be used even when the data are Normally distributed. When the data are notNormal or cannot be transformed to Normality, the Kruskal-Wallis procedure tends to be morepowerful for deleting differences than the usual parametric approach.

3.1.1 Adjusting for Tied Observations

Frequently, the Kruskal-Wallis procedure will be used when the data contain a significantfraction of nondetects (e.g., more than 15 percent of the samples). In these cases, the parametricassumptions necessary for the usual one-way ANOVA are difficult or impossible to verify, making

42

Page 49: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft ^,13/92

the non-parametric alternative attractive. However, the presence of nondetects prevents a uniqueranking of the concentration values, since nondetects are, up to the limit of measurement, all tied atthe same value.

To get around this problem, two steps are necessary. First, in the presence of ties (e.g.,nondetects), all tied observations should receive the same rank. This rank (sometimes called themidrank (Lehmann, 1975)) is computed as the average of the ranks that would be given to a groupof ties if the tied values actually differed by a tiny amount and could be ranked uniquely. Forexample, if the first four ordered observations are all nondetects, the midrank given to each ofthese samples would be equal to (1+2+3+4)74=2.5. If the next highest measurement is a uniquedetect, its rank would be 5 and so on until all observations are appropriately ranked.

The second step is to compute the Kruskal-Wallis statistic as described in the Interim FinalGuidance, using the midranks computed for the tied values. Then an adjustment to the Kruskal-Wallis statistic must be made to account for the presence of ties. This adjustment is described onpage 5-17 of the Interim Final Guidance and requires computation of the formula;

H'=- H

where g equals the number of groups of distinct tied observations and tj is the number ofobservations in the ith tied group.

EXAMPLE 12

Use the non-parametric analysis of variance on the following data to determine whether thereis evidence of contamination at the monitoring site.

Toluene Concentration (ppb)Background Wells Compliance Wells

Monft Welll Well2 Well 3 Well4 Well 5

12345

<57.5<5<56.4

<5<5<5<5<5

<512.58.0<511.2

<513.715.320.225.1

<520.135.028.219.0

43

Page 50: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 1/23/92

SOLUTION

Step 1. Compute the overall percentage of nondetects. In this case, nondetects account for 48percent of the data. The usual parametric analysis of variance would be inappropriate.Use the Kniskal-Wallis test instead, pooling both background wells into one group andtreating each compliance well as a separate group.

Step 2. Compute ranks for all the data including tied observations (e.g., nondetects) as in thefollowing table. Note that each nondetect is given the same midrank, equal to theaverage of the first 12 unique ranks.

Toluene RanksBackground Wells Compliance Wells

Month Welll Well 2 Well 3 Well 4 Well 5

12345

Rank Sum

Rank Mean

6.5146.56.513

Rb=79

Rb=7.9

6.56.56.56.56.5

6.517156.516

R3»61

R3«12.2

6.518192223

R4-88.5

R4«17.7

6.521252420

R5=96.5

R5»19.3

Step 3. Calculate the sums of the ranks in each group (Rj) and the mean ranks in each group(Rj). These results are given above.

Step 4. Compute the Kruskal-Wallis statistic H using the formula on p. 5-15 of the Interim FinalGuidance

H = 12N(N

-3(N

where N«totil number of samples, Nj=number of samples in ith group, and K=numbcrof groaps. In this case, N=25, K=4, and H can be computed as

25*26|_10

Step 5. Compute the adjustment for ties. There is only one group of distinct tied observations,containing 12 samples. Thus, the adjusted Kruskal-Wallis statistic is given by:

44

Page 51: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/2

.„.„.

Step 6. Compare the calculated value of H' to the tabulated Chi-square value with (K-l)= (#groups-l)»3 df, X23 05=7.81. Since the observed value of 11.87 is greater than theChi-square critical value, there is evidence of significant differences between the wellgroups. Post-hoc pairwise comparisons are necessary.

Step 7. Calculate the critical difference for compliance well comparisons to the backgroundusing the formula on p. 5-16 of the Interim Final Guidance document. Since the numberof samples at each compliance well is four, the same critical difference can be used foreach comparison, namely,

C, - *.«„•

Step 8. Form the differences between the average ranks of each compliance well and thebackground and compare these differences to the critical value of 8.58.

25-2612 ( l +P

UO + 5,= 8.58

Well 3: R3-Rb = 12.2-7.9 = 4.3

WelH: R4-Rb = 17.7-7.9 = 9.8

Well 5: Rf-Rb - 19.3-7.9 « 11.4

Since the average rank differences at wells 4 and 5 exceed the critical difference, there issignificant evidence of contamination at wells 4 and 5, but not at well 3.

3.2 WILCOXON RANK-SUM TEST FOR Two GROUPS

When a single compliance well group is being compared to background data and a non-parametric test is needed, the Kruskal-Wallis procedure should be replaced by the Wilcoxon Rank-Sum test (Lehmann, 1975; also known as the two-sample Mann-Whitney U test). For mostground-water applications, the Wilcoxon test should be used whenever the proportion ofnondetects in tit-combined data set exceeds 15 percent However, to provide valid results, do not

nuse the WUcoifca lest unless the compliance well and background data groups both contain at leastfour samples drh.

To run the Wilcoxon Rank-Sum Test, use the following algorithm. Combine the complianceand background data and rank the ordered values from 1 to N. Assume there are n compliancesamples and m background samples so that N=m+n. Denote the ranks of the compliance samples

45

Page 52: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

by Cj and the ranks of the background samples by Bj. Then add up the ranks of the compliancesamples and subtract n(n-i-l)/2 to get the Wilcoxon statistic W:

The rationale of the Wilcoxon test is that if the ranks of the compliance data are quite largerelative to the background ranks, then the hypothesis that the compliance and background valuescame from the same population should be rejected. Large values of the statistic W give evidence ofcontamination at the compliance well site.

To find the critical value of W, a Normal approximation to its distribution is used. Theexpected value and standard deviation of W under the null hypothesis of no contamination aregiven by the formulas

E(W) = imn; SD(W) = J^

An approximate Z-score for the Wilcoxon Rank-Sum Test then follows as:

W-E(W)-12»___ 2

SD(W) '

The factor of 1/2 in the numerator serves as a continuity correction since the discrete distribution ofthe statistic W is being approximated by the continuous Normal distribution.

Once an approximate Z-score has been computed, it may be compared to the upper 0.01percentile of the standard Normal distribution, z.oi«2.326, in order to determine the statisticalsignificance of the ttSL If the observed Z-score is greater than 2.326, the null hypothesis may berejected at the 1 percent significance level, suggesting that there is significant evidence ofcontamination at the compliance well site.

EXAMPLE 13

The table below contains copper concentration data (ppb) found in water samples at amonitoring facility. Wells 1 and 2 are background wells and well 3 is a single compliance wellsuspected of contamination. Calculate the Wilcoxon Rank-Sum Test on these data.

46

Page 53: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Copper Concentration (ppb)Background Compliance

Month Well 1 Well 2 Well 3

123456

4.25.8

11.37.07.38.2

5.26.4

11.211.510.19.7

9.410.914.516.121.517.6

SOLUTIONStep 1. Rank the N=18 observations from 1 to 18 (smallest to largest) as in the following table.

Ranks of Copper ConcentrationsBackground Compliance

Month Welll Well 2 Well 3

123456

13

13567

24

1214109

81115161817

Step 2. Compute the Wilcoxon statistic by adding up the compliance well ranks and subtractingn(n+l)/2, so that W«85-21=64.

Step 3. Compute the expected value and standard deviation of W.

E(W) = -mn = 36

SD(W) = J—mn(N +1) = VTl4 = 10.677

Step 4. Pom the approximate Z-score.

SD(W) 10.677

47

Page 54: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Step 5. Compare the observed Z-score to the upper 0.01 percentile of the Normal distribution.Since Z=2.576>2.326=z.oi, there is significant evidence of contamination at thecompliance well at the 1 percent significance level.

3.2 .1 Handling Ties in the Wiicoxon Test

Tied observations in the Wiicoxon test are handled in similar fashion to the Kruskal-Wallisprocedure. First, midranks are computed for all tied values. Then the Wiicoxon statistic iscomputed as before but with a slight difference. To form the approximate Z-score, an adjustmentis made to the formula for the standard deviation of W in order to account for the groups of tiedvalues. The necessary formula (Lehmann, 1975) is:

where, as in the Kruskal-Wallis method, g equals the number of groups of distinct tiedobservations and tj represents the number of tied values in the ith group.

48

Page 55: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draf: ",-3/^2

4. STATISTICAL INTERVALS: CONFIDENCE,TOLERANCE, AND PREDICTION

Three types of statistical intervals arc often constructed from data: Confidence intervals,Tolerance intervals, and Prediction intervals. Though often confused, the interpretations and usesof these intervals are quite distinct The most common interval encountered in a course on statisticsis a Confidence interval for some parameter of the distribution (e.g., the population mean). Theinterval is constructed from sample data and is thus a random quantity. This means that each set ofsample data will generate a different Confidence interval, even though the algorithm forconstructing the interval stays the same every time.

A Confidence interval is designed to contain the specified population parameter (usually themean concentration of a well in ground-water monitoring) with a designated level of confidence orprobability, denoted as I-a. The interval will fail to include the true parameter in approximately apercent of the cases where such intervals are constructed.

The usual Confidence interval for the mean gives information about the average concentrationlevel at a particular well or group of wells. It offers little information about the highest or mostextreme sample concentrations one is likely to observe over time. Often, it is those extreme valuesone wants to monitor to be protective of human health and the environment. As such, aConfidence interval generally should be used only in two situations for ground-water data analysis:(1) when directly specified by the permit or (2) in compliance monitoring, when downgradientsamples are being compared to a Ground-Water Protection Standard (GWPS) representing theaverage of onsite background data, as is sometimes the case with an Alternate Contaminant Level(ACL). In other situations it is usually desirable to employ a Tolerance or Prediction interval.

A Tolerance interval is designed to contain a designated proportion of the population (e.g.,95 percent of all possible sample measurements). Since the interval is constructed from sampledata, it also is t random interval. And because of sampling fluctuations, a Tolerance interval cancontain the sjtitoified proportion of the population only with a certain confidence level. Twocoefficients •Jiimriitrrl with any Tolerance interval. One is the proportion of the population thatthe interval is supposed to contain, called the coverage. The second is the degree of confidencewith which the interval reaches the specified coverage. This is known as the tolerance coefficient.A Tolerance interval with coverage of 95 percent and a tolerance coefficient of 95 percent isconstructed to contain, on average, 95 percent of the distribution with a probability of 95 percent.

49

Page 56: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/^2

Tolerance intervals are very useful for ground-water data analysis, because in manysituations one wants to ensure that at most a small fraction of the compliance well samplemeasurements exceed a specific concentration level (chosen to be protective of human health andthe environment). Since a Tolerance interval is designed to cover all but a small percentage of thepopulation measurements, observations should very rarely exceed the upper Tolerance limit whentesting small sample sizes. The upper Tolerance limit allows one to gauge whether or not too manyextreme concentration measurements are being sampled from compliance point wells.

Tolerance intervals can be used in detection monitoring when comparing compliance data tobackground values. They also should be used in compliance monitoring when comparingcompliance data to certain Ground-Water Protection Standards. Specifically, the tolerance intervalapproach is recommended for comparison with a Maximum Contaminant Level (MCL) or with anACL if the ACL is derived from health-based risk data.

Prediction intervals are constructed to contain the next sample value(s) from a population ordistribution with a specified probability. That is, after sampling a background well for some timeand measuring the concentration of an analyte, the data can be used to construct an interval that wincontain the next analyte sample or samples (assuming the distribution has not changed). APrediction interval will thus contain a future value or values with specified probability. Predictionintervals can also be constructed to contain the average of several future observations.

Prediction intervals are probably most useful for two kinds of detection monitoring. The firstkind is when compliance point well data are being compared to background values. In this case thePrediction interval is constructed from the background data and the compliance well data arecompared to the upper Prediction limits. The second land is when intrawell comparisons are beingmade on an uncontaminated well In this case, the Prediction interval is constructed on past datasampled from the well, and used to predict the behavior of future samples from the same well.

In summary, a Confidence interval usually contains an average value, a Tolerance intervalcontains a propORibn of the population, and a Prediction interval contains one or more futureobservations. Each has a probability statement or "confidence coefficient" associated with it. Forfurther explanation of the differences between these interval types, see Hahn (1970).

One should note that all of these intervals assume that the sample data used to construct theintervals are Normally distributed. In light of the fact that much ground-water concentration data isbetter modeled by a Lognormal distribution, it is recommended that tests for Normality be run on

50

Page 57: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

the logarithms of the original data before constructing the random intervals. If the data follow theLognormal model, then the intervals should be constructed using the logarithms of the samplevalues. In this case, the limits of these intervals should not be compared to the original compliancedata or GWPS. Rather, the comparison should involve the logged compliance data or loggedGWPS. When neither the Normal or Lognormal models can be justified, a non-parametric versionof each interval may be utilized

4.1 TOLERANCE INTERVALS

In detection monitoring, the compliance point samples are assumed to come from the samedistribution as the background values until significant evidence of contamination can be shown.To test this hypothesis, a 95 percent coverage Tolerance interval can be constructed on thebackground data. The background data should first be tested to check the distributionalassumptions. Once the interval is constructed, each compliance sample is compared to the upperTolerance limit If any compliance point sample exceeds the limit, the well from which it wasdrawn is judged to have significant evidence of contamination (note that when testing a largenumber of samples, the nature of a Tolerance interval practically ensures that a few measurementswill be above the upper Tolerance limit, even when no contamination has occurred. In these cases,the offending wells should probably be resampled in order to verify whether or not there is definiteevidence of contamination.)

If the Tolerance b'mit has been constructed using the logged background data, the compliancepoint samples should first be logged before comparing with the upper Tolerance limit. The stepsfor computing the actual Tolerance interval in detection monitoring are detailed in the Interim FinalGuidance on pp. 5-20 to 5-24. One point about the table of factors K used to adjust the width ofthe Tolerance interval is that these factors are designed to provide aLlcasi 95% coverage of thepopulation. Applied over many data sets, the average, coverage of these intervals will often beclose to 98% or more (see Guttman, 1970). To construct a one-sided upper Tolerance intervalwith averaye coverage of (1-0)%, the K multiplier can be computed directly with the aid of aStudent's t-dfatribution table. In this case, the formula becomes

where the t-value represents the (l-|5)th upper percentile of the t-distribution with (n-1) degrees offreedom.

51

Page 58: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

In compliance monitoring, the Tolerance interval is calculated on the compliance point data,so that the upper one-sided Tolerance limit may be compared to the appropriate Ground-WatefProtection Standard (Le., MCL or ACL). If the upper Tolerance limit exceeds the fixed standard,and especially if the Tolerance limit has been constructed to have an average coverage of 95% asdescribed above, there is significant evidence that as much as 5 percent or more of all thecompliance well measurements will exceed the limit and consequently that the compliance pointwells are in violation of the facility permit. The algorithm for computing Tolerance limits incompliance monitoring is given on pp. 6-11 to 6-15 of the Interim Final Guidance.

EXAMPLE 14

The table below contains data that represent chrysene concentration levels (ppb) found inwater samples obtained from the five compliance wells at a monitoring facility. Compute die upperTolerance limit at each well for an avgQ^e. of 95% coverage with 95% confidence and determinewhether there is evidence of contamination. The alternate concentration limit (ACL) is 80 ppb.

Chrysene Concentration (ppb)Month Welll Well 2 Well 3 Well 4 Well 5

1234

MeanSD

SOLUTIONStep 1. Before

19.739.27.8

12.8

19.8813.78

constructing

10.27.2

16.15.7

9.804.60

the tolerance

68.048.930.138.1

46.2816.40

intervals, check

26.817.731.922.2

24.656.10

the distributional

47.030.515.023.4

28.9813.58

assumptions. Thealgorithm for a parametric Tolerance interval assumes that the data used to compute theinterval ire Normally distributed. Because these data are more likely to be Lognormal indistribution than Normal, check the assumptions on the logarithms of the original datagiven in the table below. Since each well has only four observations. Probability Plotsare not likely to be informative. The Shapiro-Wilk or Probability Plot CorrelationCoefficient tests can be run, but in this example only the Skewness Coefficient isexamined to ensure that gross departures from Lognormality are not missed.

52

Page 59: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7.13/92

Month

1234

MeanSD

Welll

2.983.672.052.55

2.810.68

Logged Chrysene Concentration [log(ppb>]Well 2 Well 3 Well 4

2.32 4.22 3.291.97 3.89 2.872.78 3.40 3.461.74 3.64 3.1Q

2.20 3.79 3.180.45 0.35 0.25

Well 5

3.853.422.713.15

3.280.48

Step 2. The Skewness Coefficients for each well are given in the following table. Since none ofthe coefficients is greater than 1 in absolute value, approximate Lognormaliry (that is,Normality of the logged data) is assumed for the purpose of constructing the toleranceintervals.

Well12345

Skewness.210.334.192

-.145-.020

ISkewnessI.210.334.192.145.020

Step 3. Compute the tolerance interval for each compliance well using the logged concentrationdata. The means and SDs are given in the second table above.

Step 4. The tolerance factor for a one-sided Normal tolerance interval with an average of 95%coverage with 95% probability and n-4 observations is given by

Th.i

- = 2.6314

tofcrar limit is calculated below for each of the five wells.

Welll 2.81+2.631(0.68)- 4.61 log(ppb)Well 2 2.20+2.631(0.45)- 3.38 log(ppb)Well 3 3.79+2.631(0.35)- 4.71 log(ppb)

Well 4 3.18+2.631(0.25)- 3.85 log(ppb)

Well 5 3.28+2.631(0.48)= 4.54 log(ppb)

53

Page 60: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Step 5. Compare the upper tolerance limit for each well to the logarithm of the ACL, that islog(80)»4.38. Since the upper tolerance limits for wells 1, 3, and 5 exceed the loggedACL of 4.38 log(ppb), there is evidence of chrysene contamination in wells 1, 3, and 5.

4.1 .1 Non- parametric Tolerance Intervals

When the assumptions of Normality and Lognormality cannot be justified, especially when asignificant portion of the samples are nondetect, the use of non-parametric tolerance intervalsshould be considered The upper Tolerance limit in a non-parametric setting is usually chosen asan order statistic of the sample data (see Gunman, 1970), commonly the maximum value or maybethe second largest value observed. As a consequence, non-parametric intervals should beconstructed only from wells that are not contaminated. Because the maximum sample value isoften taken as the upper Tolerance limit, non-parametric Tolerance intervals are very easy toconstruct and use. The sample data must be ordered, but no ranks need be assigned to theconcentration values other than to determine the largest measurements. This also means thatnondetects do not have to be uniquely ordered or handled in any special manner.

One advantage to using the maximum concentration instead of assigning ranks to the data isthat non-parametric intervals (including Tolerance intervals) are sensitive to the actual magnitudesof the concentration data. Another plus is that unless all the sample data are nondetect, themaximum value will be a detected concentration, leading to a well-defined upper Tolerance limit.

Once an order statistic of the sample data (e.g., the maximum value) is chosen to representthe upper tolerance limit, Gunman (1970) has shown that the coverage of the interval, constructedrepeatedly over many data sets, has a Beta probability density with cumulative distribution

= f ——J o - u

where n=# sarnlfti in the data set and m=[(n+l)-(rank of upper tolerance limit value)]. If themaximum sampl value is selected as the tolerance limit, its rank is equal to n and so m=l. If thesecond largest value is chosen as the limit, its rank would be equal to (n-1) and so m=2.

Since the Beta distribution is closely related to the more familiar Binomial distribution,Gunman has shown that in order to construct a non-parametric tolerance interval with at least (3%coverage and (1-ct) confidence probability, the number of (background) samples must be chosensuch that

Page 61: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7,C3/92

Table A-6 in Appendix A provides the minimum coverage levels with 95% confidence forvarious choices of n, using either the maximum sample value or the second largest measurement asthe tolerance limit. As an example, with 16 background measurements, the minimum coverage isj}=83% if the maximum background value is designated as the upper Tolerance limit and (3=74% ifthe Tolerance limit is taken to be the second largest background value. In general, Table A-6illustrates that if the underlying distribution of concentration values is unknown, more backgroundsamples are needed compared to the parametric setting in order to construct a tolerance interval withsufficiently high coverage. Parametric tolerance intervals do not require as many backgroundsamples precisely because the form of the underlying distribution is assumed to be known.

Because the coverage of the above non-parametric Tolerance intervals follows a Betadistribution, it can also be shown that the average (not the minimum as discussed above) level ofcoverage is equal to l-[m/(n+l)] (see Gunman, 1970). In particular, when the maximum samplevalue is chosen as the upper tolerance limit, m=l, and the expected coverage is equal to n/(n+l).This implies that at least 19 background samples are necessary to achieve 95% coverage onaverage.

EXAMPLE 15

Use the following copper background data to establish a non-parametric upper Tolerancelimit and determine if either compliance well shows evidence of copper contamination.

Copper Concentration (ppb)Background Wells Compliance Wells

Month

12345678

Welll

<5<57.5<5<5<56.46.0

Well 2

9.2<5<5

6.18.05.9<5<5

Well 3

<55.46.7<5<5<5<5<5

Well 4

6.2<57.8

10.4

Well 5

<5<5

5.6<5

55

Page 62: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

SOLUTION

Step 1. Examine the background data in Wells 1, 2, and 3 to determine that the maximumobserved value is 9 2 ppb. Set the 95% confidence upper Tolerance limit equal to thisvalue. Because 24 background samples are available, Table A-6 indicates that theminimum coverage is equal to 88% (the expected average coverage, however, is equal to24/25-96%). To increase the coverage level, more background samples would have tobe collected.

Step 2. Compare each sample in compliance Wells 4 and 5 to the upper Tolerance limit Sincenone of the measurements at Well 5 is above 9.2 ppb, while one sample from Well 4 isabove the limit, conclude that there is significant evidence of copper contamination atWell 4 but not Well 5.

4.2 PREDICTION INTERVALS

When comparing background data to compliance point samples, a Prediction interval can beconstructed on the background values. If the distributions of background and compliance pointdata are really the same, all the compliance point samples should be contained below the upperPrediction interval limit Evidence of contamination is indicated if one or more of the compliancesamples lies above the upper Prediction limit

With intrawell comparisons, a Prediction interval can be computed on past data to contain aspecified number of future observations from the same well, provided the well has not beenpreviously contaminated. If any one or more of the future samples falls above the upper Predictionlimit, there is evidence of recent contamination at the well. The steps to calculate parametricPrediction intervals are given on pp. 5-24 to 5-28 of the Interim Final Guidance.

EXAMPLE 16

The data in the table below are benzene concentrations measured at a groundwater monitoringfacility. Calculate the Prediction interval and determine whether there is evidence of contamination.

Background Well Data Compliance Well Data

Sampling Due

Month 1

Month 2

DCIUJCJK lAMUCMUdUl

(ppb)

12.630.852.028.133.344.03.012.8

m Dcnzcnc i^mcciuiaaunSampling Date (ppb)

Month 4 48.030.342.515.0

n»4Mean=33.95

SD=14.64

56

Page 63: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/^2

Month 3 58.112.617.625.3

n=12Mean=27.52

SD=17.10

Month 5 47.63.82.651.9

n=4Mean=26.48

SD=26.94

SOLUTION

Step 1. First test the background data for approximate Nonnality. Only the background data arcincluded since these values are used to construct the Prediction interval.

Step 2. A Probability Plot of the 12 background values is given below. The plot indicates anoverall pattern that is reasonably linear with some modest departures from Nonnality.To further test the assumption of Nonnality, run the Shapiro-Wilk test on thebackground data.

PROBABILITY PLOT

Step 3. LiH'Ae data in ascending and descending order as in the following table. Also calculateilu JEfcn in i i *(n-H.i)-*<i) and multiply by the coefficients a,,.̂ taken from Table A-1to get the components of vector b; used to calculate the Shapiro-Wilk statistic (W).

57

Page 64: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

i

123456789

101112

*<i)

3.012.612.612.817.625.328.130.833.344.052.058.1

X(n-H-l)

58.152.044.033.330.828.125.317.612.812.612.63.0

an-i+l

0.5480.3330.2350.1590.0920.030

bi

30.16713.1017.3703.2511.2170.085

b=55.191

Step 4. Sum the components bj in column 5 to get quantity b. Compute the standard deviationof the background benzene values. Then the Shapiro-Wilk statistic is given as

2 T 55191 fI 33-iy* =0.947.Ln.ioivnJStep 5. The critical value at the 5% level for the Shapiro-Wilk test on 12 observations is 0.859.

Since the calculated value of W=0.947 is well above the critical value, there is noevidence to reject the assumption of Normality.

Step 6. Compute the Prediction interval using the original background data. The mean andstandard deviation of the 12 background samples are given by 27.52 ppb and 17.10ppb, respectively.

Step 7. Since there are two future months of compliance data to be compared to the Predictionlimit, the number of future sampling periods is k=2. At each sampling period, a mean offour independent samples will be computed, so m*4 in the prediction interval formula(see Interim Final Guidance, p. 5-25). The Bonferroni t- statistic, t<nt2..95), with k=2and 11 df is equivalent to the usual t- statistic at the .975 level with 11 df, i.e.,

Step 8. Compute the upper one-sided Prediction limit (UL) using the formula:

Then the UL is given by:

UL = 27.52 + (17.10X2.201), - + — = 49.25 ppb.V 4 12

Step 9. Compare the UL to the compliance data. The means of the four compliance wellobservations for months 4 and 5 are 33.95 ppb and 26.48 ppb, respectively. Since the

58

Page 65: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7^3,^2

mean concentrations for months 4 and 5 arc below the upper Prediction limit, there is noevidence of recent contamination at the monitoring facility.

4.2 .1 Non-parametric Prediction Intervals

When the parametric assumptions of a Normal-based Prediction limit cannot be justified,often due to the presence of a significant fraction of nondetects, a non-parametric Predictioninterval may be considered instead. A non-parametric upper Prediction limit is typicallyconstructed in the same way as a non-parametric upper Tolerance limit, that is, by estimating thelimit to be the maximum value of the set of background samples.

The difference between non-parametric Tolerance and Prediction limits is one ofinterpretation and probability. Given n background measurements and a desired confidence level,a non-parametric Tolerance interval will have a certain coverage percentage. With high probability,the Tolerance interval is designed to miss only a small percentage of the samples fromdowngradient wells. A Prediction limit, on the other hand, involves the confidence probability thatthe next future sample or samples will definitely fall below the upper Prediction limit. In thissense, the Prediction limit may be thought of as a 100% coverage Tolerance limit for the next kfuture samples.

As Guttman (1970) has indicated, the confidence probability associated with predicting thatthe next single observation from a downgradient well will fall below the upper Prediction limit --estimated as the maximum background value - is the same as the expected coverage of a similarlyconstructed upper Tolerance limit, namely (l-a)=n/(n+l). Furthermore, it can be shown fromGibbons (1991b) that the probability of having k future samples all fall below the upper non-parametric Prediction limit is (l-a)«n/(n+k). Table A-7 in Appendix A lists these confidencelevels for various choices of n and k. The false positive rate associated with a single Predictionlimit can be computed as one minus the confidence level.

Balancing the ease with which non-parametric upper Prediction limits are constructed is thefact that, given fixed numbers of background samples and future sample values to be predicted, themaximum confidence level associated with the Prediction limit is also fixed. To increase the levelof confidence, the only choices are to 1) decrease the number of future values to be predicted at anytesting period, or 2) increase the number of background samples used in the test. Table A-7 can beused along these lines to plan an appropriate sampling strategy so that the false positive rate can beminimized and the confidence probability maximized to a desired level.

59

Page 66: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

EXAMPLE 17

Use the following arsenic data from a monitoring facility to compute a non-parametric upperPrediction limit that will contain the next 2 monthly measurements from a downgradient well anddetermine the level of confidence associated with the Prediction limit

Arsenic Concentrations (ppb)Background Wells Compliance

Month

123456

Welll

<5<5

8<59

10

Well 2

76.5<5

612<5

Well 3 Well 4

<5<5

10.5<5<5 8

9 14

SOLUTION

Step 1. Determine the maximum value of the background data and use this value to <upper Prediction limit. In this case, the Prediction limit is set to the maximum value ofthe n* 18 samples, or 12 ppb. As is true of non-parametric Tolerance intervals, onlyuncontaminated wells should be used in the construction of Prediction limits.

Step 2. Compute the confidence level and false positive rate associated with the Prediction limit.Since two future samples are being predicted and n»18, the confidence level is found tobe n/(n+k)» 18/20-90*. Consequently, die Type I error or false positive rate is equal to(1-.90)»10%. If a lower false positive rate is desired, the number of backgroundsamples used in the test must be enlarged.

Step 3. Compare each of the downgradient samples against the upper Prediction limit. Since thevalue of 14 ppb for month 2 exceeds the limit, conclude that there is significant evidenceof contamination at the downgradient wen at the 10% level of significance.

4.3 CONFltijfriCE INTERVALSt-**f*~

Confidence intervals should only be constructed on data collected during compliancemonitoring, in particular when the Ground-Water Protection Standard (GWPS) is an ACLcomputed from the average of background samples. Confidence limits for the averageconcentration levels at compliance wells should not be compared to MCLs. Unlike a Toleranceinterval. Confidence limits for an average do not indicate how often individual samples will exceedthe MCL. Conceivably, the lower Confidence limit for the mean concentration at a compliancewell could fall below the MCL, yet 50 percent or more of the individual samples might exceed the

60

Page 67: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/73/92

MCL. Since an MCL is designed to set an upper bound on the acceptable contamination, thiswould not be protective of human health or the environment.

When comparing individual compliance wells to an ACL derived from average backgroundlevels, a lower one-sided 99 percent Confidence limit should be constructed. If the lowerConfidence limit exceeds the ACL, there is significant evidence that the true mean concentration atthe compliance well exceeds the GWPS and that the facility permit has been violated. Again, inmost cases, a Lognormal model will approximate the data better than a Normal distribution model.It is therefore recommended that the initial data checking and analysis be performed on thelogarithms of the data. If a Confidence interval is constructed using logged concentration data, thelower Confidence limit should be compared to the logarithm of the ACL rather than the originalGWPS. Steps for computing Confidence intervals are given on pp. 6-3 to 6-11 of the InterimFinal Guidance.

61

Page 68: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

5. STRATEGIES FOR MULTIPLE COMPARISONS

5.1 BACKGROUND OF PROBLEM

Multiple comparisons occur whenever more than one statistical test is performed during anygiven monitoring or evaluation period. These comparisons can arise as a result of the need to testmultiple downgradient wells against a pool of upgradient background data or to test severalindicator parameters for contamination on a regular basis. Usually the same statistical test isperformed in every comparison, each test having a fixed level of confidence (1-a), and acorresponding false positive rate, OL

The false positive rate (or Type I error) for an individual comparison is the probability thatthe test will falsely indicate contamination, i.e., that the test will "trigger," though no contaminationhas occurred. If ground-water data measurements were always constant in the absence ofcontamination, false positives would never occur. But ground-water measurements typically vary,either doe to natural variation in the levels of background concentrations or to variation in labmeasurement and analysis.

Applying the same test to each comparison is acceptable if the number of comparisons issmall, but when the number of comparisons is moderate to large the false positive rate associatedwith the testing network as a whole (that is, across all comparisons involving a separate statisticaltest) can be quite high. This means that if enough tests are run, there will be a significant chancethat at least one test will indicate contamination, even if no actual contamination has occurred. Asan example, if the testing network consists of 20 separate comparisons (some combination ofmultiple wells and/or indicator parameters) and a 99% confidence level Prediction interval limit isused on each comparison, one would expect an overall network-wide false positive rate of over18%, even though the Type I error for any single comparison is only 1%. This means there isnearly 1 chance in 5 that one or more comparisons will falsely register potential contamination evenif none has occuned With 100 comparisons and the same testing procedure, the overall network-wide false positive rate jumps to more than 63%, adding additional expense to verify the lack ofcontamination at falsely triggered wells.

To lower the network-wide false positive rate, there are several important considerations. Asnoted in Section 2.2.4, only those constituents that have been shown to be rejjabje. indicators ofpotential contamination should be statistically tested on a regular basis. By limiting the number oftested constituents to the most useful indicators, the overall number of statistical comparisons thatmust be made can be reduced, lowering the facility-wide false alarm rate. In addition, depending

62

Page 69: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

on the hydrogeology of the site, some indicator parameters may need to be tested only at one (or afew adjacent) regulated waste units, as opposed to testing across the entire facility, as long as thepermit specifies a common point of compliance, thus further limiting the number of total statisticalcomparisons necessary.

One could also try to lower the Type I error applied to each individual comparison.Unfortunately, for a given statistical test in general, the lower the false positive rate, the lower thepower of the test to detect real contamination at the well. If the statistical power drops too much,real contamination will not be identified when it occurs, creating a situation not protective of theenvironment or human health. Instead, alternative testing strategies can be considered thatspecifically account for the number of statistical comparisons being made during any evaluationperiod. All alternative testing strategies should be evaluated in light of two basic goals:

1. Is the network-wide false positive rate (across all constituents and wells beingtested) acceptably low? and2. Does the testing strategy have adequate statistical power to detect real contaminationwhen it occurs?

To establish a standard recommendation for the network-wide overall false positive rate, itshould be noted that for some statistical procedures, EPA specifications mandate that the Type Ierror for any individual comparison be at least 1%. The rationale for this minimum requirement ismotivated bv statistical power. For a given test, if the Type I error is set too low, the power of thetest will dip below "acceptable" levels. EPA was not able to specify a minimum level of acceptablepower within the regulations because to do so would require specification of a minimum differenceof environmental concern between the null and alternative hypotheses. Limited current knowledgeabout the health and/or environmental effects associated with incremental changes in concentrationlevels of Appendix EX constituents greatly complicates this task. Therefore, minimum falsepositive rates were adopted for some statistical procedures until more specific guidance could berecommended, EPA's main objective, however, as in the past, is to approve tests that haveadequate ii£|fCical power to detect real contamination of ground water, and not to enforceminimum frfe positive rates.

This emphasis is evident in §264.98(g)(6) for detection monitoring and §264.99(i) forcompliance monitoring. Both of these provisions allow the owner or operator to demonstrate thatthe statistically significant difference between background and compliance point wells or betweencompliance point wells and the Ground-Water Protection Standard is an artifact caused by an errorin sampling, analysis, statistical evaluation, or natural variation in ground-water chemistry. To

63

Page 70: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

make the demonstration that the statistically significant difference was caused by an error insampling, analysis, or statistical evaluation, re-testing procedures that have been approved by theRegional Administrator can be written into the facility permit, provided their statistical power iscomparable to the EPA Reference Power Curve given below.

For large monitoring networks, it is almost impossible to maintain a low network-wideoverall false positive rate if the Type I errors for individual comparisons must be kept above 1%.As will be seen, some alternative testing strategies can achieve a low network-wide false positiverate while maintaining adequate power to detect contamination. EPA therefore recommends hatinstead of the 1% criterion for individual comparisons, the overall network-wide false positive rate(across all wells and constituents) of any alternative testing strategy should be kept toapproximately 5% for each monitoring or evaluation period, while maintaining statistical powercomparable to the procedure below.

The other goal of any testing strategy should be to maintain adequate statistical power fordetecting contamination. Technically, power refers to the probability that a statistical testingprocedure will register and identify evidence of contamination when k exists. However, power istypically defined with respect to a single comparison, not a network of comparisons. Since sometesting procedures may identify contamination more readily when several wells in the network arecontaminated as opposed to just one or two, it is suggested that all testing strategies be comparedon the following more stringent, but common, basis. Let the effective power of a testingprocedure be defined as the probability of detecting contamination in the monitoring network whenone and only one well is contaminated with a single constituent Note that the effective power is aconservative measure of how a testing regimen will perform over the network, because the testmust uncover one contaminated well among many clean ones (i.e., like "finding a needle in ahaystack").

To establish a recommended standard for the statistical power of a testing strategy, it must beunderstood that die power is not single number, but rather a function of the level of contaminationactually present For most tests, the higher the level of contamination, the higher the statisticalpower, likewise, the lower the contamination level, the lower the power. As such, whenincreasingly contaminated ground water passes a particular well, it becomes easier for the statisticaltest to distinguish background levels from the contaminated ground water, consequently, the poweris an increasing function of the contamination level.

Page 71: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Perhaps the best way to describe the power function associated with a particular testingprocedure is via a graph, such as the example below of the power of a standard Normal-basedupper Prediction limit with 99% confidence. The power in percent is plotted along the y-axisagainst the standardized mean level of contamination along the x-axis. The standardizedcontamination levels are in units of standard deviations above the baseline (estimated frombackground data), allowing different power curves to be compared across indicator parameters,wells, and so forth. The standardized units, A, may be computed as

(Mean Contamination Level) - (Mean Background Level)(SD of Background Data)

In some situations, the probability that contamination will be detected by a particular testingprocedure may be difficult if not impossible to derive analytically and will have to be simulated ona computer. In these cases, the power is typically estimated by generating Normally-distributedrandom values at different mean levels and repeatedly simulating the test procedure. With enoughrepetitions a reliable power curve can be plotted (e.g., see figure below).

EPA REFERENCE POWER CURVE(14 Backfromd Samples)

1M

0 1 2 3 4

A (STANDARDIZED UNITS ABOVE BACKGROUND)

65

Page 72: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Nonce that the power at A=0 represents the false positive rate of the test, because at that pointno contamination is actually present and the curve is indicating how often contamination will be"detected" anyway. As long as the power at A=Q is approximately 5% (except for tests on anindividual constituent at an individual well where the false positive rate should approximate 1%)and the rest of the power curve is acceptably high, the testing strategy should be adequatelycomparable to EPA standards.

To determine an acceptable power curve for comparison to alternative testing strategies, thefollowing EPA Reference Power Curve is suggested. For a given and fixed number ofbackground measurements, and based on Normally-distributed data from a single downgradientwell generated at various mean levels above background, the EPA Reference Power Curve willrepresent the power associated with a 99% confidence upper prediction limit on the next singlefuture sample from the well (see figure above for n=16).

Since the power of a test depends on several factors, including the background sample size,the type of test, and the number of comparisons, a different EPA Reference Power Curve will beassociated with each distinct number of background samples. Power curves of alternative testsshould only be compared to the EPA Reference Power Curve using a comparable number ofbackground measurements. If the power of the alternative test is at least as high as the EPAreference, while maintaining an approximate 5% overall false positive rate, the alternativeprocedure should be acceptable.

With respect to power curves, keep in mind three important considerations: 1) the power ofany testing method can be increased merely by relaxing the false positive rate requirement, letting abecome larger than 5%. This is why an approximate 5% alpha level is suggested as the standardguidance, to ensure fair power comparisons among competing tests and to limit the overallnetwork-wide false positive rate. 2) The simulation of alternative testing methods shouldincorporate every aspect of the procedure, from initial screens of the data to final decisionsconcerning the presence of contamination. This is especially applicable to strategies that involvesome form of rewting at potentially contaminated wells. 3) When the testing strategy incorporatesmultiple comparisons, it is crucial that the power be gauged by simulating contamination in one andonly one indicator parameter at a single well (i.e., by measuring the effective powerl. As notedearlier, EPA recommends that power be defined conservatively, forcing any test procedure to find"the needle in the haystack."

66

Page 73: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

5.2 POSSIBLE STRATEGIES

5.2.1 Parametric and Non-parametric ANOVA

As described in the Interim Final Guidance, ANOVA procedures (either the parametricmethod or the Kruskal-Wallis test) allow multiple downgradient wells (but not multipleconstituents) to be combined into a single statistical test, thus enabling the network-wide falsepositive rate for any single constituent to be kept at 5% regardless of the size of the network. TheANOVA method also maintains decent power for detecting real contamination, though only forsmall to moderately-sized networks. In large networks, even the parametric ANOVA has adifficult time finding the "needle in a haystack." The reason for this is that the ANOVA F-testcombines all downgradient wells simultaneously, so that "clean" wells are mixed together with thesingle contaminated well, potentially masking the test's ability to detect the source ofcontamination.

Because of these characteristics, the ANOVA procedure may nave poorer power for detectinga narrow plume of contamination which affects only one or two wells in a much larger network(say 20 or more comparisons). Another drawback is that a significant ANOVA test result will notindicate which well or wells is potentially contaminated without further post-hoc testing.Furthermore, the power of the ANOVA procedure depends significantly on having at least 3 to 4samples per well available for testing. Since the samples must be statistically independent,collection of 3 or more samples at a given well may necessitate a several-month wait if the naturalground-water velocity at that well is low. In this case, it may be tempting to look for otherstrategies (e.g., Tolerance or Prediction intervals) that allow statistical testing of each new groundwater sample as it is collected and analyzed. Finally, since the simple one-way ANOVA procedureoudined in the Interim Final Guidance is not designed to test multiple constituents simultaneously,the overall false positive rate will be approximately 5% per constituent, leading to a potentially highoverall network-wide false positive rate (across wells and constituents) if many constituents needto be tested.

5.2.2 Retorting with Parametric Intervals

One strategy alternative to ANOVA is a modification of approaches suggested by Gibbons(1991a) and Davis and McNichols (1987). The basic idea is to adopt a two-phase testing strategy.First, new samples from each well in the network are compared, for each designated constituentparameter, against an upper Tolerance limit with pre-specified average coverage (Note that theupper Tolerance limit will be different for each constituent). Since some constituents at some wells

67

Page 74: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23^2

in a large network would be expected to fail the Tolerance limit even in the absence ofcontamination, each well that triggers the Tolerance limit is resampled and only those constituentsthat "triggered" the limit are retested via an upper Prediction limit (again differing by constituent).If one or more resamples fails the upper Prediction limit, the specific constituent at that well failingthe test is deemed to have a concentration level significantly greater than background. The overallstrategy is effective for large networks of comparisons (e.g., 100 or more comparisons), but alsoflexible enough to accommodate smaller networks.

To design and implement an appropriate pair of Tolerance and Prediction intervals, one mustknow the number of background samples available and the number of comparisons in the network.Since parametric intervals are used, it is assumed that the background data are either Normal or canbe transformed to an approximate Normal distribution. The tricky pan is to choose an averagecoverage for the Tolerance interval and confidence level for the Prediction interval such that thetwin goals are met of keeping the overall false positive rate to approximately 5% and maintainingadequate statistical power.

To derive the overall false positive rate for this retesting strategy, assume that when nocontamination is present each constituent and well in the network behaves independently of otherconstituents and wells. Then if Aj denotes the event that well i is triggered falsely at some stage ofthe testing, the overall false positive rate across m such comparisons can be written as

total a«Pr{A, orA2or... or A, or... or A,}*!-J^PrJAj

where A i denotes the complement of event Ai. Since P{ A i) is the probability of not registering afalse trigger at uncontaminated well i, it may be written as

Pr{Aj » Pr{X, S TL} + Pr{X, > TL} x Pr{Y, S PL I X, > TL}

where Xj reprejMts the original sample at well i, Y, represents the concentrations of one or moreresamples at well i, TL and PL denote the upper Tolerance and Prediction limits respectively, andthe right-most probability is the conditional event that all resample concentrations fall below thePrediction limit when the initial sample fails the Tolerance limit.

Letting x=Pr{Xi<TL) and y=Pr{YjSPL I Xj>TL), the overall false positive rate across mconstituent-well combinations can be expressed as

Page 75: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7,73/92

As noted by Guttman (1970), the probability that any random sample will fall below theupper Tolerance limit (i.e., quantity x above) is equal to the expected or average coverage of theTolerance interval. If the Tolerance interval has been constructed to have average coverage of95%, x=0.95. Then given a predetermined value for x, a fixed number of comparisons m, and adesired overall false positive rate a, we can solve for the conditional probability y as follows:

1-x

If the conditional probability y were equal to the probability that the resample(s) for the ithconstituent-well combination falls below the upper Prediction limit, one could fix a at, say, 5%,and construct the Prediction interval to have confidence level y. In that way, one could guaranteean expected network-wide false positive rate of 5%. Unfortunately, whether or not one or moreresamples falls below the Prediction limit depends partly on whether the initial sample for thatcomparison eclipsed the Tolerance limit. This is because the same background data are used toconstruct both the Tolerance limit and the Prediction limit, creating a statistical dependence betweenthe tests.

The exact relationship between the conditional probability y and the unconditional probabilityPr{ Yj^PL} is not known; however, simulations of the testing strategy suggest that when theconfidence level for the Prediction interval is equated to the above solution for y, the overallnetwork-wide false positive rate turns out to be higher than 5%. How much higher depends on thenumber of background samples and also the number of downgradient comparisons. Even with achoice of y that guarantees an expected facility-wide false positive rate of 5%, the powercharacteristics of the resulting testing strategy are not necessarily equivalent to the EPA ReferencePower Curve, again depending on the number of background samples and the number ofmonitoring well-constituent combinations in the network.

In practice, to meet the selection criteria of 1) establishing an overall false positive rate ofapproximately 5% and 2) maintaining adequate statistical power, the confidence level chosen forthe upper Prediction limit should be somewhat higher than the solution y to the preceding equation.The table below provides recommended choices of expected coverage and confidence levels for theTolerance interval-Prediction interval pair when using specific combinations of numbers ofdowngradient comparisons and background samples. In general, one should pick lower coverage

69

Page 76: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Tolerance limits for smaller networks and higher coverage Tolerance limits for larger networks.That way (as can be seen in the table), the resulting Prediction limit confidence levels will be lowenough to allow the construction of Prediction limits with decent statistical power.

PARAMETRIC RETESTING STRATEGIES#

COMPARISONS

5

20

50

100

#BGSAMPLES

8161624248

162416162424162424

TOLERANCECOVERAGE (%)

959595959595959598999899989998

PREDICTIONLEVEL (%)

909085859098979797929590989598

RATING

*****

***

**

**

**

*

**

**

*

*

*

Note: ** = strongly recommended*= recommended

Only strategies that approximately met the selection criteria are listed in the table. It can beseen that some, but not all, of these strategies are strongly recommended. Those that are merely"recommended" failed in the simulations to fully meet one or both of the selection criteria. Theperformance of all the recommended strategies, however, should be adequate to correctly identifycontamination while maintaining a modest facility-wide false positive rate.

Once a combination of coverage and confidence levels for the Tolerance-Prediction intervalpair is selected, Ac fTttifficy1 power of the testing strategy should be estimated in order to comparewith the EPA Reference Power Curve (particularly if the testing scenario is different from thosecomputed in this Addendum). Simulation results have suggested that the above method forchoosing a two-phase testing regimen can offer statistical power comparable to the EPA Referencefor almost any sized monitoring network (see power curves in Appendix B).

70

Page 77: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Several examples of simulated power curves are presented in Appendix B. The range ofdowngradient wells tested is from 5 to 100 (note that the number of wells could actually representthe number of constituent-well combinations if testing multiple parameters), and each curve isbased on either 8,16, or 24 background samples. The y-axis of each graph measures the effectivepower of the testing strategy, Le., the probability that contamination is detected when one and onlyone constituent at a single well has a mean concentration higher than background level. For eachcase, the EPA Reference Power Curve is compared to two different two-phase testing strategies. Inthe first case, wells that trigger the initial Tolerance limit are resampled once. This single resampleis compared to a Prediction limit for the next future sample. In the second case, wells that triggerthe Tolerance limit are resampled twice. Both resamples are compared to an upper Prediction limitfor the next two future samples at that well.

The simulated power curves suggest two points. First, with an appropriate choice ofcoverage and prediction levels, the two-phase retesting strategies have comparable power to theEPA Reference Power Curve, while maintaining low overall network-wide false positive rates.Second, the power of the retesting strategy is slightly improved by the addition of a secondresample at wells that fail the initial Tolerance limit, because the sample size is increased.

Overall, the two-phase testing strategy defined above-Le., first screening the network ofwells with a single upper Tolerance limit, and then applying an upper Prediction limit to resamplesfrom wells which fail the Tolerance interval-appears to meet EPA's objectives of maintainingadequate statistical power for detecting contamination while limiting network-wide false positiverates to low levels. Furthermore, since each compliance well is compared against the interval limitsseparately, a narrow plume of contamination can be identified more efficiently than with anANOVA procedure (e.g., no post-hoc testing is necessary to finger the guilty wells, and the two-phase interval testing method has more power against the "needle-in-a-haystack" contaminationhypothesis).

5.2.3 RctollM with Non-parametric Intervals

When pUfcnethc intervals are not appropriate for the data at hand, either due to a largefraction of nondetects or a lack of fit to Normality or Lognormality, a network of individualcomparisons can be handled via retesting using non-parametric Prediction limits. The strategy is toestablish a non-parametric prediction limit for each designated indicator parameter based onbackground samples that accounts for the number of well-constituent comparisons in the overallnetwork.

71

Page 78: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

In order to meet the twin goals of maintaining adequate statistical power and a low overallrate of false positives, a non-parametric strategy must involve some level of retesting at those wellswhich initially indicate possible contamination. Retesting can be accomplished by taking a specificnumber of additional, independent samples from each well in which a specific constituent triggersthe initial test and then comparing these samples against the non-parametric prediction limit for thatparameter.

Because more independent data is added to the overall testing procedure, retesting ofadditional samples, in general, enables one to make more powerful and more accuratedeterminations of possible contamination. Retesting does, however, involve a trade-off. Becausethe power of the test increases with the number of resamples, one must decide how quicklyresamples can be collected to ensure 1) quick identification and confirmation of contamination andyet, 2) the statistical independence of successive resamples from any particular well Do not forgetthat the performance of a non-parametric retesting strategy depends substantially on theindependence of the data from each well.

Two basic approaches to non-parametric retesting have been suggested by Gibbons (1990and 1991b). Both strategies define the upper Prediction limit for each designated parameter to bethe maximum value of that constituent in the set of background data. Consequently, thebackground wells used to construct the limits must be uncontaminated. After the Prediction limitshave been calculated, one sample is collected from each downgradient well in the network. If anysample constituent value is greater than its upper prediction limit, the initial test is "triggered" andone or more resamples must be collected at ihai downgradient well on the constituent for furthertesting.

At this point, the similarity between the two approaches ends. In his 1990 article, Gibbonscomputes the probability that at least one of m independent samples taken from each of kdowngradient wells will be below (i.e., pass) the prediction limit The m samples include both theinitial sample and (m-1) resamples. Because retesting only occurs when the initial well sample failsthe limit, a given well fails the overall test (initial comparison plus retests) only if all (m-1)resamples are above the prediction limit. If any resample passes the prediction limit, that well isregarded as showing no significant evidence of contamination.

Initially, this first strategy may not appear to be adequately sensitive to mild contamination ata given downgradient well. For example, suppose two resamples are to be collected whenever theinitial sample fails the upper prediction limit. If the initial sample is above the background

72

Page 79: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/^2

maximum and one of the resamples is also above the prediction limit, the well can still be classifiedas "clean" if the other resample is below the prediction limit. Statistical power simulations (seeAppendix B), however, suggest that this strategy will perform adequately under a number ofmonitoring scenarios. Still, EPA recognizes that a retesting strategy which might classify a well as"clean" when the initial sample and a resample both fail the upper Prediction limit could offerproblematic implications for permit writers and enforcement personnel.

A more stringent approach was suggested by Gibbons in 1991. In that article (1991b),Gibbons computes, as "passing behavior," the probability that all but one of m samples taken fromeach of k wells pass the upper prediction limit. Under this definition, if the initial sample fails theupper Prediction limit, all (m-1) resamples must pass the limit in order for well to be classified as"clean" during that testing period. Consequently, if any single resample falls above the backgroundmaximum, that well is judged as showing significant evidence of contamination.

. Either non-parametric retesting approach offers the advantage of being extremely easy toimplement in field testing of a large downgradient well network. In practice, one has only todetermine die maximum background sample to establish the upper prediction limit against which allother comparisons are made. Gibbons' 1991 retesting scheme offers the additional advantage ofrequiring less overall sampling at a given well to establish significant evidence of contamination.Why? If the testing procedure calls for, say, two resamples at any well that fails the initialprediction limit screen, retesting can end whenever either one of the two resamples falls above theprediction limit That is, the well will be designated as potentially contaminated if the first resamplefails the prediction limit even if the second resample has not yet been collected.

In both of his papers. Gibbons offers tables that can be used to compute the overall network-wide false positive rate, given the number of background samples, the number of downgradientcomparisons, and the number of retests for each comparison. It is clear that there is less flexibilityin adjusting a •on-parametric as opposed to a parametric prediction limit to achieve a certain Type I

.AS

error rate, fit feet, if only a certain number of retests are feasible at any given well (e.g., in orderto maintain independence of successive samples), the only recourse to maintain a low false positiverate is to collect a larger number of background samples. In this way, the inability to makeparametric assumptions about the data illustrates why non-parametric tests are on the whole lessefficient and less powerful than their parametric counterparts.

Unfortunately, the power of these non-parametric retesting strategies is not explored in detailby Gibbons. To compare the power of both Gibbons' strategies against the EPA Reference Power

73

Page 80: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Curve, Normally distributed data were simulated for several combinations of numbers ofbackground samples and downgradient wells (again, if multiple constituents are being tested, thenumber of wells in the simulations may be regarded as the number of constituent-wellcombinations). Up to three resamples were allowed in the simulations for comparative purposes.EPA recognizes, however, that it will be feasible in general to collect only one or two independentresamples from any given well. Power curves representing the results of these simulations aregiven in Appendix B. For each scenario, the EPA Reference Power Curve is compared with thesimulated powers of six different testing strategies. These strategies include collection of noresamples, one resample, two resamples under Gibbons' 1990 approach (designated as A on thecurves) and his 1991 approach (labelled as B), and three resamples (under approaches A and B).Under the one resample strategy, a potentially contaminated compliance well is designated as"clean" if the resample passes the retest and "contaminated" otherwise.

The following table lists the best-performing strategies under each scenario. As with the useof parametric intervals for retesting, the criteria for selecting the best-performing strategies required1) an approximate 5% facility-wide false positive rate and 2) power equivalent to or better than theEPA Reference Power Curve. Because Normal data were used in these power simulations, morerealistically skewed data would likely result in greater advantages for the non-parametric retestingstrategies over the EPA Reference test

Examination of the table and the power curves in Appendix B shows that the number ofbackground samples has an important effect on the recommended testing strategy. For instance,with 8 background samples in a network of at least 20 wells, the best performing strategies allinvolve collection of 3 resamples per "triggered" compliance well (EPA regards such a strategy asimpractical for permitting and enforcement purposes at most RCRA facilities). It tends to be truethat as the number of available background samples grows, fewer resamples are needed from eachpotentially contaminated compliance well to maintain adequate power. If, as is expected, thenumber of feasible, independent retests is limited, a facility operator may have to collect additionalbackground measurements in order to establish an adequate retesting strategy.

74

Page 81: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 1,13/92

NON-PARAMETRIC RETESTING STRATEGIES#

WELLS

5

20

50

100

*BGSAMPLES

88

1616248

16162424323216242432162432

STRATEGY

1 Resample2 Resamples (A)1 Resample2 Resamples (B)2 Resamples (B)2 Resamples (A)1 Resample2 Resamples (A)1 Resample2 Resamples (B)1 Resample2 Resamples (B)2 Resamples (A)1 Resample2 Resamples (A)1 Resample2 Resamples (A)2 Resamples (A)1 Resample

REFERENCE

Gibbons, 1990

Gibbons, 1991Gibbons, 1991Gibbons, 1990

Gibbons, 1990

Gibbons. 1991

Gibbons, 1991Gibbons, 1990

Gibbons, 1990

Gibbons. 1990Gibbons, 1990

RATING

*********

**

********#***

Note: ** * very good performance * = good performance

6. OTHER TOPICS

6.1 CONTROL CHARTS

Control Charts are an alternative to Prediction limits for performing either intrawellcomparisons or comparisons to historically monitored background wells during detectionmonitoring. Since the b**̂ *1? parameters for a Control Chan are estimated from historical data,this method is only appropriate for initially uncontaminated compliance wells. The main advantageof a Control Chart over a Prediction limit is that a Control Chan allows data from a well to beviewed graphically over time. Trends and changes in the concentration levels can be seen easily,because all iBopfe data is consecutively plotted on the chart as it is collected, giving the dataanalyst an liiJPflril overview of the pattern of contamination. Prediction limits allow only point-in-time comparisons between die most recent data and past information, making long-term trendsdifficult to identify.

More generally, intrawell comparison methods eliminate the need to worry about spatialvariability between wells in different locations. Whenever background data is compared tocompliance point measurements, there is a risk that any statistically significant difference in

75

Page 82: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/^2

concentration levels is due to spatial and/or hydrogeological differences between the wells ratherthan contamination at the facility. Because intrawell comparisons involve but a single well,significant changes in the level of contamination cannot be attributed to spatial differences betweenwells, regardless of whether the method used is a Prediction limit or Control Chart

Of course, past observations can be used as baseline data in an intrawell comparison only ifthe well is known to be uncontaminated- Otherwise, the comparison between baseline data andnewly collected samples may negate the goal in detection monitoring of identifying evidence ofcontamination. Furthermore, without specialized modification, Control Charts do not efficientlyhandle truncated data sets (i.e., those with a significant fraction of nondetects), making themappropriate only for those constituents with a high frequency of occurrence in monitoring wells.Control Charts tend to be most useful, therefore, for inorganic parameters (e.g., some metals andgeochemical monitoring parameters) that occur naturally in the ground water.

The steps to construct a Control Chan can be found on pp. 7-3 to 7-10 of the Interim FinalGuidance. The way a Control Chan works is as follows. Initial sample data is collected (from thespecific compliance well in an intrawell comparison or from background wells in comparisons ofcompliance data with background) in order to establish baseline parameters for the chan,specifically, estimates of the well mean and well variance. These samples are meant to characterizethe concentration levels of the uncontaminated well, before the onset of detection monitoring.Since the estimate of well variance is particularly important, it is recommended that at least 8samples be collected (say, over a year's time) to estimate the baseline parameters. Note that noneof these 8 or more samples is actually plotted on the chart

As future samples are collected, the baseline parameters are used to standardize the data. Ateach sampling period, a standardized mean is computed using the formula below, where mrepresents the baseline mean concentration and s represents the baseline standard deviation.

A cumulative sum (CUSUM) for the ith period is also computed, using the formula Sj = max{0,(Zj-k)+Si-i }, where Zj is the standardized mean for that period and k represents a pre-chosenControl Chan parameter.

Once the data have been standardized and plotted, a Control Chan is declared out-of -controlif the sample concentrations become too large when compared to the baseline parameters. An out-

76

Page 83: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7f22/92

of-control situation is indicated on the Control Chan when either the standardized means orCUSUMs cross one of two pre-determined threshold values. These thresholds are based on therationale that if die well remains uncontaminated, new sample values standardized by the originalbaseline parameters should not deviate substantially from the baseline level. If contamination doesoccur, the old baseline parameters will no longer accurately represent concentration levels at thewell and, hence, the standardized values should significantly deviate from the baseline levels on theControl Chart.

In the combined Shewhart-cumulative sum (CUSUM) Control Chan recommended by theInterim Final Guidance (Section 7), the chart is declared out-of-control in one of two ways. First,the standardized means (Zj) computed at each sampling period may cross the She whan controllimit (SCL). Such a change signifies a rapid increase in well concentration levels among the mostrecent sample data. Second, the cumulative sum (CUSUM) of the standardized means maybecome too large, crossing the "decision internal value" (h). Crossing the h threshold can meaneither a sudden rise in concentration levels or a gradual increase over a longer span of time. Agradual increase or trend is particularly indicated if the CUSUM crosses its threshold but thestandardized mean Z, does not The reason for this is that several consecutive small increases in Z,will not trigger the SCL threshold, but may trigger the CUSUM threshold. As such, the ControlChan can indicate the onset of either sudden or gradual contamination at the compliance point.

As with other statistical methods, Control Charts are based on certain assumptions about thesample data. The first is that the data at an uncontaminated well (i.e., a well process that is "incontrol") are Normally distributed. Since estimates of the baseline parameters are made usinginitially collected data, these data should be tested for Normality using one of the goodness-of-fittechniques described earlier. Better yet, the logarithms of the data should be tested first, to see if aLognormal model is appropriate for the concentration data. If the Lognormal model is not rejected,the Control Chan should be constructed solely on the basis of logged data.

The meAodotogy for Control Charts also assumes that the sample data are independentlydistributed fron a statistical standpoint. In fact, these charts can easily give misleading results ifthe consecutive sample data are not independent. For this reason, it is important to design asampling plan so that distinct volumes of water are analyzed each sampling period and thatduplicate sample analyses are not treated are independent observations when constructing theControl Chan.

77

Page 84: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

The final assumption is that the baseline parameters at the well reflect current backgroundconcentration levels. Some long-term fluctuation in background levels may be possible eventhough contamination has not occurred at a given well. Because of this possibility, if a ControlChart remains "in control" for a long period of time, the baseline parameters should be updated toinclude more recent observations as background data. After all, the original baseline parameterswill often be based only on the first year's data. Much better estimates of the true backgroundmean and variance can be obtained by including more data at a later time.

To update older background data with more recent samples^ a two-sample t-test can be run tocompare the older concentration levels with the concentrations of the proposed update samples. Ifthe t-test does not show a significant difference at the 5 percent significance level, proceed to re-estimate the baseline parameters by including more recent data. If the t-test does show a significantdifference, the newer data should not be characterized as background unless some specific factorcan be pinpointed explaining why background levels on the site have naturally changed.

EXAMPLE 18

Construct a control chart for the 8 months of data collected below.

p.=27 ppbo*25 ppb

Nickel Concentration (ppb)Month Sample 1 Sample 2

12345678

15.341.117.515.737.225.119.999.3

22.627.818.131.532,432.527.564.2

SOLUTIONStep 1. The three parameters necessary to construct a combined Shewhart-CUSUM chart are

h=5, k=l, and SCL=4.5 in units of standard deviation (SD).Step 2. List the sampling periods and monthly means, as in the following table.

78

Page 85: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Month Tj Mean (ppb) Zj Zj - k Sj

12345678

12345678

19.034.517.823.634.828.823.781.8

-0.450.42

-0.52-0.190.440.10

-0.193.10

-1.45-0.58-1.52-1.19-0.56-0.90"-1.192.10

0.000.000.000.000.000.000.002.10

Step 3. Compute the standardized means Zj and the quandties Sj. List in the table above. EachSj is computed for consecutive months using the formula on p. 7-8 of the EPA guidancedocument

51 = max {0,-1.45+0}= 0.00

52 = max (0, -0.58 + 0) = 0.00

53 = max {0,-1.52 + 0) =0.00

54 = max (0,-1.19+ 0)*0.00

55 « max {0, -0.56 + 0} - 0.00

S$ = max {0, -0.90 + 0} « 0.00

S7 = max {0,-1.19+ 0}=0.00

Sg = max{0, 2.10 + 01=2.10

Step 4. Plot the control chart as given below. The combined chin indicates that there is noevidence of contamination at the monitoring facility because neither the standardizedmean nor the CUSUM statistic exceeds the Shewhan control limits for the monthsexamined

79

Page 86: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23,92

as

OuoUsOfit

OZ

CONTROL CHART FOR NICKEL DATAMU.ZTppfc SICMA»2S»»b

5

4

3

2

1

0

1

1 I 1 1

"

£/

/

•̂

.

a. &•

hSCL

——— —CUSUM

4 6

SAMPLING PERIOD

10

Note: In the above Control Chart, the CUSUMs are compared to threshold h, while thestandardized means (Z) are compared to the SCL threshold.

6.2 OUTLIER TESTING

Formal testing for outliers should be done only if an observation seems particularly high (byorders of magnitude) compared to the rest of the data set If a sample value is suspect, one shouldrun the outlier test described on pp. 8-11 to 8-14 of the EPA guidance document It should becautioned, however, that this outlier test assumes that the rest of the data values, except for thesuspect observation, are Normally distributed (Barnett and Lewis, 1978). Since Lognormallydistributed mi Mill mi m often contain one or more values that appear high relative to the rest, it isrecommended that the outlier test be run on the logarithms of the data instead of the originalobservations. That way, one can avoid classifying a high Lognormal measurement as an outlierjust because the test assumptions were violated.

If the test designates an observation as a statistical outlier, the sample should not be treated assuch until a specific reason for the abnormal measurement can be determined. Valid reasons may,for example, include contaminated sampling equipment, laboratory contamination of the sample, or

80

Page 87: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7,23/92

errors in transcription of the data values. Once a specific reason is documented, the sample shouldbe excluded from any further statistical analysis. If a plausible reason cannot be found, the sampleshould be treated as a true but extreme value, noi to be excluded from further analysis.

EXAMPLE 19

The table below contains data from five wells measured over a 4-month period. The value7066 is found in the second month at well 3. Determine whether there is statistical evidence thatthis observation is an outlier.

Carbon Tetrachloride Concentration (ppb)Welll Well 2 Well 3 Well 4 Well 5

1.693.257.312.1

30235.115.613.7

16.27066350

70.14

19941.675.457.9

2756.559.768.4

SOLUTIONStep 1. Take logarithms of each observation. Then order and list the logged concentrations.

81

Page 88: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Draft 7/23/92

Order

1234567891011121314151617181920

Concentration(ppb)

1.693.256.57.3

12.113.715.616.235.141.657.959.768.470.175.4199275302350

7066

LoggedConcentration

0.5251.1791.8721.9882.4932.6172.7472.7853.5583.7284.0594.0894.2254.2504.3235.2935.6175.7105.8788.863

Step 2. Calculate the mean and SD of all the logged measurements. In this case, the mean andSD are 3.789 and 1.916, respectively.

Step 3. Calculate the outlier test statistic T20 as

SD 1.916

Step 4. Compare the observed statistic Tg) with the critical value of 2.557 for a sample sizen-20 and a significance level of 5 percent (taken from Table 8 on p. B-12 of the InterimFinal Gttdance). Since the observed value 720*2.648 exceeds the critical value, there issignifJcvtt evidence that the largest observation is a statistical outlier. Before excludingthis vafcv from further analysis, a valid explanadon for this unusually high value shouldbe found Otherwise, treat the outlier as an extreme but valid concentrationmeasurement

82

Page 89: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

REFERENCES

Aitchison, J. (1955) On the distribution of a positive random variable having a discrete probabilitymass at the origin. Journal of American Statistical Association, 50(272): 901-8.

Bamett, V. and Lewis, T. (1978) Outliers in statistical data. New York: John Wiley & Sons.

Cohen, A.C., Jr. (1959) Simplified estimators for the normal distribution when samples are singlecensored or truncated. Technometrics, 1:217-37.

Cox, D.R. and Hinkley, D.V. (1974) Theoretical statistics- London: Chapman & Hall.

Davis, C.B. and McNichols, RJ. (1987) One-sided intervals for at least p of m observations from anormal population on each of r future occasions. Technometrics, 29(3):359-70.

Hlliben, JJ. (1975) The probability plot correlation coefficient test for normality. Technometrics17:111-7.

Gan, F.F. and Koehler, KJ. (1990) Goodness-of-fit tests based on p-p probability plots.Technometrics, 32(3)̂ 89-303.

Gayen, A.K. (1949) The distribution of "Student's" t in random samples of any size drawn from non-normal universes. Biometrika, 36:353-69.

Gibbons, R.D. (1987a) Statistical prediction intervals for the evaluation of ground-water quality.Ground Water, 25(4):455-65.

Gibbons, R.D. (1987b) Statistical models for the analysis of volatile organic compounds in wastedisposal sites. Ground Water, 25(5):572-80.

Gibbons, R.D. (1990) A general statistical procedure for ground-water detection monitoring at wastedisposal facilities. Ground Water, 28(2)̂ 35-43.

Gibbons, R.D. (199la) Statistical tolerance limits for ground-water monitoring. Ground Water,29(4):563-70.

Gibbons, R.D. (1991b) Some additional nonparametric prediction limits for ground-water detectionmonitoring at waste disposal facilities. Ground Water, 29(5):729-36.

Gilliom, R.J. and Helsel, D.R. (1986) Estimation of distributional parameters for censored trace levelwater quality data: pan 1, estimation techniques. Water Resources Research, 22(2): 135-46.

Gunman, I. (1970) Statistical tolerance regions: classical and bavesian. Darien, Connecticut: HafnerPublishing.

83

Page 90: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Hahn, GJ. (1970) Statisdcal intervals for a normal population: pan 1, tables, examples, andapplications. Journal of Quality Technology, 2(3): 115-25.

Lehmann, E.L. (1975) Nonparametrics: statistical methods based on ranks. San Francisco: Holden-Day, Inc.

Madansky, A. (1988) Prescriptions for working statisticians. New York: Springer-Verlag.

McBean, E.A. and Rovers, F.A. (1992) Estimadon of the probability of exceedance of contaminantconcentrations. Ground Water Monitoring Review, Winter, 115-9.

McNichols, R.J. and Da vis, C.B. (1988) Statistical issues and problems in ground water detectionmonitoring at hazardous waste facilities. Ground Water Monitoring Review, Fall.

Miller, R.G., Jr. (1986) Bevond ANOVA. basics of applied statistics. New York: John Wiley &Sons.

Milliken, G.A. and Johnson, D.E. (1984) Analysis of messy data: volume 1. designed experiments.Belmont, California: Lifetime Learning Publications.

On. W.R. (1990) A physical explanation of the lognormality of pollutant concentrations. Journal ofAir Waste Management Association* 40:1378-83.

Ryan, T.A., Jr. and Joiner, B.L. (1990) Normal probability plots and tests for normality. MinitabStatistical Software: Technical Reports, November, 1-1 to 1-14.

Shapiro, S.S. and Wilk, M.B. (1965) An analysis of variance test for normality (complete samples).Biometrika, 52:591-611.

Shapiro, S.S. and Francia, R.S. (1972) An approximate analysis of variance test for normality.Journal of American Statistical Association, 67(337):215-6.

Zacks, S. (1970) Uniformly most accurate upper tolerance limits for monotone likelihood ratiofamilies of discrete distributions. Journal of American Statistical Association, 65(329):307-16.

84

Page 91: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-l.

COEFFICIENTS {AN.i+i> FOR W TEST OF NORMALITY,FOR N=2(l)50

n/i12345

n/i12345

618910

n/i12345

678910

1112131415

n/i12345

678910

20.7071

——— :—

110.5601

.33152260.1429.0695

0.0000————

210.4643

.318525782119.1736

0.1399.1092.0804.0530.0263

0.0000_____—

310.4220

292124752145.1874

0.1641.1433.1243.1066.0899

30.7071

.0000———

120.5475

.33252347.1586.0922

0.0303————

220.4590

.315625712131.1764

0.1443.1150.0878.0618.0368

0.0122_____—

320.4188289824632141.1878

0.1651.1449.1265.1093.0931

40.6872

.1677———

130.5359

.33252412.1707.1099

0.0539.0000———

230.4542

J12625632139.1787

0.1480.1201.0941.0696.0459

00228.0000.̂^_—

330.4156287624512137.1880

0.1660.1463.1284.1118.0961

50.6646

2413.0000——

140.5251

.33182460.1802.1240

0.0727.0240———

240.4493309825542145.1807

0.1512.1245.0997.0764.0539

0.0321.0107^_.̂—

340.4127285424392132.1882

0.1667.1475.1301.1140.0988

60.64312806.0875——

150.5150

.3306

.2495

.1878

.1353

0.0880.0433.0000——

250.4450306925432148.1822

0.1539.1283.1046,0823.0610

0.0403.0200.0000__—

350.4096283424272127.1883

0.1673.1487.1317.1160.1013

70.6233

.3031

.1401

.0000—

160.5056

.32902521.1939.1447

0.1005.0593.01%——

260.4407304325332151.1836

0.1563.1316.1089.0876.0672

0.0476.0284.0094__—

360.4068

281324152121.1883

0.1678.14%.1331.1179.1036

80.6052

.3164

.1743

.0561—

170.4968

.32732540.1988.1524

0.1109.0725.0359.0000—

270.4366

301825222152.1848

0.1584.1346.1128.0923.0728

0.0540.0358.0178.0000—

370.4040279424032116.1883

0.1683.1503.1344.11%.1056

90.5888

.3244

.1976

.0947

.0000

180.4886

.325325532027.1587

0.1197.0837.04%.0163—

280.4328299225102151.1857

0.1601.1372.1162.0965.0778

0.0598.0424.0253.0084—

380.4015277423912110.1881

0.1686.1513.1356.1211.1075

100.5739

.32912141.1224.0399

190.4808

.323225612059.1641

0.1271.0932.0612.0303.0000

290.4291296824992150.1864

0.1616.1395.1192.1002.0822

0.0650.0483.0320.0159.0000

390.3989275523802104.1880

0.1689.1520.1366.1225.1092

200.4734

.32112565.2085.1686

0.1334.1013.0711.0422.0140

300.4254294424872148.1870

0.1630.1415.1219.1036.0862

0.0697.0537.0381.0227.0076

400.396427372368.2098.1878

0.1691.1526.1376.1237.1108

A-l

Page 92: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

ry,

100.5739

.3291

.1224

.0399

190.4808.323225612059.164]

0.1271.0932.0612.0303.0000

290.4291296824992150.1864

U616.1395.1192.1002.0822

06500483032001590000

200.4734

.321125652085.1686

0.1334.1013.0711-0422.0140

300.4254294424872148.1870

0.1630.1415.1219.1036.0862

0.0697.0537.0381.0227.0076

395989!755380104880

S89520166!25>92

400.3964273723682098.1878

0.1691.1526.1376.1237.1108

Page 93: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-2.

PERCENTAGE POINTS OF THE W TEST FOR N=3(l)50

n 0.01 0.05

345

678910

1112131415

1617181920

2122232425

2627282930

3132333435

0.753.687.686

0.713.730.749.764.781

0.792.805.814.825.835

0.844.851.858.863.868

0.873.878.881.884.888

0.891.894.896.898.900

0.902.904.906.908.910

0.767.748.762

0.788.803.818.829.842

0.850.859.866.874.881

0.887.892.897.901.905

0.908.911.914.916.918

0.920.923.924.926.927

0.929.930.931.933.934

A-3

Page 94: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-2. (CONTINUED)

PERCENTAGE POINTS OF THE W TEST FOR N=3(1)SO

n 0.01 0.05

3637383940

0.912.914.916.917.919

0.935.936.938.939.940

41 0.920 0.94142 .922 .94243 .923 .94344 .924 .94445 .926 .945

46 0.927 0.94547 .928 .94648 .929 .94749 .929 .94750 .930 .947

A-4

Page 95: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-3.

PERCENTAGE POINTS OF THE W TEST FOR

n .01 .05

35505153555759

6163656769

7173757779

8183858789

9193959799

0.919.9350.935.938.940.944.945

0.947.947.948.950.951

0.953.956.956.957.957

0.958.960.961.961.961

0.962.963.965.965.967

0.943.9530.954.957.958.961.962

0.963.964.965.966.966

0.967.968.969.969.970

0.970.971.972.972.972

0.973.973.974.975.976

A-5

Page 96: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-4.PERCENT POINTS OF THE NORMAL PROBABILITY PLOT

CORRELETION COEFFICIENT FOR N=3(l)50(5)100

n

345678910

11121314151617181920

21222324252627282930

31323334353637383940

.01

.869

.822

.822

.835

.847

.859

.868

.876

.883

.889

.895

.901

.907

.912

.912

.919

.923

.925

.928

.930

.933

.936

.937

.939

.941

.943

.945

.947

.948

.949

.950

.951

.952

.953

.955

.956

.957

.958

.025

.872

.845

.855

.868

.876

.886

.893

.900

.906

.912

.917

.921

.925

.928

.931

.934

.937

.939

.942

.944

.947

.949

.950

.952

.953

.955

.956

.957

.958

.959

.960

.960

.961

.962

.962

.964

.965

.966

.05

.879

.868

.879

.890

.899

.905

.912

.917

.922

.926

.931

.934

.937

.940

.942

.945

.947

.950

.952

.954

.955

.957

.958

.959

.960

.962

.962

.964

.965

.966

.967

.967

.968

.968

.969

.970

.971

.972

A-6

Page 97: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-4. (CONTESTED)

PERCENT POINTS OF THE NORMAL PROBABILITY PLOTCORRELETION COEFFICIENT FOR N=3(l)50(5)100

n

41424344454647484950

556065707580859095100

.01

.958

.959

.959

.960

.961

.962

.963

.963

.964

.965

.967

.970

.972

.974

.975

.976

.977

.978

.979

.981

.025

.967

.967

.967

.968

.969

.969

.970

.970

.971

.972

.974

.976

.977

.978

.979

.980

.981

.982

.983

.984

.05

.973

.973

.973

.974

.974

.974

.975

.975

.977

.978

.980

.981

.982

.983

.984

.985

.985

.985

.986

.987

A-7

Page 98: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-5.

VALUES OF LAMBDA FOR COHEN'S METHOD

Y

.01

.05

.10

.15

.20

.25

.30

.35

.40

.45

.50

.55

.60

.65

.70

.75

.80

.85

.90

.95

1.001.051.101.151.201.251.301.351.401.45

1.501.551.601.651.701.751.801.851.901.95

.01

.0102

.0105

.0110

.0113

.0116

.0120

.0122

.0125

.0128

.0130

.0133

.0135

.0137

.0140

.0142

.0144

.0146

.0148

.0150

.0152

.0153

.0155

.0157

.0159

.0160

.0162

.0164

.0165

.0167

.0168

.0170

.0171

.0173

.0174

.0176

.0177

.0179

.0180

.0181

.0183

.05

.0530

.0547

.0566

.0584.06

.0615

.0630

.0643

.0657

.0669

.0681

.0693

.0704

.0715

.0726

.0736

.0747

.0756

.0766

.0775

.0785

.0794

.0803

.0811

.0820

.0828

.0836

.0845

.0853

.0860

.0868

.0876

.0883

.0891

.0898

.0905

.0913

.0920

.0927

.0933

.10

.1111

.1143

.1180

.1215

.1247

.1277

.1306

.1333

.1360

.1385

.1409

.1432

.1455

.1477

.1499

.1520

.1540

.1560

.1579

.1598

.1617

.1635

.1653

.1671

.1688

.1705

.1722

.1738

.1754

.1770

.1786

.1801

.1817

.1832

.1846

.1861

.1876

.1890

.1904

.1918

.15

.1747

.1793

.1848

.1898

.1946

.1991

.2034

.2075

.2114

.2152

.2188

.2224

.2258

.2291

.2323

.2355

.2386

.2416

.2445

.2474

.2502

.2530

.2557

.2584

.2610

.2636

.2661

.2686

.2710

.2735

.2758

.2782

.2805

.2828

.2851

.2873

.2895

.2917

.2938

.2960

Percentage

JO 25

.2443

.2503

.2574

.2640

.2703

.2763

.2819

.28742926.2976

.3025

.3073

.3118

.3163

.3206

.3249

.3290

.3331

.3370

.3409

.3447

.3484

.3521

.3557

.3592

.3627

.3661

.3695

.3728

.3761

.3793

.3825

.3856

.3887

.3918

.3948

.3978

.4007

.4036

.4065

.3205

.3279

.3366

.3448

.3525

.3599

.3670

.3738

.3803

.3866

.3928

.3987

.4045

.4101

.4156

.4209

.4261

.4312

.4362

.4411

.4459

.4506

.4553

.4598

.4643

.4687

.4730

.4773

.4815

.4856

.4897

.4938

.4977

.5017

.5055

.5094

.5132

.5169

.5206

.5243

of Non

30

.4043

.4130

.4233

.4330

.4422

.4510

.4595

.4676

.4755

.4831

.4904

.4976

.5046

.5114

.5180

.5245

.5308

.5370

.5430

.5490

.5548

.5605

.5662

.5717

.5771

.5825

.5878

.5930

.5981

.6031

.6081

.6130

.6179

.6227

.6274

.6321

.6367

.6413

.6458

.6502

-detects

J5

.4967

.5066

.5184

.52%

.5403

.5506

.5604

.5699

.5791

.5880

.5967

.6051

.6133

.6213

.6291

.6367

.6441

.6515

.6586

.6656

.6725

.6793

.6860

.6925

.6990

.7053

.7115

.7177

.7238

.7298

.7357

.7415

.7472

.7529

.7585

.7641

.7696

.7750

.7804

.7857

.40

.5989

.6101

.6234

.6361

.6483.66

.6713

.6821

.6927

.7029

.7129

.7225

.7320

.7412

.7502

.7590

.7676

.7761

.7844

.7925

.8005

.8084

.8161

.8237

.8312

.8385

.8458

.8529

.8600

.8670

.8738

.8806

.8873

.8939

.9005

.9069

.9133

.9196

.9259

.9321

.45

.7128

.7252.74

.7542

.7678

.7810

.7937

.8060

.8179

.8295

.8408

.8517

.8625

.8729

.8832

.8932

.9031

.9127

.9222

.9314

.9406

.9496

.9584

.9671

.9756

.9841

.99241.00061.00871.0166

1.02451.03231.04001.04761.05511.06251.06981.07711.08421.0913

.50

.8403

.8540

.8703

.8860

.9012

.9158

.9300

.9437

.9570

.9700

.9826

.9° "LOGxJ1.01881.03031.04161.05271.06361.07431.08

1.09511.10521.11521.12501.13471.14431.15*7l.K1.17211.1812

1.19011.19891.20761.21621.22481.23321.24151.24971.25791.2660

A-8

Page 99: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-S. (CONTINUED)

VALUES OF LAMBDA FOR COHEN'S METHOD

Y

2.002.052.102.152.202.252.302.352.402.45

2.502.552.602.652.702.752.802.852.90

.01

.0184

.0186

.0187

.0188

.0189

.0191

.0192

.0193

.0194

.0196

.0197

.0198

.0199

.0201

.0202

.0203

.0204

.0205

.0206

.05

.0940

.0947

.0954

.0960

.0967

.0973

.0980

.0986

.0992

.0998

.1005

.1011

.1017

.1023

.1029

.1035

.1040

.1046

.1052

.10

.1932

.1945

.1959

.1972

.1986

.1999

.2012

.2025

.2037

.2050

.2062

.2075

.2087

.2099

.2111

.2123

.2135

.2147

.2158

.15

.2981

.3001

.3022

.3042

.3062

.3082

.3102

.3122

.3141

.3160

.3179

.3198

.3217

.3236

.3254

.3272

.3290

.3308

.3326

Percentage of Non-

.20 25 JO

.4093

.4122

.4149

.4177

.4204

.4231

.4258

.4285

.4311

.4337

.4363

.4388

.4414

.4439

.4464

.4489

.4513

.4537

.4562

.5279

.5315

.5350

.5385

.5420

.5454

.5488

.5522

.5555

.5588

.5621

.5654

.5686

.5718

.5750

.5781

.5812

.5843

.5874

.6547

.6590

.6634

.6676

.6719

.6761

.6802

.6844

.6884

.6925

.6965

.7005

.7044

.7083

.7122

.7161

.7199

.7237

.7274

detects

.35

.7909

.7961

.8013

.8063

.8114

.8164

.8213

.8262

.8311

.8359

.8407

.8454

.8501

.8548

.8594

.8639

.8685

.8730

.8775

.40

.9382

.9442

.9502

.9562

.9620

.9679

.9736

.9794

.9850

.9906

.99621.00171.00721.01261.01801.02341.02871.03391.0392

.45

1.09841.10531.11221.11901.12581.13251.13911.14571.15221.1587

1.16511.17141.17771.18401.19021.19631.20241.20851.2145

.50

1.27391.28191.28971.29741.30511.31271.32031.32781.33521.3425

1.34981.35711.3642.3714.3784.3854.3924.3993.4061

2.953.003.053.103.153.203.253.303.353.403.45

3.503.553.603.653.703.753.803.853.903.95

.0207

.0209

.0210

.0211

.0212

.0213

.0214

.0215

.0216

.0217

.0218

.0219

.0220

.0221

.0222

.0223

.0224

.0225

.0226

.0227

.0228

.1058

.1063

.1069

.1074

.1080

.1085

.1091

.1096

.1102

.1107

.1112

.1118

.1123

.1128

.1133

.1138

.1143

.1148

.1153

.1158

.1163

.2170

.2182

.2193

.2204

.2216

.2227

.2238

.2249

.2260

.2270

.2281

.2292

.2303

.2313

.2324

.2334

.2344

.2355

.2365

.2375

.2385

.3344

.3361

.3378

.33%

.3413

.3430

.3447

.3464

.3480

.3497

.3513

.3529

.3546

.3562

.3578

.3594

.3609

.3625

.3641

.3656

.3672

.4585

.4609

.4633

.4656

.4679

.4703

.4725

.4748

.4771

.4793

.4816

.4838

.4860

.4882

.4903

.4925

.4946

.4968

.4989

.5010

.5031

.5905

.5935

.5965

.5995

.6024

.6054

.6083

.6112

.6141

.6169

.6197

.6226

.6254

.6282

.6309

.6337

.6364

.6391

.6418

.6445

.6472

.7311

.7348

.7385

.7422

.7458

.7494

.7529

.7565.76

.7635

.7670

.7704

.7739

.7773

.7807

.7840

.7874

.7907

.7940

.7973

.8006

.8819

.8863

.8907

.8950

.8993

.9036

.9079

.9121

.9163

.9205

.9246

.9287

.9328

.9369

.9409

.9449

.9489

.9529

.9568

.9607

.9646

1.04431.04951.05461.05971.06471.06971.07471.07961.08451.08941.0942

1.09901.10381.10861.11331.11801.12261.12731.13191.13641.1410

1.22051.22641.23231.23811.24391.24971.25541.26111.26681.27241.2779

1.28351.28901.29451.29991.30531.31071.31601.32131.32661.3318

1.41291.4197142641.43301.43961.4462145271.45921.46571.472014784

1.4847149101.49721.50341.50961.51571.52181.52791.53391.5399

A-9

Page 100: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-5. (CONTINUED)

VALUES OF LAMBDA FOR COHEN'S METHOD

Y

4.004.054.104.154.204.254.304.354.404.45

4.504.554.604.654.704.754.804.854.904.95

5.005.055.105.155.205.255.305.355.405.45

5.505.555.605.655.705.755.805.855.905.956.00

.01

.0229

.0230

.0231

.0232

.0233

.0234

.0235

.0236

.0237

.0238

.0239

.0240

.0241

.0241

.0242

.0243

.0244

.0245

.0246

.0247

.0248

.0249

.0249

.0250

.0251

.0252

.0253

.0254

.0255

.0255

.0256

.0257

.0258

.0259

.0260

.0260

.0261

.0262

.0263

.0264

.0264

.05

.1168

.1173

.1178

.1183

.1188

.1193

.1197

.1202

.1207

.1212

.1216

.1221

.1225

.1230

.1235

.1239

.1244

.1248

.1253

.1257

.1262

.1266

.1270

.1275

.1279

.1284

.1288

.1292

.1296

.1301

.1305

.1309

.1313

.1318

.1322

.1326

.1330

.1334

.1338

.1342

.1346

.10

.2395

.2405

.2415

.2425

.2435

.2444

.2454

.2464

.2473

.2483

.2492

.2502

.2511

.2521

.2530

.2539

.2548

.2558

.2567

.2576

.2585

.2594

.2603

.2612

.2621

.2629

.2638

.2647

.2656

.2664

.2673

.2682

.2690

.2699

.2707

.2716

.2724

.2732

.2741

.2749

.2757

.15

.3687

.3702

.3717

.3732

.3747

.3762

.3777

.3792

.3806

.3821

.3836

.3850

.3864

.3879

.3893

.3907

.3921

.3935

.3949

.3963

.3977

.3990

.4004

.4018

.4031

.4045

.4058

.4071

.4085

.4098

.4111

.4124

.4137

.4150

.4163

.4176

.4189

.4202

.4215

.4227

.4240

Pen

.20

.5052

.5072

.5093

.5113

.5134

.5154

.5174

.5194

.5214

.5234

.5253

.5273

.5292

.5312

.5331

.5350

.5370

.5389

.5407

.5426

.5445

.5464

.5482

.5501

.5519

.5537

.5556

.5574

.5592

.5610

.5628

.5646

.5663

.5681

.5699

.5716

.5734

.5751

.5769

.5786

.5803

:entage

25

.6498

.6525

.6551

.6577

.6603

.6629

.6654

.6680

.6705

.6730

.6755

.6780

.6805

.6830

.6855

.6879

.6903

.6928

.6952

.6976

.7000

.7024

.7047

.7071

.7094

.7118

.7141

.7164

.7187

.7210

.7233

.7256

.7278

.7301

.7323

.7346

.7368

.7390

.7412

.7434

.7456

of Non-

30

.8038

.8070

.8102

.8134

.8166

.8198

.8229

.8260

.8291

.8322

.8353

.8384

.8414

.8445

.8475

.8505

.8535

.8564

.8594

.8623

.8653

.8682

.8711

.8740

.8768

.8797

.8825

.8854

.8882

.8910

.8938

.8966

.8994

.9022

.9049

.9077

.9104

.9131

.9158

.9185

.9212

detects

.35

.9685

.9723

.9762

.9800

.9837

.9875

.9913

.9950

.99871.0024

1.00601.00971.01331.01691.02051.02411.02771.03121.03481.0383

1.04181.04521.04871.05211.05561.05901.06241.06581.06911.0725

1.0758.0792.0825.0858.0891.0924.0956.0989.1021.1053.1085

.40

1.14551.15001.15451.15901.16341.16781.17221.17651.18091.1852

1.18951.19371.19801.20221.20641.21061.21481.21891.22301.2272

1.23121.23531.23941.24341.24741.25141.25541.25941.26331.2672

1.27111.27501.27891.28281.28661.29051.29431.29811.30191.30571.3094

.45

1.33711.34231.34741.35261.35771.36271.36781.37281.37781.3828

1.38781.39271.39761.40241.40731.41211.41691.42171.42651.4312

1.43591.44061.44531.45001.45461.45921.46381.46841.47291.4775

1.48201.48651.49101.49541.49991.50431.50871.51311.51751.52181.5262

.50

1.54581.55181.55771.56351.56931.57511.58091.58661.59241.5980

1.60371.6' 11.6r«91.62051.62601.63151.63701.64251.64791.6'

1.65871.66411.66941.67471.68001.68531.6°̂l.t̂ J1.70101.7061

1.71131.71641.72151.72661.73171.73681.74181.74681.75181.75681.7617

A-10

Page 101: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-6.

MINIMUM COVERAGE (BETA) OF 95% CONFIDENCENON-PARAMETRIC UPPER TOLERANCE LIMITS

N

12345678910

11121314151617181920

21222324252627282930

31323334353637383940

(3(maximum)

5.022.436.847.354.960.765.268.871.774.1

76.277.979.480.781.982.983.884.785.486.1

86.787.387.888.388.789.189.589.990.290.5

90.891.191.391.691.892.092.292.492.692.8

P(2nd largest)

2.613.624.834.241.848.053.057.060.6

63.666.268.470.472.073.675.076.277.478.4

79.480.281.081.882.483.083.684.284.685.2

85.686.086.486.887.287.487.888.288.488.6

A-ll

Page 102: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-6. (CONTINUED)

MINIMUM COVERAGE (BETA) OF 95% CONFIDENCENON-PARAMETRIC UPPER TOLERANCE LIMITS

N p(maximum) |5(2nd largest)

41 93.0 89.042 93.1 89.243 93.3 89.444 93.4 89.645 93.6 89.846 93.7 90.047 93.8 90.248 93.9 90.449 94.1 90.650 94.2 90.8

55 94.7 91.660 95.1 92.465 95.5 93.070 95.8 93.475 96.1 93.880 96.3 94.285 96.5 94.690 96.7 94.895 96.9 95.0

100 97.0 95.4

A-12

Page 103: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

I. CONSTRUCTION OF POWER CURVES

To construct power curves for each of the parametric and non-parametric retesting strategies,random standard Normal deviates were generated on an IBM mainframe computer using SAS. Thebackground level mean concentration was set to zero, while the alternative mean concentration levelwas incremented in steps of A=0.5 standardized units above the background level. At each increment,5000 iterations of the retesting strategy were simulated; the proportion of iterations indicatingcontamination at any one of the wells in the downgradient monitoring network was designated as theeffective power of the retesting strategy (for that A and configuration of background samples andmonitoring wells).

Power values for the EPA Reference Power Curves were not simulated, but represent analyticalcalculations based on the non-central t-distribution with non-centrality parameter A. SAS programs forsimulating the effective power of any of the parametric or non-parametric retesting strategies arepresented below.

/ /***/ /*/ / */ /*/ /*/ /*/ / */ / */ /*/ / */ / *

*»*

DE:

Th.SJJfnin]

BGWLTLPL

DESCRIPTION: *** PARAMETRIC SIMULATIONS

This program produces power curves for 35 different curvesimulations (refer to the ILET statements below). Delta ranges

0 to 5 by 0.5. The variable list is as follows for theinput parameters:

BackgroundWellTolerance LimitPrediction Limit

//***********************************< r IT********************* •

// EXEC SAS// OUTSAS DD DSN-XXXXXXX.GWT03000.SJA3092.CURVES,// DISP-OLD// SYSIN DD *

OPTIONS LS-132 PS-57;ILET ISTART-1;ILET CURVENUM-35;ILET RSEED-2020;ILET REPEAT-5000;ILET ITPRINT-1000;

%LETILET%LET%LET%LET%LET%LETILET%LET%LET

BG1BG2BG3BG4BG5BG6BG7BGBBG9

-24;-24;-8;-8;-24;-24;-8;-8;-24;

BG10-24;

%LETILET%LETILET%LET%LETILET%LET%LET%LET

WL1WL2WL3WL4WL5WL6WL7WL8WL9

-5;-5;-5;-5;-20;-20;-20;-20;-50;

WL10-50;

ILET%LET%LET%LETILET%LET%LET%LETILETILET

TL1 -0TL2 -0TL3 -0TL4 -0TL5 -0TL6 -0TL7 -0TL8 -0TL9 -0TL10-0

.95;

.95;

.95;

.95;

.95;

.95;

.95;

.95;

.95;

.95;

%LET%LETILETILETILETILETILET%LETILETILET

PL1 -0PL2 -0PL3 -0PL4 -0PL5 -0PL6 -0PL7 -0PL8 -0PL9 -0PL10-0

.80;

.85;

.80;

.85;

.95;

.97;

.95;

.97;

.98;

.99;B-l

Page 104: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

%LET BG11-24;%LET BG12-24;%LET BG13-24;%LET BG14-24;%LET BG15-24;%LET BG16-24;%LET BG17-24;%LET BG18-24;%LET BG19-24;%LET BG20-24;%LET BG21-8;%LET BG22-8;%LET BG23-16;%LET BG24-16;%LET BG25-24;%LET BG26-16;%LET BG27-16;%LET BG28-16;%LET BG29-16;%LET BG30-16;%LET BG31-16;%LET BG32-24;%LET BG33-16;%LET BG34-16;%LET BG35-16;

%LET WL11-50;%LET WL12-50;%LET WL13-50;%LET WL14-50;%LET WL15-50;%LET WL16-100;%LET WL17-100;%LET WL18-100;%LET WL19-100;%LET WL20-100;%LET WL21-20;%LET WL22-5;%LET WL23-5;%LET WL24-5;%LET WL25-5;%LET WL26-20;%LET WL27-20;%LET WL28-50;%LET WL29-50;%LET WL30-50;%LET WL31-50;%LET WL32-100;%LET WL33-100;%LET WL34-100;%LET WL35-100;

ILET TL11-0.99;%LET TL12-0.99;%LET TL13-0.99;%LET TL14-0.98;ILET TL15-0.98;%LET TL16-0.98;%LET TL17-0.98;%LET TL18-0.99;%LET TL19-0.99;%LET TL20-0.99;%LET TL21-0.95;%LET TL22-0.95;%LET TL23-0.95;%LET TL24-0.95;%LET TL25-0.95;%LET TL26-0.95;%LET TL27-0.95;%LET TL28-0.98;%LET TL29-0.98;%LET TL30-0.99;%LET TL31-0.99;%LET TL32-0.98;%LET TL33-0.98;%LET TL34-0.99;%LET TL35-0.99;

%LET PLll-Q . 90;%LET PL12-0.93;%LET PL13-0.94;%LET PL14-0.95;%LET PL15-0.97;%LET PL16-0.97;%LET PL17-0.99;%LET PL18-0.95;%LET PL19-0.97;%LET PL20-0.98;%LET PL21-0.98;%LET PL22-0.90;%LET PL23-0.85;%LET PL24-0.90;%LET PL25-0.90;ILET PL26-0.95;ILET PL27«0.97;ILET PL28-0.95;ILET PL29-0.97;ILET PL30-0.90;ILET PL31-0.92;ILET PL32-0.98;ILET PL33-0.98;ILET PL34-0.95;ILET PL35-0.96;

%MACRO PARSIM;DATA ITERATE;*** Set changing simulation variable to common variable names;

BOttBGCX;WL-44WL4I;TL-S4TL4I;PL-46PL4I;

DO DELTA-0 TO 5 BY 0.5;*"• Initialize TPO, TP1 & TP2 to 0 before entering simulation;

TPO-0;TP1-0;TP2-0;

DO J-l TO *REPEAT;*«* Initialize CNTO, CNT1 £ CNT2 to 0;

CNTO-0;CNT1-0;CNT2-0;

XB-RANNORURSEED)/SQRT(BG);SB-SQRT(2*RANGAM(«RSEED,(BG-1)/2)/(BG-1));

PL2-XB+SB*SQRT(1+1/BG)"TINV( ( l - ( l -PL) /2 ) , (BG-1) )PL1-XB+SB*SO.RT(1+1/8G) *TINV( (l-(l-PL) ) , (BG-1) ) ;PLO-XB+SB*SQRT(1+1/BG)*TINV( (l-(l-TL) ) , (BG-1) ) ;TLIM-XB+SB«SQRTU+1/BG)*TINV((1-(1-TL)>,(BG-1)) ;

DO K-l TO WL;IF K<WL THEN DO;Xl-RANNOR(iRSEED);X2-RANNOR(tRSEED);X3-RANNOR(4RSEED);END;ELSE DO;X1-RANNOR(tRSEED)+DELTA;X2-RANNOR(tRSEED)+DELTA;

B-2

Page 105: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

X3-RANNOR(tRSEED)+DELTA;END;IF X1>TLIM THEN DO;CNTO-CNTO+1;IF X2>PL1 THEN CNT1-CNT1+1;IF X2>PL2 OR X3>PL2 THEN CNT2-CNT2+1;END;

END;

IF CNTO>0 THEN TPO-TPO+100/4REPEAT;IF CNT1>0 THEN TPl-TPl+lOO/tREPEAT;IF CNT2>0 THEN TP2-TP2-HOO/4REPEAT;

*** Print iteration information every 100 iterations;1-41;IF MOD(J,tITPRINT)-0 THEN

PUT '»> CURVE ' I ', ITERATION ' J ', ' BO ', ' WL- ', ' TL- ', 'PL- ', ' DELTA- ', ' TPO- ', ' TP1- ', ' TP2- '<«•;

END;OUTPUT;END;RUN;

DATA OUTSAS.PCURVE4I; SET ITERATE(KEEP-BG WL TL PL TPO TP1 TP2 DELTA);RUN;

PROC PRINT DATA-OUTS AS. PCURVE* I;FORMAT TPO TP1 TP2 8.4;TITLE1"TEST PRHTT OF PARAMETRIC SIMULATION PCURVE4I";TITLE2"NUMBER OF ITERATIONS - 4REPEAT";

RUN;

%MEND PARSIM;%MACRO CURVE;%DO I-&ISTART ITO tCURVENUM;%PARSIMIEND;%MEND CURVE;%CURVE

//'//* DESCRIPTION://•//* This program produces power curves for 15 different curve//• simulations (refer to the %LET statements below). Delta ranges//* from 0 to 5 by 0.5. The variable list is as follows for the//* input parameters://*//* BG - Background//* WL - well//•// EXEC SAS// OUTSAS DD DSN-XXXXXXX.GWT03000.SJA3092.CURVES,DISP-OLD// SYSIN DD *

OPTIONS LS-132 PS-57;%LET ISTART-1;%LET CURVENUM-15;%LET RSEED-3030;%LET REPEAT-5000;ILET ITPRINT-1000;

B-3

Page 106: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

%LET BG1 -8; ILET WL1 -5;%LET BG2 -16; ILET WL2 -5;%LET BG3 -24; ILET WL3 -5;ILET BG4 -8; ILET WL4 -20;ILET BG5 -16; ILET WL5 -20;ILET BG6 -24; ILET HL6 -20;ILET BG7 -8; ILET WL7 -50;ILET BG8 -16; ILET HL8 -50;ILET BG9 -24; ILET WL9 -50;ILET BG10-8; ILET WL10-100;ILET BG11-16; ILET WL11-100;ILET BG12-24; ILET WL12-100;ILET BG13-32; ILET WL13-100;ILET BG14-32; ILET WL14-20;ILET BG15-32; ILET NL15-50;

IMACRO NPARSIM;DATA ITERATE;*** Set changing simulation variable to common variable names;BG-4tBGtI;WL-46WL4I;

DO DELTA-0 TO 5 BY 0.5;*** Initialize PLx variables to 0 before entering simulation;PLO-0;PL1-0;PL2A-0;PL2B-0;PL3A-0;PL3B-0;

DO J-l TO (REPEAT;«** Initialize CNTx variables to 0;CNTO«0;CNT1-0;CNT2-0;CNT3-0;CNT4-0;CNT5-0;

DO K-l TO BG;TEST*RANNOR(«RSCED) ;IF K-l THEN MAX-TEST;ELSE IF TEST>MAX THEM MAX-TEST;

END;

DO L-l TO WL;IF L<WL THEN DO;Xl-RANNOR(SRSEED) ;X2-RANNOR(tRSEED) ;X3-RANNOR(4RSEED) ;X4-RANNORURSEED) ;END;ELSE DO;Xl-RANNOR(4RSEED)+DELTA;X2-RANNOR(4RSEED)+DELTA;X3-RANNOR(&RSEED)+DELTA;X4-RANNOR(fiRSEED)+DELTA;

END;IF X1>MAX THEN DO;

CNTO-CNTO-t-1;IF X2>MAX THEN CNT1-CNT1+1;

B-4

Page 107: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

IF X2>MAX i X3>MAX THEN CNT2-CNT2+1;IF X2>MAX OR X3>MAX THEN CNT3-CNT3 + 1 ;IF X2>MAX t X3>MAX £ X4>MAX THEN CNT4-CNT4+1;IF X2>MAX OR X3>MAX OR X4>MAX THEN CNT5-CNT5-H;

END;

IF CNTO>0 THEN PLO-PLO+100/4REPEAT;IF CNT1>0 THEN PL1-PL1+100/&REPEAT;IF CNT2>0 THEN PL2A-PL2A+100/4REPEAT;IF CNT3>0 THEN PL2B-PL2B+100/SREPEAT;IF CNT4>0 THEN PL3A-PL3A+100/4REPEAT;IF CNT5>0 THEN PL3B-PL3B+100/6REPEAT;

*** Print iteration information every X iterations;1-41;IF MOD(J,4ITPRINT)-0 THENPUT '»> CURVE ' I ', ITERATION ' J ', ' BG- ', ' WL- ', ' DELTA-

', ' PLO- ', ' PL1- ', ' PL2A- ', ' PL2B- ', ' PL3A- ', ' PL3B- '<«•;END;OUTPUT;END;RUN;

DATA OUTSAS.NCURVE4I; SET ITERATE(KEEP-BG WL PLO PL1 PL2A PL2B PL3A PL3B DELTA)RUN;

PROC PRINT DATA-OUTSAS.NCURVE*I;FORMAT PLO PL1 PL2A PL2B PL3A PL3B 8.4;TITLE1"TEST PRINT OF NON-PARAMETRIC SIMULATION NCURVE4I";TITLE2"NUMBER OF ITERATIONS - 4REPEAT";RUN;

%MEND NPARSIM;%MACRO CURVE;%DO 1-tISTART %TO *CURVENUM;%NPARSIM%END;%MEND CURVE;%CURVE

B-5

Page 108: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

EFFECTIVE POWER (%) w

09

O

8

M ' smi

w•nw»wnwo3w»n

Ben

Vof

Page 109: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

II. PARAMETRIC RETESTING STRATEGIES

POWER CURVE FOR 95% TOLERANCEAND 90% PREDICTION LIMIT

(• Background Sarapfes;

A (UNITS ABOVE BACKGROUND)

POWER CURVE FOR 95% TOLERANCEAND 90% PREDICTION LIMIT

A (UNITS ABOVE BACKGROUND)

B-7

Page 110: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR 95% TOLERANCEAND 85% PREDICTION LIMIT

(U Backfnmd SaMpta;

5!£

1 1 3

A (WOTS ABOVE BACKGROUND)

POWER CURVE FOR 95% TOLERANCEAND 85% PREDICTION LIMIT

CM

2

I

1 2 3

A (UNITS ABOVE BACKGROUND)

B-8

Page 111: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR 95% TOLERANCEAND 90% PREDICTION LIMIT

(24 BackfnMMl Sanpta; 5 wdb)

i a J

A (IMm ABOVE BACKGMXMD)

POWER CURVE FOR 95% TOLERANCEAND 98% PREDICTION LIMIT

!

1 2 3

A (IMTS ABOVE BACKGROUND)

B-9

Page 112: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR 95% TOLERANCEAND 97% PREDICTION LIMIT

1 2 3

A (UNtn ABOVE BACKCWMJN0)

POWER CURVE FOR 95% TOLERANCEAND 97% PREDICTION LIMIT

1W

A (UNITS ABOVE BACKGBOUND)

B-10

Page 113: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR 98% TOLERANCEAND 97% PREDICTION LIMIT

SOwdfa)

*2

M -

1 2 3

A (UNITS ABOVE BACKGKOUND)

POWER CURVE FOR 99% TOLERANCEAND 92% PREDICTION LIMIT

I

1 1 3

A (UNITS ABOVE BACKGROUND)

B-ll

Page 114: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR 98% TOLERANCEAND 95% PREDICTION LIMIT

SOwdb)

I

M -

1 > 3

A (UNITS ABOVS BACKGROUND)

POWER CURVE FOR 99% TOLERANCEAND 90% PREDICTION LIMIT

IM

25

1 2 3

A (UNITS ABOVE BACKGROUND)

B-12

Page 115: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR 98% TOLERANCEAND 97% PREDICTION LIMIT

(24 BickffDMMHl Simpfef; 50 wtita)

2hi

i

1 2 3

A (UNITS ABOVE BACKCBOUND)

POWER CURVE FOR 95% TOLERANCEAND 98% PREDICTION LIMIT

IN

ISf

I • EPA RcfcroKC

if Ztrortnmpta*

O OM reamplc

A Two manptes• 1 2

A (UNITS ABOVE BACKGROUND)

B-13

Page 116: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR 98% TOLERANCEAND 98% PREDICTION LIMIT

(16 B*c*cr«uMi Suaptec; 100 wtMf)

£M

1 1 3

A (UNITS ABOVE BACKGBOUND)

POWER CURVE FOR 99% TOLERANCEAND 95% PREDICTION LIMIT

IMwtfe)iw

IM

1 2 3

A (UNITS ABOVE BACKGROUND)

B-14

Page 117: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR 98% TOLERANCEAND 98% PREDICTION LIMIT

(24 BackfrtMad Staples; 100 wcllt)

2M

I

1 2 3

A (UNITS ABOVE BACKGROUND)

B-15

Page 118: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

NON-PARAMETRIC RETESTING STRATEGIES

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

(8 BKkfroMd SMpta*; 5 write)1M

A (UNITS ABOVE BACKGROUND)

B-16

Page 119: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

(• Dirtywm<t S»pl«; S

iM

1 1 3

A (UNITS ABOVE BACKGROUND)

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

£M

M

1 2 3

A (WOTS ABOVE BACKGROUND)

B-17

Page 120: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

1M

I

A (Wm ABOVE BACKCKOUND)

A (BOIS ABOVE BACKOOUNO)

B-18

Page 121: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-7.

CONFIDENCE LEVELS FOR NON-PARAMETRICPREDICTION LIMITS FOR N=1(1) 100

N

123456789

10

11121314151617181920

21222324252627282930

31323334353637383940

k=l

50.066.775.080.083.385.787.588.990.090.9

91.792.392.993.393.894.194.494.795.095.2

95.595.795.896.096.296.396.496.696.796.8

96.997.097.197.197.297.397.497.497.597.6

k=2

33.350.060.066.771.475.077.880.081.883.3

84.685.786.787.588.288.989.590.090.590.9

91.391.792.092.392.692.993.193.393.593.8

93.994.194.394.494.694.794.995.095.195.2

NUMBER OF

k=3 k=4

25.040.050.057.162.566.770.072.775.076.9

78.680.081.382.483.384.28S.O85.786.487.0

87.588.088.588.989.389.790.090.390.690.9

91.291.491.791.992.192.392.592.792.993.0

20.033.341950.055.660.063.666.769.271.4

73.375.076.577.878.980.081.081.882.683.3

84.084.685.285.786286.787.187.587.9882

88.688.989.289.589.790.090290.590.790.9

FUTURE

k=5

16.728.637.544.450.054.558.361.564.366.7

68.870.672273.775.076277.378.379280.0

80.881.582.182.883.383.984.484.885.385.7

86.186.586.887.287.587.888.188.488.688.9

SAMPLES

k=6

14.325.033.340.045.550.053.857.160.062.5

64.766.768.470.071.472.773.975.076.076.9

77.878.679.380.080.681.381.882.482.983.3

83.884.284.685.085.485.786.086.486.787.0

k=7

12.522.230.036.441.746250.053.356.358.8

61.163265.066.768269.670.872.073.174.1

75.075.976.777.478.178.879.480.080.681.1

81.682.182.582.983.383.784.184.484.885.1

k=8

11.120.027.333.338.542.946.750.052.955.6

57.960.061.963.665266.768.069270.471.4

72.473.374275.075.876.577.177.878.478.9

79.580.080.581.081.481.882282.683.083.3

A-13

Page 122: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-7. (CONTINUED)

CONFIDENCE LEVELS FOR NON-PARAMETRICPREDICTION LIMITS FOR N=l(l)100

N

41424344454647484950

51525354555657585960

61626364656667686970

71727374757677787980

k=l

97.697.797.797.897.897.997.998.098.098.0

98.198.198.198.298.298.298.398.398.398.4

98.498.498.498.598.598.598.598.698.698.6

98.698.698.698.798.798.798.798.798.898.8

k=2

95.395.595.695.795.795.895.996.096.1962

96296.396.496.496.596.696.696.796.796.8

96.896.996.997.097.097.197.197.197.297.2

97.397.397.397.497.497.497.597.597.597.6

NUMBER OF

k=3 k=4

93.293.393.593.693.893.994.094.194.294.3

94.494.594.694.794.894.995.095.195.295.2

95.395.495.595.595.695.795.795.895.895.9

95.996.096.196.196.296.296.396.396.396.4

91.191.391.591.791.892.092.292.392.592.6

92.792.993.093.193.293.393.493.593.793.8

93.893.994.094.194.294.394.494.494.594.6

94.794.794.894.994.995.095.195.195.295.2

FUTURE

k=5

89.189.489.689.890.090.290.490.690.790.9

91.191291.491.591.791.891.992.192292.3

92.492.592.692.892.993.093.193293293.393.493.593.693.793.893.893.994.094.094.1

SAMPLES

k-6

87.287.587.888.088.288.588.788.989.189.3

89.589.789.890.090.290.390.590.690.890.9

91.091.291.391.491.591.791.891.992.092.1

92.292.392.492.592.692.792.892.992.993.0

k=7

85.485.786.086.386.586.887.087.387.587.7

87.988.188.388.588.788.989.189.289.489.6

89.789.990.090.190.390.490.590.790.890.9

91.091.191.391.491.591.691.791.891.992.0

k=8

83.784.084.384.684.985.285.585.786.0962

86.486.786.987.187.387.587.787.988.188.2

88.488.688.788.989.099299389.589.689.7

89.990.090.19Q290.490.590.690.790.890.9

A-14

Page 123: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

TABLE A-7. (CONTINUED)

CONFIDENCE LEVELS FOR NON-PARAMETRICPREDICTION LIMITS FOR N=l(l)100

N

81828384858687888990

919293949596979899

100

k=l

98.898.898.898.898.898.998.998.998.998.9

98.998.998.998.999.099.099.099.099.099.0

k=2

97.697.697.697.797.797.797.897.897.897.8

97.897.997.997.997.998.098.098.098.098.0

NUMBER OF

k=3 k=4

96.496.596.596.696.696.696.796.796.796.8

96.896.896.996.996.997.097.097.097.197.1

95.395.395.495.595.595.695.695.795.795.7

95.895.895.995.996.096.096.0%.l96.196.2

FUTURE

k=5

94.294394.394.494.494.594.694.694.794.7

94.894.894.994.995.095.095.195.1952952

SAMPLES

k=6

93.193.293.393.393.493.593.593.693.793.8

93.893.993.994.094.194.194.294.294.394.3

k=7

92.092.192292.392.492.592.692.692.792.8

92.992.993.093.193.193293.393.393.493.5

k=8

91.091.191.291.391.491.591.691.791.891.8

91.992.092.192292292.392.492.592.592.6

A-15

Page 124: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

(24 Bacfcfrwad Sanptes; 5 wdb)

hi

1 2 3

A (DOTS ABOVE IACKC1OUND)

I

1 1 3

A (UNITS AMVt BACKGKOUND)

B-19

Page 125: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

A (UNnS ABOVE SACKOOUND)

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

1 2 3

A (UNTO ABOVE •ACXOKHMD)

B-20

Page 126: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

(S Backfrwad SI^IM, 20

A (UNITS ABOVE BACKCKHJND)

(• BMktnwtf SM t̂a; 20 wrtfc)1M

A

o(A)

1 2 3

A (UNITS ABOVE BACKGROUND)

B-21

Page 127: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

(M BMkfM^ SMpfac 20 wrib)

i a 3

A (UNITS ABOVE BACKOOUND)

2H

I 1 9

A (UNTO ABOVE BACKGROUND)

B-22

Page 128: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

(16 BKfc|TMM4 S>n

1 1 3

A (UNITS ABOVE BACKGBOUND)

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

04

ISf

1 1 3

A (UNITS ABOVE BACKCROUN0)

B-23

Page 129: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

(14 BtdtcrwMtd Saapta; 20 wdb)

*I2

1 2 3 4

A (U«n ABOVE BACKGROUND)

A (UMR ABOVE BACKGROUND)

B-24

Page 130: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

(32 BacfcfDMMd Siraplo; 20 wdb)

i1 1 3 4

A (WOTS ABOVE BACKCROUW)

(33 ••rfctriMit Siaphr, 30 odto)IM

1 2 3

A (UNITS ABOVE BACKGROUNB)

B-25

Page 131: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

£M

I

1 2 3

A (WOTS ABOVE lACKCBOUND)

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

i

1 1 3

A (UNTO ABOVE MCKCBOUNO)

B-26

Page 132: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

« Backfroud Samplci; 50 wtfe)

I

1 1 3

A (UNITS ABOVE BACKGROUND)

SOwtfc)

*

I

1 2 3

A (UNITS ABOVE BACKGROUND)

B-27

Page 133: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

(161

2M>

A<UMnAK>VEBACKC*OUND>

(M BMfcpMtf SMBpta; SO Wdb)

£

A (IMTS ABOVE •ACKCROUND)

B-28

Page 134: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

<1« Backer*** Suipta; 50 wcfls)

2ki

S

1 2 3

A (UNTII ABOVE BACKGROUND)

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

(M

£S!

1 2 3

A (UNITS ABOVE BACKGROUND)

B-29

Page 135: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

S0»db)

• i a aA (UNITS AM>VI BACKCmOUND)

Q4 ••cfcy MM« $••»!•; 51 wdh)

1 2 3

A (UNITS ABOVE •ACKGMHJND)

B-30

Page 136: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR NON-PARA METRICPREDICTION LIMITS

(32IM

1 2 3

A (UNITS ABOVE BACKGROUND)

Mwrib)

i a 3

A (UNITS ABOVE BACKGROUND)

B-31

Page 137: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

Slwtfti)

1 > 3

A (UNm ABOVE BACKCBOUND)

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

1M

M -

i a J

d (UNTO ABOVE BACKGROUND)

B-32

Page 138: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

(I ; 100 *«•*)

1 2 3

A (UNITS ABOVE BACKGROUND)

IN

A

o(A)

1 2 3

A (UNITS ABOVE BACKGROUND)

B-33

Page 139: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR NON- PAR A METRICPREDICTION LIMITS

1M

M -

2M>

1 1 3

AiUNflS ABOVE BACKGROUND)

l

1 1 3

A (UNITS ABOVE BACKCBOUND)

B-34

Page 140: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

(1* BMfcfTwiad Sampta; 100

1M r-

sI

1 1 3

A (UNn ABOVE KACKGKOUND)

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

. 1M wrik)

1 2 J

A (IMn ABOVE BACKCKO(M»

B-35

Page 141: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

(24 BMfcfMMMt Sanpks; 100 wdU)

1M

1 2 3

& (UKITS ABOVE BACKGROUND)

1 t 3

A (UNHS ABOVE BACKGBOUND)

• EPARtfcrwct

B-36

Page 142: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

POWER CURVE FOR NON-PARAMETRICPREDICTION LIMITS

(32 BirtrfTiMrt Sa»pl«; 100

I

1 2 3

A (UNITS ABOVE BACKCTOUND)

1

1 1 3

A (UNITS AMOVE BACKGROUND)

B-37

Page 143: OFFICE OF SOLID WASTE PERMITS AND STATE PROGRAMS … › work › 05 › 89012.pdf · Because any statistical or mathematical model of actual data is an approximation of reality,

• EPA

A Tfcmriii»»ln(A)

O Tbrw rMHipta (B)

1 2 )

A (UVIS ABOVE BACKGROUND)

B-38