as-tg5 - uncertainty of measurement in chemical and microbiological testing

8/10/2019 As-TG5 - Uncertainty of Measurement in Chemical and Microbiological Testing

1/29

technical guide

Uncertainty of Measurement,Precision andLimitsof Detection inChemical and MicrobiologicalTesting Laboratories

Published by:

International Accreditation New Zealand626 Great South Road, Greenlane, Auckland 1005Private Bag 28 908, Remuera, Auckland 1136, New Zealand

Telephone 64 9 525 6655Facsimile 64 9 525 2266Email: [email protected]: http://www.ianz.govt.nz

AS TG5, May 2004

ISBN: 09086 11 60 9

Copyright International Accreditation New Zealand 2003


2/29

Uncertainty of Measurement in Chemical and Microbiological Testing

Contents

1 Introduction 3

2 Some Definitions 4

3 IANZ Policy 6

4 General Interpretation And Guidance 6

5 Estimating Uncertainty Of Measurement Step By Step 12

6 Expanding Uncertainty And Confidence Limits 21

7 Comparing A Test Result With A Fixed Value 23

8 Comparing A Test Result With A Natural Population 25

9 Comparing Two Test Results 26

10 Criterion Of Detection 26

11 Limit Of Detection 27

12 Limit Of Quantification 27

13 Acknowledgement 28

14 References 29

2


3/29


1 Introduction

Decisions relating to industrial production andalso legal decisions are often based on testresults. They can involve large sums of

money or even a persons liberty. It istherefore important, not only for the analyst,but also for their clients or the decision makersto know how reliable test results are. Whatconfidence can be placed on results, whichindicate that a product is unsatisfactory or thata person has committed an offence? SomeNew Zealand chemical and microbiologicaltesting laboratories do not know whatconfidence they can place in their results andtherefore misinterpretation of insignificantdifferences is resulting in incorrect businessdecisions.

When analysts look at the results, they arefaced with a large list of possible sources ofvariation, some causing a systematic bias tothe results and some being random in nature.The possible combined extent of thesevariations needs to be known for each testmethod at typical analyte levels before theanalyst can know whether their method andresults are fit for purpose or not.

Known biases should be corrected (unless thatis not the convention for the method). Theremaining components of variation should beevaluated so that an overall uncertaintyestimate may be made.

The general accreditation criteria forlaboratories ISO/IEC 17025: 1999, states that,Testing laboratories shall have and shallapply procedures for estimating uncertainty ofmeasurement. In certain cases the nature ofthe test method may preclude rigorous,metrologically and statistically valid calculationof uncertainty of measurement. In these casesthe laboratory shall at least attempt to identifyall the components of uncertainty and make areasonable estimation, and shall ensure thatthe form of reporting of the results does notgive a wrong impression of the uncertainty.Reasonable estimation shall be based onknowledge of the performance of the methodand on the measurement scope and shallmake use of, for example previous experienceand validation data.

This International Accreditation New Zealand(IANZ) guide explains the steps, which may be

taken to identify uncertainty components andto estimate the uncertainty of measurements inthe fields of Chemistry and Microbiology.

It also explains the use of precision in suchestimates. Limits of detection and compliancewith specification are also discussed. Latersections contain some useful formulae andexamples for these fields.

Readers are referred to the Eurachem / CITACGuide for more details and further exampleson this subject. This document is obtainablefrom the Eurachem web site.

The Eurachem document suggests twoapproaches that may be used for theestimation of uncertainty in chemicalmeasurements. The first is to identify eachsource of uncertainty, estimate the uncertaintyof each source separately and combine themin the correct way to arrive at an overall

uncertainty. The second approach is to look foravailable precision data or obtain suitable suchdata, then to identify the sources of uncertaintywhich are not included in this data, estimatethe uncertainty of each additional source andthen combine these with the precisionstandard deviation.

Telarc / IANZ since the mid 1980s has beenencouraging chemical testing laboratories toobtain precision data and to estimate standarddeviations, confidence limits and limits ofdetection from these data. It is therefore

assumed that most such laboratories in NewZealand will have estimates of precision andpreferably intermediate precision for theirtests. Many will also have precision data frominter-laboratory comparison programmes inwhich they participated or from published data.

Microbiology laboratories will have or canobtain test results for replicate samplesanalysed preferably by different people, media,incubators, etc but usually on the same day(morning then afternoon). Different days andinter-laboratory comparisons are a problem in

microbiology because of the lack of stability ofsamples.

Because of the availability of these data, themain approach described here for theestimation of uncertainty of measurement inchemical and microbiological testing is thesecond Eurachem one, starting with precisiondata and combining this with any missingcomponents. Where chemistry laboratories areinvolved in characterising reference materialsor require a more rigorous approach, theestimation of uncertainty for each contributing

item and then the combining of these resultswill be the preferred approach.

3


4/29


Many New Zealand chemical andmicrobiological testing laboratories are alsoinvolved in monitoring environmental samples.In most cases, either the Health Department ofa regional council will have set limits for levels

of pollutants that samples may contain.Frequently, laboratory results indicate thatnone of a particular pollutant was detected.Such information is, of course, useless if thelimit of detection for the method is above thelimit set by the authority. This situation mayoccur when a laboratory has not correctlycalculated the limits of detection for itsmethods.

The uncertainty estimates for low levels maybe extended readily for the calculation of limitsof detection and this is presented in the later

sections of this guide.

2 Some Defin itions

2.1 Uncertainty of Measurement(ISO / VIM 1993)

A parameter associated with the result of ameasurement that characterizes the dispersionof the values that could reasonably beattributed to the measurand.

Notes

1. The parameter may be, for example astandard deviation (or a given multiple of it), orthe half-width of an interval having a statedlevel of confidence.

2. Uncertainty of measurement comprises, ingeneral, many components. Some of thesecomponents may be evaluated from thestatistical distribution of the results of series ofmeasurements and can be characterized byexperimental standard deviations. The othercomponents, which can also be characterizedby standard deviations, are evaluated from

assumed probability distributions based onexperience or other information.

3. It is understood that the result of themeasurement is the best estimate of the valueof the measurand and that all components ofuncertainty, including those arising fromsystematic effects such as componentsassociated with corrections and referencestandards, contribute to the dispersion.

2.2 Measurement Traceabili ty(ISO / VIM 1993)

The property of the result of a measurementwhereby it can be related to stated references,

usually national or international standards,through an unbroken chain of comparisons allhaving stated uncertainties.

ISO/IEC 17025 requires that

For testing laboratories the requirementsgiven for calibration laboratories apply formeasuring and test equipment with measuringfunctions used, unless it has been establishedthat the associated contribution from thecalibration contributes little to the totaluncertainty of the test result. When thissituation arises, the laboratory shall ensurethat the equipment used can provide theuncertainty of measurement needed.

Where traceability of measurements to SI unitsis not possible and/or relevant, the same

requirements for traceability to, for example,certified reference materials, agreed methodsand/or consensus standards are required asfor calibration laboratories.

2.3 MeasurementThe set of operations having the objective ofdetermining a value of a quantity with a statedlevel of uncertainty

Where a test yields, or is based on, a result,which is expressed in numerical terms, thetesting process will be regarded as a

measurement for the purposes of thisdocument and for the application of theuncertainty clause in ISO/IEC 17025.

2.4 MeasurandThe particular quantity that is subject to themeasurement.

2.5 Empirical Test MethodA test method where the result is not traceableto an SI unit, but depends on the defined stepsof the method. A different test method (oftenfor the same named analyte) may give a

different result.

An empirical test method is a test methodintended to measure a property, which isdependent on the test method used tomeasure it. Different methods may returndifferent results, which may not be related.In many cases, the method cannot be verifiedusing another test method. Examples ofempirical test methods are the leachableconcentration of chemicals and the hardnessof a material. For the former, differentextractant and leaching conditions will produce

different results.

4


5/29


For the latter, different indenter shapes andapplied forces will produce different results.

A rational test method is a test methodintended to measure a property, which is

defined independently of any test methods.There is an objective true value to thatproperty and the method can be verified usingother test methods. Examples of rational testmethods are those where the totalconcentration of a compound in a sample ismeasured. It is recognised that although thereis an objective true value, it may be verydifficult to measure that value.

All microbiology test methods are empirical asare many (most) chemical test methods(particularly where there is a digestion,

extraction, chemical reaction or clean-up stepor where steps in the method involveincomplete recovery of an analyte and there isno correction for bias). Corrections for spikerecovery results or use of similar compoundinternal standards are attempts at correctingfor bias and making the method closer to arational method.

2.6 True ValueThe value that would be obtained by a perfectmeasurement. This is indeterminate.

2.7 Conventional True ValueThe value attributed to a particular quantityand accepted, sometimes by convention, ashaving an uncertainty appropriate for a givenpurpose.

Frequently, a number of results ofmeasurements of a quantity are used toestablish a conventional true value.

2.8 Accuracy or Truenessof a Measurement

This is the closeness of the agreement

between the result of a measurement and atrue value of the measurand.

Accuracy, and particularly the extent ofinterferences must be assessed primarilyduring the selection and, if necessary,development of a test method. This requiresdetailed consideration of the ways in whichvarious sample types may affect themeasurement.

For well-researched methods, these aspectsshould have been covered for the range of

sample types specified and therefore specificchecks within each laboratory using that

method may not be required. However, it isprudent to include accuracy checks such asrunning spiked samples and referencesamples, or split samples in inter-laboratorytrials, for which the conventional true values

are well established. When introducing a well-researched method, the laboratory must stillcheck that it gives reliable results, usually byuse of certified reference materials andreplicates.

Where a new test method is developed or amethod is applied to sample types for which itsaccuracy has not been assessed, detailedinterference and accuracy checks arenecessary. These may include;

Detailed consideration of the chemistry andphysics of the method in relation to various

sample types Cross-checking of results with completely

independent methods

Checking the effect of adding interferingsubstances to known samples

Spike recovery tests Reanalysing samples from which the

measurand has been removed.

Inter-laboratory comparisons Certified Reference Materials

2.9 BiasThe systematic difference between results ofmeasurements and the true value of themeasurand.

2.10 PrecisionCloseness of agreement amongst results ofsuccessive measurements of the samemeasurand.

Precision increases as random variationsdecrease. It is possible to have results, whichare precise but not accurate.

They may be all close to a mean or averagevalue but because of some systematic affectthere is a bias and all are much higher (orlower) than the true value.

2.11 Repeatability rCloseness of the agreement between theresults of successive measurements of thesame measurand carried out under the samemeasurement conditions (same laboratory,same sample, same method, same equipment,same materials, same staff, same time) e.g.from duplicates within the same batch.

5


6/29


2.12 Reproducibil ity RCloseness of agreement between the resultsof measurements of the same measurandcarried out under changed conditions ofmeasurement (different laboratory, same

sample, same method, different equipment,different materials, different staff, differenttime) e.g. from inter-laboratory comparisonresults.

2.13 Intermediate PrecisionCloseness of agreement between the resultsof measurements of the same measurandcarried out under changed conditions ofmeasurement but within the one laboratory(same laboratory, same sample, samemethod, different equipment, differentmaterials, different staff, different time) e.g.

from replicate analysis of Quality Controlsamples or matrix reference materials overtime.

2.14 Criterion of DetectionThis is the lowest result at which the analystmay have (say 95%) confidence that someanalyte has been detected rather than none.This is about 1.7 times the standard deviationof low-level results (the single sided statistic isused).

2.15 Limi t of Detection

If a test result is just below the criterion ofdetection then the analyst cannot say that hehas found nothing. The true value will beexpected (say with 95% confidence) to liebetween zero and two times the criteria ofdetection (the confidence interval around thecriteria of detection). Therefore all that can besaid is that the true result is less than thisupper confidence limit viz. twice the criteria ofdetection.

Therefore the Limit of Detection is twice theCriteria of Detection or about 3.4 times the

standard deviation of low-level results.

2.16 Limit for QuantificationFor a numerical value to be placed on a low-level test result, the lower confidence limitassociated with this value should besignificantly above zero.

For a result at the limit of detection (say 10units), for a normal distribution and reasonabledegrees of freedom, the 95% confidence limitswill be plus and minus 2/3.4 of the result (e.g.10 units +/- 5.9 units). Such a result may not

be helpful to the client.

It may therefore be reasonable to specify thatthe confidence limits for a reported numericaltest result should be no more than say onequarter of that result.

Therefore to achieve this the limit forquantification should be about 8 times thestandard deviation of low-level results.

2.17 Certi fied Reference MaterialReference material accompanied by acertificate, one or more of whose propertyvalues are certified by a procedure whichestablishes traceability to an accuraterealisation of the unit in which the propertyvalues are expresses, and for which eachcertified value is accompanied by anuncertainty at a stated level of confidence.

3 IANZ Policy

It is the policy of IANZ that accredited testinglaboratories shall comply with therequirements of ISO/IEC 17025 in relation tothe estimation and reporting of uncertainty ofmeasurement in testing.

The requirements are stated mainly in Clauses5.4.6 and 5.10.3.1 c) of ISO/IEC 17025.

Procedures and recommendations stated inthe documents referred to in the notesfollowing Clause 5.4.6 are not requirements.

The measurement uncertainty definition in ISO VIM 1993

2applies.

4 General Interpretation AndGuidance

The following guidance and interpretation are

based on the APLAC TC 005 Guide.

4.1 Tests for which Uncertainty AppliesWhere a test produces numerical results, orthe reported result is based on a numericalresult, the uncertainty of measurement forthose numerical results shall be estimated. Incases where the nature of the test methodprecludes rigorous, metrologically andstatistically valid estimation of themeasurement uncertainty, a testing laboratoryshall identify all significant components ofuncertainty and make a reasonable attempt to

estimate the overall uncertainties of thosemeasurements. This applies whether the testmethods are rational or empirical.

6


7/29


Where results of tests are not numerical (e.g.,pass/fail, positive/negative or other qualitativeexpressions) estimates of uncertainty or othervariability are not required. However,laboratories are encouraged to have an

understanding of the variability of the resultswhere possible and especially the possibility offalse negatives or false positives.

The significance of the uncertainty ofqualitative test results is recognised and so isthe fact that the statistical machinery requiredto handle the calculation of such uncertaintyexists. However, in view of the complexity ofthe issue and the lack of agreed approaches,IANZ does not require laboratories to estimatethe uncertainty of qualitative test results at thistime but will keep this under review.

4.2 Defining the MeasurandIt is recognised that in chemical andmicrobiological testing, the measurand is oftendefined in terms of the method (empiricalmethod) and is not directly traceable to SI.Care is need in defining the measurand toensure that all uncertainty components will beidentified and accounted for.

Traceability for chemical test methods oftenhas several components. Balances,thermometers and volumetric glassware are

traceable to SI units and full uncertaintybudgets should be available if suchmeasurements make a significant contributionto the overall uncertainty. Traceability of thefinal result calculation is usually to a referencematerial which, although often not complyingwith the definition (VIM) of a certified referencematerial, should be the best available. Wherespecified sample preparations, extractions,digestions, chemical reactions, clean-ups,temperatures, etc are included in the methodand no correction for method bias (e.g.recoveries) is specified, the method is

regarded as empirical and is traceable to itsspecified instructions. Different methods wouldgive different results.

4.3 Identifying theComponents of Uncertainty

The laboratory shall identify all the significantcomponents of uncertainty for each test. Anindividual component contributing less than 1/5to 1/3 of the total uncertainty of themeasurement will not have much impact onthe total uncertainty of the measurement.However, it would become significant if the

method included a number of components ofuncertainty of this size.

Even where reliance is to be made on overallprecision data, the laboratory shall at leastattempt to identify all significant components.This will provide information to confirm that theapproach taken is reasonable and all

significant components have been accountedfor.

Flowcharting the steps of the test method andusing fish-bone diagrams to present theuncertainty components provide usefulapproaches.

In some cases, groups of test method steps(similar sample type, same procedure, sameweights and same volumes) may be commonto several different test methods and once anestimate of uncertainty has been obtained for

that group of steps, it may be used in theestimates of uncertainties for all methodswhere the group applies e.g. sub-sampling andsample preparation components applied to aspecific sample type.

4.4 Approaches to theEstimation of Uncertainty

There are various published approaches to theestimation of uncertainty and/or variability intesting. ISO/IEC 17025 does not specify anyparticular approach. Laboratories areencouraged to use statistically valid

approaches. All approaches, which give areasonable estimate and are considered validwithin the relevant technical discipline, areequally acceptable and no one approach isfavoured over the others. The followingparagraphs summarise of two approaches,which are described in the Eurachemdocument Quantifying Uncertainty inAnalytical Measurement.

Both the intermediate precision andreproducibility (from inter-laboratorycomparisons) described in ISO 5725 (see

clause 5.4.6.3 Note 3 of ISO/IEC 17025) maybe used in estimating testing uncertainty.However, these alone may omit someuncertainty sources, which, if significant,should also be estimated and combined.

Method validation and verification data fromrepeat analysis of matrix reference materials,in-house standards, replicate analyses, inter-laboratory comparison programmes, etc., willbe useful in establishing method precision. Forchemical testing, a well-designed intermediateprecision result or the precision from an inter-

laboratory comparison will often incorporatethe major components of uncertainty.

7


8/29


For microbiological testing, in most practicalcases, precision will be the only significantcomponent and the only one, which in practicecan be readily estimated.

The uncertainty of physical measurements, thepurity of calibration reference materials andtheir uncertainties, the uncertaintiesassociated with recovery (bias) trials (whenrecovery factors are applied to results), as wellas precision data shall all be considered in theevaluation of measurement uncertainty forchemical testing.The ISO Guide to the Expression ofUncertainty in Measurement (GUM)

(see

clause 5.4.6.3 Note 3 of ISO/IEC 17025) isregarded as the more rigorous approach to theestimation of uncertainty. However, in certain

cases, the validity of GUM estimates from aparticular mathematical model may need to beverified, e.g., through inter-laboratorycomparisons.

The Number of Significant Figures approachand Note 2 of ISO/IEC 17025 are notconsidered suitable for the evaluation ofmeasurement uncertainty in chemical ormicrobiological testing.

4.5 Degree of RigourThe degree of rigour and the method to be

used for estimating uncertainty shall bedetermined by the laboratory in accordancewith Note 1 of Clause 5.4.6.1 of ISO/IEC17025.

To do this, the laboratory shall:

Consider the requirements and limitationsof the test method and the need to complywith good practice in the particular sector

Ensure that it understands therequirements of the client (see clause4.4.1a of ISO/IEC 17025). It is often thecase that the client understands its problem

but does not know what tests it requires ortheir uncertainty and needs guidance onthe tests required for solving its problem

Use methods, including methods forestimating uncertainty, which meet theneeds of the client (see clause 5.4.2 ofISO/IEC 17025). It should be noted thatwhat clients want may not be what theyneed

Consider the narrowness of limits on whichdecisions on conformance withspecification are to be made

Consider the cost effectiveness of theapproach adopted.

In general, the degree of rigour needed isrelated to the level of risk that can be tolerated.

Rigorous consideration of individual sources ofuncertainty, combined with mathematical

combination to produce a measurementuncertainty is considered appropriate for themost critical work, including thecharacterisation of reference materials.However, if an inappropriate model is used,this approach will provide an inadequatemeasurement uncertainty and is notnecessarily better than the following approach.Estimation of measurement uncertainty basedon the overall estimate of precision throughinter-laboratory studies, method validation orother quality control data, taking intoconsideration additional uncertainty sources

will be commonly used for chemical andmicrobiological testing. Additional sources thatneed to be considered may include samplehomogeneity and stability,calibration/reference material, bias/recovery,equipment uncertainty (where only one item ofequipment was used in obtaining the precisiondata).

In general, if less rigour is exercised inestimating measurement uncertainty, theestimated measurement uncertainty valueshould be larger than an estimate obtained

from a more rigorous approach. Semi-quantitative measurements require lessrigorous treatment of measurementuncertainty.

When a compliance decision is clear, then aless rigorous approach to measurementuncertainty estimation may be justified.

If the estimated uncertainty in reported resultsmeans that results will be unacceptable to thelaboratorys client or decision maker, or will betoo large for determination of compliance with

the specification, the laboratory shouldendeavour to reduce the uncertainty,e.g., through identification of the largestcontributors to uncertainty and working onreducing these.

4.6 Collaborative Trials andProficiency Testing

Proficiency testing precision may not alwaysprovide complete measurement uncertaintydata if significant aspects (e.g. sampling orsample homogeneity) have not been taken intoaccount.

8


9/29


Three examples are:

Matrix differences may occur between theproficiency test samples and the samplesroutinely tested by a laboratory.

Result levels may not be the levels

routinely tested in a laboratory and/or maynot cover the full range of levelsencountered in routine laboratory work.

Participating laboratories may use avariety of empirical methods (differentmeasurand) to produce the PT results.

Statistical analysis of PT results may give anindication of the precision obtainable from aparticular method.

4.7 Uncertainty Arising from SamplingMeasurement uncertainty strictly applies only

to the result of a specific measurement on anindividual specimen.

During contract review, there shall beconsideration and agreement with the client asto whether the test result and uncertainty areto be applied to the specific sample tested orto the bulk from which it came.

Where sampling (or sub-sampling) is to betreated as part of the test (measurand), theuncertainty arising from such sampling shall beconsidered by the laboratory. Estimating the

representativeness of a sample or set ofsamples from a larger population requiresadditional statistical analysis.

Where a test method includes specificsampling procedures designed to characterisea batch, lot or larger population, themeasurement uncertainties for individualmeasurements are often insignificant relativeto the statistical variation of the batch, lot orlarger population. In cases where themeasurement uncertainty of individualmeasurements is significant in relation to the

sampling variation, the measurementuncertainty shall be taken into considerationwhen characterising the batch, lot or largerpopulation.

Where the test method includes a specific sub-sampling procedure, it is necessary to analysethe representativeness of the sub-sample aspart of the measurement uncertaintyestimation. Where there is doubt about therepresentativeness of a sub-sample, it isrecommended that multiple sub-samples betaken and tested to evaluate therepresentativeness of each.

Where only one sample is available and isdestroyed during the test, the precision ofsampling cannot be determined directly.However, the precision of the measurementsystem shall be considered. A possible

method for estimation of the precision ofsampling is to test a batch of homogeneoussamples for a highly repeatable measurandand calculate the sampling standard deviationfrom the results obtained.

4.8 ReportingMeasurement Uncertainty

If the laboratory has not estimated theuncertainties of its measurements then it willnot know whether its test results are valid forthe intended use of the client. This is whyISO/IEC 17025 makes it a requirement for

laboratories to estimate uncertainties for allmeasurements. However, there will beoccasions when the laboratory decides not toreport its uncertainties to its clients. ISO/IEC17025, however, requires reporting ofuncertainties in some specific circumstances.

For quantitative test results, measurementuncertainty shall be reported where requiredby clause 5.10.3 b) of ISO/IEC 17025, whichincludes the following circumstances:

When it is relevant to the validity orapplication of the result

When a clients instructions so require When the uncertainty affects compliance

with a specification limit

When measurement uncertainty is notreported under the provision of the thirdparagraph of clause 5.10.1 of ISO/IEC 17025,its absence shall not affect the accuracy of theconclusion, clarity of the reported informationnor introduce any ambiguity in the informationprovided to the client.

Measurement uncertainty associated with

results below the limit of quantification shallnot be reported, as currently no reliableconvention for this purpose exists.

The requirement to report measurementuncertainty when it is relevant to the validity orapplication of the test result will often need tobe interpreted. In such cases, the clientsneeds and the ability of the client to use theinformation may be taken into consideration.Although in the short term, some clients willnot be in a position to make use ofmeasurement uncertainty data, this situationcan be expected to improve.

9


10/29


In this case, the laboratory may need toprovide an interpretation of the results if theclient could be misled by the numerical resultsalone.

When reporting measurement uncertainty thereporting format described in the GUM isrecommended. The results of the uncertaintyestimations would normally be reported basedon a level of confidence of 95%. Theindiscriminate use of a coverage factor of 2 isnot recommended. Not all combineduncertainties are normally distributed and,where practicable, the uncertainty appropriateto the 95% confidence level for the appropriatedistribution should be used. The coveragefactor used for calculating the expandeduncertainty should also be reported.

When reporting the test result and itsuncertainty, the use of excessive numbers ofsignificant figures shall be avoided. Unlessotherwise specified, the primary result shall berounded to the number of significant figuresconsistent with the measurement uncertainty.When the test method prescribes rounding to alevel that implies greater uncertainty than theactual measurement uncertainty, theuncertainty implied by this rounding should bereported as the measurement uncertainty ofthe reported result.

Laboratories shall have the competence tointerpret measurement results and theirassociated measurement uncertainty to theirclients.

At present, microbiological laboratories maydecide not to report uncertainty unlessrequired by clients. This is becauseuncertainties are very large and clients are notyet prepared to understand and accept these.

When reporting microbiological uncertainties, a

description of the procedure used to estimatethe uncertainty should also be includedbecause, of the various methods used aroundthe world, some give significantunderestimates of the actual uncertainty.

4.9 Determining Compliancewith Specification

Decisions on when and how to reportcompliance or non-compliance vary accordingto requirements of the client and otherinterested parties. However, the laboratoryshall take its measurement uncertainty into

consideration appropriately when makingcompliance decisions / statements and clients

shall not be misled in relation to the reliabilityof such decisions.

In general, the principles described in APLACTC 004 shall be followed.

Difficulties may arise where:

Specifications do not quote the empiricalmethod to be used and a number ofmethods for the analyte but giving differentresults are available.

Where results are close to a specificationlimit one laboratory may state that asample complies with the specification(their confidence interval for the resultexcludes the specification limit) whereasanother laboratory may state that there is

doubt as to whether or not that samesample complies with the specification(their confidence interval includes thespecification limit).

4.10 Assessment for AccreditationDuring assessment and surveillance of IANZaccredited laboratories, the assessment teamshall evaluate the capability of the laboratoryto estimate the uncertainty of measurementsfor all tests included in its accreditation scope.They shall check that the estimation methodsapplied are valid, all significant uncertainty

components have been included, and all theIANZ criteria are met. The assessment teamshall also ensure that the laboratory canachieve the claimed limits of detection.

4.11 Additional Interpretations forChemical Testing

For empirical methods, the method bias is bydefinition zero and only individual laboratoryand measurement standards bias effects needto be considered. Where results are correctedfor recovery (from spiking experiments), theuncertainty associated with these spiking

results should be included.

It is important to ensure that all appropriateeffects are covered but not double accounted.Where appropriate, effects of matrix type andchanges in concentration should be included inthe measurement uncertainty.

Where proficiency testing precision data areused, it is important to ensure that the dataused to estimate the measurement uncertaintyare relevant. In particular, they should relate tothe same measurand (same test method and

matrix).

10


11/29


There may be issues if the proficiencyprogramme includes laboratories with a widerange of backgrounds and skill levels.

In chemical testing, it is customary to evaluate

the uncertainty at various selected levels of theanalyte. However, when a measurement isbeing made to test for compliance with limitvalues, it is necessary to use an uncertaintyvalue for test results close to the compliancelimit. Therefore, it is useful to select the limitvalues as the levels at which the uncertainty isevaluated. This approach is most likely toprovide the best estimate of the measurementuncertainty at levels adjacent to the limitvalues.

Professional judgement may be used for

estimating the magnitude of uncertaintyattributed to certain components where betterestimates are not available or readilyobtainable. In such cases, at least a short-term precision estimate of the componentshould be included in the evaluation.Professional judgement should not normally beused for significant uncertainty components.Where professional judgement has to be usedfor significant components, it must be basedon objective evidence or previous experience.Measurement uncertainty estimates containingsignificant components evaluated by

professional judgement shall not be used forapplications demanding the most rigorousevaluations of uncertainty.

4.12 Additional Interpretations forMicrobiological Testing

There are four main types of microbiologicaltests;

General quantitative procedures MPN procedures Qualitative procedures Specialist tests, e.g. pharmaceuticals.

Various approaches to estimating uncertaintyare available for general quantitative testing.

Quantitative microbiological methods areempirical because results are dependant onmethod defined conditions such astemperature, incubation period and media.Therefore, they do not have a method biascontributor to their measurement uncertainty.

A few certified reference materials areavailable for quantitative microbiological tests.Where they are available, their certified resultsare mostly obtained from collaborative studies.

Therefore only consensus values relating tothe specified method are available and, as withall other microbiological test methods, thesemethods are also empirical. These referencematerials are useful in demonstrating

competence of the laboratory but should notbe used to quantify bias.

4.12.1 Use of Precision DataThe use of precision data as described in theEurachem / CITAC Guide is applicable formicrobiology. Intermediate precision as setout in ISO 5725 is considered to incorporatemost if not all of the significant uncertaintycomponents of microbiological tests.

Any components of uncertainty not included inintermediate precision, such as performance of

different batches of media, variation inincubator conditions (where only one incubatoris available), etc., may be examined forsignificance by other statistical means or byredesign of the precision exercise.

The distribution of results from plate counttests is not normal but skewed (long right tail).Such data may first be transformed by takingthe logarithm10of each result, to obtain a closeto normal distribution. The log standarddeviation and confidence limits may then becalculated before anti-logging each confidence

limit separately.

Where other components of uncertainty suchas sampling or equipment variations need tobe combined with a precision result that wasestimated using logs and anti-logs,sophisticated mathematical calculations maybe necessary.

For those plate count tests where test resultsare less than about 20 CFU, the result maywell be below the limit of quantification (seeSection 12) and laboratories are cautioned

about reporting numerical values for suchresults.

At this stage, very little method performancedata are available from collaborative trials orproficiency testing, although this situation maychange in the future. Proficiency testing datamay not always provide suitable measurementuncertainty data because significant aspectsmay not have been taken into account:

Matrix differences may occur between theproficiency test samples and the samplesroutinely tested by a laboratory. Samples

may have been specially homogenisedand stabilised.

11


12/29


The population levels may not be thelevels routinely tested in a laboratoryand/or may not cover the full range ofpopulation levels encountered in routinelaboratory work.

Participating laboratories may use a

variety of empirical methods (differentmeasurand) to produce the PT results.

However, statistical analysis of PT results maygive an indication of the precision obtainablefrom a particular method.

4.12.2 Most Probable Number (MPN)Procedures

For MPN procedures it is traditional to refer toMcCradys Tables to obtain the test result aswell as the 95% confidence limits. These datahave been established statistically but possibly

without considering all sources of uncertainty.Some versions of these tables also containsignificant errors arising from the rounding ofdata during their preparation.

Laboratories are encouraged to identifyunusual combinations of positive tubes and toreject such results. If this is done effectively,then the uncertainties quoted in the tables will,in the mean time, be regarded as a reasonableestimate of uncertainty for these methods. Thismay be regarded as an application of Note 2 ofclause 5.4.6.2 of ISO/IEC 17025.

4.12.3 Note 2 of Clause 5.2.6.2 ofISO/IEC 17025

Application of Note 2 for some areas ofspecialist testing e.g., pharmaceuticalmicrobiological assays, may also beapplicable, as the methods concerned includevalidation of the assay parameters, specifylimits to the values of the major sources ofuncertainty of measurement and define theform of presentation of calculated results.

4.12.4 Poisson Distr ibut ion

The Poisson distribution and confidence limitapproaches as described in BS 5763 and ISO7218 may significantly underestimateuncertainty as not all sources of uncertaintyare taken into account. The distribution ofparticles / organisms in a liquid may bedescribed using the Poisson distribution butother components of uncertainty associatedwith the test procedure are not included.Indeed some components of uncertainty suchas dilutions and consistency of reading plateswill not be described by a Poisson distribution.

For one-off analyses, the Poisson distributionapproach will give a quick estimate of the

uncertainty (see Note to 10.1.6 of BS 5763).However, this will likely give a significantunderestimate of the actual uncertainty.

4.12.5 Negative Binomial ModelThe negative binomial model described in

ISO/TR 13843 may be more appropriate thanPoisson alone as it covers the Poissondistribution plus over-dispersion factors.

5 Estimating Uncertainty OfMeasurement Step By Step

5.1 Specify the measurand and decideif it is empirical or traceable to SI

A clear and unambiguous statement of what isbeing measured must be prepared.

Where the method is an empirical method(many chemical tests and all microbiologicaltests), some careful thought is neededbecause what is actually being measured isusually not what the name of the methodimplies.

For example, fat in milk by gravimetricdetermination (Roese Gottlieb) is actually allthe substances which extract from the milksub-sample using the specified solvent, time,temperature and equipment and which remainafter the drying stage. If some oil or greasehad spilled into the sample these would alsobe measured as fat as would any oilycontaminations from reagents or equipment.Some volatile fat components may also belost during the drying stage. Maybe the clientwants to know the fat in the sample and notthe sub-sample. This is a different measurandand the uncertainty of sub-sampling will needto be taken into consideration. If the clientwants to know the fat in the shipment thenthis a different measurand again andhomogeneity of the shipment will need to betaken into consideration. It is also noted thatnot all countries define fat in terms of lipidcontent. They may also define fat in terms offatty acid content.

Another example is total petroleumhydrocarbons in soil. Perhaps the measurandis the material in the sub-sample, which isextracted by the specified solvent and whichabsorbs infra-red light at the specifiedwavelength and under the specified conditionsof dilution, cell length etc as compared with aspecified mixture of hydrocarbons.

One should not forget the material, which isremoved (presumably fats) if the clean-up step

12


13/29


is invoked. But wasnt the client interested inthe whole sample and not the sub-sample?

A microbiological example may be Listeria inMilk. The measurand may be the result ofthe specified calculation from the colonies

which are counted when the sub-sample isresuscitated, diluted, plated, incubated(temperature and time) and otherwise treatedas specified in the methodFurther examples of typical tests done inchemical testing laboratories, which areempirical and which require carefulspecification of the measurand are:

Permanganate value Chemical Oxygen Demand Biochemical Oxygen Demand of Water Total Suspended Solids of Water Weak Acid Extractable or Lead in Soil

Available Phosphorus in Soil Crude or Dietary Fibre in Food Lean Meat in Sausages Total Organic Nitrogen in Sausages Free Sulphur Dioxide in Wine Mercury in Fish Aflatoxins in Peanuts Arsenic, Antimony, Cadmium in Ceramics Flash Point of Avgas Acid and Base Number of Oil Cholesterol in Blood Cyanide in Stomach Contents

Accelerants in Fire Debris.

The list goes on covering the large majority ofall chemical test methods accredited by IANZ,which are empirical and are not traceable tothe mole. Change the method and one has adifferent measurand giving a different answer.

All require careful definition before one candecide which components of uncertaintyshould be included in uncertainty estimates.

Questions also arise about whether the blank

is pre-determined and subtracted and whetherspike experiments give recovery results whichare used to correct for method bias in anattempt to provide some traceability to themole.

5.2 Write the formula for calculating theresults

An example equation for a pesticide in breadbeing analysed by solvent extraction followedby gas liquid chromatography using anexternal standard is,

g/gm.Rec.I

V.c.I

sampref

extrrefsampR = Formula 1

WhereR = Pesticide result in sample (mg/kg)

Isamp

= Peak intensity of sample extract

I ref = Peak intensity of reference standard

c ref = Reference standard concentration

(ug/ml)

Vextr = Final volume of extract (ml)

Rec = Recovery

Msamp = Mass of investigated sub-sample (g)

5.3 Identify and list all possiblesources of uncertainty

To assist with this process, it may be helpful to

flowchart the entire method from sampling orsub-sampling to the final result including listingall materials and equipment used.

Items to note are:

Start at the beginning is samplingincluded in the measurand?

Consider all inputs such as set-up ofequipment and preparation of reagents

Note control items such as masses,volumes, temperatures, times, pressuresconcentrations

Note sub-sampling and samplepreparation / dilution / digestion /extraction steps

Note reference materials and their statedpurities / values

Are you using standard additions orspikes? Are you correcting for recoveriesor using these only as quality controls?

Are you doing duplicates and reportingtheir means?

Are you correcting for an average blank oris a blank run with each sample or batch?

Are dilutions made prior to presentation of

samples to the instrument? What are the plate counting processes?Each step of the method should be examinedto identify possible sources of uncertainty.

Variations that relate to possible bias:

Instrument calibration errors or drift Calibration curve fitting errors Balances, glassware or thermometer

calibrations

Reference material impurities Recoveries and blanks.

Random variations: Inadequate sample storage

13


14/29


Sample homogeneity, sampling, sub-sampling, clumping, dilutions

Sample storage and transport variations Sub-sampling Random sample matrix effects, stability,

variability, form of binding of analyte

(different to spike) Day to day differences Reagent variations Reaction stoichiometry departures from

expected chemistry or incompletereactions

Measuring and weighing Measurement conditions temperature

and humidity effects on measurementsand sample stability, etc

Errors in equivalence point detection Variations in volumetric glassware

calibrations (tolerances) Repeatability of the method Instrument settings; readings; other effects Electronic bleeps on instruments and

computers

Computational effects straight line orcurved calibration and rounding too soon

Blank correction blank uncertainty andappropriateness to subtract blank

Operator effects, colour changes, platereadings

Other random effects (always include this).

Errors which are not considered for uncertaintyestimates but which should be avoided orcorrected:

Incorrect sample Wrong client information Containers or exhibits mislabelled Samples preserved incorrectly Wrong method choice Errors on method card Wrong dilutions or dilution factors Basic chemistry wrong Determining incorrect species

Matrix interference Wrong recovery factor applied Wrong chemicals or low quality chemicals Wrong units used or reported Incorrect standard solutions Calculation programme errors Computer programme errors Reporting wrong species Number transfer errors Calculation of data entry errors Sample tube mix-ups Typing errors.

5.4 List available dataSuch data as calibration certificateuncertainties, inter-laboratory, in-house orpublished precision data, purity of referencematerials, method validation data, etc shouldbe identified.

Examples are:

Method validation / verification data suchas replicates of various samples or matrixreference materials under variousconditions (intermediate precision)

QC replicates or house standard repeatanalyses under various conditions(intermediate precision) or sameconditions (repeatability r)

Inter-laboratory comparison statistics(reproducibility R)

Published precision data from methodvalidations (r and R)

Certificates of instrument or referencematerial calibrations

In-house equipment calibration data Published experimental research data.

5.5 Select suitable precis ion dataSelect available precision data covering themaximum number of sources of uncertaintyand express these as a standard deviation.

ILCP precision (R) will probably includevariations from reference materials andinstrument calibrations. However, it may notinclude real sample and sub-samplingvariations as samples are usually speciallyprepared, homogenised and stabilised.Results from outlier laboratories may distortthe R value.

Intermediate precision will probably not includevariations in reference materials (same usedfor long periods) or larger equipment (only onein the lab) or uncertainty in the calibration ofequipment. If intermediate precision isdetermined on real samples and samples aresplit before the sub-sampling procedure thensub-sampling variation will be included.

Precision data may not include the fullvariations in control points that are permissiblewithin the defined method, because most labstend to set these points to the centre of thepermissible range (except during robustnessexperiments for method validation). Somemeasure of this may be desirable. Whenrelying on published precision data (e.g.published R), the laboratory must demonstrate

that its own precision (r) is comparable withthe published r.

14


15/29


If results are to be corrected for method bias /recovery (using results from spiking studies)then the uncertainty of these spike testsshould be estimated and incorporated.

Precision frequently varies significantly with

the level of the result. The precision datashould be adjusted for concentration of theanalyte or a relationship established betweenprecision and concentration.

For microbiological testing, precision dataderived from the skewed distribution will havethe upper confidence limit significantly furtherfrom the result than the lower one.

5.6 Calculate intermediate precis ion

5.6.1 Data qual ity

The design of the data collection procedure isthe most important aspect of any estimation ofintermediate precision. Data must includevariations caused by different days, operators,equipment, laboratories, reagents etc ifrealistic estimations of intermediate precisionare to be obtained.

Replicates analysed by the same analyst atthe same time are likely to give anunrealistically small precision results(repeatability (r) only).

For estimating reproducibility, results obtainedfrom several different laboratories, eachanalysing a sub-sample from the same sampleor batch, are required.

Intermediate precision may be obtained fromanalyses by different analysts on different daysusing where possible different equipment,materials, standards, etc and using standardprocedures usually applied in the laboratory.

Use of a statistical formula without carefulthought is undesirable. There needs to be

careful attention to the sources of variabilitythat should be considered in getting a realisticdesign for the collection of data to estimateintermediate precision.

Because R and intermediate precision includer in addition to other significant components ofprecision, an estimation of one or both of thesewill be the most useful.

In some circumstances a laboratory may beasked to certify the composition of a batch ofmaterial. Here, sampling variation and product

uniformity within the batch become important

and these variations may be so large as torender analytical variations insignificant.

5.6.2 Assumptions of normalityIt is assumed throughout this paper thatreplicate results form a normal distribution

about a mean. There should, however, becoarse checks that this assumption isreasonable. If there are few results, theyshould be plotted out along a line. If many, ahistogram or similar display may be drawn(see Figure 1). Alternatively, the use of normalprobability plots can demonstrate normality ofa data set.

Figure 1

std dev

mean

Indications that the assumption is not valid are:

several results lie well away from the rest a scatter of values that is quite

asymmetric, e.g. values may be closelybunched at the high or low end of the

scale.

Microbiological replicate results are oftenfound to be bunched at the low end (long tail atthe upper end) of the scale. Such results maybe transformed by taking log10values of eachtest result to give a log distribution which iscloser to normal. Normal statistics may then beapplied for the calculation of precision.

5.6.3 Replicate analyses inone laboratory

In a good routine testing laboratory, it is

normal for at least a selection of analyses tocarried out in duplicate or triplicate (n = 2 or 3).This allows only a very inaccurate estimate ofstandard deviation to be calculated for eachsample. However, where a material such as amatrix reference material or a house standardis analysed repeatedly, and especially undervarying conditions, the estimation of astandard deviation is more reliable.

The formula is,

1

2

2

= n

)yy(S Formula 2

15


16/29


where S2

= the varianceS = estimated standard deviationn = number of resultsy = an individual test resulty = mean of individual results

= n

y

y - y = the deviation of each result

from the mean

The calculation of the standard deviation usingformula 2 is illustrated with the followingexamples.

Example 1The determination of nickel in an alloyThe sample gave results of 4.26, 4.18, 4.23and 4.27% by mass.

Table 1

Sample size (n) = 4Mean )y( = 4.235

3

004902 .S =

S = 0.0404

More replicates will give a more reliableestimate of precision.

Example 2The determination of col iform bacteria in

ground beefThe sample gave the following CFU per gramresults:

3.0x105, 2.7x105, 3.5x105, 1.0x106,

3.3x105, 2.5x105, 3.1x105, 3.3x105

Table 2

Results Log10

Results Log10

differenceLog10

difference2

300000 5.477 -0.070 0.00493

270000 5.431 -0.116 0.01346

350000 5.544 -0.003 0.00001

1000000 6.000 0.453 0.20488

330000 5.519 -0.029 0.00083

250000 5.398 -0.149 0.02233

310000 5.491 -0.056 0.00314

330000 5.519 -0.029 0.00083

Log10mean 5.547Sum Log Diff

20.25041

Log Std Deviation 0.18914

Log Conf Limits 95% +/- 0.37827

Log Upper Limit 5.926

Log Lower Limit 5.169

Upper Limi t 842624

Lower Limi t 147600

Mean 352663

Results Deviationfrom average

Deviationsquared

4.26 0.025 0.000625

4.18 0.055 0.003025

4.23 0.005 0.000025

4.27 0.035 0.001225

Sum 16.94 0.12 0.0049

Thus the mean is 3.5x105and the 95%

confidence interval is from 1.5x105to 8.4x10

5

5.6.4 Al ternative data collectionTo avoid the need to analyse one samplemany times for improving standard deviationestimates, data from replicate (e.g. duplicate)analyses may be used. Replicates mayinclude those from samples with (slightly)different results or those from a same sampleanalysed in different laboratories. Suchroutine replicates, which encompass allvariations, can then be used to calculate muchbetter estimates of precision than where only afew replicates on one sample are available.

An example model for data collection forintermediate precision, which incorporatesmany of the components of uncertainty for atest method is presented in Table 3.

The sample results are paired up to form 15sets of duplicates (30 test results).

16


17/29


18/29


Example 3Duplicates for free sulphur dioxide in wine

Table 4

Sample

Result

1

Result

2

yi1

i2

Iyi1

i2I

2

1 42 45 -3 9

2 68 63 5 25

3 50 51 -1 1

4 38 41 -3 9

8 43 39 4 16

3 48 47 1 1

5 29 25 4 16

6 60 57 3 9

1 44 46 -2 42 66 67 -1 1

1 47 46 1 1

5 27 28 -1 1

7 18 20 -2 4

8 44 43 1 1

9 71 62 9 81

Thus: p = 15

SR2

= 5.97 and

SR = 2.44

Example 4Duplicates for Aerobic Plate Count at 35

OC

on ready-to-eat meals:

Table 5

Dupli

cate

s

Logs

of d

upli

cates

Diffe

rence

Diffe

rence2

1300000 1900000 6.114 6.279 -0.165 0.0272

16000000 14000000 7.204 7.146 0.058 0.0034

4200000 5000000 6.623 6.699 -0.076 0.0057

1700000 1600000 6.230 6.204 0.026 0.0007

13000 9600 4.114 3.982 0.132 0.0173

2800 4100 3.447 3.613 -0.166 0.0274

4700000 3200000 6.672 6.505 0.167 0.02791300000 1000000 6.114 6.000 0.114 0.0130

360000 270000 5.556 5.431 0.125 0.0156

2900000 1400000 6.462 6.146 0.316 0.1000

2110000 1600000 6.324 6.204 0.120 0.0144

100000 120000 5.000 5.079 -0.079 0.0063

2700000 2000000 6.431 6.301 0.130 0.0170

730000 1400000 5.863 6.146 -0.283 0.0800

640000 440000 5.806 5.643 0.163 0.0265

2200000 1500000 6.342 6.176 0.166 0.0277

280000 450000 5.447 5.653 -0.206 0.0425

54000 64000 4.732 4.806 -0.074 0.0054

15000 11000 4.176 4.041 0.135 0.018181000 100000 4.908 5.000 -0.092 0.0084

25000 17000 4.398 4.230 0.167 0.0281

P = 20 Sum Differences2

0.5125

Log SR2 0.0128

(divide by 2p)

Log SR 0.1132

Log CL = +/- 0.226

(95% t = 2)

Upper Limit (log) 5.828

Overall Mean (log 5.601

Lower Limit (log) 5.375

Upper Limit 672474

Mean 399291

Lower Limit 237084

2

21 ii yy = 179

18


19/29


5.6.6 Outlying resultsExtreme outlying results may be identified bystatistical means, for example the Cochransmaximum variance test described in ISO 5725.

Where outliers or stragglers are identified,

steps should be taken to explain them byidentifying some technical error, computationalerror, clerical error, wrong sample, etc and tocorrect them. Action should then be taken toprevent their recurrence.

Result should only be rejected when there is avalid explanation that they are outliers.Otherwise all results should remain in the datato be used.

5.7 List the sources of uncertainty notcovered by selected precision data

By studying the design of the project used toobtain the data for the precision calculationsand considering the list of all sources ofuncertainty, the analyst can decide whichcomponents have been covered by theprecision data and which have not.

Assuming a well designed precisionexperiment where replicate samples arecarried through the whole procedure includingsub-sampling, then some common items,which may not have been included, are:

Balance uncertainty: Where the laboratory

always uses the same balance for weighingsay a small quantity of the referencematerial (used for calibrating theequipment), this may be a significant itemnot included in the precision estimate. Forexample the lab may weigh out 2 mg ofmaterial on a five-place balance with anuncertainty (on the calibration certificate) of+/- 0.08 mg, to make up a standard. Thisreference material may be used for a fewweeks or months before it is replaced.

Purity of reference material: The certificatewith the reference material may state that it

is better than (say) 90% pure. This can betaken to mean that the purity is somewhere(anywhere) between 90 and 100% but notoutside these limits.

Equipment: Where the laboratory alwaysuses only the same item of equipment forthe particular tests, the uncertainty(unknown bias) associated with thatequipment will not be included in theprecision estimate.

Recoveries and blanks: Where test resultsare corrected for a fixed recovery result orfor a fixed blank value, the uncertaintyassociated with these values will not beincluded in the precision estimate.

These uncertainties may be as large as theprecision for the actual test and will need tobe estimated.

Sample preparation and sub-sampling: Ifreplicates for the precision project aretaken after sample preparation and/or sub-

sampling, then the uncertainty associatedwith sample preparation and/or sub-sampling will not be included in theprecision estimate.

There may be other significant items on theparticular list, which also need to be identifiedand their uncertainties estimated separately.

5.8 Estimate or locate information on theuncertainties of each of these identifiedadditional sources and express each asa standard deviation.

Taking the above items: Balance: The uncertainty for the balance

will be on the balance calibration certificate.It should be as a standard deviation or maybe converted to one by dividing by thecoverage factor, which should also be onthe certificate if the expanded uncertaintyhas been quoted.

The best accuracy (95% confidencelimit) of 0.00016 g is divided by thecoverage factor of 2 to obtain thestandard deviation for this component of0.00008g or 0.08 mg.

Reference material: If the certificate on thereference material indicates that it is betterthan 90% pure and the laboratory alwaysuses this same reference material tocalibrate its equipment then the uncertaintyallowance would be based on treating thisas a rectangular distribution (see figure 2)from 90 to 100%.

Figure 2

2a

std dev

90% 100%a

3The rule for rectangular distributions forcalculating the standard deviation is:o divide the interval by 2 and then by the

square route of 3,o i.e. (100-90) / 2 / 3 = 2.89%

for the mean value of 95% purity.

Recovery: From a series of spiking trials,the recovery may have been found to be85% +/- 6% at 95% confidence.

19


20/29


Assuming the coverage factor (often referredto by the symbol k) was 2 (that is the reportedconfidence limits are about 95% confidence)then the standard deviation for this contributionwill be 3% for the 85% recovery result.

Predetermined blank values may besimilarly evaluated.

Homogeneity - Previous special trials toassess the homogeneity of a sample mayhave been conducted by analysing acomponent of the sample, which is stableand gives results with very small testmethod uncertainty values.

For example, the precision (as a standarddeviation) for the selected stablecomponent, in a reference solution, may be

1% of the test result while its precision (asa standard deviation) for a series of sub-samples from the same primary samplemay be 5% of the result. In this case, thestandard deviation attributable to the sub-sampling would be (5

2 1

2) = 4.9% of

the result (relative standard deviation).

5.9 Identify how each component ofuncertainty fits into the formula forcalculating the test results.

An example of a formula for calculating thetest results is extended from in formula 1 to

sample.ref

extrefsample

m.RecI

H.P.V.C.IR = g / g.

Where components of uncertainty are not inthe equation, for example overall precision Pand sample homogeneity H (sub-sampling)these need to be included appropriately. In thiscase they are both included as multipliers.Other components are already included withinthe equation.

Once the precision function is included as amultiplier (result x 1 as it does not change theresult) any uncertainty components in theequation, which are already included in theprecision experiment need to be deleted.

In this case the two I components (peakheights) and the V (volume) component willbe included in precision.

The mass of sample component will beinsignificant as (say) a four-place balance wasused to weigh 2 grams of sample. However,

the uncertainty of the mass of the referencematerial used to make up the standard solution

will be significant (see section 5.7 above). Thiswould come under the C refcomponent.

The uncertainty associated with the purity ofthe reference material would also come underthe C refcomponent.

5.10 Estimate the overall uncertainty forthe method by combining eachsource value using the appropriateformulae. The rules for combininguncertainties are as follows. Alluncertainty components areexpressed as standard deviations.

Rule 1:Where the calculation of themeasurand (test result) involves only adding orsubtracting quantities, (e.g. y = p + q - r) thecombined uncertainty is calculated using

Formula 5.

......u(r)u(q)u(p)u 222c ++= Formula 5

Rule 2:Where the calculation of themeasurand involves multiplying or dividingquantities (e.g. y = p x q / r) the combineduncertainty is calculated using formula 6 tocombine the relative uncertainties.

......[u(r)/r][u(q)/q][u(p)/p]yu 222c ++= Formula 6

Where there is a mixture of additions andmultiplications in the measurand formula, thecombined uncertainties of the componentsadded or subtracted are first calculated(Formula 5) and then the overall combineduncertainty is calculated for the remaining(combined or single) multiplied or dividedcomponents using their relative uncertainties(Formula 6).

For combining precision and homogeneitycomponents, use rule 2 (Formula 6).

From the measurand equation above as theexample, it is noted that all components areeither multiplied or divided and therefore rule 2(Formula 6) will be used.

The calculation for overall uncertainty is

2

cRe

)c(ReU

2

Homog

)Homog(U

2

ecPr

)ec(PrU2

ref

)ref(U2

mass

)mass(URRU

+

+

+

+

=

This includes relative uncertainties for themass of reference material weighed, the purityof the reference material, the precision, thehomogeneity of the sample (ability to sub-sample) and the recovery (from spike

20


21/29


experiments). The uncertainty of the balanceused for weighing the sample is usually notsignificant but this should be checked.

Using some results from above (section 5.8):

Balance Calibration +/- 0.08 mgMass of Reference Material (say) 2.0 mgUncertainty of RM Purity +/- 2.89%Reference Material Purity 95 %

Precision as SD (say) +/- 0.1 g/gTypical Test Result (say) 2.0 g/gHomogeneity (2 ug/g x 4.9%) +/- 0.098g/gRecovery Precision as SD +/- 3 %Recovery 85 %

2

85

3

2

2

)098.0

2

2

1.02

95

89.22

0.2

08.02

RU

+

+

+

+

=

ug/g0.00130.00240.00250.000930.0016u R ++++= 2uR = 0.18 ug/g

The largest components here are the precisionand the sample homogeneity (sub-sampling).If these were the only componentsincorporated, then the uncertainty result wouldhave been 0.14 ug/g.

Components that are less than about 1/3 ofthe major components have little effect on theoverall uncertainty.

A well-designed precision experiment couldhave included sub-sampling in the precisioncomponent and thus the precision alone wouldhave been a reasonable (but a little small)estimate of uncertainty. For many chemicaltests, this will be the case but until thelaboratory has gone through the exercise theyare not in a position to know that.

5.11 Uncertainty forMicrobiological Tests

For microbiological tests, because of the

complications of the skewed distribution, theestimate of uncertainty from a well-designedintermediate precision experiment will beregarded as a reasonable estimate of theuncertainty of results.

Where MPN tables are used to obtainmicrobiological results, the 95% confidencelimits quoted in the tables will be regarded as areasonable estimate of the uncertainty of theseresults in the mean time.

6 Expanded Uncertainty andConfidence Limits

The confidence interval is the range withinwhich the true or best possible result is likelyto lie at the stated confidence (e.g. 95%).

The confidence limits are the bounds of theconfidence interval. The confidence limits maybe referred to as expanded uncertainty.

If the typical method uncertainty as astandard deviation SR,for a single result (ormean of replicates), has been calculated as inthe previous sections using Formulae 2 or 3for microbiological testing and Formulae 5and/or 6 for chemical testing, then theconfidence limits can be determined fromFormula 7.

Confidence Limits =n

.Sty R Formula 7

where:

y = mean of replicates (e.g. duplicates)

done on this new samplen = number of replicates done on this new

samplet = coverage factor from statistical t-

tables

This formula applies where the overalluncertainty relates to a normal distribution of

test results and the major component ofuncertainty is precision, which is the case forchemical and logged microbiological results.

Formula 7 shows that the mean of a duplicate

is 2 times more precise than an individual

test result. ( n for n replicates). However,

this only applies to the precision component ofthe uncertainty and therefore replicate testingcannot be used to reduce uncertainty tonegligible proportions.

At this stage the analyst must decide whatlevel of confidence is required, i.e. how likely isit that the confidence interval will contain thetrue result.

For most purposes a 95% confidence isacceptable. This means that the confidenceinterval for a particular result would contain thetrue result for 95 out of 100 tests. The 95%level is the confidence level recommended foruse in the ISO 5725 standard, and by the UKWater Research Centre TR 66.

21


22/29


This method of calculating the confidenceinterval gives a 95% probability of capturingthe true mean result, which for this purpose isassumed to have some particular fixed butunknown value. The 95% probability isconditional on this.

Having chosen the confidence level requiredthe coverage factor (t) can now be found instatistical table such as Table 6.

Table 6Probability Points Of Thet-Distribution

Doublesided Test

Pr%Double Sided 0.5 1 5 10

Degrees Coverage Factors (t)of freedom

1 127 63.7 12.7 6.312 14.1 9.92 4.30 2.923 7.45 5.84 3.18 2.354 5.60 4.60 2.78 2.135 4.77 4.03 2.57 2.01

6 4.32 3.71 2.45 1.947 4.03 3.50 2.36 1.898 3.83 3.36 2.31 1.869 3.69 3.25 2.26 1.8310 3.58 3.17 2.23 1.81

11 3.50 3.11 2.20 1.8012 3.43 3.05 2.18 1.7813 3.37 3.01 2.16 1.7714 3.33 2.98 2.14 1.7615 3.29 2.95 2.13 1.75

16 3.25 2.92 2.12 1.7517 3.22 2.90 2.11 1.7418 3.20 2.88 2.10 1.7319 3.17 2.86 2.09 1.7320 3.15 2.85 2.09 1.72

21 3.14 2.83 2.08 1.7222 3.12 2.82 2.07 1.7223 3.10 2.81 2.07 1.7124 3.09 2.80 2.06 1.7125 3.08 2.79 2.06 1.71

26 3.07 2.78 2.06 1.7127 3.06 2.77 2.05 1.7028 3.05 2.76 2.05 1.7029 3.04 2.76 2.05 1.7030 3.03 2.75 2.04 1.70

40 2.97 2.70 2.02 1.68

60 2.91 2.66 2.00 1.67

120 2.86 2.62 1.98 1.66

Infinity 2.81 2.58 1.96 1.64

Note that Pr % is the probability or risk of errorfor this statistical method and the degrees offreedom relate to the numbers of replicates forthe original calculation of SR.

Continuing the example from Section 5.10consider 5% risk (95% confidence).

SRwas 0.18Degrees of freedom = 30 15 = 15

(for say 30 duplicates used to obtain the precision)

Coverage factor = 2.13

(double sided 5% from Table 6)

Thus confidence limits (CL 95) which bound the95% confidence interval in which the trueresult is expected to lie, for a new sampleanalysed in duplicate giving say 1.94 and 2.00ppm of pesticide are found as follows:

y =2

21yy +

= 1.97

n = 2

CL 95 = y n

S.t R

= 1.97 2 18.013.2 = 1.97 0.27

Thus the 95% confidence interval is from 1.70to 2.24and the true result is expected to liebetween 1.70 and 2.24 at the 95% aconfidence level.

This may create a problem for decision makersif the legal limit for pesticide is 2.00 ppm.

Note from Table 6, that:

for a 5% risk of error, t2 and for 0.5 % risk of error, t3,where there are a reasonable number ofdegrees of freedom.

Figure 3Confidence Interval

95%Confidence

Limit

95%Confidence

Limit

t . S/n

Results

t . S/n

< >>

y

The normal distribution curve is about the trueor best possible result, which is unknown.However, the true result is expected to liewithin the confidence interval for 95% of tests.One cannot view the centre of a confidenceinterval as the best point estimate of the trueresult.

The interval simply defines the range ofplausible values (see Bhattacharyya and

Johnston; Statistical Concepts and Methods;Section 8.3).

22


23/29


7 Comparing a Test Result witha Fixed Value

7.1 GeneralTo approach this situation, we must considerwhere the observed test result (or mean of a

replicate) y is in relation to the true result and

the fixed value. The diagram for this can bedrawn as in figure 4.

Figure 4Comparing a Result With a Standard

Standard orRegulation upper limit

y Results

The true result is to be below standard valueon 95% of occasions.

For the sample to comply, the true result mustbe somewhere below the standard / regulationupper limit at a confidence of say 95%. Forthis to be the case, there should be only a 5%risk that the true result could be above thestandard or regulation limit and 95%probability that it may be anywhere below that

limit. This means that the single sided statistic(Table 7) should be used to calculate theupper confidence limit for the test result.

We calculate the upper confidence limit thus:

n

tSy R+

where tis taken from a single sided statisticaltable (Table 7).

Note from Table 7 that for 95% confidence, tisabout 1.7 for a reasonable number of degrees

of freedom.

From the previous example, pesticide must notexceed 2 ppm. What is the mean of aduplicate below which an analyst may acceptthe sample as compliant, with 95%confidence?

Degrees of freedom = 15 (say)Pr = 5% single sidedt = 1.75 single-sidedSR = 0.18n = 2

Therefore

2.00 ppm =2

180751 ..y

+ or

2.00 = y + 0.22 and

y = 1.78 ppm

Thus for the duplicate of 1.94 and 2.00 (mean1.97), the laboratory could not make a decisionabout compliance or non-compliance. Only ifthe mean was 1.78 or less could the laboratorysay with 95% confidence that the sample wasin compliance with the requirement limit of2.00 ppm.

Note that if a regulator wanted to be 95%confident that a sample of bread containedgreater that 2.00 ppm of pesticide beforetaking regulatory action, then the single sided

statistic is still used but the lower confidencelimit is calculated.

2.00 ppm =n

tSy R and

y = 2.22

At values above 2.22 ppm it would be 95%safe for the regulator to take action.

23


24/29


Table 7Probability Points Of The t-Distribution

Single-sided Test

Pr%0.5 1 5 10

Degrees

of freedom Coverage factors (t)1 63.7 31.8 6.31 3.082 9.92 6.96 2.92 1.893 5.84 4.54 2.35 1.644 4.60 3.75 2.13 1.535 4.03 3.36 2.01 1.48

6 3.71 3.14 1.94 1.447 3.50 3.00 1.89 1.428 3.36 2.90 1.86 1.409 3.25 2.82 1.83 1.3810 3.17 2.76 1.81 1.37

11 3.11 2.72 1.80 1.3612 3.05 2.68 1.78 1.36

13 3.01 2.65 1.77 1.3514 2.98 2.62 1.76 1.3415 2.95 2.60 1.75 1.34

16 2.92 2.58 1.75 1.3417 2.90 2.57 1.74 1.3318 2.88 2.55 1.73 1.3319 2.86 2.54 1.73 1.3320 2.85 2.53 1.72 1.32

21 2.83 2.52 1.72 1.3222 2.82 2.51 1.72 1.3223 2.80 2.50 1.71 1.3224 2.80 2.49 1.71 1.3225 2.79 2.48 1.71 1.32

26 2.78 2.48 1.71 1.3227 2.77 2.47 1.70 1.3128 2.76 2.47 1.70 1.3129 2.76 2.46 1.70 1.3130 2.75 2.46 1.70 1.31

40 2.70 2.42 1.68 1.30

60 2.66 2.39 1.67 1.30

120 2.62 2.36 1.66 1.29

Infinity 2.58 2.33 1.64 1.28

7.2 Results where Compliance or Non-

compliance Cannot be Stated with95% ConfidenceIn the above example, the test result was 1.97

0.27 at 95% confidence. With this result,one cannot say, with 95% confidence, if thesample complied or non-complied with thelegal limit of 2.00 even though the result wasless than 2.00.

However, it is possible to state that the sampleis likely to comply with the limit but at a lesserpercentage confidence. The APLAC TR 004report allows for this procedure. However, if

this result is to be reported as likely tocomply with the standard, the APLAC report

requires that the laboratory must also state theactual percentage confidence it has in thiscompliance statement.

The procedure for calculating the actualconfidence in the compliance statement is

based on working backwards through the t-distribution tables as follows:

From Formula 7 the tvalue may be calculatedbecause all other information is known

Confidence Limits (CL) =n

St R or

t =RS

n.CL

SR = 0.18

Here the confidence limit (CL) is the differencebetween the result and the legal limit.

CL = 2.00 - 1.97= 0.03 and

n = 2

Degrees of Freedom (from original standarddeviation estimate) were 15.

Therefore t =0.18

2.0.03

= 0.24

Now from a more extensive single sided

t -distribution table, a value closest to t= 0.24for 15 degrees of freedom is found in thecolumn headed 40.

This means that the laboratory is only 60percent confident that the test result indicatescompliance with the legal limit.

Decision makers who are not given thisinformation but simply receive the test result of1.97 ppm may unknowingly be making a veryrisky decision because of a lack of this vitalinformation.

7.3 Suitabil ity of Test Method forStating that a Product Test Result isWithin Upper and Lower ProductSpecification Limits

Where a product is being tested to determine ifthe result is within upper and lower productspecification limits, the laboratory must choosea test method with a suitable uncertainty,capable of giving a compliance statement for areasonable percentage of results which fallwithin the specification limits.

To do this, the uncertainty of results must besignificantly smaller than the products

24


25/29


specified allowable range (upper minus lowerproduct limits).

APLAC TC 004 suggests that for anexpanded uncertainty of measurement U anda specified product allowable range 2T

(2T = upper limit lower limit), the ratio U : T isa measure of the ability of the test method todistinguishing compliance from non-compliance.

For example, consider a product limit intervalof 2T and the expanded uncertainty for the testresult of U. For a compliance statement to bemade, the test result would need to be U(single sided) above the lower limit and U(single sided) below the upper limit. If U : Twere 1 : 3 then the interval between the lowerlimit plus U and the upper limit minus U would

be 66.7% of the total product allowable rangeinterval 2T. In such cases, 66.7% of possibleresults between the upper and lowerspecification limits could be stated with 95%confidence as complying.

The APLAC document therefore concludesthat a ratio of 1 : 3 or better is reasonable for alaboratory to use in deciding on the suitabilityof a test method in these circumstances.

8 Comparing a Test Result with a

Natural Population

If a single test result (or the mean of replicatesfor a single sample) is being compared withpreviously obtained results that make up apopulation of acceptable results for a productsuch as fruit juice the analyst must answer thequestion, does the result belong inside oroutside the acceptable population? and, forinstance, can they conclude that the juice hasbeen watered?

The fruit juice population will have (say) anormal distribution such as figure 1 from whichuncertainty as a standard deviation, SJhasbeen calculated using Formula 2.

The test sample replicates will have a standarddeviation SRcalculated using Formulae 5 and /or 6.

When the difference between the normal fruit

juice population mean x and the sample test

result (or mean of replicates) y is greater than

half their combined confidence intervals, thenthere is less than Pr% probability that the

sample belongs to the population of fruit juicesand the analyst may well say, the sample isnot fruit juice.

The combined standard deviation SMcan becalculated from Formula 8.

SM2= SJ

2+

n

SR2

Formula 8

Therefore for a 5% or less risk of rejecting thesample when it is actually fruit juice,

MS.tyx Formula 9

Where tequals 1.7 for a single-sided statisticand degrees of freedom are taken as thesmallest of those used for the SJor SRcalculations.

For example, magnesium in fruit juice istypically at a level of about 80 ppm with astandard deviation of 5 ppm. Tap watercontains negligible magnesium. The lowestlevel of magnesium an unknown sample mayhave, before it is considered with 95%confidence to have been adulterated, may becalculated.

The method for magnesium determination hasa standard deviation of (say) 2.5 ppm andduplicate results are determined on eachsample.

2

5.25S

222

M +=

= 25 + 3.1SM = 5.3

Thus for MtSyx or

80 y 1.7 x 5.3 or(5% risk single sided)

80 y 9 then

y 71Therefore, a sample with a magnesium result

of 71 ppm or less, is not a pure fruit juicebecause it is outside the plausible values forfruit juice at 95% confidence.

Note that for a sample y result just above 71

ppm of magnesium, the sample may be judgedto be not inconsistent with fruit juice with 95%confidence. The SJis the standard deviation ofthe fruit juice population and must have beenderived from samples representing the likelygeographical origins and varieties of the fruitsin the suspect juice.

25


26/29


9 Comparing Two Test Results

This situation is similar to the previous oneexcept that usually the two results have thesame uncertainty and thus,

what was n

S

becomesSR

J

where each sample has n replicates.

Now formula 8 simplifies to

n

SS RM

2=

or for duplicate test results SM= SR .

Therefore, for the two test (R) results yandx ,

if

n

Styx R

2 (or t.SRfor duplicates)

then the two test (or mean of replicate) resultswill be judged as different because the trueresult for each is outside the plausible rangefor the other, at the confidence level (e.g. 95%)chosen for the t value.

10 Criterion of Detection

This is the minimum value that a single testresult (or mean of replicates) may have for theanalyst to say that something is present with

95% confidence. The term is defined anddiscussed in the UK Water Research CentreTechnical Report TR 66.

It can often be assumed that the blankresults for a method will vary and form apopulation distribution identical to that ofresults for a sample with the measurand closeto the blank (low level). This will occurwhether the blank contains traces of themeasurand or not.

Therefore, the population of the blank and the

population of a low level sam

as-tg5 - uncertainty of measurement in chemical and microbiological testing

Documents