validation in chemical measurement

176

Upload: others

Post on 18-Dec-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Validation in Chemical Measurement
Page 2: Validation in Chemical Measurement

Validation in Chemical Measurement

Page 3: Validation in Chemical Measurement

Paul De Bièvre · Helmut Günzler (Eds.)

123

Validation inChemical Measurement

Page 4: Validation in Chemical Measurement

Library of Congress Control Number: 2004114858

ISBN 3-540-20788-0 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material isconcerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplicationof this publication or parts thereof is permitted only under the provisions of the German CopyrightLaw of September 9, 1965, in its current version, and permission for use must always be obtained fromSpringer. Violations are liable for prosecution under the German Copyright Law.

Springer is a part of Springer Science+Business Mediaspringeronline.com© Springer-Verlag Berlin Heidelberg 2005Printed in Germany

The use of general descriptive names, registered names, trademarks, etc. in this publication does notimply, even in the absence of a specific statement, that such names are exempt from the relevantprotective laws and regulations and therefore free for general use.

Cover design: design & production GmbH, HeidelbergProduction: LE-TEX Jelonek, Schmidt & Vöckler GbR, Leipzig

Printed on acid-free paper 52/3111/YL - 5 4 3 2 1 0

Prof. Dr. Paul De BièvreDuineneind 92460 KasterleeBelgium

Prof. Dr. Helmut GünzlerBismarckstrasse 1169469 WeinheimGermany

Page 5: Validation in Chemical Measurement

Preface

Validation of measurement methods has been used for a verylong time in chemistry. It is mostly based on the examinationof a measurement procedure for its characteristics such asprecision, accuracy, selectivity, sensitivity, repeatability, re-producibility, detection limit, quantification limit and more.

When focussing on quality comparability and reliabilityin chemical measurement, the fields of interest to this Jour-nal, one stumbles into various interpretations of the termvalidation. It is one more example of a term which is usedsometimes very consistently, sometimes very loosely or in-deed ambiguously. Since the term is very common in thechemical community, it is important that its meaning beclear. Turning to the 2nd edition of the International Vo-cabulary of Basic and General terms in Metrology (VIM)(1993), surprisingly we do not find a definition. Webster’sDictionary of the English language (1992) tells us that val-idation is ‘making or being made valid’. Obviously valida-tion has to do with valid. The same Webster indicates themeaning of the corresponding verb: to validate seems ‘tomake valid or binding, to confirm the validity of (Latin: val-idare)’, where valid means: ‘seen to be in agreement withthe facts or to be logically sound’. We certainly can buildon this to have a ‘valid’ discussion. Validation of a methodclearly seems to mean making ‘valid’ the measurement re-sults obtained by this method. The first definition ‘seen tobe in agreement with the facts’, is rather difficult to apply.The second definition however, tells us that ‘validation ofa method is to make the method to be seen as logicallysound’. It looks as if validation of a method is a processwhereby it is tested and demonstrated by somebody or someauthority to be logically sound. Such a validation shouldenable everybody to use it. That implies a list of methods‘validated’ by competent authorities in the field concerned,which sounds possible and useful. Is that not what AOACdoes?

Sometimes, the notion of validating a measurement resultalso shows up. Apparently it means to make a result ‘valid’,and even binding, i.e. confirming its ‘validity’. Since validmeans ‘seen to be in agreement with the facts’, that almostsounds as a synonym for ‘accurate’. That makes sense andthere seems to be no argument as to whether a method ora result can be validated (they can). An important questionarises: does a validated method automatically give a vali-dated measurement result, i.e. a quantity value1 with asso-

ciated measurement uncertainty? The answer must be: no.There can never be a mechanism or recipe for producing au-tomatically ‘valid’ results because one can never eliminatethe skills, the role and the responsibility of the analyst.

ISO 9000:2000, item 3.8.5 defines validation as ‘confir-mation by examination and provision of objective evidencethat the requirements for an intended use are fulfilled’. Therevised edition of the VIM (‘VIM3’), is likely to fine-tunethis definition of the concept ‘validation’ to be ‘confirmationthrough examination of a given item and provision of ob-jective evidence that it fulfills the requirements for a statedintended use’.

Looking at simple practice, many people are looking fora formal decision that a given measurement method automat-ically gives them ‘valid’ i.e. reliable results. One wonderswhat this has to do with ‘stated intended use’. Reliabil-ity clearly is a property of a measurement result. Checkingwhether that result fulfills the requirement for a stated in-tended use, seems to be a totally different matter. That re-quires the formulation of a requirement a priori, i.e. beforethe measurement is made, and derived from the need for ameasurement result, not from the result itself.

This anthology contains 31 outstanding papers publishedin the Journal “Accreditation and Quality Assurance” sinceits inception, but mostly in the period 2000–2003, on the topic‘validation’. They reflect the latest understanding – or lackthereof –, of the concept and possibly some rationale(s) forthe answer to the question why it is important to integrate theconcept of ‘validation’ into the standard procedures of everymeasurement laboratory.

It is hoped that this anthology is of benefit to both theproducers and the users of results of chemical measurements:the basic concepts and the basic thinking in measurement arethe same for both.

Prof. Dr. P. De BièvreEditor-in-ChiefAccreditation and Quality AssuranceKasterlee 2004-04-02

1quantity (German: ‘Messgrösse’, French: ‘grandeur de mesure’, Dutch:‘meetgrootheid’) is not used here in the meaning ‘amount’, but as the genericterm for the quantities we measure: concentration, volume, mass, tempera-ture, time, etc., as defined in the VIM.

Page 6: Validation in Chemical Measurement

Contents

Bioanalytical method validation and its implications forforensic and clinical toxicology – A review . . . . . . . . . . . . 1

Validation of a computer program for atomic absorptionanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Instrumental validation in capillary electrophoresis andcheckpoints for method validation . . . . . . . . . . . . . . . . . . . 14

Qualification and validation of software and computersystems in laboratories. Part 1: Validation duringdevelopment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Clinical reference materials for the validation of theperformance of photometric systems used for clinicalanalyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Measurement uncertainty and its implications forcollaborative study method validation and methodperformance parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Qualification and validation of software and computersystems in laboratories. Part 2: Qualification of vendors 42

Qualification and validation of software and computersystems in laboratories. Part 3: Installation andoperational qualification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Qualification and validation of software and computersystems in laboratories. Part 4: Evaluation andvalidation of existing systems . . . . . . . . . . . . . . . . . . . . . . . 51

Analytical validation in practice at a quality controllaboratory in the Japanese pharmaceutical industry . . . . 56

Validation of the uncertainty evaluation for thedetermination of metals in solid samples by atomicspectrometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Validation requirements for chemical methods inquantitative analysis – horses for courses? . . . . . . . . . . . . 68

Validation criteria for developing ion-selectivemembrane electrodes for analysis of pharmaceuticals . . 73

Is the estimation of measurement uncertainty a viablealternative to validation? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Validation of an HPLC method for the determination ofp-dichlorobenzene and naphthalene in mothrepellents . . 80

The evaluation of measurement uncertainty frommethod validation studies. Part 1: Description of alaboratory protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

The evaluation of measurement uncertainty frommethod validation studies. Part 2: The practicalapplication of a laboratory protocol . . . . . . . . . . . . . . . . . . 91

Automated ion-selective measurement of lithium inserum. A practical approach to result-level verificationin a two-way method validation . . . . . . . . . . . . . . . . . . . . . 101

Determination of hydrocarbons in water –interlaboratory method validation before routinemonitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

Interlaboratory quality audit program for potable water– assessment of method validation done on inductivelycoupled plasma atomic emission spectrometer(ICP-AES) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

A need for clearer terminology and guidance in therole of reference materials in method development andvalidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Validation of salt spray corrosion test . . . . . . . . . . . . . . . 121

Method validation and reference materials . . . . . . . . . . . 128

Method validation of modern analytical techniques . . . 134

Validation of test methods. General principles andconcepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

Marketing Valid Analytical Measurement . . . . . . . . . . . . 143

Analytical procedure in terms of measurement(quality) assurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

Flexibilisation of the scope of accreditation: animportant asset for food and water microbiologylaboratories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

Page 7: Validation in Chemical Measurement

VIII Contents

Different approaches to legal requirements onvalidation of test methods for official control offoodstuffs and of veterinary drug residues and elementcontaminants in the EC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Methods validation: AOAC’s three validation systems 163

Observing validation, uncertainty determination andtraceability in developing Nordtest test methods . . . . . . 167

Page 8: Validation in Chemical Measurement

Contributors

Hassan Y. Aboul-EneinBiological and Medical Research Department(MBC-03), King Faisal Specialist Hospital & ResearchCentre, P.O. Box 3354, Riyadh 11211, Saudi Arabia

G. Anand KumarVIMTA Labs Ltd., Hyderabad–500 051, India

Boris AnisimovThe National Physical Laboratory of Israel, DancingerA. Bldg., Givat Ram, Jerusalem 91904, Israel

Elke AnklamEuropean Commission, Directorate General JointResearch Centre, Institute for Reference Materials andMeasurements, Food Safety and Quality Unit,Retieseweg 111, B-2440 Geel, Belgium

Vicki J. BarwickLaboratory of the Government Chemist, Queens Road,Teddington, Middlesex, TW11 0LY, UK

Ricardo J. N. Bettencourt da SilvaCECUL, Faculdade de Ciências da Universidade deLisboa, P-1700 Lisbon, Portugal

M. BuzoianuNational Institute of Metrology, Sos.Vitan-Bârzesti 11, 75669 Bucharest, Romania

M. Filomena G. F. C. CamõesCECUL, Faculdade de Ciências da Universidade deLisboa, P-1700 Lisbon, Portugal

J. J. CuelloUniversity of Zaragoza, Analytical ChemistryDepartment. Crtra. Zaragoza, s/n 22071 Huesca, Spain

R. C. DíazUniversity of Zaragoza, Analytical ChemistryDepartment. Crtra. Zaragoza, s/n 22071 Huesca, Spain

Stephen L. R. EllisonLaboratory of the Government Chemist, Queens Road,Teddington, Middlesex, TW11 0LY, UK

Rattanjit S. GillLaboratory of the Government Chemist, Queens Road,Teddington, Middlesex, TW11 0LY, UK

Alison GillespieLaboratory of the Government Chemist, Queens Road,Teddington, Middlesex, TW11 0LY, UK

Mykolas GladkovasInstitute of Chemistry, Goštauto, 92600 Vilnius,Lithuania

H. HeyOfficial food and veterinary laboratory ofSchleswig-Holstein, 24537 Neumünster, Germany

D. Brynn HibbertDepartment of Analytical Chemistry, University ofNew South Wales, Sydney, NSW 2052, Australia

Magnus HolmgrenSP Swedish National Testing and Research Institute,P.O. Box 857, SE-50115 Borås, Sweden

Ludwig HuberHewlett-Packard GmbH Waldbronn, P.O. Box 1280,76337 Waldbronn, Germany

Rouvim KadisD. I. Mendeleyev Institute for Metrology,19 Moskovsky pr., 198005 St. Petersburg, Russia

H. KohnoQuality Assurance Department, SandozPharmaceuticals Ltd., No. 1 Takeno, Kawagoe City,Saitama, Japan

Reinhard KreßnerFederal Environmental Agency, P. O. Box 33 00 22,14191 Berlin, Germany

Stephan KüppersSchering AG, In-Process-Control,Müllerstrasse 170–178, 13342 Berlin, Germany

Ilya KuselmanThe National Physical Laboratory of Israel, DancingerA. Bldg., Givat Ram, Jerusalem 91904, Israel

Page 9: Validation in Chemical Measurement

X Contributors

Margreet LauwaarsEuropean Commission, Directorate General JointResearch Centre, Institute for Reference Materials andMeasurements, Food Safety and Quality Unit,Retieseweg 111, B-2440 Geel, Belgium

Alexandre LeclercqInstitut Pasteur de Lille, Department of FoodMicrobiology and Safety, Rue du ProfesseurCalmette 1, 59000 Lille, France

Alex LepekTech Projects, P.O. Box 9769, Jerusalem 91091, Israel

Peter LepomFederal Environmental Agency, P. O. Box 33 00 22,14191 Berlin, Germany

Solveig LinkoMedix Diacor Labservice, Department of QualityAssurance, Nihtisillankuja 1, 00260 Espoo, Finland

Nineta MajcenMetrology Institute of the Republic of Slovenia(MIRS), Laboratory Centre Celje, Tkalska 15,SI-3000 Celje, Slovenia

Hans H. MaurerDepartment of Experimental and Clinical Toxicology,Institute of Experimental and Clinical Pharmacologyand Toxicology, University of Saarland,66421 Homburg, Germany

S. OthakeQuality Assurance Department, SandozPharmaceuticals Ltd., No. 1 Takeno, Kawagoe City,Saitama, Japan

Frank T. PetersDepartment of Experimental and Clinical Toxicology,Institute of Experimental and Clinical Pharmacologyand Toxicology, University of Saarland,66421 Homburg, Germany

Mark J. Q. RaffertyLaboratory of the Government Chemist, Queens Road,Teddington, Middlesex, TW11 0LY, UK

Eugenija RamoškieneInstitute of Chemistry, Goštauto, 92600 Vilnius,LithuaniaC. RíosUniversity of Zaragoza, Analytical ChemistryDepartment. Crtra. Zaragoza, s/n 22071 Huesca, Spain

Gordon A. RossHewlett-Packard GmbH Waldbronn, 76337Waldbronn, Germany

H. S. SahooVIMTA Labs Ltd., Hyderabad–501 051, India

Mudis ŠalkauskasInstitute of Chemistry, Goštauto, 92600 Vilnius,Lithuania

M. A. SarasaUniversity of Zaragoza, Analytical ChemistryDepartment. Crtra. Zaragoza, s/n 22071 Huesca, Spain

S. K. SatpathyVIMTA Labs Ltd., Hyderabad–502 051, India

Joao Seabra e BarrosInstituto Nacional de Engenharia e TecnologiaIndustrial, Estrada do Paço do Lumiar,P-1699 Lisbon Codex, Portugal

S. SenoQuality Assurance Department, SandozPharmaceuticals Ltd., No. 1 Takeno, Kawagoe City,Saitama, Japan

Avinoam ShenharThe National Physical Laboratory of Israel, DancingerA. Bldg., Givat Ram, Jerusalem 91904, Israel

Raluca-Iona StefanDepartment of Analytical Chemistry, Faculty ofChemistry, University of Bucharest, Blvd. RepubliciiNo. 13, 70346 Bucharest-3, Romania

G. SwarnabalaVIMTA Labs Ltd., Hyderabad–503 051, India

Sue UptonLaboratory of the Government Chemist, Queens Road,Teddington, Middlesex, TW11 0LY, UK

Robert J. WellsAustralien Government Analytical Laboratories,1 Suakin Street, Pymble, NSW 2073, Australia

Herbert WiederoderHewlett-Packard GmbH Waldbronn, P. O. Box 1280,76337 Waldbronn, Germany

Alex Williams19 Hamesmoor Way, Mytchett, Camberley, Surrey,GU16 6JG, UK

Peter WoitkeFederal Environmental Agency, P. O. Box 33 00 22,14191 Berlin, Germany

Page 10: Validation in Chemical Measurement

Received: 16 June 2002Accepted: 12 July 2002

Part of this review was published in thecommunications of the International Asso-ciation of Forensic Toxicologists (TIAFT;TIAFT Bulletin 32 (2002): 16–23) and ofthe Society for Forensic and ToxicologicChemistry (GTFCH; Toxichem and Krimi-tech 68 (2001): 116-126).

Abstract The reliability of analyti-cal data is very important to forensicand clinical toxicologists for the cor-rect interpretation of toxicologicalfindings. This makes (bio)analyticalmethod validation an integral part ofquality management and accredita-tion in analytical toxicology. There-fore, consensus should be reached inthis field on the kind and extent ofvalidation experiments as well as onacceptance criteria for validation pa-rameters. In this review, the most im-portant papers published on this top-ic since 1991 have been reviewed.Terminology, theoretical and practi-cal aspects as well as implicationsfor forensic and clinical toxicologyof the following validation parame-ters are discussed: selectivity (speci-

ficity), calibration model (linearity),accuracy, precision, limits, stability,recovery and ruggedness (robust-ness).

Keywords Method · Validation ·Bioanalysis · Toxicology

Accred Qual Assur (2002) 7:441–449DOI 10.1007/s00769-002-0516-5

© Springer-Verlag 2002

Frank T. PetersHans H. Maurer

Bioanalytical method validation and its implications for forensic and clinical toxicology – A review

Introduction

The reliability of analytical findings is a matter of greatimportance in forensic and clinical toxicology, as it is aprerequisite for correct interpretation of toxicologicalfindings. Unreliable results might not only be contestedin court, but could also lead to unjustified legal conse-quences for the defendant or to wrong treatment of thepatient. The importance of validation, at least of routineanalytical methods, can therefore hardly be overestimat-ed. This is especially true in the context of quality man-agement and accreditation, which have become mattersof increasing importance in analytical toxicology in re-cent years. This is also reflected in the increasing re-quirements of peer-reviewed scientific journals concern-ing method validation. Therefore, this topic should beextensively discussed on an international level to reach a

consensus on the extent of validation experiments and onacceptance criteria for validation parameters of bioana-lytical methods in forensic and clinical toxicology.

Over the last decade, similar discussions have beengoing on in the closely related field of pharmacokineticstudies for registration of pharmaceuticals. This is re-flected by the number of publications on this topic pub-lished in the last decade, of which the most important arediscussed here.

Important publications on validation (1991 to present)

A review on validation of bioanalytical methods waspublished by Karnes et al. in 1991 which was intended toprovide guidance for bioanalytical chemists [1]. One

F.T. Peters () · H.H. MaurerDepartment of Experimental and Clinical Toxicology, Institute of Experimental and ClinicalPharmacology and Toxicology, University of Saarland, 66421 Homburg,Germanye-mail: [email protected].: +49-6841-1626050Fax: +49-6841-1626051

Page 11: Validation in Chemical Measurement

2 F.T. Peters · H.H. Maurer

year later, Shah et al. published their report on the con-ference on “Analytical Methods Validation: Bioavailabil-ity, Bioequivalence and Pharmacokinetic Studies” heldin Washington in 1990 (Conference Report) [2]. Duringthis conference, consensus was reached on which param-eters of bioanalytical methods should be evaluated, andsome acceptance criteria were established. In the follow-ing years, this report was actually used as guidance bybioanalysts. Despite the fact, however, that some princi-ple questions had been answered during this conference,no specific recommendations on practical issues like ex-perimental designs or statistical evaluation were made.In 1994, Hartmann et al. analysed the Conference Reportperforming statistical experiments on the established ac-ceptance criteria for accuracy and precision [3]. Basedon their results they questioned the suitability of thesecriteria for practical application. From 1995 to 1997, ap-plication issues like experimental designs and statisticalmethods for bioanalytical method validation were dis-cussed in a number of publications by Dadgar et al. [4,5], Wieling et al. [6], Bressolle et al. [7] and Causon [8].An excellent review on validation of bioanalytical chro-matographic methods was published by Hartmann et al.in 1998, in which theoretical and practical issues werediscussed in detail [9]. In an update of the WashingtonConference in 2000, experiences and progress since thefirst conference were discussed. The results were againpublished by Shah et al. in a report (Conference ReportII) [10], which has also been used as a template forguidelines drawn up by the U.S. Food and Drug Admin-istration (FDA) for their own use [11]. Besides, it shouldbe mentioned that some journals like the Journal ofChromatography B [12] or Clinical Chemistry have es-tablished their own criteria for validation. Two otherdocuments that seem to be important in this context havebeen developed by the International Conference on Har-monisation of Technical Requirements for Registrationof Pharmaceuticals for Human Use (ICH) and approvedby the regulatory agencies of the European Union, theUnited States of America and Japan. The first, approvedin 1994, concentrated on the theoretical background anddefinitions [13], the second, approved in 1996, on meth-odology and practical issues [14]. Both can be down-loaded from the ICH homepage free of charge(www.ich.org). Finally, in 2001 Vander Heyden et al.published a paper on experimental designs and evalua-tion of robustness/ruggedness tests [15]. Despite the factthat the three last mentioned publications were not espe-cially focussed on bioanalytical methods, they still con-tain helpful guidance on some principal questions anddefinitions in the field of analytical method validation.

The aim of our review is to present and compare thecontents of the above mentioned publications on (bio)an-alytical method validation, and to discuss possible impli-cations for forensic and clinical toxicology.

Terminology

The first problem encountered when studying literatureon method validation are the different sets of terminolo-gy employed by different authors. A detailed discussionof this problem can be found in the review of Hartmannet al. [9]. Therein, it was proposed to adhere, in princi-ple, to the terminology established by the ICH [13], ex-cept for accuracy, for which the use of a more detaileddefinition was recommended (cf. Accuracy). However,the ICH terminology lacked a definition for stability,which is an important parameter in bioanalytical methodvalidation. Furthermore, the ICH definition of selectivitydid not take into account interferences that might occurin bioanalysis (e.g. from metabolites). For both parame-ters, however, reasonable definitions were provided byConference Report II [10].

Validation parameters

There is a general agreement that at least the followingvalidation parameters should be evaluated for quantita-tive procedures: selectivity, calibration model (linearity),stability, accuracy (bias, precision) and limit of quantifi-cation. Additional parameters which might have to beevaluated include limit of detection, recovery, reproduc-ibility and ruggedness (robustness) [2, 4–10, 12].

Selectivity (specificity)

In Conference Report II, selectivity was defined as fol-lows: “Selectivity is the ability of the bioanalytical meth-od to measure unequivocally and to differentiate theanalyte(s) in the presence of components, which may beexpected to be present”. Typically, these might includemetabolites, impurities, degradants, matrix components,etc. [10]. This definition is very similar to the one estab-lished by the ICH [13], but takes into account the possi-ble presence of metabolites, and thus is more applicablefor bioanalytical methods.

There are two points of view on when a methodshould be regarded as selective. One way to establishmethod selectivity is to prove the lack of response inblank matrix [1, 2, 4–10, 12, 14]. The requirement estab-lished by the Conference Report [2] to analyse at leastsix different sources of blank matrix has become state ofthe art. However, this approach has been subject to criti-cism in the review of Hartmann et al., who stated fromstatistical considerations, that relatively rare interferenc-es will remain undetected with a rather high probability[9]. For the same reason, Dadgar et al. proposed to eval-uate at least 10–20 sources of blank samples [4]. Howev-er, in Conference Report II [10], even analysis of onlyone source of blank matrix was deemed acceptable, if

Page 12: Validation in Chemical Measurement

Bioanalytical method validation and its implications for forensic and clinical toxicology – A review 3

hyphenated mass spectrometric methods are used for de-tection.

The second approach is based on the assumption thatsmall interferences can be accepted as long as precisionand bias remain within certain acceptance limits. Thisapproach was preferred by Dadgar et al. [4] and Hart-mann et al. [9]. Both publications proposed analysis ofup to 20 blank samples spiked with analyte at the lowerlimit of quantification (LLOQ) and, if possible, with in-terferents at their highest likely concentrations. In thisapproach, the method can be considered sufficiently se-lective if precision and accuracy data for these LLOQsamples are acceptable. For a detailed account of experi-mental designs and statistical methods to establish selec-tivity see Ref. [4].

Whereas the selectivity experiments for the first ap-proach can be performed during a pre-validation phase(no need for quantification), those for the second ap-proach are usually performed together with the precisionand accuracy experiments during the main validationphase.

At this point it must be mentioned that the term speci-ficity is used interchangeably with selectivity, althoughin a strict sense specificity refers to methods which pro-duce a response for a single analyte, whereas selectivityrefers to methods that produce responses for a number ofchemical entities, which may or may not be distin-guished [1]. Selective multi-analyte methods (e.g. fordifferent drugs of abuse in blood) should of course beable to differentiate all interesting analytes from eachother and from the matrix.

Calibration model (linearity)

The choice of an appropriate calibration model is neces-sary for reliable quantification. Therefore, the relation-ship between the concentration of analyte in the sampleand the corresponding detector response must be investi-gated. This can be done by analysing spiked calibrationsamples and plotting the resulting responses versus thecorresponding concentrations. The resulting standardcurves can then be further evaluated by graphical ormathematical methods, the latter also allowing statisticalevaluation of the response functions.

Whereas there is general agreement that calibrationsamples should be prepared in blank matrix and thattheir concentrations must cover the whole calibrationrange, recommendations on how many concentrationlevels should be studied with how many replicates perconcentration level differ significantly [5–10, 12]. InConference Report II, a sufficient number of standards todefine adequately the relationship between concentrationand response was demanded. Furthermore, it was statedthat at least five to eight concentration levels should bestudied for linear and maybe more for non-linear rela-

tionships [10]. However, no information was given onhow many replicates should be analysed at each level.The guidelines established by the ICH and those of theJournal of Chromatography B also required at least fiveconcentration levels, but again no specific requirementsfor the number of replicates at each level were given [12,14]. Causon recommended six replicates at each of sixconcentration levels, whereas Wieling et al. used eightconcentration levels in triplicate [6, 8]. Based on studiesby Penninckx et al. [16], Hartmann et al. proposed intheir review to rather use fewer concentration levels witha greater number of replicates (e.g. four evenly spreadlevels with nine replicates) [9]. This approach not onlyallows the reliable detection of outliers, but also a betterevaluation of the behaviour of variance across the cali-bration range. The latter is important for choosing theright statistical model for the evaluation of the calibra-tion curve. The often used ordinary least squares modelfor linear regression is only applicable for homoscedasticdata sets (constant variance over the whole range),whereas in case of heteroscedasticity (significant differ-ence between variances at lowest and highest concentra-tion levels) the data should mathematically be trans-formed or a weighted least squares model should be ap-plied [6–10]. Usually, linear models are preferable but, ifnecessary, the use of non-linear models is not only ac-ceptable but even recommended. However, more con-centration levels are needed for the evaluation of non-linear models than for linear models [2, 9, 10].

After outliers have been purged from the data and amodel has been evaluated visually and/or by, e.g. residu-al plots, the model fit should also be tested by appropri-ate statistical methods [2, 6, 9, 10, 14]. The fit of un-weighted regression models (homoscedastic data) can betested by the ANOVA lack-of-fit test [6, 9]. A detaileddiscussion of alternative statistical tests for both un-weighted and weighted calibration models can be foundin Ref. [16]. The widespread practice to evaluate a cali-bration model via its coefficients of correlation or deter-mination is not acceptable from a statistical point ofview [9].

However, one important point should be kept in mindwhen statistically testing the model fit: The higher theprecision of a method, the higher the probability to detecta statistically significant deviation from the assumed cali-bration model [1, 6, 9]. Therefore, the practical relevanceof the deviation from the assumed model should also betaken into account. If the accuracy data (bias and preci-sion) are within the required acceptance limits or an alter-native calibration model is not applicable, slight devia-tions from the assumed model may be neglected [6, 9].

Once a calibration model has been established, thecalibration curves for other validation experiments (pre-cision, bias, stability, etc.) and for routine analysis can beprepared with fewer concentration levels and fewer or noreplicates [6, 9].

Page 13: Validation in Chemical Measurement

4 F.T. Peters · H.H. Maurer

Accuracy

The accuracy of a method is affected by systematic (bias)as well as random (precision) error components [3, 9] Thisfact has been taken into account in the definition of accura-cy as established by the International Organization forStandardization (ISO) [17]. However, it must be mentionedthat accuracy is often used to describe only the systematicerror component, i.e. in the sense of bias [1, 2, 6–8, 10, 12,13]. In the following, the term accuracy will be used in thesense of bias, which will be indicated in brackets.

Bias

According to ISO, bias is the difference between the ex-pectation of the test results and an accepted referencevalue [17]. It may consist of more than one systematicerror component. Bias can be measured as a percent de-viation from the accepted reference value. The term true-ness expresses the deviation of the mean value of a largeseries of measurements from the accepted reference val-ue. It can be expressed in terms of bias.

Due to the high workload of analysing such large se-ries, trueness is usually not determined during methodvalidation, but rather from the results of a great numberof quality control (QC) samples during routine applica-tion or in interlaboratory studies.

Precision

According to ICH, precision is the closeness of agree-ment (degree of scatter) between a series of measure-ments obtained from multiple sampling of the same ho-mogenous sample under the prescribed conditions andmay be considered at three levels: repeatability, interme-diate precision and reproducibility [13]. Precision is usu-ally measured in terms of imprecision expressed as anabsolute or relative standard deviation and does not re-late to reference values.

Repeatability. Repeatability expresses the precision un-der the same operating conditions over a short interval oftime. Repeatability is also termed intra-assay precision[13]. Within-run or within-day precision are also oftenused to describe repeatability.

Intermediate precision. Intermediate precision expresseswithin-laboratories variations: different days, differentanalysts, different equipment, etc. [13]. The ISO defini-tion used the term “M-factor different intermediate pre-cision”, where the M-factor expresses how many andwhich factors (time, calibration, operator, equipment orcombinations of those) differ between successive deter-minations [17]. In a strict sense, intermediate precision is

the total precision under varied conditions, whereas socalled inter-assay, between-run or between-day precisiononly measure the precision components caused by the re-spective factors. However, the latter terms are not clearlydefined and obviously often used interchangeably witheach other and also with the term intermediate precision.

Reproducibility. Reproducibility expresses the precisionbetween laboratories (collaborative studies, usually ap-plied to standardization of methodology) [13]. Repro-ducibility only has to be studied, if a method is supposedto be used in different laboratories.

Unfortunately, some authors also used the term repro-ducibility for within-laboratory studies at the level of in-termediate precision [8, 12]. This should, however, beavoided in order to prevent confusion.

As already mentioned above, precision and bias can beestimated from the analysis of QC samples under speci-fied conditions. As both precision and bias can vary sub-stantially over the calibration range, it is necessary toevaluate these parameters at least at three concentrationlevels (low, medium and high relative to the calibrationrange) [1, 2, 9, 10, 14]. In Conference Report II, it wasfurther defined that the concentration of the low QCsample must be within three times LLOQ [10]. The Jour-nal of Chromatography B requirement is to study preci-sion and bias at two concentration levels (low and high),whereas in the experimental design proposed by Wielinget al. four concentration levels (LLOQ, low, medium,high) were studied [6, 12]. Causon also suggested to esti-mate precision at four concentration levels [8]. Severalauthors have specified acceptance limits for precisionand/or accuracy (bias) [2, 7, 8, 10, 12]. Both ConferenceReports required precision to be within 15% relativestandard deviation (RSD) except at the LLOQ where20% RSD was accepted. Bias was required to be within±15% of the accepted true value, except at the LLOQwhere ±20% were accepted [2, 10]. These requirementshave been subject to criticism in the analysis of the Con-ference Report by Hartmann et al. [3]. They concludedfrom statistical considerations that it is not realistic toapply the same acceptance criteria at different levels ofprecision (repeatability, reproducibility) as RSD underreproducibility conditions is usually considerably greaterthan under repeatability conditions. Furthermore, if pre-cision and bias estimates are close to the acceptance lim-its, the probability to reject an actually acceptable meth-od (β-error) is quite high. Causon proposed the same ac-ceptance limits of 15% RSD for precision and ±15% foraccuracy (bias) for all concentration levels [8].

The guidelines established by the Journal of Chroma-tography B required precision to be within 10% RSD forthe high QC samples and within 20% RSD for the lowQC sample. Acceptance criteria for accuracy (bias) werenot specified therein [12].

Page 14: Validation in Chemical Measurement

Bioanalytical method validation and its implications for forensic and clinical toxicology – A review 5

Again, the proposals on how many replicates at eachconcentration levels should be analysed vary consider-ably. The Conference Reports and Journal of Chroma-tography B guidelines required at least five replicates ateach concentration level [2, 10, 12]. However, onewould assume that these requirements only apply to re-peatability studies; at least no specific recommenda-tions were given for studies of intermediate precisionor reproducibility. Some more practical approaches tothis problem have been described by Wieling et al. [6],Causon [8] and Hartmann et al. [9]. In their experimen-tal design, Wieling et al. analysed three replicates ateach of four concentration levels on each of 5 days.Similar approaches were suggested by Causon (six rep-licates at each of four concentrations on each of fouroccasions) and Hartmann et al. (two replicates at eachconcentration level on each of 8 days). All three usedor proposed one-way ANOVA to estimate repeatabilityand time-different precision components. In the designproposed by Hartmann et al. the degrees of freedom forboth estimations are most balanced, namely 8 for with-in-run precision and 7 for between-run precision. In theinformation for authors of the Clinical Chemistry jour-nal, an experimental design with two replicates per run,two runs per day over 20 days for each concentrationlevel is recommended, which has been established bythe NCCLS [18]. This not only allows estimation ofwithin-run and between-run standard deviations, butalso of within-day, between-day and total standard de-viations, which are in fact all estimations of precisionat different levels. However, it seems questionable ifthe additional information provided by this approachcan justify the high workload and costs compared to theother experimental designs.

Daily variations of the calibration curve can influ-ence bias estimation. Therefore, bias estimation shouldbe based on data calculated from several calibrationcurves [9]. In the experimental design of Wieling et al.,the results for QC samples were calculated via dailycalibration curves. Therefore, the overall means fromthese results at the different concentration levels reli-ably reflect the average bias of the method at the corre-sponding concentration level. Alternatively, as de-scribed in the same paper, the bias can be estimated us-ing confidence limits around the calculated mean val-ues at each concentration [6]. If the calculated confi-dence interval includes the accepted true value, one canassume the method to be free of bias at a given level ofstatistical significance. Another way to test the signifi-cance of the calculated bias is to perform a t-testagainst the accepted true value.

However, even methods exhibiting a statistically sig-nificant bias can still be acceptable, if the calculated bi-as lies within previously established acceptance limits.Other methods for bias evaluation can be found in Ref.[9].

Limits

Lower limit of quantification (LLOQ)

The LLOQ is the lowest amount of an analyte in a sam-ple that can be quantitatively determined with suitableprecision and accuracy (bias) [10, 13]. There are differ-ent approaches for the determination of LLOQ.

LLOQ based on precision and accuracy (bias) data [2,7–10, 13, 14]. This is probably the most practical ap-proach and defines the LLOQ as the lowest concentra-tion of a sample that can still be quantified with accept-able precision and accuracy (bias). In the ConferenceReports, the acceptance criteria for these two parametersat LLOQ are 20% RSD for precision and ±20% for bias.Only Causon suggested 15% RSD and ±15%, respective-ly [8]. It should be pointed out, however, that these pa-rameters must be determined using an LLOQ sample in-dependent from the calibration curve. The advantage ofthis approach is the fact that the estimation of LLOQ isbased on the same quantification procedure used for realsamples.

LLOQ based on signal to noise ratio (S/N) [12, 14]. Thisapproach can only be applied if there is baseline noise,e.g. to chromatographic methods. Signal and noise canthen be defined as the height of the analyte peak (signal)and the amplitude between the highest and lowest pointof the baseline (noise) in a certain area around the anal-yte peak. For LLOQ, S/N is usually required to be equalto or greater than 10.

The estimation of baseline noise can be quite difficultfor bioanalytical methods, if matrix peaks elute close tothe analyte peak.

LLOQ based on standard deviation of the response fromblank samples [14]. Another definition of LLOQ is theconcentration that corresponds to a detector responsethat is k times greater than the estimated standard devia-tion of blank samples (SDbl). From the detector signal,the LLOQ can be calculated using the slope of the cali-bration curve (S) with following formula: LLOQ=k·SDbl/S (for blank corrected signals).

This approach is only applicable for methods whereSDbl can be estimated from replicate analysis of blanksamples. It is therefore not applicable for most quantita-tive chromatographic methods, as here the response isusually measured in terms of peak area units, which canof course not be measured in a blank sample analysedwith a selective method.

LLOQ based on a specific calibration curve in the rangeof LLOQ [14]. In this approach, a specific calibrationcurve is established from samples containing the analytein the range of LLOQ. One must not use the calibration

Page 15: Validation in Chemical Measurement

6 F.T. Peters · H.H. Maurer

curve over the whole range of quantification for this de-termination. The standard deviation of the blank can thenbe estimated from the residual standard deviation of theregression line or the standard deviation of the y inter-cept. The calculations of LLOQ are basically the same asdescribed under the heading “LLOQ based on standarddeviation of the response from the blank samples”. Thisapproach is also applicable for chromatographic meth-ods.

Upper limit of quantification (ULOQ)

The upper limit of quantification is the maximum analyteconcentration of a sample that can be quantified with ac-ceptable precision and accuracy (bias). In general theULOQ is identical to the concentration of the highestcalibration standard [10].

Limit of detection (LOD)

Quantification below LLOQ is by definition not accept-able [2, 5, 9, 10, 13, 14]. Therefore, below this value amethod can only produce semiquantitative or qualitativedata. However, it can still be important to know the LODof the method. According to ICH, it is the lowest con-centration of analyte in a sample which can be detectedbut not necessarily quantified as an exact value. Accord-ing to Conference Report II, it is the lowest concentra-tion of an analyte in a sample, that the bioanalytical pro-cedure can reliably differentiate from background noise[10, 13].

The approaches for estimation of the LOD are basi-cally the same as those described for LLOQ under theheadings “LLOQ based on signal to noise ratio (S/N)” –“LOQ based on a specific calibration curve in the rangeof LLOQ”. However, for LOD a S/N or k-factor equal toor greater than three is usually chosen [1, 6, 9, 14]. If thecalibration curve approach is used for determination ofthe LOD, only calibrators containing the analyte in therange of LOD must be used.

Stability

The definition according to Conference Report II was asfollows: “The chemical stability of an analyte in a givenmatrix under specific conditions for given time inter-vals” [10]. Stability of the analyte during the whole ana-lytical procedure is a prerequisite for reliable quantifica-tion. Therefore, full validation of a method must includestability experiments for the various stages of analysisincluding storage prior to analysis.

Long-term stability

The stability in the sample matrix should be establishedunder storage conditions, i.e. in the same vessels, at thesame temperature and over a period at least as long asthe one expected for authentic samples [1, 2, 4, 5, 9, 10,12].

Freeze/thaw stability

As samples are often frozen and thawed, e.g. for re-anal-ysis, the stability of analyte during several freeze/thawcycles should also be evaluated. The Conference Reportsrequire a minimum of three cycles at two concentrationsin triplicate, which has also been accepted by other au-thors [2, 4, 6, 9, 10].

In-process stability

The stability of analyte under the conditions of samplepreparation (e.g. ambient temperature over time neededfor sample preparation) is evaluated here. There is gener-al agreement, that this type of stability should be evalu-ated to find out, if preservatives have to be added to pre-vent degradation of analyte during sample preparation[4, 9, 10].

Processed sample stability

Instability cannot only occur in the sample matrix, butalso in prepared samples. It is therefore important to alsotest the stability of an analyte in the prepared samplesunder conditions of analysis (e.g. autosampler conditionsfor the expected maximum time of an analytical run).One should also test the stability in prepared samples un-der storage conditions, e.g. refrigerator, in case preparedsamples have to be stored prior to analysis [4–6, 9, 10].

For more details on experimental design and statisti-cal evaluation of stability experiments see Refs. [4, 5, 9].

Stability can be tested by comparing the results of QCsamples analysed before (comparison samples) and after(stability samples) being exposed to the conditions forstability assessment. It has been recommended to per-form stability experiments at least at two concentrationlevels (low and high) [4–6, 9]. For both, comparison andstability samples, analysis of at least six replicates wasrecommended [9]. Ratios between comparison samplesand stability samples of 90–110% with 90% confidenceintervals within 80–120% [9] or 85–115% [4] were re-garded acceptable. Alternatively, the mean of the refer-ence samples can be tested against a lower acceptancelimit corresponding to 90% of the mean of the compari-son samples [8, 9].

Page 16: Validation in Chemical Measurement

Bioanalytical method validation and its implications for forensic and clinical toxicology – A review 7

Recovery

As already mentioned above, recovery is not among thevalidation parameters regarded as essential by the Con-ference Reports. Most authors agree, that the value forrecovery is not important, as long as the data for LLOQ,(LOD), precision and accuracy (bias) are acceptable [1,5, 7–10]. It can be calculated as the percentage of theanalyte response after sample workup compared to thatof a solution containing the analyte at the theoreticalmaximum concentration. Therefore, absolute recoveriescan usually not be determined if the sample workup in-cludes a derivatization step, as the derivatives are oftennot available as reference substances. Nevertheless, theguidelines of the Journal of Chromatography B requirethe determination of the recovery for analyte and internalstandard at high and low concentrations [12].

Ruggedness (robustness)

Ruggedness is a measure for the susceptibility of a meth-od to small changes that might occur during routine anal-ysis like small changes of pH values, mobile phase com-position, temperature, etc. Full validation must not nec-essarily include ruggedness testing; it can however bevery helpful during the method development/pre-valida-tion phase, as problems that may occur during validationare often detected in advance. Ruggedness should betested, if a method is supposed to be transferred to anoth-er laboratory [1, 9, 13–15]. A detailed account and help-ful guidance on experimental designs and evaluation ofruggedness/robustness tests can be found in Ref. [15].

Implications for forensic and clinical toxicology

Almost all of the above mentioned publications referredto bioanalytical methods for bioavailability, bioequiva-lence or pharmacokinetic studies. This field is of coursevery closely related to forensic and clinical toxicology,especially if only routine methods are considered. There-fore, it seems reasonable to base the discussion concern-ing method validation in toxicological analysis on theexperiences and consensus described above and not tostart the whole discussion anew. In the following, possi-ble implications for forensic and clinical toxicology willbe discussed.

Terminology

As already mentioned above, there are several sets of ter-minology in the literature. It is therefore strongly recom-mended to adopt, in principle, one of these sets for vali-dation in forensic and clinical toxicology and add slight

modifications, where it seems necessary. The definitionsestablished by the ICH seem to be a reasonable choice asthey are consensus definitions of an international confer-ence and easily available on the homepage of ICH(www.ich.org).

Validation parameters

Selectivity (specificity)

During pharmacokinetic studies (therapeutic) drugs areusually ingested under controlled conditions. Therefore,there is no need to prove the ingestion of this drug. Dueto this fact the selectivity evaluation can be based on theacceptability of precision and accuracy data at theLLOQ. This approach is quite problematic for forensicand clinical toxicology, where analysis is often mainlyperformed to prove ingestion of an (illicit) substanceand, therefore, qualitative data are also important. Here,the approach to prove selectivity by absence of signals inblank samples makes much more sense. The confine-ment of Conference Report II [10] to only study onesource of blank matrix for methods employing MS de-tection does not seem reasonable for toxicological appli-cations because of the great importance of selectivity inthis field. However, discussion is needed on how manysources of blank samples should be analysed and if thisshould depend on the detection method.

It seems also reasonable to also check for interferenc-es from other xenobiotics that can be expected to bepresent in authentic samples (e.g. other drugs of abusefor methods to determine MDMA, other neuroleptics formethods to determine olanzapine). This can be accom-plished by spiking these possible interferents at theirhighest expectable concentrations into blank matrix andchecking for interferences after analysis. Another way toexclude interferences from such compounds is to checkauthentic samples containing these but not the analytefor interfering peaks. This latter approach is preferable,if the possibly interfering substance is known to be ex-tensively metabolized, as it also allows exclusion of in-terferences from such metabolites, which are usually notavailable as pure substances.

Calibration model

The use of matrix-based calibration standards seems alsoimportant in toxicological analysis, in order to accountfor matrix effects during sample workup and measure-ment (e.g. by chromatographic methods). Consensusshould be reached on how many concentration levels andhow many replicates per level should be analysed. Fromour own experience six levels with six replicates eachseems reasonable. Weighted calibration models will gen-

Page 17: Validation in Chemical Measurement

8 F.T. Peters · H.H. Maurer

erally be the most appropriate in toxicological analysis,as concentration ranges of analytes in toxicological sam-ples are usually much greater than in samples for phar-macokinetic studies. Homoscedasticity, a prerequisite forunweighted models, can however only be expected forsmall calibration ranges.

Accuracy (precision and bias)

There is no obvious reason to evaluate these parametersdifferently than has been described above. Due to the of-ten higher concentration ranges, it might be reasonableto also validate the analysis of QC samples containingconcentrations above the highest calibration standard af-ter dilution or after reduction of sample volumes, as ithas been described by Wieling et al. [6] and Dadgar et al.[5]. The latter has also described the use of QC sampleswith concentrations below those of the lowest calibrationstandard using greater sample volumes.

Limits

The same approaches and criteria as those describedabove under “Limits” could be used. All approacheshave been described to a lesser or greater extent in inter-national publications, especially for the determination ofLOD. Nevertheless, it seems important to reach consen-sus on this matter at least for forensic and clinical toxi-cology, as reliable detection of a substance is one of themost important issues in toxicological analysis. At thispoint it must be stressed that for the estimation of LODand LLOQ via a special calibration curve, the calibrationsamples must only contain the analyte at concentrationsclose to LOD and LLOQ. Use of the calibration curveover the whole range may lead to overestimation of theselimits.

Stability

The biggest problems encountered during stability test-ing for bioanalytical methods in forensic and clinicaltoxicology is the fact that there is a great number of dif-ferent sampling vessels. Furthermore, the anticoagulantsused also differ. Both facts make it difficult to assesslong-term stability, as the workload to analyse all possi-ble combinations of vessels and anticoagulants is ofcourse far to great. However, for some analytes relevantto forensic and clinical toxicology (e.g. cocaine) stabilityproblems with different sampling vessels have been re-ported [19]. Therefore, the relevance of this parameter toforensic and clinical toxicology has to be discussed ex-tensively. Agreement on a single type of vessel to use forsampling of toxicological samples would probably be the

easiest solution. Another problem is the fact that storageconditions prior to arrival in the laboratory are notknown. So this matter will also have to be addressed.

Recovery

Recovery does not seem to be a big issue for forensicand clinical toxicologists as long as precision, accuracy(bias), LLOQ and, especially, LOD are satisfactory.However, during method development one should ofcourse try to optimize recovery.

Ruggedness

There is no obvious reason to treat this matter differentlythan described above under “Ruggedness (robustness)”.

Measurement uncertainty

Measurement uncertainty, a parameter characterizing thedispersion of the values that could reasonably be attrib-uted to the measurand (e.g. concentration), is consideredan important concept in analytical chemistry [20]. Itcomprises many components and is generally reported asa standard deviation or as a confidence interval. Howev-er, measurement uncertainty was not explicitly addressedin any of the publications on bioanalytical method vali-dation discussed in this review. One probable reasonmight be that measurement uncertainties of modern ana-lytical methods are certainly small compared to the dif-ferences encountered between individual subjects inpharmacokinetic studies. Nevertheless, knowledge aboutthe uncertainty of measurements can be very helpful oreven important for the interpretation of bioanalytical da-ta, especially in forensic toxicology. As bioanalyticalmethods are usually rather complex and assessment ofthe contribution of individual components on the com-bined uncertainty of the results would therefore be timeconsuming and costly, estimation of measurement uncer-tainty from validation data, especially precision and ac-curacy data, would be preferable. The more the design ofvalidation experiments accounts for conditions duringroutine application of a method, i.e. the more individualcomponents are comprised in one validation parameter,the better the estimation of measurement uncertaintyfrom those parameters. Consequently, reproducibility da-ta from interlaboratory experiments studies or intermedi-ate precision data with M-factors ≥ 3 from intralaborato-ry experiments should adequately reflect measurementuncertainty of bioanalytical methods. However, furthercomponents not covered adequately by validation experi-ments might play a role for measurement uncertainty. Adetailed guide on the quantification of analytical mea-

Page 18: Validation in Chemical Measurement

Bioanalytical method validation and its implications for forensic and clinical toxicology – A review 9

surement uncertainty has been published by EURA-CHEM/CITAC [20].

Conclusion

There are only a few principle differences concerningvalidation of bioanalytical methods in the fields of phar-

macokinetic studies and forensic and clinical toxicology.Therefore, it seems reasonable to base the discussion onvalidation in the field of toxicology on the experiencesand consensus already existing in the closely relatedfield of pharmacokinetic studies for registration of phar-maceuticals and focus the discussion on those parame-ters, which are of special importance for toxicologists,i.e. selectivity, LOD, LLOQ and stability.

References

1. Karnes HT, Shiu G, Shah VP (1991)Pharm Res 8:421–426

2. Shah VP, Midha KK, Dighe S, McGilveray IJ, Skelly JP, Yacobi A,Layloff T, Viswanathan CT, Cook CE,McDowall RD, Pittman KA, Spector S(1992) Pharm Res 9:588–592

3. Hartmann C, Massart DL, McDowallRD (1994) J Pharm Biomed Anal12:1337–1343

4. Dadgar D, Burnett PE (1995) J PharmBiomed Anal 14:23–31

5. Dadgar D, Burnett PE, Choc MG, Gallicano K, Hooper JW (1995) JPharm Biomed Anal 13:89–97

6. Wieling J, Hendriks G, Tamminga WJ,Hempenius J, Mensink CK, OosterhuisB, Jonkman JH (1996) J Chromatogr A730:381–394

7. Bressolle F, Bromet PM, Audran M(1996) J Chromatogr B 686:3–10

8. Causon R (1997) J Chromatogr B689:175–180

9. Hartmann C, Smeyers-Verbeke J,Massart DL, McDowall RD (1998) JPharm Biomed Anal 17:193–218

10. Shah VP, Midha KK, Findlay JW, HillHM, Hulse JD, McGilveray IJ, McKayG, Miller KJ, Patnaik RN, Powell ML,Tonelli A, Viswanathan CT, Yacobi A(2000) Pharm Res 17:1551–1557

11. U.S. Department of Health and HumanServices, Food and Drug Administra-tion (2001) Guidance for industry, bio-analytical method validationhttp://www.fda.gov/cder/guidance/in-dex.htm

12. Lindner W, Wainer IW (1998) J Chro-matogr B 707:1–2

13. International Conference on Harmoni-zation (ICH) (1994) Validation of ana-lytical methods: Definitions and termi-nology ICH Q2 A

14. International Conference on Harmoniza-tion (ICH) (1996) Validation of analyti-cal methods: Methodology ICH Q2 B

15. Vander-Heyden Y, Nijhuis A, Smeyers-Verbeke J, Vandeginste BG, MassartDL (2001) J Pharm Biomed Anal24:723–753

16. Penninckx W, Hartmann C, MassartDL, Smeyers-Verbeke J (1996) J AnalAt Spectrom 11:237–246

17. International Organization for Stan-dardization (ISO) (1994) Accuracy(trueness and precision) of measure-ment methods and results: ISO/DIS5725–1 to 5725–3. ISO, Geneva, Switzerland

18. NCCLS (1999) Evaluation of precisionperformance of clinical chemistry de-vices; Approved guideline. NCCLS,Wayne, Pa., USA

19. Toennes SW, Kauert GF (2001) J AnalToxicol 25:339–343

20. EURACHEM/CITAC (2000) Quantify-ing uncertainty in analytical measure-ment http://www.eura-chem.bam.de/guides/quam2.pdf

Page 19: Validation in Chemical Measurement

Accred Qual Assur (1997) 2 :234–237Q Springer-Verlag 1997 GENERAL PAPER

Ilya KuselmanBoris AnisimovAvinoam ShenharAlex Lepek

Validation of a computer programfor atomic absorption analysis

Received: 25 January 1997Accepted: 14 March 1997

I. Kuselman (Y) 7 B. AnisimovA. ShenharThe National Physical laboratory ofIsrael, Danciger A. Bldg, Givat Ram,Jerusalem 91904, IsraelTel: c972-2-6536534Fax: c972-2-6520797e-mail: [email protected]

A. LepekTech Projects, P. O. Box 9769,Jerusalem 91091, Israel

Abstract The approach to valida-tion of a computer program for ananalytical instrument as a compo-nent of the analytical method (us-ing this instrument with the pro-gram) is discussed. This approachwas used for validating a new pro-gram for atomic absorption analy-sis. The validation plan derivedfrom this approach was based onminimising the influence of allsteps of the analytical procedureon the analytical results obtainedby the method. In this way signifi-cant changes in the results may becaused only by replacement of theprevious program by the new one.

The positive validation conclusionwas based on the comparison ofthe results of the analysis of suita-ble reference materials obtainedwith the new program and with itsprecursor in the same conditions,and also on comparison of theirdeviations from the accepted refer-ence values for these materials,with the corresponding uncertain-ties.

Key words Validation 7 Computerprogram 7 Analytical instrument 7Measurement uncertainty 7 Atomicabsorption analysis

Introduction

The validation of analytical methods is a well-knownproblem in the analytical community [1]. The interna-tional guidance for equipment qualification (EQ) ofanalytical instruments and their validation is in the de-velopment stage [2–4]. At this time validation of com-puter systems for analytical instruments is less elabo-rated [5–7]. The term “computer system” comprisescomputer hardware, peripherals, and software that in-cludes application programs and operating environ-ments (MS-DOS, MS-Windows and others) [5, 6]. Sinceprograms, software and the whole computer system areelements of the instrument used by the analyst accord-ing to the analytical method, successful validation ofthe method as a “black box” [8] means successful vali-dation of the instrument, computer system, softwareand programs. On the other hand, the same instrumentmay also be calibrated and validated as a smaller (in-

cluded) “black box” [9], with a corresponding valida-tion conclusion about the computer system and its ele-ments. Next, the computer system is again a smaller in-cluded “black box” [10, 11] etc. It is like a matreshka –a Russian wooden doll with successively smaller onesfitted into it. Therefore the approach to the validationof a computer program as a validation of a componentof the analytical method is sound. In the framework ofthis approach, not all the method validation parameters(performance characteristics [10, 11]) may be relevantfor the program validation. For example, the validationparameter “accuracy” (bias) is meaningful in this con-text, while the “selectivity”, “specificity” or “rugged-ness” are not. The bias may be evaluated as the differ-ence between the results of the analysis of suitable ref-erence materials obtained with the new program andwith its precursor in the same conditions, and also byevaluation of their deviation from the accepted refer-ence values for these materials in comparison to thecorresponding uncertainties [12]. Certainly, the pro-

Page 20: Validation in Chemical Measurement

Validation of a computer program for atomic absorption analysis 11

gram validated in this way for the specific analyticalmethod cannot be considered as validated for other ap-plications (methods).

In this paper, the present approach is used for vali-dating the new program GIRAF for atomic absorptionanalysis with the Perkin-Elmer 5000 spectrophotometerequipped with the graphite furnace and/or flame facili-ties. The program was developed by Tech Projects ac-cording to specifications of the National Physical Labo-ratory of Israel. The purpose of the development was toreplace the out-of-date Perkin-Elmer HGA GraphicsSoftware Package with Data System-10 (the operatingenvironment is PETOS) and the obsolete Perkin-Elmercomputer 3600 Data Station used with the Graphitefurnace from 1983. Moreover, the technique for atomicabsorption analysis with flame facilities included visualreading of absorbance or emission values from thespectrophotometer display that did not correspond tothe Good Laboratory Practice requirements. In addi-tion to Perkin-Elmer equipment, an IBM compatiblePC with the program Quattro-Pro for Windows wasused routinely by us for the linear regression analysisand uncertainty calculations. The same calculationswere done with the raw data of the graphite furnaceanalysis (absorbance peak area or peak height) ob-tained from Perkin-Elmer HGA Graphics SoftwarePackage with Data System-10.

GIRAF program description

The operating computer environment is MS-DOS, andthe hardware is an IBM-compatible PC. GIRAF reads,displays and stores data from the spectrophotometer(absorbance or emission) as a function of time at a rateof 50 readings per second. The raw data can be dis-played after filtering. The filter is an exponential onewith a time constant of 0.2 s. The stored data are usedto calculate the calibration curve and the analysis re-sults including uncertainties in accordance with [12].The calculations are based on the linear regression ofdata with replicate measurements. If the graphite fur-nace is used, the absorbance peak area and peak heightare calculated as the data for regression analysis; if theflame facilities are used, the average absorbance or em-ission over the defined time are the data. The programalso provides for analysis using the standard additionmethod.

Validation plan

In spite of the replacement of all elements of the com-puter system, it is obvious that in our case only the pro-grams can influence the analytical results. So, the planof GIRAF validation consisted in comparison of the re-

sults of the analysis of the reference materials obtainedwith the program to be validated with the previous rou-tinely used ones. The objects for analysis, commonlyacceptable as the simplest, were chosen for minimisingthe influence of all the steps of the analytical procedureon the analytical results. These objects are aqueous so-lutions of lead for the graphite furnace, of copper forflame atomic absorption, and of sodium for flameatomic emission analysis. The preparation of the testsolutions and the solutions for the spectrophotometercalibration (five in each calibration range) was plannedto be done from the same sources. Differences betweenthe results of the analysis obtained for the test solutionswith the GIRAF program Cnew and with its precursors(no change of any other conditions) Cpre and also theirdeviations from the accepted reference values for thesesolutions were planned as the data for the validationconclusion. The conclusion can be positive if these datacorrespond to the following acceptance criteria:

A. hCnewPCpreh^(U2newcU2

pre)1/2, where Unew andUpre are are the uncertainties (95% confidence) ofCnew and Cpre, correspondingly;

B. hCPCtrh^(U2cU2tr)1/2, where C and U are Cnew

and Unew or Cpre and Upre, while Ctr and Utr are the“true” concentration of the analyte in the referencematerial and its uncertainty (95% confidence).

Experimental

Reagents

Merck standard solutions were used: lead(II) nitrate in 0.5 M nit-ric acid, certified lead concentration 998B2 mg/L, shelf life up to31. 03. 98; copper(II) nitrate in 0.5 M nitric acid, certified copperconcentration 1000B2 mg/L, shelf life up to 31. 10. 98; sodium ni-trate in water, certified concentration 1000B2 mg/L, shelf life upto 01. 06. 97. Nitric acid (1 vol%) as the solution for dilution ofthe standard solutions was prepared from Merck nitric acid Su-prapur (65%) and deionised water.

Mg(NO3)276 H2O Suprapur, and (NH4)2HPO4 Pro analysis,both from Merck, were used for matrix modifier preparation forlead determination by graphite furnace analysis.

CsCl Suprapur from Merck was used for matrix modifier pre-paration for sodium determination by flame atomic emissionanalysis.

Reference materials

The reference materials for calibration of the spectrophotometerwith the graphite furnace system were prepared from the leadstandard solution by its dilution with nitric acid 1% solution:blank (nitric acid 1% solution), 5, 10, 20, 30 and 40 mg/L (ppb). Asolution with a lead concentration of 25 ppb (“true” value) wasused as a test solution, i.e. a sample with “unknown” analyte con-centration.

The reference materials for calibration of the spectrophotom-eter for flame atomic absorption analysis were prepared from thecopper standard solution by its dilution with nitric acid 1% solu-

Page 21: Validation in Chemical Measurement

12 I. Kuselman · B. Anisimov · A. Shenhar · A. Lepek

Table 2 Acceptance criteria

Kind ofanalysis

Response Criterion A Criterion B

hCnewPCpreh (U2newcU2

pre)1/2 hCprePCtrh (U2precU2

tr)1/2 hCnewPCtrh (U2newcU2

tr)1/2

Graphitefurnace

Flame

PeakheightPeakarea

AbsorbanceEmission

0.1

0.4

0.010.01

2.5

1.8

0.130.03

1.4

0.8

0.000.02

2.2

1.3

0.090.03

1.3

1.2

0.010.03

1.3

1.3

0.110.03

Table 1 Results of the experiment (average values of three replicates) for validation of the new program GIRAF. Conc., concentra-tions; Prev., previous; prog., program; Test, test solution

Graphite furnace Flame

Pb Peak height Peak area Cu Absorbance Na Emissionconc.,ppb

Prev.prog.

Newprog.

Prev.prog.

Newprog.

conc.,ppm

Prev.prog.

Newprog.

conc.,ppm

Prev.prog.

Newprog.

Blank5

10203040

Test

0.0110.0560.1040.1880.2670.3370.234

0.0150.0600.1130.1880.2810.3700.248

0.0060.0190.0330.0580.0830.1060.072

0.0080.0190.0330.0570.0840.1050.072

Blank12345

Test

0.0000.0400.0830.1260.1670.2050.103

0.9971.0371.0781.1201.1611.1961.098

Blank0.10.20.40.61.0

Test

P0.0040.1100.2400.4800.7051.1570.605

0.1030.2160.3490.5850.8241.2740.735

CBU

26.4B2.1

26.3B1.2

25.8B1.2

26.2B1.2

CBU

2.50B0.08

2.49B0.10

CBU

0.52B0.02

0.53B0.02

tion: blank (nitric acid 1% solution), 1, 2, 3, 4 and 5 mg/L (ppm).A solution with a copper concentration of 2.5 ppm (“true” value)was used as a test solution.

The reference materials for the calibration of the spectropho-tometer for flame atomic emission analysis were prepared fromthe sodium standard solution by its dilution with water and addi-tion of the matrix modifier: blank (CsCl 0.1% solution in water),0.1, 0.2, 0.4, 0.6 and 1.0 ppm. A solution with sodium concentra-tion of 0.5 ppm (“true” value) was used as a test solution.

It may be shown that, according to the preparation procedureof the test solutions, the relative expanded uncertainty [12] of thecorresponding analyte concentrations is approximately 1% (rel.)at 95% confidence, i.e. the correct concentrations Ctr and theiruncertainies Utr are 25.0B0.3 ppb for lead, 2.50B0.03 ppm forcopper, and 0.50B0.01 ppm for sodium.

Technique of the experiment

All measurements were performed with the program GIRAF andwith the previous ones without shutdown of the instrument at thesame analytical conditions (according to Perkin-Elmer “cook-book”) by the same analyst with the same reference materials ap-plied as calibration and test solutions. The matrix modifier forlead determination by graphite furnace analysis was used sepa-rately from the reference materials, i.e. was not mixed with them.The matrix modifier for sodium determination by flame atomicemission analysis was introduced in all corresponding solutions(the blank and the reference materials for the calibration andtest). Three replicates of all measurements were made.

Results and discussion

Results of the analysis (C) are presented in Table 1.Usually the display of the Perkin-Elmer 5000 instru-ment in flame atomic absorption analysis is autozeroedfor the blank solution. For this reason the values readvisually from the display and those stored by GIRAFdiffer by the value of the blank, a fact that has no in-fluence on the results of the analysis. The uncertaintiesof the results (U) are shown in Table 1 also. The uncer-tainties of the calculated analyte concentrations in thetest solutions Utr described in the Reference materialssection are approximately one third of the correspond-ing uncertainties of the results of the analysis (for sodi-um it is not obvious because of rounding), and so theuse of the term “true” values in the context of the vali-dation is permissible. All these data were used for cal-culation of the acceptance criteria A and B formulatedin the Validation plan (see Table 2). The criteria aresatisfied for both graphite furnace and flame analysis.

It is worth while to note also that the program GI-RAF shortens the analysis time and is more easy to usethan the previous programs, but evaluation of these pa-rameters was not included in the validation.

Page 22: Validation in Chemical Measurement

Validation of a computer program for atomic absorption analysis 13

Conclusions

1. The approach to the validation of a computer pro-gram as a validation of a component of the analyticalmethod (including this program as “matreshka”) is aproductive one because of the use of the final results ofthe program application, i.e. the analytical results, forthe validation conclusion.

2. The performance characteristics of the program(validation parameters) should be chosen from corre-sponding characteristics of the method and supple-

mented, if it is needed, with other ones specific for theprogram.

3. The validated program GIRAF corresponds tothe requirements of international guides to quality inanalytical chemistry [10, 11] and may be used for graph-ite furnace analysis as well as for flame absorption andemission analysis with the Perkin-Elmer 5000 spectro-photometer.

Acknowledgement The authors express their gratitude to Prof.E. Schoenberger for helpful discussions.

References

1. Green JM (1996) Anal Chem68:305A–309A

2. Bedson P, Sargent M (1996) AccredQual Assur 1 :265–274

3. Huber L (1996) Accred Qual Assur1 :24–34

4. Burgess C (1996) VAM Bulletin15 :18–20

5. Huber L, Thomas M (1995) LG-GC13:466–473

6. Huber L (1996) LG-GC 14 :502–508

7. Samways K, Sattler L, Wherry B(1996) Pharm Technol 20 :45–49

8. Kuselman I, Shenhar A (1997)Accred Qual Assur 2 (in press)

9. Guidelines for the Determination ofRecalibration Intervals of MeasuringEquipment Used in Testing Labora-tories (1984) International DocumentNo. 10 OIML, 1st edn. Bureau Inter-national de Métrologie Legale, Paris

10. Accreditation for Chemical Laborato-ries. Guidance on the Interpretationof the EN 45000 Series of Standardsand ISO/IEC Guide 25 (1993) WE-LAC Guidance Document No.WGD2/EURACHEM Guidance Doc-ument No. 1, 1st edn. Available fromlocal EURACHEM offices

11. International Guide to Quality inAnalytical Chemistry. An Aid to Ac-creditation (1995) CITAC Guide 1,1st edn. ISBN 0-948926-09-0, Ted-dington

12. Quantifying Uncertainty in AnalyticalMeasurement (1995) EURACHEM,1st edn. ISBN 0-948926-08-2, Ted-dington

Page 23: Validation in Chemical Measurement

Accred Qual Assur (1997) 2 :275–284Q Springer-Verlag 1997 GENERAL PAPER

Gordon A. Ross Instrumental validation in capillaryelectrophoresis and checkpoints formethod validation

Abstract Capillary electrophoresis(CE) is increasingly being used inregulated and testing environmentswhich demand validation. The de-sign, development and productionof CE instrumentation should begoverned by qualifications whichensure the quality of the finishedproduct. The vendor should there-fore provide guidelines and proce-dures which assist the user in en-suring the adequate operation ofthe instrumentation and especiallyin designing installation qualifica-tion (IQ) and operational qualifica-tion/performance verification (OQ/PV) procedures. OQ/PV shouldtest those functions of an instru-ment which directly affect the CEanalysis, i.e. voltage, temperature,injection precision and detectorfunction. In validation of CE meth-ods care should be taken that

those aspects which directly affectthe precision of peak parametersare appreciated. The relationshipbetween CE instrumentation,chemistry and validation parame-ters is discussed and guidelines arepresented for definition of a CEmethod for submission to regulato-ry authorities.

Key words Instrument 7 Method 7Validation 7 Capillary 7Electrophoresis

Abbreviations CZE capillary zoneelectrophoresis 7 MECC or MEKCmicellar electrokineticchromatography 7 CGE capillarygel electrophoresis 7 CIEFcapillary isoelectric focusing 7 CECcapillary electrochromatography 7DAD Diode-array detection

Received: 15 January 1997Accepted: 10 February 1997

G. A. Ross (Y)Hewlett-Packard GmbH,76337 Waldbronn,GermanyTel.: c49-7243-602663Fax: c49-7243-602-149/501email: gordon [email protected]

Introduction

Capillary electrophoresis (CE) is being used increas-ingly for quantitative analyses in regulated environ-ments, testing laboratories and food and environmentalanalyses [1]. With the maturing of the technique, instru-mental automation and increasing familiarity of work-ers with its operation, an increasing number of reportsshow that acceptable precision can be readily achieved,and rugged validated methods can be developed [2–9].

Validation of any instrumental analysis demands thevalidation of the component parts of the analysis andtheir combined function. This entails validation of theinstrumentation, the analytical method developed on

that instrumentation and their suitability for routinelyperforming the intended analysis. This report is in-tended to highlight instrumental and methodologicalvalidation issues specifically applied to CE and to dis-cuss the relationship between vendors and users of CEinstrumentation in environments demanding validationof such instrumental analysis.

Validation of capillary electrophoresis instrumentation

Comprehensive validation of any instrumental methodof analysis is not achieved solely in the users analyticallaboratory. The roles of the vendor and end user of the

Page 24: Validation in Chemical Measurement

Instrumental validation in capillary electrophoresis and checkpoints for method validation 15

instrumentation are quite different. However, theseroles do overlap significantly at the point of instrumen-tal validation, and there are areas where the vendorscontribution can be an invaluable aid to the user.

The vendor develops and delivers the instrumenta-tion, while the user sets criteria and specifies tests todetermine whether the instrumentation is suitable forthe intended use. The differing responsibilities for set-ting and assessing such qualification and their overlapare shown in Fig. 1. In order to supply reliable instru-mentation, its development, manufacture and the man-ufacturing process itself must be qualified. The processbegins with the manufacture of instrumentation by ven-dors operating to ISO 9001 standards. For GLP pur-poses, any equipment includes the instrument (hard-ware) and any computer-operating system used with it(software). The process of validation may be viewed asthree qualification phases. First is the design qualifica-tion (DQ) phase, where the instrument is defined, de-signed and developed, and second the installation qual-ification/operational qualification (IQ/OQ) phasewhere the instrument is installed and calibrated. Opera-tional qualification is also referred to as performanceverification. Last is the performance qualification (PQ)phase where the suitability of the analytical method forits intended use is monitored.

Design qualification (DQ)

Hardware

Instrumental specifications set by the vendor may bequite different from those required by the user. Vendorspecifications, while helpful in indicating to the userwhether an instrument is suitable for the analysis, areactually intended to indicate whether the instrumentconforms to the definition of the instrument during itsdesign, development and manufacture. The vendor hassole responsibility during the DQ phase of an instru-ment’s lifecycle (Fig. 1). It is at the end of this DQphase that the instrument is tested for conformity to thespecifications set during this phase. These are the in-strumental specifications. The vendor should thereforesupply some form of qualification to indicate that theinstrument conforms to the instrumental design specifi-cations, although, as already stated, these can be quitedifferent from the user’s specifications. Manufacturersare often best placed to provide guidelines and proce-dures for operation and maintenance of the instrumentand should work closely with the user to determineprocedures such that the equipment’s operation can bevalidated. However, GLP/GMP-bound end usersshould implement appropriate inspection and testingprograms to test instruments for conformity to theirown specifications, which dictated the choice of the in-

Fig. 1 The relationship between vendor and user during the qual-ification stages of instrumental and method validation

strument purchased in the first instance. The vendorcan assist in this process and reduce the burden on theuser if validation certificates can be provided with eachinstrument upon delivery. In this case, the need for theuser to perform these tests is removed. A declaration ofconformity from the vendor certifies the type of testswhich the instrument has undergone.

Software

The majority of analytical processes in regulated envi-ronments rely on computer control for automation andfor data capture and evaluation. It is therefore impor-tant that not only is the instrument validated but alsothe computer and operational software. Validation ofsoftware is more complex than hardware validation, es-pecially since source codes for many analytical pro-cesses can run to several hundred thousand lines. Inthis case the responsibility for validation still remainswith the user, who should have the assurance that thecomputer system has been validated during and at theend of the DQ phase. The vendor should provide docu-mentation with a declaration that the software has beenvalidated to these or similar guidelines to ensure soft-ware quality standards. Further, the vendor should pro-vide access to the source codes if required by theuser.

Installation qualification/operational qualification(IQ/OQ)

Installation qualification (IQ)

Installation qualification (IQ) for instrumentation is de-fined by the vendor. IQ provides documentation which

Page 25: Validation in Chemical Measurement

16 G.A. Ross

shows that the instrumentation was complete upon de-livery, correctly installed and functional. The IQ is per-formed by the vendor and should include a single testrun the results of which should be stored with the IQdocumentation.

Operational qualification/performance verification(OQ/PV)

Manufacturers can greatly assist the user by providingoperational qualification/performance verification(OQ/PV) procedures to be performed on site to checkthe instruments performance. When an OQ/PV testprocedure is developed for an instrument it is impor-tant to identify and test those functions of the instru-ment which have a direct effect on its ability to performthe analyses it was intended to undertake. This is some-times confused with testing the instrument for confor-mity with instrumental specifications, but as previouslystated the instrumental specifications are linked to theDQ phase and not subsequent use. The instrumentalfunctions tested in OQ/PV are those which affect thequalitative and quantitative data required from theanalysis.

The OQ/PV should be performed on installation,after extended use (e.g. at least once per year) and aftermajor repair. This may also be extended to performingan OQ/PV before starting to validate or optimise amethod on that instrumentation. OQ/PV for a capillaryelectrophoresis instrument should therefore be directedtowards those specific instrumental functions which canaffect the CE analysis.

A CE instrument comprises a high-voltage sourceand electrodes, containers for buffers, provision forholding and injecting samples, a thermostatted capillaryhousing and a detector. Most CE instruments use UV-vis absorbance detection, sometimes with spectral anal-ysis capabilities. In instrumental separation sciencesquantitative data is derived from the peak elution/mi-gration time and peak dimensions; therefore any instru-mental function which can affect these must be tested.

Parameters for testing in capillary electrophoresisOQ/PV

The parameters to be tested in OQ/PV for capillaryelectrophoresis are temperature stability, voltage stabil-ity, injection precision, detector performance (noise,drift and wavelength accuracy), and integrator func-tionality. Additionally, the migration time of a test ana-lyte, using a well-defined method, may be monitored asan indication of the overall function of the instrument.Where the test procedure is performed using a test ana-lyte and method, the usual validation parameters of the

Fig. 2 The effect of capillary temperature on (a) peak migrationtime and (b) amount injected by pressure. Conditions: buffer20 mM borate pH 9.2, sample 0.1 mM p-hydroxyacetophenone,capillary 48 cm (40 cm eff)!50 mm, detection 192 nm, injection250 mbar/s, voltage 30 kV

method should be determined. In all cases, care shouldbe taken to develop a test procedure which directlytests the parameter in question and is not inferred fromnor dependent on other instrumental or chemical pa-rameters. The parameters to be tested are discussed indetail below.

Temperature stability

The capillary temperature can affect both peak migra-tion time and peak area (Fig. 2a, b). The effects are dueto temperature-mediated viscosity changes in the buf-fer. The electrophoretic mobility of analyte (me) is pro-portional to its charge (q) to mass ratio, and inversely

Page 26: Validation in Chemical Measurement

Instrumental validation in capillary electrophoresis and checkpoints for method validation 17

related to the viscosity (h) of the medium throughwhich it travels (Eq. 1).

mepq/6p r h (1)

Therefore, as the viscosity decreases the mobility in-creases and the migration time gets faster. The volumeinjected via hydrodynamic injection (pressure or va-cuum) is also inversely related to the buffer viscositythrough the Hagen-Poiseuille equation (Eq. 2).

volume injectedpDPd4 p t/128hL (2)

Therefore, as the viscosity of the buffer in the capillarydecreases, a larger volume of sample is moved into thecapillary for the same pressure differential.

The stability of the temperature setting is of moreimportance to the long-term reproducibility of an assaythan to its absolute accuracy, although this is importantfor method transfer. Capillary temperature can also af-fect the selectivity in MECC or CEC analyses and alsothe resolution of a separation. Direct recording of thetemperature of the thermostatting medium (air or li-quid) is of more use than using inferred measurements,e.g. current or migration times, since these are also de-pendent upon non-instrumental parameters, e.g. bufferionic strength and capillary history. Most CE instru-ments will provide a record of the thermostat tempera-ture derived from probes which are part of the temper-ature control mechanism. If this is not possible, then anexternal temperature probe should be used. Typicallythe capillary temperature should be stable to within0.1 7C and accurate to within B1 7C.

Voltage stability

The applied voltage (V) divided by the capillary totallength (L) is the field strength (E), which affects thepeak migration time (mt) through its direct relationshipto the analyte’s apparent mobility (mapp), which is thesum of the analyte’s electrophoretic mobility (me) andelectroosmotic mobility (meo):

migration time p (mecmeo)V/L or mapp E

Reproducible migration times are therefore instrumen-tally dependent on a reproducible field strength. Usinga fixed capillary length, the voltage may be used as adirect indication of the stability and accuracy of thefield strength. Directly recording the applied voltage isof more benefit than using inferred measurements suchas migration time or current since these are dependenton non-instrumental chemical parameters e.g. bufferconcentration. Where the instrument provides a recordof the voltage applied, this may be used, but the vendorshould indicate the source of the voltage measurement.Typically voltage supplies should be stable to withinB0.1% and accurate to within B1% of the set vol-tage.

Injector precision

The ability of CE instrumentation to repeatedly inject afixed volume of sample is fundamental to achieving aprecise quantitative assay. Injector precision may bedetermined using a defined method (chemical and in-strumental parameters), in conjunction with a test sam-ple, by examining the precision of the reported cor-rected peak area (peak area/migration time). It is im-portant to use the corrected peak area since this willremove minor fluctuations in migration time, due tochemistry, from the analysis of injector precision. Whileit is useful for the sample to be traced to national orinternational standards, this is not mandatory providedthat the supplier operates to quality standards such asISO 9001. The sample concentration should be suchthat it falls within the linear range of detection for thatsample with that method and should provide sufficientsignal-to-noise so that injector precision is not colouredby integration fluctuations. Optimally this should pro-vide a peak with a signal-to-noise ratio of around 500.Other instrumental functions must be verified prior toperforming this test where they can affect the amountinjected or the reported peak area, i.e. detector lin-earity, capillary temperature and integrator function.Most CE instruments inject sample by applying a fixedor variable differential pressure for a fixed time. Theirability to precisely reproduce the injection is compro-mised for very short (ca. 1 s) injection times, thereforetime of injection should be greater than 3–4 s. Simplydipping a capillary into a sample will result in a smallvolume being injected. In order to minimise the effectsof this extraneous injection on the analysis of injectorprecision, a reasonable injection volume should be usedto provide data. An injected volume of around 1–5% ofthe total capillary volume is recommended.

Detector performance

The majority of CE instruments are equipped with UV-vis absorption detection. The most important charac-teristics of UV-vis detectors which should be tested arethe noise and drift. These should first be determined inthe absence of a capillary using the ASTM method sothat the detector is appraised directly and the measure-ments are not affected by the capillary. Fixed and vari-able wavelength detectors should be verified for accu-racy of the detection wavelength. DAD and rapid-scan-ning detectors should be verified for wavelength accu-racy over the defined spectral range by checking for ac-curacy at a minimum of three wavelengths which coverthis range. This may be performed using an internal fil-ter (e.g. holmium oxide) or using an analyte with awell-characterised spectrum. Detector linearity shouldbe determined for the test analyte to ensure that the

Page 27: Validation in Chemical Measurement

18 G.A. Ross

concentration used for determination of injector repro-ducibility is within the linear range of that analyte onthe instrument being tested. Therefore the detectorshould be subjected to different tests: (1) noise anddrift without a capillary present, (2) wavelength accura-cy without a capillary present, (3) wavelength accuracyusing a test analyte, and (4) linearity of response vs testanalyte concentration over the range which covers theconcentration used for the injector precision test.

Integrator functionality

The long-term stability of the integrator function canbe determined using a test electropherogram. Integra-tion parameters such as migration time, peak area, cor-rected area and baseline may be used to determine andmonitor the functionality of the integration software. Itis useful if the test chromatogram is appropriate to theanalysis being undertaken by the instrument, andtherefore the user should have the opportunity to de-fine the test electropherogram.

Instrument-specific functions

Should a manufacturer supply an instrument with aspecialised or uncommon feature which can affect theperformance of an assay, e.g. buffer replenishment,then its functionality should also be assessed, especiallyif the function is part of the user’s specification for sui-tability of the instrument to perform the intended anal-ysis.

Performance criteria and limits

Setting the limits for determining whether the instru-ment is still within acceptable performance criteriashould be entirely the responsibility of the end user.The vendor can assist in providing guidelines as to howthe instrument should function on delivery, but the op-erational qualification is intended to determine wheth-er the instrument is still operating to within the specifi-cations which the user defined upon purchase. For ex-ample, if the intended analysis is quantitation of a maincomponent in a drug formulation, then detector noisewhich affects sensitivity may be of lesser importancethan injection precision. Conversely, where the instru-ment is used for impurity determination, especiallywhere the reported data is (area/area) %, then sensitiv-ity and therefore detector noise may be of more impor-tance than injector precision. If limits are indicated bythe manufacturer, then these may be used directly if ap-propriate, but the user should be aware that the deci-sion is his or hers. The limits for an OQ/PV should not

automatically be defined from the instruments specifi-cations.

Checkpoints for validation of capillary electrophoresismethods

Analytical method validation has developed within thepharmaceutical industry over the years in order to pro-duce an assurance of the capabilities of an analyticalmethod. A recent text on validation of analytical tech-niques has been published by the international Confer-ence on Harmonisation (ICH) [19]. This discusses thefour most common analytical procedures: (1) identifica-tion test, (2) quantitative measurements for content ofimpurities, (3) limit test for the control of impuritiesand (4) quantitative measurement of the active moietyin samples of drug substance or drug product or otherselected components of the drug product. As in anyanalytical method, the characteristics of the assay aredetermined and used to provide quantitative datawhich demonstrate the analytical validation. The re-ported validation data for CE are identical to thoseproduced by an LC or GC method [11] and are derivedfrom the same parameters, i.e. peak time and response.Those validation parameters featured by the ICH (Ta-ble 1) are derived from the peak data generated by themethod. Table 1 also indicates those aspects of a CEmethod (instrumentation and chemistry), peculiar tothe technique, which can affect the peak data and high-lights factors which can assist the user in demonstratingthe validation parameters.

Selectivity

Selectivity is demonstrated by identifying a peak as asingle analyte. This may be achieved by characterisingthe peak by its migration time provided that it can bedemonstrated that no other components are co-migrat-ing with that peak. This may be demonstrated by co-injection of the sample with a reference standard. Spec-tral analysis software capable of providing UV-vis ab-sorbance spectra and determining peak purity can bevery useful in demonstrating the selectivity of an assay,especially in identification tests.

Peak purity

Using DAD or scanning detectors it is possible to de-termine the purity of a peak by spectral peak purityanalysis. Peak homogeneity can be demonstrated by avariety of methods. However, the most commonly usedtechnique normalises and compares UV spectra fromvarious peak sections.

Page 28: Validation in Chemical Measurement

Instrumental validation in capillary electrophoresis and checkpoints for method validation 19

Table 1 The relationship be-tween instrument and chemis-try in CE analyses: the re-ported data and validation pa-rameters

Validation Reported data Instrument Chemistryparameters

Selectivity Migration time Detection wavelength Buffer pH(Specificity) (preferably in Peak spectra Buffer components,

conjunction with additives and theirpeak purity) concentrations

Precision Migration time Capillary temperature Buffer pH, capillary(Accuracy) precision Applied voltage treatment, capillaryStability temperature, bufferRuggedness ionic strength, organicRobustness modifiers

Peak area precision Capillary temperatureInjection mechanism:pressure/vacuum appliedTime of applied pressure/vacuumHeight of vial andtransition time

Linearity Linearity Linear range of detectorRange

LOD Sensitivity Detector noiseLOQ Data analysis

Amount injected

Peak identity

This may be determined by acquiring the UV spectra ofa peak and comparing this with a known standard. Theprocess may be optimised by constructing a spectral li-brary and comparing the analyte spectra with that of aknown substance in the spectral library. Using suitablesoftware, this may be performed automatically and witha high degree of accuracy. Such spectral analysis of ana-lytes is one of the main techniques used for peak iden-tification in pharmaceutical, food and environmentalanalysis. Peak identity may also be determined fromspiking the sample with a pure preparation of the pre-sumed analyte and observing increase in peak height orarea. Care should be taken that the increases in peakareas do not result from erroneous analyte identifica-tion and co-migration of two different species. In-creases in peak widths or changes in the peak shape ofthe spike compared with the sample may indicate a co-migration of two peaks, although in CE peak shapesare frequently concentration dependent. Peak purityanalysis on the spiked sample can be used to confirmthe presence of a single analyte.

Migration time precision

Migration time precision may be used for peak identifi-cation and is important in assessing the overall per-formance of a method, since many other parametersimpinge on its reproducibility. Recent studies have de-monstrated that not only is CE capable of exhibiting

good precision in a quantitative assay [3, 6, 7, 12] butthat this precision can be preserved when methods aretransferred between laboratories and between instru-ments [4, 5].

As we have seen from Eq. 1, the mobility is depend-ent upon the buffer viscosity, assuming that the buffermake-up is constant and its pH is stable. Therefore,from an instrumental point of view, the electric fieldand the capillary temperature are the two main param-eters which must be stabilised to effect reproduciblemigration time. Where a buffer is selected for proper-ties other than its buffering capacity, e.g. indirect detec-tion techniques, it may be necessary to replace the buf-fer frequently in order to ensure reproducible migra-tion times.

Peak area precision

Peak area precision is of great importance in the devel-opment of a quantitative assay. The reproducibility ofthe peak area is dependent upon a number of parame-ters, all of them instrumentally derived. Chemistry haslittle impact upon the quantitative reproducibility of anassay except where electrolytic changes in buffer pHcan affect the absorption spectra of an analyte or wherewall adsorption of the analyte can occur [17]. Mostcommercial instrumentation allows the use of electroki-netic and pressure injection modes for introduction ofsample.

Electrokinetic injection with positive polarity intro-duces a bias towards the more cationic components of a

Page 29: Validation in Chemical Measurement

20 G.A. Ross

sample [18]. Electrokinetic injection is, however, oftennecessary when performing CGE separations, sincepressure injection may disturb the gel. In CGE analysisof nucleic acids and SDS proteins, smaller analytes willbe preferentially loaded since they migrate faster intothe gel. Other components of the sample matrix canalso affect the amount of sample/analyte injected usingelectrokinetic loading. For quantitative analysis, thefluctuations and dependence of electrokinetic injectiontechniques on an analyte’s charge and sample matrixgenerally preclude it use in quantitative assays. Injec-tion by hydrodynamic displacement (pressure or va-cuum or hydrostatic displacement) is a more usefultechnique and is more appropriate to quantitative anal-ysis.

The precision of the amount injected by pressure isalso dependent on the reproducibility of other instru-mental parameters. The injected amount depends onprecision of the applied pressure, the time control andalso the rise and fall times of the applied pressure. Inorder to optimise the precision of the injected amountit may be necessary to use post-sample plugs of buffersor voltage ramps to minimise thermal expansion of thebuffer plug and sample loss. Precise sample injectionalso depends on the viscosity of both the sample andthe buffer, as indicated by Eq. 2. Since viscosity is tem-perature dependent, thermostatting the sample tray oroperating in an environment where ambient tempera-ture is controlled will eliminate the effects of tempera-ture-induced sample viscosity variations. However,even with constant temperature in the sample tray,temperature-induced variations in the viscosity of thebuffer in the capillary can also affect the amount in-jected (Fig. 2b). In this case a stable capillary tempera-ture is important. Where a viscous sample is being ana-lyses, care should be taken that calibration standardsare of similar viscosity or, ideally, constructed usingblank sample matrix [8]. Although the amount injectedis determined by these instrumental factors, the re-ported area of the integrated peak is dependent uponthe functioning of the integration software. Reproduci-bility of peak area generally degenerates as the concen-tration decreases (Fig. 4). This is in part due to integra-tion effects where the integrator can have some difficul-ty in deciding the start and end of a peak which has aheight of only two to three times that of the baselinenoise.

The geometry of the capillary end plays an impor-tant part in ensuring reproducible peak shape. If theend of the capillary is ragged or broken, this can lead tocarry-over effects [14]. Figure 4 shows the effects of ca-pillary end geometry on the peak shape of thioureaanalysed using CEC. The rise in baseline after the peakcaused by a ragged capillary end can lead to errors inintegration and therefore irreproducibility of the re-ported peak area. Ideally the capillary end should be

Fig. 3 The relationship between signal-to-noise ratio and preci-sion of reported peak area

Fig. 4 The effects of capillary end geometry and capillary endwashes on peak shape

flat and a short section of polyimide should be removedfrom it. Capillaries may be cut using a variety of means,e.g. a sapphire cutting tool or a ceramic stone. After thecapillary is scribed, the break should be made by pull-ing the capillary apart and not by bending it. Alterna-tively, precision cut capillaries may be purchased.

Detector linearity, limit of quantitation and limit ofdetection

The linearity of an analytical method is its ability toproduce test results which are directly, or by means of awell-defined mathematical transformation, proportion-al to the concentrations of analytes within a givenrange. Usually it is important to demonstrate that a lin-ear relationship exists between detector response and

Page 30: Validation in Chemical Measurement

Instrumental validation in capillary electrophoresis and checkpoints for method validation 21

analyte concentration. Linearity should be determinedby injection of at least ten standards of varying concen-trations which cover 50–150% of the expected workingrange of the assay. In CE, using direct detection meth-ods, a linear range covering three to four orders ofmagnitude is obtainable. However, in indirect detectionmodes, linear ranges of two orders of magnitude arecommon. A linear regression analysis applied to the re-sults should have an intercept not significantly differentfrom zero. Errors associated with each concentrationlevel should be determined by multiple injection ateach concentration level. Linear ranges of three to fourorders of magnitude are of great utility in the determi-nation of low (~0.1%) impurity levels. The limits ofdetection of an assay using UV detection can be deter-mined as that signal which gives an S/N ratio of 3.Therefore, with a knowledge of the molar absorptivityof the analyte, this can be determined purely from aknowledge of the noise of the detector. A caveat to thisis that this noise should be the noise of the “system”,i.e. the detector noise at the detection wavelength dur-ing a run.

The limit of detection (LOD) is that concentrationwhich can be differentiated from a blank, while the lim-it of quantitation (LOQ) is the lowest concentrationwhich can be determined with an acceptable accuracyand precision. It is important to realise that with mostinstruments, when a capillary is replaced, in most casesthe detector flow cell is also replaced since detectiontakes place on-column. In this case, although the entiremethod need not be re-validated, the limits of detectionand limits of quantitation, if these are appropriate tothe method validation being undertaken, should be re-determined. Minor variations in capillary alignment canbe sufficient to produce significant changes in detectorresponse and sensitivity. This may not be necessarywhen using a decoupled detection cell.

In contrast to OQ/PV testing, the detector baselinenoise should be determined using the analytical bufferwith the operating voltage applied. For fixed-wave-length detectors, the wavelength used for detection andquantification should be chosen such that this lies on aplateau of UV absorption for both analyte and internalstandard. With the use of a DAD, where multiplewavelengths can be monitored, the sample detectionwavelength may be different from that used for the in-ternal standard. This provides a greater flexibility inchoice of internal standard, if used, and in choice of de-tection wavelength to provide maximum absorption.

Definition of a CE method for submission to aregulatory agency

When detailing a method, care should be taken that theappropriate information is provided such that not only

Table 2 Checklist for detailing a CE method for submission toregulatory authorities

Parameter Required information

Purpose Nature of the analysisReasons for performing the analysisAnalyte being determined.

Instrument Instrument manufacturerModel

CE mode Separation mechanismCZE, MECC, CGE, CIEF, CEC

Buffer Buffer constituents, concentrations and pHWeight of each component identified by chemi-cal formula as well as common namesVolume and concentration of acid or base usedto adjust the pHTotal volume of solvent (e.g. water)Ready-made buffers: source, batch number andspecifications.

Capillary Total capillary length (inlet to outlet, cm)Effective or detection length (inlet to detectionpoint, cm)Internal diameter (including extended lightpathsif used)Type: bare fused silica

coated (type of coating and vendor)packed (packing material and vendor)

Injection Injection mechanism:Electrokinetic: applied voltage and time of appli-cationHydrodynamic: units of pressure or vacuum ap-plied or height of displacement and time of ap-plication

Voltage Applied voltage (field strength i.e. volts per cmof capillary)Expected current (i)

Temperature Capillary thermostat temperatureSample tray temperature

Detection Type of detector (e.g. DAD, fixed wavelengthetc)Detection wavelength, band width, rise time andsampling rateReference wavelength if using DAD

Capillary//reatment

Prior to first usePre-analysisInter-analysisStorageIndicate conditioning solutions, flush pressureand time of treatment

can the method be reproduced on a similar instrumentby a different operator, but also there is sufficient detailto facilitate method transfer between instruments fromdifferent manufacturers. These details should also beincluded when submitting to a regulatory agency. Ideal-ly the method description should take a standard form.A checklist of method parameters for submission of aCE method is shown in Table 2.

All materials used to construct the buffer should bemade from electrophoresis grade materials. It is of

Page 31: Validation in Chemical Measurement

22 G.A. Ross

Table 3 EOF variations withsilica batch and pre-treatment Silica batch Not pre-treated Pre-treateda

mean* EOFmm/s % RSD mean* EOFmm/s

% RSD

QQT11A 0.2307 1.31 0.2837 0.86QNRO1 0.2353 1.18 0.2828 0.78MNZ04B 0.2303 5.95 0.2811 1.48KZG01A 0.1886 23.97 0.2580 1.71MSZ01 0.1538 11.32 0.2195 0.95KYL12 0.2185 13.8 0.2517 0.42KYL05 0.1406 19.82 0.2900 0.97

All data 0.1997 19.65 0.2667 9.47

* All np10a pre-treatment 5 min methanol, 5 min NaOH, 5 min water, 20 min run buffer (flush pressure 1bar)Conditions: capillary 48.5 cm (40 cm eff)!50 mm id, buffer 50 mM phosphate pH 7.00, temperature20 7C, injection 65 mbar/s, voltage 26 kV, sample 0.2% DMSO in water

some importance to note that where surfactants or or-ganic additives are used the buffer should have its pHadjusted before their addition, and this should benoted. The construction of the buffer should take theform of a standard operating procedure (SOP). Whereready-made buffers are used for the analysis, thesource, batch number and specifications of the buffershould be indicated. Pre-made buffers should be sup-plied by the vendor with appropriate documentation in-dicating assay information and verification of purity.Since instruments from different manufacturers use dif-ferent units of pressure or vacuum for sample injectionit is of specific importance to specify these. In particu-lar, for method transfer, the pressure units should betranslated into the pressure units of the instrument themethod is being transferred to. Similarly, in instru-ments of different manufacture, the standard capillarydimensions, i.e. total and effective lengths, may vary. Itis also very useful to indicate the expected current withthe indicated buffer, field strength and temperature set-tings since this may highlight errors in any of these pa-rameters.

The capillary pre-treatments should be documentedas part of the method. It is generally sufficient to indi-cate the time of treatment and the conditioning solu-tions used. However, with transfer of a method be-tween instruments the number of column volumesshould be indicated since different instruments use dif-ferent pressures to flush capillaries. The reasons for de-tailing capillary pre-treatments lie in the influence onanalyte mobility of the electroosmotic flow (EOF)which is ubiquitous in bare fused silica capillaries. Sincethe mobility of an analyte is composed of its own mo-bility plus that of the EOF then this must also be stablein order to produce reproducible migration times. Sta-bility of the EOF can be achieved by appropriate capil-lary pre-conditioning techniques which should besuited to the analysis being undertaken, e.g. precondi-

tioning with 10% v/v phosphoric acid prior to peptideseparations using phosphate buffer at low pH or clean-ing the capillary between runs with 1 N NaOH whenanalysing samples where either analyte or sample ma-trix exhibits some wall adhesion. This is necessary be-cause of the hysteresis of the relationship between pHand EOF when increasing or decreasing the buffer pH[13–15].

There are various types of capillary pre-treatment.One is the prior-to-first-use treatment; other treatmentsinclude preparative conditioning performed either dailyor after extended use and capillary treatment betweenanalyses. Conditioning between analyses may or maynot be necessary, although minimally this should in-clude a buffer flush to eliminate the possibility of inter-analysis carry-over. Finally details of capillary treat-ment prior to storage should be documented if the ana-lytical method is intermittently used.

Table 3 shows the EOF associated with variousbatches of silica with and without pre-treatment. Pre-conditioning the capillary with the indicated wash regi-men serves to improve reproducibility of the EOF in allcases, although there is still a marked variability be-tween silica batches. Sufficient time should be allowedfor the capillary to equilibrate with the run buffer. Forconditioning between analyses, flushing with bufferonly may be sufficient. More elaborate techniques suchas applying a short duration of voltage after flushingwith run buffer has been found to improve migrationtime reproducibility with some analyses [16]. When us-ing coated capillaries it is usually not necessary to useany wash technique and is sometimes inadvisable.However, buffer should be replaced between runs toeliminate the risk of fouling or carry-over, and storageconditions where appropriate should be detailed. In allcases one capillary should be used for one methodonly.

Page 32: Validation in Chemical Measurement

Instrumental validation in capillary electrophoresis and checkpoints for method validation 23

Summary

The requirements for validation of CE instrumentationhave been described. Although these are broadly simi-lar to the requirements for validation of any instrumen-tation used in GLP/GMP-compliant laboratories, someaspects are peculiar to CE, especially in regard to OQ/PV testing. Those features of a CE instrument whichshould be tested in an OQ/PV include temperature andvoltage stability, detector function and injector preci-sion. These should be directly assessed and not inferredfrom other measurements dependent upon one or moreof these parameters or upon chemistry. It is importantthat the contributions to the performance of an analysis

are identified as instrumental and/or chemical. Capilla-ry electroseparation methods are fully capable of beingvalidated similarly to LC and GC methods with identi-cal validation criteria. When defining the parameters ofa CE method, all operational parameters should benoted, including accurate instructions for constructionof the buffer and also the capillary pre-treatment used.Guidelines for detailing a method for submission toregulatory agencies must include sufficient detail for re-producibility studies at all levels as well as for methodtransfer between instruments.

Acknowledgement The author would like to thank Ludwig Hu-ber, Hewlett-Packard, Waldbronn for critical reading of themanuscript and productive discussions.

References

1. Altria KD, Kersey MT (1995) LC-GC8 (4) :201–208

2. Altria KD (1993) J Chromatogr634 :323–328

3. Altria KD (1993) J Chromatogr646 :245–257

4. Altria KD, Harden RC, Hart M, He-vizi J, Hailey PA, Makwana JV,Portsmouth MJ (1993) J Chromatogr641 :147–153

5. Altria KD, Clayton NG, Hart M,Harden RC, Hevizi J, Makwana JV,Portsmouth MJ (1994) Chromatogra-phia 39 :180–184

6. Prunonosa J, Obach R, Diez-CasconA, Gouesclou L (1992) J Chromatogr574 :127–133

7. Thomas BR, Fang XG, Chen X, Tyr-rell RJ, Ghodbane S (1994) J Chro-matogr 657 :383–394

8. Perrett D, Ross GA (1995) J Chro-matogr 700 :179–1865

9. Thomas BR, Ghodbane S (1993) JLiq Chromatogr 16 :1983

10. Huber L (1994) Good LaboratoryPractice and current good manufac-turing practice. A primer. Hewlett-Packard publication number 12-5963p2115E

11. Altria KD, Rudd DR (1995) Chroma-tographia 41 :325–331

12. Altria KD, Fabre H (1995) Chroma-tographia 40 (5/6) :313–320

13. Lambert WJ, Middleton DL (1990)Anal Chem 62 :1585–1587

14. Ross GA (1994) Biomedical applica-tions of capillary electrophoresis in-cluding the analysis of structural hae-moglobinopathies. PhD thesis, Uni-versity of London, UK

15. Coufal P, Stulik K, Claessens HA,Cramers CA (1994) J High Res Chro-matogr 17 :325–334

16. Ross GA (1995) J Chromatogr718 :444–447

17. Soga T, Ross GA (1997) J Chroma-togr (in press)

18. Huang XH, Gordon MJ, Zare RN(1987) Anal Chem 60 :375–377

19. Pharmeuropa Vol. 5.4 (Dec. 1993)343–345

Page 33: Validation in Chemical Measurement

Accred Qual Assur (1997) 2 :360–366Q Springer-Verlag 1997 GENERAL PAPER

Ludwig HuberHerbert Wiederoder

Qualification and validation of softwareand computer systems in laboratoriesPart 1: Validation during development

Received: 17 May 1997Accepted: 30 June 1997

L. Huber (Y) 7 H. WiederoderHewlett-Packard Waldbronn,PO Box 1280,D-76337 Waldbronn, GermanyTel.: c49-7243-602 209;Fax: c49-7802-981948;e-mail: Ludwig Huber6hp.com

Abstract Software and computersystems are tested during all devel-opment phases. The user require-ments and functional specificationsdocuments are reviewed by pro-grammers and typical anticipatedusers. The design specifications arereviewed by peers in one to twoday sessions and the source code isinspected by peers, if necessary. Fi-nally, the function and perform-ance of the system is tested by typ-

ical anticipated users outside thedevelopment department in a reallaboratory environment. All devel-opment phases including test activ-ities and the final release follow awell-documented procedure.

Key words Validation 7Qualification 7 Computers 7Software 7 Analytical 7Laboratories

Introduction

Proper functioning and performance of equipment playa major role in obtaining consistency, reliability and ac-curacy of analytical data. Therefore equipment shouldbe properly selected, designed, installed, and operated,and the correct function and performance should beverified before and during operation. This holds forequipment hardware, computer hardware, hardware in-terfaces, and software and computer systems. Qualifica-tion of equipment hardware is well established and hasbeen described by several authors [1–4], and typicallyusers in analytical laboratories are quite familiar withtesting equipment for hardware specifications.

Software and computer system validation differfrom hardware validation in that it is harder to specifyabsolute performance criteria and functional specifica-tions for software and to define tests and acceptancecriteria. There are many publications that deal with thevalidation of software and computer systems in oneway or another. However, either they are not specificfor the requirements of analytical laboratories or theyare not specific to computers in the laboratory.

This series of articles should fill this gap. It was feltthat the subject is so complex that it would not fit into a

single article. Therefore, four articles are currentlyplanned. The first deals with validation and qualifica-tion during development. The second deals with thetasks that should be performed during installation andprior to operation. Article number three covers the val-idation tasks during routine use. While the first threearticles deal with the validation of new systems, articlenumber four gives recommendations on how to retros-pectively evaluate and validate existing systems.

This first article describes the validation and qualifi-cation of computer systems such as those for instru-ment control and data evaluation during development.Development validation of computer systems pur-chased from a vendor typically is done at the vendor’ssite, and even though most computer systems and soft-ware in analytical laboratories are purchased from ven-dors it was felt that such an article makes sense for us-ers of such systems for two reasons:1. From a regulatory point of view, computer systems

must be validated. The user has the ultimate respon-sibility for validation but he can delegate some parts,for example validation during development, to thevendor. If he does this, the user should have someproof that his software has been validated at thevendor’s site. Using software development practices

Page 34: Validation in Chemical Measurement

Qualification and validation of software and computer systems in laboratories. Part 1: Validation during development 25

at Hewlett-Packard Waldbronn as example, this arti-cle should help users to understand what should bedone during development, and this information maybe used to ask the right questions of the vendor.

2. Users of computer systems may develop software ontheir own either as a stand-alone package, such as asoftware for specific statistics, or as an add on to thestandard software. This software should be validatedand programmers should get ideas and advice onhow to validate.Because there is still a misunderstanding of terms

such as validation, qualification and verification, thesewill be explained right at the beginning. It is also im-portant to understand the terms computer system andcomputerized systems and the different types of soft-ware loaded on a computer, but other publicationsshould be consulted for this type of information.

Readers of the article, who have to comply with reg-ulations should know about and understand them. Rel-evant information can be found in existing literature [7,8].

Definitions

The terms validation, verification and qualification arefrequently used interchangeably. Validation and verifi-cation have been defined by EN ISO 8402:1995 [6].

Verification: Confirmation by examination and pro-vision of objective evidence that the requirements havebeen fulfilled.

Validation: Confirmation by examination and provi-sion of objective evidence that the particular require-ments for a specific intended use are fulfilled.

Even though the definitions look similar, there is adistinct difference. While verification is of general na-ture, validation refers to ‘specific intended use’. In thissense a computer system that is developed for multipleusers with multiple applications is verified rather thatvalidated at the vendor’s site. When the system is in-stalled at the user’s site for a specific task and the sys-tem is tested to meet the previously specified require-ments, this process is defined as validation. If the sys-tem is intended to be used for different applicationsand more generic tests are done in the sense of EN ISO8402:1995, this process again is called verification.

In practice, vendors of computer systems and userfirms use the terms differently. The pharmaceutical in-dustry uses the terms validation in the sense of EN ISO8402:1995, and the term verification is hardly known.Instead of this the term “qualification” is used.

The entire qualification process is broken downinto– Design qualification (DQ) for setting functional and

performance specification (operational specifica-tion)

– Installation qualification (IQ) for performing anddocumenting the installation in the selected user’senvironment

– Operational qualification (OQ) for testing theequipment to ensure that it meets the previously de-fined functional and performance specifications

– Performance qualification (PQ) for testing that thesystem performs as intended for the selected appli-cation.OQ is similar to performance verification. PQ is

most similar to validation as defined by EN ISO8402:1995, because PQ always includes the user’sequipment and method, and therefore is very specific.

Instrument vendors use the term qualification for in-stallation and both verification and qualification fortesting the equipment hardware and software for docu-mented specifications prior to operation. For softwaredevelopment typically the term validation is used, al-though the term verification would be more appro-priate.

The confusion is made nearly complete by the intro-duction of new terms by official committees. For exam-ple, the OECD has used the term acceptance testing [9]which is a subset of OQ and should be performed be-fore the full OQ as a criterion of whether the systemwould be accepted by the user’s firm.

Software product life cycle

Software development often takes several years, and itis impossible to ensure a certain quality standard simp-ly by testing the program at the end of its developmentprocess. Quality cannot be designed into the softwareafter the code is written; it must be designed and pro-grammed into software prior to and during its develop-ment phases by following written development stand-ards, including the use of appropriate test plans andtest methods.

The product life cycle approach as illustrated inFig. 1 has been widely accepted for the validation ofcomputerized systems during their entire life. Theproduct life is divided into phases:– Setting user requirements and functional specifica-

tions– Design and implementation, with code generation

and inspection– Test of subsystems (build a system and test as a sys-

tem)– nstallation and qualification of system before it can

be put into routine use– Monitoring performance of system during its entire

use– Maintenance and recording history of changes

The Hewlett-Packard product life cycle starts withthe proposal to initiate a new project. Proposals are

Page 35: Validation in Chemical Measurement

26 L. Huber · H. Wiederoder

Fig. 1 The Hewlett-Packard CAG product life cycle model.ITQS: Information Technology Quality system

based on business and user needs. These describe howusers fulfill these needs now and how the new softwarecan provide better solutions. In the investigation phase,system reference specifications (user requirements) andexternal reference specifications (functional specifica-tions) are developed and reviewed. In the design phase,the design specifications are developed and reviewed.The implementation phase includes writing the codeand code inspections, if necessary. In the test phase,functional testing is performed with test cases for eachfunction as specified in the external reference specifica-tions document. After the product is shipped and used,feedback from users is recorded and documented, andchanges to the software are made following docu-mented change control procedure. Each phase is com-pleted, reviewed and approved before the subsequentphase is started.

Checkpoint meetings

Each phase ends with a checkpoint meeting. This isprepared by the project leader and attended by allmembers of the project team and managers from differ-ent departments. Team members report on their activi-ties during the phase and how they could meet the re-quirements as written in the development and valida-tion plan. The team and management go through thechecklist and discuss each point to determine whetherand how the checkpoint items have been fulfilled. Anaction plan with people assignment is put together aspart of the phase exit report for those items that are notyet closed. If all issues are resolved, the checklist issigned off by the management.

Proposal and investigation phases

The software life cycle starts with a requirements anal-ysis and product definition. These define the require-

ments that the product must meet for functionality,compatibility with existing systems, usability, perform-ance, reliability, supportability and security. The goal isto specify both the problem and the constraints uponthe solution. Planning activities include project plans,budgets, schedules and validation, verification and test-ing. During this phase the project team is established,usually comprising representatives from system devel-opment, product marketing, product support, qualityassurance, manufacturing and application chemists,who represent the users and are deeply involved in thedevelopment of a functional requirements specificationand in the user interface prototyping. A project teamleader is appointed to manage the project and a projectnotebook is created and maintained through the entiredevelopment phase.

Users from all application segments are interviewedby team members to discover their needs. Finally, a listwith all proposed functional requirements is drawn upand evaluated by the project team. Usually the list istoo long for all requirements to be implemented withina reasonable time-frame, so the requirements are prior-itized into three categories, “Musts”, “Wants” and“Nice to haves”. “Must” requirements are consideredto be those that are a prerequisite to the success of thesoftware and are always included in the final specifica-tions. “Wants” and “Nice to haves” are of lesser impor-tance and are included only if their implementationdoes not appreciably delay the project.

The external reference specifications (ERS) docu-ment is developed. This includes an overview of thescope and benefits of the project and a detailed descrip-tion of the complete product from a user’s perspective.The document is reviewed by the project team. It is thestarting point for system design and also the basis for– Design specifications document– Functional testing– The functional specifications document that is avail-

able for users to write their own requirement specifi-cations and operational qualifications or acceptancetesting

– The user documentation (e.g., user manual, on-linehelp).Feasibility studies are done if necessary. The soft-

ware engineering tools are determined and the soft-ware ‘make’ process is designed.

Deliverables for this phase include:– System reference specifications (user requirement

specifications)– External reference specifications (functional specifi-

cations)– Risk assessment– Quality plan

Page 36: Validation in Chemical Measurement

Qualification and validation of software and computer systems in laboratories. Part 1: Validation during development 27

Fig. 2 Extract from a design review

Design phase

The goal here is to design a solution that satisfies 100%of the defined ‘must’ requirements and falls within theset constraints. Alternative solutions are formulatedand analyzed, and the best are selected. Verification ac-tivities during the design phase include inspecting thedesign specifications for completeness, correctness,consistency and checking that the design directly corre-lates with the defined requirements. Thorough designinspections are of the utmost importance because cor-rection of defects detected in this phase is much lesscostly than is the case when these are detected in a laterlife cycle phase.

Details on system screen designs, report layouts,data dictionary with data flow diagram, system configu-rations, system security, file design, system limitationsand memory requirements are laid out by system devel-opers and are usually formally inspected with membersof the development team. Major outputs of this phaseare the internal design documents and prototypes. Thedesign documents are based on the ERS and can beused as a source for the technical support documenta-tion.

In this phase, operators from different backgroundstest the user interface in a process called usability test-ing to determine the access to the intended functionand their understanding of interaction concepts.

The design specifications as prepared by individualprogrammers are inspected by peers. This is done in ameeting organized and led by a specially trained mod-erator who may be from the development departmentor the quality assurance department. Programmers re-view the design specifications document for consistencyand correctness. Any findings are discussed and re-corded as shown in Fig. 2. The procedure is repeateduntil the team no longer finds any defects in the docu-ment. Deliverables for this phase include:– Internal design document (design specifications)– Reports on design inspections/reviews

– Usability test report– GLP validation/documentation plan– Chemical performance and application test plan– QA plan update

Implementation phase

In the implementation phase, the detailed design is im-plemented in source code, following written and ap-proved software coding standards, and results in a pro-gram that is ready to be tested. After certain groups offunctions have been programmed, they are tested indi-vidually by the programmers before they are integratedinto a larger unit or into the complete system. Verifica-tion includes code and internal documentation, test de-signs, and all activities that determine whether the spec-ified requirements have been met. Concurrently withthe implementation phase, system documentation, suchas user manuals and the electronic help system, is pre-pared. Documentation also includes a description ofthe algorithms used by the program.

In this phase, the system also undergoes a rigoroususability evaluation with testers from different back-grounds. The goal is for an experienced user to be ableto perform the basic functions without the need for for-mal instruction (the so-called plug-and-play ap-proach).

Deliverables for this phase include:– Source code with documentation– Code inspection/walkthrough reports, where neces-

sary– Documentation of test cases in preparation for the

test phase.

Testing

Thorough testing and verification are most importantfor any validation and qualification. For a software pro-ject, testing and verification are done throughout all lifecycle phases. The goal is to detect errors, if any, as earlyas possible. Requirements specifications and the designare reviewed or inspected during the definition and de-sign phases and the code is tested and may be formallyinspected by the programmers during code implemen-tation. Proper functioning of software together with theequipment hardware is verified in the test phase andduring operation.

Types of testing

Software testing can be classified as being either struc-tural (white box) or functional (black box) (Fig.3).Structural testing (white box) of software is the detailed

Page 37: Validation in Chemical Measurement

28 L. Huber · H. Wiederoder

Fig. 3 Testing and verification are done throughout all life cyclephases

Fig. 4 Test coverage matrix

examination of the internal structure of code, includinglow- and high-level path analysis, branch flow analysisand inspection for adherence to software developmentprocedures. It tests logical paths through the softwareby providing test cases that exercise specific sets of con-ditions. Besides the source code, other documentation,such as logic diagrams, branch flow analysis reports, de-scription of modules, definition of all variables, specifi-cations, and test suites of all inputs and outputs, are re-quired.

Functional testing (black box) of software evaluatesthe outputs of a program compared to the expectedoutput values for a range of input values. For a comput-er-controlled analytical system, functional testingshould always include analytical hardware to verifyproper parameter communication and data flow.Source code is not required, but a full set of systemspecifications and a description of functional routines,such as calibration algorithms, must be available.

Structural testing is done in the development depart-ment and starts during the implementation phase. Codemodules are checked individually by the programmersand may be formally inspected by peers, if appropriate.Modules are tested by programmers with specific testsuites and then linked together into a system and testedas a whole for proper functionality to ensure that de-signs are correctly implemented and the specified re-quirements satisfied.

Written in advance, the test plan defines all test pro-cedures with their pass/fail criteria, expected test re-sults, test tasks, test environment for equipment andcomputers, criteria for acceptance and release to manu-facturing, and the persons responsible for conductingthese tests. The test plan also specifies those functionsexcluded from testing, if any. Individual tasks coverfunctional testing, simulation of incomplete functionsas integration proceeds, mathematical proof of results,records of discrepancies, classification of defects andcorrective actions.

A test coverage matrix (Fig. 4) is created whichshows linkages of test cases to the design specificationsand external (functional) and system (user) require-ment specifications. This matrix ensures that all func-tions are tested through all phases.

a-Testing

After the programmers have completed the engineer-ing tests, the system undergoes functional testing intypical operating conditions, so-called a-testing. Overseveral weeks, groups of chemists and other profession-als conduct the testing, using test cases defined for eachperson in a test book that must be signed off by individ-uals on completion of the test. The objective is to testthe complete computerized system for functionality,usability, reliability, performance and supportability asstated in the requirement specifications document.

Test cases that will be handed out to the test personare prepared by the quality assurance (QA) depart-ment together with the development team. The docu-ments include the scope of the test with a link to therequirement specifications document, the system confi-guration, background information on the function to betested, detailed test description and expected results.During and after testing, the test person completes thesheet with his name, actual results and any commentson findings.

The system is not only tested under typical operatingconditions but also at the limits under which it will berequired to operate – an approach known variously asworst case testing, gray box testing or testing of bound-ary conditions. Testing boundary conditions is impor-tant because most software errors occur around itsboundary limits. Combinations of several worst casesare also tested. For example, if a system is specified toacquire data from multiple instruments and the data ac-quisition rate can be varied, test cases include acquisi-tion from the maximum number of instruments at thehighest data rate.

Software testing also includes so-called stress testing.Inputs with inappropriate character types (alphabetic

Page 38: Validation in Chemical Measurement

Qualification and validation of software and computer systems in laboratories. Part 1: Validation during development 29

Table 1 Extract from an al-pha-test summary sheet Version Test time

(hours)Defects Cumulative

sum ofdefects

Defect disco-very rate(defects/hour)

Line fit

Vxxx

fVyyy

857396

788748

806979

331

80149232

472475476

0.940.860.82

0.040.030.02

1.051.041.03

0.230.230.23

The test results are evaluated using the Shooman Plot (10). The discovery rate is plotted versus thetotal number of defects discovered (Fig. 5). A regression linear fit curve is calculated and plottedtogether with maximum and minimum fits which by definition have a confidence interval of 5%.From the Shooman reliability model as shown in figure 5 the number of remaining defects can beestimated. This information is useful to forecast the number test cycles that are still necessary and apossible release date.The number of critical defects after the last test cycle must be zero for the software to pass therelease criteria. This and other release criteria are specified in the quality plan.

characters instead of numeric ones, for example, orinappropriate character length and character composi-tion) are made, and instrument parameters that lie out-side the instrument’s operational limits are entered.The expectation is that these inputs will not damagedata or disrupt system and software operation and thatthe system will recover after producing error mes-sages.

The test environment reflects as many system confi-gurations as possible. This includes different equipmentthat is controlled by the computer, different peripher-als, such as printers, CD ROMS, different internalmemory (RAM), and different operating systems, forexample, Windows 95 and NT.

Test cases reflect typical user applications with man-ual interactive and automated operation. Automatedsequences typically run over 24 h or more, where meth-ods are changed between runs. Data files with differentfile sizes are generated to make sure that system canhandle large files.

The user manual is prepared before the a-test to al-low test personnel to verify its accuracy and usefulness.At least one test case requires installation of the soft-ware and hardware according to the installation in-structions.

Defect tracking system

Documenting software errors is important and prob-lems should not be casually reported for repair by theprogrammer on an ad hoc basis. Problems found duringtesting are tracked using the HP internal defect controlsystem (DCS).

Defects are classified by the test person as low, me-dium, serious and critical. Defect density, discoveryrate and defect summaries are recorded for each test

Fig. 5 Example for Schooman Plots and Reliability model withdefect discovery rate versus number of defects. Total number ofdefects is estimated for average (AVG), best and worst case

cycle and for the entire a-test. Summary sheets includeinformation on version number, test hours, number ofdefects, total number of defects, defects per hour andthe linear fit. An example is shown in Table 1.

The test results are evaluated using the Shoomanplot. The discovery rate is plotted versus the total num-ber of defects discovered (Fig. 5). A regression linearfit curve is calculated and plotted together with maxi-mum and minimum fits which by definition have a con-fidence interval of 5%. From the Shooman reliabilitymodel (Fig. 5), the number of remaining defects can beestimated. This information is useful to forecast thenumber of test cycles that are still necessary and a pos-sible release date.

The number of critical defects after the last test cyclemust be zero for the software to pass the release crite-

Page 39: Validation in Chemical Measurement

30 L. Huber · H. Wiederoder

ria. This and other release criteria are specified in thequality plan.

b-Testing

Once software defects and usability discrepancies re-ported during a-testing have been corrected, the soft-ware may be tested at selected customers’ sites (the so-called b-test). The key feature of b-testing is that it isconducted in a customer environment and supervisedby a person not involved in the development process.One of the objectives of b-testing is to test the HPproduct delivery and support channel. A trained HPapplications engineer (AE) assists the customer withinstallation and checks the software installation proce-dure.

Deliverables for the test phase include:– Test plans with acceptance criteria and test cases– Test results– Validation documents– Defect density report– User training material– System status bulletin (SSB).

Release for production and installation

After the testing is complete and the code is correctedfor errors, the software is released for production anddistribution. The product is considered ready for re-lease when it has met all the criteria specified in thequality plan and after formal sign-off by product line,quality assurance, and manufacturing management. Aprerequisite for this is sufficient training of service engi-neers who must be able not only to install and operatethe software but also to train users and answer users’questions. Availability of user documentation in the

form of on-line help and printed reference material isalso a release criterion.

The manufacturing department ships the product inaccordance with manufacturing guidelines, based onthe receipt of valid purchase orders. The product docu-mentation includes a Declaration of System Validationwith statements from Hewlett-Packard that the soft-ware was developed and tested according to the Hew-lett-Packard Analytical Software Life Cycle, a processthat has been certified for ISO 9001 quality standardand for the information technology quality system(ITQS) compliance and that has been inspected by rep-resentatives from both computer validation companiesand the pharmaceutical industry.

Summary

Software development and validation activities followthe product life cycle concept where development isdivided into different phases. Each phase is completed,reviewed and signed off by management before thenext phase starts. The life cycle starts with a proposaland investigation phase where the user requirementsand functional specifications (external reference speci-fications) are set. During the design phase, the designspecifications that define internal code structure andformulas are developed. In the implementation phase,the code that meets user requirements and design spec-ifications is written. During all phases, the documentsare formally reviewed by peers and correct function ofthe software is tested by anticipated users in the testphase. Test activities follow a test plan with test casesfor each function as specified in the external referencespecifications document. Software release for distribu-tion follow criteria which are specified in a qualityplan.

References

1. Bedson P, Sargent M (1996) AccredQual Assur 1 :265–274

2. Freeman HM, Leng M, Morrison D,Munden RP 1995 Pharm Tech Eu-rope 40–46

3. Huber L Equipment qualification inpractice (to be published)

4. Burgess C (1996) Eur Pharm Rev52–54

5. Huber L, Welebob L (1997) AccredQual Assur 2 :316–322

6. DIN EN ISO 8402 – Definitions:Quality Management, in English,French and German, August 1995,Beuth, Berlin, Germany

7. Huber L (1996) LC-GC magazine14 :502–508

8. Huber L (1995) Validation of compu-terized analytical instruments, Inter-pharm, Buffalo Grove, IL, USA,ISBN 0-935184-75-9 Hewlett-PackardPart Number :5959-3879

9. OECD (1992) Compliance of labora-tory suppliers with GLP principles,Series on principles of good laborato-ry practice and compliance monitor-ing, No. 5 GLP consensus document,environment monograph No. 49, Paris

10. Shooman M (1973) Operational test-ing and software reliability estimationduring program development, Proc.Int. Symp. Computer Reliability, pp51–57

Page 40: Validation in Chemical Measurement

Accred Qual Assur (1997) 2 :375–380Q Springer-Verlag 1997 GENERAL PAPER

M. BuzoianuHassan Y. Aboul-Enein

Clinical reference materials for thevalidation of the performance ofphotometric systems used for clinicalanalyses

Received: 23 April 1997Accepted: 7 July 1997

M. BuzoianuNational Institute of Metrology, Sos.Vitan-Bârzesti No.11, 75669 Bucharest,RomaniaTel.: c4016344030;Fax: c4013301533

H. Y. Aboul-Enein (Y)Biological and Medical ResearchDepartment (MBC-03), King FaisalSpecialist Hospital & Research Centre,P.O.Box 3354, Riyadh 11211, SaudiArabiaFax: c966-1-4427858;e-mail: enein6kfshrc.edu.sa

Abstract There are a wide varietyof spectrophotometric devices no-wadays used in health services withvarious qualities of manufacturemethods of measurement and me-trological characteristics for per-forming the necessary measure-ments. Therefore, to meet the ac-curacy and repeatability require-ments needed in medical diagnosisand treatment, the validation ofthe performance of such systemsby clinical chemistry laboratories isessential. However, the validationof a spectrophotometric system forclinical analyses requires several

reference materials, according tothe end use of the measurementresults. This paper discusses somecharacteristics required of the clini-cal reference materials needed andused by Romanian Institute of Me-trology for validation work. Typesof clinical reference materials de-veloped in the national area forthis purpose are also presented.

Key words Clinicalspectrophotometry 7 Qualityassurance 7 Validation 7 Referencematerials 7 Spectrophotometricstandards

Introduction

The measurement of components of biological samplesfaces several problems which influence the quality ofthe final outcome. Thus, quality control becomes apractice of increasing importance in clinical chemistry.

As is well known, the quality of a measurement re-sult depends on different factors, such as sampling, themeasurement method, the instrumentation and the ref-erence materials used for the calibration, etc. This iswhy it is important that all critical parameters of theinstrument and the reference materials should be cali-brated and regularly controlled. Experience showedthat some manufacturers may not offer sufficient fieldcalibration and testing, and may not be willing to re-lease calibration functions or allow access to raw meas-urement data. So, as the complexity of instruments in-creases, it is essential to establish other means to relia-bly calibrate and control instrumentation. On the otherhand, as the instruments used in clinical chemistry labo-ratories are most diverse and have different degrees of

accuracy and repeatability, the validation of the pho-tometric devices used for human blood analyses is veryimportant and currently a subject of considerable inter-est.

According to the Romanian laws and ordinances is-sued lately in the field of metrology, all the instrumentsused in public health are subject to pattern approval, toverification or to compulsory calibration. In this frame-work, the validation work has a special place.

Generally, validation is a process by which a sample,measurement method, instrument, or piece of data isdeemed to be useful for a specific purpose.

The validation of an instrument demands a proce-dure and a number of standards (reference materialsincluded) with the appropriate characteristics. The defi-nition and use of standards and reference materialsavailable to the clinical chemistry is still a problem inthis field. Although there are several clinical referencematerials, the problems of using them should not beunderestimated (stability, preservation, contamination,use of aqueous solutions, etc.).

Page 41: Validation in Chemical Measurement

32 M. Buzoianu · H.Y. Aboul-Enein

A large number of routine clinical chemistry labora-tories perform analyses with multichannel analyzers ofboth the continuous-flow and the direct-sample type.Various types of spectrophotometric reference materi-als have been recommended to validate the photomet-ric accuracy and linearity, wavelength accuracy or straylight radiation of photometric systems used for clinicalanalyses. In this respect, much has been done in the na-tional area. The problem we are facing now concernsthe required reference materials to be used for the val-idation of the concentration accuracy of this instrumen-tation, so widely used in clinical chemistry laborato-ries.

Aqueous standard solutions containing a mixture ofall substances commonly being determined in routineclinical chemistry laboratory are not available. Multi-channel systems are therefore being calibrated withcommercially supplied calibration sera for which valuesfor each analyte have been assigned by the manufactur-ers. The same manufacturer also provides control serafor the validation of the system. Wide differences mayoccur between values obtained on the same analyzerwith sera from different manufacturers and even be-tween lots from the same manufacturer (due to the var-iation in weight or homogeneity of material). In addi-tion, the ‘certificates’ of the calibration and/or controlsera do not often provide information about uncertain-ties, the method of measurement or the influence pa-rameters. Usually, only the target value or the rangewithin which it lies is indicated for each analyte.

The role of the Standard Reference Materials of theNational Institute of Standards and Technology (NISTSRMs) is clearly defined in the validation work. An al-ternative is to use the certified sera that have been as-signed using NIST SRMs. It was of interest to find thatthe major requirements for concentration referencematerials used to validate the photometric systems forclinical analyses have not been well specified in majorregulatory guidelines [1]. It seemed appropriate, there-fore, in this communication to present some aspects ofthe Romanian experience on this topic.

The validation of the instrumental performance ofphotometric systems used for clinical analyses

The quality of the end result fundamentally depends onthe sample, the data, the measurement process, thetransformation of data to information and, finally, thedisplay and transformation of the information. It is ob-vious that the role of the instrument in providing theintegrity of data is also fundamental to the end result.Written requirements for instrumental performance arenot sufficient for assuring the reliability of the resultunless they are tested and validated.

The technical specifications identified and describedby most of the manufacturers of absorption photomet-ers for medical use include wavelength accuracy, spec-tral half-width of spectral radiation flux at the detector,photometric accuracy, percentage of wavelength inte-grated, false radiation, and photometric short-time re-peatability. As discussed previously [2], the Instrumen-tal Performance Validation Procedures, issued by seri-ous manufacturers of analytical instruments, indicatethe methods and the reference materials required totest and to maintain optimum spectrometer perform-ance in daily routine analysis.

When using automatic continuous flow instrumenta-tion, identifying and describing the performance of thephotometric system is rather more difficult. The basicphotometric requirements seem clear enough:1. Given a chemistry which is linear, it is expected that

the photometric output would exhibit basic confor-mance to the Beer-Bouguer Law over the absor-bance range 0–2.

2. The sensitivity should be such that the desired or op-timum absorbance concentration relationship isachieved and sustained from determination to deter-mination.In our experience, the various automatic photomet-

ric systems yield adequate linear calibration linesthrough an absorbance of one. However, linearityshould not be considered apart from sensitivity, sinceconsiderable flexibility exists with regard to the possi-ble ratios of sample to reagent. In this respect, for iden-tification of such performance requirements, this is nota sufficient prerequisite nor a pragmatic way to validatethe instrument, since the automatic flow and the specif-ic method and reagents used prevent testing of the theaccuracy and repeatability of the photometric system inroutine work. Furthermore, the output signal displayedis a concentration value instead of a photometric quan-tity. As recommended in [3], such evaluation of the in-struments for automatic analysis in clinical biochemis-try laboratories should take into account the accuracyand repeatability of the concentration measurement us-ing at least five different analytes from appropriate serareference materials.

For this reason, it is necessary to focus on the mean-ing of the term ‘appropriate’ when applied to referencematerials and also on the practical procedure in the me-trological assurance of clinical concentration measure-ments in the national area.

Clinical reference materials for the validation of theperformance of photometric systems used for clinicalanalyses

With their values known, according to De Bièvre [4],clinical reference materials are designated:

Page 42: Validation in Chemical Measurement

Clinical reference materials for the validation of the performance of photometric systems used for clinical analyses 33

Fig. 1 Measurement uncertainty components for the determina-tion of potassium by flame photometry (S1), calcium by atomicabsorption spectrometry (S2), magnesium by molecular spectro-photometry (S3), glucose by molecular spectrophotometry (S4)

– to validate a measurement procedure (i.e. to assurethat the procedure that includes chemical prepara-tion, instrument calibration, measurement, data ac-quisition and data treatment, is suitable and per-forms properly);

– to validate the measurement instrument (i.e. to as-sure that it provides, properly and reproducibly, sta-ble and accurate information);

– to validate the data acquisition and data treatmentof the measurement procedure (i.e. to assure thatthe associated software leads to the proper data).This validation process, required prior to measuring

any unknown sample, enables the evaluation of uncer-tainty contributions to the measurement procedurewhich are present in the measurement of real life sam-ples.

Photometric devices used for clinical analyses arecalibrated against calibration sera, usually commercial-ly supplied with the instrument, for which values foreach analyte have been assigned by the manufacturers.Wide differences may occur between values obtainedon the same device with sera from different manufac-turers and even between lots from the same manufac-turer (due to the variation in weight or homogeneity ofmaterial). On the other hand, great care should be tak-en in the use of the information concerning the exactmethod used to assign values to the material.

SRMs supplied by NIST make the verification ofcommercial calibration sera possible if referee methodsof analysis are available for use in conjunction with theSRMs, so the first important issue here is how toachieve traceability of the certified value of the clinicalreference materials to the SI unit. Several traceabilityschemes for reference materials in analytical measure-ments have been recommended [4, 5]. Closely con-nected is the problem of the evaluation of the uncer-tainty of the clinical reference material. In this respect,a major aspect is the significance of this uncertainty inthe medical treatment or diagnosis, as it is widely re-cognized that it is rather difficult to show clearly andunambiguously a limit of accuracy required for a clini-cal result.

An attempt to evaluate the contribution of the un-certainty of the reference material in the overall uncer-tainty of the photometric measurement process is illus-trated in Fig. 1. Note that the steps followed for evalu-ating and expressing the measurement uncertainty arein accordance with the metrological approach recentlydescribed by Bunzoianu and Aboul-Enein [2].

Figure 1 represents four examples of the evaluationof measurement uncertainty for potassium, calcium,magnesium and glucose using flame photometry,atomic absorption spectrometry and molecular spec-trometry (Mg determination with Titan Yellow andglucose determination with glucose oxidase). For thesake of simplicity in Fig. 1, the component of uncertain-

ty due to the sampling is not represented. Thus, the un-certainties of concentration per concentration (uc/c)due to the photometric devices (1), to the referencematerials used for calibration (2), to the calibration (3)and to the volumetric means of measurement (4) areillustrated.

It should be noted that when we used methods ofmeasurement needing inorganic reference materials forcalibration (such as flame photometry or atomic ab-sorption spectrometry) the uncertainty due to the refer-ence materials was considerably lower than that due tothe photometric device. On the contrary, when we useda clinical reference material certified for its glucoseconcentration with a 10% (rel) uncertainty, this uncer-tainty exceeded twice the uncertainty due to the spec-trophotometric device. When we determined Mg by aspectrophotometric method with Titan Yellow, wefound that the uncertainty due to the reference materi-al was approximately twice that due the device, as weused a very accurate spectrophotometer.

The clinical reference materials developed for thevalidation of the performance of photometric systems

The major purposes followed by the Romanian Insti-tute of Metrology in the development of synthetic clini-cal reference materials were: (1) to be able to handlelarge quantities of base material in a short time, (2) toprepare samples of correct and adequate size, (3) toprepare samples of the highest quality, (4) to insure, as

Page 43: Validation in Chemical Measurement

34 M. Buzoianu · H.Y. Aboul-Enein

Table 1 Concentration values: synthetic type reference materialsera

Element Concentration in mmol/dm3

Type 14.01 Type 14.02 Type 14.03 Type 14.04

NaKCaClMgGlucoseUrea

148.53.72.3

170.00.865.45.5

126.56.23.3

150.01.8

16.519.5

158.44.23.6

1801.47.4

11.4

134.43.53.0

150.01.26.39.6

Table 2 Concentration values: synthetic type reference material.Grav. M, gravimetric method; Flamp. M, flame photometry; IonS. M, ion selective method; Cont. F, continuous flow; Spp. M,

spectrophotometric method; AAS. M, atomic absorption spec-trometry; Vol. M, volumetric method; Gl. ox. M, glucose oxidasemethod; o-Tol. M, ortho-toluidine method

Element Concentration in mmol/dm3

Type 14.01 Type 14.02 Type 14.03 Type 14.04

Na Grav. MFlamp. MIon S. MCont. F

148.8 B0.4148.0 B4.0149.5 B5.0149.0 B4.0

126.8 B0.2126.0 B2.0127.0 B2.0127.3 B3.0

158.7 B0.4158.0 B7.0159.0 B9.0159.9 B8.0

134.4 B0.3135.0 B3.0136.2 B3.0137.0 B4.0

K Grav. MFlamp. MIon S. MCont. F

3.70B0.303.72B0.203.75B0.203.85B0.30

6.20B0.206.10B0.506.05B0.406.15B0.50

4.20B0.204.30B0.304.15B0.204.50B0.30

3.50B0.203.70B0.203.65B0.203.55B0.20

Ca Grav. MSpp. MCont. FAAS. MIon S. MVol. M

2.30B0.202.40B0.102.60B0.042.26B0.302.30B0.082.12B0.10

3.30B0.103.28B0.103.35B0.063.25B0.303.40B0.103.20B0.20

3.60B0.103.75B0.103.20B0.063.55B0.303.65B0.103.50B0.20

3.00B0.123.10B0.103.09B0.052.90B0.303.15B0.092.85B0.20

Mg Grav. MSpp 1. MSpp 2. MIon S. MAAS. M

0.86B0.020.84B0.070.90B0.070.86B0.040.88B0.15

1.80B0.021.79B0.081.87B0.081.85B0.041.83B0.15

1.41B0.011.38B0.061.42B0.061.44B0.031.39B0.15

1.21B0.011.19B0.071.22B0.071.27B0.031.25B0.15

Gl Grav. MGl. ox. Mo-Tol. MCont. F

5.40B0.055.43B0.585.60B0.345.27B0.30

16.70B0.2014.63B1.0015.98B0.6416.26B0.41

7.47B0.087.73B0.657.85B0.487.22B0.38

6.36B0.066.60B0.676.73B0.436.25B0.32

far as possible, the traceability of the clinical measure-ments direct to SI units.

Preparation of the clinical reference materials

Four types of reference materials for the validation ofthe instrumental performances of the photometric de-vices used for clinical analyses were gravimetrically pre-pared, under the responsibility of the Romanian Na-tional Institute of Metrology, from high-purity reagentsas a synthetic material composed of deionized water(electrical conductivity 0,5 mS/cm, sterilised by 0,2-mmfilter and UV continuous-flow filter) and suitable inor-ganic salts containing declared cations, metals and or-ganic salts with choride anions. In general, CertifiedReference Materials are expensive consumables. Dueto the risk that samples may suffer from instability ordegradation or are contaminated, the RMs developedwere bottled in 10 ml ampoules made of glass, purifiedand pre-treated, intended for a limited term (one yearmaximum). In fact the stability was checked for eachtype of reference material over 20 months.

The concentrations, in mmol/l, of different compo-nents were checked by classical, visible (VIS) spectro-photometry, ion-selective electrometry, atomic absorp-tion spectrometry, continuous flow and flame photome-

try. The matrix and the molar concentration of the ref-erence materials developed are shown in Table 1.

Note that the molar concentration values are rele-vant to decision levels where accurate measurement iscrucial for medical decision. The category of synthetic-type reference material sera needs to be certified bymeans of primary methods. According to internationalregulations [6], the methods of certification of the clini-cal reference materials followed several steps regardingthe preparation of the material, homogeneity testing,performance of interlaboratory analyses, assignment ofthe certified value and uncertainty.

Page 44: Validation in Chemical Measurement

Clinical reference materials for the validation of the performance of photometric systems used for clinical analyses 35

For each analyte, mean concentrations and their95% confidence interval were evaluated for each meth-od employed, because differences among the resultsobtained by the individual methods were significant.The assigned concentration values were establishedbased on the interpretation of data respecting themethods of preparation and measurement (taking intoaccount the contribution of a statistical as well as a sys-tematic nature). Accordingly, the mean values assignedfor different methods of measurement for five selectedanalytes from the synthetic sera reference materials de-veloped are presented in Table 2.

Evaluation and uncertainty

The uncertainties of the assigned values were then eval-uated [7] on the basis of the estimation of the Type Aand Type B uncertainty components, due to the possi-ble errors originating principally from preparation aswell as from the direct measurement (method in-cluded), as follows:

Ucpk7;u2Accu2

BC (1)

where Uc is the expanded uncertainty of the concentra-tion (c) value, and k is control for the input quantitydetermined from independent repeated observations.The uncertainty of the above estimate is the Type Astandard uncertainty.

uAcp "n

Aip1

1 P

ciPc 22

nPi(2)

where the ci is the value of the measured concentrationand c is the mean of the n measurements performedunder repeatable conditions.

For an estimate of an input quantity that has notbeen obtained from repeated observations, the asso-ciated estimated variance or standard uncertainty isevaluated by scientific judgement based on all availableinformation on the possible of its variability. This is thecase of Type B standard uncertainty. The pool of infor-mation may include previous data, experience with gen-eral knowledge of behaviour of relevant materials andinstruments, manufacture’s specifications, or data pro-vided in calibration and other certificates or uncertain-ties assigned to references data taken from hand-books.

uBcp"m

Ajp1

c2c,xj 7u2

B,xj (3)

where C2c, xj are the functions describing the concentra-

tion estimation depending on xj influence factors thatdepend on the method used to certify each element

Fig. 2 Analytical data obtained in interlaboratory comparisonRM type 14.01–14.04

Page 45: Validation in Chemical Measurement

36 M. Buzoianu · H.Y. Aboul-Enein

from the reference material developed (such as massmeasurement, volume measurement, spectrophotomet-ric measurement, reference materials used for the cali-bration, etc.) and u2

B,xj are the uncertainties associatedwith each of these functions.

Each standard uncertainty for all inputted quanti-ties, evaluated as Type A or Type B procedure, werecombined in the correct mathematical manner to evalu-ate combined standard uncertainty, uc

2(y), – that char-acterizes the dispersion of the values that could reason-ably be attributed to the considered measurand.

The additional measure of uncertainty that meetsthe requirement of providing an interval of the above istermed expanded uncertainty and was obtained by mul-tiplying the combined standard uncertainty by a cover-age factor k (k p 2). Thus, the uncertainties obtainedfor the assigned values for sodium, potassium, calcium,magnesium and glucose concentration in the referencematerials for each methods of measurement used areillustrated in Table 2. Narrower uncertainty limits wereobtained for these conventionally certified values incomparison to values of acceptable range stated forcommercially available control sera of the same matrix.Furthermore, a comparison was made of the analysesof the synthetic clinical reference materials carried outby 15 selected laboratories (according to the equipmentand training in the field of clinical chemistry and thegeographic coverage area, respectively) using their re-spective routine analytical methods and instrumenta-tion. Each laboratory tested the synthetic sera as un-known samples after the calibration of their specificmeasurement means with control sera supplied fromdifferent sources. Whenever possible, the flame pho-tometers, the photometers and the spetrophotometerswere previously tested in accordance with the LegalMetrology Norms, issued in accordance with the re-quirement of the International Organization for LegalMetrology (OIML). The results were graphically com-pared with the assigned values as shown in Fig. 2. Notethat the uncertainty of a value is considerably smallerthan the interlaboratory spread. The results showed thefollowing spread: less c 4,9% for Na, less c 19% forK, less c 26,1% for Ca, less c 18,6% for Mg and less

- 15,6% for glucose, asymmetrically distributed aroundthe assigned values. The graphs indicate what is thebest possible result obtainable on the assigned valueand what is obtained in practice under routine condi-tions. All methods used produced results with roughlythe same spread. It may be seen that all the elementstested show some outliers at the high, then on the lowconcentration side.

No method was found to be superior. The bestmethod used can give wrong results if incorrectly ap-plied, even in the experimental laboratories. Therefore,it is better to impose good quality on a laboratory thana specific method.

The main reasons for the current lack of comparabil-ity at working level include an insufficient awareness ofuncertainty and source of error, a lack of high qualityreference materials and no recognized system for inter-comparison of traceable clinical chemistry measure-ments. In addition, the limit results were obtained inthe absence of a reliable uncertainty budget and insuffi-cient QA procedures. In fact, in the national area theintroduction of adequate QA procedures is making itsfirst steps.

Conclusion

This paper discusses a number of practical problems ar-ising from the request for and use of clinical referencematerials for the validation of the performance of pho-tometric systems used in national clinical chemistry la-boratories. It shows that uncertainties in the measure-ment step of photometric analysis have largely been ig-nored. Uncertainties associated with this step can anddo contribute significantly to overall analytical uncer-tainty. Thus, for a knowledge of trueness and measure-ment uncertainty, an adequate certified reference mate-rials system and an attempt at a traceability chain are ofthe utmost importance, since the quality of clinicalchemistry results depends critically on the use of relia-ble reference materials and properly validated instru-ments.

References

1. Bronder Th, Rinneberg H (1992)PTB Laborbericht, Proposal for inter-national recommendation

2. Buzoianu M, Aboul-Enein HY (1997)Accred Qual Assur 2 :11–17

3. Broughton PM, Buttolph MA, Gow-enlock AH, Neill DW, ScentelberyRG (1969) J Clin Path 22 :278–284

4. De Bièvre P (1996) Accreditation andquality assurance in analytical chemis-try. Springer, Berlin Heidelberg NewYork, pp 159–193

5. Marschal A (1980) Bull BNM39:33–39

6. Guide ISO 35 (1989) Certification desmateriaux de référence – Principesgénéraux et statistiques

7. Guide to the expression of uncertain-ty in measurement (1993) Interna-tional Organization for Standardiza-tion (ISO), Geneva

Page 46: Validation in Chemical Measurement

Accred Qual Assur (1998) 3 :6–10GENERAL PAPER

Stephen L.R. EllisonAlex Williams

Measurement uncertainty and itsimplications for collaborative studymethod validation and methodperformance parameters

Abstract ISO principles of meas-urement uncertainty estimation arecompared with protocols for meth-od development and validation bycollaborative trial and concomitant“top-down” estimation of uncer-tainty. It is shown that there is sub-stantial commonality between thetwo procedures. In particular, bothrequire a careful consideration andstudy of the main effects on the re-

sult. Most of the information re-quired to evaluate measurementuncertainty is therefore gatheredduring the method developmentand validation process. However,the information is not generallypublished in sufficient detail atpresent; recommendations are ac-cordingly made for future report-ing of the data.

Introduction

One of the fundamental principles of valid, cost-effec-tive analytical measurement is that methodology shoulddemonstrably fit its intended purpose [1]. Technical fit-ness for purpose is usually interpreted in terms of therequired “accuracy”. To provide reasonable assuranceof fitness for purpose, therefore, the analyst needs todemonstrate that the chosen method can be correctlyimplemented and, before reporting the result, needs tobe in a position to evaluate its uncertainty against theconfidence required.

Principles for evaluating and reporting measurementuncertainty are set out in the ‘Guide to the expressionof uncertainty in measurement’ (GUM) published byISO [2]. EURACHEM has also produced a document“Quantification of uncertainty in analytical measure-ment” [3], which applies the principles in this ISOGuide to analytical measurements. A summary hasbeen published [4]. In implementing these principles,however, it is important to consider whether existingpractice in analytical chemistry, based on collaborativetrial [5–7], provides the information required. In thispaper, we compare existing method validation guide-lines with published principles of measurement uncer-

tainty estimation, and consider the extent to whichmethod development and validation studies can pro-vide the data required for uncertainty estimation ac-cording to GUM principles.

Measurement uncertainty

There will always be an uncertainty about the correct-ness of a stated result. Even when all the known or sus-pected components of error have been evaluated andthe appropriate corrections applied, there will be un-certainty on these corrections and there will be an un-certainty arising from random variations in end re-sults.

The formal definition of “Uncertainty of Measure-ment” given by the GUM is “A parameter, associatedwith the result of a measurement, that characterises thedispersion of the values that could reasonably be attri-buted to the measurand. Note (1): The parameter maybe, for example, a standard deviation (or a given multi-ple of it) or the half width of an interval having a statedlevel of confidence.”

For most purposes in analytical chemistry, the“measurand” is the concentration of a particular spe-cies. Thus, the uncertainty gives a quantitative indica-

Received: 22 September 1997Accepted: 6 October 1997

S.L.R. Ellison (Y)Laboratory of the GovernmentChemist,Queens Road, Teddington,Middlesex, TW11 0LY, UK

A. Williams19 Hamesmoor Way,Mytchett,Camberley, Surrey, GU16 6JG, UK

Page 47: Validation in Chemical Measurement

38 S.L.R. Ellison · A. Williams

Fig. 1

tion of the range of the values that could reasonably beattributed to the concentration of the analyte and ena-bles a judgement to be made as to whether the result isfit for its intended purpose.

Uncertainty estimation according to GUM princi-ples is based on the identification and quantification ofthe effects of influence parameters, and requires an un-derstanding of the measurement process, the factors in-fluencing the result and the uncertainties associatedwith those factors. These factors include corrections forduly quantified bias. This understanding is developedthrough experimental and theoretical investigation,while the quantitative estimates of relevant uncertain-ties are established either by observation or prior infor-mation (see below).

Method validation

For most regulatory applications, the method chosenwill have been subjected to preliminary method devel-opment studies and a collaborative study, both carriedout according to standard protocols. This process, andsubsequent acceptance, forms the ’validation’ of themethod. For example, the AOAC/IUPAC protocol [5,6] provides guidelines for both method developmentand collaborative study. Typically, method develop-ment forms an iterative process of performance evalua-tion and refinement, using increasingly powerful testsas development progresses, and culminating in collabo-rative study. On the basis of the results of these studies,

standard methods are accepted and put into use by ap-propriate review or standardisation bodies. Since thestudies undertaken form a substantial investigation ofthe performance of the method with respect to true-ness, precision and sensitivity to small changes and in-fluence effects, it is reasonable to expect some com-monality with the process of uncertainty estimation.

Comparison of measurement uncertainty and methodvalidation procedures

The evaluation of uncertainty requires a detailed exam-ination of the measurement procedure. The steps in-volved are shown in Fig. 1. This procedure involvesvery similar steps to those recommended in the AOAC/IUPAC protocol [5, 6] for method development andvalidation, shown in Fig. 2. In both cases the same proc-esses are involved: step 1 details the measurement pro-cedure, step 2 identifies the critical parameters that in-fluence the result, step 3 determines, either by experi-ment or by calculation, the effect of changes in each ofthese parameters on the final result, and step 4 theircombined effect.

The AOAC/IUPAC protocol recommends thatsteps 2,3 and 4 be carried out within a single laboratory,to optimise the method, before starting the collabora-tive trial. Tables 1 and 2 give a comparison of this partof the protocol [6] with an extract from correspondingparts of the EURACHEM Guide [3]. The two proce-dures are very similar. Section 1.3.2 of the method vali-

Fig. 2

Page 48: Validation in Chemical Measurement

Measurement uncertainty and its implications for collaborative study method validation and method performance parameters 39

Table 1 Method development and uncertainty estimation

Method validation1 Uncertainty estimation

1.3.2 Alternative approachesto optimisation(a) Conduct formal rugged-ness testing for identificationand control of critical varia-bles.(b) Use Deming simplex op-timisation to identify criticalsteps.(c) Conduct trials by changingone variable at a time.

Having identified the possiblesources, the next step is tomake an approximate assess-ment of size of the contribu-tion from each source, ex-pressed as a standard devia-tion. Each of these separatecontributions is called an un-certainty component.Some of these componentscan be estimated from a seriesof repeated observations, bycalculating the familiar statis-tically estimated standard de-viation, or by means of sub-sidiary experiments which arecarried out to assess the sizeof the component. For exam-ple, the effect of temperaturecan be investigated by makingmeasurements at differenttemperatures. This experi-mental determination is refer-red to in the ISO Guide as“Type A evaluation”.

1 Reprinted from The Journal of AOAC INTERNATIONAL(1989) 72(4) :694–704. Copyright 1989, by AOAC INTERNA-TIONAL, Inc.

dation protocol is concerned with the identification ofthe critical parameters and the quantification of the ef-fect on the final result of variations in these parameters;the experimental procedures (a) and (c) suggested areclosely similar to experimental methodology for evalu-ating the uncertainty. Though the AOAC/IUPAC ap-proach aims initially to test for significance of change ofresult within specified ranges of input parameters, thisshould normally be followed by closer study of the ac-tual rate of change in order to decide how closely a pa-rameter need be controlled. The rate of change is ex-actly what is required to estimate the relevant uncer-tainty contribution by GUM principles. The remainderof the sections in the extract from the protocol giveguidance on the factors that need to be considered;these correspond very closely to the sources of uncer-tainty identified in the EURACHEM Guide. The datafrom method development studies required by existingmethod validation protocols should therefore providemuch of the information required to evaluate the un-certainty from consideration of the main factors in-fluencing the result.

The possibility of relying on the results of a collabo-rative study to quantify the uncertainty has been con-sidered [8], following from a general model of uncer-tainties arising from contributions associated withmethod bias, individual laboratory bias, and within-

and between-batch variations. Collaborative trial is ex-pected to randomise most of these contributions, withthe exception of method bias. The latter would be ad-dressed via combination of the uncertainties associatedwith a reference material or materials to which resultsare traceable with the statistical uncertainty associatedwith any estimation of bias using a finite number of ob-servations. Note that the necessary investigation andreporting of bias and associated statistical uncertainty(i.e. excluding reference material uncertainty), are nowrecommended in existing collaborative study standards[7]. Where the method bias and its uncertainty aresmall, the overall uncertainty estimate is expected to berepresented by the reproducibility standard deviation.The approach has been referred to as a “top-down”view. The authors concluded that such an approachwould be feasible given certain conditions, but notedthat demonstrating that the estimate was valid for aparticular laboratory required appropriate internalquality control and assurance. Clearly, the controls re-quired would relate particularly to the principal factorsaffecting the result. In terms of ISO principles, this re-quirement corresponds to control of the main contribu-tions to uncertainty; in method development and vali-dation terms, the requirement is that factors found tobe significant in robustness testing are controlled withinlimits set, while factors not found individually signifi-cant remain within tolerable ranges. In either case,where the control limits on the main contributing fac-tors, together with their influence on the result, areknown to an individual laboratory, the laboratory canboth check that its performance is represented by thatobserved in the collaborative trial and straightforward-ly provide an estimate of uncertainty following ISOprinciples.

The step-by-step approach recommended in the ISOGuide and the “top down” approach have been seen asalternative and substantially different ways of evaluat-ing uncertainty, but the comparison between methoddevelopment protocols and ISO approach above showsthat they are more similar than appears at first sight. Inparticular, both require a careful consideration andstudy of the main effects on the result to obtain robustresults accounting properly for each contribution tooverall uncertainty. However, the top down approachrelies on that study being carried out during methoddevelopment; to make use of the data in ISO GUM es-timations, the detailed data from the study must beavailable.

Availability of validation data

Unfortunately, the necessary data are seldom readilyavailable to users of analytical methods. The results ofthe ruggedness studies and the within-laboratory op-

Page 49: Validation in Chemical Measurement

40 S.L.R. Ellison · A. Williams

Table 2 Method performance and measurement uncertainty estimation. Note that the text is paraphrased for brevity and the numbersin parentheses refer to corresponding items in the EURACHEM guide (column 2)

Method validation protocol1 EURACHEM guide

1.4 Develop within-laboratory attributes of the optimisedmethod(Some items can be omitted; others can be combined.)

1.41 Determine [instrument] calibration function . . . to de-termine useful measurement range of method. (8, 9)

1.4.2 Determine analytical function (response vs concentra-tion in matrix . . .). (9)

1.4.3 Test for interference (specificity):(a) Test effects of impurities . . . and other componentsexpected . . . (5)(b) Test non-specific effects of matrices. (3)(c) Test effects of transformation products . . . (3)

1.4.4 Conduct bias (systematic error) testing by measuringrecoveries . . . (Not necessary when method itself de-fines the property or component.) (3, 10, 11)

1.4.5 Develop performance specifications . . . and suitabilitytests . . . to ensure satisfactory performance of criticalsteps . . . (8)

1.4.6 Conduct precision testing . . . [including] . . . both be-tween-run (between-batch) and within-run (within-batch) variability. (4, 6, 7, 8, 12)

1.4.7 Delineate the range of applicability to the matrices orcommodities of interest. (1)

1.4.8 Compare the results of the application of the methodwith existing tested methods intended for the samepurposes, if other methods are available.

1.4.9 If any of the preliminary estimates of the relevant per-formance of these characteristics are unacceptable, re-vise the method to improve them, and retest as neces-sary

1.4.10 Have method tried by analyst not involved in its devel-opment. Revise method to handle questions raised andproblems encountered.

The evaluation of uncertainty requires a detailed examinationof the measurement procedure. The first step is to identifypossible sources of uncertainty. Typical sources are:1. Incomplete definition of the measurand (for example, fail-

ing to specify the exact form of the analyte being deter-mined).

2. Sampling – the sample measured may not represent thedefined measurand.

3. Incomplete extraction and/or pre-concentration of themeasurand, contamination of the measurement sample,interferences and matrix effects.

4. Inadequate knowledge of the effects of environmentalconditions on the measurement procedure or imperfectmeasurement of environmental conditions.

5. Cross-contamination or contamination of reagents orblanks.

6. Personal bias in reading analogue instruments.7. Uncertainty of weights and volumentric equipment.8. Instrument resolution or discrimination threshold.9. Values assigned to measurement standards and reference

materials.10. Values of constants and other parameters obtained from

external sources and used in the data reduction algorithm.11. Approximations and assumptions incorporated in the

measurement method and procedure.12. Variations in repeated observations of the measurand un-

der apparently identical conditions.

1 Reprint from The Journal of AOAC INTERNATIONAL (1989) 72(4) :694–704. Copyright 1989, by AOAC INTERNATIONAL,Inc.

timisation of the method are, perhaps owing to theirstrong association with the development process ratherthan end use of the method, rarely published in suffi-cient detail for them to be utilised in the evaluation ofuncertainty. Further, the range of critical parametervalues actually used by participants is not available,leading to the possibility that the effect of permittedvariations in materials and the critical parameters willnot be fully reflected in the reproducibility data. Final-ly, bias information collected prior to collaborativestudy has rarely been reported in detail (though overallbias investigation is now included in ISO 5725 [7]), andthe full uncertainty on the bias is very rarely evaluated;it is often overlooked that, even when investigation ofbias indicates that the bias is not significant, there willbe an uncertainty associated with taking the bias to bezero [9], and it remains important to report the uncer-tainty associated with the reference material value.

Recommendations

The results of the ruggedness testing and bias evalua-tion should be published in full. This report shouldidentify the critical parameters, including the materialswithin the scope of the method, and detail the effect ofvariations in these on the final result. It should also in-clude the values and relevant uncertainties associatedwith bias estimations, including both statistical and ref-erence material uncertainties. Since it is a requirementof the validation procedure that this information shouldbe available before carrying out the collaborative study,publishing it would add little to the cost of validatingthe method and would provide valuable informationfor future users of the method.

In addition, the actual ranges of the critical parame-ters utilised in the trial should be collated and includedin the report so that it is possible to determine theireffect on the reproducibility. These parameters willhave been recorded by the participating laboratories,

Page 50: Validation in Chemical Measurement

Measurement uncertainty and its implications for collaborative study method validation and method performance parameters 41

who normally provide reports to trial co-ordinators; itshould therefore be possible to include them in the fi-nal report.

Of course there will frequently be additional sourcesof uncertainty that have to be examined by individuallaboratories, but providing this information from the

validation study would considerably reduce the workinvolved.

Acknowledgements The preparation of this paper was supportedunder contract with the Department of Trade and Industry aspart of the National Measurement System Valid Analytical Meas-urement (VAM) Programme [10].

References

1. Sargent M (1995) Anal Proc32 :201–202

2. ISO (1993) Guide to the expressionof uncertainty in measurement. ISO,Geneva, Switzerland,ISBN 92-67-10188-9

3. EURACHEM (1995) Quantifying un-certainty in analytical measurement.Laboratory of the GovernmentChemist, London.ISBN 0-948926-08-2

4. Williams A (1991) Accred QualAssur 1 :14-17

5. Horwitz W (1995) Pure Appl Chem67:331–343

6. AOAC recommendation (1989) J As-soc Off Anal Chem 72 :694–704

7. ISO (1994) ISO 5725 :1994 Precisionof test methods. ISO, Geneva

8. Analytical Methods Committee(1995) Analyst 120 :2303–2308

9. Ellison SLR, Williams A (1996) In:Parkany M (ed) The use of recoveryfactors in trace analysis. Royal Socie-ty of Chemistry, London

10. Fleming J (1995) Anal Proc 32 :31–32

Page 51: Validation in Chemical Measurement

Accred Qual Assur (1998) 3 :2–5Q Springer-Verlag 1998 GENERAL PAPER

Ludwig Huber Qualification and validation of softwareand computer systems in laboratoriesPart 2 :Qualification of vendors

Abstract When software and com-puter systems are purchased fromvendors, the user is still responsi-ble for the overall validation. Be-cause the development validationcan only be done by the develop-ers, the user can delegate this partto the vendor. The user’s firmshould have a vendor qualificationprogram in place to check for this.The type of qualification dependsvery much on the type and com-plexity of software and can go

from documented evidence of ISO9001 or equivalent certification foroff-the-shelf products to direct au-dit for software that has been de-veloped on a contract basis. Usinga variety of practical examples, thearticle will help to find the optimalqualification procedure.

Key words Validation 7Qualification 7 Vendors 7Computers 7 Analyticallaboratories

Received: 8 August 1997Accepted: 12 September 1997

L. Huber (Y)Hewlett-Packard Waldbronn,P.O. Box 1280,D-76337 Waldbronn, GermanyTel.: c49-7243-602 209Fax: c49-7802-981948e-mail: Ludwig [email protected]

Introduction

Software and computer systems should validated andqualified during all phases of their life cycle. The vali-dation and qualification tasks during development havebeen discussed in an earlier article of this series [1].This article gives recommendations on how to selectand qualify a vendor when software or complete sys-tems are purchased.

When software and computerized systems are pur-chased from a vendor a frequently asked question iswho is responsible for the validation of such a system:the vendor or the user? The Organization for Econom-ic Cooperation and Development (OECD) states clear-ly in consensus paper No. 5: “It is the responsibility ofthe user to ensure that the software program has beenvalidated” [2]. This is also the practice of the US FDA(United States Food and Drug Administration) as spec-ified by Tetzlaff, a former FDA investigator: “The re-sponsibility for demonstrating that systems have beenvalidated lies with the user” [3]. However, it is obviousthat product quality cannot be achieved by testing in auser’s laboratory. This must be incorporated during de-sign and development, which can only be done by the

vendor. Therefore the OECD makes a further state-ment in consensus paper No. 5: “It is acceptable for for-mal validation to be carried out by the supplier on be-half of the user”. Furman et al. [4] of the US FDA alsomake it very clear: “All equipment manufacturersshould have validated their software before releasingit”.

A more recent OECD consensus paper [5] requiresthe development of software in a quality system envi-ronment: “There should be adequate documentationthat each system was developed in a controlled mannerand preferably to recognized quality and technicalstandards (e.g. ISO 9001)” Similarly the European Uni-on requires in Annex 11 of the EC guide to good man-ufacturing practice (GMP) for medicinal products [6]:“The user shall ensure that software has been producedin accordance with a system of quality assurance”.

The objective of vendor qualification is to get assu-rance that the vendor’s products development andmanufacturing practices meet the requirements of theuser’s firm. For software development this usuallymeans that the software is developed and validated fol-lowing documented procedures. The requirementsusually vary between user firms and, within companies,

Page 52: Validation in Chemical Measurement

Qualification and validation of software and computer systems in laboratories. Part 2: Qualification of vendors 43

between different departments. An example of a devel-opment and validation procedure as typically requiredby the pharmaceutical manufacturing has been pub-lished by Huber [1] in an earlier article of this series.

The dilemma of the user is that she or he should en-sure that the software has been validated during devel-opment even if she/he typically has no insight into thevendor’s practices and most of the time does not havethe technical understanding of how software should bevalidated during development. So the question is: Howcan an analytical chemist, a good laboratory practice(GLP) study director, or a laboratory manager decidewhether the software she or he is using has been vali-dated during development? Ideally, there should be asoftware vendor qualification scheme that certifies soft-ware vendors to be in compliance with all regulations.Unfortunately, even though some software certificationsystems exist, e.g., Information Technology QualitySystem (ITQS) [7] and TickIt [8], none of them is ac-cepted by regulatory agencies as being always sufficientfor vendor qualification.

This article should help to get around the dilemmaand to establish an efficient vendor qualification pro-gram for different software categories with minimumefforts. The computer systems will be classified asstandard systems and software or systems that havebeen developed on a contract basis specifically for theuser. Special attention is paid to the requirements ofusers in regulated pharmaceutical industries.

Standard systems

Vendor qualification is relatively easy if the software orcomputer system is sold in the same configuration tomultiple users. Typical examples are computer systemsfor analytical instrument control, data acquisition anddata evaluation. In this case there is much informationavailable on the quality of the product and on the ven-dor’s behavior in case of problems. This informationmay be available within the user’s firm based on experi-ence of previous versions within the same departmentor the newly purchased version in other departments.Information can also be gleaned from colleages in othercompanies. If such information is available, only a littleadditional information need be collected from the ven-dor. This can include questions like

1. Has the software been developed and manufacturedin accordance with a system of quality assurance?Usually, certification for ISO9001 provides such evi-dence.

2. Can the vendor provide certified evidence that thesoftware has been validated during development?Certification for ITQS [7] or IckIT [8] provides suchevidence. Both systems certify compliance with ISO

Guide 9000-3 [9]. The TickIT scheme was jointly de-veloped by the United Kingdom Department ofTrade and Industry (DTI) and the British ComputerSociety. ITQS is an international mutual-recognitionscheme for ISO 9000 registration in the software andhardware sections of Information Technology andTelecommunications sector. A major objective ofITQS is to expand the international agreement onthe registration of IT companies and to provide aglobal scheme for the recognition of certification inthis area.

3. Can the vendor guarantee access to software devel-opment documents? This question is particularly im-portant for those working in the regulated environ-ments. Shipment of a declaration of system valida-tion with each software or computer system providessuch evidence. This declares that each softwareproduct and their actual revisions have been vali-dated and that additional information on productdevelopment can be made available.

In addition, regulatory agencies need to get assu-rance that the software source code can be made avail-able in case it is required. Therefore, for standard soft-ware used in GLP and in GMP environments, vendorsshould make a statement that the source code can bemade accessible at the vendor’s site. A statements onthis should be included in the vendor’s declaration ofsystem validation.

Instrument vendors generally react positively to this.For example, most vendors develop and validate analy-tical products following documented product life cycles.Products are shipped with a “Declaration of SystemValidation” or similar documents that certify that thespecific product was developed and validated followingthe product life cycle process. Most vendors are alsocertified for ISO 9001 and some also for ITQS or Tick-It. Some vendors also make further information on de-velopment and testing available to the user on specialrequest, and some guarantee accessibility of the sourcecode to regulatory agencies.

If a vendor is not able or willing to provide docu-mented evidence of validation, the user should considerselecting another vendor: “Companies should consideralternative vendors when they encounter suppliers whoare unable or unwilling to share test data or evidence tosupport system performance” [3]. This recommenda-tion is easy to follow if there are a number of competi-tors for the same or similar products. If this is not thecase, for example when special software for an emerg-ing technique is required, a user may purchase the soft-ware anyway, but should evaluate the vendor using thecriteria usually applied for vendor qualification fornon-standard software.

Users of larger computer systems in pharmaceuticaldevelopment sometimes feel that they should do more

Page 53: Validation in Chemical Measurement

44 L. Huber

Table 1 Checklist for supplier assessments (from [10])

Audit Item

Company informationCompany history: how long has the company been in busi-ness?Financial status (obtain copy of annual report)?Is the company currently in the process of negotiation forsale?Size (number of employees?)What percentage of sales is invested in research and develop-ment of new products?Does the vendor have an established customer base in the userfirm’s market place?Are there company policies on quality, security etc.?Is there a Quality Management system?Is the vendor compliant to ISO 9001? (obtain copies of certifi-cates)Is the vendor compliant to ISO Guide 9000-3 (obtain copies ofITQS or TickIT)Has the company been audited by other companies?

OrganizationIs there a formal quality assurance department (ask for an or-ganization chart)?

Software developmentDoes the vendor follow engineering standards?Is there a software quality assurance program?Is there a structured methodology (e.g. life cycle approach)Are there life cycle checkpoint forms (checklists)?Is prototyping done before and during development?Is all development done at the vendor’s site?If not, are third parties certified or regularly audited by the ven-dor (e.g. ISO 9001?

TestingWho develops test plans?Are requirement specifications reviews and design inspectionsdone periodically?How is functional testing performed?Who is testing (outside the development department)?Are there test protocols?Is there a test traceability matrix to ensure that all user require-ments and product functions are tested?Are there procedures for recording, storing and auditing test re-sults?Who approves test plans/protocols?

Support/trainingHow many support personnel does the company have?Does the vendor have formalized training programs in installa-tion, operation and maintenance of systems?Which support systems are in place (phone, direct)?Where is the nearest office for support?Is a service contract available and what does it cover (installa-tion, startup, performance verification, maintenance, training?)Does the company provide consulting and validation services?Do support people speak local language?What is the average response time?Is the service organization compliant to an international qualitystandard (for example, ISO 9002 or ISO 9003)?How long are previous versions supported and at what level?How long is support and supply of parts guaranteed?Is training available on how to use the system? Location, fre-quency?Are training materials available (description, media)?

Table 1 (Continued)

Audit Item

Failure reports/enhancement requestsIs there a formal problem-reporting and feedback system inplace?How are defects and enhancement requests handled?How are customers informed on failure handling?Are quality records and statistical failure evaluations in exis-tence?

Change controlWho initiates changes?Who authorizes changes?Are there procedures for change control?Do they include impact analysis, test plans?Is there a formal revision control procedure?Will all updates get new version numbers?Are there procedures for user documentation updates?How are customers informed on changes?

People qualificationDo people have knowledge of regulatory compliance and pro-gramming science?Is there documentation on education, experience, training?

The product/projectWhen did development of the software first begin?When was the initial version of the software first released?How many systems are installed?How often are software releases typically issued?How many employees are working on the project?Are there functional specifications?Are there samples of reports?Which vintage of data files can be processed with today’s soft-ware?

User documentationAre there procedures and standards for the development andmaintenance of user documentation?What documentation is supplied?For how long is the documentation supplied?For how long is the user documentation archived?

Archiving of software and documentationWhat is archived, for how long (software, revisions, sourcecode, documentation)?Where is the source code archived?Can the source code be made accessible to regulatory agen-cies?Is a periodic check of data integrity ensured?

SecurityIs the developer’s area secure?What type of security is provided to prevent unauthorizedchanges?Are there written procedures specifying who has access to soft-ware development and control?

Customer trainingDoes a training manual exist?Do they provide tools for training e.g., computer-based or videotraining?Do they offer operator training courses (frequency, language)?Is there documented evidence for the trainer’s qualifications?

Page 54: Validation in Chemical Measurement

Qualification and validation of software and computer systems in laboratories. Part 2: Qualification of vendors 45

detailed qualification than that recommended in pre-vious paragraphs, even for standard systems, and giveconsideration to a direct audit at the vendor’s site, es-pecially when specific software and computer systemsare widely used throughout the company. Direct auditsare expensive and time-consuming for both the user’sand the vendor’s firm. Therefore both parties should dotheir utmost to prevent such audits. The vendor canhelp by providing the user with appropriate detailedwritten information on the software quality system. Theuser can develop checklists usually used during directaudits and send it to the vendor. An example is shownin Table 1. The checklists should be returned withinabout 3–4 weeks. In our experience, direct audits arenot necessary for standard systems if the answers tothese questions are positive, and we are not aware ofany situation where a user’s firm failed a regulatory in-spection after vendors of standard software had beenqualified as described in this article.

Non-standard software and computer systems

If standard software has been modified or new softwarehas been developed on special request by a user, thevendor should be qualified through audits. This can bedone through a detailed written questionnaire similarto the questions in Table 1 for smaller projects orthrough a detailed site audit (second party audit) . Fre-quently the quality requirements are agreed upon be-fore the programming is started as part of the purchase

agreement. Compliance with the agreement is checkedduring and after development in direct audits. Ques-tions covered during the audit are similar to those inTable 1. Here it is important to check selected exam-ples of the documentation to make sure that this, too, issatisfactory. Details on how to conduct a software sup-plier audit have been published by Segalstad [11].

Summary

Software and computer systems must be validated be-fore and during routine use. Also, for purchased soft-ware, the user is responsible for the entire validationbut the vendor can do part of the validation on behalfof the user. The user should have a qualification proc-ess in place to assure that the vendor did develop andvalidate the software according to quality assurancestandards. For standard off-the-shelf systems, certifica-tion to a general quality standard, e.g. ISO 9001, pro-vides enough evidence of validation and, for software,certification using a system such as ITQS or TickIT to-gether with the vendor’s declaration of software valida-tion with the assurance of accessibility to validationdocuments and the source code for regulated industryis sufficient. When special software is developed on be-half of the user, quality assurance measures with detailsof development validation can be part of the purchaseagreement. In this case, the user may check compliancewith the agreement by means of direct audits.

References

1. Huber L (1997) Accred Qual Assur2 :360–366

2. OECD (1992) Compliance of labora-tory suppliers with GLP principles.(Series on principles of good labora-tory practice and compliance moni-toring, No. 5, GLP consensus docu-ment, environment monograph No.49) Paris

3. Tetzlaff RF (1992) PharmTech:71–82

4. Furman WB, Layloff TP, Tetzlaff RF(1994) J AOAC Intern 77 :1314–1318

5. OECD (1995) The application of theprinciples of GLP to computerizedsystems. (Series on principles of goodlaboratory practice and compliancemonitoring, GLP consensus documentNo. 10, environment monograph No.116) Paris

6. EC guide to good manufacturingpractice for medicinal products (1992)In: The rules governing medicinalproducts in the European community,vol 4. Office for Official Publicationsfor the European Communities, Lux-embourg

7. European Information TechnologyQuality System Auditor Guide, Jan-uary 1992, document 92/0001(ITAGC) and SQA-68 (CTS-2).Available from ITQS Secretariat,Brussels, Belgium

8. The TickIT guide to software qualitymanagement system construction andcertification using EN 29001, Issue 2,February 1992. Available from theDISC TickIT Office London

9. ISO 9000–3 :1991(E), Quality man-agement and quality assurance stand-ards, Part 3: Guidelines for the appli-cation of ISO 9001 to development,supply and maintenance. Internation-al Organization for Standardization,Geneve, Switzerland

10. Huber L (1995) Validation of compu-terized analytical instruments. Inter-pharm, Buffalo Grove, IL, USA,Hewlett-Packard Part Number: 5959-3879

11. Segalstad S (1996) Eur Pharm Rev :37–44

Page 55: Validation in Chemical Measurement

Accred Qual Assur (1998) 3 :140–144Q Springer-Verlag 1998 GENERAL PAPER

Ludwig Huber Qualification and validation of softwareand computer systems in laboratoriesPart 3. Installation and operational qualification

Received: 31 October 1997Accepted: 25 November 1997

L. Huber (Y)Hewlett-Packard Waldbronn,PO Box 1280,D-76337 Waldbronn, GermanyFax: c49-7802-981948Tel.: c49-7243-602 209e-mail: Ludwig Huber6hp.com

Abstract Installation and opera-tional qualification are importantsteps in the overall validation andqualification process for softwareand computer systems. This articleguides users of such systems stepby step through the installationand operational qualification pro-cedures. It provides guidelines onwhat should be tested and docu-mented during installation prior to

routine use. The author also pres-ents procedures for the qualifica-tion of software using chromato-graphic data systems and a net-work server for central archiving asexamples.

Key words Validation 7Qualification 7 Computers 7Software 7 Analytical 7Laboratories

Introduction

Qualification and validation activities to assess the per-formance of software and computer systems cover thewhole life-cycle of the products. The first article of thisseries [1] gave an overview of the validation processduring development; the second [2] discussed qualifica-tion of a vendor when the software or computer systemis purchased. This article will discuss the validationsteps required during installation and tests required pri-or to operation – processes called installation qualifica-tion and operational qualification (OQ) or acceptancetesting respectively. Operational qualification of a com-puter system is the most difficult qualification task. Thereason is that still today there are no guidelines availa-ble on what and how much testing should be done. Thebasic question is: how much testing is enough? Toomuch testing can become quite expensive, and insuffi-cient testing can be a problem during an audit. For ex-ample, we have seen test protocols of 200 and morepages that users of a commercial computerized chroma-tographic systems have developed over several weeks.Each software function, such as switching the integratoron and off, has been verified as part of an OQ proce-dure, which is not necessary if such tests have beendone to a large extent at the vendor’s site.

This article gives practical validation steps for twotypes of computer systems: integrated computerizedanalytical systems and a network for file printing andarchiving.

Pre-installation

Before a computerized analytical system arrives at theuser’s laboratory, serious thought must be given to itslocation, environmental and space requirements. Acomprehensive understanding of the requirements ofthe new computer system must be obtained from thevendor well in advance: required bench or floor spaceand environmental conditions such as humidity andtemperature. Bench space should also include somespace left and right of the equipment required for prop-er operation. Care should be taken that all the environ-mental conditions and electrical grounding are withinthe limits specified by the vendor and that the correctcables are used. Environmental extremes of tempera-tures, humidity, dust, power feed line voltage fluctua-tions, and electromagnetic interference should beavoided. If environmental conditions could influencethe validity of test results, the laboratory should havefacilities to monitor and record these conditions, eithercontinuously or at regular intervals.

Page 56: Validation in Chemical Measurement

Qualification and validation of software and computer systems in laboratories. Part 3: Installation and operational qualification 47

Table 1 Steps before installation

– Obtain manufacturer’s recommendations for installation siterequirements.

– Check the site for the fulfillment of the manufacturer’s recom-mendations (space, utilities such as electricity, and environmen-tal conditions such as humidity and temperature).

– Allow sufficient shelf space for SOPs, operating manuals andsoftware

Table 2 Steps during installation

– Compare computer system as received with purchase order (in-cluding software, accessories, spare parts).

– Check documentation for completeness (operating manuals,maintenance instructions, standard operating procedures fortesting, safety and validation certificates).

– Check equipment for any damage.– Install computer hardware and peripherals. Make sure that dis-

tances are within the manufacturer’s specifications.– Swith on the instruments and ensure that all modules power up

and perform an electronic self-test.– Install software on computer following the manufacturer’s rec-

ommendations.– Verify correct software installation.– Make back-up copy of software and installation verification

files.– Configure peripherals, e.g. printers and equipment modules.– Identify and make a list with description of all hardware, in-

clude drawings where appropriate.– Make a list with description of all operating and applications

software installed on the computer.– List equipment manuals and SOPs.– Prepare an installation report.

Installation

When the computer system arrives, the shipmentshould be checked by the user for completeness. Itshould be confirmed that the equipment ordered iswhat was in fact received. Besides the equipment hard-ware, other items should be checked, for example cor-rect cables, other accessories and documentation. Avisual inspection of the entire hardware should followto detect any physical damage. For more complex in-strumentation, for example, when multiple computersare connected to a network, wiring diagrams should beproduced, if not supplied by the vendor. Distance be-tween the computers and peripherals such as printersand analytical equipment must be within manufactur-er’s specifications. For example, long low-voltage elec-trical lines from analytical equipment to computers arevulnerable to electromagnetic interference. This mayresult in inaccurate input data to the computer. In addi-tion, electrical lines should be shielded if motors or flu-orescent light sources are nearby. At the end of the in-stallation of the hardware an electrical test of all com-puter modules and systems should follow.

Table 3 Form for computer system identification

Computer hardwareManufacturer, modelSerial numberProcessorInternational memory (RAM)Graphics adapterHard disk (type, partitions, memory sizes)Installed drivesPointing device (e.g., mouse)Space requirement

MonitorManufacturer, modelSerial number

PrinterManufacturer, modelSerial numberSpace requirement

Instrument interface cardType, select code, slot number

Connected equipement hardwareHardware module 1Interface card setting

ModemType, speed

Network connectionCard typeNetwork address

Operating softwareOperating system (version)User interface (version)

Application software 1DescriptionManufacturer/vendorProduct number (version)Required disk space

Application software 2DescriptionManufacturer/vendorProduct number (version)Required disk space

Computer hardware should be well documentedwith model number and serial and revision numbers,and software should be documented with model and re-vision numbers. For larger laboratories with many com-puter systems this should preferably be documented viaa computer-based database.

Documentation should include items such as size ofthe hard disk, internal memory (RAM), installed typeand version of operating software, standard applicationsoftware and user-contributed software, e.g., Macroprograms. This information is important because allitems can influence the overall performance of a com-puter system. The information should be readily availa-ble in case a problem occurs with the computer system.Table 3 includes items for proper documentation ofcomputer hardware and software.

Page 57: Validation in Chemical Measurement

48 L. Huber

When complex software is installed on a computer,the correctness and completeness of the installed pro-gram and data files should be verified. The problem isthat wrong or incomplete software installation may beidentified as such not at installation or during initialtesting but during routine use, when that particular pro-gram is used. Vendors can assist installation verificationby supplying installation reference files and automatedverification procedures. In this case, the integrity ofeach file is verified by comparing the cross-redundancy-check (CRC) of the installed file with the checksum ofthe original file recorded on the installation master.Modified or corrupt files have different checksums andare thus detected by the verification program. Verifica-tion reports should include a list of missing, changedand identical files (see Fig. 1). The result file should bestored as a reference file. The verification program willidentify whether any file has incidentally been deleted.Therefore, whenever there is a problem with the soft-ware, the actual installation should first be checkedagainst this reference file.

A new reference file should be generated wheneverthe user adds software that has been authorized. An ex-ample would be Macros to customize the system. In thisway, it can always be checked whether Macros havebeen changed or added that are not authorized.

The generation and signing off of the installation re-port should mark the completion of the installation. Inthe pharmaceutical industry this is referred to as the in-stallation qualification (IQ) document. The documentshould be signed by the user’s representative if the IQwas done by the user, and by the vendor’s and theuser’s representative if the IQ was done by the ven-dor.

Operational qualification and acceptance testing

After the installation of hardware and software, an op-erational test should follow, a process which is referredto as operational qualification (OQ). For the qualifica-tion of computer systems, the term “acceptance” is alsoused, which can be identical to or be a reduced set ofthe OQ. The purpose of OQ is to demonstrate that thesystem’s hardware and software operate “as intended”in the user’s environment.

Correct functioning of software loaded on a comput-er system should be checked in the user’s laboratoryunder typical operating conditions within its intendedranges. Tests should be developed that execute themost important software functions. The type and num-ber of tests depend on the complexity of software andon the tests that have already been done during systemdevelopment. For example, many more tests should bedone if there is no evidence of extensive testing doneduring development. For details see [1].

Fig. 1 Installation verification report

Fig. 2 Examples of computer systems in analytical laboratories

In an analytical laboratory we can typically findcomputer systems as shown in Fig. 2.

1. Integrated computerized analytical systems, forexample chromatographic systems with computer soft-ware for instrument control, data acquisition and dataevaluation. Data are printed and electronically stored.Sometimes these computer systems employ spreadsheetprograms or user-contributed Macros for customizeddata evaluation.

2. Networked computer systems with multiple com-puters and peripherals. Examples are servers for com-mon printing and data storage or client/server networkswhere the operating system and application softwareare loaded on the server and can be executed on theclient computer. Another example is a laboratory infor-mation management system (LIMS) for collection andmanagement of data from multiple computers.

All computers may also employ office software, suchas spreadsheets and word processing programs. Theseprograms can operate with manual or automated dataentry with on-line connection to the analytical data ac-quisition and evaluation system.

Testing procedures for categories one and two arevery different. In both cases test cases should be devel-oped and acceptance criteria should be defined. The

Page 58: Validation in Chemical Measurement

Qualification and validation of software and computer systems in laboratories. Part 3: Installation and operational qualification 49

Fig. 3 Testing of computer hardware and software as part of anintegrated chromatographic system

Table 4 Qualification process of chromatographic software

Generation of master data

1. Generate one or more master chromatograms (the chromato-grams should reflect typical samples)

2. In case the method uses spectral data, generate master spectrawith spectral libraries.

3. Develop integration method, calibration method and proce-dure for spectral evaluation, for example, peak purity or/andidentity checks.

4. Generate and print out master result(s).5. Save results on paper and electronically.

Verification

1. Select master chromatogram (with spectra).2. Select master method for data processing (integration, quanti-

tation, spectral evaluation, etc.).3. Run test manually or automatically. Automation is preferred

because it is faster and has less chances for errors.4. Compare test results with master data. Again, this can be done

automatically if such software is built into the system.5. Print and archive results.

most efficient test for integrated analytical system is toanalyze a well-characterized chemical standard or sam-ple (see Fig. 3). During this test, key functions of thesoftware are executed, for example, in the case of chro-matography– instrument control– data acquisition– peak integration– quantitation– file storage– file retrieval– printing

If the system has a spectral detector, spectral evalua-tion such as peak purity and compound identity shouldbe part of the test. Additional tests should be develop-ed if there are other software functions used in routineanalysis, but they are not part of the sample or standardanalysis test. If the system is used over a wide concen-tration range with multiple calibration points, the testsshould be run over many concentrations to verify cor-rect function of the calibration curve.

Most software functions of a computerized chroma-tographic system can also be tested by using well-char-acterized data files without injecting a test sample. Theadvantage is that less time is required for the test; theresults concept has been described in great detail in [3]and is summarized in Table 4. The procedure is veryuseful after updating the computer system, for exampleafter adding more internal memory or when changingto a new operating system. The test procedure is verygeneric and can also be used to test and verify the cor-rect functions of other software packages.

Well-characterized test chromatograms and spectraderived from standards or real samples are stored ondisk as a master file. Chromatograms and spectra maybe supplied by the vendor as part of the software pack-age. These vendor-supplied chromatograms and spectraare only useful if they reflect the user’s way of working,otherwise test chromatograms and spectra should berecorded by the user. This master data file passesthrough normal data evaluation, from spectral evalua-tion and integration to report generation. Results arestored on the hard disk. Exactly the same results should

always be obtained when using the same data file andmethod for testing purposes. If the chromatographicsoftware is used for different methods, the test shouldbe for different methods. For example one test can beset up for assay and an other for impurity tests.

Preferably, tests and documentation of resultsshould be done automatically, always using the sameset of test files. In this way users are encouraged to per-form the tests more frequently, and user-specific errorsare eliminated. In some cases, vendors provide test filesand automated test routines for verification of a com-puter system’s performance in the user’s laboratory.Needless to say, the correct functioning of this softwareshould also be verified. This can easily be done bychanging the method or data file and rerunning the test.The report should indicate an error. If such automatedverification software is not available, the execution ofthe tests and verification of actual results with prere-corded results can be done manually.

Successful execution of such procedure ensuresthat– the actual version of the application software works

correctly for the tested functions– executed program and data files are loaded correctly

on the hard disk– the actual computer hardware is compatible with the

softwarethe actual version of the operating system and user

interface software is compatible with the applicationsoftware.

For a networked computer system, OQ can mean,for example, verifying correct communication betweenthe computers and the peripherals. Data sets should bedeveloped and input at one part of the network. Theoutput at some other part should be compared with the

Page 59: Validation in Chemical Measurement

50 L. Huber

Fig. 4 Qualification of a network server for data storage andprinting

input. For example, if a server is used to secure andarchive data from a chromatographic data station, re-sults should be printed on1. the chromatographic data system2. the server after storage and retrieval of the files.

The results should be compared, either manually orautomatically. If the network links to other in-housesystems, correct function of the linkage should be verif-ied using well-characterized data sets.

Some important points should be considered for theOQ of software and computer systems:

1. A validation team should be formed consisting ofanalytical experts from the laboratories affected, com-puter experts from IT departments and validation ex-perts.

2. A validation plan should be developed that de-scribes the purpose of the system including subsystems,responsible persons, test philosophy and a schedule fortesting.

3. The intended use and functions of the networkand all subsystems should be defined. For subsystemsand some core functions of the network the vendorshould provide a list with detailed functional specifica-tions. From these specifications, the user can derive thefunctions the systems will use in the user’s laboratory.

4. Tests should be developed for each subsystem,and each subsystem should be validated. Again, thevendor should provide validated software to automati-cally execute these tests. An example was shown abovefor chromatographic data systems.

5. Acceptance criteria should be specified before thetests start.

6. For multi-user system, some tests should be donewhile the maximum number of users are using the sys-tem.

7. All or at least some tests should be done underconditions of maximum data flow.

8. When there is a change to the system, the valida-tion team should evaluate the possible impact of thechange to other parts of the system. Based on this eval-uation, a test plan should be developed that executeseither all or part of the tests as specified in 3 above.

At the end of the OQ, documentation should beavailable or developed that should includes a validationprotocol with– the description, intended use and unique identifica-

tion of equipment– functional specifications

test protocols with test items, acceptance criteria, ac-tual test results, date and time when tests have beenperformed, and people who performed the tests withnames and signatures– summary of results and a statement on the validation

status

Conclusion

Installation and operational qualification (acceptancetesting) should follow standardized procedures and theresults should be documented. Installation qualificationconsists of checking the instrument and documentationfor completeness. For complex software, the complete-ness and integrity of the installed program and datafiles should be checked.

During operational qualification, the software andcomputer system is verified against the functional spec-ifications. For integrated analytical computerized sys-tems the tests mainly consist of running well-character-ized test samples, and the communication to the equip-ment hardware should be checked as well. For testingthe computer system only, laboratory-specific data setsand evaluation procedures can be used.

For networked systems, each individual subsystemshould be validated. After successful validation of thesubsystems, network functions should be validated. Thetests should be defined by a validation team that con-sists of expert members from various departments.

References

1. Huber L, Wiederoder H (1997)Accred Qual Assur 2 :360–366

2. Huber L (1998) Accred Qual Assur3 :140–144

3. Huber L (1995) Validation of compu-terized analytical systems. Interpharm,Buffalo Grove, Ill. ISBN 0-935184-75-9, Hewlett-Packard Part number:5959–3879

Page 60: Validation in Chemical Measurement

Accred Qual Assur (1998) 3 :317–321Q Springer-Verlag 1998 GENERAL PAPER

L. Huber Qualification and validation of softwareand computer systems in laboratoriesPart 4. Evaluation and validation of existing systems

Abstract Existing software andcomputer systems in laboratoriesrequire retrospective evaluationand validation if their initial valida-tion was not formally documented.The key steps in this process aresimilar to those for the validationof new software and systems: userrequirements and system specifica-tion, formal qualification, and pro-cedures to ensure ongoing per-formance during routine operation.

The main difference is that fre-quently qualification of an existingsystem is based primarily on relia-ble operation and proof of per-formance in the past rather thanon qualification during develop-ment and installation.

Key words Validation 7Qualification 7 Computers 7Analytical laboratories 7 Existingsystems

Received: 30 April 1998Accepted: 2 June 1998

L. Huber (Y)Hewlett-Packard Waldbronn,P.O. Box 1280,D-76337 Waldbronn, GermanyFax: c49-7802-981948e-mail: ludwig [email protected]://www.t-online.de/home/HUBERL/validat.htm

Introduction

Software and computer systems used in analytical labo-ratories that must comply with good laboratory practice(GLP) and good manufacturing practice (GMP) regula-tions require formal validation to confirm that theymeet the criteria for intended use. Both new and exist-ing systems, regardless of age, must be validated fortheir suitability for indented use. Whereas the firstthree articles in this series [1–3] focused on validationduring development, vendor qualification, and installa-tion and operational qualification of new systems, thisarticle describes the procedures for systems that havebeen purchased, installed, and used prior to formal val-idation and that must be newly documented.

The Organization for Economic Cooperation andDevelopment (OECD) GLP consensus paper No. 10[4] contains the following paragraph on retrospectiveevaluation: “There will be systems where the need forGLP compliance was not foreseen or not specified.Where this occurs there should be documented justifi-cation for use of the systems; this should involve a re-trospective evaluation to assess suitability”.

Recommendations for the retrospective evaluationof computer systems, notably in pharmaceutical manu-

facturing, have been discussed in the literature. For ex-ample, Agalloco [5] has proposed guidelines for thevalidation of existing computer systems and Hambloch[6] has suggested a strategy for the retrospective evalu-ation of computer systems in compliance with GLP andGMP. Representatives from the United States Foodand Drug Administration as well as from the pharma-ceutical industry gave their view on validation of exist-ing legacy systems in a recent publication [7]. However,although this literature provides recommendations onhow to retrospectively evaluate computer systems ingeneral, there are no guidelines on the evaluation andvalidation of existing computerized analytical systems.In this paper, therefore, I attempt to explain how to de-velop and implement a strategy for the retrospectiveevaluation and formal validation of existing computer-ized analytical systems.

The entire process comprises three steps:1. Identify all software and computerized systems in an

analytical laboratory and assess their need for vali-dation. Create a priority list and time schedule forimplementation. The priority list can be based on arisk/exposure analysis.

2. Develop a standard operating procedure (SOP) andtemplates for implementation.

Page 61: Validation in Chemical Measurement

52 L. Huber

3. Develop a validation plan and evaluate individualsystems according to the priority list from step 1 andthe SOP from step 2.Step 1 can be accomplished by defining the environ-

ment in which the system is used, by determining thetype of data generated by the system, and by estimatingthe risk of generating inaccurate data or of loosingdata. For example, all systems used in a GLP or GMPenvironment to generate critical data should be vali-dated. Such systems can include computers controllinganalytical systems, such as gas chromatographs, spec-trophotometers, mass spectrometers but also spread-sheets, Macroprograms and servers for data collection,data archiving and print-out.

Step 2 is more difficult and can be quite complex.This paper therefore focuses on this second step in thatit recommends a procedure for the retrospective evalu-ation. The specific limitations and advantages of oldersystems compared with new systems are briefly dis-cussed.

Specific characteristics of older systems

Although most programmers and developers of com-puter systems test products prior to release, almostnone of the older systems found in laboratories havebeen validated and documented according to the ne-west regulations. For example, the software may nothave been developed in accordance with documentedquality systems or with the most recent product life cy-cle guidelines. Testing may have never been formal-ized, acceptance criteria may not have been specified,the complete test documentation may not have been ar-chived or the required signatures for authorization maynot have been collected because at the time there wasno real need to do so. As a result, vendors of softwareand computerized systems often cannot provide retros-pective evidence that the software was properly verifiedand qualified during development. Even if initial testshave been performed and documented at the user’ssite, almost all computerized systems undergo changesduring their lifetime that often are not recorded. Basedon these observations, it is easy to conclude that oldersystems never can attain validation status.

Fortunately, however, existing computerized sys-tems have an advantage over new systems: experiencegained over time. Most computer systems in analyticallaboratories have been tested in one way another, forexample, the vendors functional and performance spec-ifications may have been verified when the system wasinstalled, or the complete system may have been testedfor suitability on a day-by-day basis. In the validationprocess, the evaluator can use this experience and testresults to review and assess the quality of analytical re-sults from computer systems. Such a review and its

thorough documentation may serve as sufficient evi-dence that the system has performed and continues toperform according to its initially specified intended use.Because not all the steps required for official validationmay be executed and because the final system statuscan be determined based on a historical review, theprocess is known as retrospective evaluation.

Procedure for retrospective evaluation

The exact course of a retrospective evaluation dependson the complexity of the system and its use. Therefore,the effort involved will vary from system to system.However, in principle, each evaluation should followthe same procedure. A generic SOP and validation planshould be developed to document the individual steps,which are listed in Table 1. For smaller projects, somesteps may be excluded, but the reason for their exclu-sion should be documented in the protocol for eachevaluation.

The first step in the evaluation process is to defineand document the current system use and user require-ment specifications. If the system will be changed in theforeseeable future, any resulting changes in the in-tended use of the system should be described as well.The definition should include a list of system functions,operational parameters and performance limits. Forchromatography software required functions may in-clude instrument control, data acquisition, peak inte-gration through quantitation, file storage and retrievaland print-out of methods and data. If the system alsoincludes spectrophotometric detectors, the functionsfor spectral evaluation should be specified as well. Ta-ble 2 lists items that should be included in the systemdocumentation.

After specifications have been set, general informa-tion on the system should be gathered. The mainsources of such information are system users within theorganization. Information can also be exchangedamong users from different laboratories. Key issues inthis phase are the extent of software use and the num-ber of problems reported. This information is not onlyuseful to test past and current performance but can alsogive some hints on potential problems if the intendeduse may change in the future.

The next step is to collect information and documen-tation on the history of the specific system under evalu-ation. This information should include not only thetype and frequency of initial and ongoing testing butalso the dates and installation procedure of both soft-ware and hardware updates to the system.

Finally, the documentation should be evaluated inrelation to the user requirement specifications, func-tional specifications and performance limits previouslydefined. There should be a direct link between the tests

Page 62: Validation in Chemical Measurement

Qualification and validation of software and computer systems in laboratories. Part 4: Evaluation and validation of existing systems 53

Table 1 Recommended steps for retrospective evaluation andvalidation (from Ref. [8])

1. Describe and define the system.

(a) Summarize intended use, for example, automated qualitativeand quantitative analysis of drug impurities.

(b) Specify user requirements for the system.(c) Define specifications and operating limits of hardware mod-

ules, e.g. baseline noise of an high-performance liquid chro-matography (HPLC) detector or flow rate precision

(d) List system functions and operating parameters and limits asused for current applications and for intended future use, forexample: simultaneous control and data acquisition from twoHPLC systems, peak integration, qantitation using an exter-nal standard method, peak purity checks using a diode-arraydetector or statistical evaluation of peak retention times andareas.

(e) List equipment hardware (e.g. spectrophotometer):– asset number– merchandising number or name– manufacturer’s name, address and phone number– hardware serial number and firmware revision number– date received in the laboratory, date put into operation– location– dates of any changes, for example software updates

(f) List computer and peripherals:– manufacturer’s name– model and serial number– internal memory– graphics adapter and monitor– hard disk– interfaces– printer

(g) List all software programs:– operating system with product and version numbers– standard applications software– user-developed applications software– databases, spreadsheet and word processing programs

(h) List accessories (cables, spare parts, etc).(i) Provide system drawings.(j) Define network configurations.

2. Gather all available information and documentation on thestandard system.

(a) Information on where, how and how often the same system isused within and outside the organization and reports from us-ers on its performance and reliability (should include bothpositive and negative comments).

(b) Validation certificates from vendors for purchased systems.(c) Internat reports for systems developed in-house (develop-

ment process, source code, quality assurance principles, testprocedures and results).

(d) The vendor’s performance specifications of equipment hard-ware, e.g. baseline noise of a UV-visible HPLC detector.

(e) The vendor’s functional specifications for software, e.g. peakintegration, quantification using external and internal stand-ard methods.

(f) Formulae used for calculations, e.g. system linearity.(g) User manuals and SOPs on operation, testing, calibration and

maintenance.(h) Records on system updates.(i) Internal and external audit reports.

Table 1 (continued)

3. Collect information on the history of the specific system.

(a) Installation reports.(b) Reports on initial testing, e.g. tests to verify that the system

meets vendor equipment hardware specifications.(c) System failure and repair reports.(d) Equipment hardware maintenance logs, e.g. to record the

change of an HPLC pump seal.(e) Equipment hardware updates, for example, when a variable

wavelength detector was replaced by a diode-array detector.(f) Software updates, for example, when a new revision of the

operating system was installed.(g) Hardware calibration records.(h) Results on performance monitoring (e.g. results of software

tests using test data sets and of system suitability tests orquality control charts using well-characterized standards orcontrol samples).

(i) System feedback reports to the programmer or vendor andresponses from the programmer or vendor.

4. Qualify the system.

(a) Check whether the documentation listed under items 2 and 3is complete and up-to-date, for example, does the revision ofthe existing user manual comply with the currently installedfirmware and software revisions?

(b) Evaluate type of tests and test results and compare the resultswith user requirements, functional and performance specifi-cations as defined in items 1 b–d. For examples, check if thetests cover only normal operation ranges or also upper- andlower-operational limits.

(c) Assess whether the equipment performs as expected and for-mally assign the system formal validated status.

(d) If there are not enough tests to prove that the system per-forms as intended, such tests should be developed, executedand the results documented.

5. Update documentation and develop and implement a planfor maintaining system performance.

(a) Update system description, specifications, drawings, appro-priate SOPs, and user manual.

(b) Develop a preventive maintenance plan.(c) Develop a calibration schedule.(d) Establish procedures and schedule for performance qualifica-

tion.(e) Develop error recording, reporting, and a remedial action

plan.

and each user requirement, functional and performancespecification as listed in the first step (see Table 3). Thetests should confirm that the system operates as in-tended over all specified ranges for each function. Testsshould have been performed not only at typical operat-ing ranges but also at the operational limits. For exam-ple, if the system runs multiple applications simulta-neously, tests should verify system performance undersuch conditions.

If the tests yield enough documentation to confirmsystem performance according to specifications over alloperational ranges, the system can be formally releasedas a validated system. A plan should then be developed

Page 63: Validation in Chemical Measurement

54 L. Huber

Table 2 Documentation of a computer for retrospective evalua-tion

System I.D.ApplicationLocationInstallation dateInstalled with current configuration since (date)Computer hardwareMonitorPrinterConnected equipment hardwareNetworkOperating softwareApplication softwareSystem functionsMajor repairsComments

Table 3 Test matrix

Function Evidence of correct performance(user requirement)

Function 1 yes, see protocol AW1Function 2 yes, see protocol AW2Function 3 yes, see protocol AX77Function 4 no, tests must be defined

to ensure that system performance is maintained in thefuture. Such a plan should include preventative mainte-nance steps and calibration and test schedules as well asa change control procedure.

If insufficient tests are available to verify proper sys-tem performance, a test plan should be developed toexecute missing functions and to test functions overmissing operational ranges. For example, if a UV-visi-ble diode array is used in an High-performance liquidchromatography (HPLC) system to evaluate peak puri-ty or to confirm peak identity, a test plan should in-clude tests for these parameters if the original tests didnot cover these functions. In such situations, seriousconsideration should be given to updating the systemsoftware. It often costs less to upgrade or replace thesoftware than to validate a new system. In some cases,however, software updates can be rather expensive be-cause the hardware typically must be updated or re-placed completely as well.

Problems also arise if the new tests detect an error inthe software. For example, an error in the algorithm fordata evaluation will yield incorrect final data. There-

fore, the SOP should also contain steps to be taken ifthe previously generated data is incorrect, such as notif-ication procedures.

After evaluation and validation, the following mini-mum documentation should be available:– a standard operating procedure for implementation

and validation protocol– user requirement and functional specifications– a description of the system hardware and software,

including all functions– historical logs of the hardware with system failure re-

ports, maintenance logs and records, and calibrationrecords

– test data demonstrating that the system performs ac-cording to user requirements and functional specifi-cations

– SOPs and schedules for preventive maintenance andongoing performance testing

– error recording, reporting, and remedial action plan– a formal statement with appropriate signatures that

the system is formally validated– up-to-date user information documentation such as

operating and reference manuals– validation report

Conclusions

All software and computer systems used in accreditedand GLP- and GMP-regulated analytical laboratoriesfor the generation and analysis of critical data shouldbe retrospectively evaluated and formally validated. Asa first step, the user should list all systems in the labo-ratory, assess the need for validation for each systemand develop an implementation plan. For validation ofindividual systems, the validation plan should describeuser requirements and specific validation steps. Itshould also list the person or persons responsible foreach step as well as the criteria that must be met toachieve validated status.

Retrospective evaluation begins with the collectionof system information such as test results, logs of fail-ures and repairs, and records of system changes. Theuser must them determine whether enough tests havebeen performed to verify proper system performanceas described in the user requirements document. Ifthere is not enough evidence that the system performsas intended, additional tests should be developed andexecuted. If the system passes all the requirements, itshould be treated as a new system.

Page 64: Validation in Chemical Measurement

Qualification and validation of software and computer systems in laboratories. Part 4: Evaluation and validation of existing systems 55

References

1. Huber L (1997) Accred Qual Assur2 :360–366

2. Huber L (1998) Accred Qual Assur3 :2–5

3. Huber L (1998) Accred Qual Assur3 :140–144

4. OECD (1995) Application of the prin-ciples of GLP to computerized sys-tems. (Series on principles of good la-boratory practice and compliance mon-itoring, No. 10, GLP, consensus docu-ment environment monograph No.116) Paris

5. Agalloco J (1987) Farm Tech , 38–406. Hambloch H (1994) Existing computer

systems: a practical approach to retros-pective evaluation. In: Computer vali-dation practices. Interpharm, BuffaloGrove, Ill., USA, p 93-112

7. The Gold Sheet, Quality Control Re-ports, vol 30, No. 7, July 1996. ChevyChase, Md., USA, pp 3, 9, 10

8. Huber L (1995) Validation of compu-terized analytical systems. Interpharm,Buffalo Grove, Ill., USA

Page 65: Validation in Chemical Measurement

Accred Qual Assur (1997) 2 :140–145Q Springer-Verlag 1997 PRACTITIONER’S REPORT

Reports and notes on experiences with quality assurance, validationand accreditation

S. SenoS. OhtakeH. Kohno

Analytical validation in practice at aquality control laboratory in the Japanesepharmaceutical industry

Abstract Analytical validation isrequired as the basis for any evalu-ation activities during manufactur-ing process validation, cleaning val-idation and validation of the test-ing method itself in the pharma-ceutical industry according to goodmanufacturing practice (GMP)rules and guidelines. Validation ofanalytical methods and proceduresin a quality control (QC) laborato-ry is implemented mainly at thetime of transfer or introduction ofthe methods developed by the ana-lytical development laboratorywithin group companies or else-where. However, it is sometimesnecessary to develop a new or im-proved method of analysis for theQC laboratory’s own use. In thefirst part of this report, a generaldescription of analytical validation

of the high performance liquidchromatography (HPLC) methodincluding preparation of documentsis presented based on the experi-ence in our QC laboratory. A typ-ical example of method validationof robotic analysis system is thencited. Finally the merits and de-merits of these analytical valida-tions for QC laboratories are sum-marized. The authors emphasizethe importance of analytical valida-tion and the responsibility of QClaboratory management for the ef-fective design and implementationof validation activities.

Key words Analytical method andprocedure 7 Validation 7 Qualitycontrol laboratory 7Documentation 7 Robotic analysis

Received: 4 October 1996Accepted: 5 December 1996

S. Seno (Y) 7 S. Ohtake 7 H. KohnoQuality Assurance Department, SandozPharmaceuticals Ltd., No. 1 Takeno,Kawagoe City, Saitama, Japan

Introduction

In the area of pharmaceutical industries in most coun-tries, validation of manufacturing processes and sup-porting functions is the fundamental issue for the qual-ity assurance of products. Analytical methods and pro-cedures are associated with most evaluation activities inthis field. Analytical validation is an essential prerequi-site for any evaluation work during process validation,

cleaning validation and testing method validation itselfaccording to GMP regulations, rules and guidelines[1–3].

In most quality control (QC) laboratories in thisfield, there are a number of types of analytical instru-ments and equipment in practical use. These must besuitable for the intended use in analytical work. Instal-lation qualification (IQ) and operational qualification(OQ) are required before practical use. Calibration ofthe instruments according to the preset schedule and

Page 66: Validation in Chemical Measurement

Analytical validation in practice at a quality control laboratory in the Japanese pharmaceutical industry 57

preparation of the standard operating procedures(SOPs) must be implemented and records must bekept.

Validation of analytical methods and procedures isone of the important duties of development laborato-ries in the R and D division of major pharmaceuticalcompanies. Validation of the testing method in a QClaboratory is implemented mainly at the time of trans-fer or introduction of the method from the develop-ment laboratory of the group companies or from out-side. However, it often becomes necessary to develop anew or improved method of analysis for the QC labora-tories’ own use.

Analytical validation also necessitates documenta-tion prepared and approved for record keeping. Ittakes resources and time to collect data and elaboratevalidation parameters to establish the method and tocompile documentation. It is the responsibility of theQC laboratory management to design and implementthese validation activities efficiently before proceedingwith routine analysis.

Definitions

The analytical method or procedure is a detailed de-scription of the steps necessary to perform each analyti-cal test. This may include (but is not limited to) prepa-ration of the sample, reference standard and reagent,use of apparatus, generation of the calibration curve,use of the formulae for calculation, etc.

Validation of analytical procedure is the process ofconfirming that the analytical procedure employed fora test of pharmaceuticals is suitable for its intended use[4].

Validation of an analytical method is the process bywhich it is established, by laboratory studies, that per-formance characteristics of the method meet the re-quirement for the intended analytical applications [5].

The test methods, which are used for assessing com-pliance of pharmaceutical products with establishedspecifications, must meet proper standards of accuracyand reliability [2].

Analytical method validation

As required by regulatory authorities [1–3, 6] and In-ternational Conference on Harmonisation (ICH) guid-elines [7], analytical method validation is especially im-portant in establishing the assay methods and proce-dures of quantitative or semi-quantitative measurementof target substances or compounds. Among the specifi-cation items for drug products, assay, content uniformi-ty, and dissolution are typical of those which requirealmost full analytical validation. The purity test, related

substances test, and residual solvent test may be catego-rized as semi-quantitative assays which require some-what reduced validation work. For the manufacturingvalidation, multi-point sample assay must be performedto verify the prescribed homogeneous distribution of anactive drug substance and uniformity throughout abatch of the product. For cleaning validation, a semi-quantitative assay of low level active substance and de-tergent used for cleaning is necessary. Various analyti-cal techniques and methods can be used to meet therequirements. A typical and versatile analytical tech-nique for this purpose is high performance liquid chro-matography (HPLC).

Steps in the analytical validation of the HPLC method

Preliminary work

Availability of development manual

Documentation of policies, fundamental principles, andprocedures for development are prerequisites for theimplementation of validation activities and should beavailable.

Selection of the analytical technique and instrument

The most important factor for selection of the analyti-cal technique and instrument is the objective of theanalysis. The method must meet the requirements ofguidelines issued by the regulatory authorities wherethe method is intended for compliance purposes.

Features of the sample and availability of referencematerials are also key factors in determining an analyti-cal procedure and detection method, which then relateto various analytical validation characteristics and pa-rameters.

Cost effectiveness is a pressing factor even in QC la-boratories, while at the sametime quality standardsmust be kept as high as necessary. Time and manpow-er-saving methods must be developed and applied inroutine operations in the QC laboratory of the pharma-ceutical company. There are many other limiting fac-tors for establishing the procedure. The HPLC methodis nowadays the most widely used versatile method inthe pharmaceutical area.

IQ and OQ of the HPLC instrument

Elements of IQ

IQ is the verification based on installation documentsthat indicate that the instrument meets the require-

Page 67: Validation in Chemical Measurement

58 S. Seno · S. Ohtake · H. Kohno

Table 1 Characteristics to bevalidated in the HPLC meth-od (example of format)

Characteristic Requi-red?

Acceptance criteria Remarks/Othercriteria

Accuracy/trueness Y, N Recovery 98–102% (individual)with 80, 100, 120% spiked sample

PrecisionRepeatability Y, N RSD ~2%Intermediate precision Y, N RSD ~2%

Specificity/selectivity Y, N No interferenceDetection limit Y, N S/N `2 or 3Quantitation limit Y, N S/N `10Linearity Y, N Correlation coefficient r`0.999Range Y, N 80–120% or QL–120% or 200%Stability of sample solution Y, N `24 h or `12 h

ments of the system and that all components of the sys-tem have been delivered, ready for installation in ac-cordance with the standards and specifications. The ar-eas designated for equipment installation must have ad-equate space and appropriate utilities. A check list or atable format is often prepared to facilitate IQ for ap-proval and for record keeping.

Elements of OQ

OQ is the verification that the installed system operatesas prescribed for the intended purpose in accordancewith the manufacturer’s specification as well as the usercompany’s standard. OQ includes calibration and veri-fication of the unit components of the system under op-erating conditions. SOPs must be developed at thisstage.

The UV detector, fluorescence detector, electrolyticdetector, refractive index detector, and other detectionsystems must be checked for repeatability of responsesignal, sensitivity, and stability.

The pump must give reproducibility of flow rate andstability at maximum and minimum flow rates.

The auto-sampler must give reproducibility of injec-tion volume and stability of controlled temperature.

The oven must have accuracy and stability of con-trolled temperature.

The integrator/data processor/PC or micro-comput-er must give reliability in terms of computer validationand standard output versus input.

Performance qualification (PQ) of an HPLC system

Analytical validation documentation

Validation documentation, consisting of a protocol, testdata, and reports, must be reviewed and approved by

responsible personnel. Examples of the contents in aprotocol and a report are shown below:1. Protocol

Method name: Validation number: Date:Objective:Summary of method:Specific requirements:Sample and analyte:Reference standard:Characteristics to be validated (see Table 1):Detailed description of method/procedure:Data sheet/format for each validation charac-teristic:Revalidation and reason:Prepared by: Date:Reviewed by: Date:Approved by: Date:

2. ReportTitle/method name: Validation number: Date:Protocol:Data sheets and raw data including chromato-grams:Results versus acceptance criteria:Evaluation and comments:Conclusion:Revalidation schedulePreprepared by: Date:Reviewed by: Date:Approved by: Date:

System suitability test (SST)

The SST is a single overall test of system function, andis used to verify that the selectivity and repeatability ofa chromatographic system is adequate for the analysisrequired. The SST must be performed before the anal-ysis. The method/procedure should describe the fre-quency of the SST.

Page 68: Validation in Chemical Measurement

Analytical validation in practice at a quality control laboratory in the Japanese pharmaceutical industry 59

The terms are as defined in [4] and [5].1. Selectivity: system selectivity chromatogram must be

attached.Retention time (t)Resolution (Rs)Selectivity (a)Number of theoretieal plates (N)Symmetry factor (S) or Tailing factor (T)

2. Repeatability of peak area or peak heightIn assay: using 100% reference solution with at leastthree injections, for validation purpose, not less thansix injections, RSD ~1%In purity test: using 100% reference solution with atleast three injections, RSD ~1%using 1% reference solution with at least three injec-tions, RSD ~2%.

Typical example of analytical validation in the QClaboratory: method validation of a robotic analysissystem for ketotifen 1 mg capsules and tizanidinehydrochloride 1 mg tablets

A Zymate II Pye system for automated analysis by theZymark Corporation (USA) with on-line HPLC hasbeen set up and used for content uniformity tests andassays, including process validation, for a variety ofdrug products in our QC laboratory since January 1992.The robotic system automatically performs the follow-ing operations: preparation of sample and referencestandard solutions, injection into HPLC, HPLC ana-lyses, and generation of reports.

The robotic operation procedures were designed tosimulate the manual ones (registered test methods) andwere slightly modified to fit the robotic system. HPLCparameters and conditions are the same as those usedin the routine manual methods.

Configuration of Zymate II Robotic System

Flow diagram of basic robotic assay procedure

Major validation elements

Master laboratory station (MLS) for pipetting anddilution with 3 (A, B, C) 10 ml syringes

AccuracyRepeatability of delivered volume

Disposable 5 ml pipette tips for pipetting 1 ml, 2 mland 4 ml of solvents

AccuracyRepeatability of delivered volume

Filter cartridges

Recovery (adsorption loss)

Content uniformity test of ketotifen 1 mg capsules andtizanidine hydrochloride 1 mg tablets

Comparison of manual analysis and robotic analysiswith the same procedure.Comparison with registered method.

Validation results

Master laboratory station (MLS)

Accuracy and repeatability of delivering 2 ml, 5 ml and10 ml of three different solvents (100% water, 50% wa-ter/methanol, 100% methanol) were checked by weigh-ing the delivered solvents. Accuracy was within the ac-ceptance limit of 98.0% of the theoretical values andthe RSD was less than 2% for all 2 ml, 5 ml, 10 ml de-liveries.

Disposable 5 ml pipette tips

Deliveries of preset volume of 1 ml, 2 ml, and 4 ml ofthe same solvents used for MLS were checked byweighing. For water and 50% water/methanol, threelevels of delivery were within acceptance limits andshowed no difference from manual pipetting. However,50% water/methanol and 100% methanol gave smallervolumes, and out of limit for 1 ml and 2 ml deliveries.

Filter cartridges

There was no detectable loss of ketotifene and tizanid-ine hydrochloride in filter cartridges.

Content uniformity test

Results are shown in Table 2.There was no significant difference in the results ob-

tained by manual operation, robotic operation and theregistered method. All the data were within the accept-ance limits.

The validation of the robotic method was successful-ly completed by finalizing the validation report and ap-proving the documentation by the management of theQuality Assurance Department.

Page 69: Validation in Chemical Measurement

60 S. Seno · S. Ohtake · H. Kohno

Fig. 1 Configration of ZymateT II Robotic System

Fig. 2 Flow diagram of basic robotic procedure for content uni-formity assay.Weighing (Mettler balance AE200)

fAddition of designated volume of solvent (MLS)

fExtraction of active substance (Vortex mixer)

fCentrifugation (Centrifuge)

fFiltration (Filter cartridge)

fDilution (MLS and syringe hand with 1 ml, 5 ml pipette tips)

fInjection into HPLC

fHPLC analysis (Hitachi L-6000 with UV-VIS detector)

fData processing (Hitachi L-2500 integrator)

fReport generation (NEC PC9801 DX)

Merits and demerits associated with analyticalvalidation in the QC laboratory

Merits

1. Reliability of analytical results is assured throughthe validation.

2. The clear objective and performance capability ofthe method can be confirmed by all the staff whoapply the method.

3. Awareness of the importance of predetermined pro-tocols for the validation work and motivation for theimprovement can be enhanced.

4. Analytical validation gives a good opportunity fortraining the QC staff.

Demerits

1. Costs for validation are continually increasing.2. Designing and scheduling of validation need experi-

enced personnel.

Page 70: Validation in Chemical Measurement

Analytical validation in practice at a quality control laboratory in the Japanese pharmaceutical industry 61

Table 2 Comparison of the manual, robotic, and registered methods for content uniformity (Manual, Unit operations in the roboticmethod were performed manually; Registered, Manual assay method using HPLC cited in the registration document)

Product Lot No. Percentage of nominal content found by various methods (RSD)

Manual Robotic Registered

Ketotifen 1-mg 66562 101.0 (0.9) 101.6 (0.4) 101.1 (1.1)capsules 66564 101.8 (2.4) 102.3 (0.3) 100.2 (2.4)

66569 100.4 (1.5) 101.2 (0.6) 101.0 (1.5)

Tizanidine 66563 99.5 (1.0) 99.3 (0.4) 99.5 (0.3)hydrochloride 66565 99.5 (1.1) 99.3 (0.6) 98.3 (0.5)1-mg tablets 66570 99.7 (1.5) 99.6 (1.6) 99.5 (0.9)

Conclusion

Reliability of analytical results generated in a QC labo-ratory depends on the analytical method and procedureused. Unless the method and procedure are validated,the numerical results remain as an estimate and cannotbe used for rigorous compliance judgment. The qualityof the pharmaceutical product must be assured fromthe very beginning of the manufacturing activities, in-

cluding acceptance testing of raw materials, processingoperations, in-process controls, and product testing to-gether with established manufacturing processes andvalidated systems.

Analytical validation is the fundamental elementsupporting quality assurance activities in pharmaceuti-cal industries. It is the responsibility of QC laboratorymanagement to design and implement these validationactivities efficiently and effectively.

References

1. Regulations for Manufacturing Controland Quality Control of Drugs (GMPof Japan), amended in January 1994,Ministry of Health and Welfare, Japan

2. CFR Title 21 Part 211-Current GoodManufacturing Practice for FinishedPharmaceuticals, revised in January1995, Food and Drug Administration,Department of Health and HumanServices, USA

3. Guide to Good Manufacturing Practicefor Medicinal Products, The GMP Di-rectives 91/356/EEC and 91/341/EEC(1991), European Commission

4. The Japanese Pharmacopeia, Thir-teenth Edition (JP 13) (1995) Ministryof Health and Welfare, Japan

5. The United States Pharmacopeia,Twenty-third Revision (USP 23)(1995) United States PharmacopeialConvention, Rockville, MD, USA

6. Guidelines for Establishing the Specifi-cations and Analytical Procedures ofNew Drugs, Notification No. 586(1994) by the director of the Pharma-ceuticals and Cosmetic Division,MHW, Japan

7. ICH Harmonised Tripartite Guideline“Text on Validation of Analytical Pro-cedures”, October 1994, and DraftConsensus Guideline “Extension ofthe ICH Text on Validation of Analy-tical Procedures”, November 1995, bythe “ICH Steering Committee, Inter-national Conference on Harmonisationof Technical Requirements for Regis-tration of Pharmaceuticals for HumanUse

Page 71: Validation in Chemical Measurement

Accred Qual Assur (1998) 3 :155–160Q Springer-Verlag 1998 PRACTITIONER’S REPORT

Reports and notes on experiences with quality assurance, validationand accreditation

Ricardo J. N. Bettencourt da SilvaM. Filomena G. F. C. CamõesJoão Seabra e Barros

Validation of the uncertainty evaluationfor the determination of metals in solidsamples by atomic spectrometry

Received: 3 November 1997Accepted: 2 January 1998

Presented at: 2nd EURACHEMWorkshop on Measurement Uncertaintyin Chemical Analysis, Berlin,29–30 September 1997

R. J. N. Bettencourt da Silva (Y)M. F. G. F. C. CamõesCECUL, Faculdade de Ciências daUniversidade de Lisboa, P-1700 Lisbon,Portugal

J. Seabra e BarrosInstituto Nacional de Engenharia eTecnologia Industrial, Estrada do Paçodo Lumiar, P-1699 Lisbon Codex,Portugal

Abstract Every analytical resultshould be expressed with some in-dication of its quality. The uncer-tainty as defined by Eurachem(“parameter associated with the re-sult of a measurement that charac-terises the dispersion of the valuesthat could reasonably be attributedto the, . . ., quantity subjected tomeasurement”) is a good tool toaccomplish this goal in quantitativeanalysis. Eurachem has produced aguide to the estimation of the un-certainty attached to an analyticalresult. Indeed, the estimation ofthe total uncertainty by using un-certainty propagation laws is com-ponents-dependent. The estimationof some of those components isbased on subjective criteria. Theidentification of the uncertaintysources and of their importance,

for the same method, can varyfrom analyst to analyst. It is impor-tant to develop tools which willsupport each choice and approxi-mation. In this work, the compari-son of an estimated uncertaintywith an experimentally assessedone, through a variance test, is per-formed. This approach is appliedto the determination by atomic ab-sorption of manganese in digestedsamples of lettuce leaves. The totaluncertainty estimation is calculatedassuming 100% digestion efficiencywith negligible uncertainty. This as-sumption was tested.

Key words Uncertainty 7Validation 7 Quality control 7Solid samples 7 Atomicspectrometry

Introduction

The presentation of an analytical result must be accom-panied by some indication of the data quality. This in-formation is essential for the interpretation of the ana-lytical result. The comparison of two results cannot beperformed without knowledge of their quality. Eura-chem [1] defined uncertainty as the “parameter asso-ciated with the result of a measurement that character-ises the dispersion of the values that could reasonablybe attributed to the, . . ., quantity subjected to measure-

ment” and presented it as a tool to describe that quali-ty.

The Eurachem guide for “Quantifying uncertainty inanalytical measurement”, which is based on the appli-cation of the ISO guide [2] to the chemical problem,was observed. ISO aims at the estimation of uncertaintyin the most exact possible manner, in order to avoidexcess of confidence in overestimated results. The ap-plication of these guides turns out to be a powerfultool. The exact estimation of uncertainties is importantfor the detection of small trends in analytical data. Thetime and effort used in such estimations can avoid

Page 72: Validation in Chemical Measurement

Validation of the uncertainty evaluation for the determination of metals in solid samples by atomic spectrometry 63

many further doubts concerning observation of legallimits and protects the user of the analytical data fromfinancial losses. The use of uncertainty instead of lessinformative percentage criteria brings considerablebenefits to the daily quality control.

Despite the analyst’s experience, some analyticalsteps like sampling and recovery are of particularly dif-ficult estimation. Mechanisms should be developed tosupport certain choices or approximations. The com-parison of an estimated uncertainty with the experi-mentally assessed one can be of help.

In this work the Eurachem guide [1] was used forthe estimation of uncertainties involved in the determi-nation by electrothermic atomic absorption spectrome-try (EAAS) of manganese in digested lettuce leaves.The total uncertainty estimation was calculated assum-ing a 100% digestion efficiency with negligible uncer-tainty. The experimental precision was compared withan estimated one for the purpose of validation of theproposed method of evaluation. After this validationthe uncertainty estimation was used in an accuracy testand in routine analysis with the support of a spread-sheet programme.

The uncertainty estimation process

The uncertainty estimation can be divided into foursteps [1]: (1) specification, (2) identification of uncer-tainty sources, (3) quantification of uncertainty compo-nents, and (4) total uncertainty estimation.

Specification

A dry-base content determination method is proposed,the sample moisture determination being done in paral-lel. Figure 1 represents the different steps of the analy-sis. The analytical procedure was developed for labora-tory samples. Sampling uncertainties were not consid-ered.

The dry-base correction factor, fcorr., is calculatedfrom the weights of the vial (z), vial plus non-driedsample (x) and vial plus dry sample (y)

fcorr.p1PxPyxPz

(1)

The sample metal content, M, is obtained from theinterpolated concentration in the calibration curve,Cinter, the mass of the diluted digested sample, a, andthe dilution factor, fdil., (digested sample volume timesdilution ratio).

MpCInter.!fdil.

a(2) Fig. 1 Proposed method for dry-base metal content determina-

tion in lettuce leaves

The dry-base content, D, is obtained by application ofthe correction factor, fcorr., to the metal content, M.

Dpfcorr. M (3)

Identification of uncertainty sources

The uncertainty associated with the determination offcorr. is estimated from the combination of the three in-volved weighing steps, Fig. 1a.

The uncertainty associated with the sample metalcontent is estimated from the weighing, dilution and in-terpolation sources (Fig. 1b). The model used for thecalculation of the contribution from the interpolationsource assumes negligible standards preparation uncer-tainty when compared with the instrumental randomoscillation [5, 10].

Quantification of the uncertainty components

The quantification of the uncertainty is divided intoequally treated operations:

Gravimetric operations

The weighing operations are present in the dry-basecorrection factor (three) and in the sample metal con-tent (one). Two contributions for the associated uncer-tainty, sWeighing, were studied:1. Uncertainty associated with the repeatability of theweighing operations, sBalance

Repeat., is obtained directly fromthe standard deviation of successive weighing opera-tions. The corresponding degrees of freedom are thenumber of replicates minus 1.

Page 73: Validation in Chemical Measurement

64 R.J.N. Bettencourt da Silva et al.

2. Uncertainty associated with the balance calibration,sBalance

Calib. , defined by

sBalanceCalib. p

2!Tolerance;12

(4)

where the Tolerance is obtained from the balance cali-bration certificate.

The Eurachem guide suggests that when the uncer-tainty components are described by a confidence inter-val, aBb, without information on degrees of freedom,the associated uncertainty is 2b/;12, which representsthe uncertainty of a 2b amplitude rectangular distribu-tion. These uncertainties are designated type B. Thenumber of degrees of freedom associated with thesBalance

Calib. type B estimation, nBalanceCalib. , is approximately [1]

equal to

nBalanceCalib. H

12 3

sBalanceCalib.

mBalanceCalib.

42

(5)

were mBalanceCalib. is the mass associated with the balance

calibration tolerance.The two uncertainties are then combined

sWeigingp;(sBalanceCalib. )2c(sBalance

Repeat.)2 (6)

The corresponding degrees of freedom are calcu-lated by the Welch-Satterwaite equation [1–3]. Whenthe pairs (uncertainty, degrees of freedom)

(sa, na); (sb,nb); (sc,nc); (sd,nd); . . .

for the quantities a, b, c, d, . . ., in a function=pf(a,b,c, d, . . .) are taken into account, then the effec-tive number of degrees of freedom associated with =,n=, is

n=ps4

=

1 i fia2

4 s4a

na

c1 i fib2

4 s4b

nb

c1i fic2

4 s4c

nc

c1 i fid2

4 s4d

nd

c . . .

(7)

The calculation of the uncertainty and of the degreesof freedom associated with the sample weight is by thedirect application of Eqs. 4–7. The calculations of thedry-base correction factor are more elaborate.3. Uncertainty associated with the dry base factor

The dry base correction factor is a function of threeweighing operations (Eq. 1). To estimate the uncertain-ty, sfcorr., associated with the fcorr., the general equation(Eq. 8) was used [1]

sfcorr.p"1i fcorr.

ix 22

s2xc1i fcorr.

iy 22

s2yc1i fcorr.

iz 22

s2z (8)

It is therefore

sfcorr.p

"1 yPz(xPz)22

2

s2xc1P 1

(xPz)22

s2yc1 xPy

(xPz)222

s2z (9)

The values of sx, sy and sz are then calculated asdescribed in the section “Gravimetric operations”above. The number of degrees of freedom is calculatedby the Welch-Satterwaite equation (Eq. 7). The appli-cation of a spreadsheet program available in the litera-ture simplifies this task [3]. However, the classical ap-proach is more flexible for different experimental con-figurations or for one or more dilution steps, and is alsoeasily automated.

Volumetric operations

The uncertainties associated with to the volumetric op-erations were calculated from the combination of two[(1) and (2) below] or three [(1), (2) and (3) below]components:1. Uncertainty associated with volume calibrations,sVol.

Calib.

sVol.Calib. p

2!Tolerance;12

(10)

where the information on this tolerance is normallyavailable with the instrument in the form: volumetricinstrument volumeBtolerance. This type B uncertaintyestimation has the same treatment as the one reportedin Eq. 5 for the degrees of freedom

nVol.Calib. H

12 3

sVol.Calib.

V 42

(11)

where nVol.Calib. is the number of degrees of freedom asso-

ciated with sVol.Calib. for a certain volume V.

2. Uncertainty associated with volume repeatabilitytests, sVol.

Repeat.

The sVol.Repeat. and the corresponding degrees of free-

dom, nVol.Repeat., are also extracted directly from the re-

peatability tests. Such tests consist of successive weigh-ings of water volumes measured by the instrument. Theobserved standard deviation is a function of the ana-lyst’s expertise.3. Uncertainty associated with the use of volumetricequipment at a temperature different from that of cali-bration, sVol.

Temp.

This third component corrects for errors associatedwith the use of 20 7C calibrated material in 20B3 7C so-lutions. When two consecutive volumetric operationsare performed at the same temperature, as is the case indilution stages, they become self-corrected for this ef-fect.

The glass instrument expansion coefficient is muchsmaller than that of the solution. For this reason we

Page 74: Validation in Chemical Measurement

Validation of the uncertainty evaluation for the determination of metals in solid samples by atomic spectrometry 65

have only calculated the latter. For a temperature oscil-lation of DTpB3K with a 95% significance level andfor a volumetric expansion coefficient of pure water of2.1!10P4 7CP1 (our solutions can be treated as purewater because of their low concentrations), the 95%volume confidence interval becomesVBV!3!2.1!10P4. Dividing the expanded uncer-tainty by the Student t value, t(e, 95%)p1.96, we ob-tain the temperature effect component uncertainty

sVol.Temp. p

V!3!2.1!10P4

1.96(12)

The number of degrees of freedom due to the tem-perature effect can also be estimated as for nVol.

Calib. (Eq.11), substituting sVol.

Calib. by sVol.Temp.

These components are then combined to calculatethe volume uncertainty, sVol.

sVol.p;(sVol.Calib.)2c(sVol.

Repeat.)2c(sVol.Temp.)2 (13)

The number of degrees of freedom associated with sVol.

can also be calculated by the Welch-Satterwaite equa-tion.4. Uncertainty associated with the dilution factor

Our analytical method has three volumetric stepsthat can be combined as a dilution factor, fdil., whoseuncertainty, sfdil.

, can easily be estimated by:

sfdil.

fdil.

p"1sDSVVol.

VDSV22

c1sPVol.

VP 22

c1sVVol.

VV 22

(14)

were the DSV, P and V stand respectively for digestedsolution volume, dilution operation pipette and dilu-tion operation vial; sVol. and V represent respectivelyeach corresponding volumetric uncertainty and volume.As in the other cases, the degrees of freedom were cal-culated by the Welch-Satterthwaite equation.

Sample signal interpolation from a calibration curve

The mathematical model used to describe our calibra-tion curve was validated by the Pennincky et al. [4]method. At this stage we proved the good fitting prop-erties of the unweighted linear model to our calibrationcurve. With this treatment we aimed not only at the ac-curacy but also at the estimation of more realistic sam-ple signal interpolation uncertainties. These uncertain-ties were obtained by the application of an ISO interna-tional standard [5].

The instrument was calibrated with four standards(0–2–4–6 mg/L for Mn) with three measurement repli-cates each [4]. Samples and control standard (4 mg/Lfor Mn) were also measured three times. The controlstandard was analysed for calibration curve quality con-trol (see “Quality control”).

Total uncertainty estimation

The total uncertainty estimation, sT, is a function of thedry-base correction factor uncertainty, sfcorr.

, of the un-certainty associated to the analysis sample weighing op-eration, sSample

Weighing, of the dilution factor, sfdil., and of

the instrumental calibration interpolated uncertainty,sCinter.

. These four quantities combine their uncertain-ties in the equation

sT

Dp"1sSample

Weighing

a 22

c1sfcorr.

fcorr.2

2

c1sfdil.

fdil.2

2

c1sCInter.

CInter.2

2

(15)

where D represents the dry-base sample metal contentand a has the same meaning as in Eq. 2. The otherquantities have already been described.

The expanded uncertainty can then be estimated af-ter the calculation of the effective number of degrees offreedom, df (Eq. 7). Therefore the coverage factor usedwas the Student t defined for that number and a 95%significance level (t(df,95%). The estimated confidenceinterval is defined by

DBsT7t(df,95%) (16)

Quality control

Ideally, the readings of the instruments for each sampleand for each standard should be random [6–7]. Normal-ly, the instrument software separates the calibrationfrom the sample reading. Although this allows an im-mediate calculation, it can produce gross errors if theoperator does not verify the drift of the instrument re-sponse. For this reason, the calibration curves shouldbe tested from time to time by reading a well-knowncontrol standard. This standard can also be preparedfrom another mother solution in respect to the calibra-tion standards, for stability and preparation checking.

Normally, the laboratories use fixed and inflexiblecriteria for this control. They define a limit to the per-centage difference between the expected and the ob-tained value, and in low precision techniques they areobliged to increase this value. Assuming the uncertain-ty associated with the control standard preparation tobe negligible when compared to the instrumental un-certainty, the case-to-case interpolation uncertaintiescan be used as a fit for each case. If the observed confi-dence interval includes the expected value, there is rea-son to think that the system is not under control. Theinstrumental deviation from control can be used as aguide for instrumental checking or as a warning of theinadequacy of the chosen mathematical model for thecalibration.

Page 75: Validation in Chemical Measurement

66 R.J.N. Bettencourt da Silva et al.

Fig. 2 Repeatability test. The confidence intervals are repre-sented by the average value plus the estimated expanded uncer-tainty for a 95% confidence level

Validation of the uncertainty estimation

Method validation is the process of demonstrating theability of a method to produce reliable results [8]. Ananalytical result should be expresed along with a confi-dence interval and a confidence level. The confidenceinterval can be described by a mean value and a inter-val width. Therefore the validation depends on the re-liability of the confidence interval width estimation.The accuracy test can be performed exactly only, afterthat step.

The statistical equivalence between the estimatedand the observed values can be used to confirm thatquality. The F-test [10] is a good tool for comparing(non-expanded) uncertainties.

Application of uncertainty validation schemes

The proposed uncertainty validation method was ap-plied to the dry-base determination of manganese in di-gested lettuce leaves by electrothermic atomic absorp-tion spectrometry. The proposed quality control sche-me was also applied.

The 200-mg samples were digested with nitric acid ina microwave-irradiated closed system [11]. The instru-mental determination was performed in a GBC atomicspectrometer with D2 lamp background correction. APd/Mg mixture [12] was used as chemical modifier. Thedry-base correction factor was calculated by a parallelassay. The samples were dried in an oven at 60 7C un-der atmospheric pressure, and the CRM (certified ref-erence material – NIST 1570a) was treated as specifiedby NIST [13].

Repeatability test

The estimated uncertainties were compared with theexperimental ones by an F-test for the 95% confidencelevel [10]. Figure 2 represents the obtained experimen-tal values associated with the estimated expanded un-certainty (95% confidence level). The coverage factorused was 1.96 for the average effective number of de-grees of freedom, df, of 57500. The Eurachem [1] pro-posal of a coverage factor of 2 is adequate for thiscase.

The replicates 1 and 5 (REP1, REP5) are consecu-tive single outliers (Grubbs test) for a 95% confidencelevel [9]. Therefore, they have not been used for theexperimental uncertainty calculation. The two uncer-tainties are statistically equivalent for the test used (ex-perimental uncertainty: 0.82 mg/Kg for 9 df; estimateduncertainty: 0.73 mg/Kg for 57500 df) at the 95% confi-dence level.

Fig. 3 Accuracy test over spinach leaves NIST CRM. The ob-tained values (NIST 1, 2 and 3) were associated with a 95% con-fidence level expanded uncertainty. The certified value is alsopresented for the same confidence level

Accuracy test

The accuracy test was performed with spinach leaves(NIST 1570a) because of their claimed similarity withlettuce leaves in terms of proteins, carbohydrates, fibreand inorganic matter content [14]. The validated uncer-tainty estimation was used for the comparison of ob-tained values with certified ones (Fig. 3).

The loss of precision in EAAS with the time of useof furnace is taken into account in the case-to-case in-terpolation uncertainty calculation. The accuracy is re-tained with a larger confidence interval.

The overlapping of these intervals indicates thatthere is no reason to think that our method lacks accu-racy. Our analytical method can be considered to per-form as badly or as well as the NIST methods. This as-sumption seems sufficiently valid to consider the meth-od validated.

Page 76: Validation in Chemical Measurement

Validation of the uncertainty evaluation for the determination of metals in solid samples by atomic spectrometry 67

Conclusions

The assumption of 100% efficient digestion with negli-gible uncertainty is valid for the total uncertainty esti-mation of the presented example. This uncertainty esti-mation proved to be a valuable criterion for methodvalidation and quality control, which can be tested by a

very simple procedure. The easy routine use of an exacttreatment in a spreadsheet program can be useful forthe more demanding situations. Nevertheless, furtherapproximations can be easily tested.

Acknowledgements Thanks are due to JNICT for financial sup-port to CONTROLAB LDA, for the instrumental facilities thatmade this work possible, as well as to the teams of INETI andCONTROLAB LDA, for their support.

References

1. Eurachem (1995) Quantifying uncer-tainty in analytical measurement, ver-sion 6

2. ISO (1993) Guide to the expressionof uncertainty in measurement, Swit-zerland

3. Kargten J (1994) Analyse 119 :2161–2165

4. Penninckx W, Hartmann C, MassartDL, Smeyers-Verbeke J (1996) JAnal At Spectrom 11 :237–246

5. ISO International Standard 8466-1(1990) Water quality – calibrationand evaluation of analytical methodsand estimation of performance char-acteristics – Part 1: Statistical evalua-tion of performance characteristics,Geneva

6. Analytical Methods Committee(1994) Analyse 119 :2363–2366

7. Staats G (1995) Fresenius J AnalChem 352 :413–419

8. Taylor JK (1983) Anal Chem55:600A–608A

9. Grubbs FE, Beck G (1972) Techno-metrics 14 :847–854

10. Miller JC, Miller JN (1988) Statisticsfor analytical chemistry (2nd edn).Wiley, UK

11. Deaker M, Maher W (1995) J AnalAt Spectrom 10 :423–431

12. Soares ME, Bastos ML, Carvalho F,Ferreira M (1995) At Spectrosc4 :149–153

13. NIST (1994) Certificate of AnalysisSRM1570a Trace elements in spinachleaves

14. Penninckx W, Smeyers-Verbeke J,Vankeerberghen P, Massart DL(1996) Anal Chem 68 :481–489

Page 77: Validation in Chemical Measurement

Accred Qual Assur (1998) 3 :189–193q Springer-Verlag 1998

Reports and notes on experiences with quality assurance, validationand accreditation

Robert J. Wells Validation requirements for chemicalmethods in quantitative analysis – horsesfor courses?

Received: 22 September 1997Accepted: 28 November 1997

R.J. Wells (Y)Australian Government AnalyticalLaboratories, 1 Suakin Street, Pymble,NSW 2073, Australia

Abstract Although the validationprocess necessary to ensure that ananalytical method is fit for purposeis universal, the emphasis placedon different aspects of that processwill vary according to the end usefor which the analytical procedureis designed. It therefore becomesdifficult to produce a standardmethod validation protocol whichwill be totally applicable to all ana-lytical methods. It is probable thatfar more than 30% of the methods

in routine laboratory use have notbeen validated to an appropriatelevel to suit the problem at hand.This situation needs to change anda practical assessment of the de-gree to which a method requires tobe validated is the first step to areliable and cost effective analyti-cal industry.

Key words Validation 7Verification 7 Veterinary drugs 7Anabolic steroids

Introduction and background

All methods used in analytical chemistry are subject toerror. Therefore it is vital that each method should beevaluated and tested to ensure that it produces resultswhich make it suitable for the intended purpose. Meth-od validation and verification is the implementation ofthis evaluation process [1–5]. However, the extent andrigor with which a particular method is evaluated is de-pendent on the intended use and past experience withthe method.

Method validation is the process in which everystage of a new analytical method is subjected to rigor-ous series of tests to ensure that the method will beable to deliver all the outcomes required of it. The con-fidence that the method can deliver these outcomes isexpressed in terms of statistical probability over thewhole analyte concentration range established duringthe validation process. Verification of a method is asimplified form of the validation process. It involves thetesting of a series of method parameters to ensure that

a previously validated analytical procedure performs asreported when it is introduced into a new environmentwhere, at the very least, equipment may not be identi-cal to that employed in the initial validation.

Established methods must, as a minimum require-ment, be verified when introduced into a laboratory forthe first time. Verification, strictly speaking, is also nec-essary if the method is modified or applied to a newsituation, for example a different sample matrix. A newmethod must be subject to a much more searching se-ries of validation procedures, each one of which addsfurther confidence in the analytical results obtained.While method validation is mandatory for assurance ofanalytical quality, the cost to a laboratory is significant.It is therefore important for the financial well-being ofa laboratory that validation should adopt no more thanthose procedures necessary to ensure the analyticalquality demanded by a client.

In general, validation processes are universal, butthe rigor with which these processes are applied willvary with the intended use of the method. This paper

Page 78: Validation in Chemical Measurement

Validation requirements for chemical methods in quantitative analysis – horses for courses? 69

Table 1

Requirements for both Additional requirementsscreening and for confirmatoryconfirmatory methods method

Specificity (selectivity) RecoverySensitivity Limit of quantitation (LOQ)Calibration and linearity Quality controlAccuracy (bias) Repeatability (between

analysts)Repeatability of method Reproducibility (between

laboratories)Range RuggednessLimit of detection (LOD) System suitability test

Table 2

Stage Process involved

Preliminary stages1. Identify

requirementsNeeds of client

2. Methodselection

Literature search, recommendation ofcolleaguesNovel approach, regulatory require-mentsStaff expertise and requirements forstaff training

3. Developmentcandidatemethod

Preliminary testing

Method validation4. Calibration and

linearityGoodness of fit

5. Accuracy (bias) Comparison with reference method,spiked samplesCertified reference materials,collaborative tests

6. Repeatability Within day, within lab7. Reproducibility Day to day, between labs8. Sensitivity,

LOD and LOQInstrumental specifications

9. Specificity Interference studies10. Recovery Samples of known concentration,

preparation efficiency11. Quality control Spiked samples, blanks, critical control

points12. System suitabil-

ity testRoutine acceptance criteria, controlcharting

Validity of method to meet required analytical outputs, includ-ing calculation of uncertainty for all relevant stages, is now es-tablished

13. Produce meth-od documenta-tion

Write upexperimental/validation work

14. Write user in-structions

Prepare standard operating procedure(SOP)

15. Managementauthorisation

16. Use method17. Method review Updating SOPSs

will outline validation principles and discuss the degreeto which these principles must be implemented toachieve the necessary confidence in the analytical resultobtained.

Types of method, screening vs confirmation

Methods can be classified in a number of ways, but inthe present instance an important distinction should bemade between screening and confirmatory methods.Confirmatory methods should include some or all ofthe parameters shown in Table 1. Screening methodsrequire a limited sub-set of all of the parameters usedfor method validation, but should include those param-eters indicated in the first column of Table 1.

Stages of validation

A distinction can be made between establishing theperformance characteristics of a method and completemethod validation. After the performance of a methodhas been assessed, complete validation requires evalua-tion during its application to the intended purpose, us-ing reference methods for comparison, certified refer-ence materials if available, and interlaboratory compar-isons in order to obtain a realistic estimate of the uncer-tainty of routine results. It therefore follows that, wherepossible, laboratories should not work alone but colla-borate in interlaboratory studies.

Important protocols for method validation in the lit-erature have been derived, amongst others, from theCurrent Good Manufacturing Practice, Code of FederalRegulations, Food and Drug Administration, NationalDrug Administration, the United States Pharmaco-poeia Convention, the American Public Health Asso-ciation and the International Conference on Harmoni-zation.

The scheme shown in Table 2 is the ideal validationprocedure for most methods. However, this is a daunt-

ing list, which many laboratories would wish to abbre-viate without compromising the required analyticalquality.

Metrological requirements

A cornerstone of the development of a method is thatall results are traceable. This has been discussed in de-tail in an earlier paper in this journal by Price [6]. Cur-rent requirements of QA and GLP are that all labora-tory equipment should have maintenance and calibra-tion records kept in a log book. Such calibration proce-dures must be traceable to a reference or primarystandard. This is true of both general and specialized

Page 79: Validation in Chemical Measurement

70 R.J. Wells

laboratory equipment, but traceability of some special-ized laboratory equipment (e.g. mass spectrometers)may be problematical.

A more serious problem arises when the traceabilityof the whole analytical process is considered. Completetraceability through to the mole is an excellent idealbut nowhere near achievable at present. Comparabilitybetween chemical analyses by use of certified referencematerials (CRMs) may be regarded as the initial step inthe traceability chain to the mole. Even this relativelysmall first step is constrained by a number of factors.Some of these are:– A CRM is usually certified using a specific method.

Therefore the same method should be used un-changed for all analyses deemed to be traceable tothat CRM.

– There are only a limited number of CRMs availa-ble.

– CRMs are often only available for a particular ma-trix at a particular analyte concentration. Any devia-tion from either sample matrix or a specific concen-tration would invalidate traceability.Added to these constraints are the issues raised by

the certainty of a result in terms of both analyte identif-ication and quantitation. The uncertainty of a result isdependent on the analyte concentration. In trace analy-sis it might be argued that the traceability of a methodin identifying a substance would have a traceabilitychain completely different from that of a method whichquantitates the same substance. The traceability chainfor a method which both quantitates and identifies asubstance could be different again. This differentiationis important in many regulatory analyses in which azero tolerance for a substance has been set. Underthese circumstances, only the presence of the substancehas to be established. This problem is discussed in moredetail later in this paper.

Practical applications of the validation process

Thus far, this paper has simply summarized informationcovered in standard texts. Two different practical situa-tions are now described in which it is necessary to em-phasize certain parts of the validation process to ensureconfidence in the results, while minimizing or even ig-noring other processes which have no impact on theanalytical outcome. In discussing these separate exam-ples, we hope to demonstrate that a pragmatic ratherthan a strictly rigid approach to validation must be ad-opted to economically accommodate various needs.

Establishment and regulation of maximum residuelimits (MRLs) for veterinary drugs

One of the two Joint WHO/FAO Expert Committeeson Food Additives (JECFA) has the task of recom-mending appropriate MRLs for veterinary drug resi-dues in edible animal tissues based on toxicological as-sessments of a particular drug coupled with an accurateknowledge of the depletion of the drug and/or its meta-bolites from various animal tissues over time. Theserecommendations then go forward to the Codex Ali-mentarius for ratification and adoption for regulatorypurposes. However, the validation requirements for anacceptable method to generate residue depletion datamay be distinctly different from the requirements for asatisfactory regulatory method.

A residue depletion method for a veterinary drug

This is usually developed and used in-house (underGLP) by a company for the specific purpose of measur-ing residues of the drug in tissues. Seldom is the meth-od tested in another laboratory. Since the drug is ad-ministered under controlled conditions, interferencesdue to the presence of related drugs is never a problem.Indeed, related drugs are often used as surrogates orinternal standards in such methods. Therefore only ma-trix interferences are important in the validation proc-ess. Analyte recovery and accuracy of quantificationare of prime importance, but since no standard refer-ence materials are available, these parameters must beassessed from spiked samples. The LOQ must be as lowas possible and the CV of the method must also be low(ideally ~ 10% at the LOQ) and certainly lower thanthe animal-to-animal variability. The complexity (andtherefore the cost) of the method is not a vital factor,nor is the suitability of the method for routine applica-tion of great importance. A final correction for recove-ry is necessary in order to establish the actual residueconcentrations in tissue. Often a liquid chromatography(LC) method using pre- or post-column derivatizationis the method of choice to meet the analytical objec-tive.

A regulatory method for a veterinary drug

This often has distinctly different requirements. Sinceresidue testing is a factor in world trade, it is vital thatthere is comparability of results between national labo-ratories undertaking drug residue monitoring. Thus, aninterlaboratory trial of the method is very desirable.Furthermore, it is most unlikely that all regulatory la-boratories will be identically equipped, and a regulato-

Page 80: Validation in Chemical Measurement

Validation requirements for chemical methods in quantitative analysis – horses for courses? 71

ry method must be robust and flexible enough to copewith such variation. Another major requirement of theregulator, which is of little concern to the drug manu-facturer, is the strong preference for an economicalmulti-residue method suitable for the quantitation ofall drugs of a particular class. An optimum methodwould combine high efficiency isolation and quantita-tion with the confirmation of a wide range of drugs ofdifferent classes from the same matrix. The selectivityof a method is very important to achieve this aim. TheLOQs of each drug included in this ideal method wouldneed to be only 4–10 times lower than the MRL, andthe recovery and%CV at the LOQ need not be as strin-gent as those required in residue depletion studies. Theuse of related drugs of regulatory interest as internalstandards or surrogates would be undesirable in thismethod, and a gas chromatography-mass spectrometry(GC-MS) or LC-MS method using deuterium-labeledinternal standards would probably be the most cost ef-fective way of achieving most or all of the analyticalgoals.

Clearly satisfactory validation of a residue depletionmethod would be significantly different in many re-spects from that demanded by a regulatory method.But who should develop the ‘regulatory method’ andhow is it to be decided if a method is suitable for regu-latory purposes?

Codex have requested that, during the evaluationprocess for recommendation of MRLs, JECFA also en-sure that the sponsor of the drug provide a “validatedmethod suitable for regulatory purposes”. The drugmanufacturer might argue that this is not their respon-sibility. Codex might reply, with some justification, thatif a company seeks registration of a drug then the pro-vision of a method to detect and regulate the usage pat-terns should be mandatory. The development of a newanalytical method, particularly a multi-residue methodsuitable for all edible tissues, is not an inexpensive en-terprise, and, not surprisingly, many companies are re-luctant to develop a regulatory method which fullymeets the desires of the regulators. This impasse hasstill to be resolved to the acceptance of each party, withJECFA caught in the middle as both arbiter andjudge.

Detection of the use of anabolic steroids in sport

The detection of the illegal use of drugs in sport will bea major priority for the Australian Government Analy-tical Laboratories (AGAL) for the next 3 years. Theanalyses carried out by the Australian Sports DrugTesting Laboratory in AGAL (NSW) illustrate a fur-ther set of analytical problems which test, to the full,the implementation of a rigid validation protocol. The

detection of anabolic steroids in urine will be used asan example.

International Olympic Committee requirements fordrug testing are simple. If any banned non-endogenousanabolic agent is unambiguously detected then thesample is positive. Since the protocol employed re-quires any positive sample to be analysed three timesand since the criteria used to declare a sample positiveare conservative, false positives should be zero. Falsenegatives are not as easy to control since a decision ismade from the initial screen. On top of what is essen-tially a very sensitive qualitative analysis is a require-ment to measure the ratio of testosterone to epitestos-terone as a method to detect the illegal use of exoge-nous testosterone. Again this is not quantitative, but ac-curate measurement of ratios of analytes at very lowlevels requires extensive validation information.

Steroids are excreted as the glucuronides and sul-fates of the steroid and its metabolites. Enzymic hydro-lysis yields the free steroids, which are derivatised anddetermined by gas chromatography-mass spectrometry(GC-MS). Positive identification of two or three differ-ent metabolites is needed to declare a positive. Howev-er, only a very few deuterated anabolic steroids or theirmetabolites are available as pure standards. Indeed,few non-isotopically labeled steroid metabolites arecommercially available. Therefore a urine prepared bymixing incurred samples of all the known anabolicagents is used with every batch analysed to monitor thecomplete analytical procedure. Deuterated testosteroneand methyltestosterone are added as surrogate and in-ternal standard respectively. Only the concentration ofadded substances is known. When the day-to-day oreven sample-to-sample variability of GC-MS is recog-nized together with the fact that all urines are of differ-ent complexity, the validation process becomes increas-ingly difficult.

Deuterated testosterone and DES (diethylstilbes-trol) glucuronide allow estimation of the success of theglucuronide deconjugation and the recovery of free ste-roids. Use of the composite urine ensures that deriva-tive separation and sensitivity are within experimentalprotocols, and deuterated methyltestosterone tests in-strumental performance. Although the LOD of addedsubstances can be measured, those of all the metabol-ites cannot be accurately obtained. Moreover, in realsamples, the LOD for each group of metabolites will bedifferent according to the interfering substances whichare present. This is why the criteria used to date to de-clare a sample positive are so conservative - the athletealways gets the benefit of any doubt. The “traditional”validation of the whole process can only be based onthe composite urine used as external standard and iscentered around the limit of detection. However, ac-ceptance-rejection criteria based on obtaining satisfac-tory GC-MS spectra for all designated ions of all target

Page 81: Validation in Chemical Measurement

72 R.J. Wells

metabolites in the GC-MS run must also be consideredas part of the validation process.

The introduction of GC-high resolution mass spec-trometry has added another dimension to the problembecause much lower signal-to-noise spectra can be ob-tained and many interferences screened out. This bea-cons a new era where detection limits will fall and thenumber of confirmed positives could well rise. Also,will new problems arise in deciding if the method hasbeen adequately validated? That is, is it demonstrablyfit for the purpose for which it was designed?

Method validation vs performance criteria validation

The debate about comparability of methods based onperformance criteria is still current. The AOAC standis that results can only be validly compared between la-boratories if the same method and preferably the sameequipment is used. Although many good arguments canbe advanced for this approach, the time and resourcesrequired to set up and evaluate interlaboratory tests areoften extreme. Also, this approach leads to inertia andreluctance to commit to change or improvement. Theobvious argument against performance criteria basedvalidation is the question of what is the bench mark

against which the criteria are measured. If results arecompared with those obtained for an SRM, then thatSRM was probably determined by a standard method.The ideal situation arises when two independent meth-ods give comparable results. This certainly happens –sometimes.

Conclusions

This debate reinforces the need for a rigorous systemfor traceability in analytical chemistry. The time mustcome when clients can be confident in the fitness oftheir results for the purpose for which they were ob-tained. It has been estimated that at least 30% of allchemical analysis is not fit for purpose. In Australiaalone, this probably represents $A 100 000 000 per an-num spent on analytical work which is worthless.

However, it is probable that far more than 30% ofthe methods in routine laboratory use have not beenvalidated to an appropriate level to suit the problem athand. This situation needs to change and a practical as-sessment of the degree to which a method requires tobe validated is the first step to a reliable and cost-effec-tive analytical industry.

References

1. ISO/IEC Guide 25 (draft 5 1996] Gen-eral Requirements for the Competenceof Testing and Calibration Laborato-ries. International Organisation forStandards, Geneva

2. Taylor JK (1993) Anal Chem55:600A–608A

3. Green JM (1996) Anal Chem68:305A–309A

4. Thomson M, Wood R (1995) PureAppl Chem 87 :649–666

5. Guideline for collaborative study ofprocedure to validate characteristics ofa method of analysis (1989) J AssocOff Anal Chem 72 :694–704

6. Price G (1996) Accred Qual Assur1 :57–66

Page 82: Validation in Chemical Measurement

Accred Qual Assur (1998) 3 :194–196Q Springer-Verlag 1998

Reports and notes on experiences with quality assurance, validationand accreditation

Raluca-Ioana StefanHassan Y. Aboul-Enein

Validation criteria for developingion-selective membrane electrodesfor analysis of pharmaceuticals

Abstract The problem of valida-tion criteria for developing ion-se-lective membrane electrodes forthe analysis of pharmaceuticals ar-ises from the connection betweenthe reliability of ion-selective mem-brane electrodes construction andthe reliability of the analytical in-formation. Liquid membrane selec-tive electrodes are more suitablefor validation than the solid varie-ty. The influence of the stability of

ion pair complexes from the mem-brane on various parameters (e.g.response, limit of detection, andselectivity) is discussed. Validationcriteria are proposed.

Key words Ion-selectivemembrane electrodes 7 Reliabilityof construction 7 Reliability ofanalytical information 7 Validationcriteria 7 Analysis ofpharmaceuticals

Received: 18 September 1997Accepted: 17 November 1997

R.-I. StefanDepartment of Analytical Chemistry,Faculty of Chemistry, University ofBucharest, Blvd. Republicii No. 13, 70346,Bucharest-3, Romania

H. Y. Aboul-Enein (Y)Bioanalytical and Drug DevelopmentLaboratory, Biological and MedicalResearch (MBC-03), King Faisal SpecialistHospital and Research Centre, P.O. Box3354, Riyadh 11211,Saudi ArabiaFax: c966-1-442 7858e-mail: [email protected]

Introduction

For pharmaceutical analyses, it is necessary to have re-liable methods. Results obtained using ion-selectivemembrane electrodes are the best because of the sim-plicity, rapidity, and accuracy of direct and continuousmeasurement of the activity of the ions in the solution.Another very important reason for the selection ofelectrochemical sensors for pharmaceutical assay isnon-interference of by-products when the purity of araw material is to be determined.

Taking into account the definition of the validationof an analytical method given in the US Pharmaco-poeia: “Validation of an analytical method is the proc-ess by which it is established, by laboratory studies, thatthe performance characteristics of the method meet therequirements for the intended analytical applications.Performance characteristics are expressed in terms of

analytical parameters.”[1]. We can conclude from thisthat the performance of the method can be evaluatedthrough the analytical parameters of the chemical sen-sors and by recovery studies.

Using electrochemical sensors and considering thesources of uncertainty given by Pan [2], viz. homogene-ity, recovery, analysis blank, measurement standard,calibration, matrix effect and interferences, measuringinstrument, and data processing, the main uncertaintysources i.e., the homogeneity and the matrix effect, areeliminated.

The validation criteria for ion-selective membraneelectrodes (ISMEs) can be established only by takinginto account their reliability and its effect on the relia-bility of the analytical information that it is obtained.

Page 83: Validation in Chemical Measurement

74 R.-I. Stefan · H.Y. Aboul-Enein

The reliability of the construction of ISMEs

For the construction of ISMEs, solid membranes andliquid membranes are proposed. The construction ofsolid membrane ISEs is based on the insertion of elec-troactive material (ion pair complex), or only of acounter ion into a polyvinyl chloride (PVC) matrix.The PVC type as well as the plasticizer used for themembrane construction affect the reliability of the elec-trode’s response. Thomas proposed some rules for solidmembranes based on a PVC matrix construction [3--5].The procedure most often used for solid membraneconstruction is known as the Moody and Thomas pro-cedure [5], as this assures the best reliability of solidmembrane electrodes.

Reliable liquid membrane electrodes are obtainedby impregnating a graphite rod of spectral purity withthe solution that contains the ion pair complex dis-solved in the appropriate solvent [6]. The rod is at-tached to the end of a polytrifluoroethene (PTFE)tube.

Why liquid membrane ion-selective electrodes are moresuitable for validation than solid membraneion-selective electrodes

The first principle for ISME validation is the reproduci-bility of their construction. The construction of the sol-id membrane is not reproducible because the electroac-tive material is not uniformly distributed in the mem-brane. On the other hand, the impregnation process inliquid membrane construction causes the electroactivematerial to be uniformly distributed in the graphite rodsurface because of the uniformity of distribution of thepores in the graphite.

The second criterion for non-selective electrodes isthe simplicity of the membrane construction. For a sol-id membrane, smaller quantities of electroactive mate-rials, plasticizers, and solvent (usually tetrahydrofuran)are used in a large quantity of PVC. Homogenizationbetween them is not easy to achieve. However, afterhomogenization, it is necessary to satisfy many strictconditions to obtain a solid membrane with a given uni-form thickness. This is very hard to achieve, and ac-cordingly it affects the electrode response. The impreg-nation of the graphite rod with the ion pair complexsolution is done by immersion of the graphite rod in thesolution, and thus the thickness of the impregnationzone is constant for the same ion pair complex everytime.

The third criterion for ISEM construction is the rap-idity of the construction process.

For solid membrane construction one must waitmore than 24 h to obtain the membrane and another

24 h or more immersion in the inner solution in orderfor it to be suitable for characterization, Whereas ISEswith liquid membrane can be obtained and character-ized in the same day, because 1 h is required for theextraction of the ion pair complex solution and 12 h forthe impregnation of graphite rod with the solution.

Taking into account the above-mentioned criteria,one must recognize that ISEs based on liquid mem-branes are more suitable for validation, because theirconstruction is reliable, simple, and rapid.

The connection between the reliability of developedISMEs and the reliability of obtained analyticalinformation

The good relationship between the reliability of devel-oped ion-selective membrane electrodes and the relia-bility of obtained analytical information assures the val-idation of the ISMEs for analysis.

To obtain accurate results it is necessary to use ref-erence materials [7] and adequate procedures for thecalibration of an ISME [8]. The response of the ISMEis connected with the stability of the ion pair complex,because the stability of the complex is affected by thesolvent used for ion pair complex extraction. In PVCmatrix membranes, the PVC and plasticizer mixtureplays the same role as the solvent for the ion pair com-plex, because it affects the stability of this complex.From this point of view, the PVC matrix membrane as-sures a response value more than 55 mV/decade. Theresponse value of the liquid membrane electrode isgenerally less than the Nernstian one. Usually it showsresponses in the 50–55 mV/decade range. But a re-sponse having the 50 mV/decade value assures good ac-curacy and also the reliability of the analytical informa-tion.

The limit of detection is also related to the responseof the ISME:

pX1pE7

S

pX1pPlog[X1] where [X1] is the limit of detection, E7is the standard potential, and S is the slope. It followsthat a good limit of detection of 10P6 to 10P10 mol/Lcan be assured by a high E7 value and a low S value. Ifwe consider the minimum value for S, 50 mV/decade,the minimum E7 value for a slope of 50 mV/decademust be 300 mV to obtain a 10P6 mol/L for the limit ofdetection. The E7 value is given by the free energy forthe ion pair complex formation, and it also depends onthe stability constant of the ion pair complex.

The best limit of detection is assured by the ion-selective electrodes with a liquid membrane. The lowerlimits of detection are necessary for pharmaceutical

Page 84: Validation in Chemical Measurement

Validation criteria for developing ion-selective membrane electrodes for analysis of pharmaceuticals 75

analysis, especially for in vitro and in vivo dissolutiontests, when low quantities of ions must be detected. Theworking range is also a very important parameter forISMEs. It must be not less than two concentration de-cades.

Selectivity is one of the essential characteristics ofthe membrane electrodes. Accordingly, it is importantthat by-products, degradation products, metabolites,and compressing components (excipients) do not inter-fere, and thus the ISMEs can be successfully used forraw material assays as well as for raw material determi-nations in pharmaceutical formulations. To improvethe selectivity, it is necessary to choose another counterion for membrane construction, because the selectivityof membrane electrodes can be explained through thestability constants of the ion pair complexes betweenthe counter ion and the ions from the solution. To beselective over one ion from the solution it is necessaryfor the stability constant of the ion pair complex to beless than the stability constant of the ion pair complexobtained between the main ion and counter ion.

The response time is one of the main factors forISME validation, because these electrodes must beused to perform dissolution tests of pharmaceuticalsand potentiometrical titrations. Recovery studies are avery important means of documenting the reliability ofa method. Recovery studies give information about ac-curacy and repeatability of the analytical method. Foraccurate analytical information, both the repeatabilityand the reliability are given by relative standard devia-tion (RSD) values. When ISMEs are used for pharma-ceuticals analysis, the reliability must have the maxi-mum value and RSD^1.5%.

Validation criteria of ISME

An ISME can be validated for pharmaceuticals analysisonly if:1. the best counter ion (giving the ion pair complex

with the best stability) is chosen,2. the best matrix (PVC, liquid-solvent) for the electro-

active material is used,3. the biocompatibility of materials is correlated with

the response characteristics of the ISME,4. the counter ion assures a slope of 50 mV/decade

(minimum value), a limit of detection of 10P6 mol/L(minimum value), a large working range, a responsetime less than 1 min, and the maximum selectivity inpreference to the by-products, degradation products,compression compounds, and metabolites.

Conclusions

The validation of ion-selective membrane electrodes isbased on the reproducibility of their development, thesimplicity and rapidity of their construction, and theirresponse characteristics. Typically, these response char-acteristics include: minimum 50 mV/decade for theslope, 10P6 mol/L for limit of detection, large workingrange, and low response time, which can be assuredonly by the best counter ion and matrix (PVC and plas-ticizer for solid membrane electrodes and solvent forliquid membrane electrodes).

An RSD of more than 1.5% for a recovery test can-not be permitted. RSD values assure both the accuracyand the reliability of the analytical information.

References

1. US Pharmacopoeia, USPXXIII (1994),~1225` Validation of CompendialMethods, USP-Commission, UnitedStates Pharmacopeial Convention, Inc.,Rockville, MD, USA, p 1982

2. Pan XR (1996) Accred Qual Assur1 :181–185

3. Thomas JDR (1994) Analyst 119 :203–208

4. Ghauri MS, Thomas JDR (1994) AnalProc Including Anal Comm 31 :181–183

5. Craggs A, Moody GJ, Thomas JDR(1974) J Chem Educ 51 :541–544

6. Cosofret VV (1978) Rev Roum Chim23:1489–1493

7. Aboul-Enein HY (1996) Accred QualAssur 1 :176

8. Buck RP, Cosofret VV (1993) PureAppl Chem 65 :1849–1858

Page 85: Validation in Chemical Measurement

Accred Qual Assur (1998) 3 :412–415Q Springer-Verlag 1998

Reports and notes on experiences with quality assurance, validationand accreditation

Stephan Küppers Is the estimation of measurementuncertainty a viable alternative tovalidation?

Received: 30 April 1998Accepted: 3 July 1998

Presented at: Analytica Conference 98,Symposium 2: “Uncertainty budgets inchemical measurements”, Munich,21–24 April 1998

S. Küppers (Y)Schering AG,In-Process-Control,Müllerstrasse 170–178,D-13342 Berlin, GermanyTel.: c49-30-468-1-7819Fax: c49-30-468-9-7819e-mail: stephan.kueppers6schering.de

Abstract Two examples of the useof measurement uncertainty in adevelopment environment are pre-sented and compared to the use ofvalidation. It is concluded thatmeasurement uncertainty is a goodalternative to validation for chemi-cal processes in the developmentstage. Some advantages of meas-urement uncertainty are described.The major advantages are that theestimations of measurement uncer-tainty are very efficient, and can beperformed before analysis of the

samples. The results of measure-ment uncertainty influence thetype of analysis employed in thedevelopment process, and themeasurement design can be ad-justed to the need of the process.

Key words Chemical analysis 7Development process 7Measurement uncertainty 7Validation 7 Practical examples

Introduction

The “know how” in quality assessment has grown overthe last few years. Quality management systems are im-proving and validation procedures are being optimized.But the problem that validation procedures are timeconsuming still remains. Often validations are perform-ed too late. Especially in a development environmentwhere the advance from one development step to thenext is based on a decision resulting from the previousstep. Therefore, valid analytical data is needed directlyafter each method development.

The estimation of measurement uncertainty is an al-ternative to validation if a quantitative result is used forthe assessment of the development process. In this caseonly one parameter of the validation is needed. Theprecision of the analysis is sometimes taken as a qualityfigure instead of a validation. But precision cannot re-place validation because in this case the environment,i.e. the sample preparation is neglected and precision is

only a representative of one contribution to the uncer-tainty of measurement: in high performance liquidchromatography (HPLC) usually the uncertainty con-tribution of the sampler.

To illustrate the point, two examples taken from aprocess research environment in a pharmaceutical com-pany are presented. It is demonstrated that measure-ment uncertainty has practical advantages compared tovalidation.

Examples of process development

The first example presents a typical process from ourresearch environment, where a chemical synthesis istransferred from development to production. In this sit-uation validation of the chemical process is performedthroughout, including variations of process parametersfor intermediates that are not isolated. For these inter-mediates a pre-selected analytical method is normally

Page 86: Validation in Chemical Measurement

Is the estimation of measurement uncertainty a viable alternative to validation? 77

Fig. 1 The synthesis and variations performed in a processresearch validation shown schematically. HV stands forperformed according to the manufacturing formula. The exampleshown illustrates the first step of the chemical synthesis. Circlesshow the variations of the two parameters tested in the validation(LiCl and TOBO are internal names for chemicals used in thesynthesis, eq stands for equivalent)

Table 1 First step in the estimation of the measurementuncertainty of high performance liquid chromatography (HPLC)analysis as a method used for the assessment of a chemicalprocess

Uncertainty component Min Max

Inhomogeneity of the sample andinhomogeneity of sampling

~0.2% 0.5%

Weighing of the reference materials orthe sample (about 50 mg)

~0.1% 0.1%

Uncertainty of the instrumentation(sampler uncertainty) and referencematerial(s) depending on the number ofinjections

0.5% 1.5%

Evaluation uncertainty (uncertaintycaused by integration)

~0.1% 0.1%

Uncertainty of the reference material(s) P P

available, usually a method that has been employed forsome time. Because no validation report is requestedby regulatory authorities, no formal validation is per-formed.

An example of a typical validation scheme for achemical synthesis in process research is given in Fig. 1.In this example the process research chemist sets up anumber of experiments. The most important questionpresented by the chemist to the analytical departmentis: Is the variability of the analytical process small com-pared to the variation performed in the process valida-tion [1, 2].

The original manufacturing formula (HV) and fivevariations are performed in the first step of the synthe-sis. Six samples are analysed. The results of these sixanalyses are used to assess the validation of this processstep. In this case validation of the analytical method is aprerequisite for any decision that is made about the val-idity of the process. This information is needed beforethe process research chemist can start variations of theprocess otherwise it is possible that the data receivedcannot be assessed. The difficulty of assessing the dataof the process validation results from the fact that thedata is influenced by the analytical method and the un-certainty of the chemical process. If the uncertainty ofthe analytical method is larger or in the same range asthe variations of the chemical process, assessment ofthe data is not possible.

To assess uncertainty, a type B estimation of uncer-tainty is performed. After analysis of the samples thetype B estimation can be verified by a type A estima-

tion analogous to the concept presented by Henrion etal. [3]. In our laboratory, a type B estimation can beperformed with confidence because we have performedthis type of analysis (an HPLC method with externalstandard calibration) about 50 times before. A controlsample is included in every analysis and the results areplotted on a control chart. The control chart, repre-senting the total measurement uncertainty for the ana-lytical method, is then reviewed for the estimation ofuncertainty. The standard deviation for the assay of acontrol sample for 50 analyses was 1.51%. An exampleof the way in which a control sample can be used formeasurement uncertainty is presented in detail in [1].The typical uncertainty components for this type ofanalysis are (Table 1):— inhomogeneity of the sample and sampling— weighing of reference materials (about 50 mg)— weighing of samples (about 50 mg)— uncertainty of the instrumentation (sampler uncer-

tainty)— evaluation uncertainty (uncertainty caused by inte-

gration)— uncertainty of reference materials.The case presented here is simple because all six sam-ples can be analysed as one set of samples in one analy-tical run. For a single run the estimation of uncertaintycan be reduced to the calibration of the method with anin-house reference material and the uncertainty of theanalysis itself. In this example inhomogeneity of thesample, the evaluation uncertainty and the uncertaintyof the reference material (i.e. inhomogeneity) as givenin Table 1 can be neglected because the results of onlyone analysis are compared. The weighing uncertaintiescan be neglected because they are small compared tothe sampler uncertainty.

If the measurement uncertainty is estimated usingthe parameters in Table 1 but for a longer time period

Page 87: Validation in Chemical Measurement

78 S. Küppers

Table 2 Results of the estimation of the measurementuncertainty for HPLC analysis

Option 1a Option 2b

Calibration 1% 1%

Analysis of samples 1.5% 1%

Total uncertainty (received fromuncertainty propagation)

1.8% 1.4%

a Option 1: two weights of the calibration standard with sixinjections for each of them, two weights of the sample with twoinjections for each weightb Option 2: two weights of the sample with four injections foreach weight

Table 3 Comparison of the results of the estimation and theanalysis (type B estimation compared to type A estimation)

Estimation Found(one example)

Calibration 1% 0.8%

Analysis of samples 1% 0.8%

Total uncertainty(by uncertainty propagation)

1.4% 1.14%

as for example covered in the control chart, the inho-mogeneity of the sample has to be included in the esti-mation. Using the mean values between the min andmax column in Table 1 the uncertainty estimated byuncertainty propagation is 1.46%, which is close to thevalue found in the control chart. In the case presentedhere the estimation of measurement uncertainty can beperformed in only two steps as shown in Table 2. Cali-bration and analysis of samples represent the major un-certainties and combined they provide the completeuncertainty of our experiment; in this case the uncer-tainty of the HPLC sampler. The influence of theHPLC sampler is known, therefore, there are two op-tions to perform the analysis.

Together with the customer it was decided to per-form the analysis according to option 2. The sampleswere analysed in one analytical run. The result is shownin Table 3. The result of the estimation compares wellwith the “found” results. The found uncertainty issmaller than the estimation. Assessment of the resultsof the validation of the manufacturing formula be-comes easier from the customers point of view becausethe customer is able to deceide if a variation of his re-sult is related to his process or to the uncertainty of theanalytical method. Additionally, the influence of the in-dividual contributions to uncertainty becomes smallerbecause of the uncertainty propagation. Therefore, thedifference between the estimated and found uncertain-ty becomes smaller with an increasing number of pa-rameters that influence uncertainty.

The estimation of uncertainty replaces a full valida-tion of the analytical method. It generates the necessaryinformation at the right time. The statistical informa-tion received from the analysis can be used for the in-terpretation of the data and finally the analysis is de-signed to the customers needs. In this case measure-ment uncertainty is a good alternative to validation.

The second example illustrates the determination ofwater content which is an important characteristic forchemical substances and is needed in many chemicalreactions. It is usually determined by Karl-Fischers(KF) titration. The water content determined in our la-boratory ranges from ~0.1% to about 30%. It is widelyknown that KF water titration’s may be influenced bythe sample and, depending on the range, some otherparameters may significantly affect the uncertainty.

Because the concentration of a reagent is often de-termined on the basis of the water content of the reac-tion mixture, uncertainty information for the water de-termination is needed. The problem in a developmentenvironment is that various synthesis routes are tested.If salts are used in a chemical reaction it is usual thatchemists test different counter ions for the optimizationof the synthesis. However, different salts are rarelytested in the analytical department. One of the prob-lems of the variation of counter ions is that the hygros-copicity of the salts is often different.

Two independent steps have to be followed:1. Substance dependent influences have to observed.

In most cases the chemical structure of the compo-nent is known and therefore serious mistakes can beavoided. Titration software and various KF reagentshave to be available and standard operation proce-dures have to be established.

2. The individual uncertainty has to be considered.Therefore a preliminary specification has to be setand a type B estimation of uncertainty can be usedto show if there is any problem arising from the dataand the specification limit [4].

The first step is a general task using the appropriateequipment and has been established in our laboratory.However, the second part needs to be discussed in de-tail.

Suppose there is a chemical reaction with reagent Awhere:

AcB ] C. (1)

The alternative reaction for A may also be with wa-ter to D.

AcH2O ] D. (2)

The reaction with water is often faster than with B.Because of the low molecular weight of water 0.5% (w/w) of water in B may be 10 mol %. Therefore an excessof at least 10 mol % of A might be needed to completethe reaction. The water content is determined in the

Page 88: Validation in Chemical Measurement

Is the estimation of measurement uncertainty a viable alternative to validation? 79

Table 4 Estimation of the measurement uncertainty for thetitration of water (example performed manually)

Results from the titrations(% (w/w) from the weight ofthe sample)

0.20%; 0.23% 0.215%c0.0150%

Minimum uncertainty of1.1%

1.1% c0.00236%

Hygroscopicity (estimated onthe basis of experience withamin compounds)

5% c0.00621%

Uncertainty of the titer 1% c0.00124%

Reaction of the sample withthe solvent

5% c0.00621%

Result reported includinguncertainty

0.25%

analytical department. For example the value deter-mined by KF titration is 0.22% (w/w) from two measur-ements with 0.2% (w/w) and 0.23% (w/w) as the indi-vidual values. The question that has to be asked is: Is0.22% of the weight always smaller than 0.25% (w/w)?What we need is the measurement uncertainty added tothe “found” value. If this value is smaller than the limit(0.25%) the pre-calculated amount of the reagents canbe used.

The model set up consists of two terms:

YpX*(1cU)

where X is the mean value of the measurements plusthe standard uncertainty. U is the sum of the variousinfluence parameters on measurement uncertainty:— hygroscopicity— uncertainty of the titer— reaction of the KF solvent with the sample— a minimum uncertainty constant of 1.1% (taken

from the control chart of the standard reference ma-terial) covering balance uncertainties, the influenceof water in the atmosphere and the instrument un-certainty with the detection of the end of titration.

The factors for the three contributions mentionedabove are estimated on the basis of experience. Thecalculation is performed using a computer program [5].This makes the decision easy and fast. In this case thetype A estimation on the basis of the results and thetype B estimation of the influence factors are com-bined. An example is given in Table 4 in a compressedform.

The alternative would be an experimental valida-tion. In this case the uncertainty estimation has provento be a very useful alternative to validation, although,on the basis of experience, the estimate of hygroscopic-ity is difficult and may lead to incorrect values.

Conclusions

Method validation is a process used to confirm that ananalytical procedure employed for a specific test is suit-able for the intended use. The examples above showthat the estimation of measurement uncertainty is a vi-able alternative to validation. The estimation of meas-urement uncertainty can be used to confirm that ananalytical procedure is suitable for the intended use. Ifthe estimation of measurement uncertainty is used to-gether with validations both the uncertainty estimationand the validation have their own place in a develop-mental environment. The major advantages of meas-urement uncertainty are that it is fast and efficient.Normally, if the analytical method is understood by thelaboratory very similar results are found for the estima-tion of uncertainty and for the classical variation of crit-ical parameters, namely, validation. The decision onhow to perform a validation should be made on a caseto case basis depending on experience.

Acknowledgements Fruitful scientific discussions with Dr. P.Blaszkiewicz, Schering AG, Berlin and Dr. W. Hässelbarth,BAM, Berlin are gratefully acknowledged.

References

1. Küppers S (1997) Accred Qual Assur2 :30–35

2. Küppers S (1997) Accred Qual Assur2 :338–341

3. Henrion A, Dube G, Richter W (1997)Fres J Anal Chem 358 :506–508

4. Renger B (1997) Pharm Tech Europe9 :36–44

5. Evaluation of uncertainty (1997) R.Metrodata GmbH, Grenzach-Whylen,Germany

Page 89: Validation in Chemical Measurement

Accred Qual Assur (1999) 4 :473–476Q Springer-Verlag 1999

Reports and notes on experiences with quality assurance, validationand accreditation

R.C. DíazM.A. SarasaC. RíosJ.J. Cuello

Validation of an HPLC method for thedetermination of p-dichlorobenzene andnaphthalene in mothrepellents

Received: 21 December 1998Accepted: 4 May 1999

R.C. Díaz (Y) 7 M.A. Sarasa 7 C. Rios 7J.J. CuelloUniversity of Zaragoza, AnalyticalChemistry Department. Crtra. Zaragoza,s/n, 22071 Huesca, Spaine-mail: cdiaz6posta.unizar.esTel.: c34-9-74 239320Fax: c34-9-74 239302

Abstract The determination ofdichlorobenzene and naphthalenein commercial repellents used inSpain has been validated. This wasdone using an isocratic regime, totest the reverse -phase HPLC sys-tem with acetonitrile: water 65 :35(v: v) as the mobile phase, at 20 7C.This technique is proposed for themodular validation of the HPLC

system . The results obtained withthis method show good agreementwith the results provided by themanufacturers of the mothrepel-lents.

Key words Validation 7 HPLC 7p-Dichlorobenzene 7 Naphthalene 7Mothrepellents.

Introduction

The most frequently used commercial mothrepellentsin Spain contain p-diclorobenzene (p-DB) or naphthal-ene (N) as the active component (98–99% w/w), therest being perfume, approximately 1–2%.These valuesare generally determined by gas cromatography [1–5].We propose an alternative analytical method, based ondetermination of the active components by HPLC inreverse phase, using an isocratic regime, with diphenyl(DP) as an internal standard. The results obtainedshow good conformity with the certified values of theactive components supplied by the manufacturers ofthe mothrepellents. The modular approximation [6] isproposed for the validation of the method, that in-cludes the validation of each part of the system (pump,injector, etc.).

HPLC validation systems

The stages involved in the validation of the HPLC sys-tem, in chronological order, are as follows [7–10]:

Column

The column used for the validation was a reverse phaseNucleosil 120 C-18. Methanol was used as the mobilephase, to achieve a resolution of Rs 11.5. The mobilephase flow rate was 1 ml/min, the injection volume was10 ml, with detection at 250 nm. The determination wasaccomplished at ambient laboratory temperature(20 7C). The trial was repeated 10 times. The mixture todetermine the real flow consisted of nitromethane, an-thracene, pirene and perilene, dissolved in methanol.We recommend that the proportions are adjusted sothat each peak corresponds to approximately 25% ofthe area with respect to normalized areas. As an illus-tration, for 0.5 units of absorbance (AU), the cited so-lution will contain: nitromethane (1000 mg ml–1), an-thracene (2 mg ml–1 ), pirene (33 mg ml–1) and perilene

Page 90: Validation in Chemical Measurement

Validation of an HPLC method for the determination of p-dichlorobenzene and naphthalene in mothrepellents 81

Table 1 Variation coefficients(in %) obtained for the deter-mination of flow rate and re-producibility of injections

Parameter to determine Equation Tolerance

Short-term flow Global CVp((CVi)norm.area/4 ~1.5Flow during a long run Global CVp(CVi)tWb/4 ~1Reproducibility of injection Global CVp[(CVabs.area)2P(CVnorm.area)2]/4 ~1.5

(12 mg ml–1). Uracil (5 mg ml–1) was also added as a notretained reference component.

Pump

Flow accuracy

The flow volume exiting of the detector was measuredusing water of chromatographic quality. The eluent vol-ume was measured over a 10 min period, in a graduatedmanometer and for different flow rates (0.5, 1.0 and2.0 ml min–1) the time elapsed was measured with aprecision chronometer. The flow rate was then calcu-lated. This test was repeated every 10 days to evaluatethe reproducibility of the flow. An accuracy of up to3% of the theoretical value was recorded. The ob-served variations should be to the critical parts of thepump (fundamentally the valves).

Calculation of the short-term flow

As of the obtained chromatogram obtained from thesample test, is calculated, for each one of its compo-nents, the variation coefficient of the values of the nor-malized areas, for ten injections. The expression for theglobal variation coefficient of the mixture is shown inTable 1. The tolerance turned out to be inferior to1.5% which is the tolerance recommended in the litera-ture.

Measurement of the flow during a long run

The variation coefficient of the retention times,(t’RptRPturacil) corrected, for ten injections was calcu-lated from the chromatogram obtained for the sampletest. The global variation coefficient turned out to beinferior to 1%. The equation used for the calculationappears in Table 1.

Injector

All determinations were carried out by manual injec-tion.

Reproducibility

The reproducibility was calculated for each componentof the test mixture by sixtuplicated injections of 20 ml.The global variation coefficient was calculated usingtheequation shown in Table 1.

Linearity

A calibration curve was constructed using a solution ofanthracene in methanol (2 mg ml–1), by quintuplicateinjections of 10, 20 and 30 ml. The correlation coeffi-cient was 0.9890 and the precision for each group of in-jections was 2.0%.

UV Detector

Accuracy of the wavelength

The cell of the detector was filled with a chromophoresolution of uracil in methanol (15 mg ml–1) A sweepwas performed between 250 and 265 nm. The wavel-ength corresponding to the maximum absorbance mustbe 257B2 nm.

Linearity

In order to test the linear response range of the detec-tor, a standard solution of propylparaben in methanol(25 mg ml–1) was prepared, that represents 2 AU. A cal-ibration curve was constructed using 5, 10, 15, 20 and 25mg ml–1, beginning with the most dilute solution.

Quantification limit

A standard solution of anthracene in methanol (2 mgml–1) was used. The same methanol used for the mobilephase and solvent of the test solution was used a ablank control. With the integrator set at minimum atte-nuation (maximum sensibility), the noise was deter-mined for six injections of methanol (10 ml), calculatingsb. The value 10 sb gave a signal level related to thequantification limit, obtained from the standard solu-tion of anthracene, with the necessary dilution to ob-

Page 91: Validation in Chemical Measurement

82 R.C. Díaz · M.A. Sarasa · C. Ríos · J.J. Cuello

Table 2 Expressions of thevarious contributions to totaluncertainty

Contribution to uncertainty Equation foruncertainty

Repetitions of the certified standard sa2ps2 (1/Nc1/n)

Calibration curve sc2psr

2 [1/NP(Axi)2/(Axi2)]

Precision of the HPLC equipment ue2pse

2/3Precision of the CMRS usc

2p(Io/k)2

Dilution of the reference standard ud2psd

2/3

tain a final concentration to satisfy the requirement10 sb.

Data processing

Calibration of the software

The set of chromatograms corresponding to the fourcited standards was processed and compared to the ob-tained results and weighted according to the precalcula-tions. There were the differences between the two cali-bration curves [slope, interception at the origin, R (cor-relation coefficient), R2 and standard deviation].

Linearity

A correlation exists between the response of the con-vertor of the analoque signal and the exit of the detec-tor. It is a registered response of the data acquisitionsystem, for each one of the accomplished readings ofthe standards of absorbance, analysing the correlationbetween the values obtained with the system from dataprocess and the accomplished readings directly in thedetector.

Uncertainty calculation

Components of type A

They are associated with the calibration curve, sc. Theequation of the straight line obtained is: ypbcmx,where y is the area of the peak of each standard in thechromatogram (expressed in area units), b is the inter-ception at the origin and x the concentration of thestandard (expressed in mg ml–1). smpsr/s1/2 is the corre-sponding deviation from the slope, sr the standard de-viation of the differences between the obtained valueand the real value and s the corresponding to concen-trations.

Components of type B

The contribution to the total uncertainty associatedwith the repetitions of the certified standard was deter-

Table 3 Chromatography and integrator conditions

Column: Nucleosil 120 C-18, 25 cm!0.46 cm!5 mm of filmthickness

Mobile phase: acetonitrile/water; 65 :35 (v/v)Mobile phase flow: 1.2 ml min–1

Temperature: 20 7CInjection: 10 mlAttenuation: 7Threshold: 5

mined by sa. N is the number of samples of CertifiedReference Materials (CMRS) and N the number of re-petitions for each sample. Also they contribute to thenumber of repetitions n of the number N of samplesCMRS, with the term sa. There exists a inaccuracy in-trinsically related to the HPLC equipment used. ue isthe contribution to the total uncertainty. Also asso-ciated with the certified standards, of agreement to thecalibration certificate, that is represented by ssc (Io isthe uncertainty interval and k the uncertainty factor, inthis case 2). ud is related to the successive dilutions ef-fected to prepare the corresponding standards for con-struction of the calibration curve, with sd the standarddeviation associated with standard most diluted, since isgreater the one which contribution adds to the uncer-tainty. The expressions of the corresponding uncertain-ties appear in Table 2.The global variance is calculated from: sy

2p[(sa

2cue2cud

2cusc2)/m2]csc

2/m2csm2 (yPb)2/m4.

The total uncertainty is calculated by the equationIpksy

2, where kp2 (with a confidence level of the95%).

Experimental procedure

Standard solutions p-DB or N (3200, 5300, 7400 and9300 mg ml–1) were prepared in 25-ml flasks. DP (2 ml)was added to flask as an internal standard, and the finalvolume was made up to 25 ml with acetonitrile. The testsamples were prepared with 3500 mg ml–1 of the corre-sponding mothrepellent, with 2 ml of DP and made upto 25 ml with acetonitrile.

The optimized chromatographic and integrator con-ditions appear in Table 3.

Page 92: Validation in Chemical Measurement

Validation of an HPLC method for the determination of p-dichlorobenzene and naphthalene in mothrepellents 83

Table 4 Results obtained (ex-pressed in % w/w) in the de-termination of p - dicloroben-zene (p-DB) or naphthalene(N) in test and commercialsamples of mothrepellents

Commercial name Certified (p-DB) Certified (N) Result

Polil (a) 98–99 P 98.35B0.11p-DB (b) 1 99 P 99.90B0.06Naftalina (c) P 1 99 99.86B0.04Camfolina (d) P 97–98 97.36B0.14

(a) Cruz Verde – Legrain(b) Bayer AG(c) Industrial Química del Nalón(d) Vanossa

Results and discussion

The standard components were identified by their re-tention times (8.09 min for N, 9.63 min for p-DB and11,28 min for DP), from their corresponding calibrationgraphs. The equation used for the determination of p-DB was: Area of the peak p-DB/Area of the peakDPpbcm[p-DB]/[DP], and Area of the peak N/Areaof the peak DPpbcm [N]/[DP] was used for the deter-mination of N. By interpolating the values obtained forthe mothrepellent samples, the content of p-DB or N inthe commercial products (Table 4). The values ob-

tained can be compared with the values supplied by themanufacturers (for five repetitions of each sample) andare shown in Table 4.

Conclusions

A simple and fast HP method for the determination ofp-DB and N in mothrepellents manufactured in Spainhas been validated. This was achieved by modular vali-dation of the HPLC system. The results obtained showgood agreement with the values provided by the manu-facturers.

References

1. Heinen H (1988) Analysis method ofdichlorobenzenes. Bayer AG, Lever-kusen, Germany

2. Gaset P (1994) Analysis method of p-dichlorobenzene. Cruz Verde – Le-grain, Barcelona, Spain

3. NIOSH (1978) Manual of AnalyticalMethods. Vol. 4

4. García E (1987) Analysis methods fornaphthalene. Industrial Química delNalón, Oviedo, Spain

5. Ortiz de Urbina G (1993) Analysis ofrefined naphthalene. INASMET, SanSebastián, Spain

6. Tetzlaff RF (1992) Pharm Technol16 :70 – 83

7. Waters (1993) Guide for the systemqualification of HPLC automatedwith detection UV

8. Bedson P, Rudd D (1999) AccredQual Assur 4 : 50 – 62

9. EURACHEM (1995) Quantifying un-certainly in analytical measurement.Teddington, UK

10. Huber L (1998) BioPharm 11 :41,65–66

Page 93: Validation in Chemical Measurement

Accred Qual Assur (2000) 5 :47–53Q Springer-Verlag 2000 PRACTITIONER’S REPORT

Vicki J. BarwickStephen L.R. Ellison

The evaluation of measurementuncertainty from method validationstudiesPart 1: Description of a laboratory protocol

Received: 10 April 1999Accepted: 24 September 1999

V.J. Barwick (Y) 7 S.L.R. EllisonLaboratory of the Government Chemist,Queens Road, Teddington, Middlesex,TW11 0LY, UKe-mail: vjb6lgc.co.ukTel.: c44-20-89437421Fax: c44-20-89432767

Abstract A protocol has been de-veloped illustrating the link be-tween validation experiments, suchas precision, trueness and rugged-ness testing, and measurement un-certainty evaluation. By planningvalidation experiments with uncer-tainty estimation in mind, uncer-tainty budgets can be obtainedfrom validation data with little ad-ditional effort. The main stages inthe uncertainty estimation processare described, and the use of true-

ness and ruggedness studies is dis-cussed in detail. The practical ap-plication of the protocol will be il-lustrated in Part 2, with referenceto a method for the determinationof three markers (CI solvent red24, quinizarin and CI solvent yel-low 124) in fuel oil samples.

Key words Measurementuncertainty 7 Method validation 7Precision 7 Trueness 7 Ruggednesstesting

Introduction

In recent years, the subject of the evaluation of meas-urement uncertainty in analytical chemistry has gener-ated a significant level of interest and discussion. It isgenerally acknowledged that the fitness for purpose ofan analytical result cannot be assessed without some es-timate of the measurement uncertainty to compare withthe confidence required. The Guide to the Expressionof Uncertainty in Measurement (GUM) published byISO [1] establishes general rules for evaluating and ex-pressing uncertainty for a wide range of measurements.The guide was interpreted for analytical chemistry byEURACHEM in 1995 [2]. The approach described inthe GUM requires the identification of all possiblesources of uncertainty associated with the procedure;the estimation of their magnitude from either experi-mental or published data; and the combination of theseindividual uncertainties to give standard and expandeduncertainties for the procedure as a whole. Some appli-

cations of this approach to analytical chemistry havebeen published [3, 4]. However, the GUM principlesare significantly different from the methods currentlyused in analytical chemistry for estimating uncertainty[5–8] which generally make use of “whole method” per-formance parameters, such as precision and recovery,obtained during in-house method validation studies orduring method development and collaborative study[9–11]. We have previously described a strategy for re-conciling the information requirements of formal (i.e.GUM) measurement uncertainty principles with thedata generated from method validation studies [12–14].The approach involves a detailed analysis of the factorsinfluencing the result using cause and effect analysis[15]. This results in a structured list of the possiblesources of uncertainty associated with the method. Thelist is then simplified and reconciled with existing ex-perimental and other data. We now report the applica-tion of this approach in the form of a protocol for theestimation of measurement uncertainty from validationstudies [16]. This paper outlines the key stages in the

Page 94: Validation in Chemical Measurement

The evaluation of measurement uncertainty from method validation studies. Part 1: Description of a laboratory protocol 85

Fig. 1 Flow chart summarising the uncertainty estimationprocess

protocol and discusses the use of data from truenessand ruggedness studies in detail. The practical applica-tion of the protocol will be described in Part 2 with ref-erence to a high performance liquid chromatography(HPLC) procedure for the determination of markers inroad fuel [17].

Principles of approach

The stages in the uncertainty estimation process are il-lustrated in Fig. 1. An outline of the procedure dis-cussed in the protocol is presented in Fig. 2. The firststage of the procedure is the identification of sources ofuncertainty for the method. Once the sources of uncer-tainty have been identified they require evaluation.The main tools for doing this are precision, trueness (orbias) and ruggedness studies. The aim is to account foras many sources of uncertainty as possible during theprecision and trueness studies. Any remaining sourcesof uncertainty are then evaluated either from existingdata (e.g. calibration certificates, published data, pre-vious studies, etc.) or via ruggedness studies. Note thatit may not be necessary to evaluate every source of un-certainty in detail, if the analyst has evidence to suggestthat some are insignificant. Indeed, the EURACHEMGuide states that unless there are a large number of

them, uncertainty components which are less than one-third of the largest need not be evaluated in detail. Fi-nally, the individual uncertainty components for themethod are combined to give standard and expandeduncertainties for the method as a whole. The use ofdata from trueness and ruggedness studies in uncertain-ty estimation is discussed in more detail below.

Trueness studies

In developing the protocol, the trueness of a methodwas considered in terms of recovery, i.e. the ratio of theobserved value to the expected value. The evaluationof uncertainties associated with recovery is discussed indetail elsewhere [18, 19]. In general, the recovery, R,for a particular sample is considered as comprisingthree components:– Rm is an estimate of the mean method recovery ob-

tained from, for example, the analysis of a CRM or aspiked sample. The uncertainty in Rm is composed ofthe uncertainty in the reference value (e.g. the uncer-tainty in the certified value of a reference material)and the uncertainty in the observed value (e.g. thestandard deviation of the mean of replicate ana-lyses).

– Rs is a correction factor to take account of differ-ences in the recovery for a particular sample com-pared to the recovery observed for the material usedto estimate Rm.

– Rrep is a correction factor to take account of the factthat a spiked sample may behave differently to a realsample with incurred analyte.These three elements are combined multiplicatively

to give an estimate of the recovery for a particular sam-ple, R, and its uncertainty, u(R):

RpRm!Rs!Rrep , (1)

u(R)pR!"1u(Rm)Rm

22

c1u(Rs)Rs

22

c1u(Rrep)Rrep

22

. (2)

Rm and u(Rm) are calculated using Eq. (3) andEq. (4):

RmpCobs

CRM

, (3)

u(Rm)pRm!"1 sobs

Cobs2

2

c1u(CRM)CRM

22

, (4)

where Cobs is the mean of the replicate analyses of thereference material (e.g. CRM or spiked sample), sobs isthe standard deviation of the mean of the results, CRM

is the concentration of the reference material andu(CRM) is the standard uncertainty in the concentrationof the reference material. To determine the contribu-

Page 95: Validation in Chemical Measurement

86 V.J. Barwick · S.L.R. Ellison

Fig. 2 Flow chart illustratingthe stages in the method vali-dation/measurement uncer-tainty protocol

tion of Rm to the combined uncertainty for the methodas a whole, the estimate is compared with 1, using anequation of the form:

tph1PRmh

u(Rm). (5)

To determine whether Rm is significantly differentfrom 1, the calculated value of t is compared with thecoverage factor, kp2, which will be used to calculatethe expanded uncertainty [19]. A t value greater than 2suggests that Rm is significantly different from 1. How-ever, if in the normal application of the method, no cor-rection is made to take account of the fact that themethod recovery is significantly different from 1, the

uncertainty associated with Rm must be increased totake account of this uncorrected bias. The relevantequation is:

u(Rm)bp"11PRm

k 22

cu(Rm)2 . (6)

A special case arises when an empirical method isbeing studied. In such cases, the method defines themeasurand (e.g. dietary fibre, extractable cadmiumfrom ceramics). The method is considered to define thetrue value and is, by definition, unbiased. The presump-tion is that Rm is equal to 1 and that the only uncertain-ty is that associated with the laboratory’s particular ap-plication of the method. In some cases, a reference ma-

Page 96: Validation in Chemical Measurement

The evaluation of measurement uncertainty from method validation studies. Part 1: Description of a laboratory protocol 87

Fig. 2 Continued

Page 97: Validation in Chemical Measurement

88 V.J. Barwick · S.L.R. Ellison

terial certified for use with the method may be availa-ble. Where this is so, a bias study can be carried outand the results treated as discussed above. If there is norelevant reference material, it is not possible to esti-mate the uncertainty associated with the laboratorybias. There will still be uncertainties associated withbias, but they will be associated with possible bias inthe temperatures, masses, etc. used to define the meth-od. In such cases it will normally be necessary to con-sider these individually.

Where the method scope covers a range of samplematrices and/or analyte concentrations, an additionaluncertainty term Rs is required to take account of dif-ferences in the recovery of a particular sample type,compared to the material used to estimate Rm. This canbe evaluated by analysing a representative range ofspiked samples, covering typical matrices and analyteconcentrations, in replicate. The mean recovery foreach sample type is calculated. Rs is normally assumedto be equal to 1. However, there will be an uncertaintyassociated with this assumption, which appears in thespread of mean recoveries observed for the differentspiked samples. The uncertainty, u(Rs), is thereforetaken as the standard deviation of the mean recoveriesfor each sample type.

When a spiked sample, rather than a matrix refer-ence material, has been used to estimate Rm it may benecessary to consider Rrep and its uncertainty. In gener-al, Rrep is assumed to equal 1, indicating that the recov-ery observed for a spiked sample is truly representativeof that for the incurred analyte. The uncertainty,u(Rrep), is a measure of the uncertainty associated withthat assumption. In some cases it can be argued that aspike is a good representation of a real sample, for ex-ample in liquid samples where the analyte is simply dis-solved in the matrix; u(Rrep) can therefore be assumedto be small. In other cases there may be reason to be-lieve that a spiked sample is not a perfect model for atest sample and u(Rrep) may be a significant source ofuncertainty. The evaluation of u(Rrep) is discussed inmore detail elsewhere [18].

Evaluation of other sources of uncertainty

An uncertainty evaluation must consider the full rangeof variability likely to be encountered during applica-tion of the method. This includes parameters relating tothe sample (analyte concentration, sample matrix) aswell as experimental parameters associated with themethod (e.g. temperature, extraction time, equipmentsettings, etc.). Sources of uncertainty not adequatelycovered by the precision and trueness studies requireseparate evaluation. There are three main sources ofinformation: calibration certificates and manufacturers’specifications, data published in the literature and spe-

cially designed experimental studies. One efficientmethod of experimental study is ruggedness testing,discussed below.

Ruggedness studies

Ruggedness tests are a useful way of investigating si-multaneously the effect of several experimental param-eters on method performance. The experiments arebased on the ruggedness testing procedure described inthe Statistical Manual of the AOAC [20]. Such experi-ments result in an observed difference, Dxi

, for each pa-rameter studied which represents the change in resultdue to varying that parameter. The parameters aretested for significance using a Student’s t-test of theform [21]:

tp;n!Dxi

;2!s, (7)

where s is the estimate of the method precision, n is thenumber of experiments carried out at each level foreach parameter (np4 for a seven-parameter Plackett-Burman experimental design), and Dxi

is the differencecalculated for parameter xi. The values of t calculatedusing Eq. (7) are compared with the appropriate criticalvalues of t at 95% confidence. Note that the degrees offreedom for tcrit relate to the degrees of freedom for theprecision estimate used in the calculation of t. For pa-rameters identified as having no significant effect onthe method performance, the uncertainty in the finalresult y due to parameter xi, u(y(xi)), is calculated usingEq. (8):

u(y(xi))p;2!tcrit!s;n!1.96

!dreal

dtest

, (8)

where dreal is the change in the parameter which wouldbe expected when the method is operating under con-trol in routine use and dtest is the change in the param-eter that was specified in the ruggedness study. In otherwords, the uncertainty estimate is based on the 95%confidence interval, converted to a standard deviationby dividing by 1.96 [1, 2]. The dreal/dtest term is requiredto take account of the fact that the change in a parame-ter used in the ruggedness test may be greater than thatobserved during normal operation of the method. Forparameters identified as having a significant effect onthe method performance, a first estimate of the uncer-tainty can be calculated as follows:

u(y(xi))pu(xi)!ci , (9)

cipObserved change in result

Change in parameter, (10)

where u(xi) is the uncertainty in the parameter and ci isthe sensitivity coefficient.

Page 98: Validation in Chemical Measurement

The evaluation of measurement uncertainty from method validation studies. Part 1: Description of a laboratory protocol 89

Fig. 3 Contributions to themeasurement uncertainty forthe determination of CI sol-vent red 24, quinizarin and CIsolvent yellow 124 in fuel oil

The estimates obtained by applying Eqs. 8–10 are in-tended to give a first estimate of the measurement un-certainty associated with a particular parameter. If suchestimates of the uncertainty are found to be a signifi-cant contribution to the overall uncertainty for themethod, further study of the effect of the parameters isadvised, to establish the true relationship betweenchanges in the parameter and the result of the method.However, if the uncertainties are found to be smallcompared to other uncertainty components (i.e. the un-certainties associated with precision and trueness) thenno further study is required.

Calculation of combined measurement uncertainty forthe method

The individual sources of uncertainty, evaluatedthrough the precision, trueness, ruggedness and otherstudies are combined to give an estimate of the stand-ard uncertainty for the method as a whole. Uncertaintycontributions identified as being proportional to ana-lyte concentration are combined using Eq. (11):

u(y)y

p"1u(p)p 2

2

c1u(q)q 2

2

c1u(r)r 2

2

c . . . , (11)

where the result y is affected by parameters p, q, r....which each have uncertainties u(p), u(q), u(r)... . Un-certainty contributions identified as being independentof analyte concentration are combined using Eq. (12):

Page 99: Validation in Chemical Measurement

90 V.J. Barwick · S.L.R. Ellison

u(y)bp;u(p)2cu(q)2cu(r)2c . . . (12)

The combined uncertainty in the result at a concentra-tion y’ is calculated as follows:

u(yb)p"(u(y)b)2c1yb!u(y)

y 22

(13)

Discussion and conclusions

We have developed a protocol which describes howdata generated from experimental studies commonlyundertaken for method validation purposes can be usedin measurement uncertainty evaluation. The main ex-perimental studies required are for the evaluation ofprecision and trueness. These should be planned so asto cover as many of the possible sources of uncertaintyidentified for the method as possible. Any remainingsources are considered separately. If there is evidenceto suggest that they will be small compared to the un-certainties associated with precision and trueness, thenno further study is required. However for uncertaintycomponents where no prior information is available,further experimental study will be required. One usefulapproach is ruggedness testing which allows the evalua-tion of a number of sources of uncertainty simulta-neously. It should be noted that ruggedness testing re-ally only gives a first estimate of uncertainty contribu-tions. Further study is recommended to refine the esti-mates for any sources of uncertainty which appear tobe a significant contribution to the total uncertainty.

The main disadvantage of this approach is that itmay not readily reveal the main sources of uncertaintyfor a particular method. In previous studies we havetypically found the uncertainty budget to be dominatedby the precision and trueness terms [14]. In such cases,if the combined uncertainty for the method is too large,indicating that the method requires improvement, fur-ther study may be required to identify the stages in themethod which contribute most to the uncertainty. How-ever, the approach detailed here will allow the analystto obtain, relatively quickly, a sound estimate of meas-urement uncertainty, with minimum experimental workbeyond that required for method validation.

We have applied this protocol to the evaluation ofthe measurement uncertainty for a method for the de-termination of three markers (CI solvent red 24, CI sol-vent yellow 124 and quinizarin (1,4-dihydroxyanthra-quinone)) in road fuel. The method requires the extrac-tion of the markers from the sample matrix by solidphase extraction, followed by quantification by HPLCwith diode array detection. The uncertainty evaluationinvolved four experimental studies which were also re-quired as part of the method validation. The studieswere precision, trueness (evaluated via the analysis ofspiked samples) and ruggedness tests of the extractionand HPLC stages. The experiments and uncertaintycalculations are described in detail in Part 2. A summa-ry of the uncertainty budget for the method is present-ed in Fig. 3.

Acknowledgements The work described in this paper was sup-ported under contract with the Department of Trade and Indus-try as part of the United Kingdom National Measurement SystemValid Analytical Measurement (VAM) Programme.

References

1. ISO (1993) Guide to the expressionof uncertainty in measurement. ISO,Geneva

2. EURACHEM (1995) Quantifying un-certainty in analytical measurement.Laboratory of the GovernmentChemist (LGC), London

3. Pueyo M, Obiols J, Vilalta E (1996)Anal Commun 33 :205–208

4. Williams A (1993) Anal Proc30 :248–250

5. Analytical Methods Committee(1995) Analyst 120 :2303–2308

6. Ellison SLR (1997) In: Ciarlini P,Cox MG, Pavese F, Tichter D (eds)Advanced mathematical tools in me-trology III. World Scientific, Singa-pore

7. Ellison SLR, Williams A (1998)Accred Qual Assur 3 :6–10

8. Ríos A, Valcárcel M (1998) AccredQual Assur 3 :14–29

9. IUPAC (1988) Pure Appl Chem60:885

10. AOAC (1989) J Assoc Off AnalChem 72 :694–704

11. ISO 5725 :1994 (1994) Accuracy(trueness and precision) of measure-ment methods and results. ISO,Geneva

12. Ellison SLR, Barwick VJ (1998)Accred Qual Assur 3 :101–105

13. Ellison SLR, Barwick VJ (1998) Ana-lyst 123 :1387–1392

14. Barwick VJ, Ellison SLR (1998) AnalComm 35 :377–383

15. ISO 9004-4 :1993 (1993) Total qualitymanagement Part 2: Guidelines forquality improvement. ISO, Geneva

16. Barwick VJ, Ellison SLR (1999) Pro-tocol for uncertainty evaluation fromvalidation data. VAM Technical Re-port No. LGC/VAM/1998/088, availa-ble on LGC website at www.lgc.co.uk

17. Barwick VJ, Ellison SLR, RaffertyMJQ, Gill RS (1999) Accred QualAssur

18. Barwick VJ, Ellison SLR (1999) Ana-lyst 124 :981–990

19. Ellison SLR, Williams A (1996) In:Parkany M (ed) The use of recoveryfactors in trace analysis. Royal Socie-ty of Chemistry, Cambridge

20. Youden WJ, Steiner EH (1975) Sta-tistical manual of the association ofofficial analytical chemists. Associa-tion of Official Analytical Chemists(AOAC), Arlington, Va.

21. Vander Heyden Y, Luypaert K, Hart-mann C, Massart DL, Hoogmartens J,De Beer J (1995) Anal Chim Acta312 :245–262

Page 100: Validation in Chemical Measurement

Accred Qual Assur (2000) 5 :104–113Q Springer-Verlag 2000 PRACTITIONER’S REPORT

Vicki J. BarwickStephen L.R. EllisonMark J.Q. RaffertyRattanjit S. Gill

The evaluation of measurementuncertainty from method validationstudiesPart 2: The practical application of a laboratoryprotocol

Received: 10 April 1999Accepted: 24 September 1999

V.J. Barwick (Y) 7 S.L.R. EllisonM.J.Q. Rafferty 7 R.S. GillLaboratory of the Government Chemist,Queens Road, Teddington, Middlesex,TW11 0LY, UKe-mail: vjb6lgc.co.uk,Tel.: c44-20-8943 7421,Fax: c44-20-8943 2767

Abstract A protocol has been de-veloped illustrating the link be-tween validation experiments andmeasurement uncertainty evalua-tion. The application of the proto-col is illustrated with reference to amethod for the determination ofthree markers (CI solvent red 24,quinizarin and CI solvent yellow124) in fuel oil samples. The meth-od requires the extraction of themarkers from the sample matrix bysolid phase extraction followed byquantification by high performanceliquid chromatography (HPLC)

with diode array detection. The un-certainties for the determination ofthe markers were evaluated usingdata from precision and truenessstudies using representative samplematrices spiked at a range of con-centrations, and from ruggednessstudies of the extraction andHPLC stages.

Key words Measurementuncertainty 7 Method validation 7Precision 7 Trueness 7 Ruggedness 7High performance liquidchromatography

Introduction

In Part 1 [1] we described a protocol for the evaluationof measurement uncertainty from validation studiessuch as precision, trueness and ruggedness testing. Inthis paper we illustrate the application of the protocolto a method developed for the determination of thedyes CI solvent red 24 and CI solvent yellow 124, andthe chemical marker quinizarin (1,4-dihydroxyanthra-quinone) in road fuel. The analysis of road fuel samplessuspected of containing rebated kerosene or rebatedgas oil is required as the use of rebated fuels as roadfuels or extenders to road fuels is illegal. To preventillegal use of rebated fuels, HM Customs and Excise re-quire them to be marked. This is achieved by addingsolvent red 24, solvent yellow 124 and quinizarin to thefuel. A method for the quantitation of the markers wasdeveloped in this laboratory [2]. Over a period of timethe method had been adapted to improve its perform-ance and now required re-validation and an uncertaintyestimate. This paper describes the experiments under-

taken and shows how the data were used in the calcula-tion of the measurement uncertainty.

Experimental

Outline of procedure for the determination of CIsolvent red 24, CI solvent yellow 124 and quinizarin infuel samples

Extraction procedure

The sample (10 ml) was transferred by automatic pi-pette to a solid phase extraction cartridge containing500 mg silica. The cartridge was drained under vacuumuntil the silica bed appeared dry. The cartridge wasthen washed under vacuum with 10 ml hexane to re-move residual oil. The markers were eluted from thecartridge under gravity with 10 ml butan-1-ol in hexane(10% v/v). The eluent was collected in a glass specimenvial and evaporated to dryness by heating to 50 7C un-der an air stream. The residue was dissolved in aceton-

Page 101: Validation in Chemical Measurement

92 V.J. Barwick et al.

Fig. 1 Cause and effect diagram illustrating sources of uncertain-ty for the method for the determination of markers in fuel oil

itrile (2.5 ml) and the resulting solution placed in an ul-trasonic bath for 5 min. The solution was then passedthrough a 0.45 mm filter prior to analysis by high per-formance liquid chromatography (HPLC).

HPLC conditions

The samples (50 ml) were analysed on a Hewlett Pack-ard 1050 DAD system upgraded with a 1090 DAD op-tical bench. The column was a Luna 5 mm phenyl-hexyl,250 mm!4.6 mm maintained at 30 7C. The flow ratewas 1 ml min–1 using a gradient elution of acetonitrileand water as follows:

Time (min) 0 3 4 5 9 10 20 21 23Water 40 40 30 10 10 2 2 40 40% Acetonitrile 60 60 70 90 90 98 98 60 60

Calibration was by means of a single standard inacetonitrile containing CI solvent red 24 and CI solventyellow 124 at a concentration of approximately20 mg lP1 and quinizarin at concentration of approxi-mately 10 mg lP1. CI solvent red 24 and quinizarinwere quantified using data (peak areas) recorded ondetector channel B (500 nm), whilst CI solvent yellow124 was quantified using data recorded on detectorchannel A (475 nm). The concentration of the analyte,C in mg lP1, was calculated using Eq. (1):

CpAS!VF!CSTD

ASTD!VS

, (1)

where AS is the peak area recorded for the sample solu-tion, ASTD is the peak area recorded for the standard

solution, VF is the final volume of the sample solution(ml), VS is the volume of the sample taken for analysis(ml) and CSTD is the concentration of the standard solu-tion (mg lP1).

Experiments planned for validation and uncertaintyestimation

A cause and effect diagram [3–5] illustrating the mainparameters controlling the result of the analysis is pre-sented in Fig. 1. Note that uncertainties associated withsampling are outside the scope of this study, as the un-certainty was required for the sample as received in thelaboratory. The uncertainty contribution from sub-sam-pling the laboratory sample is represented by the “in-homogeneity” branch in Fig. 1. Initially, two sets of ex-periments were planned – a precision study and a true-ness study. These were planned so as to cover as manysources of uncertainty as possible. Parameters not ade-quately covered by these experiments (i.e. not variedrepresentatively) were evaluated separately using rug-gedness tests or existing published data. Whilst thesestudies are required for the method validation process,it should be noted that they do not form a completevalidation study [6].

Page 102: Validation in Chemical Measurement

The evaluation of measurement uncertainty from method validation studies. Part 2: The practical application of a laboratory protocol 93

Precision experiments

Samples of 3 unmarked fuel oils (A–C) were fortifiedwith CI solvent red 24 at concentrations of 0.041, 1.02,2.03, 3.05 and 4.06 mg lP1; quinizarin at concentrationsof 0.040, 0.498, 0.996, 1.49 and 1.99 mg lP1; and CI sol-vent yellow 124 at concentrations of 0.040, 1.20, 2.40,3.99 and 4.99 mg lP1 to give a total of 15 fortified sam-ples. Oil B was a diesel oil, representing a sample oftypical viscosity. Oil A was a kerosene and oil C was alubricating oil. These oils are respectively less viscousand more viscous than oil B.

Initially, 12 sub-samples of oil B with a concentra-tion of 2.03 mg lP1 CI solvent red 24, 0.996 mg lP1

quinizarin and 2.40 mg lP1 CI solvent yellow 124 wereanalysed. The extraction stage was carried out in twobatches of six on consecutive days. The markers in all12 sub-samples were quantified in a single HPLC run,with the order of the analysis randomised. This studywas followed by the analysis, in duplicate, of all 15 sam-ples. The sample extracts were analysed in three sepa-rate HPLC runs such that the duplicates for each sam-ple were in different runs. For each HPLC run a newstandard and a fresh batch of mobile phase was pre-pared.

In addition, the results obtained from the replicateanalysis of a sample of BP diesel, prepared for the true-ness study (see below), were used in the estimate of un-certainty associated with method precision.

Trueness experiments

No suitable CRM was available for the evaluation ofrecovery. The study therefore employed representativesamples of fuel oil spiked with the markers at the re-quired concentrations. To obtain an estimate of Rm andits uncertainty, a 2-l sample of unmarked BP diesel wasspiked with standards in toluene containing CI solventred 24, quinizarin and CI solvent yellow 124 at concen-trations of 0.996 mg mlP1, 1.02 mg mlP1 and1.97 mg mlP1, respectively, to give concentrations inthe diesel of 4.06 mg lP1, 1.99 mg lP1 and 4.99 mg lP1,respectively. A series of 48 aliquots of this sample wereanalysed in 3 batches of 16. The estimate of Rs and itsuncertainty, u(Rs), was calculated from these resultsplus the results from the analysis of the samples used inthe precision study.

Evaluation of other sources of uncertainty: Ruggednesstest

The effects of parameters associated with the extrac-tion/clean-up stages and the HPLC quantification stagewere studied in separate experiments. The parameters

studied for the extraction/clean-up stage of the methodand the levels chosen are shown in Table 1a. The rug-gedness test was applied to the matrix B (diesel oil)sample containing 2.03 mg lP1 CI solvent red 24,0.996 mg lP1 quinizarin and 2.40 mg lP1 CI solvent yel-low 124 used in the precision study. The eight experi-ments were carried out over a short period of time andthe resulting sample extracts were analysed in a singleHPLC run. The HPLC parameters investigated and thelevels chosen are given in Table 1b. For this set of ex-periments a single extract of the matrix B (diesel oil)sample, obtained under normal method conditions, wasused. The extract and a standard were run under eachset of conditions required by the ruggedness test. Theeffect of variations in the parameters was monitored bycalculating the concentration of the markers observedunder each set of parameters, using the appropriatestandard.

Results and uncertainty calculations

Precision study

The results from the precision studies are summarisedin Table 2. Estimates for the standard deviation for asingle result were obtained from the results of the du-plicate analyses of the 15 samples, by taking the stand-ard deviation of the differences between the pairs anddividing by ;2. Estimates of the relative standard de-viations were obtained by treating the normalised dif-ferences in the same way [7]. The results from the anal-ysis of the BP diesel sample represented three batchesof 16 replicate analyses. An estimate of the total preci-sion (i.e. within and between batch variation) was ob-tained via ANOVA [8]. The precision estimates coverdifferent sources of variability in the method. The esti-mates obtained from the duplicate samples and the BPoil sample cover batch to batch variability in the extrac-tion and HPLC stages of the method (including thepreparation of new standards and mobile phase). Theestimate obtained from matrix B does not cover batchto batch variability in the HPLC procedure as all thereplicates were analysed in a single HPLC run. Theprecision studies also cover the uncertainty associatedwith sample inhomogeneity as they involved the analy-sis of a number of sub-samples taken from the bulk.

CI solvent red 24

No significant difference was observed (F-tests, 95%confidence) between the three estimates obtained forthe relative standard deviation (0.0323, 0.0289 and0.0414). However, the test was borderline and acrossthe range studied (0.04 mg lP1 to 4 mg lP1) the method

Page 103: Validation in Chemical Measurement

94 V.J. Barwick et al.

Tab

le1

Res

ults

fro

m t

he r

ugge

dnes

s te

stin

g of

the

pro

cedu

re f

or t

he d

eter

min

atio

n of

CI

solv

ent

red

24,

quin

izar

in a

nd C

I so

lven

t ye

llow

124

in f

uel

oil.

a R

ugge

dnes

ste

stin

g of

the

ext

ract

ion/

clea

n-up

pro

cedu

re

Par

amet

erV

alue

sd r

eal/d

test

u (x i

)C

I so

lven

t re

d 24

Qui

niza

rin

CI

solv

ent

yello

w 1

24

Dx

i

(mg

lP1)

c iu(

y(x i

))(m

glP

1)

Dx

i

(mg

lP1)

c iu(

y(x i

))(m

glP

1)

Dx

i

(mg

lP1)

c iu(

y(x i

))(m

glP

1)

Bra

nd o

f si

lica

cart

ridg

esA

Var

ian

aW

ater

s1/

1a0.

0075

0*–

0.04

93P

0.00

750*

–0.

0174

P0.

0025

0*–

0.01

99Sa

mpl

e vo

lum

eB

10m

lb

12m

l–

0.04

ml

P0.

353

0.17

60.

0070

5P

0.18

00.

090

0.00

360

P0.

423

0.21

20.

0084

5R

ate

of e

luti

on o

f oi

lw

ith

hexa

neC

vacu

umc

grav

ity

1/10

a–

0.02

75*

–0.

0049

30.

070

–0.

0070

aP

0.02

0*–

0.00

199

Vol

ume

of h

exan

e w

ash

D12

ml

d8

ml

–0.

04m

l0.

213

0.05

310.

0021

30.

176

0.04

440.

0017

70.

225

0.05

630.

0022

5C

once

ntra

tion

of b

utan

-1-o

l/hex

ane

E12

%e

8%0.

2% (

v/v)

/4%

(v/

v)P

0.04

25*

–0.

0024

70.

0175

*–

0.00

0868

P0.

010*

–0.

0010

Vol

ume

10%

of b

utan

-1-o

l/hex

ane

F12

ml

f8

ml

0.08

ml/

4m

l0.

04m

l0.

0625

*–

0.00

0986

P0.

0050

*–

0.00

0347

0.08

00.

020

0.00

080

Eva

pora

tion

tem

pera

ture

G50

7 Cg

807 C

107C

/30

7C2.

897 C

P0.

0275

*–

0.01

64P

0.04

250.

0014

20.

0040

90.

0075

0*–

0.00

663

aSe

e te

xt f

or e

xpla

nati

on*

No

sign

ific

ant

effe

ct a

t th

e 95

% c

onfi

denc

e le

vel

Tab

le1b

Rug

gedn

ess

test

ing

of t

he H

PL

C p

roce

dure

Par

amet

erV

alue

sd r

eal/d

test

u(x i

)C

I so

lven

t re

d 24

Qui

niza

rin

CI

solv

ent

yello

w 1

24

Dx

i

(mg

lP1)

c iu(

y(x i

))(m

glP

1)

Dx

i

(mg

lP1)

c iu(

y(x i

))(m

glP

1)

Dx

i

(mg

lP1)

c iu(

y(x i

))(m

glP

1)

Typ

e of

ace

toni

trile

in m

obile

pha

seA

bF

ar-U

Vgr

ade

abH

PL

Cgr

ade

–a

P0.

0748

aa

P0.

101

aa

P0.

0228

*a

a

Flo

w r

ate

Bb

0.8

ml

min

–1

bb1.

2m

lm

in–1

–0.

0017

3m

l m

in–1P

0.12

40.

309

0.00

0535

0.02

830.

0707

0.00

0122

P0.

0465

0.11

60.

0002

0

Inje

ctio

n vo

lum

eC

b40

ml

c b60

ml

–0.

75m

l0.

115

0.00

576

0.00

432

P0.

0406

0.00

203

0.00

152

0.02

840.

0014

20.

0010

7C

olum

n te

mpe

ratu

reD

b25

7Cd b

357C

27C

/10

7C1

7CP

0.13

00.

0130

0.00

752

0.02

010.

0020

10.

0011

6P

0.02

33*

–0.

0028

2D

etec

tor

wav

elen

gth

(A)

Eb

465

nmeb

485

nm4

nm/2

0nm

1.15

nm0.

104

0.00

520

0.00

598

0.02

390.

0012

00.

0013

8P

0.01

61*

–0.

0028

2

Deg

assi

ng o

fm

obile

pha

seFb

Deg

asse

dfb

Not

dega

ssed

–a

0.10

8a

a0.

0641

aa

0.00

907*

aa

Det

ecto

rw

avel

engt

h (B

)G

b49

0nm

gb51

0nm

4nm

/20

nm1.

15nm

0.10

50.

0052

50.

0060

4P

0.01

12*

–0.

0015

40.

0198

*–

0.00

282

aSe

e te

xt f

or e

xpla

nati

on*

No

sign

ific

ant

effe

ct a

t th

e 95

% c

onfi

denc

e le

vel

Page 104: Validation in Chemical Measurement

The evaluation of measurement uncertainty from method validation studies. Part 2: The practical application of a laboratory protocol 95

Table 2 Summary of data used in the estimation of u(P)

Analyte/Matrix n Mean(mg l–1)

Standarddeviation(mg l–1)

Relativestandarddeviation

CI solvent red 24Matrix B 12 1.92 0.0621 0.0323BP diesel 48a 3.88 0.112 0.0289Matrices A–C 15b – 0.0376 0.0414QuinizarinMatrix B 11 0.913 0.0216 0.0236BP diesel 48a 1.89 0.0256 0.0136Matrices A–C 15b – 0.0470 0.0788CI solvent yellow 124Matrix B 12 2.35 0.0251 0.0107BP diesel 48a 4.99 0.0618 0.0124Matrices A–C 15b - 0.0247 0.0464

a Standard deviation and relative standard deviation estimatedfrom ANOVA of 3 sets of 16 replicates (see text)b Standard deviation and relative standard deviation estimatedfrom duplicate results (15 sets) for a range of concentrations andmatrices (see text)

Table 3 Results from the re-plicate analysis of a diesel oilspiked with CI solvent red 24,quinizarin and CI solvent yel-low 124

Analyte Targetconcentration,Cspike (mg l–1)

Mean,Cobs (mg lP1)

Standarddeviationof the mean,sobs (mg lP1)a

CI solvent red 24 4.06 3.88 0.0360Quinizarin 1.99 1.89 0.00370CI solvent yellow 124 4.99 4.99 0.0167

a Estimated from ANOVA of 3 groups of 16 replicates according to ISO 5725 :1994 [9]

precision was approximately proportional to analyteconcentration. It was decided to use the estimate of0.0414 as the uncertainty associated with precision,u(P), to avoid underestimating the precision for anygiven sample. This estimate was obtained from theanalysis of different matrices and concentrations and istherefore likely to be more representative of the preci-sion across the method scope.

Quinizarin

The estimates of the standard deviation and relativestandard deviation were not comparable. In particular,the estimates obtained from the duplicate results weresignificantly different from the other estimates (F-tests,95% confidence). There were no obvious patterns inthe data so no particular matrix and/or concentrationcould be identified as being the cause of the variability.There was therefore no justification for removing anydata and restricting the coverage of the uncertainty es-timate, as in the case of CI solvent yellow 124 (see be-low). The results of the precision studies indicate that

the method is more variable across different matricesand analyte concentrations for quinizarin than for theother markers. The uncertainty associated with the pre-cision was taken as the estimate of the relative standarddeviation obtained from the duplicate results, 0.0788.This estimate should ensure that the uncertainty is notunderestimated for any given matrix or concentration(although it may result in an overestimate in somecases).

CI solvent yellow 124

There was no significant difference between the esti-mates of the relative standard deviation obtained forsamples at concentrations of 2.4 mg lP1 and4.99 mg lP1. However, the estimate obtained from theduplicate analyses was significantly greater than theother estimates. Inspection of that data revealed thatthe normalised differences observed for the samples ata concentration of 0.04 mg lP1 were substantially largerthan those observed at the other concentrations. Re-moving these data points gave a revised estimate of therelative standard deviation of 0.00903. This was inagreement with the other estimates obtained (F-tests,95% confidence). The three estimates were thereforepooled to give a single estimate of the relative standarddeviation of 0.0114. At present, the uncertainty esti-mate cannot be applied to samples with concentrationsbelow 1.2 mg lP1. Further study would be required toinvestigate in more detail the precision at these low lev-els.

Trueness study

Evaluation of Rm and u(Rm)

The results are summarised in Table 3. In each case Rm

was calculated using Eq. (2):

RmpCobs

CRM

, (2)

where Cobs is the mean of the replicate analyses of thespiked sample and CRM is the concentration of the

Page 105: Validation in Chemical Measurement

96 V.J. Barwick et al.

1 Detailed information on the estimation of uncertainties of thistype is given in Ref. [7].

spiked sample. The uncertainty, u(Rm), was calculatedusing Eq. (3):

u(Rm)pRm!"1 sobs

Cobs2

2

c1u(CRM)CRM

22

, (3)

where u(CRM) is the standard uncertainty in its concen-tration of the spiked sample. The standard deviation ofthe mean of the results, sobs, was estimated fromANOVA of the data according to Part 4 of ISO5725 :1994 [9].

Using information on the purity of the material usedto prepare the spiked sample, and the accuracy andprecision of the volumetric glassware and analyticalbalance used, the uncertainty in the concentration of CIsolvent red 24 in the sample, u(CRM), was estimated as0.05 mg lP1.1 The uncertainties associated with the con-centration of quinizarin and CI solvent yellow 124 wereestimated as 0.025 mg lP1 and 0.062 mg lP1, respective-ly. The relevant values are:

CI solvent red 24: Rmp0.957 u(Rm)p0.0148Quinizarin: Rmp0.949 u(Rm)p0.0121CI solvent yellow 124: Rmp1.00 u(Rm)p0.0129

Applying Eq. (4):

tpF1PRmF

u(Rm)(4)

indicated that the estimates of Rm obtained for CI sol-vent red 24 and quinizarin were significantly differentfrom 1.0 (t12) [7, 10]. During routine use of the meth-od, the results reported for test samples will not be cor-rected for incomplete recovery of the analyte. Equation(5) was therefore used to calculate an increased uncer-tainty for Rm to take account of the uncorrected bias:

u(Rm)bp"11PRm

k 22

cu(Rm)2 , (5)

u(Rm)b was calculated as 0.0262 for CI solvent red 24and 0.0283 for quinizarin. The significance test for CIsolvent yellow 124 indicated that Rm was not signifi-cantly different from 1.0. The uncertainty associatedwith Rm is the value of u(Rm) calculated above (i.e.0.0129).

u(Rs) is the standard deviation of the mean recover-ies obtained for the samples analysed in the precisionstudies and the BP diesel sample used in the study ofRm. This gave estimates of u(Rs) of 0.0322 for CI sol-vent red 24, 0.0932 for quinizarin and 0.0138 for CI sol-vent yellow 124. The estimate for CI solvent yellow 124

only includes concentrations above 1.2 mg lP1, for thereason discussed in the section on precision.

Calculation of R and u(R)

The recovery, R, for a particular test sample and thecorresponding uncertainty, u(R), is calculated usingEqs. (6) and (7):

RpRm!Rs!Rrep , (6)

u(R)pR!"1u(Rm)Rm

22

c1u(Rs)Rs

22

c1u(Rrep

Rrep2

2

. (7)

In this study a spiked sample can be considered areasonable representation of test samples of markedfuel oils. There is therefore no need to correct the esti-mates of Rm and u(Rm) by including the Rrep andu(Rrep) terms. Both Rm and Rs are assumed to be equalto 1. R is therefore also equal to 1. Combining the esti-mates of u(Rm) and u(Rs), the uncertainty u(R) was cal-culated as 0.0415 for CI solvent red 24, 0.0974 for quin-izarin and 0.0187 for CI solvent yellow 124.

Ruggedness test of extraction/clean-up procedure

The results from the ruggedness study of the extraction/clean-up procedure are presented in Table 1a. The pre-cision of the method for the analysis of the sample usedin the ruggedness study had been estimated previouslyas 0.0621 mg lP1 (np11) for CI solvent red 24,0.0216 mg lP1 (np10) for quinizarin and 0.0251 mg lP1

(np11) for CI solvent yellow 124. Parameters weretested for significance using Eq. (8):

tp;n!Dxi

;2!s, (8)

where s is the estimate of the method precision, n is thenumber of experiments carried out at each level foreach parameter (np4 for a seven-parameter Plackett-Burman experimental design), and Dxi

is the differencecalculated for parameter xi [1, 11]. The degrees of free-dom for tcrit relate to the degrees of freedom for theprecision estimate used in the calculation of t.

The parameters identified as having no significanteffect on method performance, at the 95% confidencelevel are highlighted in Table 1a. For these parametersthe uncertainty in the final result was calculated usingEq. (9):

u(y(xi))p;2!tcrit!s;n!1.96

!dreal

dtest

, (9)

where dreal is the change in the parameter which wouldbe expected when the method is operating under con-

Page 106: Validation in Chemical Measurement

The evaluation of measurement uncertainty from method validation studies. Part 2: The practical application of a laboratory protocol 97

trol in routine use and dtest is the change in the param-eter that was specified in the ruggedness study. The es-timates of dreal are given in Table 1a. For parameter A,brand of silica cartridge, the conditions of the test (i.e.changing between two brands of cartridge) were con-sidered representative of normal operation of themethod. dreal is therefore equal to dtest. The effect of therate of elution of oil by hexane from the cartridge wasinvestigated by comparing the elution under a vacuumand with elution under gravity. In routine analyses, theoil will be eluted under vacuum. Variations in the va-cuum applied from one extraction to another will affectthe rate of elution of the oil and the amount of oileluted. However, the effect of variations in the vacuumwill be small compared to the effect of having no va-cuum present. It can therefore be assumed that varia-tions in the observed concentration of the markers, dueto variability in the vacuum, will be small compared tothe differences observed in the ruggedness test. As afirst estimate, the effect of variation in the vacuum dur-ing routine application of the method was estimated asone-tenth of that observed during the ruggednessstudy. This indicated that the parameter was not a sig-nificant contribution to the overall uncertainty for CIsolvent red 24 and CI solvent yellow 124, so no furtherstudy was required. The estimates of dreal for the con-centration and volume of butan-1-ol in hexane used toelute the column were based on the manufacturers’specifications and typical precision data for the volu-metric flasks and pipettes used to prepare and deliverthe solution.

For the parameters identified as having a significanteffect on method performance, the uncertainty was cal-culated using Eqs. (10) and (11):

u(y(xi))pu(xi)!ci , (10)

cipObserved change in result

Change in parameter. (11)

The estimates of the uncertainty in each parameter,u(xi), are given in Table 1a. The uncertainties asso-ciated with the sample volume, volume of hexane washand volume of the 10% butan-1-ol/hexane solutionwere again based on the manufacturers’ specificationsand typical precision data for the volumetric flasks andpipettes used to prepare and deliver the solutions. Theuncertainty in the evaporation temperature was basedon the assumption that the temperature could be con-trolled to B5 7C. This was taken as a rectangular distri-bution and converted to a standard uncertainty by div-iding by ;3 [7]. As discussed previously, the effect onthe final result of variations in the vacuum when elutingthe oil from the cartridge with hexane was estimated asone-tenth that observed in the ruggedness test.

The effects of all the parameters were considered tobe proportional to the analyte concentration. The un-

certainties were therefore converted to relative stand-ard deviations by dividing by the mean of the resultsobtained from previous analyses of the sample undernormal method conditions (see results for Matrix B inTable 2).

Ruggedness test of the HPLC procedure

The results from the ruggedness study of the HPLCprocedure, and the values of dreal and u(xi) used in theuncertainty calculations, are presented in Table 1b. Re-plicate analyses of a standard solution of the threemarkers gave the following estimates of the precision ofthe HPLC system at the concentration of the sampleused in the study: CI solvent red 24, sp0.0363 mg lP1

(np69); quinizarin, sp0.0107 mg lP1 (np69); CI sol-vent yellow 124, sp0.0196 mg lP1 (np69). Parameterswere tested for significance, at 95% confidence, usingEq. (8). The uncertainties for parameters identified ashaving no significant effect on the method performancewere calculated using Eq. (9). Based on informationfrom manufacturers’ specifications for HPLC systems,the uncertainty associated with the column temperaturewas estimated as B1 7C, giving an estimate of dreal of2 7C. Again, based on manufacturers’ specifications forDAD detectors, the uncertainty associated with the de-tector wavelengths was estimated as B2 nm, giving adreal value of 4 nm.

The uncertainties due to significant parameters wereestimated using Eqs. (10) and (11). Information in theliterature suggests that a typical variation in flow rate isB0.3% [12]. The uncertainty in the flow rate was there-fore estimated as 0.00173 ml minP1

, assuming a rectan-gular distribution. Data in the literature gave 1.5% as atypical coefficient of variation for the volume deliveredby an autosampler [13]. The uncertainty associated withthe injection volume of 50 ml was therefore estimatedas 0.75 ml.

Two remaining parameters merit further discussion;the type of acetonitrile used in the mobile phase andwhether or not the mobile phase was degassed. Themethod was developed using HPLC grade acetonitrile.The ruggedness test indicated that changing to far-UVgrade results in a lower recovery for all three analytes.The method protocol should therefore specify that forroutine use, HPLC grade acetonitrile must be used.The ruggedness test also indicated that not degassingthe mobile phase causes a reduction in recovery. Themethod was developed using degassed mobile phase,and the method protocol will specify that this must bethe case during future use of the method. As these twoparameters are being controlled in the method proto-col, uncertainty terms have not been included.

The effects of all the parameters were considered tobe proportional to the analyte concentration. The un-

Page 107: Validation in Chemical Measurement

98 V.J. Barwick et al.

Fig. 2 Contributions as rela-tive standard deviations(RSDs) to the measurementuncertainty for the determina-tion of CI solvent red 24,quinizarin and CI solvent yel-low 124 in fuel oil

certainties were therefore converted to relative stand-ard deviations by dividing by the mean of results ob-tained from previous analyses of the sample under nor-mal method conditions (see results for Matrix B in Ta-ble 2).

Other sources of uncertainty

The precision and trueness studies were designed tocover as many of the sources of uncertainty as possible(see Fig. 1), for example, by analysing different samplematrices and concentration levels, and by preparingnew standards and HPLC mobile phase for each batchof analyses. Parameters which were not adequately var-

ied during these experiments, such as the extractionand HPLC conditions, were investigated in the rugged-ness tests. There are however, a small number of pa-rameters which were not covered by the above experi-ments. These generally related to the calibration of pi-pettes and balances used in the preparation of thestandards and samples. For example, during this studythe same pipettes were used in the preparation of allthe working standards. Although the precision asso-ciated with the operation of the pipette is included inthe overall precision estimate, the effect of the accuracyof the pipettes has not been included in the uncertaintybudget so far. A pipette used to prepare the standardmay typically deliver 0.03 ml above its nominal value.In the future a different pipette, or the same pipette

Page 108: Validation in Chemical Measurement

The evaluation of measurement uncertainty from method validation studies. Part 2: The practical application of a laboratory protocol 99

after re-calibration, may deliver 0.02 ml below the nom-inal value. Since this possible variation is not alreadyincluded in the uncertainty budget it should be consid-ered separately. However, previous experience [14] hasshown us that uncertainties associated with the calibra-tion of volumetric glassware and analytical balances aregenerally small compared to other sources of uncertain-ty such as overall precision and recovery. Additionaluncertainty estimates for these parameters have nottherefore been included in the uncertainty budgets.

Calculation of measurement uncertainty

The contributions to the uncertainty budget for each ofthe analytes are illustrated in Fig. 2. In all cases thesources of uncertainty were considered to be propor-tional to analyte concentration. Using Eq. (12):

u(y)y

p"1u(p)p 2

2

c1u(q)q 2

2

c1u(r)r 2

2

c . . . (12)

the uncertainty in the final result, u(y), was calculatedas 0.065 for CI solvent red 24, 0.13 for quinizarin and0.024 for CI solvent yellow 124, all expressed as relativestandard deviations. The expanded uncertainties, calcu-lated using a coverage factor of kp2 which gives a con-fidence level of approximately 95%, are 0.13, 0.26 and0.048 for CI solvent red 24, quinizarin and CI solventyellow 124, respectively.

Discussion

In the case of CI solvent red 24 and CI solvent yel-low 124, the significant contributions to the uncertaintybudget arose from overall precision and recovery, andthe brand of the solid phase extraction cartridge used.If a reduction in the overall uncertainty of the methodwas required, useful approaches would be to specify aparticular brand of cartridge in the method protocol, orto adopt matrix specific recovery corrections for testsamples.

The combined uncertainty for quinizarin, which issignificantly greater than that calculated for the othermarkers, is dominated by the precision and recoveryterms. The results of the precision study indicated vari-able method performance across different matrices andanalyte concentrations. The uncertainty, u(Rs), asso-

ciated with the variation in recovery from sample tosample was the major contribution to the recovery un-certainty, u(R). This was due to the fact that the recov-eries obtained for matrix B were generally higher thanthose obtained for matrices A and C. However, in thisstudy, a single uncertainty estimate for all the matricesand analyte concentrations studied was required. It wastherefore necessary to use “worst case” estimates of theuncertainties for precision and recovery to adequatelycover all sample types. If this estimate was found to beunsatisfactory for future applications of the method,separate budgets could be calculated for individual ma-trices and concentration ranges.

Conclusions

We have developed a protocol which describes howdata generated from experimental studies commonlyundertaken for method validation purposes can be usedin measurement uncertainty evaluation. This paper hasillustrated the application of the protocol. In the exam-ple described, the uncertainty estimate for three ana-lytes in different oil matrices was evaluated from threeexperimental studies, namely precision, recovery andruggedness. These studies were required as part of themethod validation, but planning the studies with uncer-tainty evaluation in mind allowed an uncertainty esti-mate to be calculated with little extra effort. A numberof areas were identified where additional experimentalwork may be required to refine the estimates. Howeverthe necessary data could be generated by carrying outadditional analyses alongside routine test samples.Again this would minimise the amount of laboratory ef-fort required.

For methods which are already in routine use theremay be historical validation data available which couldbe used, in the same way as illustrated here, to generatean uncertainty estimate. If no such data are available,the case study gives an indication on the type of experi-mental studies required. Again, with careful planning,it is often possible to undertake the studies alongsideroutine test samples.

Acknowledgments The work described in this paper was sup-ported under contract with the Department of Trade and Indus-try as part of the United Kingdom National Measurement SystemValid Analytical Measurement (VAM) Programme.

Page 109: Validation in Chemical Measurement

100 V.J. Barwick et al.

References

1. Barwick VJ, Ellison SLR (1999)Accred Qual Assur (in press)

2. May EM, Hunt DC, Holcombe DG(1986) Analyst 111 :993–995

3. Ellison SLR, Barwick VJ (1998)Accred Qual Assur 3 :101–105

4. Ellison SLR, Barwick VJ (1998) Ana-lyst 123 :1387–1392

5. ISO 9004–4 :1993 (1993) Total qualitymanagement Part 2. Guidelines forquality improvement. ISO, Geneva,Switzerland

6. EURACHEM (1998) The fitness forpurpose of analytical methods, a la-boratory guide to method validationand related topics. Laboratory of theGovernment Chemist, London

7. EURACHEM (1995) Quantifying un-certainty in analytical measurement.Laboratory of the GovernmentChemist, London

8. Farrant TJ (1997) Practical statisticsfor the analytical scientist: a benchguide. Royal Society of Chemistry,Cambridge

9. ISO 5725 :1994 (1994) Accuracy(trueness and precision) of measure-ment methods and results. ISO, Gen-eva, Switzerland

10. Ellison SLR, Williams A (1996) In:Parkanay M (ed) The use of recoveryfactors on trace analysis. Royal Socie-ty of Chemistry, Cambridge

11. Youden WJ, Steiner EH (1975) Sta-tistical manual of the association ofofficial analytical chemists. Associa-tion of Official Analytical Chemists,Arlington, Va

12. Brown PR, Hartwick RA (eds)(1989) High performance liquid chro-matography. Wiley, New York

13. Dolan JW (1997) LC-GC Internation-al 10 :418–422

14. Barwick VJ, Ellison SLR (1998) AnalComm 35 :377–383

Page 110: Validation in Chemical Measurement

Accred Qual Assur (2001) 6 :31–36Q Springer-Verlag 2001 PRACTITIONER’S REPORT

Solveig Linko Automated ion-selective measurement oflithium in serum. A practical approachto result-level verificationin a two-way method validation

Abstract The Quality AssuranceDepartment of Medix Diacor Lab-service evaluated a two-way meth-od validation procedure for serumlithium quantification in therapeut-ic drug monitoring In the processof a company fusion and rationali-zation of two considerably largeproduction lines, three indepen-dent ion-selective electrode (ISE)methods were surveyed, amongmany others. While tailoring thenew medical laboratory produc-tion, subcontracting from a colla-borating company was discontin-ued. Likewise, modernization ofthe ISE instrumentation was una-voidable to increase throughputand effectiveness. It was importantthat the new result levels should becomparable both with the formersubcontractor’s levels and with the

levels reported from the previouslyexisting instrumentation. The aimof this study was to evaluate themost crucial performance charac-teristics of a novel lithium methodin comparison to the two ISE testmethods being withdrawn. Thestandardized lithium test methodwas inspected in terms of linearmeasurement range, analytical var-iation, bias, past and on-going pro-ficiency testing, in addition tomethod comparison, to achieve thedesired analytical goals. Fulfillingthe accreditation requirements interms of the introduced methodvalidation parameters is discussed.

Keywords Validation 7 Lithium 7Performance 7 Proficiency testing 7Analytical goals

Received: 19 April 2000Accepted: 26 July 2000

S. LinkoMedix Diacor Labservice,Department of Quality Assurance,Nihtisillankuja 1, 00260 Espoo, Finlande-mail: solveig.linko6medix.fiTel.: c358-9-5256-277Fax: c358-9-5256-260

Introduction

Lithium is used as a therapeutic drug for patients suf-fering from manic-depressive illness and other mooddisorders. Serum concentrations should be monitoredregularly in all patients receiving lithium therapy. Ther-apeutic drug monitoring (TDM) and avoidance of in-toxication are the key indications of serum lithium (S-Li) measurements. TDM is as important in the treat-ment of acute phase psychiatric disorders as in long-term therapy. The therapeutic concentration range isrelatively narrow, 0.60 to 1.2 mmol/l [1]. Adverse reac-tions may occur at concentration levels of 1.0 to1.5 mmol/l and more severe reactions at higher concen-

trations [2, 3]. Thus, analytical quality specificationsmust be established and well-known prior to routineproduction with any new measurement system. Accord-ingly, ensuring the trueness of result level and the ap-propriate analytical performance are fundamentalpoints.

In 1990, Westgard et al. reported on the importanceof understanding quality management science (QMS)in clinical chemistry [4]. Throughout the 1990s, manyguidelines for implementing quality systems in medicallaboratories and explanations on criteria required formedical laboratory accreditation were introduced [5–7].Today quality management with its various elementsshould be part of the routine work in a modern clinicallaboratory, whether the laboratory is accredited or not.

Page 111: Validation in Chemical Measurement

102 S. Linko

The fantastic development of technology in the medicallaboratory branch, as in many others, allows unbreak-able validation processes at many levels.

If new test methods combined with new instrumen-tation are introduced into a laboratory they should,self-evidently, be state of the art . On the other hand, ifminor modifications occur in existing methods, ob-viously there will be less parameters to be validated.Therefore, the validation process should be definedand well-planned in all cases. The European co-opera-tion for Accreditation of Laboratories (EAL) has pro-vided general guidelines on certain issues related to thevalidation of test methods [8]. According to paragraph2.1.3, the required depth of the validation process de-pends on the maturity of the test method and the pre-valence of its use. “A standardised test method” is ex-clusively defined by EAL. This interpretation can eas-ily be applied to the medical laboratory field and hasbeen discussed [9] in terms of additionally needed clar-ification in the new ISO/IEC 17025 standard [10]. Para-graph 5.4.5 states that the validation shall be as exten-sive as is necessary to meet the needs of the given ap-plication or field of application.

Ion-selective electrodes (ISEs) represent the currentprimary methodology in the quantification of S-Li[11–13]. Moreover, ISE modules are parts of large andfully automated clinical chemistry analysers. In prac-tice, the validation parameters are most often chosen interms of judging the acceptability of the new measure-ment system for daily use. For this reason, the first ap-proach was to study whether the detected imprecisionfulfilled the desired analytical quality specifications.Secondly, proficiency testing (PT) results from pastsamples were of great value in predicting future bias.The identity of the three ISE methods was evaluatedusing patient samples. The analytical performance waschecked after 6 months routine use. Without any ex-ception, method validations always mean an extra eco-nomical burden. Therefore, the validation parameterschosen and processed have to be considered carefully.

The ISE method validation was performed in two di-rections (two-way), during the course of a laboratoryfusion. The two laboratories involved in the fusionwere both previously accredited according toEN 45001. Nevertheless, an S-Li test method was notincluded in either of the scopes of accreditation.

Background to the two-way method validation process

Originally three independent clinical laboratories werereporting S-Li results to clinicians, each using of theirown ISE methodology. Laboratory A (Lab A) subcon-tracted the S-Li tests from its co-operating laboratory B(Lab B) for many years. Meanwhile, laboratory C (Lab

C) produced and reported its state-of-the-art S-Li re-sults independently of Lab A and Lab B. In 1999, LabA and Lab C fused to form one company. In the newlaboratory union, the former Lab A was lacking an in-house S-Li test method, while the current instrumenta-tion used for S-Li determinations was running out of itstechnical performance capabilities. Consequently, NewLab A (united Lab AcC) evaluated the productionstatus in terms of S-Li and decided to set up a fully au-tomated S-Li method within the capabilities of theavailable clinical chemistry automation and discontinuethe subcontract with Lab B.

Two problems needed to be resolved: 1) How toperform an adequate method validation to verify theS-Li concentrations reported by the new laboratory?and 2) Which validation steps would fulfil the requiredcriteria, if the laboratory wanted to have the S-Li ISEmethod considered within its scope of accreditation inthe future?

Experimental

Instrumentation

A Cobas Integra 700 clinical chemistry analyser com-bined with an ISE module from Roche Diagnostics(Germany) was used by the New Lab A. The subcon-tractor, Lab B tested S-Li using a Microlyte 6 ISE ana-lyser from Kone Instruments (Finland). Laboratory Capplied a 654 Nac/Kc/Lic analyser (Chiron 654) fromCiba Corning Diagnostics Ltd (Chiron Instruments,UK) for its S-Li production.

Sample material

Sixty two serum samples for method comparison werecollected from the patient sample pool of Lab A. TheS-Li concentrations varied between 0.1 mmol/l and1.6 mmol/l, thus covering both the therapeutic rangeand the decision making levels of S-Li. All 3 instru-ments were capable of measuring, quantitatively, 56samples. Six samples, were beyond the lower measure-ment ranges of the Microlyte 6 and Chiron 654 instru-ments. The samples were aliquoted into appropriatetubes and refrigerated until analysed.

The PT material was obtained from the nationalscheme organizer, Labquality (Helsinki, Finland) andfrom an international scheme organizer, Murex Bio-technology Ltd. (UK). The samples were of human ori-gin and purchased either as liquid or lyophilized mate-rial. All PT samples were prepared and stored untilanalysed according to the instructions given by the sup-plier.

Page 112: Validation in Chemical Measurement

Automated ion-selective measurement of lithium in serum. A practical approach to result-level verification in a two-way method validation 103

Reagents

The three ISE setups each required their own systemsolutions and calibrators which were purchased fromthe respective manufacturers. A stock solution of50 mM LiCl in deionized water was serially diluted inlithium-free fresh human serum to test the linearity ofthe Cobas Integra 700 ISE method. The tested concen-tration range was from 0.06 mmol/l to 4.01 mmol/l oflithium. LiCl, p.a. 99% purity, was purchased fromMerck (Germany).

ISE methods

The ISE systems of all three analysers measured lith-ium in undiluted serum as direct measurements. Auto-mated system calibration and calculation of the resultswere in-built in all the instruments. The lithium meas-urement range of the Cobas Integra 700 ISE modulewas from 0.10 mmol/l up to 4.00 mmol/l. According tothe manufacturer’s specifications the linear measure-ment range of the Kone Microlyte 6 ISE analyser wasfrom 0.20 mmol/l to 4.00 mmol/l and for the Chiron 654ISE from 0.20 mmol/l to 5.00 mmol/l.

A long-term quality control serum, Longtrol (Lab-quality, Finland), was used by Lab B and Lab C for dai-ly internal quality control. New Lab A used Daytrol aswell as normal and pathological quality control samplesfrom Roche Diagnostics for both method validationand routine use. Two different lots of Longtrol and theRoche controls were used in this study.

Statistical analysis

All statistical tests were run using Analyse-It with Mi-crosoft Excel 5.0 for Windows (Analyse-It SoftwareLtd., 40 Castle Ings Gardens, Leeds, UK). Ordinary lin-ear regression (OLR) was used in testing the linearmeasurement range of the novel ISE method. The nor-mality of the method comparison for the three ISE set-ups were checked by Shapiro-Wilk W test. Agreementbetween the ISE instrumentation was elucidated withAltman-Bland (AB) plots. Passing and Bablok regres-sion analysis was used for method comparisons. The re-lative biases from PT results were calculated accordingto the Marchandise equation [14].

Results and discussion

Linearity

The mean values of the replicate measurements in eightserial dilutions were plotted against the theoretical lith-

Fig. 1 The linearity of the Cobas Integra700 ion-selective elec-tode (ISE) lithium method. S-Li serum lithium; linear regressionline yp0.970xc0.014; R2p0.9996; intercept p0.014 (95% CI:P0.035 to 0.062); slope p0.970 (95% CI: 0.949 to 0.990)

ium values. The method showed acceptable linearitybetween 0.10 and 4.00 mmol/l of lithium(yp0.970xc0.014, R2p0.9996) and the results werecomparable to the manufacturer’s specifications. The95% confidence intervals (CIs) for the intercept andslope are given in Fig. 1.

Imprecision

The analytical variation of the Cobas Integra ISE meth-od in New Lab A was studied in the validation processby running the two-level system quality control samplesfor10 days, and Daytrol for8 days. The intra-assay var-iations (np20) of the two system controls were muchbetter than those quoted by the manufacturer. Thewithin-batch variation of the lower system control wasverified as 0.43% compared to 0.81% given by the man-ufacturer. The higher level varied by 1.4%: the manu-facturer quoted a respective variation of 2.5%.

Lab B reported its inter-assay variation as total var-iation of Daytrol only at the higher therapeutic level.Respectively, Lab C reported variation only at the low-er therapeutic level (Table 1). As the national targetcoefficient of variation (CV%) for analytical variationis 2% for S-Li, all methods fulfilled this quality specifi-cation, when the long-term control, Daytrol, was used.Inter-assay variation of the lower level system controlwas 3.5 times higher, than quoted by the manufacturer.This stresses the point that every laboratory shouldcheck the performance specifications given by the man-ufacturer. The variation of the higher therapeutic levelshowed equality with the obtained validation results.

Page 113: Validation in Chemical Measurement

104 S. Linko

Table 1 Inter-assay variation, CVmeas % of the three ion-selectiveelectrode (ISE) applications during method validation and at the6-month checkpoint for the lower and higher therapeutic levelsand the toxic level of serum lithium (S-Li)

ISE application CVmeas % and (CVman)a %(Lithium level)

Cobas Integra 700 1.7b(0.48) 2.4c 1.5d (2.5)“Method validation” (0.48 mmol/l) (1.14 mmol/l) (2.11 mmol/l)Cobas Integra 700 3.3b (0.48) 2.8c 4.5d (2.5)“Six-monthcheckpoint”

(0.53 mmol/l) (1.14 mmol/l) (1.86 mmol/l)

Kone Microlyte 6 – 1.8c –“Method validation” (1.14 mmol/l)Chiron 654 0.91e – –“Method validation” (0.67 mmol/l)

a CVman is the inter-assay variation specified by the manufactur-erb Cobas Integra 700 system control, normal levelc Daytrol, lot “a”d Cobas Integra 700 system control, pathological levele Daytrol, lot “b”

At the 6-month checkpoint the variations of bothsystem controls showed much worse performance com-pared to the results measured during the validationprocess. Usually one expects better performance in var-iation over a 12 times longer time period. An explana-tion for the higher between-day variation might be dif-ferences between the two lots of the control samplesand/or over-all unsuitability of this control material.These results demonstrate, that imprecision detectedduring any validation gives only an idea of the real var-iation. Further corrective actions should be done tocheck the high variation of the system controls. A tem-porary instability of the lithium electrode could not ex-plain this variation. The cumulative report of Daytrolfrom the first 6 months showed satisfactory perform-ance. Ninety six percent of the results were within thetarget of 2%, although the cumulative variation was2.8%. Higher than allowed variation was detected dur-ing two summer months, which explains why the allow-able target was exceeded. The proposed Europeanspecifications for imprecision of S-Li measurements is3.6% at the decision levels of 0.4 mmol/l and 1.5 mmol/l[15, 16]. Excluding the higher system control at the six-month checkpoint, the variations of all other controls inevery ISE system were within this target.

Bias

The national allowable total analytical error, TEa is6%. According to the European specifications for inac-curacy, the respective allowable bias is maximally 4.2%[15, 16]. Table 2 shows the relative biases calculatedfrom the PT results according to the Marchandise equa-

Table 2 Relative bias percentages of the three ISE methods. CPis 6-month checkpoint; QAP is Quality Assessment Programme

Relative bias %

ISE instrumentation MethodvalidationNationalQAP

CP NationalQAP

CPInternationalQAP

Cobas Integra 700 3.3 (np6) 3.2 (np6) 2.8 (np12)Kone Microlyte 6 5.7 (np6) – –Chiron 654 7.3 (np6) – –

Table 3 Comparison of bias percentages from reference methodvalues (RMV) and consensus mean values (CMVs) within twonational short-term (ST) surveys

National QAP Bias % (95% CI)from RMV

Bias % from CMV

ST survey A 8.7 (0.957 mmol/l) 6.1 (0.98 mmol/l)ST survey B 3.3 (2.40 mmol/l) 2.5 (2.42 mmol/l)

tion. Six past samples from Labquality were run on theCobas Integra ISE and Chiron 654 analysers in the val-idation process. The subcontractor, Lab B, reportedtheir results from the respective months. All biases ofthe national PT samples were related to consensusmeans (np72B2) within the method group of “liquidchemistry”. None of these past samples had referencemethod values (RMVs). It can be concluded from thecalculated relative bias that the novel method showedexcellent performance. The Chiron 654 results weremuch more biased and beyond the acceptable limits ofboth the national and European quality specifications.Neither of the relative biases of the novel method ex-ceeded the target limits either in national or interna-tional schemes at the 6-month checkpoint. Two of thePT samples during that time period had traceable val-ues to flame emission photometry (INSTAND Refer-ence Laboratories, Düsseldorf, Germany) as the refer-ence method. Unfortunately, the uncertainties, fromthe two RMVs could not be estimated because the un-certainty, Ui of the reference method was unknown. Onthe other hand, the Marchandise equation can only beapplied, if n 63. The results in Table 3 show that al-though the consensus mean values (CMVs) did not de-viate more than 2.4% (survey A) and 0.83% (survey B)from the RMVs, the new method was roughly withinacceptable quality specifications in survey A, when theresults was compared to the CMV. Anyhow, the per-formance was good in survey B and passed both nation-al and European operation criteria.

Page 114: Validation in Chemical Measurement

Automated ion-selective measurement of lithium in serum. A practical approach to result-level verification in a two-way method validation 105

Table 4 Descriptive comparison of the three ISE instruments

ISEinstrumentation

n Median S-Li(mmol/l)

95% CI of medianS-Li (mmol/l)

Cobas Integra 700 56 0.640 0.540 to 0.680Kone Microlyte 6 56 0.625 0.520 to 0.660Chiron 654 56 0.700 0.610 to 0.740

Fig. 2 Agreement between the Cobas Integra 700 and Kone Mi-crolyte 6 methods as an Altman-Bland (AB) plot. The 95% limitsof agreement are illustrated as dashed lines

Method comparison

Fifty six of the sixty two samples gave a quantitativelithium concentration with all three ISEs. Six sampleswere beyond the lower measurement level (0.20 mmol/l) of the Kone Microlyte 6 and Chiron 654 analysers.As these results were not comparable to the results ob-tained by the novel method, they were excluded fromthe statistical analysis and further discussions. Theprobability values (P) of the Shapiro-Wilk W coeffi-cients were 0.07 for the Cobas Integra instrument, 0.04for the Kone Microlyte 6 and 0.09 for the Chiron 654 asthe normalities of the result groups were tested. In thedescriptive comparison of the three ISEs it was foundthat the new method and the Kone Microlyte 6 methodgave comparable results, while the result level of theChiron method was approximately 10% higher thanthose of the other two (Table 4). In addition to theskewness detected in all result groups, it was concludedthat none of them were normally distributed. The Alt-man-Bland (AB) agreements between the two compar-ison ISEs and the new method are shown in Figs. 2 and3. A positive bias of 0.013 mmol/l (95% CI: 0.005 to0.021) was detected between the Cobas Integra andKone Microlyte 6 methods. This bias was not of clinicalsignificance and only 3.6% of the 56 samples werebeyond the upper limit of 95% of agreement. The Co-bas Integra method was negatively biased compared tothe Chiron 654. From Fig. 3 it can be seen that threesamples (5.4%) fell outside the 95% limits of agree-ment. The bias of –0.052 mmol/l (95% CI: –0.059 to–0.045) was obvious due to the largest relative bias ofPT samples (Table 2) despite the best imprecision ofthe Chiron instrument (Table 1). According to the phy-sicians’ opinion this deviation was not of great clinicalsignificance. Unfortunately, neither of the ISE methodsbeing withdrawn could be compared to the RMVs tosee the biases from the true values.

Passing and Bablok regression analysis between themethods resulted in biases as predicted in AB agree-ment (Table 5). The CIs of the intercepts and theslopes did not overlap between the comparison instru-ments. It can be concluded that although a statisticaldifference was detected, this was not clinically signifi-cant either. Lab A previously reported a therapeuticrange of 0.8–1.2 mmol/l, and Lab C a range of0.5–1.5 mmol/l with their patient results. Thus, accord-

Fig. 3 Agreement between the Cobas Integra 700 and Chiron 654methods as an Altman-Bland (AB) plot. The 95% limits of agree-ment are illustrated as dashed lines

Table 5 The summarized data of the ISE method comparison(np56)

Cobas Integra 700 v.the comparisonmethods

AB agreement Passing and Bablokmethod conversion

Bias Intercept Slope

Kone Microlyte 6 0.013 (95% CI:0.005 to 0.021)

0.041a 0.951b

Chiron 654 P0.052 (95% CI:P0.059 to P0.045)

P0.084c 1.041d

a Kone Microlyte 6 :95% CI of intercept 0.026 to 0.057b Kone Microlyte 6 :95% CI of slope 0.925 to 0.977c Chiron 654 :95% CI of intercept –0.099 to –0.055d Chiron 654 :95% CI of slope 1.000 to 1.067

Page 115: Validation in Chemical Measurement

106 S. Linko

ing to these results a therapeutic range of 0.6–1.2 mmol/l and a toxic range of 11.6 mmol/l could reliably be es-tablished at New Lab A.

Conclusions

This method validation study indicates that the choiceof validation parameters is crucial. Checking the meth-od performance in accordance to the obtained valida-tion results should not be omitted. The preliminary im-precision results do not necessarily predict the analyti-cal variation as the method is taken into routine use.The suitability of system controls have to be consideredcarefully when patient results are accepted according tothe general internal quality control principles. Further,surprisingly high variations should be clarified with themanufacturer. The use of past PT samples is highly rec-ommended to predict the future inaccuracy, whenever

possible. However, it would be of great value to esti-mate the true bias from RMVs with given uncertaintiesand more frequent availability. The deviations from theRMVs could consequently be used in uncertainty bud-geting and the estimation of the total measurement un-certainty. Certified reference materials with traceablevalues cannot be used in single method validations dueto their high expense.

It was concluded that the new ISE method present-ed for the determination of S-Li fulfilled the analyticalquality specifications. The results from the 6-monthcheckpoint were satisfactory and could be used for fur-ther assessment by the national accreditation body inthe future.

Acknowledgements The author wishes to give many thanks toAino Salla for her technical assistance and to chief chemist PaavoTammisto at Rinnekoti Foundation Laboratory (Espoo, Finland)for his many contributions to this study.

References

1. Tietz NW (1995) Clinical guide to la-boratory tests, 3rd edn. W.B. Saun-ders, USA, p. 840

2. Price LH, Heninger GR (1994) NEngl J Med 331 :591–597

3. NIMH/NIH Consensus DevelopmentConference Statement (1985) Am JPsychiatry 142 :4

4. Westgard JO, Burnett RW, BowersGN (1990) Clin Chem 36 :1712–1716

5. Dybkær R, Jordal R, Jørgensen PJ,Hanson P, Hjelm M, Kaihola HL,Kallner A, Rustad P, Uldall A, deVerdier CH (1993) Scand J Clin LabInvest 53 Suppl. 212 :60–84

6. European Communities Confedera-tion of Clinical Chemistry (ECCCC)(1996) Essential criteria for qualitysystems of clinical biochemistry labo-ratories. In the EU Working Groupon Harmonization of Quality Systemsand Accreditation, EC4

7. European cooperation for Accredita-tion of Laboratories (EAL) (1997)Accreditation for medical laborato-ries, EAL-G25/ECLM-1, 1st edn. Eu-ropean cooperation for Accreditation(EA)

8. European cooperation for Accredita-tion of Laboratories (EAL) (1997)Validation of test methods. Generalprinciples and concepts, EAL-P11, 1stedn. European cooperation for Ac-creditation (EA)

9. Walsh MC (1999) Accred Qual Assur4 :365–368

10. ISO/IEC 17025 (1999) General requi-rements for the competence of testingand calibration laboratories. Interna-tional Organisation for Standardisa-tion (ISO), Geneva, Switzerland

11. Linder MW, Keck PE Jr. (1998) ClinChem 44 :1073–1084

12. Sampson M, Ruddel M, Elin R(1994) Clin Chem 40 :869–872

13. Kuhn T, Blum R, Hildner HP, WahlHP. (1994) Clin Chem 40 :1059

14. Marchandise H (1993) J Anal Chem34:82–86

15. Fraser CG, Petersen HP, Ricos C,Haeckel R (1992) Eur J Clin ChemClin Biochem 30 :311–317

16. Westgard JO, Seehafer JJ, Barry PL(1994) Clin Chem 40 :1228–1232

Page 116: Validation in Chemical Measurement

Received: 31 October 2000Accepted: 3 January 2001

Presented at the EURACHEM/EQUALMWorkshop “Proficiency Testing in Analyti-cal Chemistry, Microbiology and Labora-tory Medicine”, 24–26 September 2000,Boras, Sweden

Abstract The clarification of hydro-carbon input into the Baltic sea viarivers is one of the priority issues ofthe 4th Pollution Load Compilation(PLC-4) within the framework of in-ternational Baltic Sea marine moni-toring. An interlaboratory compari-son was conducted to check the ap-plicability of a new method for thedetermination of hydrocarbons bysolvent extraction and gas chroma-tography. Surrogate oil solutionswith known hydrocarbon contentwere distributed among the partici-pants for preparation of water sam-ples of different hydrocarbon con-

centration. In using these concentra-tions as assigned values and by set-ting target values for precision, theproficiency of participating laborato-ries could be tested as a qualifyingstep before involvement in PLC-4routine monitoring. The results ofthe exercise indicate that hydrocar-bons in water samples can be moni-tored as a mandatory test item withinthe framework of PLC-4.

Keywords Hydrocarbons · Mineral oils · Water monitoring ·Quality assurance · Interlaboratorycomparison

Accred Qual Assur (2001) 6:173–177© Springer-Verlag 2001 PRACTITIONER’S REPORT

Peter WoitkeReinhard KreßnerPeter Lepom

Determination of hydrocarbons in water – interlaboratory method validation before routine monitoring

Introduction

Nowadays it is generally accepted that laboratory profi-ciency testing schemes are an important quality assur-ance tool in environmental monitoring programmes.Apart from using validated analytical methods and inter-nal laboratory quality control procedures, the regularparticipation of laboratories in interlaboratory compari-sons is required to ensure the accuracy and comparabilityof data.

Within the framework of the 4th Pollution Load Com-pilation (PLC-4) of Baltic Sea monitoring, the clarifi-cation of oil inputs to the Baltic Sea via rivers is one ofthe priority issues according to Helsinki Commission (HELCOM) recommendations [1]. Hence, the determi-nation of hydrocarbons in some larger rivers and pointsources is mandatory in PLC-4. Gas chromatographic(GC) determination of hydrocarbons after solvent extrac-tion was chosen as the analytical procedure [2]. Themethod enables the determination of hydrocarbons atconcentrations above 0.1 mg L–1, and encompasses a

wide range of aliphatic, alicyclic, and aromatic com-pounds of interest. Compared with the frequently appliedinfrared spectrometric method, the GC procedure pro-vides much additional information relating to boilingpoint range and qualitative composition of the hydrocar-bon contamination.

Oil inputs are to be measured within PLC-4 for thefirst time. To support external quality assurance mea-sures and to obtain information on the proficiency of thenominated laboratories, an interlaboratory comparisonon the determination of hydrocarbons in water was con-ducted before the start of routine monitoring.

Test materials

Because hydrocarbons, especially n-alkanes, in aqueoussolution are susceptible to microbiological degradation,surrogate samples were used in the interlaboratory com-parison. All participants were requested to prepare watersamples from the solutions listed in Table 1 according to

P. Woitke () · R. Kreßner · P. LepomFederal Environmental Agency,P.O. Box 33 00 22, 14191 Berlin, Germanye-mail: [email protected].: +49 30 8903 2566Fax: +49 30 8903 2285

Page 117: Validation in Chemical Measurement

108 P. Woitke · R. Kreßner · P. Lepom

instructions distributed with the samples. All sampleswere weighed before shipment to enable monitoring ofsolvent evaporation.

Two concentration levels were to be analysed by theparticipants, one near the limit of determination and theother ca. ten times this level. The analytical procedureincludes a clean-up step in which polar compounds, e.g.plant fats, fulvic acids, and surfactants, are removedfrom the extracts. To check the influence on analyticalresults of typical matrix constituents which might bepresent in river waters and waste waters, possible inter-fering compounds were added to a subset of the testmaterials (Table 1).

Because the hydrocarbon test samples were providedas solutions and were prepared from only one stock solu-tion, a test of homogeneity was not necessary. A subset

of sample aliquots was kept by the organiser for stabilitytesting. The sample weight was recorded and was foundto be constant during the time of the exercise.

Interlaboratory comparison

The hydrocarbon oil index within PLC-4 was determinedin accordance with the draft standard ISO/DIS 9377–4,which has recently been adopted as ISO/FDIS 9377–2[2]. Briefly, water samples are extracted with n-hexaneor petroleum ether. After clean-up of the extracts on aFlorisil column, hydrocarbons are detected and quanti-fied by capillary gas chromatography with flame ionisat-ion detection (GC-FID, Fig. 1).

Results from previous interlaboratory validation stud-ies on the method showed accuracy to be acceptable –relative reproducibility standard deviations werebetween 20% and 40% depending on the hydrocarbonconcentration and the amount of interfering compoundsin the sample. Recoveries between 80% and 100% were

Table 1 Solutions for preparing surrogate samples distributed in the interlaboratory comparison on the determination of hydrocarbonsin water

Sample Code Concentration Solvent Volume Weight of hydrocarbons Weight of interfering compounds[mL] [mg] [mg]

Diesel Lubricating Texapon Dodecanoic Riverfuel oil NSO acid methyl fulvic [mg] [mg] (surfactant) ester acid

Standard solution STD 0.624 mg mL–1 n-Hexane 50 20.2 11 – – –Surrogate sample S1 0.224 mg L–1* i-Propanol 250 33.9 16.4 – – –Surrogate sample S2 2.22 mg L–1* i-Propanol 250 247.8 252.3 – – –Surrogate sample S3 0.474 mg L–1* i-Propanol 250 72.2 34.4 253 255.6 41.2Surrogate sample S4 3.34 mg L–1* i-Propanol 250 381.4 369.8 250.1 249.8 41.7

*If prepared according to the sample preparation instructions

Fig. 1 Integrated gas chromatogram of a hydrocarbon standardsolution corrected for ‘column bleed’ (1 mg mL–1 in n-hexane,3 µL on-column injection, 12 m × 0.32 mm BPX-5 column, 60 °Cfor 1 min, 20 °C/min to 360 °C, 360 °C for 15 min)

Page 118: Validation in Chemical Measurement

Determination of hydrocarbons in water – interlaboratory method validation before routine monitoring 109

obtained for most samples. In the presence of sur-factants, only, recoveries dropped to 60%.

Because the number of PLC-4 laboratories officiallynominated by the HELCOM countries was very limited,additional laboratories were invited to participate. Labo-ratories not familiar with the new method received sev-eral test samples to acquaint them with the procedure be-fore announcement of the exercise. Samples, data reportsheets, and a questionnaire referring to the experimentalconditions were sent to 24 laboratories. The participantswere requested to perform four replicate determinations.Sixteen laboratories in six different countries around theBaltic Sea sent in results to the organiser.

The data were assessed by the ISO 5725–2 protocol[3] implemented in the software package Prolab98 (DrUhlig, Quo Data, Dresden, Germany), which is routinelyused by the German Federal Environmental Agency forevaluation of laboratory proficiency tests. Outliers wererejected by use of Grubbs’ test (P=1%) and Cochran’stest (P=1%).

Results

Because samples S1–S4 were synthetic solutions of min-eral oils, the overall means of the participants can becompared with assigned values, and recoveries can becalculated (Fig. 2). For sample S1 the recovery was in

the expected range above 85%. It was only 75% for sam-ple S2 indicating incomplete extraction of higher con-centrations of hydrocarbons. Recovery of ca. 70% wasobtained for samples S3 and S4 which contained inter-fering compounds, irrespective of mineral oil concentra-tion. To check the calibration functions established bythe participants, a standard solution (STD) containing anunknown concentration of mineral oil for direct injectioninto the GC system was distributed. Evaluation of the re-sults revealed that the correct concentration was found(Fig. 2), although the relatively high overall repeatabilityand reproducibility obtained for the standard solutionmust be regarded as insufficient.

Table 2 summarises the overall statistics of the inter-laboratory comparison. Repeatability between 7% and16% and reproducibility of ca. 30% were obtained for allthe samples investigated, in good agreement with resultsfrom a recently conducted interlaboratory validationstudy arranged by ISO/TC147/SC2/WG15 [2]. The re-sults indicate that mineral oil concentrations in watersamples between 0.2 and 3 mg L–1 can be measured withadequate precision by the participating laboratories.

There were no significant differences between the re-peatability or the reproducibility obtained for the stan-dard solution and those obtained for the different watersamples. Thus the variability of the sample preparationsteps did not seem to contribute markedly to the overalluncertainty of the analytical procedure. It could be con-cluded that variability in the determination of hydrocar-bons in water is mainly a result of the precision of theGC determination.

Proficiency evaluation

Analysis of systematic errors in the determination of hy-drocarbons in water can be achieved by use of Youdenplots after transformation of the results to an overallmean of Mtotal=0 and a standard deviation of Stotal=1(Fig. 3). Almost all laboratories are distributed aroundthe 45° line indicating that most of the variation was sys-tematic rather than random, particularly at higher miner-al oil concentrations (sample pair S2/S4, Fig. 3B). Re-sults located within the interval Mtotal±2 Stotal indicatesufficient proficiency of the participating laboratories inperforming the determination of hydrocarbons in water

Fig. 2 Comparison of assigned values and participants’ overallmean obtained for the standard solution, STD, and water samplesS1–S4 (error bars indicate the 95% confidence interval)

Table 2 Summary statistics ofthe HELCOM interlaboratorycomparison exercise on the de-termination of oil in water

Sample No. of No. of No. of Overall Reference Overall Sr* SR

*

labs results outliers mean value recovery [%] [%][mg L–1] [mg L–1] [%]

STD 14 49 4 0.717** 0.624** 115 7.45 26.4S1 14 50 8 0.206 0.224 92.0 14.8 31.3S2 15 56 4 1.69 2.22 76.1 9.24 21.9S3 14 46 4 0.320 0.474 67.5 16.2 34.9S4 15 58 4 2.20 3.34 65.9 9.58 30.2

* Sr – relative repeatabilitystandard deviation; SR – rela-tive reproducibility standarddeviation** [mg mL–1]

Page 119: Validation in Chemical Measurement

110 P. Woitke · R. Kreßner · P. Lepom

samples. For comparison purposes results from anotherlaboratory (participant K), which used a different methodof determination (fluorescence spectroscopy) are included.For higher mineral oil concentrations, this method seemsto have a negative systematic error.

The calculation of the Z-scores to evaluate the labora-tory’s proficiency [4] was based on the known concen-trations of hydrocarbons in the samples (Table 2) and ona target precision set by the PLC-4 programme:

20% for hydrocarbon concentrations above 1 mg L–1;and

30% for hydrocarbon concentrations below 1 mg L–1

(near the determination limit).

The Z-scores achieved by the participants are displayedin Fig. 4 for all the samples investigated. Eleven labora-tories completed the interlaboratory comparison success-fully (80% of Z-scores within |Z|< 2). The between-labo-ratory standard deviations are better than the target val-ues for tolerable error set by PLC-4 (Table 2). Unsatis-factory Z-scores of individual laboratories were mainly aconsequence of low recoveries, in particular for samplescontaining interfering substances. As a consequence,correction of analytical results for recovery, regularly de-termined by analysis of spiked water samples, will beconducted within the PLC-4 programme.

Summarising the results of this study, mineral oils canbe determined in water samples by solvent extractionand gas chromatography. This test item will be manda-tory within the framework of PLC-4. Nevertheless, fur-ther interlaboratory comparisons should be performed inthe course of PLC-4 as an external quality-assurancemeasure within this monitoring programme.

Acknowledgements The authors would like to thank all laborato-ries for participating in the interlaboratory comparison. Financialsupport from HELCOM is gratefully acknowledged.

Fig. 3 Youden plot of the results (transformed to an overall meanof Mtotal=0 and a standard deviation of Stotal=1) for sample pairS1/S3 (A) and sample pair S2/S4 (B). Satisfactory performance isrepresented by the displayed box (Mtotal±2 Stotal). Each dot repre-sents the results of an individual laboratory marked by a lettercode (* – laboratories involved in PLC-4)

Fig. 4 Z-scores of the partici-pating laboratories (A–X) forsamples S1–S4 calculated us-ing assigned concentrations andPLC-4 target values for preci-sion (the critical Z-value |Z|=2is plotted as a bold line, * –laboratories involved in PLC-4)

Page 120: Validation in Chemical Measurement

Determination of hydrocarbons in water – interlaboratory method validation before routine monitoring 111

References

1. Helsinki Commission, Baltic MarineEnvironment Protection Commission(1999) Guidelines for the Fourth BalticSea Pollution Load Compilation (PLC-4)

2. ISO/FDIS 9377–2 (2000) Water Quality– Determination of Hydrocarbon OilIndex – Part 2: Method Using SolventExtraction and Gas Chromatography

3. ISO 5725 (1994) Accuracy (Truenessand Precision) of Measurement Methodsand Results – Part 2: Basic Method forthe Determination of Repeatability andReproducibility of a Standard Measure-ment Method

4. ISO/IEC Guide 43–1 (1997) ProficiencyTesting by Interlaboratory Comparisons– Part 1: Development and Operation ofProficiency Testing Schemes

Page 121: Validation in Chemical Measurement

Received: 3 April 2002Accepted: 4 September 2002

Abstract In an effort to assess themethod validation done using ICP-AES in our laboratory for potablewater, an Environmental LaboratoryApproval Program organized byNew York State Department ofHealth, Wadsworth Center providingthe reference material has been un-dertaken for 14 trace elements andseven other chemical constituents.The certified means for the referencematerial and the results obtained inour laboratory are compared. Thecomparisons helped us assess thequality of our work. All the datafrom the inductively coupled plasmaatomic emission spectrometer (ICP-AES) fall into the ranges specified.

These data are intended to depict thequality of chemical analysis beingconducted in our laboratory and toincrease the level of confidence ofour clientele in accepting our test re-ports. It should be further noted thatwhile the technique may not be new,the model is new and the simulta-neous detection of elements requiredvalidation for those of our clientelewho are only familiar with sequen-tial AAS and AES.

Keywords ICP-AES · Environmental Laboratory ApprovalProgram – New York Department of Health, Wadsworth Center · Trace elements · Potable water

Accred Qual Assur (2003) 8:21–24DOI 10.1007/s00769-002-0549-9

© Springer-Verlag 2003

G. Anand KumarH.S. SahooS.K. SatpathyG. Swarnabala

Interlaboratory quality audit program for potable water – assessment of method validation done on inductively coupled plasmaatomic emission spectrometer (ICP-AES)

Introduction

Certain major, minor and trace elements tend to accumulatecausing toxicity in potable water. The quantitative measure-ment of these elements helps in diagnosis and treatment ofa range of disorders. Proficiency testing materials, whenstudied by an analytical method, help in method improve-ment and validation and lead to better results, increasingthe confidence of the analysts in reporting the data to thecustomers. Due to the complex nature of the material inquestion, its variable compositions of the elements, theneed to validate the method of analysis is mandatory on thepart of the analyst. So the analysis can be authenticatedwith the help of studies on proficiency testing materials,thus increasing the confidence level for values reported.

In an effort to meet this demand, we have procuredproficiency testing material of potable water from theEnvironmental Laboratory Approval Program being con-ducted by New York State Department of Health, Wads-

worth Center. The chief of this institute has initiated thecollaborative analysis program for analyzing the refer-ence material for subsequent certification, with respectto major, minor and trace element concentrations. Thewhole study is used to analyze the waters of all kinds[1–6] such as drinking water – IS: 10500/1991, packagednatural drinking water – IS: 13428/1998, packageddrinking water – IS: 14543/1998, and water for qualitytolerances for processed food industry – IS: 4251/1967.The characterization of these materials includes differentsample preparation methods and analytical techniquesbeing used by all the laboratories worldwide. Our labora-tory is participating from India with NABL accredita-tion. We have determined 14 constituents by ICP-AESand seven by classical methods for the present study. Theaim of this paper is to highlight the potential of ICP-AESin the generation of accurate analytical data for severaltrace elements in potable water samples and increase theconfidence level of the customers who send us these

G. Anand Kumar · H.S. SahooS.K. Satpathy · G. Swarnabala ()VIMTA Labs Ltd, Hyderabad – 500 051, India e-mail: [email protected].: +91-40-7264141Fax: +91-40-7263657

Page 122: Validation in Chemical Measurement

Interlaboratory quality audit program for potable water – assessment of method validation done on ICP-AES 113

samples. Since the programme had all the constituentsdetermined, the data for the classical methods is alsopresented for completeness.

Experimental

Instrumentation

The ICP-AES used is a Varian-Radial Vista (Varian Instruments,Australia) associated with a microcomputer, software operatingparameters, etc. as given in reference [7].

Materials

The water sample is obtained from the Environmental LaboratoryApproval Program. We have used ICP-AES multi-element Refer-ence Standard supplied by E. Merck with Lot. No: 0C030033 forstandard calibration. These standards have certified data obtainedfrom NIST as third-generation traceability, assuring reasonable ac-curacy in the estimations (Certificate of Analysis, Certipur – Refer-ence Material; 11355 ICP multielement standad IV made fromNIST standards reference materials, Lot No. 0C030033. Dr. HaraldUntenecker, Central Analytical Laboratory, Merck). An echello-gram showing all the elements is attached.

Sample procedure

To minimize the possibility of change in concentration, the am-poule is opened just before the analysis. The sample is prepared as

follows: the ampoule temperature is adjusted to 20 °C prior toanalysis. Approximately 900 mL of reagent water is added to a 1-L volumetric flask. Reagent grade nitric acid (5.0 mL) is addedto the volumetric flask. The flask is swirled to mix. Ampoule IDno. is recorded. The ampoule is broken and 10 mL of the sample istransferred to the flask using a pipette. Then the flask is filled tomark with Milli-Q water and thoroughly shaken. The 14 elementsare measured and the data are tabulated in Table 1.

Standard preparation

Aqueous standards are prepared from 1000 ppm E. Merck stan-dard reference material supplied with NIST traceability. The stan-dards are made up using 18 MΩ Milli-Q water with 1% v/v highquality nitric acid 65%-Fluka Chemie.

The following calibration standards for a five-point calibrationcurve have been made keeping in view the need to cover the rangeof 10, 2, 1, 0.1, and 0.01-ppm concentrations:

Std. 1: Ag, Ba, Cd, Cr, Cu, Fe, Mn, Ni, Pb, and Zn.Std. 2: Be.Std. 3: As, Se, and Sb.

Rinse and calibration blank solutions were prepared from 18 MΩMilli-Q water with 5% HNO3 as per the instructions provided bythe approval program.

Methods used for chemical analysis

All the chemical analysis procedures are adopted from APHAmethods [8]. These are specified in Table 2 along with the analyte.

Table 1 Validation data for po-table water reference materials– Environmental LaboratoryApproval Program analyzed forthe following elements by ICP-AES

Parameter Result APHA method Study mean Accept. limits Score(µg/L)

Ag 87.0 3120 82.2 73.8–90.5 SatAs 57.3 3120 51.1 44.8–57.3 SatBa 777 3120 817.0 696.0–942.0 SatBe 8.74 3120 7.42 6.6–8.94 SatCd 10.0 3120 8.59 6.9–10.4 SatCr 99.0 3120 88.4 75.1–102.0 SatCu 300 3120 328.0 293.0–359.0 SatFe 541 3120 523.0 478.0–568.0 SatMn 70.5 3120 74.20 68.50–79.80 SatNi 219 3120 240.0 201.0–273.0 SatPb 31 3120 33.80 23.60–43.80 SatSb 45.95 3120 39.30 27.40–51.0 SatSe 130.0 3120 48.50 37.80–56.60 UnsatZn 479 3120 479.0 439.0–518.0 SatSat, Satisfactory; Unsat, unsat-

isfactory.

Table 2 Validation data for po-table water reference materials– Environmental LaboratoryApproval Program analyzed forthe following chemical parame-ters on ICP-AES

Parameter Result APHA method Study mean Accept. limits Score(mg/L)

Na 12.6 3500 12.50 11.30–13.70 SatNitrite 1.52 4500 1.51 1.28–1.74 SatChloride 62.9 4500 59.10 55.10–63.10 SatFluoride 5.1 4500 5.46 4.94–6.04 SatNO3 as N 7.94 4500 7.91 7.19–8.79 SatSulfate 44.4 4500 48.4 43.4–53.4 SatTDS 228 2540 244.0 162.0–326.0 SatTDS, Total dissolved salts; Sat,

satisfactory.

Page 123: Validation in Chemical Measurement

114 G.A. Kumar et al.

Results and discussion

ICP-AES offers rapid, multi-element determinations.Its sensitivity is lower than that of either ICP-MS or AA-GTA, but it can handle higher levels of total dis-solved solids than ICP-MS and is much faster than AA-GTA. The results obtained are in comparison withall the equipment used by about 100 laboratories thatare participating in the Environmental Laboratory Approval Program. The quality of the performance re-lating to equipment as well as the analyst in such inter-national collaborative programs implies the use of cer-tified reference materials for calibration and controlsamples to ensure a certain level of traceability of themeasurements and finally the degree of assurance of re-liability.

The use of reference standards for calibration mini-mized several analytical errors. However, our workingstandards also gave results very similar to the referencestandards. The results obtained for the study in questionare summarized in Tables 1 and 2. The organizers col-lected the assigned value for individual elements andprovided us the mean or range of the submitted data forover 100 laboratories using highly reliable statisticalprocedures. The table gives the result, mean, method,range and reliability of the parameter in question. Acareful observation of these data provides the precisionwith which the experimental work has been conductedin the present study. Figure 1 shows the Echellogram of

23 elements in the reference standard along with thewavelengths.

A typical example of the requirements for IS:14543/1998 (4) are Ba, 1.0; Cu, 0.05; Fe, 0.1; Mn, 0.1;Zn, 5.0; Ag, 0.01; Al, 0.03; Se, 0.01; Ca, 75.0; Mg, 30.0;Na, 200.0; Hg, 0.001; Cd, 0.01; As, 0.05; Pb, 0.01; Cr,0.05; Ni, 0.02; Sb, 0.005; B, 5.0 µg/L separately. Routine-ly all the samples are checked for the requirements beforereporting. VISTA software has a built in program withUSEPA requirements [9, 10] for the aspirating of stan-dard reference solution; instrument check standard solu-tion; interference check standard solution and QC stan-dard solution for the laboratory QC control. Continuingcalibration verification is done every ten analytical sam-ples run on Auto Sampler, a facility that is not availablewhen using the manual runs of the samples. We haveadopted the Varian instruments instructions to calibratethe equipment to find the instrument detection limits andthe method detection limits. The former values were ob-tained by aspirating on consecutive three days for all thestandards and the latter values were obtained for consecu-tive seven runs on the same day for all the standards.These are also reconfirmed using “CHEMSW” softwareprogram available for the calculation of detection limits.The original “K-style” glass concentric nebulizer is usedto obtain the raw-data with best possible sensitivity. Forour method validation, we have adopted a seven-pointcalibration range for almost all the elements ranging be-tween 10–0.01 mg/L. The accuracy in the recovery of thestandards ranged from 95%–105% with a precision at 1 µg/L as ±12%. Matrix spiking results in the recoveriesrange between 75%–%125% as per USEPA/APHA. Va-pour-generation accessory (VGA 77), is used for Hg measurements. Three standards 0.001; 0.005, and 0.01 µg/L are used for linearity or calibration. Stannouschloride (25%) and HCl (Fluka) are being used for the re-duction in the cold-vapour generation.

Conclusions

It may be concluded that the observed data for 13 ele-ments (except for the selenium) and the seven other con-stituents specified for potable water are in concurrencewith all the laboratories whose data is submitted to Envi-ronmental Laboratory Program, New York State Depart-ment of Health. [After the constructive criticism from thereviewers, the unsatisfactory results for the Ag, Cd, Sband Se elements were repeated. Cd values have improvedby changing the wavelength from 217 to 226 nm. For an-timony we have used a new standard and the value im-proved. For silver the number of washings before aspira-tion was tripled and the result improved. The seleniumrange is still not very satisfactory, but our efforts to im-prove upon this are in progress and we are positive aboutit.] The method detection limits, RSD’s, and the instru-

Fig. 1 Echellogram showing all the 23 elements in the standardalong with the wavelengths. The wavelengths for all the 23 ele-ments are:

Ag 328.068 Al 396.152 B 249.772 Ba 455.403Bi 223.061 Ca 317.933 Cd 214.439 Cd 226.502Co 238.892 Cr 267.716 Cr 357.868 Cu 327.395Fe 238.204 Ga 294.363 In 230.606 K 766.491Li 670.783 Mg 279.553 Mn 257.610 Na 589.592Ni 231.604 Pb 220.353 Sr 407.771 Tl 190.794Zn 213.857

Page 124: Validation in Chemical Measurement

Interlaboratory quality audit program for potable water – assessment of method validation done on ICP-AES 115

References

mental detection limits obtained for our instrument andalso the method validation done using this data confirmsthe validity of the method being adopted. Thus the use ofinstruments simultaneously for routine analysis imparts ahigh degree of excellence to the data being reported.

Acknowledgements G.S. is grateful to Dr. S. P. Vasireddi, Foun-der, Chairman and Managing Director, VIMTA Labs Ltd for giv-ing permission to publish this work. We would like to thank theEnvironmental Laboratory Approval Program organized by NewYork State Department of Health, Wadsworth Center for permit-ting us to publish these data.

1. APHA (2000) Standard methods forthe examination of water and wastewater analysis, 20th edn. APHA.

2. (1991) IS 10500 – Drinking water3. (1998) IS 13428 – Packaged natural

mineral water4. (1998) IS 14543 – Packaged drinking

water (other than the packaged naturalmineral water)

5. (1992) IS 1070 – Reagent-grade water6. (1967) IS 4251 – Quality tolerances

from water for processed food industry

7. Satyanarayana KVS, Ravi Kumar Ch,Swarnabala G (2002) Atomic Spec-trosc (submitted)

8. APHA (2000). Methods – 3120, 3500,4500 and 2540. In: Standard methodsfor the examination of water and wastewater analysis, 20th edn. APHA

9. USEPA (1994) Determination of metaland trace elements in water and wastesby inductively coupled plasma–atomicemission Sspectrometry – Method200.7, USEPA, Environmental Moni-toring and Support Laboratory, Cincin-nati, OH, May 1994. USEPA Washing-ton, DC

10. Bridger S, Knowles M (2000) ICP-AES at work – ICP-29. A completemethod for environmental samples bysimultaneous axially viewed ICP-AESfollowing USEPA guidelines. VarianAustralia Pty Ltd.

Page 125: Validation in Chemical Measurement

Received: 2 November 2002Accepted: 3 February 2003

Presented at the CERMM-3, Central Euro-pean Reference Materials and Measure-ments Conference: The function of refer-ence materials in the measurement process,May 30–June 1, 2002, Rogaška Slatina,Slovenia

Abstract The fact that various defi-nitions and terminology applied tomeasurements in analytical chemis-try are not always consistent andstraightforward, by not only answer-ing the question “what”, but also“how”, leads to their various inter-pretations. This results in non-uni-form implementation of very basicand essential metrological principlesin chemistry. Such a diverse situationis not conducive to the endorsementof harmonised measurements allacross the world, to serve as a toolfor improving the quality of life inits broadest sense for all its citizens.The discussion in this paper is fo-cused on problems associated withterminology and definitions of ‘ref-erence material’ and ‘validation’.The role of reference materials in

measurement processes for purposesother than calibration and validationprinciples in analytical chemistry arealso discussed in this paper. Wherepossible, potential solutions are pro-posed, but more often, questions ofessential importance are raised in or-der to initiate international discus-sion which will hopefully lead toequally understandable answers.

Keywords Reference materials ·Analytical chemistry · Measurementsin chemistry · Validation · Methoddevelopment

Accred Qual Assur (2003) 8:108–112DOI 10.1007/s00769-003-0598-8

© Springer-Verlag 2003

Nineta Majcen A need for clearer terminology and guidance in the role of reference materials in method development and validation

Introduction

Internationally understandable terminology and interpre-tation of definitions play an essential role in the analyti-cal community worldwide [1–3]. Language, as a tool forcommunication, can considerably accelerate the unifica-tion of activities or alteratively act as a diversifying ele-ment that could keep analysts going on in various direc-tions and not approaching the same goal. As this effect isintensified due to the various ‘dimensions’ of differentlanguages that we speak, it is even more important tomake the original definition as non-confusing as possibleand to provide additional explanations in written andoral form to make sure that it is implemented properly. Itis thus important, that the terms we use are ‘traceable’ tothe same definition and its understanding/implementa-

tion in practice as this will result in comparable measure-ment results.

Only on the basis of internationally accepted defini-tions and terminology, can the intended uses of referencematerials be discussed in a way uniformly understand-able across all sectors where analytical chemistry is ap-plied. While the production of reference materials andtheir storage is very well described in the literature, theirintended use is somehow overlooked [4–13]. The conse-quence is that the analysts are left to their own imagina-tion for how to use them, which as a boomerang (again)leads to different interpretations of results measured invarious laboratories.

In order to make the exchange of information, basedon results of measurements obtained in analytical labora-tories reasonable, measurements should be traceable to

N. Majcen Republic of Slovenia-Ministry of Education, Science and Sport,Metrology Institute (MIRS),Celje, Ljubljana, Slovenia

N. Majcen ()Metrology Institute of the Republic of Slovenia (MIRS),Laboratory Centre Celje,Tkalska 15, SI-3000 Celje, Sloveniae-mail: [email protected].: +386-3-4280754Fax: +386-3-4280760

Page 126: Validation in Chemical Measurement

A need for clearer terminology and guidance in the role of reference materials in method development and validation 117

the same stated unit and thus be comparable. Neither es-tablishing nor demonstrating the traceability of measure-ment results is an easy task. Moreover, there is no gener-al rule that can be proposed, which is valid and imple-mented in general. Validation is thus an important way toinvestigate and describe the traceability and uncertaintyissue of the concrete measurement result. Since, up tonow it has mainly not been considered as such, some up-dates related to interpretations of valid definitions arefurther discussed in the paper.

The problems associated with terminology

Reference material

As written in the Eurachem Guide ‘it is commonplace toconfuse reference material with certified reference mate-rial’ [14]. Unfortunately, the situation is even worse, asthere are also some other terms used to describe thesame, like ‘standard material’, ‘reference standard’,‘standard reference material’ etc. It is evident from theliterature that in the International Vocabulary of GeneralTerms in Metrology (VIM), at least twelve differentterms associated with objects used to define, realise,conserve or reproduce a unit or a value of a quantity areused [1, 15]. Up to now several approaches have been al-ready made to uniform terminology in this field as it isessential for many reasons (traceability and consequentlycomparability being one of them) and it is hoping that

the new revision of VIM which is due soon will makethe situation better. Some of the most often used defini-tions are given in Table 1.

As reference materials can serve different purposes, itis of essential importance to agree on the terminologyand definitions for each specific term. As it stands now,the definitions might be interpreted in at least two differ-ent ways, as shown on Fig. 1. If the definitions are un-

Table 1 Reference material: definitions

Document Term Definition

VIM [1]; ISO Guide 30:1992 [4] Reference material Material or substance one or more of whose property values are sufficiently homogeneous and well established to be used for (a) the calibration of an apparatus, (b) the assessment of a measurement method, or (c) for assigning values to materials

VIM [1]; ISO Guide 30:1992 [4] Certified reference Reference material, accompanied by a certificate, one or more material of whose property values are certified by a procedure

which establishes traceability to an accurate realization of the unit in which the property values are expressed, and for which each certified value is accompanied by an uncertaintyat a stated level of confidence. NOTE (4 of 5): All CRMs lie within the definition of “measurement standards” or “etalons”given in the VIM.

VIM [1] Measurement standard, Material measure, measuring instrument, reference materialetalon or measuring system intended to define, realize, conserve or reproduce

a unit or one or more values of a quantity to serve as a reference. EXAMPLES (2 of 6): standard hydrogen electrode; reference solution of cortisol in human serum having a certified concentration.

EURACHEM Guide Reference material RM can be virtually any material used as a basis for reference, on Validation, 1998 [14] and could include laboratory reagents of known purity,

industrial chemicals, or other artifacts. The property or analyte of interest needs to be stable and homogenous but the material does not need to have the high degree of characterisation, traceability and certification more properly associated with CRMs.

Fig. 1 Terminology: reference material (RM) vs. certified refer-ence material (CRM). Inconsistent terminology is a fertile base forvarious interpretations of definitions: definitions given in Table 1can be interpreted as given in case 1 (CRMs being a part of a‘family’ named RM) or as given in case 2 (several different typesof RM, one of them being CRM). Intended use of reference stan-dard depends on accepted interpretation of definitions

Page 127: Validation in Chemical Measurement

118 N. Majcen

derstood as given in case 1, then reference materials canbe used for calibration as well. If this is not the case,which is implied in case 2, then reference materialsshould not be used for calibration purposes. The problemis, of course, much more complicated, but is beyond thescope of this paper.

Method development

From literature and in practice, any change in analyte,matrix or both is described as ‘method development’(better: ‘procedure development’). Furthermore, thisterm also describes minor or major changes of part of themeasurement (analytical) procedure e.g. a different dis-solution step. Nowadays, it rarely happens that a certainmeasurement procedure is developed from scratch, as acompletely new, i.e. where a certain ‘new’ technique isapplied to measure a ‘new’ analyte in a ‘new’ matrix.However, there is no official definition in any of interna-tional documents of the term ‘method development’ andit seems the situation may remain as it is, since it doesnot cause confusion, misunderstandings and/or misinter-pretation.

Procedure validation

The situation is becoming more complicated and lessuniform when discussing validation. Several definitions(Table 2) are offered to analysts [14, 16–17]. TheEURACHEM guide also gives some additional explana-tion of how the definition could be understood as well asguidance on how it might be implemented in practice. Itis important to emphasise that other interpretations arealso possible and that by no means the ones given in theabove stated Guide are the best by definition. Addition-ally, it should be stressed that term ‘procedure valida-tion’ instead of ‘method validation’ is used in this paperas this is more exact wording for how it should be under-stood within this context and what is actually takingplace in practice.

If in the past, the procedure validation tended to con-centrate on the process of evaluating the procedure’s per-formance capabilities, the modern approach also con-firms that the procedure under consideration has perfor-mance capabilities consistent with what the applicationrequires. Implicit in this is that it will be necessary toevaluate the procedure’s performance capabilities.

It therefore depends on each specific case, which per-formance parameters are to be confirmed before startingthe analysis. The scope of validation thus might depend toa certain extent on the customers’ needs. If they are verydiverse (e.g. the laboratory has a customer, who is interest-ed in lower concentrations of a particular analyte as well asa customer who brings a sample with a higher concentra-tion of the same analyte, but this customer requires verylow measurement uncertainty), it is worth broadening thescope of validation or confirmation, otherwise it would besufficient to evaluate/confirm key parameters for the in-tended use. The general rule should be that only those pa-rameters that are needed to be fulfilled due to the custom-ers’ requirements must be confirmed. However, a lot of ad-ditional measurements are usually done to evaluate theprocedure’s performance parameters (Table 3) and the re-sult’s properties (traceability, measurement uncertainty).

Intended uses of reference materials for quality purposes, including validation

Due to the above mentioned problems related to termi-nology it is essential to decide on the nomenclature that

Table 2 Validation: definitions

Document Definition

EURACHEM Guide on Validation, 1998 [14] Method validation as being the process of defining an analytical requirement, and confirming that the method under consideration has performance capabilitiesconsistent with what the application requires.

ISO 9000:2000 [16] Confirmation, through the provision of objective evidence (3.8.1.),that the requirements (3.1.2) for a specific intended use or application have been fulfilled NOTE 1 The term “validated” is used to designate the corresponding statusNOTE 2 The use conditions for validation can be real or simulated.

ISO 17025:1999 [17] Validation is the confirmation by examination and provision of objective evidencethat the particular requirements for a specific intended use are fulfilled.

Table 3 Validation: list of possible performance parameters to beevaluated and properties of the results obtained via validation

Performance parameters Property

Working range, linearity TraceabilitySensitivity, detection limit UncertaintyRepeatabilityReproducibilityRecoveryRobustness

Page 128: Validation in Chemical Measurement

A need for clearer terminology and guidance in the role of reference materials in method development and validation 119

will be used within this paper in order to be in a positionto give a clear explanation on purposes for which refer-ence materials are used. Thus ‘reference materials’ areused as a ‘family name’ i.e. as given in case 1 on Fig. 1.As such, reference materials can be used for (a) estab-lishing traceability and (b) quality assurance, which isthe issue of this paper. The following aspects of qualityassurance are covered by reference materials: (a) valida-tion, (b) internal quality control, (c) external quality as-sessment [18–20].

Validation as one of the milestones of ISO/IEC 17025is required to be performed each time non-standardmethods, procedures developed in-house or methods out-side their intended scope are used. Additionally, accord-ing to the same standard, every laboratory should con-firm that it can properly operate standard or validatedmethods (procedures) before (first) introducing the tests,which, as a consequence again include the validationprinciple. The scope of validation (‘objective evidence’)i.e. which procedure’s performance parameters are to beevaluated depends on the specific intended use e.g. (a)compliance with regulations, (b) maintaining quality andprocess control, (c) making regulatory decisions, (d) sup-porting national and international trade, (e) supportingresearch. As recommended by ISO/IEC 17025, valida-tion and/or confirmation of the procedure should be donethrough uncertainty evaluation that is made through sys-tematic assessment of the quantities influencing the re-sult. This can be done either via measurement of the ref-erence materials or by other means e.g. participation ininterlaboratory comparison or comparison of resultsachieved with other procedures [21].

Various types of control charts, a tool that is most of-ten used for internal quality control purposes, are oftenbased on values given by reference materials. For thispurpose, in-house reference materials should be prefera-bly used, as certified reference materials are too expen-sive to be used to monitor the stability of the measure-ment process.

For external quality assessment purposes, a referencematerial with an undisclosed value is distributed to theparticipants to demonstrate their technical performanceand skills. Comparison of the results of a certain labora-tory with a reference range is a ‘measure’ of how goodthe laboratory is performing in a specific field. However,participation in interlaboratory comparisons (ILC)should be a regular activity and the results of regularparticipations should be traced and compared in time. Itis important to be aware that demonstrating the improve-ment of performance is the key idea of external qualityassessment, thus there is no need of being afraid of hav-ing ‘bad’ results in an ILCs.

Conclusion

Terminology and definitions related to reference materi-als and validation as well as their implementation inpractice has been discussed in this paper. Several at-tempts are going on at the international scene to harm-onise the definitions and their implementation in prac-tice, which should, hopefully lead to comparable mea-surements in chemistry on an international scale. How-ever, as the topic is very broad in its scope and extreme-ly complicated, progress seems to be slower than de-sired, but would expectedly lead to complementary andconsistent definitions. Therefore, it is not yet possible togive definite answers to all the questions stated in thispaper as they are still under discussion at internationallevel, but nevertheless, the following conclusions can begiven.

1. There is an urgent need to harmonise terminology re-lated to reference materials. Inconsistency in this fieldhas a tremendous effect on all measurements in ana-lytical laboratories as written standards, analyticalmeasurement procedures and other documents are un-derstood in different ways. Consequently, this oftenleads to results, which are not comparable as they arenot traceable to the same stated unit.

2. The role of reference materials for quality purposesseems to be well implemented, but people are oftennot aware of their possibly different role in the cali-bration process, which might have important conse-quences with regard to the result of measurement.Thus, training of practitioners and education of stu-dents in the topics described in this paper are highlydesirable.

3. Analysts and end users of the measurement resultsshould be aware of the new dimension of the proce-dure validation definition given in ISO/IEC 17025,which requires that a procedure’s performance param-eters are fit for a specific intended use. In other wordsthis means that the work of an analyst is not finishedwhen performance capabilities of a specific method(or preferably ‘procedure’) are evaluated, but he/shehas to go a step further and check whether these char-acteristics are in compliance with the client’s needs.Of course, it is up to the client to specify his/her re-quirements/properties the result should have. Further-more, ISO/IEC 17025 is introducing evaluation ofmeasurement uncertainty as a mean of performingvalidation through systematic assessment of all quan-tities influencing the result.

Acknowledgements The author is grateful to Aleš Fajgelj for hiscomprehensive comments on the topic described in this paper. Sin-cere thanks also to Philip Taylor, Ewa Bulska, Emilia Vassileva,Miloslav Suchanek and Margreet Lauwaars for their contributionduring fruitful discussions on validation.

Page 129: Validation in Chemical Measurement

120 N. Majcen

References

1. International Vocabulary of Basic andGeneral Terms in Metrology, BIPM,IEC, IFCC, ISO, IUPAC, IUPAP,OIML (1993)

2. Holcombe D. (2000) Accred Qual Assur 5:159–164

3. Terms and definitions used in connec-tion with reference materials, ISOGuide 30 (1992)

4. Calibration in analytical chemistry anduse of certified reference materials,ISO Guide 32 (1997)

5. Use of certified reference materials,ISO Guide 33 (2000)

6. The role of reference materials,Achieving quality in analytical chemis-try, ISO Information booklet (2000)

7. The use of CRMs in the fields coveredby metrological control exercised bythe national service of legal metrology,OIML D18 (2002)

8. Stoeppler M., Wolf W.R., Jenks P.J.(Eds) (2001) Reference materials forchemical analysis, certification, avail-ability and proper usage, Wiley-VCH

9. Fajgelj A., Parkany M. (Eds) (1999)The use of matrix reference materialsin environmental analytical processes,RSC

10. De Bièvre P. (2000) Accred Qual Assur5:307

11. De Bièvre P. (2000) Accred Qual Assur5:224–230

12. Fajgelj A. (2000) Spectroscopy 15(6):19–21

13. Günzler H. (Ed) Accreditation andquality assurance in analytical chemis-try, Springer

14. The fitness for purpose of analyticalmethods, A laboratory guide to methodvalidation and related topics, EURACHEM (1999)

15. Örnemark U, Discussion paper on ter-minology associated with referencematerial, 4E/RM (2001)

16. Quality management systems – Funda-mentals and vocabulary ISO 9000(2000)

17. General requirements for the compe-tence of testing and calibration labora-tories, ISO/IEC 17025 (1999)

18. Neidhart N., Wegscheider W. (Eds)(2001) Quality in chemical measure-ments, Springer

19. Harmonised guidelines for the use ofrecovery information in analyticalmeasurement, IUPAC, ISO, AOAC,EURACHEM, Technical Report (1998)

20. Marschal A., Andrieux, CompagonP.A., Fabre H. (2002) Accred Qual As-sur 7:42–49

21. TrainMiC material (2002) www.trainmic.org

Page 130: Validation in Chemical Measurement

Received: 14 September 2002Accepted: 11 March 2003Published online: 23 April 2003© Springer-Verlag 2003

Presented at the Conference “Metrology and Measurement – 2001”,24–25 April 2001, Kaunas, Lithuania

Abstract Quality control of corro-sion test results implies the valida-tion of the corrosion test method andestimation of the uncertainty of cor-rosion rate measurement. The corro-sion test in an artificial atmosphereof the salt spray mist needs evalua-tion of corrosivity of the test cabinetby reference specimens. Such cali-bration of corrosion environmentraises very strict requirements forthe method description and detailsof all procedures and used speci-mens. Reliable corrosion measure-ments by spray tests require valida-tion of the experimental device to-gether with the experimental proce-dure and determination of corrosi-vity uncertainty of the test cabinetenvironment.

Corrosion tests have been con-ducted for a long time but there areonly a few cases of corrosion dataquality assessment or interlaboratorycomparisons for such measurements.Each test method when used in dif-ferent laboratories gives different re-

sults, as it is impossible to performthe whole procedure exactly in thesame manner. Therefore, a very es-sential parameter of the method is itsrobustness. A proper validation ofthe corrosion test method means theevaluation of the impact of variousenvironmental features and perfor-mance variations on the uncertaintyof the test result.

Our aim was to present an experi-mental evaluation of the corrosivityof the salt spray corrosion test cabi-net, to indicate the gaps in the de-scription of the corrosion test meth-od according to ISO 9227 and to es-timate the main components of theuncertainty of the corrosivity mea-surement.

The validation results requirechanges in the salt spray test methoddescription and maybe in the perfor-mance.

Keywords Validation of method ·Corrosivity · Measurement uncertainty

Accred Qual Assur (2003) 8:235–241DOI 10.1007/s00769-003-0624-x

Eugenija RamoškieneMykolas GladkovasMudis Šalkauskas

Validation of salt spray corrosion test

Introduction

Corrosion tests in artificial atmospheres [1, 2, 3] areused as comparative tests for the evaluation of corrosi-vity of metals and metal alloys and corrosion protectioncapability of various corrosion protection means bymetal plating, varnishing and paint coating as well asanodic and conversion coating. Therefore, it is essentialto know precisely the corrosivity of the test cabinet en-vironment.

The standard method for the test cabinet corrosivitydetermination is described in ISO 92271 [1] but we failedto find any information about the validation of this meth-od or its metrological parameter evaluation. On the otherhand, it is necessary to determine from the experimentalpoint of view whether this technique is reliable enoughas a standard method. A procedure is described in ISO

E. Ramoškiene () · M. GladkovasM. ŠalkauskasInstitute of Chemistry,Goštauto, 92600 Vilnius, Lithuaniae-mail: [email protected]; [email protected]

1 Translated and accepted as LST – national standard of LithuaniaRepublic.

Page 131: Validation in Chemical Measurement

122 E. Ramoskiene · M. Gladkovas · M. Salkauskas

5725-1 [4] for the estimation if the standard method issufficiently detailed and can possibly be improved.

ISO 9227 [1] does not specify in detail many neces-sary parameters and does not determine the precision ofsuch a test method. The precision and accuracy of corro-sion determination are influenced by many factors: prep-aration of specimens, conditioning, removal of corrosionproducts, cleaning, drying, etc. In literature on the corro-sion tests we failed to find any information concerningthe quality of corrosion tests results. The aim of this pa-per is to call attention to the problems in the corrosionmeasurement data quality and the necessity to evaluatethe uncertainty for measurement results. We attempted toshow the main components of uncertainty of the result insuch a measurement on the basis of the experimentalevaluation of the corrosivity of the spray test corrosioncabinet by means of reference specimens.

Materials and methods

An accuracy experiment can often be considered as a practical test ofthe adequacy of the standard measurement method. One of the main

purposes of standardization is to eliminate differences between users(laboratories) as far as possible and the data provided by the experi-ment should reveal how effectively this purpose can be achieved.

According to the ISO 5725-1 requirements [4] a description ofthe measurement method is one of the main sources of the uncer-tainty and therefore it is essential for traceability of results. There-fore, an analysis of the description of the standard test method wasperformed and some ambiguous statements or lack of informationfor experimental procedures were pinpointed as shortcomings. Thedetails of the experiment, which do not meet the requirements ofthe description of the reference method, were marked off as well.The result of these attempts is presented in Table 1.

Evaluation of cabinet corrosivity. In order to determine the cor-rosivity of the corrosion cabinet environment eight tests were per-formed [5] according to the standard method of the neutral saltspray test (Table 1). The results of corrosion rate of RS and themain statistical parameters such as the number of reference sam-ples n, average RS mass m and RS mass loss ∆m of each RS, aver-age RS surface area S and surface area of each RS Sn, mean aver-ages of all eight experiments and their standard deviations are pre-sented in Table 2a and 2b.

Corrosion rate values were calculated from the mass loss ofeach test presenting a statistical array and indicating corrosion rateordered array average value ν, mode value νmode, median valueνmedian, and standard deviation s.

The statistical analysis of the data was performed for the deter-mination of outliers by means of their interquartile range:

Table 1 Comparative analysis of standard LST ISO 92227:1997 and supplementary standards EN ISO 7384:1998 (1), LST ISO7253:1998 (2) and possibilities of their experimental realization

Requirements of standard/reference methods Peculiarities of the experimental device and proceduresand shortcomings in the standard

Corrosion cabinetCabinet capacity no less than 0.2 m3 and preferably Cabinet capacity – 0.4 m3

not less than 0.4 m3

At least two fog collecting devices in the zones Four collecting devices in the cornersof the test specimens – glass cylinder with funnels with a diameter of 100 mmInert plastic supports for the test specimens PMMA plastic support for the five specimensAutomatic registration of temperature and air humidity (1) Not provided automatic registrationa

Determined humidity should be kept within ±5% (1) No data about humidity valueIf necessary, air circulation system shall be provided Not provideda

Spraying deviceSupply of clean air of controlled pressure [(70 to 170)±0.7] kPa Compressed air pressure within determined interval,

and supplied without filtrationAir shall be humidified in saturation tower at temperature Not provideda

several degrees higher than that of the cabinetLevel of the water must be maintained automatically ProvidedAtomizers shall be made of inert material PMMA plasticAdjustable baffles to obtain uniform distribution of the spray PMMA plasticLevel of the salt solution shall be maintained automatically Maintained automatically

Test solutionSodium chloride (50±5) g in distilled or deionised water The test solution prepared from analytical grade NaCl with a conductivity not higher than 20 µS/cm at (25±2) °C (GOST 4233-77); conductivity of deionised water not checkedNaCl shall contain less than 0.001% of Ni, 0.001% Cu, 0.1% Cu and Ni determined by AAS and both met the requirements of NaI and 0.5% (m/m) of total impurities; specific gravity range of ISO 9227; the pH of the prepared solution measured of the solution is 1.0255 to 1.0400 and the pH range with millivoltmeter pH-121within 6.0 to 7.0 at (25±2) °C

Page 132: Validation in Chemical Measurement

Validation of salt spray corrosion test 123

Reference specimens (RS)RS cut from cold-rolled plates or strips of CR4 grade steel RS of stated parameters cut from DC 04 AMO (EN 10130) (according to ISO 3574) (1±0.2) mm thick and (50×80) mm steel plates (800×1000) mm; not clear what “mat finish” means; with faultless surface and a mat finish Ra=1.3±0.4 µm our surface Ra=0.67±0.4 µmCut edge should not be sharp RS marked on the back side by stamping with numbers

and their cut edges groundedLength and widths of RS measured with a vernier (graduation value 0.05 mm)Thickness measured with micrometer MK-0-25 mm (GOST 6507-78) (accuracy 0.01 mm)Storage of RS before pretreatment not specified

Pretreatment of RS before testClean RS immediately before use in a vessel full of an RS cleaned before use (not clear what “immediately” means) appropriate organic solvent (hydrocarbon, with a boiling point in hexane, b.p. 64 °Cbetween 60 °C and 120 °C) using a clean soft brush or an ultrasonic cleaning deviceRinse RS with fresh solvent, then dry them Rinsed with hexane and dried by fanKeep RS in a desiccator with a proper drying agent for 24 h (1) RS kept in a desiccator 30 min to 1 h; not clear what or dry RS over proper time interval at proper conditions (2) “proper drying agent” and “proper time” meansWeigh RS a to ±1 mg RS weighed using second precision class laboratory balances

VLP-200 gProtect one face of RS with removable coating Adhesive plastic film used

Arrangement of the RSPosition each one RS on four different quadrants Five RS on every cabinet quadrant according with the unprotected face upwards at an angle of 20°±5° to the standard requirementsfrom the verticalThe upper edge of RS shall be level with the top Salt spray collectors ca. 5 cm lower than the upper edge of RSof the salt spray collector

Operation condition in the spray cabinetTemperature should be (35±2) °C and shall be measured Requirement metat least 100 mm from the wallsThe average rate of collection of solution in each collecting Requirement metdevice, measured at least over 24 h of continuous spraying, shall contain 1 ml/h to 2 ml/h and shall have a sodium chloride concentration of (50±5) g/l and pH value in the range of 6.5 to 7.2The test duration is 96 h Requirement met

Treatment of RS after the testRemove the protective coating The mode of removal not specified; the protective coating

stripped offRemove the corrosion products by immersion in cleaning RS rinsed with cool running water with a soft sponge, solution at 20 °C to 25 °C for 10 min then it dipped for 10 min into the cleaning solution (according to ISO 8407 [8]) (500 ml hydrochloric acid and 3.5 g hexamethylenetetramine;

distilled water to make 1000 ml) prepared according to ISO 8407Thoroughly clean the RS at ambient temperature with water, Requirement metthen with acetoneFollowed by drying Duration not specified; RS dried using fan for (5–10) min

and kept in a desiccatorWeigh the RS to ±1 mg and calculate the mass loss in g/m2 After ca. 30 min weighed according to the requirements

Evaluation of cabinet corrosivityOperation of the test apparatus is satisfactory if the mass loss Ambiguous meaning of “±40 g/m2” may be means confidence of each reference specimen is (140±40) g/m2 interval and there is no statement on uncertainty of result value

a As far as it was beyond the ability of our equipment and laboratory provision

Table 1 (continued)

Requirements of standard/reference methods Peculiarities of the experimental device and proceduresand shortcomings in the standard

Page 133: Validation in Chemical Measurement

124 E. Ramoskiene · M. Gladkovas · M. Salkauskas

Table 2a Primary data (n) of salt spry test cabinet corrosivity evaluation in eight experiments as corrosion rate (v, mg/m2) of RS (aver-age mass m=31225 mg) mass loss (∆m, mg) from their geometric surface area (S, mm2) (first four experiments)

No of 1 2 3 4experimentRaw data ∆m S v ∆m S v ∆m S ν ∆m S v

1 0.348 3972 87.7 0.408 4054 100.6 0.391 3957 98.0 0.382 3982 96.02 0.390 3998 97.6 0.426 4102 103.9 0.393 4004 98.0 0.434 4071 106.63 0.403 4000 100.8 0.425 4090 103.9 0.413 4032 102.5 0.454 4096 107.64 0.403 3980 101.3 0.423 4054 104.2 0.413 3988 103.2 0.441 4038 109.35 0.403 3976 101.5 0.431 4129 104.3 0.425 3990 106.5 0.452 4073 110.06 0.406 3980 102.0 0.430 4102 104.7 0.430 4038 106.6 0.445 4026 110.57 0.404 3953 102.2 0.432 4085 105.7 0.429 4014 106.9 0.457 4096 111.78 0.410 3969 103.4 0.432 4074 106.0 0.434 3985 108.9 0.468 4176 112.19 0.421 3982 103.8 0.435 4060 107.2 0.436 3988 109.2 0.457 4073 112.2

10 0.417 4009 104.1 0.439 4070 107.8 0.437 3991 109.5 0.439 4049 112.411 0.416 3991 104.3 0.440 4066 108.2 0.434 3955 109.9 0.471 4186 112.612 0.419 3988 105.0 0.441 4053 108.8 0.446 4023 110.8 0.452 4002 112.813 0.421 3997 105.4 0.453 4116 110.1 0.446 4018 111.0 0.469 4091 114.614 0.428 4019 106.4 0.440 3993 110.3 0.447 4026 111.1 0.474 4120 114.915 0.427 3988 107.1 0.455 4109 110.7 0.443 3958 112.1 0.470 4083 115.116 0.429 3997 107.4 0.452 4079 110.9 0.478 4138 115.4 0.469 4056 115.617 0.436 3996 109.2 0.436 4056 110.9 0.457 3943 116.018 0.433 3947 109.7 0.460 4082 112.7 0.510 3960 128.819 0.444 4029 110.2 0.466 4083 114.120 0.477 3992 119.5 0.465 4070 114.3Average 0.417 3988.2 104.4 0.439 4076.4 108.0 0.437 4000.4 109.1 0.452 4076.1 110.9Standard 0.02 19.2 5.9 0.01 28.7 3.7 0.03 43.7 6.8 0.02 52.8 4.6deviation

IQR=Q3–Q1, where Q3 is the third quartile and Q1 is the first quar-tile, when the median divides the experimental sample into twoparts. Outliers were found in experiments No 1, 3 and 6 and maybe rejected but we did not find enough good reason for that andthey were used in calculations of the average corrosion rate valueand its standard deviation.

There is rather significant scattering of data and some of theeight experiments may be regarded as outliers. Therefore we havetested the null hypothesis Ho of equality of the lowest mean of cor-rosion rate in experiment No 1 and the highest mean in experimentNo 8. The calculated value F=s1

2/s82=5.92/5.02=1.4 for standard de-

viations was compared with Fisher distribution statistical test values

where α is significance level 0.1, and n1 and n8 number of RS intests No 1 and No 8.

Null hypothesis is acceptable as F1–0.05(19,19)=0.5≤F=1.4≤2.02=F0.05(19,19). The null hypothesis of equality of means ofequal standard deviations is not acceptable when

Ordinary statistical analysis of the corrosivity tests showed thatthe mean of corrosion rate v=111 g/m2 of eight experiments meansand its standard deviation s=4.1 g/m2 estimate of neutral salt spraytest cabinet corrosivity as 111±3 g/m2 (confidence 95%) showsthat corrosivity of our corrosion cabinet meets the requirements ofISO 9227, which states that the average value and its data scatter-ing should be 140±40 g/m2.

For the total number of tested RS n=148, the mean of corro-sion rate v=112 and its standard deviation s=7.5 differs insignifi-cantly from the mean of the result of eight experiments which

must be regarded as separate tasks each of which may have specif-ic influence on the result. Null hypothesis of equality of meanswhen standard deviations is unequal may be rejected whent>tα/2(k) and in our case t=0.64<t0.05(45)=1.684.

For more advanced analysis it is necessary to calculate the in-fluence of many other factors and express it as the uncertainty ofthe result. The next part of this article deals with this task on thebasis of our experimental data.

Estimation of corrosivity uncertainty. The corrosion rate (v) of ref-erence specimens (RS) at the stated conditions according to ISO9227 in the neutral salt test cabinet may be expressed as follows:

where ∆m is mass loss after the test/ during the test, S is area ofreference specimen and t is duration of the corrosion test (accord-ing to the standard method the test duration is 96 h).

Therefore, the corrosivity of the cabinet was evaluated asv=∆m/S g/m2 for 96 h in the eight experiments. The analysis of themain uncertainty sources according to recommendations [6, 7]was performed (Fig. 1) for evaluation of possible value of uncer-tainty of corrosivity measurement result. Each main source of un-certainty (mass loss ∆m, surface area S and duration t) was analy-sed and calculated separately and these components used for com-bined and expanded uncertainty calculation.

Uncertainty of mass loss measurement. The standard method ofthe neutral salt spray test does not indicate the mass of RS. Massloss was found as a difference between the RS prepared for thecorrosion test and the RS after the corrosion test and corrosionproduct stripping as well as protective coating removal from theRS (Table 1). Such a mass loss determination is based on threecomponents: (1) mass loss determination by weighing (accuracy0.5 mg and standard deviation 0.3 mg) before the neutral saltspray test and after it, (2) determination of difference and (3) cor-

Page 134: Validation in Chemical Measurement

Validation of salt spray corrosion test 125

The value of such a systematic error, if used for the result cor-rection, diminishes the mass loss value and influences the standarduncertainty of the result with its standard deviation (in our case0.9 mg). Generally, it is possible to use relative uncertainty uc=0.4.

The uncertainty of the difference of the dissolution rates in thecorrosion product stripping solution between the corroded surfaceand the surface protected from corrosion but unprotected in thestripping solution corresponding surface was not determined. Aspecial investigation should be conducted and instructions added

Table 2b Primary data (n) of salt spry test cabinet corrosivity evaluation in eight experiments as corrosion rate (v, mg/m2) of RS (aver-age mass m=31225 mg) mass loss (∆m, mg) from their geometric surface area (S, mm2) (continuation—second four experiments)

5 6 7 8

∆m S v ∆m S v ∆m S v ∆m S v

0.415 4022 103.2 0.404 4092 98.7 0.425 4017 105.7 0.455 4067 111.90.425 4005 105.5 0.421 4094 102.8 0.430 4160 103.4 0.464 4012 115.60.425 3968 106.7 0.427 4105 104.1 0.415 3968 104.7 0.496 4000 117.20.424 3972 106.8 0.417 3960 105.4 0.423 3994 105.9 0.468 3997 117.30.437 4042 108.2 0.437 4097 106.7 0.437 3986 109.6 0.468 3981 117.50.435 3982 109.2 0.446 4096 108.8 0.439 3969 110.7 0.467 3967 117.60.444 4064 109.2 0.450 4107 109.6 0.443 3950 112.2 0.479 3977 119.90.436 3966 110.0 0.449 4081 109.9 0.454 3995 113.6 0.485 4014 120.80.442 3963 111.6 0.453 4099 110.4 0.459 3985 115.3 0.483 3985 121.20.448 3966 112.1 0.452 4082 110.8 0.460 3976 115.6 0.491 3999 122.80.458 4027 113.8 0.453 4001 113.2 0.464 4009 115.7 0.502 4059 123.80.460 4018 114.4 0.472 4125 114.4 0.463 4003 115.8 0.499 4014 124.40.455 3971 114.5 0.479 4084 117.3 0.466 3986 116.8 0.496 3985 124.50.461 4011 114.9 0.487 4099 118.9 0.471 3957 119.0 0.500 4005 124.80.458 3953 116.1 0.484 4046 119.7 0.492 3945 124.8 0.498 3964 125.60.462 3964 116.6 0.499 4130 120.9 0.501 3993 125.6 0.501 3971 126.10.470 4018 117.1 0.492 4056 121.2 0.522 4061 128.50.467 3960 117.9 0.500 4098 122.1 0.521 4034 129.10.481 4017 119.7 0.513 4069 126.1 0.515 3973 129.7

0.517 3971 129.80.448 3994.2 112.0 0.460 4080.1 112.7 0.453 3993.3 113.4 0.491 4001.8 122.40.02 31.8 4.5 0.03 40.0 7.3 0.02 47.4 6.4 0.02 31.2 5.0

rosion product stripping (accuracy 0.5 mg and standard deviation0.9 mg).

Five samples were cleaned, dried and weighed (with ±1 mg ac-curacy) for the determination of RS unprotected surface dissolu-tion in the corrosion product stripping solution, containing the hy-drochloric acid. After 10 min dripping in the hydrochloric acid so-lution they were rinsed at room temperature with water, acetoneand dried before weighing. The average mass loss was 2 mg andits standard deviation was 0.9 mg.

Fig. 1 Main uncertainty sourc-es of corrosivity measurementas rate of mass loss in a neutralsalt spray cabinet expressed asv=∆m/S

Page 135: Validation in Chemical Measurement

126 E. Ramoskiene · M. Gladkovas · M. Salkauskas

to the method documentation for proper uncertainty determina-tion. The calculated standard uncertainty u(∆m)=0.027 g of massloss 0.4 g determined as an average from 148 reference specimensafter corrosion in the neutral salt spray cabinet within 96 h.

Uncertainty of surface area determination. According to the de-scription of the standard test method, the RS should be plates(50×80×1 mm); the total geometric surface area of such samples is8260 mm2. The corrosion test in neutral salt spray environment isperformed only on one side of the plate and the other side of theplate is protected by an adhesive plastic film; thus the treated geo-metric surface area is reduced to 4260 mm2. About 6% of this areais the surface of plate edges, the structure of which depends on theplate formation technique (cutting, slashing, clipping etc.). Thus,the corrosion rate of such a surface may be different from theplane surface.

Insulation tightness and adherence of the adhesive plastic filmto the protected surface of the plate contributes to the total surfacearea under the corrosion test. It may be assumed that such uneven-ness of the surface protection occurs within a one millimetre widezone along the whole perimeter and it makes up ca. 6% of the totalgeometric surface area.

The measurement accuracy of the geometric surface areashould be evaluated as ±2.2 mm2 according to the standard meth-od where it is stated that RS must be 50×80 mm and it may be in-terpreted as a measurement with accuracy ±0.1 mm.

For the uncertainty determination of the geometric surface ar-ea, the calculations of combined uncertainty were performed onthe basis of the experimental measurements and the analysis of afull set of parameters (up to 10); each of them may contribute tothe combined uncertainty of the total geometric area of the surfaceof the test plate according to the recommendation [7]. Experimen-tally we estimated only seven parameters contributing to uncer-tainty: geometric surface area, length, width, thickness, complete-ness/evenness of insulation, surface roughness and its anisotropy[5].

The geometric surface area is only a basic parameter for thecorrosion rate expression in normalized unit g/m2 within normal-ized test duration. An actual surface area is mentioned [2], but notdefined in standards [1, 2]; therefore we use the geometric surfaceand try to evaluate influence of various factors that increase thereal surface area. The corrosion rate depends on such surface pa-rameters as surface roughness and its anisotropy, evenness of ma-terial composition on the surface, pretreatment of surface, storageof samples before the corrosion test, and so on. Inasmuch as theseparameters have no proper units for quantitative estimation ex-pression, it was impossible to evaluate their contribution to thecombined uncertainty of the tested surface area. The nominal val-ue of surface roughness in the standard method is indicated asRa=1.3±0.4 µm, but in our case Ra=0.67±0.01 mm, with an anisot-ropy Rai=0.60 µm (along the plate) and Ras=0.74 µm (across theplate).

Combined uncertainty of the geometric surface area, calculatedfrom the experimental data value, is 2.4×103 mm2 or 2.4×10−3 m2.It is of the same order as that calculated from the data given in thestandard method, but the expanded uncertainty is twice as large asis the combined uncertainty according to the indicated precision ofmeasurements in the standard method. Thus, the multi-measure-ment total data scattering is larger than the uncertainty of mea-surements of a single plate.

The most significant components of the surface area uncertain-ty are the thickness of the corrosion test plate and the roughness ofits surface; anisotropy may give the contribution up to 90%.

Uncertainty of corrosion duration. In the standard method normalcorrosion test duration is indicated as 96 h, and this means that themeasurement accuracy is ±1 h. Our experiments were performedwith ±0.2 h accuracy since the corrosion cabinet was opened forseveral minutes every 24 h to check the quantity of the collected

mist. These interruptions were not included in the corrosion dura-tion. The duration of the corrosion test in the neutral salt spraycabinet was measured from the moment when the RS, placed inthe cabinet, reached the indicated corrosion conditions.

It was assumed that the uncertainty of corrosion test durationcomprised two components: (1) accuracy which was indirectly in-dicated in standard method description and (2) our experimentalpossibility to keep the total duration of corrosion process in theneutral salt spray environment within limits of ±0.2 h.

The estimation of corrosion process duration (without any con-sideration of the peculiarities of the corrosion process in the testcabinet environment and experimental conditions) gives the valueof 0.14 h.

Uncertainty of reference specimen corrosion rate. According toformula of specific uncertainty addition for corrosion rate calcula-tion, its uncertainty has three components:

– Uncertainty of mass loss measurement– Uncertainty of surface area measurement– Uncertainty of corrosion process duration

Therefore, in our case (Table 3) combined uncertainty of corrosionrate uc(v) obtained from the corresponding relative uncertaintiesuc(∆m), uc(S) and uc(t) has the value 0.97:

The expanded uncertainty U with a 95% confidence isU=111×2×0.97~215 g/m2 for the experimentally determined aver-age value of corrosion rate 111 g/m2 within 96 h.

Results and discussion

Additional specifications from other corrosion test stan-dards [2, 3] are required for the standard method of theaccelerated corrosion test in the neutral salt spray testcabinet at 35±2 °C within 96 h. For the evaluation ofcorrosion data quality, the test should be performed ac-cording to the requirements of contemporary standardssuch as ISO/IEC 17025 [9] and, therefore, the corrosiontest data uncertainty must be determined.

Experimental possibilities and technical conditions ofthe corrosion test cabinet allow us to present preliminary

Table 3 Uncertainty components (u—standard uncertainty, uc—relative uncertainty) of corrosivity measurement evaluated as rateof mass loss in a neutral salt spray test cabinet

Component Valuea u uc

Mass loss 0.4 g 0.027 g 0.59(change in mass), ∆m

Geometric surface 4×10−3 m2 2.4×10−3 m2 0.60area, S

Duration of corrosion 96 h 7.1×10−1 h 7.4×10−3

process, t

a Data value are approximated to one significant digit

Page 136: Validation in Chemical Measurement

Validation of salt spray corrosion test 127

metrological parameters of such measurements. Manyparameters are rather sensitive to some factors of corro-sion environment conditions and performance procedure.

Experimental data reveal that the corrosivity of the testcabinet is greatly influenced by the procedure of perfor-mance, and by design parameters of the test cabinet. Thequality of hydrochloric acid, used for the corrosion productstripping, as well as the other parameters of used materials,which should be checked by additional experiments, influ-ence greatly the quality of experimental data too.

The analysis of experimental data shows that the av-erage value 111 g/m2 of all corrosivity data (improved byrejecting outliers) corresponds to the value 140±40 g/m2

indicated in the standard. For the evaluation of the ex-panded combined uncertainty U with factor k=2 the cor-rosivity measurement gives the value of ±215 g/m2 (at95% confidence). It means that our data uncertainty isfive-times higher than that specified in the standard asthe data scattering interval ±40 g/m2 and seven times aswide compared the statistic confidence interval in ourown experimental data corrosivity (Table 2a and 2b).The main components of the combined uncertainty aremass loss and surface area determination.

The analysis of uncertainty components of the RSsurface area shows that surface roughness (especially itsanisotropy) greatly influences the resulting value. A dis-advantage of the standard description is that there is noinstruction how to manage surface roughness when itsvalue differs from Ra=1.3±0.4 µm. The surface rough-ness of our specimens was ca. 0.7 µm, i.e. twice as smallas stated in standard [1]; therefore, it is possible that av-erage value of the established corrosivity 111±215 g/m2

corresponds to the lower indicated limit of the standardvalue 140±40 g/m2, i.e. 100 g/m2.

Conclusions

Corrosion testing by salt spray standard procedure re-veals that prescriptions given in ISO 9227 need someimprovements in specifications and cross-references toprovide a link with ISO 7384, ISO 7253.

The experimentally determined value of the spraycabinet corrosivity (in 96 h) 111±3 g/m2 covers the valueinterval 140±40 g/m2 indicated in the standard method.The standard deviation in experimental data scattering israther small (s=4.1 g/m2); the calculated expanded un-certainty (±215 g/m2), however, exceeds interval indicat-ed in the standard method.

The analysis of uncertainty sources and componentsreveals that the uncertainty value of the spray cabinetcorrosivity depends on the measurements of the surfacearea of the RS and mass loss in the corrosion processes.

If it is necessary to minimize the uncertainty of thecorrosivity result, the surface area determination and sur-face roughness specification should be improved as wellas mass loss determination, which is based on a rathercomplicated procedure of corrosion product removal.

For a more comprehensive evaluation of corrosivitymeasurement uncertainty, it is necessary (1) to carry outan experimental investigation of the influence of eachparticular factor of corrosion test cabinet environmentand experimental procedure and (2) construct a more de-tailed measurement model.

References

1. ISO (1990) Corrosion tests in artificialatmospheres – salt spray tests. ISO9227, ISO, Geneva, Switzerland

2. ISO (1995) Corrosion test in artificialatmosphere – general requirements. ENISO 7384, ISO, Geneva, Switzerland(ISO 7384:1986)

3. ISO (1996) Paints and varnishes – deter-mination of resistance to neutral saltspray (fog). ISO 7253, ISO, Geneva,Switzerland

4. ISO (1994) Accuracy (trueness and pre-cision) of measurement methods and re-sults – Part 1: general principles anddefinitions. ISO 5725-1:1994/Cor.1:1998, ISO, Geneva,Switzerland

5. Ramoškiene E, Gladkovas M, Šalkauskas M (2001) Matavimai2(18):13–17 (in Lithuanian)

6. EURACHEM/CITAC (2000) Quantify-ing uncertainty in analytical measure-ment, 2nd edn. EURACHEM/CITACGuide, EURACHEM/CITAC, Tedding-ton, UK

7. Barwick VJ, Ellison SL (2000) Protocolfor uncertainty evaluation from valida-tion data (LGC/VAM/1998/088). Teddington LGC, ; www.vam.org.uk

8. ISO (1991) Corrosion of metals and al-loys – removal of corrosion productsfrom corrosion test specimens. ISO8407, ISO, Geneva, Switzerland

9. ISO/IEC (1999) General requirementsfor the competence of testing and cali-bration laboratories. ISO/IEC 17025,ISO, Geneva, Switzerland

Page 137: Validation in Chemical Measurement

Received: 25 July 2003Accepted: 27 November 2003Published online: 5 February 2004© Springer-Verlag 2004

Presented at BERM-9—Ninth InternationalSymposium on Biological and Environ-mental Reference Materials, 15–19 June2003, Berlin, Germany.

Abstract For implementation offood and feed legislation, there is astrong need for development andharmonisation of reliable, validatedand if possible, robust and simpleanalytical methods. In addition, pre-cise methods used for measuring theexposure of humans to certain typesof food contaminants and residues(natural, man-made or produced dur-ing technological treatment) such as,e.g. mycotoxins, acrylamide, pesti-cides and allergens have to be avail-able, in order to compare results de-rived from monitoring studies. Meth-ods should be validated (in-house orin a collaborative trial) according toharmonised protocols and good labo-

ratory practice must be in place inorder to be compliant with interna-tionally harmonised standards. Theway in which this is implementeddepends strongly on the analyte, in-terference within the food matrix andother requirements that need to bemet. Food and feed certified refer-ence materials, when matrix matchedand containing the appropriate con-centration of the certified substance,are an extremely useful tool in vali-dation of measurements.

Keywords Food analysis · Foodsafety and quality · Method validation · Certified reference materials

Accred Qual Assur (2004) 9:253–258DOI 10.1007/s00769-004-0767-4

Margreet LauwaarsElke Anklam

Method validation and reference materials

Introduction

Food safety and quality is an issue that concerns everycitizen in the European Union and of course worldwidethis is covered by the Joint FAO/WHO Codex Alimenta-rius Food Standards Programme. Food safety in its sci-entific meaning regards human health issues. Distinc-tions have to be made between food effects that impairthe immediate health state of a person (i.e. cause illnesswithin hours or days) and adverse effects which manifestthemselves only after a prolonged period (months andyears).

Food quality is one of the most important factors de-termining the consumer’s perception and acceptance, at-traction and purchase of a product. The consumer ex-pects a wide range of competitively priced food productsof consistently high quality. The market competition andconsumer pressure to move away from the use of highlyprocessed food products towards more “natural” food

has motivated the food industry in the development ofnovel foods, ingredients and processes. Authenticityproof and detection of fraud, together with assessment ofcompliance with labelling, are therefore in the service ofthe consumer.

Demand-driven re-active and pro-active food controlis crucial to ensure consumer protection. As animal feedis the prime source of contamination entering food itssafety is also important to ensure the health of livestockand the safety of animal-derived food products.

Therefore, food and feed legislation has been put inplace at both national and European levels for sometime. For its implementation, there is a strong need fordevelopment and harmonisation of reliable, validatedand if possible, simple analytical methods.

In the European Union, the authorities in the MemberStates that are competent to perform official controls shallmeet operational criteria that guarantee their efficiency, ef-fectiveness and impartiality [1]. The Directive on the sub-

M. Lauwaars · E. Anklam ()European Commission,Directorate General Joint Research Centre,Institute for Reference Materials and Measurements, Food Safety and Quality Unit,Retieseweg 111, 2440 Geel, Belgiume-mail: [email protected].: +32-14-571316Fax: +32-14-571783

Page 138: Validation in Chemical Measurement

Method validation and reference materials 129

ject of additional measures concerning the official controlof foodstuffs [2] (to be amended by [1]) demands that of-ficial food control laboratories use validated methods ofanalysis, whenever possible. For this reason, analyticalmethods used by enforcement laboratories for the imple-mentation of legislation must be subjected to validationprocedures, in order to show that the method produces re-liable results. Methods need to provide accurate, repeat-able and reproducible results within and between differentlaboratories. This is extremely important in view of legalactions and trade specifications, as well as for monitoringor risk assessment studies. Method validation is done tocheck the performance of a method and assess its perfor-mance characteristics. The precision of the method is like-wise important as a “fit for purpose” requirement.

Food and feed reference materials (FF-RMs) and espe-cially certified reference materials (CRMs) play an im-portant role in the verification of the accuracy of analyti-cal measurements. They can be used as part of the mea-surement uncertainty estimation and to assess the trace-ability of the analytical result. CRMs are also used in sev-eral cases for the calibration of the instrumental set-up.

As food and feed commodities represent a very diffi-cult matrix, it is a challenge for food analysts to be ableto detect and quantify all kinds of food constituents, ad-ditives, residues and contaminants, including theirmetabolites, at all possible concentration levels. More-over, sometimes the analytes are not even very wellcharacterised, such as total fat, water, total carbohydratesand total proteins. These food constituents consist of acomplex mixture of chemical substances (e.g. for fat itcan be the sum of triglycerides, phospholipids, glycol-ipids and monodiglycerides and diglycerides). The typeof method applied, in the case of so-called definingmethods (e.g. drying oven method for water or determi-nation of total nitrogen content for proteins), can definethe substance(s) analysed. Another difficulty in foodanalysis is that the analyte may be strongly bound to thematrix, which influences the extraction efficiency. Trans-formation of original food constituents during food pro-cessing or storage may also occur. Some feed additivesmay form metabolites in animals and therefore thosesubstances have to be looked for in animal-derived foodproducts. There is a “cry” from the official food and feedcontrol laboratories for many more CRMs, or at least testmaterials (TMs), than those existing to date.

Impact of validated methods and reference materials on implementation of food safety legislation

Official food laboratories

As laid down in European legislation [1], analysis offood samples taken during official controls shall be car-

ried out by laboratories designated for that purpose bythe competent authority. Any laboratory that is assessedand accredited in accordance with European Standardsdeveloped by the European Committee for Standardisa-tion (CEN) shall be eligible for designation as an officiallaboratory. The standards are according to [3]:

– EN ISO/IEC 17025 on “General requirements for thecompetence of testing and calibration laboratories”

– EN 45002 on “General criteria for the assessment oftesting laboratories”

– EN 45003 on “Calibration and testing laboratory ac-creditation system—general requirements for opera-tion and recognition”.

The ISO 17025 standard [3] describes monitoring thequality assurance of test and calibration results by,amongst other means, the regular use of CRMs and/orinternal quality control using secondary reference mate-rials and by participation in inter-laboratory comparisonsor proficiency testing programmes.

Methods

Validated food analysis methods are used for compliancewith food legislation in the internal EU market and glob-al trade. They serve likewise to detect fraud, to test forthe authenticity of specifically labelled food productsand to monitor specific substances for exposure assess-ment (e.g. EU pesticides programme).

Due to the demand for reliable and comparable meth-ods, performance requirements have been established at anational and international level for implementation of of-ficial methods, e.g. by European legislation, by the CENor the Association of the Analytical Communities(AOAC) International, and worldwide by Codex Alimen-tarius (CAC). Thus any method proposed to be used forofficial purposes must be validated in a collaborative trialstudy, resulting in defined method performance character-istics [4]. The framework for the design and conduct ofsuch collaborative trial studies, as well as the statisticalevaluation, are also defined in appropriate protocols [5].Any method that has been successfully validated accord-ing to these protocols can be recognised as an officialmethod for use in legal cases or for international tradecontrol purposes.

Food methods validated by a collaborative trial studyand those validated using the single-laboratory approachhave been adopted as national and international stan-dards by, e.g. CEN, International Organisation for Stan-dardisation (ISO), AOAC International and by the JointFAO/WHO Codex Alimentarius Food Standards Pro-gramme. A number of EN Standards developed by CENrelate to the organisation of controls. It is however im-portant to keep in mind that, in addition to the methodperformance criteria, economical and prevention strategy

Page 139: Validation in Chemical Measurement

130 M. Lauwaars · E. Anklam

aspects are also important in method development. De-mands for fast and efficient procedures and the possibili-ty of automation have lead to the development and vali-dation of rapid screening methods, in addition to thewell-established official confirmatory methods.

Reference materials

FF-RMs can be CRMs, proficiency test materials (PTMs),test materials and other standard materials. CRMs areused for in-house verification of an externally fully vali-dated method, as is required by ISO 17025, as well as forthe assessment of the analytical recovery of a method [6].

It is essential that FF-RMs are as similar as possible tothe “real” samples as also holds true for other CRMs, e.g.environmental samples [7]. This is often not possible dueto very complex and delicate food and feed matrices es-pecially as CRMs must show stability over a certainrange of time. In food safety and quality, CRMs play animportant role in the analysis of e.g. food microbial con-tamination, food contaminants such as mycotoxins, diox-ins, polychlorinated biphenyls (PCBs) and food residuessuch as veterinary drug residues (hormones, antibiotics)and pesticides. CRMs for food quality control are impor-tant for analysis of food constituents such as fat, sugarand protein content or of typical indicators for the foodorigin (e.g. stable isotopes in wine). In addition, FF-RMsare used in proficiency testing, although most of this test-ing is done with non-certified assay materials (PTMs).

An additional and extremely useful tool can be theavailability of FF-RMs in order to calibrate the analyticalresults achieved (e.g. in monitoring studies). Althoughthere is sometimes the opinion that this is an expensive so-lution, FF-RMs can save time, add to the reliability of val-idation data and thus prove an enormous economic value.

The use of reference materials, when fulfilling re-quirements concerning matrix and analyte concentrationprovides a link to a stated reference. This traceability isthe property of a result related to the international stan-dard (SI) or a stated reference. A CRM is also a tool indetermining the selectivity and specificity of a method.

When an FF-RM and especially a CRM is desired forvalidation of a measurement procedure, such material isnot always easily available. This is especially the casefor CRMs with naturally incurred contaminants orresidues. Unfortunately these are still rare and often notmatrix-matched. Examples of urgent areas of need ofsuch FF-RMs are given in the following section.

Also, the CRM needs to have the certified componentat the required level of the measurement, with a stateduncertainty at the level of intended use. A minimum por-tion may be recommended for the compatible CRM.

CRMs and method validation—selected examples in food safety and quality

Persistent chlorinated compounds

The occurrence of high concentrations of PCBs anddioxins in food and feed in Belgium a few years agobrought about a sudden drop in confidence in food safe-ty. More extensive monitoring activities concerning thelevels of PCBs and dioxins in food started immediatelyafter the public announcement in individual MemberStates. This revealed a strong need for fast analyticalmethods for screening purposes. It was also found thatMember States’ enforcement laboratories applied vary-ing analytical methods, which indicated the need for acomparison of results from different laboratories [8].There are some CEN standards available that offer anumber of strategies for extraction, clean-up and quanti-tative analysis of PCBs [9]. However, these methods donot only determine PCBs and are rather time-consuming.For this reason a simplified method based on gas chro-matography with mass spectrometric detection (GC-MS)for the determination of the specific PCBs was devel-oped and validated in-house on various food and feedmatrices [10]. In this single-laboratory method validationstudy both spiked food and feed samples and CRMswere used, namely BCR 349 (cod liver oil) and BCR 450(milk powder), both certified for the content of PCB con-geners. The recoveries of the individual PCB congenersvaried from 88 to 107%, indicating good correlation withinternationally recognised criteria for the performancecharacteristics of analytical methods. It must be notedthat one of the CRMs applied was only a contaminatedoil, while most contaminated food products in the studyneeded special treatment or fat extraction from the ma-trix prior to measurement. It is obvious that the sampletreatment or extraction is a major contributor to the esti-mation of uncertainty. By avoiding the extraction step,the method uncertainty of the result may be substantiallylower than in reality.

The GC-MS technique was applied in Member Statesas much as possible but capacity for large numbers ofsamples was limited during this calamity and rather highin costs. Therefore the need for a rapid screening methodto detect PCBs and dioxins became apparent. In order toevaluate the potential of an immunoassay-based methodfor rapid screening of PCBs in food and feed, an in-house validation study was carried out [11]. AgainCRMs (CRM BCR 349 Cod liver oil and Mackerel oilCRM BCR 350) were used to ensure the traceability ofthe measurement results. The immunoassay incorporatesa rapid sample-processing protocol. The latter has beenoptimised to detect concentrations of PCBs in animal fatat or above 200 ng/g. This immunoassay has shown to besuitable for the rapid determination of PCBs with a highsample throughput and minimal hazardous solvent

Page 140: Validation in Chemical Measurement

Method validation and reference materials 131

waste. Due to the nature of both CRMs (oils), several ex-traction steps in the sample treatment for food and feedwere not made, which certainly leads to a lower mea-surement uncertainty than expected for a real and com-plex food or feed matrix.

It is interesting to note that there was a significant in-crease in the sales of PCB reference materials availablefrom the Institute for Reference Materials and Measure-ments (IRMM) of the European Commission’s Direc-torate General Joint Research Centre (JRC). The IRMMis responsible for the storage of all BCR CRMs and isalso producing in-house food and feed-related CRMs aswell as other materials. The sales of CRM BCR 349(Cod liver oil) increased more than 10 times in the mid-dle of 1999 (just after the discovery of the Belgian diox-in crisis) in comparison to 1998. In order to prepare amore realistic CRM, IRMM prepared, immediately afterthe announcement of the Belgian crisis, PCB-contami-nated pork fat samples and PCB proficiency testing cali-bration standards for the quality control of data producedby the analysing laboratories [12, 13]. In conclusion, itcan be stated that different analytical methodologies canbe applied to the determination of PCBs, provided cer-tain quality criteria are fulfilled and CRMs have shownto be a very useful tool in assessing these methods.

Cocoa butter equivalents in chocolate

The new European Chocolate Directive [14] allows theaddition of up to 5% of vegetable fats other than cocoabutter (CB), the so-called cocoa butter equivalents(CBEs), in chocolate. CBEs resemble the chemical com-position and physical properties of CB very closely,making them therefore extremely difficult to quantifyand even in some cases to detect (especially at very lowlevels). There is a perceived need within official controllaboratories for reliable analytical methods for the quan-tification (around the 5% level) of CBEs in chocolate, asMember States’ laws and administrative provisions needto comply with the new Chocolate Directive before Au-gust 2003. All proposed analytical methods have beenevaluated by the JRC in collaboration with EU expertlaboratories [15]. The performance of several methodshas been compared and a final method based on the anal-ysis of the main components, triglycerides, has been pro-posed for further validation.

A cocoa butter CRM has been prepared in the courseof this project in order to facilitate the work of the ana-lytical chemist [16]. The CRM IRMM 801 aims to en-sure a high comparability of the analytical resultsachieved. It was used as a calibrant for the establishmentof a standardised database containing data from morethan 74 different CBs and 94 CBEs. The latter resultedin the application of a simple equation by testing labora-tories detecting CBEs in mixture with CB and in plain

chocolate. For two methods, based on gas–liquid chro-matography, standardised method descriptions have beenprepared and both methods were recently validated in acollaborative study [17, 18]. One method has been tai-lored to detect CBEs in CB and confectionery productsdown to a level of 0.4% related to the final product,chocolate (assumed fat content of chocolate 20%), thuslimiting the false-negative or false-positive results. Theother method was aimed at the determination of CBEs atthe 5% level in chocolate. In both validation studies, theCB-CRM IRMM 801 was applied as a calibrant.

In conclusion, having today the two validated meth-ods together with the cocoa butter CRM at hand, the im-plementation of the EU Chocolate Directive has beenmade feasible, at least for the assessment of the amountof CBEs in plain chocolate.

Peanut allergens in food products

The European Commission has recognised the problemof food allergens and has recently made a proposal toamend the European Food Labelling Directive that im-poses that all ingredients intentionally added to foodproducts must be included on the label [19].

However, correct labelling requires the knowledge ofthe concentration of the allergens and their behaviourduring processing. In addition, the assessment of compli-ance with labelling requires suitable analytical proce-dures, which should be validated either by an interna-tionally accepted in-house testing protocol or by a col-laborative study.

For this reason, the CEN has recently established anew working group on food allergens in order to stan-dardise the analytical methodology available so far (CENTC 275 WG 12). The working group concluded that thereare no collaboratively studied methods currently availablefor the analysis of allergens in the low ppm range.

The detection, and especially the quantification, of al-lergens in processed food products can be very difficult,as they are often present in trace amounts only, or aremasked by the food matrix. It was shown that a level of100 µg of peanut proteins can already trigger a mild re-action in a peanut-allergic person. This could be causedby the consumption of 100 g chocolate or biscuits con-taining 1 mg/kg peanuts.

The IRMM has recently started the validation ofpresently available methods (mostly based on enzyme-linked immunosorbent assays) in a collaborative trialstudy with real food samples such as cookies and darkchocolate containing peanuts in minute amounts(1–20 mg/kg). These methods have been already beenvalidated in-house [20].

The capability to detect any unintentional contamina-tion of food products that usually do not contain peanutsis especially important for peanut-allergic patients. De-

Page 141: Validation in Chemical Measurement

132 M. Lauwaars · E. Anklam

tection limits for peanut allergens probably need to be1 mg/kg or even lower.

Validated methods are just as important as the avail-ability of reference materials for allergens [21]. As peanutallergy is highly prevalent and peanut products may enterinto the production of various food matrices, e.g. choco-late, ice cream, biscuits and breakfast cereals, it is essen-tial to have a peanut reference material both for researchand routine analysis. Peanuts available in the food sectorare derived from various sources, such as peanut vari-eties/types from different geographical origins, and aretreated by various technological processes, such as dryand oil roasting at various temperatures for various times.

The IRMM has launched the production of a basicpeanut reference material using the most commonly usedpeanut varieties for food production. This material maybe used for spiking food matrices for further method de-velopment, validation and proficiency testing, for clini-cal tests and in vitro assays. This future peanut referencestandard will take into account specific demands fromthe food industry and respect various technological con-ditions as mentioned.

Future CRMs for food and feed analysis—outlook

For food safety and quality control, a number of otherCRMs would be very much welcomed. Many requestsare made for matrix matched CRMs and PTMs. Espe-cially official food and feed control laboratories andmoreover the Community and National Reference Labo-ratories are dependent on fit-for-purpose validated ana-lytical methods and real matrix reference materials.

In order to match with method validation studies re-cently carried out, amongst the ones high on the list areCRMs for smoke flavourings, both for toxic substances(e.g. polycyclic aromatic hydrocarbons) as well as theircompositional characterisation, tissue of the central ner-vous system (e.g. brain in meat products), pesticides infood and feed for the EU pesticides monitoring pro-gramme, veterinary drug residues and hormones in ani-mal-derived tissues, food allergens, and especially mi-crobial contaminated food and feed matrices. Other high-ly desired CRM’s are those containing mycotoxins infood at the latest legislative levels, e.g. patulins in appleproducts and baby food, fusarium toxins in cereals,acrylamide in food or products from organic farming forproof of authenticity.

Conclusions

Food and feed CRMs, when available and meeting thenecessary conditions, are an extremely useful tool in val-idation of measurements for both calibration and trace-ability. The homogeneity, stability and assignment of acertified reference value with a CRM (with associatedstated uncertainty) enhance the reliability of the methodperformance assessment. This has been recognised by in-ternational standards and by European legislators. There-fore the appropriate use of CRMs and validated methodshas a strong impact on legal action levels. There is animmense need for many more FF-RMs than are availabletoday. However, it should be clearly stated that validatedmethods and CRMs do not automatically guarantee thequality of the analytical results.

References

1. European Commission (2003) Proposalfor a Regulation of the European Parliament and of the Council on offi-cial feed and food controls. EU, Brussels

2. European Commission (1993) Councildirective 93/99/EEC of 29 October1993 on the subject of additional mea-sures concerning the official control offoodstuffs. Off J L 290, 24/11/1993, pp 14–17

3. International Organisation for StandardISO/IEC 17025 (1999) General re-quirements for the competence of test-ing and calibration laboratories. ISO, Geneva

4. European Committee for Standardisa-tion, CEN (1999) Food analysis—biotoxins—criteria of analytical meth-ods for mycotoxins, Report Referenceno. CR 13505:1999E

5. Horwitz W (1995) Pure Appl Chem67:331–343

6. Barwick V, Burke S, Lawn R, Roper P,Walker R (2001) Applications of refer-ence materials in analytical chemistry.LGC, Teddington, UK. ISBN 0-85404-448-5

7. Fajgelj A, Parkany M (1996) The useof matrix reference materials in envi-ronmental analytical processes. RSC, UK. ISBN 0-85404-739-5

8. von Holst C, Mueller A (2001) Fres J Anal Chem 371:994–1000

9. European Committee for Standardisa-tion CEN (1997) EN 1528-1, 1528-2,1528-3 and 1528-4. Beuth Verlag, Berlin

10. von Holst C, Mueller A, Bjoerklund E,Anklam E (2001) Eur Food Res Tech-nol 213:154–160

11. Jaborek-Hugo S, von Holst C, Allen R,Stewart T, Willey J, Anklam E (2001)Food Add Contam 18:121–127

12. Bester K, De Vos P, Le Guern L, Harbeck S, Kramer G (2000) The certi-fication of PCBs in pig fat—IRMM444, 445 and 446, EU-Report EUR19600 EN

13. Bester K, De Vos P, Le Guern L, Harbeck S, Kramer G (2001) Chemo-sphere 44:529

14. European Commission (2000) Direc-tive 2000/36/EC of the European Parliament and the Council of 23 June2000 relating to cocoa and chocolateproducts intended for human consump-tion. Off J L197, pp 19–25

Page 142: Validation in Chemical Measurement

Method validation and reference materials 133

15. Lipp M, Simoneau C, Ulberth F, Anklam E (1999) Possibilities and limits to detect cocoa butter equiva-lents in mixture with cocoa butter—devising analytical methods for the determination of cocoa butter and othervegetable fats in chocolate with theaim of checking compliance with exist-ing Community labelling. EU ReportEUR 18992 EN

16. Koeber R, Buchgraber M, Ulberth F,Bacarolo R, Bernreuther A, SchimmelH, Anklam E, Pauwels J (2003) Thecertification of the content of fivetriglycerides in cocoa butter. EU Report 20781 EN

17. Buchgraber M, Anklam E (2003) Validation of a method for the detec-tion of cocoa butter equivalent in cocoabutter and plain chocolate. Report on avalidation study. 2003 EUR 20685 EN

18. Buchgraber M, Anklam E (2003) Validation of a method for the quantifi-cation of cocoa butter equivalents incocoa butter and plain chocolate, EU-Report 20777 EN

19. European Commission (2000) Food labelling directive. Off J L109,6.5.2000

20. Poms RE, Lisi C, Summa C, Stroka J,Anklam E (2003) In-house validationof commercially available ELISA testkits for peanut allergens in food. EU-Report 20767 EN

21. Poms RE, Ulberth F, Klein CL, Anklam E (2003) GIT Lab J Eur3:132–135

Page 143: Validation in Chemical Measurement

New ideas, controversial or even contentious papers – Please give us your viewsPapers published in this section do not necessarily reflect the opinion of the Editorsand the Editorial Board

Accred Qual Assur (1999) 4 :352–356Q Springer-Verlag 1999

D. Brynn Hibbert

Method validation ofmodern analyticaltechniques

Abstract Validation of analytical meth-ods of well-characterised systems, such asare found in the pharmaceutical industry,is based on a series of experimental pro-cedures to establish: selectivity, sensitivi-ty, repeatability, reproducibility, linearityof calibration, detection limit and limit ofdetermination, and robustness. It is ar-gued that these headings become moredifficult to apply as the complexity of theanalysis increases. Analysis of environ-mental samples is given as an example.Modern methods of analysis that use ar-rays of sensors challenge validation. Theoutput may be a classification rather thana concentration of analyte, it may havebeen established by imprecise methodssuch as the responses of human tastepanels, and the state space of possible re-sponses is too large to cover in any ex-perimental-design procedure. Moreoverthe process of data analysis may be doneby non-linear methods such as neural net-works. Validation of systems that rely oncomputer software is well established.The combination of software validationwith validation of the analytical responsesof the hardware is the challenge for theanalytical chemist. As with validation ofautomated equipment such as programm-able logic controllers in the synthesis ofpharmaceuticals, method developers mayneed to concentrate on the process ofvalidation, as well as the minutiae ofwhat is done.

Key words Method validation 7 Sensors 7Electronic nose

Introduction

Method validation is an established proc-ess which is:“the provision of documentary evidence

that a system fulfils its pre-defined speci-fication” [1], or:“the process of proving that an analyticalmethod is acceptable for its intended pur-pose” [2].

EURACHEM has offered a morelong-winded definition that includes:“checks... to ensure that the performancecharacteristics of the method are under-stood and demonstrate that the method isscientifically sound under the conditionsunder which it is to be applied” [3].

In theory any method used by an ac-credited laboratory will have some valida-tion documentation, either provided bythe laboratory or because it is using anofficial method that has been validated.In the latter case the laboratory mustprovide evidence that it can carry out thismethod competently (verification). Whatis required to validate a method is clearin the case of regulated pharmaceuticallaboratories, and although some of theaccepted validation statistics may bequestioned (for example r10.999 for cali-bration linearity [4]), the process may besaid to produce the desired result, i.e.methods employed to analyse pharmaceu-ticals are fit for purpose.

The contention of this paper is thatapart from this example, methods usedfor other purposes are often not well vali-dated, and as increasing use is made ofmulti-sensor methods involving intelligentdata analysis, there may not be even aprospect of validation in the traditionalsense.

Method validation of analyticalmethods for pharmaceuticals

In the context of this paper, there arethree aspects of the validation of analyti-cal methods used in the pharmaceuticalindustry that are of note. First, the prod-uct is of high cost, and the industry isvery profitable. Secondly, it is one of themost regulated industries, often with sep-arate agencies overseeing its activities. (InAustralia, this is the Therapeutic GoodsAdministration). Thirdly, although thereis an increasing interest to determine thenature and concentrations of possible ex-cipients and breakdown products arisingfrom the manufacture and storage ofpharmaceuticals, the systems for analysis

are essentially well known, both in termsof identification and concentration.Therefore the industry is required to vali-date by law, they have the money to doit, and the system is tractable. It is thecontention of this paper that if validationis possible for any method, validation forthe pharmaceutical industry should be themost straightforward. This does not, ofcourse, underestimate the time and costthat goes into a full-method validation.Guidelines for validation of pharmaceuti-cal methods have been published by anumber of bodies [2, 3, 5].

Validation is a logical process that isconducted in parallel with method devel-opment. If a method that is being devel-oped will ever be used in earnest, then itmust be validated. The data obtainedduring method development may be usedtowards validation, and the outcome ofvalidation may force changes in the meth-od (which must then be re-validated).This iterative procedure is known as the‘development/validation cycle’. Methodsthat are submitted for regulatory approv-al must show evidence of validation ofthe attributes summarised in Table 1. Be-fore the method is developed, in consul-tation with the end-user, minimum ac-ceptance criteria must be determined foreach of the aspects of the method givenin Table 1. The validation report will thendocument how the method has met thecriteria.

The methods may be done in blocks,with the later validation steps, such as ro-bustness being the ‘icing on the cake’ ofthe method. In large companies, decisionswill be made at different stages to contin-ue work on the method and validation, orto cease development. Some aspects ofthe validation process are discussed be-low.

Specificity

The number and concentrations of impu-rities represents one of the few, possiblyindeterminate quantities in pharmaceuti-cal method validation. However, carefulresearch into all stages of manufacture,storage and use can reveal the range ofcompounds that are not the active prod-uct. Modern advances in liquid chromato-graphy with diode array or mass spec-trometric detection have aided the iden-tification of impurities. Peak purity of thetarget analyte is assessed after the prod-uct has been deliberately ‘stressed’ by ex-posing it to high temperature, high hu-midity and high-light levels. For bulkpharmaceuticals exposure to acid, baseand oxidising agents may also be studied.When validating impurity methods, theresolution of the impurity peaks, amongthemselves and from the active ingredientpeak, is of importance.

Page 144: Validation in Chemical Measurement

Method validation of modern analytical techniques 135

Table 1 Validation of an analytical method, showing tasks and typical acceptance crite-ria for the analyte

Validation studies Tasks for analyte method

Specificity Analyte c placebo, synthesis intermediates, excipients, degrada-tion impurities; versus pure analyte.Peak purity and resolution assessed. Resolution 11.5

Calibration linearity Six levels each with three replicates from 50% to 150% of targetconcentration. r10.999

Accuracy Recovery of certified reference material. B2% at concentrationsof 80% to 120% of target concentration.

Precision Instrument repeatability – ten replicate injections. RSD~1%Intra-assay precision. Multiple analysis of aliquots of a sampleduring one day. RSD~2%Intermediate precision. Multiple analysts, instruments, days inone laboratory.Reproducibility by inter-laboratory studies to detect bias.

Calibration range Determine from accuracy and linearity studies.Detection limit Only for impurity methods. S/Np3 from blank studies or cali-

bration.Quantitation limit Only for impurity methods. Concentration to give a defined

RSD (e.g. 10%), or S/Np10.Robustness Experimental design establishing the changing critical paramet-

ers (e.g. mobile phase composition, column temperature).

Calibration linearity

Although non-linear calibration is widelyavailable, it is still considered mandatorythat across the range of likely use(80%–100%, or more widely 50%–150%of target concentration) the calibrationline to establish the relationship betweenthe measured quantity (peak height, peakarea) and the concentration of analyteshould be linear. A commonly used crite-rion is that the correlation coefficient begreater than 0.999. The square of the cor-relation coefficient (sometimes known asthe coefficient of determination) is thefraction of the variance in the dependentvariable, explained by the independentvariable. As such it can be used for as-sessing the validity of a linear model, butit is not a good measure of scatter abouta line for which linearity is established.Mulholland and Hibbert have shown thatfor slightly curved lines, even withrp0.999, when the line was used for cali-bration, errors in the predicted concentra-tion of 70% could be observed at thelower end of the range [4]. If such lin-earity is established, one-point or three-point calibration may then be used inpractice. Mulholland and Hibbert [4] rec-ommend inspection of residuals, andGreen [2] the use of the response factorp (yI–a)/x for a linear relationshipypacbx. Each of these methods high-lights small deviations from linearity thatare not apparent from the simple plot ofy against x. The use of the calibrationline to establish a detection limit hasbeen promulgated by ISO [6] as a morerealistic method than the generally used 3

times the standard deviation of the base-line.

Precision

Precision is usually measured as thestandard deviation of a set of data. Ofimportance is which set of data, taken un-der what circumstances. The more varia-bility that is allowed in a set of experi-ments, the greater the variance that willresult. Thus if ten aliquots of a homoge-neous sample are injected into a singlechromatograph by a single analyst oneafter the other, the repeatability, (as thestandard deviation of the ten peakheights is known) would be expected tobe no more than 1% of the average peakheight. Unfortunately in papers submittedto journals, this is often the only measureof a method’s precision. Intra-assay, orintra-laboratory precision measures theeffects of different analysts, or repeatedsample preparation. It is the standard de-viation (usually reported as a relativestandard deviation, i.e. the standard de-viation divided by the mean, also knownas the coefficient of variation) of re-peated analyses of aliquots of a singlesample on 1 day in the laboratory. Aspart of a ruggedness study an experimen-tal design may be undertaken to vary anumber of possible parameters with aview to determine which are important tothe analytical method.

Highly fractionated designs are used(Taguchi or other orthogonal designs)which establish main effects only with theminimum number of experiments. Finally,

as part of assessment by regulatory bod-ies, a laboratory may participate in a pro-ficiency study in which their method willbe used to analyse two unknown samplescontaining a stated analyte. The inter-la-boratory precision is usually at least twicethat of the intra-laboratory precision, andthat after removal of outliers. It is nowcustomary to use a robust statistic such asa robust z score, which is defined as

zp(xPmedian(x))

normIQR.

normIQR is the normalised interquartilerange, i.e. the range which encompassesthe middle 50% of a set of data multi-plied by 0.7413. Essentially the robust zscore reports how many standard devia-tions a sample is from the middle of thedata set. Outliers are then identified ashaving z 13. Such interlaboratory com-parisons are a useful way of establishingconformity among a group of laborato-ries, but it must be stressed that unlesscertified samples are used, it does not es-tablish accuracy. As De Bièvre has ob-served, the outlier has sometimes beenshown to be the only laboratory near tothe correct answer.

An experimental determination of re-producibility should also coincide with atheoretical calculation based on theknown, or estimated, precision of the in-dividual steps of the method. The so-called ‘bottom up’ method as part of anuncertainty audit, is included in the prin-ciples of valid analytical measurement(VAM) [7–9].

Environmental analysis

Before considering the modern tech-niques that will prove almost impossibleto validate in the sense described above, Ishall comment on a sector of analyticalchemistry that, while superficially similarto the pharmaceutical industry, by virtueof less regulation and more complex sam-ples, already provides examples of diffi-culty of validation.

At a recent conference on analyticalchemistry held by the Royal AustralianChemical Institute [10], an environmentalconsultant, bemoaned the presence ofanalytical laboratories which, while beingfully registered with the appropriate body(in Australia the National Association ofTesting Authorities – NATA) did notconsistently provide results that were ‘fitfor purpose’. Costs were being cut to thepoint that, in the opinion of the speaker,the results were almost meaningless.What is the difference between an analy-tical method that is designed to deter-mine the correct concentration of an ac-tive ingredient of a pharmaceutical prod-uct and a method that will determine if a

Page 145: Validation in Chemical Measurement

136 D.B. Hibbert

sample of soil has greater than the actionlimit of heavy metals? First the system inenvironmental analysis is less well de-fined and could be drawn from a greatnumber of possibilities. As a consequencethe matrix is not completely defined. Sec-ondly, even the approximate concentra-tion of the analyte may not be known.There is no target concentration, theamount of a given heavy metal couldvary between zero and many thousandsof parts per million. Sampling and samplepretreatment are also crucial for obtain-ing results that are meaningful. In termsof validation it is possible to comment onthe effect of these differences.

Specificity

In terms of the sample, specificity of themethod is difficult to establish. It may bethat the clean up procedure allows the fi-nal sample to be analysed free of inter-ferences, but the process of obtainingsuch a laboratory sample from the origi-nal material in the environment has agreat number of uncontrollable variables.Obvious interferents may be known andprocedures adopted to avoid them. Anexample is the presence of high levels ofsodium chloride in sea water samples,which proves difficult for atomic spectros-copy methods.

Calibration linearity

The linearity of calibration should be es-tablished. However in practice laborato-ries using ICPOES or ICPMS tend to cal-ibrate over a very wide range, often usingsingle-point calibration in the vicinity ofthe concentration of that of the sample.The problem arises with a batch of sam-ples that show a wide variation in con-centrations of analyte. Proper attention tothe calibration protocols will require anumber of ranges to be validated. Again,in practice, this may not be adhered tobecause of pressure of the number ofsamples to be processed.

Accuracy

The nature of the samples means that aCRM to establish accuracy in a propermetrological way may not exist. Usually,recovery studies will be done with spikedsamples. There is concern about the puri-ty of standards used here, and also aboutthe speciation of redox active speciessuch as transition metal elements. Stand-ard addition is a method that extrapolatesbeyond the calibration range, and istherefore prone to high uncertainty. Thelow levels of analyte makes it difficult toestablish a ‘true’ concentration.

Precision

Once a method is established, precisionmay be determined by suitable replicatedexperiments. However it is in inter-labo-ratory trails that the problems with envi-ronmental methods often show up. It isaccepted that for trace analysis RSD val-ues of tens of percent are likely. In stud-ies conducted in Western Australia onpesticide residues in bovine fat RSD val-ues for dieldrin were 12% and for dia-zonium were 28%. It is typical to see aquarter of the laboratories in such a trialproducing values that could be termedoutliers. In the previously mentionedstudy, 5 laboratories out of 26 had z 13for aldrin. In a parallel study RSD valuesfor petroleum in dichloromethane andwater were 40% and 25%, respectively.The conclusions of these studies was thatthere was poor comparability because ofthe different methods used, that accredi-tation apparently made no difference tothe quality of results, and that a lack ofunderstanding of definitions of the quan-tities to be analysed (for example ‘gasol-ine range organics’) caused non-methoderrors. In relation to methods, this is con-trary to the conclusion of van Nevel et al.who asserted that the results of the IMEPround of analyses of trace elements innatural and synthetic waters showed nodependence on method [11]. If properlyvalidated methods do yield comparableresults, then one conclusion from therange of studies around the world is thatmany environmental methods are not val-idated. It may be that validated methodsare indeed used, but not for exactly thesystems for which they were validated.

Limits of detection and determination

Much is made of detection limits in envi-ronmental analysis. Much of the modernconcern about chemicals in the environ-ment stems from the ability of the analy-tical chemist to analyse ever lower con-centrations. Less attention is given to alimit of determination. Usually the RSDis the best possible for the given method,and because intra-laboratory precision, oreven simple repeatability, is quoted, thisis usually accepted.

Array sensors

The ultimate goal of analytical chemistryis not to analyse chemicals per se but tosolve problems couched in the languageof society. “Can we drink the water?”“Can a children’s playground be built onthe waste site?” “Does the patient havean over active thyroid?” [12]. Society alsowants the answers to questions like these

as quickly as possible and as cheaply aspossible. These pressures have given aboost to the development of portablesensors that yield answers that are intelli-gible to the untrained user. Computercontrol and in situ data analysis havemade great strides, and it may be arguedthat the development of suitable chemicalsensors is the limiting factor in the wideruse of these devices [13]. Arrays of sen-sors that relay on sophisticated data anal-ysis are typified by the probably miss-named “electronic nose” [14]. Portablesensors have been developed for monitor-ing atmospheric pollution, foods, winesand other beverages, odours from facto-ries, abattoirs and sewage plants [15].

Analytical methods using arrays of sen-sors

The transduction mechanisms of thesesensors are based on the conduction ofsemiconductors such as tin oxide [16], orpolymers such as polypyrrole [17]. Moresensitive are sensors that ‘weigh’ imping-ing molecules [18] and more sensitive stillis the biological nose. Recently there hasbeen a renewal of interest in optical sen-sors incorporating fluorescent molecules[19]. Typically a device will have 3 to 30sensors, the output of each being a vol-tage. This may be measured at the steadystate, or the time development of the vol-tages may be monitored. Humidity andtemperature control is important formany sensors.

The array may target a particular vo-latile chemical, or it may be calibratedagainst less exact standards, for example,the subjective scores given by a taste-test-ing panel.

Data treatment and analysis

The methods of data analysis depend onthe nature of the final output. If theproblem is one of classification, a numberof multivariate classifiers are availablesuch as those based on principal compo-nents analysis (SIMCA), cluster analysisand discriminant analysis, or non-linearartificial neural networks. If the requiredoutput is a continuous variable, such as aconcentration, then partial least squaresregression or principal component regres-sion are often used [20].

Using such an array for, for example,the provenance of a red wine claiming tobe Chianti, requires calibration with anumber of samples that cover the rangeof genuine Chianti and all the ersatz ver-sions that may be encountered. An un-known sample is presented to the arrayand the output (usually a number of vol-tage responses) is fed into the data analy-

Page 146: Validation in Chemical Measurement

Method validation of modern analytical techniques 137

sis software, which in due course, returnsthe answer “The wine is Chianti” or “Thewine is not Chianti”. With more informa-tion during calibration, the response mayoffer further advice on the origin of thesample, or it may determine how near atrue Chianti, a forgery may be.

Aspects of method validation

Given that an array sensor is being devel-oped, how may they be validated? I knowof no system that has undergone valida-tion for accreditation purposes. Evencompared with environmental analysis,these sensors present difficulties thatneed to be resolved, before a consensuson suitable validation protocols isreached. Gardner and Bartlett have con-sidered the problem of how to define theperformance of electronic noses usingstandard odour mixtures [21]. They pro-pose two indicators of performance, therange of different odours that may be de-tected by the array, and the ability to dis-criminate between similar odours (the re-solving power).

Accuracy and precision

Precision and accuracy need to be rede-fined for classification problems. The ac-curacy is the percentage of correct classif-ications, when a series of unknowns arepresented to the calibrated instrument.For systems in which there is a differencein the consequence of false positives orfalse negatives (for example medical diag-nosis of an infectious disease) the test ofaccuracy has to be set up to reflect thisdifference. The repeatability of a classifi-cation system may also be determined bypresenting the same sample a number oftimes to it, and recording the percentageof correct classifications. The statistics ofdiscrete events can be applied to the re-sults to yield probability levels. Just as di-gital signal encoding has advantages overanalogue methods, a discrete classifiermay be expected to produce a higher per-centage of correct answers.

Calibration

Headings such as ‘calibration linearity’have no direct meaning for multivariatemethods that do not have a single ‘x’ val-ue. Indeed, multivariate regression meth-ods pose the calibration problem with the‘Y block’ being the target variable (e.g.concentration) and the ‘X block’, the ma-trix of measurands (e.g. voltage outputsof the array of sensors), this being the in-verse of a traditional analytical calibra-tion graph [22]. Multivariate calibration,because of the complexities of models

and the number of parameters, must beshown to be valid both in terms of thecalibration model (i.e. how well the mod-el fits the standards used in calibration),and its prediction ability. Two methods toestablish the prediction ability of a meth-od are popular. First, if there are suffi-cient calibration sets of data, a tranche(say 10%) is set aside to be used to vali-date the model. The validation set is pre-sented to the model and the success ofthe prediction is assessed. Alternatively, aleave-one-out procedure is adopted,known as cross validation, in which themodel is established with n–1 data pointsand the n th data set is predicted by themodel. The n th set is returned and a dif-ferent set is removed, the model estab-lished and the removed set predicted.This continues until all of the data setshave been removed and predicted. Crossvalidation does not work if the calibra-tion set has been determined by a mini-mal experimental design, because eachset is vital to the statistical integrity ofthe whole.

Robustness

The most difficult aspect of validation toestablish for an array sensor is the ro-bustness, particularly in terms of the var-iation in the data sets that may be pre-sented to the instrument. In the exampleused above, while a number of genuineChianti vintages may be sourced and pre-sented to the device, a similar library offake Chiantis may not be available. Thecalibration depends greatly on the set ofcases used, and this also impinges on therobustness of the model. If data from ahuman taste-testing panel is used to es-tablish the model, then a number of dif-ferent groups of tasters should be em-ployed to establish the sensitivity of themodel to the vagaries of the human pan-els.

Validation of software

Most array sensors rely on computerisedalgorithms to build the calibration model.In-house software or customised commer-cial products are often used. The need toestablish compliance of the software isevident, both in terms of its own specifi-cations and in use with the instrument.Software should be certified to ISO 9001under the TickIT scheme [23–25], and inaddition to basic performance require-ments the following should be addressedin the user requirements specification:– Alarm, event and message handling– Audit, report and archiving require-

ments– Transfer of data via networks– Display and user-interface specification

– Maintenance, security and diagnostics– System failure and recovery require-

ments to protect the integrity of data– Upgrading and future requirements– Year 2000 compliance.Before acceptance, therefore, the aboveuser requirements should be thoroughlytested according to an agreed protocol,and further tests must be undertaken onthe use of the software with the instru-ment (field testing).

Discussion

The validation of analytical methods thatare based on arrays of sensors cannotrely on the traditional pharmaceutical ap-proach. Many of the validation headingshave no equivalent, and important steps,such as validation of software, are not ex-plicitly included. The pharmaceutical in-dustry does, however, have a model forvalidation that could be used in modernanalytical chemistry, namely the valida-tion of programmable logic controllers(PLC) [26] and supervisory control anddata acquisition (SCADA) [27] systems.This was the subject of a recent issue ofthe journal Measurement and Control [1].Of importance is now the process of vali-dation, which may be broken down into anumber of steps (Fig. 1).

The validation life cycle of Fig. 1shows validation to be the driver of thedevelopment process, coming before anydevelopment is commenced, and with thedifferent steps of qualification verifying aset of specifications. The final validationreport closes the cycle and is the writtenevidence that the system has been vali-dated and is therefore ‘fit for purpose’.Parameters such as accuracy and preci-sion, now become requirements withinone of the validation stages. Installationqualification ensures that all of the com-ponents of the method are present andinstalled according to the design specifi-cation. Operational qualification ensuresthat each part of the method functionsaccording to specification when testedagainst a representative set of cases, andperformance qualification tests the per-formance in a real environment, includingtests of software that were enumeratedabove.

Complex instruments giving advice asthe output may never be validated to theextent of a simple analytical method.However, what validation will do is to setthe bounds of the responses and situa-tions in which the instrument will give fit-for-purpose output. The users of the in-strument will then need to establish iftheir particular uses fall within the scopeof the validated instrument. We look for-ward to a report of a fully validated elec-tronic nose.

Page 147: Validation in Chemical Measurement

138 D.B. Hibbert

Fig. 1 The validation life cycle (adapted from [26])

Conclusion

In this paper we have reviewed the vali-dation of analytical methods used in thepharmaceutical industry, and concludethat the methods do establish fitness forpurpose. However outside this tightly reg-ulated industry, it is not so clear thatproperly validated analytical methods areused. Contemplation of methods of envi-ronmental analysis suggests that the com-plexity of the problems and the use ofmethods outside the confines for whichthey were originally validated, may be acontributing factor to the concern cur-rently expressed about the quality of ana-lytical results. Modern analytical methodsemploying arrays of sensors, using multi-variate calibration models and providingdiscrete output (classification), adds evenmore complexity to the validation prob-lem. The validation of PLC in the phar-maceutical industry is given as an exam-ple of the validation of a highly complexsystem, and it is suggested that approachtaken with PLC validation be adoptedwhen validating modern analytical meth-ods.

References

1. Wingate GAS (1998) Meas Control31 :5–10

2. Green JM (1996) Anal Chem68:305–309A

3. EURACHEM (1997) Guidance docu-ment No. 1/WELAC Guidance docu-ment No. WGD 2 :Guidance on theinterpretation of the EN 45000 seriesof standards and ISO/IEC Guide 25.Laboratory of the GovernmentChemist, Teddington, UK

4. Mulholland MI, Hibbert DB (1997) JChromatogr 762 :73–82

5. Food and Drug Administration(1996) Draft guidelines on the valida-tion of analytical procedures: method-ology. Federal Register 61 :9316–9319

6. ISO 11095 (1996) Linear calibrationusing reference materials Internation-al Organisation for Standardisation,Geneva

7. Sargent M (1996) Confidence inAnalysis. Laboratory of the Govern-ment Chemist, Teddington, UK.VAM Bulletin (Autumn 1996) 3–4

8. ISO (1995) Guide to the expressionof uncertainty in measurement. Inter-national Organisation for Standardis-ation, Geneva

9. EURACHEM (1995) Quantifying un-certainty in analytical measurement.Laboratory of the GovernmentChemist, Teddington, UK

10. McLeod S (ed) (1997) Proceedings ofthe 14th Australian Symposium onAnalytical Chemistry, Adelaide July1997. Analytical Chemistry Division,Royal Australian Chemical Institute,Melbourne

11. van Nevel L, Taylor PDP, ÖrnemarkU, Moody JR, Heumann KG, DeBièvre P (1998) Accred Qual Assur3 :56–68

12. Mulholland MI, Hibbert DB, HaddadPR, Parslov P (1995) ChemometricsIntell Lab Syst 30 :117–128

13. Haswell SJ, Walmsley AD (1991)Anal Proc London 28 :115–17

14. Gardner JW, Bartlett PN (1994) SensActuators B 18 :211–20

15. Hodgins D, Simmonds D (1995) JAutom Chem 17 :179–85

16. Chiba A (1990) Chem Sens Technol2 :1 –18

17. Hierlemann A, Weimar U, Kraus G,Schweizer-Berberich M, Goepel W(1995) Sens Actuators B 26 :126–34

18. Barko G, Papp B, Hlavay J (1995)Talanta 42 :475–82

19. Dickinson TA, White J, Kauer JS,Walt DR (1996) Nature 382 :697–700

20. Hibbert DB (1996) In: Linskens HF,Jackson JF (eds) Modern methods ofplant analysis vol 19 :Analysis ofplant fragrance. Springer, Berlin, pp119–140

21. Gardner JW, Bartlett PN (1996) SensActuators B 33 :60–67

22. Martens H, Næs T (1989) Multivar-iate calibration. Wiley, Chichester,England

23. ISO 9001 (1994) Quality systems –Model for quality assurance in design,development, production, installationand servicing. International Organisa-tion for Standardisation, Geneva

24. ISO (1991) Quality management andquality assurance standards – Part3 :Guidelines for the application ofISO 9001 to the development, supplyand maintenance of software. Inter-national Organisation for Standardis-ation, Geneva

25. DISC TickIT Office (1995) The Tick-IT Guide: A guide to software qualitymanagement system construction andcertification. Using ISO 9001 :1994,British Standards Institution, Issue 3,London

26. Mitchell R, Roscoe S (1998) MeasControl 31 :10–13

27. Coady PJ (1998) Meas Control31 :14–19

Presented at: CITAC InternationalSymposium on Analytical QualityAssurance for the 21st Century, Sydney,Australia, 15–16 October 1998

D.B. Hibbert (Y)Department of Analytical Chemistry,University of New South Wales, SydneyNSW 2052, Australiae-mail: b.hibbert6unsw.edu.auTel.: c61-2-9385 4713Fax: c61-2-9385 6141

Page 148: Validation in Chemical Measurement

Relevant accreditation policies and concepts in EU-, EFTA-, NAFTA-and other regions

Accred Qual Assur (1998) 3 :29–32

Permanent Liaison Group betweenEAL (European co-operation forAccreditation of Laboratories) andEUROLAB

Validation of testmethodsGeneral principles and concepts

Foreword

EAL and Eurolab have a PermanentLiaison Group (PLG), which is a forumwhere EAL and Eurolab are discussingmatters of mutual interest. The PLG con-sists of five members from each organiza-tion.

This document has been prepared inthe PLG and endorsed by both organiza-tions.

The document is intended to give gen-eral views on certain issues related to thevalidation of test methods and should beseen as a common understanding and po-sition of EAL and Eurolab. In order todefine and describe the activities behindthe concept “Validation of test methods”more detailed guidance documents areneeded. This document should be seen asa basis for such guidance documents.

Validation of test methods

Introduction

The definition used for “validation” inthe ISO standard 8402 is “confirmationby examination and provision of objectiveevidence that the particular requirementsfor a specific intended use are fulfilled”.This definition gives the impression ofconfined and well-defined (exact) opera-tions. Test methods are normally devel-oped for an intended application range.The reference to particular requirementsmust in many cases be interpreted in aflexible way as the requirements can beof general nature.

Both standardized and non-standar-dized test methods are covered. They canbe exact or associated with large uncer-tainties. Even novel test methods will beconsidered. The validation of a test meth-od becomes in this context a way of de-monstrating that the method is fit for itsintended purpose. The fitness for purposeincludes an assessment and a balancing oftechnological possibilities, risks and costs.

There are very few papers in the openliterature dealing with the general princi-ples of test method validation. On theother hand, a lot of detailed descriptionsof the validation of specific test methodsare available. A brief overview of theconcepts, aims and procedures in valida-tion is given in this document.

General principles to be used in valida-tion

In the validation process an estimate ismade of the representativeness, repeata-bility and reproducibility of the testmethod. The definitions are given in an-nex 1.

In the validation process the ultimateaim is to secure that the test methods aregood enough with respect to representa-tiveness, reproducibility and repeatability.How much effort should be spent on vali-dation must be decided on a case by casebasis. If large economic values as well asconsiderable health, safety and environ-mental issues are involved, much moreemphasis must be paid to the validationof the test methods. The frequency of useof the test method should also be consid-ered when determining the extent of vali-dation. The total consequences of wrongresults are of course larger for methodsin extensive use than for test methodsused occasionally.

The validation of test methods coversto a large extent the uncertainty, repeata-bility and reproducibility of the testmethod. As the factors affecting the re-sults and contributing most to the uncer-tainty change from one technical sector toanother or even from one test method toanother, a universal solution cannot begiven. Guidance on the expression of un-certainties can be found for example inthe international “Guide to the expres-sion of uncertainty in measurement” andEAL guidance document “Expression ofuncertainty in quantitative testing”.

Standardized test methods should beconsidered validated for their intendedapplication range and thus good enoughfor that purpose although their repeata-bility and reproducibility are not knownin detail. The testing laboratory must,however, check that they apply the meth-od correctly. For non-standardized testmethods it is up to the testing laborato-ries to determine how far they go in defi-ning the level of repeatability and repro-ducibility.

To develop a representative testmethod, adequate knowledge is requiredof the practical use of the test results andof the real service conditions of the ob-ject of the test. Based on such knowl-edge, the “representative” properties tobe determined by the test may be identif-ied.

The factors affecting the test resultsand their uncertainty may be groupedinto three main categories:Instrumental and technical factors– sampling– homogeneity– test method– equipment

Page 149: Validation in Chemical Measurement

140 M. Holmgren

Human factors

Environmental factors– testing environment

Instrumental and technical factors are re-lated to the constructional and functionalcharacteristics of the test and measure-ment equipment, as well as to other tech-nical operations involved in the test (e.g.sampling, preparation of samples, test ob-ject homogeneity). Their effect may beminimized and kept under control by thefollowing provisions:– define the equipment as precisely as

necessary– provide a clear description of the test

procedure as well as the equipment op-eration

– establish procedures for operationalcontrol and calibration

– ensure where applicable traceability ofmeasurements to the SI units.

Whenever practical, the above provi-sions should be included in the descrip-tion of the test method. References to in-ternal procedures or applicable standardsshould be included.

Human factors are related to the compe-tence of the staff and may be controlledthrough:– education/basic knowledge– on job training/practical experience.

The qualification required for the per-sonnel employed for a given test may bespecified in the test method or referencecan be made to the applicable internalprocedures.

Environmental factors are associated tothe environment where the test is per-formed. Among others the effect of thefollowing parameters must be assessedand properly controlled:– atmospheric conditions (temperature,

pressure, humidity)– pollution/contamination– other environmental characteristics

(e.g. EMC).The effect of the above parameters

should be described in the test method orreference to other applicable documentsshould be made. However, for new testmethods this information is often notavailable. In some cases the data base formethod validation is so large that statisti-cal methods should be applied.

The validation process must considerthe expected or required uncertainty ofthe test results and their intended use.

Critical threshold values (e.g. inhealth and environment) cannot generallybe technically justified with a small un-certainty. However, if a legal limit is set,there must be test methods suited for thepurpose. Reference is made to a recentILAC Guide.

The required depth of the validationprocess depends also on the maturity ofthe test method and the prevalence of itsuse. One can distinguish between the fol-lowing categories:– novel methods– methods used by several laboratories– modification of established methods– standardized methods.

The ways, in which the validation isperformed in the different cases, need notbe clearly differentiated. If the fitness forpurpose concept is maintained, it is oftenpossible to validate at reasonable cost butwith a higher degree of uncertainty.

The novel methods are first developedin one single laboratory, often on the ba-sis of a special request from a customeror on ideas created in the laboratory.That customer cannot pay for a widerange validation nor can the laboratoryitself. The aim of the validation of testmethods must always be to demonstratethat the method is fit for the intendedpurpose and that the results have an ac-ceptable uncertainty. It is important thatthe rules of validation of test methods donot prevent the natural technological de-velopment from taking place. The labora-tory does not expect (although it doeswant) outside financial help for validationof novel methods and in many cases triesto protect its new development from go-ing to its competitors or from becominggenerally available to all.

When a certain number of laborato-ries work in the same area, cooperationand inter-laboratory comparisons can bearranged. The coordination of such activi-ties is an extra economic burden. In orderto speed up the process, external financ-ing is needed.

The testing laboratories need to up-date their existing test methods. The flex-ible scope of accreditation as agreed be-tween EAL and Eurolab was also in-tended to allow modifications to be madeto accepted (accreditation covered) testmethods. This requires validation proce-dures applicable to method modifications.It is up to the laboratories to describetheir procedures for validating modifiedtest methods.

The most thorough validation proce-dure is required for test method standar-dization purposes. The work needed isconsiderable and covers proficiency test-ing, the determination of factors affectingthe uncertainty, measuring range, etc.The financial burden cannot be laid onthe laboratories but on the standardiza-tion organizations. Standardized testmethods must be considered sufficientlyvalidated for their intended applicationranges. If they are not, they should bewithdrawn.

The validation of test methods con-sists of two interrelated steps:

(i) suitability of the test to solve theproblem (customer needs)

(ii) demonstration of the technical capa-bility of the test method within thespecified test range

i.e. measuring the right properties with asufficiently reliable method.

The suitability or representativenessof a test method is in many cases an attri-bute which is difficult to define especiallyfor tests related to product acceptance.The test methods must be such that theresults obtained correlate with the per-formance characteristics and operationalexperience of the product.

Validation procedure

Both testing laboratories and accredita-tion bodies are looking for proceduresand guidelines for planning and controll-ing the test method validation process.However, the discussion above has clearlyindicated that one single procedure can-not be developed. Consequently, a pal-ette of different choices of validationtechniques has to be developed. How de-tailed the validation will be, depends onthe circumstances (needs, costs, possibili-ties, risks, etc.).

The validation of the test methods is,of course, of interest also to the accredi-tation bodies. The principle to be appliedshould be that the laboratory describesthe way it is validating the test methodsand the accreditation body should makethe judgement if the procedure used isacceptable in that case. The different vali-dation possibilities are built up around– utilization of calibration– intercomparisons including the use of

reference materials and referencemethods

– well qualified staff and their profes-sional judgement

– simulation and modelling– other approaches.

Method validation is often based onthe combined use of validation proce-dures. The validation used can be “di-rect” or comparative. The selection of thevalidation procedures should also be jus-tified on a cost-benefit basis as long asthe fitness-for-purpose is maintained. Fo-cusing the effort on the most critical fac-tors affecting the test method will lead toa different solution for the validation of“exact” physical and chemical test meth-ods as compared to that for product orsubjective testing. For example, in thevalidation of ergonomics and sensory testmethods not all possibilities are applica-ble.

As said above different validationprocedures may be followed, their effec-tiveness and applicability depending onthe type of test considered. They can be

Page 150: Validation in Chemical Measurement

Validation of test methods. General principles and concepts 141

characterized as “scientific” or “compara-tive”:

Scientific approach

In the scientific approach the assess-ment of the representativeness, repeata-bility and reproducibility of the method isperformed with reference to the differentconstitutive elements and features. Evi-dence should describe the representative-ness of the selected properties and the as-sociated uncertainty. This can be basedon information published in the scientificand technical literature or on ad hoc in-vestigations performed by the laboratorydeveloping the method. The laboratoryshall demonstrate that relevant influenc-ing factors (instrumental and technical,human, environmental) have been ana-lyzed and that they are under controlwithin the uncertainty associated withthemethod.

Comparative approach

The test method is assessed by comparingits results to those obtained by means ofanother already validated test method,which has been developed for the samepurposes. If this is not possible, the per-formance characteristics of the methodmay be assessed through interlaboratorycomparisons. The method is “valid” if theresults obtained by the different laborato-ries fall within the expected uncertaintylimit. Deviations beyond such limits mayindicate e.g. a lack of control of the in-fluencing parameters. The causes of thisbehaviour should be clarified and themethod is to be redefined accordingly.The interlaboratory comparison does notalways provide a comprehensive valida-tion of the representativeness of themethod, which may be accurate and sta-ble, though physically “wrong”.

The acceptance procedure for new ormodified test methods is either (i) deter-mined internally in the laboratory (ii)agreed upon between the customer andthe laboratory or (iii) accepted by the au-thorities and/or accreditation bodies. Ahigher degree of reliance is needed whensafety, health and large economic valuesare involved. Calibration has been em-phasized as an important element in themethod validation process, but it is notnecessarily the most dominating factor.The understanding of the testing methodwith its systematic and random errors iscrucial. A scientific approach to analyzesources of error as well as the compe-tence of the personnel doing that job isof great importance.

The laboratory should always describethe way the validation of test methods is

done and this description should be apart of the quality system/manual whenappropriate.

As simplified validation procedures(“fast” validation methods) must be usedin many cases, the capability to use pro-fessional judgement in assessing whetherthe validation is comprehensive enoughbecomes pronounced. However, evenwhen talking about simplified or fast vali-dation procedures, the validation must bedone with such a depth that the methodis fit for the intended use and acceptableto the customer and/or authorities. It isclear that the definition of the use andscope of the method and assumption ofuncertainty should not be misleading andtoo optimistic.

When the use of new test methods be-comes more extensive, work describingthe effect of changes in test parameterscan be initiated in order to show the ro-bustness of the method. Prenormative re-search should also be initiated.

The need for new or improved testmethods arises when we lack methods orthe existing ones are not complete, goodor efficient enough. There is no need forthe laboratory community to develop newmethods if existing ones can be consid-ered adequate.

Annex 1

Definitions

Repeatability (of results of measurement)

Closeness of the agreement between theresults of successive measurements of thesame measurand carried out under thesame conditions of measurement. (VIM)

Notes

1. These conditions are called repeatabil-ity conditions

2. Repeatability conditions include:– the same measurement procedure– the same observer– the same measuring instrument, usedunder the same conditions– the same location– repetition over a short period oftime

3. Repeatability may be expressed quan-titatively in terms of the dispersioncharacteristics of the results.

Reproducibility (of results of measure-ments)

Closeness of the agreement between theresults of measurements of the samemeasurand carried out under changedconditions of measurement. (VIM)

Notes

1. A valid statement of reproducibilityrequires specification of the conditionschanges.

2. The changes conditions may include– principle of measurement– method of measurement– observer– measuring instrument– reference standard– location– conditions of use– time.

3. Reproducibility may be expressedquantitatively in terms of the disper-sion characteristics of the results.

4. Results are here usually understood tobe corrected results.

Uncertainty (of measurement)

Parameter, associated with the result of ameasurement, that characterizes the dis-persion of the values that could reasona-bly be attributed to the measurand.(BIPM/IEC/IFCC/ISO/IUPAC/IUPAP/OIML).

Notes

1. The parameter may be, for example, astandard deviation (or a given multipleof it), or the width of a confidence in-terval.

2. Uncertainty of measurement com-prises, in general, many components.Some of these components may beevaluated from the statistical distribu-tion of the results of series of measure-ments and can be characterized by ex-perimental standard deviations. Theother components, which can also becharacterized by standard deviations,are evaluated from assumed probabili-ty distributions based on experience orother information.

3. It is understood that the result of themeasurement is the best estimate ofthe value of the measurand, and thatall components of uncertainty, includ-ing those arising from systematic ef-fects, such as components associatedwith corrections and reference stand-ards, contribute to the dispersion.

Validation

Confirmation by examination and provi-sion of objective evidence that the parti-cular requirements for a specific intendeduse are fulfilled (ISO 8402).

Notes

1. In design and development, validationconcerns the process of examining aproduct to determine conformity withuser needs.

Page 151: Validation in Chemical Measurement

142 M. Holmgren

2. Validation is normally performed onthe final product under defined oper-ating conditions. It may be necessaryin earlier stages.

3. The term “validated” is used to desig-nate the corresponding status.

4. Multiple validations may be carriedout if there are different intendeduses.

Verification

Confirmation by examination and provi-sion of objective evidence that specifiedrequirements have been fulfilled (ISO8402).

Notes

1. In design and development, verifica-tion concerns the process of examiningthe result of a given activity to deter-mine conformity with the stated requi-rements for that activity.

2. The term “verified” is used to desig-nate the corresponding status.

Annex 2

References and backgroundmaterial

ISO 5725 (1986): Precision of test meth-ods - Determination of repeatability andreproducibility for a standard test methodby inter-laboratory tests.ISO 8402 (1994): Quality managementand quality assurance - vocabulary.BIPM/IEC/IFCC/ISO/IUPAC/IUPAP/OIML (1995): Guide to the expression ofuncertainty in measurement.EAL-G 23 (1996): Expression on uncer-tainty in quantitative testing.EAL-G 12 (1995): Traceability of meas-uring and test equipment to nationalstandards.Forstén Jarl (1991): A view on the assess-ment of the technical competence of test-ing laboratories. Nordtest Technical Re-port 149, 46p.ILAC (1996): Guide an assessment andreporting of compliance with specifica-tion, based on measurements and tests ina laboratory.VIM (1993): International vocabulary ofbasic and general terms in metrology.

Contact person:M. HolmgrenEUROLAB Secretariat, SP Box 857,S-50115 Borås, Sweden

POLICIES AND CONCEPTS

Page 152: Validation in Chemical Measurement

Relevant accreditation policies and concepts in EU-, EFTA-, NAFTA-and other regions

Accred Qual Assur (1999) 4 :45–49

Alison GillespieSue Upton

Marketing Valid AnalyticalMeasurement

Abstract The Valid Analytical Measure-ment (VAM) programme was set up bythe Department of Trade and Industry aspart of its support for the UK NationalMeasurement System. This paper gives anoverview of the VAM programme togeth-er with a description of the principles onwhich valid analytical measurementshould be based. This is followed by a

description of the work that has been car-ried out to market the results of theVAM programme to the analytical com-munity.

Key words Valid analyticalmeasurements 7 Marketing technology 7Technology transfer

Introduction

Analytical measurement has a vital rolein ensuring the quality of goods and com-modities and in supporting Governmentin areas such as revenue collection, healthand safety, environmental protection, ag-riculture and law enforcement. Clearlythe data from analytical measurements

Page 153: Validation in Chemical Measurement

144 A. Gillespie · S. Upton

need to be fit for their intended purpose.However, there is evidence to suggestthat frequently there is lack of agreementon the validity of the data. One estimateputs the costs to British industry of unre-liable data at at least ^1 billion per year.

Following a review of its support forthe National Measurement System(NMS) [1] in 1989, the Department ofTrade and Industry (DTI) set up the Val-id Analytical Measurement (VAM) pro-gramme, recognising for the first time theneed to include analytical measurementswithin the NMS. Previously it had beenconcerned mainly with physical measure-ments.

The objectives of the NMS are: to en-able individuals and organisations in theUnited Kingdom to make measurementscompetently and accurately and to de-monstrate the validity of such measure-ments; to co-ordinate the UK’s measure-ment system with measurement systemsof other countries.

In the physical area the validity ofmeasurements is established by means ofan unbroken chain of comparisons fromthe working standards in day-to-day usethrough reference standards to the na-tional and international primary stand-ards, using recognised measurement tech-niques and methods. In this way themeasurements are “traceable” back to theprimary standards.

Unfortunately this infrastructure ofworking standards, reference standardsand primary standards does not exist foranalytical measurements. Although theComité International des Poids et Me-sures (CIPM) have started work on thedevelopment of such an infrastructure, itwill take several years, if not decades, todevelop. Thus the VAM programme isdesigned to meet the objectives of theNMS, i.e. to enable organisations to makemeasurements that are fit for their in-tended purpose, by:

the use of validated methods whichare calibrated using reference materials,carried out with proper quality assurance(QA) and quality control (QC) proce-dures, including participation in proficien-cy testing schemes. With the laboratorymethods and procedures being subjectedto third-party accreditation.

The aim of the VAM programme is toencourage the use of the above proce-dures to improve the quality of the analy-tical measurements in the United King-dom. The work centres around threemain areas of activity: defining and dis-seminating best analytical practice whichwill enable laboratories to deliver reliableresults every time; developing the toolswhich enable laboratories to implementbest analytical practice; working with ana-lysts in other countries to ensure thecomparability of analytical measurementsacross international boundaries.

Table 1 The six VAM principles of bestanalytical practice

1. Analytical measurements should bemade to satisfy an agreed requirement.

2. Analytical measurements should bemade using methods and equipmentwhich have been tested to ensure theyare fit for purpose.

3. Staff making analytical measurementsshould be both qualified and compe-tent to undertake the task.

4. There should be a regular independentassessment of the technical perform-ance of a laboratory.

5. Analytical measurements made in onelocation should be consistent withthose elsewhere.

6. Organisations making analytical meas-urements should have well-definedquality control and quality assuranceprocedures.

Best practice through the VAMprinciples [2]

Best analytical practice has been definedby six VAM principles (see Table 1). Val-id measurements and agreement betweenlaboratories can be achieved by adheringto these basic principles. The VAM prin-ciples describe an approach which is con-sistent with the best of modern qualitysystems, presented in a way which is di-rectly meaningful to those making analy-tical measurements. The principles arefundamental to VAM and are designedto control all factors that might affect thereliability of analytical results. Unlike for-mal quality management systems there isno accreditation process and as such thephilosophy of VAM is similar to that of atotal quality management (TQM) systemfor analytical laboratories, which can beapplied whatever the size of organisationor nature of work. The objective of theprinciples was to raise awareness of theimportance of obtaining reliable “fit forpurpose” measurements.

Addressing the aims of VAM

The work carried out under the VAMprogramme covers three broad technicalthemes – chemical measurement, physicalmeasurement and biologically based ana-lytical measurement. LGC is the leadcontractor and works together with theNational Physical Laboratory (NPL) andAtomic Energy Authority Technology(AEAT) in delivering the VAM pro-gramme.

Currently there are several technicalprojects being carried out at the Labora-tory of the Government Chemist (LGC)in the areas of education and training,reference materials, quality systems andmanagement, confidence and validation,out-of-laboratory measurement, DNAtechnology research and high accuracychemical analysis. A number of these ar-eas are covered in other articles in this is-sue of the journal. Two of the more re-cent areas of work that are creating a lotof interest are DNA technology researchand out-of-laboratory measurement.

DNA technology, in particular, is hav-ing a revolutionary effect on a host of in-dustrial and regulatory sectors. This areais rapidly developing and offers tremen-dous advantages and benefits to industry,but there is an urgent need for parallelvalidation of the analytical techniquesemployed in DNA-based measurementsand development of tools to enhance val-idity such as suitable reference standards.Analytical molecular biology has typicallybeen developed, and is most often em-ployed, in academic and medical researchenvironments where there is little need toconsider the more routine applicability,reliability and reproducibility of themethods. Evaluation of these factors andfurther validation of the methods istherefore necessary, particularly whensuch techniques are applied to the analy-sis of “real” samples.

Studies undertaken by LGC duringthe VAM 1994–1997 programme have de-monstrated that a large number of gener-ic problems need to be addressed beforethe potential power of nucleic-acid-basedmethods can be reliably applied. Themain outputs of the DNA technology val-idation project in the new programmewill be validated novel measurement sys-tems which utilise DNA technologicalprocedures, quality protocols, and refer-ence materials.

Out-of-laboratory measurements areundertaken across a broad range of in-dustrial and analytical sectors for a varie-ty of reasons: in clinical and medical diag-nostics; for the control of chemical andpetrochemical production processes; andto monitor emissions and discharges tothe environment. The validity of data de-rived from such measurements is clearlyof vital importance, for example to de-monstrate compliance with environmentallegislation. However, it is particularly dif-ficult to obtain valid and reliable measur-ements outside the laboratory. The ina-bility to control the environment in whichthe measurements are made and the useof untrained operators both have poten-tial to impact significantly on the reliabili-ty of data. The situation is made worsebecause of the lack of adequate QA andQC procedures, the shortage of referencematerials and calibration standards, and

Page 154: Validation in Chemical Measurement

Marketing Valid Analytical Measurement 145

Table 2 VAM product definitions

Product descriptionVAM sets out six principles of good ana-lytical practice, backed up by technicalsupport and management guidance.

Product positioningSelf-regulated implementation of bestpractice for analytical science to producereliable consistent and comparable resultsfor measurable commercial gain.

Selling propositionAdoption of the principles of VAM willreduce the costs and risks associated withunreliable measurements.

Fig. 1 The VAM promotional model

the lack of adequate specifications andperformance data for the procedures andequipment employed.

The current state of practice also var-ies considerably across sectors. Under theVAM programme, the aims are to in-crease user confidence in the validity ofdata derived from portable measuringequipment and to improve the reliabilityand comparability of measurements madeusing portable equipment. The possibilityof establishing formal QA and QC proce-dures and proficiency testing for out-of-laboratory measurements will also be ex-plored.

Marketing VAM

A major part of the VAM programmehas been the marketing and technologytransfer of the work carried out to themarket place itself. Marketing VAMposes particular challenges to the market-er in that it is a philosophy rather than atangible product or a service. The intangi-bility of VAM also makes it a difficultconcept for organisations to understandand sign up to. Promotion, and ultimatelyuptake, of VAM would be more effec-tive, if ways could be identified to makeit more tangible.

Marketing strategy

The key starting point for the effectivemarketing of VAM was to agree on aproduct definition, proposition and posi-tioning for VAM. The final definitionsarrived at are outlined in Table 2 and de-scribe the VAM product being marketed,the benefits of adopting VAM (namelyreduced costs and risks to an organisa-tion) and the positioning of VAM rela-tive to other quality systems (self-regul-ated implementation). In addition to defi-ning the VAM product it was also recog-

nised that there was a need to secure ad-option of the principles, rather than justawareness of them and to identify thebarriers to adoption of VAM. The targetmarket also needed to be trackedthrough the various stages from adoptionto implementation.

A promotional model to be used as aframework for the marketing of VAMwas developed to address the issues out-lined above (see Fig. 1). Development ofthe model also took place alongside aquantitative telemarketing survey con-ducted amongst identified decision mak-ers in industry. The objectives of the sur-vey were to: identify potential barriers toadoption of VAM; identify the communi-cation objectives and key messages;benchmark the overall awareness ofVAM.

The model indicated the variousstages that a VAM “prospect” needed tomove through towards adoption of VAMand attempted to define the communica-tion objectives for each stage. For exam-ple, the initial objective is to createawareness of VAM amongst the targetaudience. Following on from this the taskis to create understanding and an attitudeof favourability towards VAM. The finalstep towards adoption of VAM is an ex-pression of commitment to implementa-tion. In the model, adoption of VAM ismeasured by uptake of VAM products,for example books, videos and certifiedreference materials. In effect the VAMproducts become the tools to aid imple-mentation of the six VAM principles. Theright-hand side of the model indicates

that the new VAM clients, or implement-ers, also form a community that willcreate a requirement for future VAMproducts and services.

The target audience for the newVAM marketing strategy was defined assenior decision-makers in industry, that isthose persons towards the top of an or-ganisation with influence to change atti-tudes, work practices and realise the com-mercial benefits of adoption. The need toidentify “early adopters” or “champions”for VAM was identified.

In order to implement the strategy arange of new materials were produced,namely:1. VAM direct mail piece, used to raise

awareness of VAM and elicit requestfor further information.

2. VAM welcome pack “Better Measure-ments Mean Better Business”, whichsets out the business benefits of adopt-ing VAM to senior decision makers.This was produced to improve under-standing and favourability towardsVAM following an expression of inter-est.

3. “Managers Guide to VAM”. A moredetailed guide on how to implementVAM in the analytical laboratory. Theguide includes a self-assessment exer-cise, used to highlight priority areasfor immediate attention and detailssources of further help and advice.

4. New VAM database used to trackprospects from initial point of contactthrough to adoption or rejection ofVAM.

Page 155: Validation in Chemical Measurement

146 A. Gillespie · S. Upton

Implementation of the VAMpromotional model

Implementation of the model began witha campaign which aimed to generateawareness amongst senior decision-mak-ers of the costs and risks of unreliablemeasurements. A direct mail campaignwas commenced aimed at senior decision-makers in industry. In the first phase theconversion rate from direct mailshot torequest for the VAM welcome pack wasslightly higher than might be expected(7%) and the conversion rate from wel-come pack to the Managers Guide toVAM was encouraging (35%), suggestingthe message seemed to be striking achord. A follow-up telemarketing cam-paign was also instigated which resultedin an increase in response rate as a resultof the more personalised nature of thecontact. However, it was clear that manydifficulties arose because of the poorquality of data contained on most data-bases. In this respect the effectiveness ofdirect mail as a technique for generatingawareness and response proved difficultto evaluate. There is a requirement to ob-tain either a better-quality database andrepeat the direct mail exercise, or toidentify an alternative mechanism forcommunicating with senior decision-mak-ers.

The model will only work if there is aconvincing business case for the adoptionof VAM. In practice so far this hasproved quite difficult to demonstrate.High profile cases in the public domainhave high impact but are often perceivedto be politically sensitive. In the searchfor exemplar organisations there is a gen-eral unwillingness for companies to admitthat bad practice was ever present intheir organisation, no matter how muchthey have since improved. However, thiswhole area is a key to the recruitment ofboth producers and users to VAM andneeds to be addressed.

The model identifies a mechanism bywhich VAM is adopted by producers ofdata. However, it falls short of explaininghow communication with the bench ana-lyst will take place and what should bethe nature of this type of communication.In practice it is actually this level that theVAM contractors are most experiencedat communicating since generally the con-tent is technically, as opposed to commer-cially, biased.

Measurement of the adoption ofVAM is not easy. Whilst uptake of VAMproducts and CRMs is one measure, inpractice this information is difficult topull together for an individual, especiallywhen products can be purchased from anumber of suppliers. An assumption hasto be made that once a product is pur-chased it will also be used.

Future plans

By definition, it is proposed that the fu-ture plans for promoting VAM to pro-ducers of analytical data should concen-trate on “the individual or departmentwith responsibility for the production andoverall quality of an analytical measure-ments”. In some cases these individualsmay be senior decision makers but in themajority of cases these are more likely tobe bench analysts, quality managers andlaboratory team leaders. This suggeststhat an appropriate base for segmentationof the “producer” audience for VAM isby job function, with the target segmentsranging from those individuals who ac-tually make the measurements (produceranalysts) to those who have overall re-sponsibility for their quality (senior deci-sion-makers), but who may not actuallycarry out any analysis themselves.

Recognising that the producers ofanalytical data can be segmented as suchenables the benefits of VAM to be posi-tioned more appropriately for each targetsegment. Whilst the overall benefit ofVAM to the producers of analytical datacan be described as reduction in the costsand risks associated with unreliable meas-urements, this “business benefits” mes-sage is most likely to be of interest to thesenior decision makers. The produceranalyst is much more likely to be inter-ested in the technical aspects of the VAMprogramme, i.e. their job is concernedwith “how” to implement VAM as op-posed to “why”. An interest in the “how”part of the adoption is likely to arisewhen: there is pressure to improve quali-ty from within the organisation itself, i.e.via senior decision-makers convinced ofthe business benefits (self-regulation ar-gument); there is pressure to improvequality from internal or external custom-ers (user demand); the individual con-cerned is enlightened enough to see thebenefits of adoption in their work with-out any external pressure being applied.

Whilst senior decision-makers who areproducers of analytical data can still beencouraged to adopt VAM through theroute suggested in the VAM promotionalmodel, i.e. via the VAM welcome packand manager’s guide, the model needsfurther adaptation in order to explainhow the producer analysts can be ad-dressed.

Promotion to users of analytical data

Over the last year or so, there has beenincreasing recognition that another im-portant target market for VAM, as wellas “producers” of analytical data, is the“users” of analytical data. Demand fromthese so-called users for quality in analy-tical measurements is believed to be an

important driver towards improving theoverall quality of analytical measure-ments made in the United Kingdom.

Essentially, there are two mechanismsby which the quality of analytical measur-ements made in the United Kingdom canbe improved: by the producers improvingthe quality of the analytical measure-ments they supply; by the users of analy-tical measurements demanding an im-provement in the quality of analyticalmeasurements they procure.

Past thinking was based on the pre-mise that it was preferable to equip thesupplier of data with the tools required tosatisfy demand, before stimulating de-mand itself, and that the inherent valueof quality would be a concept readily ap-preciated by such parties. Thus VAM hasbeen promoted to producers of analyticaldata as self-implemented best practice,which will bring about measurable com-mercial gain to an organisation through areduction in the costs and risks associatedwith unreliable measurements. However,without an external driver for adoption itis likely that some laboratories will simp-ly choose to ignore VAM, taking a calcu-lated risk or, in some cases, simply dis-missing the issue altogether. There is acost associated with implementing qualitysystems (both up-front and ongoing) andif those costs are to be justified the pro-ducers of data need to be able to realisethe benefits as well. Users of analyticaldata also need to understand the implica-tions when purchasing data in terms of anadditional price they might be expectedto pay.

Fundamental to the marketing ofVAM to users is the need to describewhat exactly is meant by a user of analy-tical data. There are a range of individu-als or organisations who might commis-sion analytical measurements directlyfrom these providers. Organisations, suchas banks and insurance companies, alsorely on the results of analytical measure-ments, for example in risk assessment,without actually dealing with the produc-er of the measurements themselves, i.e.they operate through a third party. In ad-dition there are organisations and individ-uals (e.g. consumers) who, perhaps with-out even knowing, depend on the resultsof analytical measurements.

It seems logical, therefore, to identifyproducers and users of analytical data ac-cording to their position in a supply chainof analytical data, which begins at thepoint where the measurement is madeand leads up to the point where it is ac-tually put to use. It is also important torecognise that:1. a supply chain can exist within an or-

ganisation or between two or more or-ganisations and individuals;

2. at different positions in the supplychain there will be different decisions

Page 156: Validation in Chemical Measurement

Marketing Valid Analytical Measurement 147

taken and these may vary in the levelof risk;

3. it is possible for an individual to fulfiltwo roles in the supply chain, for ex-ample as a user commissioning dataand subsequently as a provider of thedata (possibly with some value addedinterpretation) to the next individualor organisation in the chain;

4. the more remote an individual or or-ganisation is from the producer, theless likely they are to be aware of theissues concerning quality in analyticalmeasurement. In effect this means theyare more likely to take analysis forgranted or simply not even consider itat all.A suggested definition for producers

and users of analytical data is as follows:A producer of analytical measure-

ments is “the individual or departmentwith responsibility for the production andoverall quality of an analytical measure-ment”.

A user of analytical data is “an indi-vidual or group who makes a decisionbased on the results of an analyticalmeasurement, be they internal or externalto the producer organisation”.

These definitions imply that the “se-nior decision-makers” targeted previouslyin the promotional model for producerscould in fact be both producers and usersof analytical data. It is clear that, to mar-ket VAM effectively, a clear understand-ing of the benefits of VAM is requiredfrom the distinct viewpoint of users andproducers. In the case of senior decision-makers they will be in a position to takeadvantage of some or all of these de-pendent on whether they are adopting auser or producer role.

Benefits of VAM to users andproducers

In the VAM welcome pack and the man-ager’s guide the benefits of VAM arepresented as a reduction in costs (bothdirect and hidden) and risks. These aredefined as:1. The direct costs of unreliable measure-

ments are those costs incurred if ananalysis has to be repeated becausethe original measurements are known,or suspected to be unreliable.

2. The hidden costs of unreliable measur-ements are the costs which are hiddeninternally in the organisation, becausethe true impact of unreliable measure-ments is not recognised.

3. The risks of unreliable measurementsare the risks associated with the re-lease of unreliable measurements to acustomer.

In recognising the distinction betweena user and producer of analytical measur-ements and attempting to clarify the ben-efits of VAM to both parties, it is clearthat the above definitions, whilst provid-ing a useful framework for promotingVAM to producer organisations, do notgo far enough. The reality of the situationis that the direct costs of unreliable meas-urements are only relevant to the produc-ers of analytical measurements (unlessthey are passed on as an increased cost tothe user, in which case the user needs tobe aware of what they are or are not pay-ing for). For users of analytical data themain concern is one of risk from poor de-cision-making.

Current situation

Promotion to users of analytical data is acomplex and difficult task. Users are lesslikely to be aware of or educated in theissues relevant to the technically complexmatters which affect quality in analyticalmeasurement. They are also more diffi-cult to identify and reach.

There is a body of anecdotal evidencethat suggests awareness of VAM, andbest practice in analytical measurementamongst users of analytical data is low.Concern has been expressed regardingthe way in which potential customerswent about choosing a contract laborato-ry. Factors such as quality of data wererarely considered, with the exception of arequest in some instances for the labora-tory to be NAMAS (National Accredita-tion of Measurement and Sampling) ac-credited. Cost appears to be the main fac-tor considered, with customers often se-lecting cheaper, less quality conscious la-boratories, without consideration of theassociated risks.

It is important to also recognise thatwhilst the risks associated with the use ofunreliable measurements are undoubtedlyreal, there is a counter argument whichsays that most users have survived ormanaged that risk to date. This may bebecause the true impact has not been feltyet as, for example, in the case of con-taminated land analysis where there is along time delay between the analysis be-ing carried out and the consequences ofpoor quality analysis being felt. Also,analysis of samples may be repeated sev-eral times as a means of reducing risk. Ifthe true factors impacting risk could becalculated then a more appropriate, “fitfor purpose”, measurement could be car-ried out (possibly even at a reduced cost).

The primary objective of the promo-tion to users of analytical data is to im-prove the quality of analytical measure-ments made in the United Kingdom bystimulating demand for VAM from the

users of analytical data. In developing astrategy to meet this objective it is recog-nised that VAM needs to be made moretangible and visible, and clear businessbenefits need to be identified and com-municated persuasively to the relevantbusiness audience.

References

1. Measuring up to the competition(1989) Her Majesty’s Stationery Office,London. Cm 728

2. VAM Bulletin, 13 :4–5 (Autumn 1995)

A. Gillespie (Y) 7 S. UptonThe Laboratory of the GovernmentChemist,Queens Road,Teddington, TW11 OLY, UKe-mail: amg6lgc.co.ukTel.: c44-181-9437000Fax: c44-181-943 2767

Page 157: Validation in Chemical Measurement

POLICIES AND CONCEPTS

Accred Qual Assur (2002) 7:294–298DOI 10.1007/s00769-002-0484-9© Springer-Verlag 2002

POLICIES AND CONCEPTS

Rouvim Kadis

Analytical procedure interms of measurement(quality) assuranceAbstract In the ordinary sense the term“analytical procedure” means a descriptionof what has to be done while performingan analysis without reference to quality ofthe measurement. A more sound definitionof “analytical procedure” can be given interms of measurement (quality) assurance,in which a specified procedure to be fol-lowed is explicitly associated with an es-tablished accuracy of the results produced.

The logic and consequences of such an ap-proach are discussed, with background def-initions and terminology as a starting point.Close attention is paid to the concept ofmeasurement uncertainty as providing asingle-number index of accuracy inherentin the procedure. The appropriateness ofthe uncertainty-based approach to analyti-cal measurement is stressed in view of spe-cific inaccuracy sources such as samplingand matrix effects. And methods for theirevaluation are outlined. The question of aclear criterion for analytical procedure val-idation is also addressed from the stand-point of the quality requirement whichmeasurement results need to meet as anend-product.

Keywords Accuracy · Analytical procedure · Measurement assurance · Measurement uncertainty · Quality assurance · Validation

IntroductionThere are different ways in which qualityassurance concepts play a role in analyticalchemistry. Most of them such as stipulatingrequirements for the competence and ac-ceptance of analytical laboratories, writingquality manuals, performing systems au-dits, etc. can be viewed as something for-eign to common analytical thinking forcedupon analysts by external authorities. Per-haps another possible way is to try to inte-grate quality assurance aspects into com-mon analytical concepts and (re)definethem in such a way as to explicitly includethe quality matters required. This may fa-cilitate an effective quality assurance strat-egy in analytical chemistry.

In ordinary usage the term “analyticalprocedure” hardly needs a referential defi-nition and may be for this reason there arefew official definitions of the term. Theonly definition quoted in the references [1]is rather diffuse:

“The analytical procedure refers to the wayof performing the analysis. It should de-scribe in detail the steps necessary to per-

Page 158: Validation in Chemical Measurement

Analytical procedure in terms of measurement (quality) assurance 149

separation, and so on. Because of this, ev-erything in the chain that affects the chemi-cal measurement result must be predeter-mined as far as practically possible: the ex-perimental operations, the apparatus andequipment, the materials and reagents, thecalibration and data handling. Thus, a“complete analytical procedure, which isspecified in every detail, by fixed workingdirections (order of analysis) and which isused for a particular analytical task” [5] – aconcept presented by Kaiser and Speckeras far back as in the 1950s [6] – becomes apoint of critical importance in obtainingmeaningful and reproducible results. Weuse the term “analytical procedure” ormerely “procedure” for short, in the senseoutlined above.

“Method”, “procedure”, or “protocol”

The importance of the correct usage of rel-evant terms, in particular, the term “proce-dure” rather than “method” is noteworthy.The terms actually correspond to differentlevels in the hierarchy of analytical metho-dology [7] expressed as a sequence fromthe general to the specific:

technique method procedure protocol

Indeed, the procedure level provides thespecific directions necessary to utilize amethod, which is in line with the definitionof measurement procedure given in the In-ternational Vocabulary of Basic and Gener-al Terms in Metrology (VIM): “set of oper-ations, described specifically, used in theperformance of particular measurementsaccording to a given method” [8].

This nomenclature is however not al-ways adhered to. In many cases, i.e. scien-tific publications, codes of practice, or offi-cial directives, an analytical procedure isvirtually implied when an analytical meth-od is spoken about. Commonly used ex-pressions such as “validation of analyticalmethods” or “performance characteristicsof analytical methods” are typical exam-ples of incorrect usage. Such confusion ap-pears even in the definition suggested byWilson in 1970 for the term “analyticalmethod” [9]. As he then put it, “an analyti-cal method is to be regarded as the set ofwritten instructions completely definingthe procedure to be adopted by the analystin order to obtain the required analyticalresult”. It is actually difficult to make adistinction between the two notions whenone of them is defined in terms of the other.

On the other hand, there is normally noreason to differentiate the two most specif-ic levels in the hierarchy above, carryingthe term “procedure” over to the designat-

ed “protocol”. The latter was defined [7] as“a set of definitive directions that must befollowed, without exceptions, if the analyt-ical results are to be accepted for a givenpurpose”. So, written directions have to befaithfully followed in both cases. In manyinstances the term “procedure” actuallysignifies a document in which the proce-dure is recorded – this is specifically notedin VIM in respect to “measurement proce-dure”. Besides, the term “standard operat-ing procedure” (SOP), especially appliedto a procedure intended for repetitive use,is popular in quality assurance terminolo-gy.

A clear distinction needs to be drawnbetween analytical procedure as a general-ized concept and its particular realization,i.e. an individual version of the procedurearising in specific circumstances. In prac-tice, an analytical procedure exists as a va-riety of realizations, differing in terms ofspecimens, equipment, reagents, environ-mental conditions, and even the analyst’sown routine. Not distinguishing betweenthese concepts can lead to a misinterpreta-tion embodied, for instance, in the view-point that with a detailed specification theprocedure will change “each time the ana-lyst, the laboratory, the reagents or the ap-paratus changed” [9]. What will actuallychange is realizations of the procedure, on-ly provided that all specified variables re-main within the specification.

Also one cannot but note that the hier-archy of methodology above concerns, infact, a level of specificity rather than theextent to which the entire CMP may becovered. Although sampling is the first(and even the most critical) step of the pro-cess, it is often treated as a separate issuewhen addressing analytical methodology.A “complete analytical procedure” may ormay not include sampling, depending onthe particular analytical problem to besolved and the scope of the procedure.

An analytical procedure yields theresults of established accuracy

In line with Doerffel’s statement which re-fers to analytical science as “a disciplinebetween chemistry and metrology” [10],one may define analytical service – as asort of analytical industry, that is practicalactivities directed to meeting customerneeds – as based upon concepts of chemis-try, metrology, and industrial quality con-trol. The intention of any analytical metho-dology in service is to produce data of ap-propriate quality, i.e. those that are fit fortheir intended purpose. The question to an-swer is what kind of criteria should be ad-dressed in characterizing fitness-for-pur-pose.

From the viewpoint of objective ofmeasurement, which is to estimate the true

form each analytical test. This may includebut is not limited to: the sample, the refer-ence standard and the reagents prepara-tions, use of the apparatus, generation ofthe calibration curve, use of the formulaefor the calculation, etc.” [2].

In brief, this simply means a description ofall that should be done in order to performthe analysis.

Leaving aside some prolixity in the def-inition above, the main thing that is lackingis the goal requirement needed in consider-ing quality matters. As it is shown in thispaper, a sound definition of an analyticalprocedure can be given in terms of mea-surement (quality) assurance. The case inpoint is not simply “the way of performingthe analysis” but that which ensures ob-taining the results of a specified quality.What this eventually means is a prescribedprocedure to follow in producing resultswith a known uncertainty.

If we have indeed recognized chemicalanalysis to be measurement, though pos-sessing its own peculiarities, we can applythe principles and techniques of quality as-surance developed in measurement to ana-lytical work. These principles and tech-niques constitute the field of measurementassurance [3], a system affording a confi-dence that all the measurements producedin a measurement process maintained instatistical control are good enough for theirintended use. “Good enough” implies herenothing more than having an allowable un-certainty. Although measurement assurancewas originally developed for instrumentcalibration, i.e. with emphasis on measure-ment traceability, it is reasonable to treat itmore generally. One can say that a fixedmeasurement procedure is a means of as-signing an uncertainty to a single measure-ment, and this is the essence of measure-ment assurance. This also reveals the role aprescribed (analytical) procedure plays inroutine analytical measurement. We willfocus on different aspects involved in theconcept of an analytical procedure definedin terms of measurement assurance such asterminology, content, evaluation, and vali-dation.

Starting point

Chemical analysis generally consists ofseveral operational stages beginning withtaking a sample representative of the wholemass of the material to be analysed andending with calculation and reporting ofresults. In this sequence the measurementproper usually makes a relatively smallcontribution to the overall variability in-volved in the entire chemical measurementprocess (CMP) [4], the largest portion ofwhich being concerned with “non-mea-surement” operations such as isolation,

Page 159: Validation in Chemical Measurement

150 R. Kadis

value of the quantity measured, and its ap-plicability for decision-making, closenessof the result to the true value, no matterhow it is expressed, should be such a crite-rion. If a measurement serves any practicalneed, it is to meet an adequate level of ac-curacy. It is compliance with an accuracyrequirement that fundamentally defines thesuitability of measurement results for aspecific use, and hence corresponding de-mands are to be made on a measurementprocess that produces the results. Next it isassumed that the process is generated bythe application of a measurement proce-dure and thus, the accuracy requirementsshould be finally referred to in the proce-dure itself. (The “requirements sequence”first implies substantiation of the demandson accuracy in a particular application ar-ea, the problem that needs special consid-eration in chemical analysis [11].)

Following this basic pattern, it is rea-sonable to re-define Kaiser’s “completeanalytical procedure”, so that the fitness-for-purpose criterion is explicitly allowedfor. There must be an accuracy constraintbuilt in the definition so as to give a deter-mining aspect of the notion. It is probablyunknown to most analytical chemistsworldwide that such a definition has longsince been adopted in analytical terminolo-gy in Russia. This was formulated in 1975by the Scientific Council on AnalyticalChemistry of the Russian Academy of Sci-ences. As defined by the latter an analyti-cal procedure is the: “ a detailed descrip-tion of all the conditions and operationsthat ensure established characteristics oftrueness and precision” [12]. This wordingwhich goes beyond the scope of analyticalchemistry specifically differs from theVIM definition of measurement procedurequoted above by including an accuracy re-quirement as a goal function. It clearlypoints out that adhering to such a fixedprocedure ensures (at least conceptually)that the results obtained are of a guaranteeddegree of accuracy.

Two basic statements underlie the defi-nition above. First, an analytical procedurewhen performed as prescribed, with thechemical measurement process operatingin a state of control, has an inherent accu-racy to be evaluated. Second, a measure ofthe accuracy can be transferred to the re-sults produced, providing a degree of theirconfidence. In essence, the measure of ac-curacy typical of a given procedure is be-ing assigned to future results generated bythe application of the procedure underspecified conditions. The justification forboth the propositions was given by Youdenin his work on analytical performance [13,14] where methods for determining accura-cy in laboratories were discussed in detail.

As a prerequisite for practical imple-mentation of the analytical procedure con-cept it is assumed that the chemical mea-surement process remains in a state of sta-

tistical control, being operated within thespecifications. To furnish evidence for thisand to avoid the reporting of invalid datathe analytical system needs to be tested forcontinuing performance. A number of con-trol tests may be used with this aim, for in-stance, testing the difference between par-allel determinations when they are pre-scribed to be carried out, duplicating com-plete analysis of a current test material, andanalysis of a reference material. An impor-tant point is that the control tests are to beperformed in such a manner and in such aproportion as a given measurement re-quires, and are an integral part of the wholeanalytical procedure. The control tests maybe more specific in this case and relate tocritical points of the measurement process.As examples, calibration stability controlwith even one reference sample or interfer-ence control by spiking provide usefulmeans of expeditious control in an analyti-cal procedure.

A principal point in this scheme is thataccuracy characteristics should be estimat-ed before an analytical procedure is regu-larly used and should be the characteristicsof any future result obtained by applicationof the procedure under specified condi-tions. Measurements of this type are mostcommonly performed (by technicians, notmeasurement scientists) in control engi-neering and are sometimes called “techni-cal measurements”. It is such measure-ments that are usually referred to as routinein chemical analysis. In fact, the problemsof evaluation of routine analyses faced bychemists are treated more generally in the“technical measurements” theory [15].

Uncertainty as an index of accu-racy of an analytical procedure

It is generally accepted that accuracy as aqualitative descriptor can be quantified on-ly if described in terms of precision andtrueness corresponding to random and sys-tematic errors, respectively. Accordingly,the two measures of accuracy, the estimat-ed standard deviation and the (bounds for)bias, taken separately, have to be generallyevaluated and reported [16]. As the tradi-tional theory of measurement errors holds,the two figures cannot be rigorously com-bined in any way to give an overall indexof (in)accuracy. Notice that accuracy, assuch, (“closeness of the agreement be-tween the result of a measurement and atrue value of the measurand” [8]) by nomeans involves any measurement error cat-egorization.

On the other hand, it has long been rec-ognized that the confidence to be placed ina measurement result is conveniently ex-pressed by its uncertainty that was thought,from the outset, to mean an estimate of thelikely limits to the error of measurement.

So, uncertainty has traditionally been treat-ed as “the range of values about the finalresult within which the true value of themeasured quantity is believed to lie” [17].However, there was no agreement on thebest method for assessing uncertainty.Consistent with the traditional subdivision,the “random uncertainty” and the “system-atic uncertainty” each arising from corre-sponding sources should be kept separatein the evaluation of a measurement, andthe question of how to combine them wasan issue of debate for decades.

Now a unified and widely applicableapproach to the uncertainty statement setout in ISO Guide (GUM) [18] is being ac-cepted in many fields of measurement, par-ticularly in analytical measurements due tothe helpful adaptation in the EURACHEMGuide [19]. Some peculiarities of the newapproach can be intimated, specifically, theabandonment of the previous distinctionbetween random and systematic uncertain-ties, treating all of them as standard-devia-tion-like quantities (after the correctionsfor known systematic effects have beenmade), and their possible estimation byother than statistical means. Fundamental,however, is that any numerical measure-ment is not thought of in isolation, but inrelation to the process which generates themeasurements. All the factors operative inthe process being defined, they virtuallydetermine the relevant uncertainty sources,so making practicable their quantificationto finally derive the value of total uncer-tainty. One can say that the measurementuncertainty methodology fits neatly thestarting idea of a procedure specified in ev-ery detail, since the procedure itself definesthe context which the uncertainty statementrefers to.

This is true of the component-by-com-ponent (“bottom-up”) method for evaluat-ing uncertainty that is directly in line withGUM. Also this is true for the “top-down”approach [20] that provides a valuable al-ternative when poorly understood steps areinvolved in the CMP and a full mathemati-cal model is lacking. An important point isthat the top-down methodology implies areconciliation of information availablewith the required one that is based on a de-tailed analysis of the factors which affectthe result. For both approaches to work ad-vantageously a clear specification of theanalytical procedure is evidently a neces-sary condition.

The break with the traditional subdivi-sion of measurement errors has a crucialimpact on the way accuracy may be quanti-fied and expressed. In 1961, Youden wrote[14]: “There is no solution to the problemof devising a single number to representthe accuracy of a procedure”. He was in-deed right in the sense that a strict proba-bility statement cannot be made about acombination of random and systematic er-rors. Today, thanks to the present uncer-

Page 160: Validation in Chemical Measurement

Analytical procedure in terms of measurement (quality) assurance 151

tainty concept, we maintain the other opin-ion that such a solution does exist. It ismeasurement uncertainty that can be re-garded as a single-number index of accura-cy inherent in the procedure. In doing sowe must not be confused by the fact thatthe operational definition of measurementuncertainty that GUM presents does notuse the unknown “true value” of the mea-sured quantity following pragmatic philos-ophy. The old definitions and, in particular,that cited above are equally valid and arenow considered ideal.

Consequently, we can define an analyti-cal procedure as leading to results with aknown uncertainty, as in Fig. 1 in whichtypical “constituents” to be specified in ananalytical procedure are shown.

Specific inaccuracy sources in an analytical procedure

What has been said in the previous sectiongenerally refers to specified measurementprocedures used in many fields of measure-ment. There are, however, some specialreasons, specific to chemical analysis, thatmake the uncertainty methodology particu-larly appealing in analytical measurements.This is because of specific inaccuracysources in an analytical procedure whichare difficult to be allowed for otherwise.Two such sources, sampling and matrix ef-fects, will be mentioned here, with an out-line of the methods for their evaluation.

Sampling

Where sampling forms part of the analyti-cal procedure, all operations in producingthe laboratory sample such as samplingproper, sample pre-treatment, carriage, andsub-sampling require examination in orderto be taken into account as possible sourcescontributing to the total uncertainty.

It is generally accepted that a reliableestimate of this uncertainty can be obtainedempirically rather than theoretically. Ac-cordingly, an appropriate methodology hasbeing developed [e.g. 21, 22] aimed at sep-

arating the sampling contribution from thetotal variability of the measurement resultsin a specially designed experiment. This isnot, however, the only way of quantifyinguncertainty in sampling. Explicit use ofscientific judgement is now equally ap-proved when experimental data are un-available. An illustrative example from theEURACHEM Guide (Ref. 19, ExampleA4) clearly demonstrates the potential ofmathematical modelling inhomogeneity asan alternative to the sampling assessmentexperiment.

It is significant that with the uncertaintymethodology both the major analyticalproperties, “accuracy” and “representative-ness” [23], which quality of analytical datarelies on, can be quantified and properlytaken into account to give a single index ofaccuracy. This index expresses consistencybetween the measurement results and thetrue value that refers to a bulk sample ofthe material rather than the test portionanalysed.

Matrix effects

The problem of matrix mismatch is alwaysattendant when one analyses an unknownsample “with the same matrix” using afixed, previously determined, calibrationfunction. Not uncommonly, an analyticalprocedure is developed to cover a range ofsample matrices in such a way that an“overall” calibration function can be used.An error due to matrix mismatch is there-fore inevitable if not necessary significant.Commonly regarded as systematic for asample with a particular matrix, the errorbecomes random when a population ofsamples to which the procedure applies isconsidered; this in fact constitutes an in-herent part of the total variability associat-ed with the analytical procedure.

Meanwhile, these effects are in no wayincluded in the usual measures of accuracyas they result from a “method-performancestudy” in accordance with the acceptedprotocols [24, 25]. The accuracy experi-ment defined by ISO 5725 (Ref. 24, Part 1,Section 4) does not presuppose any vari-able matrix-dependent contribution, being

confined to identical test items. The under-lying statistical model assumes that solelylaboratory components of bias and theirdistribution must be considered.

It is notable that such kinds of errorsources are fairly treated using the conceptof measurement uncertainty which makesno difference between “random” and “sys-tematic”. When simulated samples withknown analyte content can be prepared, theeffect of the matrix is a matter of direct in-vestigation in respect of its chemical com-position as well as physical properties thatinfluence the result and may be at differentlevels for analytical samples and a calibra-tion standard. It has long since been sug-gested in examination of matrix effects[26, 27] that the influence of matrix factorsbe varied (at least) at two levels corre-sponding to their upper and lower limits inaccordance with an appropriate experimen-tal design. The results from such an experi-ment enable the main effects of the factorsand also interaction effects to be estimatedas coefficients in a polynomial regressionmodel, with the variance of matrix-inducederror found by statistical analysis. Thisvariance is simply the (squared) standarduncertainty we seek for the matrix effects.

In many ways, this approach is similarto ruggedness testing aimed at the identifi-cation of operational (not matrix-related)conditions that are critical to the analyticalperformance.

“Method validation” in terms of measurement assurance

The presented concept of analytical proce-dure offers a clear perspective on the prob-lem of “method validation” which is an is-sue of great concern in quality matters.Validation is generally taken to mean aprocess of demonstration that a methodolo-gy is suitable for its intended application.The question is how should suitability beassessed, based on customer needs?

It is commonly recommended [e.g. 2,28–30] that a number of characteristics suchas selectivity/specificity, limits of detectionand quantitation, precision and bias, lineari-ty and working ranges be considered as cri-teria for analytical performance and evaluat-ed in the course of an validation study. Inprinciple, they need to be compared to somestandard; based on this, judgement is madeas to whether the procedure under issue iscapable of meeting the specified analyticalrequirements, that is to say, whether a“method is fit-for-purpose” [28].

However, from the perspective of end-users of analytical results, it is importantthat the data be only of the required qualityand thus appropriate for their intended pur-pose. In other words, the matter of primaryconcern is quality of analytical results asan end-product. In this respect, a procedure

Fig. 1 Typical “constituents” to be specified within analytical procedure, which ensuresobtaining the results with a known uncertainty

Page 161: Validation in Chemical Measurement

152 R. Kadis

will be deemed suitable when the data pro-duced are fit-for-purpose.

It follows that the common criteria ofvalidation should be made more specific interms of measurement assurance. It is (theindex of) accuracy that requires overridingconsideration among the characteristics ofanalytical performance if quality of the re-sults is primarily kept in mind. Other per-formance characteristics are desirable toensure that a methodology is well-estab-lished and fully understood, but validationof an analytical procedure on those criteriaseems impractical also in view of the lackof corresponding requirements as is com-monly the case. (Strictly speaking, there isno validation unless a particular require-ment has been set.)

We have every reason to consider theestimation of measurement uncertainty inan analytical procedure followed by thejudgement of compliance with a target un-certainty value as a kind of validation. Thisis in full agreement with ISO 17025 thatpoints to several ways of validation, amongthem “systematic assessment of the factorsinfluencing the result” and “assessment ofthe uncertainty of the results...” [31]. Inline with this is also a statistical modellingapproach to the validation process that hasrecently been developed and exemplifiedas applied to in-house [32] and interlabora-tory [33] validation studies.

A concrete example of such validationis worthy of notice. Certification (attesta-tion) of analytical procedures used in regu-lated fields such as environmental controland safety is operative in the Russian statemeasurement assurance system as a pro-cess of establishing metrological propertiesand confirming their compliance with rele-vant requirements. (By metrological prop-erties we mean herein the assigned mea-surement error characteristics, i.e. mea-surement uncertainty.) This is introducedby the Russian Federation state standardGOST R 8.563 [34] which also covers pro-cedures for quantitative chemical analysis.This certification is, in fact, a legal metrol-ogy measure similar, to some extent, topattern evaluation and approval of measur-ing instruments. Some scepticism concern-ing the efficiency of legal metrology prac-tice in ensuring the quality of analyticalmeasurements may be in order. Neverthe-less, the conceptual (measurement assur-ance) basis of this approach to validationdeserves attention beyond doubt.

Conclusions

This debate allows the following proposi-tions to be made:1. The term “analytical procedure” com-

monly used without reference to the qual-ity of data is best defined in terms ofmeasurement (quality) assurance to ex-

plicitly include quality matters. Thismeans a specified procedure which en-sures results with an established accuracy.

2. The measurement uncertainty metho-dology neatly fits the idea of a specifiedmeasurement procedure and further-more provides a tool for covering spe-cific inaccuracy sources peculiar to ana-lytical measurement. Uncertainty can beregarded as a single-number index ofaccuracy of an analytical procedure.

3. When an analytical procedure is so de-fined, uncertainty becomes the perfor-mance parameter that needs overridingconsideration over and above all the oth-ers assessed during validation studies.This kind of validation gives a direct an-swer to the question whether the dataproduced are of required quality andthus appropriate for their intended use.

References1. Holcombe D (1999) Accred Qual

Assur 4: 525–5302. International Conference on Harmoni-

zation of Technical Requirements forRegistration of Pharmaceuticals forHuman Use (1994) Text on validationof analytical procedures. ICH Qualitytopic Q2A: Definitions and terminolo-gy (http://www.ifpma.org/ich5q.html)

3. Cameron JM (1976) J Qual Technol 8:53–55

4. Currie LA (1978) Sources of error andthe approach to accuracy in analyticalchemistry. In: Kolthoff IM, Elving PE(eds) Treatise on analytical chemistry.Part I. Theory and practice, vol.1, 2ndedn. Wiley, New York, pp 95–242

5. Kaiser H (1978) Spectrochim Acta33B: 551–576

6. Kaiser H, Specker H (1956) FreseniusZ Anal Chem 149: 46–66

7. Taylor JK (1983) Anal Chem 55:600A-604A, 608A

8. BIPM, IEC, IFCC, ISO, IUPAC, IUPAP, OIML (1993) International vocabulary of basic and general termsin metrology, 2nd edn. InternationalOrganization for Standardization(ISO), Geneva

9. Wilson AL (1970) Talanta 17: 21–2910. Doerffel K (1998) Fresenius J Anal

Chem 361: 393–39411. Shaevich AB (1989) Fresenius Z Anal

Chem 335: 9–1412. Terms, definitions, and symbols for

metrological characteristics in analysisof substance (1975) Zh Anal Chim 30:2058–2063 (in Russian)

13. Youden WJ (1960) Anal Chem 32 (13):23A-37A

14. Youden WJ (1961) Mat Res Stand 1:268–271

15. Zemelman MA (1991) Metrologicalfoundations of technical measure-ments. Izdatelstvo standartov, Moscow(in Russian)

16. Eisenhart C (1963) J Res Nat BurStand 67C: 161–187

17. Campion PJ, Burns JE, Williams A(1973) A code of practice for the de-tailed statement of accuracy. NationalPhysical Laboratory, Her Majesty’sStationery Office, London

18. BIPM, IEC, IFCC, ISO, IUPAC, IUPAP, OIML (1993) Guide to the ex-pression of uncertainty in measure-ment. ISO, Geneva

19. EURACHEM/CITAC Guide (2000)Quantifying uncertainty in analyticalmeasurement, 2nd edn (http://www.eurachem.bam.de/guides/quam2.pdf)

20. Ellison SLR, Barwick VJ (1998) Ana-lyst 123: 1387–1392

21. Ramsey MH (1998) J Anal AtomSpectr 13: 97–104

22. van der Veen AMH, Alink A (1998)Accred Qual Assur 3: 20–26

23. Valcárcel M, Ríos A (1993) AnalChem 65: 781A-787A

24. ISO 5725 (1994) Accuracy (truenessand precision) of measurement methodsand results. Parts 1–6. International Or-ganization for Standardization, Geneva

25. IUPAC (1995) Protocol for the design,conduct and interpretation of method-performance studies. Pure Appl Chem67: 331–343

26. Makulov NA (1976) Zavod Lab 42:1457–1464 (in Russian)

27. Parczewski A, Rokosz A (1978) ChemAnalityczna 23: 225–230

28. EURACHEM (1998) The fitness forpurpose of analytical methods. A labo-ratory guide to method validation andrelated topics. LGC, Teddington

29. Wegsheider W (1996) Validation of an-alytical methods. In: Günzler H (ed.)Accreditation and quality assurance inanalytical chemistry. Springer, Berlin,etc., pp 135–158

30. Bruce P, Minkkinen P, Riekkola M-L(1998) Mikrochim Acta 128: 93–106

31. ISO/IEC 17025 (1999) General re-quirements for the competence of test-ing and calibration laboratories. Inter-national Organization for Standardiza-tion, Geneva

32. Jülicher B, Gowik P, Uhlig S (1999)Analyst 124: 537–545

33. van der Voet H, van Rhijn JA, van deWiel HJ (1999) Anal Chim Acta 391:159–171

34. GOST R 8.563–96 State system for en-suring the uniformity of measurements.Procedures of measurements. Gosstan-dart of Russia, Moscow (in Russian)

R. Kadis ()D. I. Mendeleyev Institute for Metrology,19 Moskovsky pr., 198005 St. Petersburg, Russiae-mail: [email protected].: +7-812-3239644Fax: +7-812-3279776

Page 162: Validation in Chemical Measurement

Accred Qual Assur (2002) 7:299–304DOI 10.1007/s00769-002-0494-7© Springer-Verlag 2002

A. Leclercq

Flexibilizationof the scope of accreditation: an important asset for food and water microbiologylaboratories

9000 series [6, 7], which evaluated onlythe capacity of laboratories to establish aquality control system, is changing over tostandards based on the NBN-EN-ISO 9001(Version 2000) standard [8], also takingcontinuous quality assurance enhancementinto account. The process of developmentof the NBN-EN-ISO-CEI 17025 standardincorporated all the requirements of NBN-EN-ISO 9001 [6] and 9002 [7] that wererelevant to the scope of testing servicescovered by the laboratory’s quality system.These last requirements of NBN-EN-ISO9001 and 9002 should be replaced by re-quirements of the NBN-EN-ISO 9001 (ver-sion 2000) standard during revision of theNBN-EN-ISO-CEI 17025 standard. Never-theless, accreditation requires not only anevaluation of quality systems, but also anevaluation of the technical competence toperform specific tests.

The heart of the accreditation process isa scope of accreditation which is well-de-fined and which avoids ambiguity or multi-ple interpretations. Accreditation must con-stitute a credible attestation to the qualifi-cations of a laboratory [4, 9, 10]. Thisscope of accreditation can affect in mayimportant ways the development of thetechnical competence of a laboratory. Atthe same time, though, laboratories need tobe able to use test methods that meet thegrowing needs of clients.

As regards food and water microbiolo-gy, current outbreaks in Europe have madeapparent the necessity for official inspec-tion agencies or ministry departments andfor food industries or water agencies to getlaboratories accredited quickly to performtests in accordance with new test parame-ters, or to apply current methods to newranges of food products. Accredited labo-ratories must be able to develop and vali-date new methods, or to derive methodsfrom standards. Official recognition ofthese new test methods or their official in-corporation into the scope of accreditationtakes a long time, and audits are requiredfor validation of these test methods.

A new concept “Flexibibilization ofscope of accreditation” or “Accreditationof types of tests” or “Scope in testing” hasemerged [11] in microbiology. It has beenused by German and Swiss accreditationbodies for nearly ten years in other fields.Under this concept, an accredited laborato-ry, which has shown solid technical com-petence in the past, could, after establish-ment of a validation file, directly incorpo-rate the new developed test method or ap-plication inside its scope of accreditationwithout a separate specific inspection. Thislaboratory could also at any moment retrynew test methods or application. Surveil-lance audits then confirm (or fail to con-

firm) the applicability of the given testwithin the scope of accreditation. This is amodification of the requirements resultingfrom accreditation stipulated in paragraph7, section b of the NBN-EN 45001 stan-dard, and it is complementary to paragraph1.6, 5.4.3 and 5.4.4. of the NBN-EN-ISO-CEI 17025 standard. As regards food andwater laboratories, three principal headingswere drawn up: formulation of scope of ac-creditation, flexibility with regard to foodproducts tested, and validation of testmethods and results.

Definition of scope of accreditation

Accreditation by individual tests

The area of application of an accreditationor scope of accreditation must be clearlydefined by reference to one or more testsor types of tests and, if applicable, testitems (NBN-EN-45002 [12]). Accredita-tion takes into account only precise indi-vidual tests as described with reference tothe particular products analyzed, to deter-minate parameters, according to the testmethod that is prescribed, and also withreference to the technical procedure forcarrying out a test or for implementing aparticular standard (NBN-EN 45001).

Scope of accreditation comes to refer toa detailed list of individual tests in confor-mity with accreditation references and re-flects the situation of a given laboratory atthe time when an audit is conducted. Eachtest is specific to a product or type of prod-ucts analyzed, to a set of parameters mea-sured, to a range of measurement applica-ble to this particular test, and to test meth-ods used by a given laboratory for this par-ticular test. “Product” as a concept couldalso be expressed as a “range of product.”Accreditation for a test method coulddrown methods considered as equivalent ifthey are established via identical methodol-ogy.

This approach is clear, strict, and com-prehensible to all clients. Laboratorieswhich perform the same functions over andover and which do not change the scope oftheir activity can by this means have theircompetence clearly established for poten-tial clients. Due to its rigidity, any signifi-cant modification of the “scope of accredi-tation” concept would require an extensiveaudit by the appropriate accreditation agen-cy.

The concept of “scope of accreditation”presents differences which can be catego-rized in various ways, with regard to cus-tomer satisfaction. Customer demandsmust be aligned with the real accreditationneeds in scientific terms. Figure 1 illus-trates the relationship between customerdemand and real need, and the availability

POLICIES AND CONCEPTS

Abstract Beltest, the Belgian accredita-tion body, has investigated flexibilizationof the scope of accreditation for chemistrylaboratories and food and water microbiol-ogy laboratories. This flexibilization, syn-onymous with test-type accreditation, al-lows a laboratory to add new test methodsor retry previous test methods without hav-ing to undergo a new audit by Beltest. Ithas been used for nearly ten years by Ger-man and Swiss accreditation bodies. Flex-ibilization permits the validation of meth-ods and results, given that the competenceof the particular laboratory is already wellestablished. This new concept in microbi-ology allows client’s needs to be adequate-ly met, and facilitates the quick establish-ment of a method in several laboratories atonce in case of a public health crisis. Thefirst laboratory to participate at this investi-gation on the flexibilization concept, as atest of the concept, was the Belgian refer-ence laboratory for food microbiology.

Keywords Accreditation · Food and watermicrobiology · Validation · Type of tests

Accreditation of Belgian laboratories isbased on proof that they conform, as re-gards documentation and practices, to therequirements established in the standardguideline NBN-EN 45001 [1], and also, ifnecessary, to applicable requirements ofsector-related or national accreditationagencies [2, 3, 4].

The NBN-EN 45001 standard has beenreplaced with the more precise standardNBN-EN-ISO-CEI 17025 [5], approved inDecember 1999 with a phase-in period oftwo years for application, which takes intoconsideration the importance of continuousquality assurance enhancement. In thesame way, certification of quality systemsbased on standards of the NBN-EN-ISO

Page 163: Validation in Chemical Measurement

154 A. Leclercq

(supply) of laboratory facilities, which canemploy for fixed or flexible scope of ac-creditation. For fixed scope of accredita-tion, area A represents laboratory capacityin which quality is controlled; scope of ac-creditation is ideal. Area B represents ca-pacity corresponding to the need and not tothe demand; this is something that can im-prove customer satisfaction. Scope of ac-creditation offers new parameters for cli-ents that could also offer them commercialadvantages. This is an important part of theconcept of flexible scope of accreditation.Area C is the part of the market where thenew concept fails to meet specificationsand so fails to satisfy customers. Scope ofaccreditation does not cover certain teststhat may be required by some clients. Area

D represents excessive quality control. Ar-ea E represents unnecessary or unsatisfiedrequirements; customers who requestthings they don’t need. Area F representsshortcomings on the part of the accreditedlaboratory; improvements are necessary tosatisfy potential needs. This is another im-portant part of the concept of flexiblescope of accreditation. Area G representsscope parameters that are unnecessary:sub-quality, possible gain for laboratory.Laboratories have a scope of accreditationconstituted through analysis not necessaryfor their market. Under flexible scope ofaccreditation, categories of need, demandand supply tend to be confused: Area A be-comes predominant.

Accreditation by types of tests or flexible scope

Recently, Beltest, the Belgian accreditationagency, investigated a new concept called“flexibilization of scope of accreditation”[11], which is an accreditation bearing ontypes of tests (i.e. enumeration of microor-ganisms) with reference to a test method(i.e. enumeration of Clostridium perfrin-gens) and a test field (i.e. all foods, water),mainly associated with a product or a rangeof products. Flexibilization of the scope ofaccreditation is understood to comprise allaccreditation procedures not directed ex-clusively at the accreditation of individualtest methods [13]. A “test type” is definedas a group of tests characterized by utiliza-tion of similar methods (including samplepreparation, standardization, calibrationand validation principles, and training fun-damentals) applied in a particular sector ofa testing field. Types of tests may be de-fined on a technology/methodology-relatedor application-related basis. For each typeof tests, the following must be specified:internal reference of laboratory, testingfield(s) or products or range of products in-volved, test method(s) used and the eventu-al date of beginning and end of accredita-tion procedures in regard to the given test.Item(s) being tested could also be specifiedin some cases. A simplified example offlexible scope accreditation was given inTable 1. The definitions of flexible-scopeaccreditation will vary from country tocountry and from sector to sector depend-

Fig. 1 Relationship between need, demand of clients and supply of laboratory for fixedand flexible scope of accreditation

Table 1 Example of presentation of flexible scope of accreditation

Internal code Samples Measured principle Description AccreditedTest field (Products or Type of tests Test method (method, from tomaterials or type (Range of measurement, standard, validated of activity tested) properties measured) in-house method)

FLEX 1 All foods and water, Enumeration of microorganismsa Cultureexcepted worn water

P20/I3.2. All foods Enumeration of total coliforms NF V08-050b

P20/I10.2. All foods Enumeration of Clostridium perfringens NF V08-056

FLEX 2 Water, excepted Enumeration of microorganismsa Culture after filtrationworn water

P20/I2.5 Water Enumeration of presumed Escherichia coli ISO 9308/1P20/I15.2 Water Enumeration of Pseudomonas aeruginosa NF-T-90-421

FLEX 3 All foods Detection of microorganismsa CultureP20/ I 49.1 All foods Detection of Salmonella spp ISO 6579P20/I49.2 All foods Detection of Listeria monocytogenes ISO 11290-1

FLEX 4 All foods Detection of microorganismsa ImmunodiagnosticP20/I55.1 All foods Detection of Salmonella spp AFNOR-BIO-12/1-04/94

Vidas SalmonellaP20/I55.3 All foods Detection of Escherichia coli O157 AFNOR-DYN-16/1-03/96

Dynabeads anti Escherichiacoli O157

a Type of tests.b NF: French Standard; ISO: International Organization for Standardization; AFNOR: French association for standardization; Bio: bioMérieux; DYN: Dynal

Page 164: Validation in Chemical Measurement

Flexibilisation of the scope of accreditation: an important asset for food and water microbiology laboratories 155

ing on established procedures in the vari-ous sectors and countries, and dependingon the requirements and needs of importantcustomers of the laboratory. This flexibilitycould take different forms, as indicated inTable 1:

1. Fixed scope with footnotes to measureprinciple column, which indicate thatno modification of the list of accreditedmethods allowed and flexible scopewith footnotes (“Optimization of giventest methods allowed (adaptation to cli-ents needs, new editions of test stan-dards)”, or, “Development of additionaltest methods within the accredited typesof tests allowed”) to measure principlecolumn, which indicate flexibility lev-els after fixed scope.

2. Fixed scope parts are indicated by listsof accredited methods and flexiblescope parts are indicated by referenceto documented in-house methods andprocedures.

3. Fixed scope parts are given by lists oftest methods. Flexible parts are indicat-ed by reference of type of tests. Therules of the accreditation body requirethe competence in a minimum numberof testing methods for the accreditationof “types of tests”.

4. Fixed scope parts are given by lists oftest methods. Flexible scopes are givenby short lists of standards and in-housemethods followed by the sentence:“Within the indicated test area the labo-ratory may modify, improve and newlydevelop test methods without prior in-formation and consent of the accredita-tion body. The test methods given areexamples only” [14].

At all times, the laboratory being accredit-ed must maintain a current listing of indi-vidual validations that can be made for atype of tests, available on request from theappropriate accreditation agency or its in-spectors, and for external queries, i.e. fromclients. Laboratories may introduce newtest methods or modified methods, withinthe limits prescribed for the accredited typeof tests, within the approved type of test,without the necessity to obtain approvalfrom accreditation authority in each indi-vidual case. It might also be possible to ob-tain confirmation of competence with re-gard to non-routine tests on the basis ofgeneral procedures. A technical annex inaccreditation certificate could refer simul-taneously to individual tests and types oftests. Extension and/or a surveillance auditcould integrate, definitively or not the newtests or type of tests validated by the labo-ratory. Information about modifications isprovided at fixed surveillance intervals. Alladditions of new types of tests, testingfields and new test methods outside the ac-credited types of tests in the scope of ac-creditation must be reported to the accredi-tation authority as part of an extension.

NBN-EN-ISO-CEI 17025 standardstates in its paragraph 1.6 that “If testingand calibration laboratories comply withthe requirements of this International Stan-dard they will operate a quality system fortheir testing and calibration activities thatalso meets the requirements of ISO 9001when they engage in the design/develop-ment of new methods, and/or develop testprograms combining standard and non-standard test and calibration methods, andISO 9002 when they only use standardmethods” [5]. This correspondence may bementioned in the accreditation certificateor in its relevant annexes.

So, the meaning of “flexibility of scopeof accreditation” is a group of measurestakes to not limit accreditation at individu-al tests. It addresses testing proficiency andis not generally intended for researchand/or product development laboratories,unless specified by a client and/or profi-ciency to a test method is critical to thesefunctions.

Requirements and assessment of laboratory

Requirements: accreditation body aspect

A laboratory may be eligible for flexiblescope of accreditation or accreditation bytype of tests only if it is accredited for indi-vidual tests and if it has passed at least onesurveillance audit. Flexibility might be ap-plied on methods of tests and/or types ofanalyzed products.

Accreditation by type of tests must sat-isfy particular requirements defined by aBeltest guide [11] and a demonstration bythe given laboratory of its performancewith regard to each type of tests. For eachtype of tests, the performance of the givenlaboratory in the following respects mustbe submitted to an appropriate accredita-tion agency:

1. A sufficient number of different indi-vidual tests, at least six, concerningtype of tests and corresponding valida-tion or verification reports (in certainexceptional cases these could be routinetests, but in such a case a supplementalvalidation or verification proceduremust be carried out);

2. General procedures of validation orverification applicable to type of testconcerned;

3. Records for the corresponding valida-tion or verification.

A list of the test methods currently coveredby the existing accreditation must be avail-able at all times in the testing laboratory onrequest of accreditation body. As part ofthe monitoring procedure, this list must besubmitted to the accreditation authority

with the new or modified test method iden-tified.

Requirements: laboratory aspect

For a food or water laboratory, each type oftest which must be accredited must satisfyseveral types of requirements:

1. Each test included in a given type oftests must be submitted to a validationprocedure applied to the implementa-tion of the methods in the laboratory,and must be regularly evaluated withtriply redundant verification proce-dures;

2. When laboratories use outdated ver-sions of standards, or fail to mention adate of version of standards in theirstatement of accreditation scope, theymust revise their procedures for testing(revise instructions) and train techni-cians in new procedures for testing ac-cording to new standards within a fixedshort period after standards publication;

3. When the flexible-scope accreditationbears on methods for “all foods” or “allwaters”, laboratories must be understrict control during validation of testresults and at reception of samples. Testresults depend completely upon sam-pling carried out by the laboratory, andon the first step of manipulation ofsample before real test application. Val-idation of implementation of an accred-ited test method is made in accordancewith a divergent type of matrix repre-senting the main activities of the givenlaboratory. The laboratory must estab-lish and scientifically document theproper handling of all the differenttypes of food samples delivered by cli-ents. Upon reception of a non-conven-tional sample, technicians must havedefinite procedural instructions for han-dling the sample. This step is necessaryfor the correct identification of the sam-ple. Also, the availability of the samplematerial and the competence of thetechnician must be verified by an ap-propriate accrediting agency.

Assessment

Several points which arise during the prep-aration of accreditation files in accordancewith this new concept, or which arise dur-ing inspection, must be reviewed:

1. Organizational aspects that the labora-tory must monitor in relation to valida-tion or verification of new or modifiedtest methods;

2. Competence, experience and training(present and future) of staff (and partic-ularly for senior technical and manage-ment staff) in statistics and in relationto the particular type of testing, and theparticular products concerned;

Page 165: Validation in Chemical Measurement

156 A. Leclercq

3. Performance level of equipment;4. Test procedures or/and instructions;5. Performance level of quality control

system;6. Recording of completed validation pro-

cedures;7. Conformation level by comparison of

accreditation requirements, as men-tioned by results of most recent audits(and particularly control of technicalaspects);

8. Results of participation in ring-tests orcollaborative trials;

9. Management specifications in cases ofless frequent testing.

Inspection by an official accreditationagency must be based on a selection of cer-tain test methods from both quantitativeand qualitative points of view. This selec-tion takes into consideration:

1. Evidence of the implementation of thequality management system, experi-ence, capability, if any, modifica-tion/development of testing methods;

2. The technical complexity of the tests;3. The possible consequences of errors

(possible risks) in the tests;4. The frequency of use of the test meth-

ods;5. The ratio of routine tests (standard

methods) to non-routine tests (specialspecifications from clients, in-housemethods, ...);

6. The ratio of complete observations oftest performance and checks of test re-ports to inspections of test facilities.

The range of audited activities could makeit possible to attest the capacity of a givenlaboratory to introduce new test methodsor to modify current methods. However,the testing laboratory must not be made toincur unnecessary costs.

Validation of methods and results: main point of flexible schedule

Laboratories which conform to standardsNBN-EN 45001 (Paragraph 5.4.1., 5.4.2.,6.2 and 6.3.) and NBN-EN-ISO-CEI 17025(Paragraph 5.4.5.) standards should usevalidated methods for testing, methodswhich produce reliable results; part of theresults validation procedure includes anobligation to demonstrate the capacity tocarry out this procedure.

Validation of methods

Validation is defined as confirmation byexamination, including the provision of ob-jective evidence that the particular require-ments for a specific intended use are ful-filled. It may include procedures of sam-

pling, handling and/or transportation. Pro-cedures themselves, and any assigning oftasks such as development, implementationand validation of new or in-house testmethods, should be laid out in detail, by aperson experienced in documentation ofquality control, in a flow chart format. Forcomplex methods such procedures may, atleast to some extent, end up in projectmanagement plans.

This validation procedure must confirmthat the method under examination isadapted to the test context, and is continu-ously in accordance with the client’s re-quirements or needs, as regards the follow-ing parameters: trueness, precision (repro-ducibility and/or repeatability), uncertaintyof the results, measurement range compris-ing limits of detection and/or quantifica-tion, specificity and robustness against ex-ternal influences and/or cross-sensitivityagainst interference from the matrix of thesample. Qualitative methods must satisfythe parameters of the matrix effect (and in-terference effect) which are very importantwhen a flexible scope of accreditation isapplied, because of the type of tests: thatis, when “all foods” must be controlled.Levels of validation are evaluated depend-ing on the particular type of method used,and on results. But the following parame-ters must be assumed:

1. Standardized methods of testing (andmethods validated by an official agencyof validation), used strictly (without ad-aptation of test parameters), limited tospecified tasks (type of products tested,range of measurement) not requiring in-dividual validation;

2. Modifications of standardized methodsof testing (modification of test parame-ters, and/or specified tasks validated inregard to the modified parameters);

3. Non-standard methods (publishedmethods, new test methods, commer-cialized test systems, kits, ...) or labora-tory-designed/developed methods mustbe thoroughly validated.

The laboratory shall record the results [15]obtained in validation testing, the proce-dure used for validation, and a statement asto whether the method is appropriate forthe intended use.

Certain official methods which are notconsidered in this list present some prob-lems as regards validation. These frequent-ly occur because of a failure to use stan-dardized methods, or from simplificationof standardized methods, sanctioned by anexpert consensus, or written in to certainregulations. Their official appearanceseems to influence accreditation bureaus toconsider standardized and official methodsof validation as equivalent. In many cases,these official methods were originally es-tablished in emergency situations of na-tional crisis (outbreak, epidemic, etc.). Inthe majority of cases, these official meth-

ods lack validation procedures. Onerousstandardized validation procedures may becircumvented. In such cases laboratoriesmust demonstrate at a minimum the quali-tative recovery of a microorganism for en-richment procedures which are replicated,and demonstrate quantitative recovery of amicroorganism for an enumeration proce-dure carried out in replicate.

Validation of testing methods alwaysinvolves a balance between costs, risks andthe technical capabilities of a given labora-tory (NBN-EN-ISO-CEI 17025, 1999).Laboratory administrators must establishminimum quality requirements beforestarting the process of validation and im-plementation, or before starting the wholedevelopment process. Procedures of vali-dation for a small laboratory could not bethe same as for commercial validation of anew method, whether or not new kits froma manufacturer are used. There are manycases in which the range and uncertainty ofthe test result values can only be given in asimplified way due to lack of information.

Validation and verification of results

Results verification is totally different fromresults validation. Results validation (point4.7.5. and 5.9. of NBN-EN-ISO-CEI 17025standard) shows, each year, or when it isjudged necessary, that a given laboratoryhas the capacity to apply a particular meth-od, repetitively, in respect of obtained dataduring initial validation. Trueness and sta-tistical dispersion of results are the basis ofthe definition of the uncertainty of the stan-dard of measurement [16] and, in somecases, the basis for the definition of thelimit of detection and quantification. Man-agement of data from validation results, ascontrol card, could permit the detectionand control of eventual deviation. Valida-tion of results is the internal quality controlprocedure which verifies the stability ofperformance of the methods for which ac-creditation is sought, in the limited-scopeprocedural context.

Guideline for a validation scheme

Table 2 shows requirements for validationof quantitative methods and could be sim-plified in the case of qualitative methods.

So, the methods characteristic of eachtest, comprising taken together a type oftests, must undergo validation testing oftheir results. This is the implementation ofthe method, and the establishment of astandard for its performance. For the stan-dardization of quantitative methods, thisconsists at a minimum of a determinationof trueness when blank utilization, certifiedreference materials (or reference materials,or spiking materials) or collaborative trialsare used, repeatability (r) with repetition,

Page 166: Validation in Chemical Measurement

Flexibilisation of the scope of accreditation: an important asset for food and water microbiology laboratories 157

Tab

le 2

Val

idat

ion

para

met

ers

requ

ired

for

qua

ntit

ativ

e or

qua

lita

tive

test

met

hod

Met

hods

Tru

eR

epea

t-In

tral

abor

ator

yL

imit

of

Lim

it o

f R

elat

ive

Mat

rix

Lin

eari

tyS

elec

tivi

tyS

peci

fici

tyS

ensi

bili

tyR

obus

tnes

sne

ssab

ilit

yre

prod

ucib

ilit

yde

tect

ion

quan

tita

tion

accu

racy

effe

ctof

fici

alto

met

hods

New

met

hod

+a

++

(+)b

(+)

(+)

(+)

(+)

++

++

Cla

ssic

met

hod

++

+(+

)(+

)+

(+)

–c–

––

–M

odif

ied

met

hod

++

+(+

)(+

)(+

)(+

)(+

)(+

)(+

)(+

)(+

)

aA

lway

s ex

ecut

ed.

bE

xecu

ted

if te

chni

call

y ju

stif

ied.

cN

ot e

xecu

ted

or reference materials utilization and repro-ducibility (R) with collaborative trials orcertified reference material (or referencematerials). For the standardization of quali-tative methods, the minimum requisite isthe determination of limits of detection,with analysis by checked samples near thislimit. These results must be reported, basedon a significant number of tests, and dis-cussed in a validation report, which givepositive agreement to include new methodsof testing as part of an accreditation pro-gram.

At the same time, internal quality con-trol must be carried out to verify the per-formance stability of the limited-scope per-formance of the method. Triply redundantverification methods are carried out withregard to control of first, second and/orthird line, each set of methods applied to aparticular type of test. The first line of ver-ification involves pro forma repetition ofall the steps of the test, in order to establishrepeatability or reproducibility for quanti-tative tests, and to verify the range of sen-sitivity or detection for qualitative tests.This verification is performed by the ex-perimenter himself, as part of the properperformance of the test. A second line ofverification is put into operation by admin-istrative decision, and includes testing with“blind” samples, repetition of samples, in-ternal audit procedures, etc. If necessary, athird line of verification can be set up bythe use of certified reference materials (orspiking materials), or through collaborativetrials. These procedures are based on exter-nal cooperation. External audit proceduresand complaints handling procedures arealso part of this third line of verification.

An annual schedule of validation ofmethods for each type of test could be es-tablished as regards limited-scope accredi-tation. Some test methods similar amongthemselves may allow the use of previous-ly established validation procedures, such

as those used for reference testing, or thoseused for the establishment of routine stan-dards for the enumeration of aerobic meso-philic microorganisms. The schedule coulddifferentiate between methods that can bevalidated during collaborative trials inwhich the laboratory participates, andmethods where substrates must be used asreference materials if available.

Conclusion

Accreditation with flexible scope allowslaboratories who receive requests for eval-uation and analysis to quickly provide cus-tomers with a service adapted to their actu-al needs (Fig. 2), and provides food andwater microbiology authorities with newtest methods which can be brought rapidlyto bear in cases of public health crisis.

Its implementation is easy in smallcountries due to centralization of manage-ment and ease of surveillance of this newform of accreditation by a national accredi-tation body. Due to the long-standing es-tablishment of validation of food microbio-logical methods in some countries, the be-ginning of validation of water microbiolog-ical methods, and establishment of Europe-an standards for the validation of food mi-crobiological test methods (Microval pro-ject), the flexible-scope type of accredita-tion must be adapted to certain national re-quirements, but could prove less complexthan the standard requirements.

Acknowledgements I thank Doctor Yas-mine Ghafir (Université Libre de Liège,Liège) and Nicole Vanlaethem (Beltest,Brussels) for their help.

Fig. 2 Testing laboratory, accreditation requirements and relation clients-laboratory

Page 167: Validation in Chemical Measurement

158 A. Leclercq

14. ILAC (2000) Draft 6 – The scope ofaccreditation and consideration ofmethods and criteria for the assessmentof the scope in testing. ILAC, Rhodes,Australia

15. Beltest (1996) Lignes directrices rela-tives à la traçabilité des mesures. Doc-ument L8/REV0. Beltest, Brussels

16. ISO (1994) ISO 5725-1. Exactitude(justesse et fidélité) des résultats etméthodes de mesure – Partie 1: Princi-pes généraux et définitions. ISO, Geneva

A. Leclercq ()Institut Pasteur de Lille, Department of Food Microbiology and Safety, Rue du Professeur Calmette 1, 59000 Lille, Francee-mail:[email protected].: +33-3-20-43-89-23Fax: +33-3-20-43-89-26

7. ISO (1994) NBN-EN-ISO 9002 –Quality Systems – Model for qualityassurance in production, installationand servicing. ISO, Geneva

8. ISO (2000) NBN-EN-ISO 9001:2000 –Quality management systems – Re-quirements. ISO, Geneva

9. Flowers RS, Gecan JS, Pusch DJ(1992) Laboratory quality assurance.In: Vanderzant C, Splittstoesser DF(eds) Compendium of methods for themicrobiological examination of foods.APHA, Washington, DC

10. Lightfoot NF, Maier EA (1998) Micro-biological analysis of food and water.Guidelines for quality assurance. Else-vier, Amsterdam

11. Beltest (2000) I015 – REV.0 – Critèresauxquels doivent répondre les laborato-ires accrédités demandeurs d’un do-maine d’application flexible pour lesanalyses de pesticides. Beltest, Brussels

12. CEN (1989b) NBN-EN 45002 – Gen-eral criteria for the assessment of test-ing laboratories. CEN, Brussels

13. Steinhorst A (1998) Accreditation andQuality Assurance 3:295

References1. CEN (1989a) NBN-EN 45001 – Gen-

eral criteria for the operation of testinglaboratories. CEN, Brussels

2. Beltest (1991) Critères généraux et lig-nes directrices à l’usage des laborato-ires candidats à une accréditation. Doc-ument L1/F-REV0. Beltest, Brussels

3. Eurachem (1996) Accreditation forlaboratories performing microbiologi-cal testing. European cooperation foraccreditation of laboratories. EAL-G18.Eurachem

4. Wilson B, Weir G (1995) Food anddrink laboratory accreditation, a practi-cal approach. Chapman & Hall,London

5. ISO (1999) NBN-EN-ISO-CEI 17025– General requirements for the compe-tence of testing and calibration labora-tories. ISO, Geneva

6. ISO (1994) NBN-EN-ISO 9001 –Quality Systems – Model for qualityassurance in design/development, pro-duction, installation and servicing.ISO, Geneva

Page 168: Validation in Chemical Measurement

Relevant legislation, legislation plans and related initiatives in different countriesand continents

Accred Qual Assur 3 :211–214Q Springer-Verlag 1998

H. Hey

Different approaches tolegal requirements onvalidation of testmethods for officialcontrol of foodstuffs andof veterinary drugresidues and elementcontaminants in the EC

Abstract In order to ensure food con-sumer protection as well as to avoid bar-riers to trade and unnecessary duplica-tions of laboratory tests and to gain mu-

tual recognition of results of analyses, thequality of laboratories and test results hasto be guaranteed. For this purpose, theEC Council and the Commission have in-troduced provisions– on measures for quality assurance for

official laboratories concerning the ana-lyses of foodstuffs on the one hand andanimals and fresh meat on the other,

– on the validation of test methods toobtain results of sufficient accuracy.This article deals with legal require-

ments in the European Union on basicprinciples of laboratory quality assurancefor official notification to the EC Com-mission and on method validation con-cerning official laboratories. Widespreaddiscussions and activities on measurementuncertainty are in progress, and the Euro-pean validation standards for official pur-poses may serve as a basis for world-wideefforts on quality harmonization of analy-tical results. Although much time has al-ready been spent, definitions and require-

ments have to be revised and further ad-ditions have to be made.

Key words Laboratories for offical con-trol of foodstuffs, veterinary drug resi-dues and element contaminants 7 Valida-tion of test methods 7 Legal requirements

Laboratories for official control offoodstuffs, veterinary drug residuesand element contaminants

Official laboratories testing foodstuffsand stockfarming products of animal ori-gin have to meet requirements of twoseparate and distinct legislative areas in-cluding analytical quality assurance in theEC. The first area concerns the officialcontrol of foodstuffs. It falls within theremit of the Directorate General III (DGIII), Internal Market, and refers to Direc-tives 89/397/EEC [1] and 93/99/EEC [2].The second area is within the responsibil-ity of the Directorate General VI (DGVI), Agriculture, and refers to a series ofdirectives which until May 1996 regulatedthe prohibition of the use of certain sub-stances having a pharmacological actionand which are used in stockfarming topromote growth and productivity in life-

Page 169: Validation in Chemical Measurement

160 H. Hey

DG III, Internal Marketofficial food control

DG VI, Agricultureveterinary drug and elementcontaminant control

Legislation Preferably horizontal Preferably vertical

Laboratory qualityassurance system fornotification to theEC-Commission

Yes, Art. 3 of Directive93/99/EEC: EN 45000series, additional measuresof OECD-GLP principles

No

Methods of analysis Preferably optional to thelaboratories, regulationson validation

Preferably methodstandardisation, additionalregulations on validation(going further than DGIII)

Reference laboratorysystem

No Yes

Regulations on proficiencytesting

Yes, Art. 3 of Directive93/99/EEC, InternationalHarmonized Protocol [5](according to Councilminutes [6])

No (indirect: collaborativetrials, managed byreference laboratories)

stock or for therapeutic purposes. In or-der to harmonize the legal regulations ofthe past on residues of illegal substances,a number of provisions required clarifica-tion in the interests of the effective appli-cation of control and residue detection inthe Community. With a view to ensurethe immediate and uniform application ofthe official controls, the rules of the pastand their amendments were assembled inthe Councils Directive 96/22/EC [3].

It is the purpose of the official foodcontrol, of the veterinary drug controland of the element contaminant controlto pursuit the free movement of food-stuffs, living animals and fresh meat with-in the EU and to guarantee safe productswithout health hazards. Additionally, theofficial food control shall prevent fraudfrom consumers, whereas a main task ofveterinary drug control (official and selfcontrol) is to compel the stockfarming in-dustry to take greater responsibility forthe quality and safety of meat and to gua-rantee even odds. This is the reason whyArt. 31 of 96/23/EC [4] provides for thecharging of a fee to cover the official con-trol measures on drug and element con-taminant analysis.

Unfortunately; DG III and DG VI arefollowing totally different approaches tolegislation, laboratory quality assuranceand test method or test result validation:

The official status of laboratories infood control is defined in Art. 7 of theCouncil Directive 89/397/EEC, which ref-ers to both “first opinion” and “secondopinion” (or referee) laboratories. “Firstopinion” stands for laboratories of the of-ficial food control authorities in the Eu-ropean Union. “Second opinion” meansprivate third party laboratories which un-dertake analyses on a “second part of

samples” from such enterprises, whichhave been subject to official inspectionand sampling by the enforcement authori-ties. The status is laid down in the basicfood law of some EU member states, andthe laboratories have to be officially ac-knowledged to the EC Commission notlater than 31 October 1998. Their func-tion is to enable food companies to de-fend their legitimate claims against thefood enforcement authorities. The privatelaboratory community is not quite famil-iar with this clarification of the EU Com-mission and the EU Council in the Coun-cil minutes [6] on Art. 3 of the CouncilDirective 93/99/EEC.

First and second opinion institutionsfor the purpose of official food analysishave to be notified to the Commission aslaboratories referred to in Art. 7 of theDirective 89/397/EEC. They have to meetthe assessment criteria laid down in Art.3 § 1 and § 2 of the Directive 93/99/EEC.The regulations consist of the general cri-teria for the operation of testing labora-tories laid down in European StandardEN 45001 supplemented by standard op-erating procedures and the random auditof their compliance by quality assurancepersonnel. These supplements are in ac-cordance with the OECD principles nos.2 and 7 of good laboratory practice as setout in Section II of Annex 2 of the Deci-sion of the Council of the OECD of 12May 1981 concerning the mutual accept-ance of data in the assessment of chemi-cals [7]. The measures and administrativeprovisions necessary to comply with thislegal requirement have to be met before1 November 1998. The accreditationpursuing EN 45001 in the valid statusalone does not meet the supplements(standard operating procedures and ran-

dom audit by quality assurance person-nel, which is not involved in testing) aslegal requirements of official food andsecond opinion laboratories as mentionedabove.

Concerning laboratories for the analy-sis of veterinary drug residues and ele-ment contaminants in animals and freshmeat, there are no detailed regulations onthe assessment of laboratory qualitiy as-surance in contrast to official food con-trol laboratories, but Annex V, Chapter2, no. 1b) of the Council Directive 96/23/EC refers to functions and activities ofCommunity reference laboratories in thisfield. Function no. 1b) includes the sup-port of the national residue reference la-boratories to build up an appropriate sys-tem of quality assurance, which is basedon the general criteria of good laboratorypractice (GLP principles) and the stand-ards series EN 45000. This system can beregarded as analogous to the legal requi-rements laid down for official food con-trol laboratories in Art. 3 of Directive 93/99/EEC. So far, in contrast to officialfood laboratories, there are no legallybinding provisions for the assessment andnotification of official veterinary drug-and element-contaminant laboratories tothe EC Commission. But if the acknowl-edgement of these laboratories is to be-come mandatory in the future, the basicsystem of quality assurance seems to bequite clear: it will consist of some ele-ments of the OECD principles of GLPand of the standard series EN 45000. Be-sides the general principles of laboratoryorganization and quality assurance thereare existing detailed legal EC regulationson validation of methods referring toboth kinds of laboratories.

Validation requirements concerningoffical food control laboratories

Art. 4 of the Directive 93/99/EEC re-quires the member states of the EU toensure the validation of methods of ana-lysis used within the context of officialcontrol of foodstuffs “whenever possi-ble”. For this purpose, the laboratorieshave to comply with the provisions of pa-ragraphs 1 and 2 of the Annex to CouncilDirective 85/591/EEC [8] concerning theintroduction of Community methods ofsampling and analysis for the monitoringof foodstuffs intended for human con-sumption.

Annex 1 and 2 in detail demand:1. Methods of analysis which are to be

considered for adoption under the pro-visions of the Directive shall be exam-ined with respect to the following cri-teria:

(a) specificity,(b) accuracy,

Page 170: Validation in Chemical Measurement

Different approaches to legal requirements on validation of test methods for official control of foodstuffs 161

(c) precision; repeatability intra-laborato-ry (within laboratory) and reproduci-bility interlaboratory (within and be-tween laboratories) variabilities,

(d) limit of detection,(e) sensitivity, (which means the slope of

the calibration curve and which is oft-en mixed up with the level of the de-tection limit)

(f) practicability and applicability,(g) other criteria which may be selected

as required.2. The precision values referred to in 1

(c) shall be obtained from a collabora-tive trial which has been conducted inaccordance with an internationally re-cognized protocol on collaborativetrials [e.g. International Organizationof Standarization “Precision of TestMethods” (ISO 5725/1981)]. The re-peatability and reproducibility valuesshall be expressed in an internationallyrecognized form (e.g. the 95% confi-dence intervals as defined by ISO5725/1981). The results from the colla-borative trial shall be published orfreely available.

Validation requirements concerningofficial control laboratories onveterinary drug residues andelement contaminants

Validation criteria referring to official la-boratories on the control of veterinarydrug residues are included in the follow-ing Commission Decisions:– 93/256/EEC [9], which lays down the

methods to be used for detecting resi-dues of substances having a hormonalor a thyreostatic action (immunoassay,thin-layer chromatography, liquid chro-matography, gas chromatography, massspectrometry, spectrometry or othermethods),

– 90/515/EEC [10], which lays down ref-erence methods to be used for detect-ing heavy metals and arsenic.

Commission Decisions are strictly manda-tory to those laboratories to whom theyare addressed. Compared to a Commis-sion Decision, a Council or CommissionDirective has not the same legally bind-ing character, but the general aims of aDirective have to be transposed into themember states’ national law.

Whereas the Decision 93/256/EEC it-self is very brief, the annex includes– definitions of terms related to analyses,– general requirements, which in accor-

dance with Directive 85/591/EEC shallapply to the examination of test meth-ods, and

– criteria for the qualitative identificationand quantification of residues.

In detail, the Decision 93/256/EEC inthe first part of the annex demands thesame validation criteria as the Directive85/591/EEC, but beyond this there arethe following additional requirements:– sample blank determination– reagent blank determination– requirements for calibration curves– limit of determination (additional to

limit of detection in 85/591/EEC)– co-chromatography (second run of ana-

lytical procedure with standard addi-tion)

– calibration curves– susceptibility to interference.Pursuing Art. 3 of 93/256/EEC, the vali-dation criteria laid down in the Annex“are applicable to routine methods ofanalysis of residues of substances havinga hormonal or a thyreostatic action”. TheAnnex of 93/256/EEC distinguishes be-tween “screening methods” (no. 1.2.2)and “confirmation methods” (no. 1.2.3).According to Art. 3 of the CommissionsDecision 93/256/EEC, the calibration cri-tera for confirmation methods are applic-able to reference analyses on veterinaryagents, which are laid down in Annex Iof Directive 86/469/EEC [11] and whichrefer to the control of animals and freshmeat.

In comparison to the Directive 85/591/EEC and to the Decision 93/256/EEC theDecision 90/515/EEC on the analysis ofelement contaminants does not containfurther basic validation criteria. Both 93/256/EEC and 90/515/EEC demand thatspecial requirements for the interpreta-tion of results are fulfilled in order to en-sure accurate results and/or to preventfalse positive or negative signals.

Discussion

The Annex of 93/256/EEC is an almostcomplete guideline on the validation ofphysico-chemical methods of analysis. Al-though the title concerns the analysis ofhormonal and thyreostatic agents in ani-mals and fresh meat, the criteria shouldbe applied to the routine testing of anyanalyte irrespective of the kind and originof samples and matrices. They shouldneither be restricted to substances havinga hormonal or a thyreostatic action norto reference methods for official purposesnor to laboratories for the purpose of of-ficial enforcement authorities. Besides theofficial laboratories, the private ones,which are working in the field of food, ofveterinary drug and of element contami-nant analyses, should focus their atten-tion on the validation criteria mentionedabove as well.

Since no guideline may be perfect, theAnnex to 93/256/EEC needs to be revisedbecause

1. Some definitions are contradictory,meaningless, without benefit or willcause much expenditure of personneland measurement capacity, e.g. “Limitof determination. This is the smallestanalyte content for which the methodhas been validated with specific accu-racy and precision”. Apart from thefact that “precision” is included in theexplanation of “accuracy”; the defini-tion manifests a fundamental inabilityto give a definition which is fit forpractice. A useful definition of the de-tection and quantification limit isbased on a statistical approach to theconfidence hyperbola of a methodscalibration curve, elaborated by the“Deutsche Forschungsgemeinschaft”[12].

2. Some requirements cannot be fulfilledin routine analyses, because of beingtoo costly, time expensive and large-scaled, e.g. general requirements onsample blanks and calibration curves,where among others the following pro-cedures have to be followed and/or in-formation must be given:

– In case of nonlinearity, the mathemati-cal formula which describes the curve.

– Acceptable ranges within which the pa-rameters of the calibration curve mayvary from day to day.

– Verification of the calibration curve byinternal standards.

– A minimum of six calibration points isrequired.

– Details of the variance of the variableof the calibration curve should be giv-en.

– Control samples have to be included ineach assay, with concentration levelszero and at lower, middle and upperparts of the working range.

– The procedure of analysis shall includesample blank and reagent blank deter-minations.

– If it is to be expected that factors suchas species, sex, age, feeding or otherenvironmental factors may influencethe charateristics of a method, a set ofat least 20 blank samples is requiredfor each individual homogeneons popu-lation to which the method is to be ap-plied.These requirements seem to be

“hatched at the green table”, because, ifthey were to be fulfilled completely, the“practicability” requirements of the Com-mission Decision such as high samplethroughput and low cost could never bemet.

The Decision 93/256/EEC in case ofrepeated analyses of a reference materialincludes the following ranges for devia-tion of the mean from the true value (ac-curacy) and coefficients of variation

Page 171: Validation in Chemical Measurement

162 H. Hey

True value(mass fraction)

Deviation ranges

~1 mg/kg –50% to c20%11 mg/kg – 10 mg/kg –30% to c10%110 mg/kg –20% to c10%

Content (massfraction)

Coefficient ofvariation

1 mg/kg 45%10 mg/kg 32%100 mg/kg 23%1 mg/kg 16%

These deviation ranges concerning accu-racy and precision should be completedby ranges for the recovery, e.g. in accor-dance with a report of the NetherlandsInspectorate of Health Protection on vali-dation of methods [13]:

Analyte content Minimum recovery

10,5 mg/kg 80–110%1 to 500 mg/kg 60–115%~1 m/kg 50%

Furthermore, there should be mini-mum requirements on the use of certifiedstandard reference samples and the useof proficiency testing schemes as far asappropriate. Pursuing Art 3 § 2b of theCouncil Directive 93/99/EEC it is manda-tory to official food laboratories to carryout proficiency tests, which according tothe “statement for the Council min-utes”[6] have to comply with the Interna-tional Harmonized Protocol for the Profi-ciency of (Chemical) Analytical Labora-tories [5].

Conclusion

Harmonization of requirements on analy-tical quality assurance and method valida-tion between DG III and DG VI is ur-gent and should have absolute priority.The validation criteria from 93/256/EECdescribed and discussed above are par-tially mandatory to official EU laborato-ries for the analysis of foodstuffs andhave to be fully met by official laborato-ries testing veterinary drug residues inanimals and fresh meat. The examplesgiven do not claim to be complete. Theymay serve as a guideline, and there aremany good reasons for a general revisionand for adding further requirements. Be-sides, the guideline may be a good basisfor worldwide harmonization of physico-chemical method validation, particularlyin the current discussion on measurementuncertainty.

References

1. OJ No L 186/23 of 30.06.19892. OJ No L 290/14 of 24.11.19933. OJ No L 125/3 of 23.05.19964. OJ No L 125/10 of 23.05.19965. Thompson M, Wood R (1993) J

AOAC Int 76 :926–9406. Document 4241/94/EEC, statement

for the Council minutes7. Good Laboratory Practice in the

Testing of Chemicals, OECD, Paris1982

8. OJ No L 372/50 of 31.12.19859. OJ No L 118/64 of 14.05.1993

10. OJ No L 286/33 of 18.10.199011. OJ No L 275/36 of 26.09.198612. Rückstandsanalytik von Pflanzen-

schutzmitteln, Deutsche Forschungs-gemeinschaft, 11th edn, 1991

13. Report number IGB/93-002, secondversion, Rijswijk, March 1993

H. Hey (Y)Official food and veterinary laboratory ofSchleswig-Holstein, D-24537 Neumünster,GermanyTel.: c49-4321-904610Fax: c49-4321-904619

NEW REFERENCE MATERIALS

Page 172: Validation in Chemical Measurement

Relevant accreditation policies and concepts in EU-, EFTA-, NAFTA-and other regions

Accred Qual Assur (1998) 3 :32–35

Margreet Lauwaars

Methods validation:AOAC’s three validationsystems

Abstract Validated methods of analysisare needed for many purposes: enforce-ment of regulations, import/export con-trol, in accredited laboratories, academia,institutions. The AOAC INTERNA-TIONAL Official Methods Program isdesigned to provide fully validated meth-ods of analysis, based on interlaboratorytesting by a minimum of eight laborato-ries. Another, lesser validation system isused for peer-verified methods of analysiswhere two or three laboratories partici-pate. The system for performance testingof test kits is specially designed for athorough testing of manufacturer claims,and can be obtained by submitting a kitto Performance Testing by the AOACResearch Institute.

Key words Validation of methods ofanalysis 7 Interlaboratory study 7Peer-verified methods 7 Performancetesting of test kits

Introduction

Since its beginning in 1884, AOAC IN-TERNATIONAL has been truly dedi-cated to the validation of analytical meth-ods through trials in multiple laborato-ries. An early undertaking of AOAC isstill its most important business: support-ing the use of analytical methods used inmultiple laboratories through validationby interlaboratory studies.

What do validated methods mean tothe laboratory? Why use methods thathave been performance-characterizedthrough interlaboratory study validation?Such methods give the analyst increasedconfidence in the results of this analysis.In addition, the user of analytical data,who frequently may not be the analyst or

the laboratory, will have increased confi-dence in the results of the analysis. Thisis important since regulatory actions ortransactions from local to world level canbe based upon these results.

The characteristics of the analyticalmethod and its applicability to the workat hand are established through valida-tion, are key elements in understandingthe quality of analytical data, and assureconfidence in quality.

Why quality measurements areneeded:– compliance with regulations– maintain quality and process con-

trol– meet terms of procurement– make regulatory decisions– conduct national and internation-

al trade– support research

Quality measurements have several el-ements. Quality assurance plans and qual-ity control procedures are an essential be-ginning. In addition, it is necessary tohave qualified scientists whose trainingneeds to be documented and updated ona continuous basis. Quality measurementsalso require proper use of reference ma-terials where available, and laboratoriesmust repeatedly test their ability to per-form through taking part in proficiencytesting schemes. The provision of anotheressential element in quality measure-ments, namely validated methods, is theprimary contribution from the work ofAOAC.

There are many reasons why qualitymeasurements are needed. Products mustbe tested for compliance and content, forsafety, and to meet a variety of regula-tions. Quality measurements are alsoneeded to monitor quality during produc-tion and in process control. Contracts arewritten based on standards and terms ofprocurement that must be tested. Everyday, regulatory agencies make decisionsbased on results which are only as goodas the quality of the methods upon whichthey themselves are based. Quality meas-urements support international and na-tional trade and support research. Labo-ratories need to generate credible dataand to do so they must have valid meth-ods.

Page 173: Validation in Chemical Measurement

164 M. Lauwaars

Table 1 Data requirements of the AOAC Method Validation Programs

AOAC Official Methods AOAC Performance-Tested Methods AOAC Peer-Verified Methods

1. Specificity2. Sensitivity3. False positives/negatives4. Limit of detection5. Precision6. Accuracy7. Matrixes8. Ruggedness9. Recovery

10. Intra-lab repeatability11. Comparison of existing reference

methods12. Collaborative studya

13. Package insert review14. Quality policy certification

1. Specificity2. Sensitivity3. False positives/negatives4. Limit of detection5. Precision6. Accuracy7. Matrixes8. Ruggedness9. Recovery

10. Intra-lab repeatability11. Comparison of existing reference

methods12. Laboratory reproducibilityb

13. Package insert review14. Quality policy certification

1. Specificity2. Sensitivity3. False positives/negatives4. Limit of detection5. Precision6. Accuracy7. Matrixes8. Ruggedness9. Recovery

10. Intra-lab repeatability11. Comparison of existing reference

methods

a As defined in AOAC guidelines for conducting collaborative studyb As defined in AOAC-RI policy on laboratory verification

Validation of a method establishesthrough systematic laboratory stud-ies, that the performance character-istics of the method meet the speci-fications related to the intended useof the analytical results. Perform-ance characteristics determined in-clude: selectivity and specificity,range, linearity, sensitivity, limit ofdetection and limit of quantifica-tion, accuracy and precision.

Three method validation systems areoperated by AOAC INTERNATIONAL:– The AOAC Official Methods Program– The AOAC Peer-Verified Methods

Program– The AOAC Performance-Tested Test

Kit Program.

The AOAC Official Methods Program

The AOAC Official Methods Program isdesigned to provide analytical methodsfor which the performance characteristicshave been validated to the highest degreeof confidence recognized by an interna-tionally harmonized protocol through anindependent multiple laboratory study[1].

An acceptable interlaboratory studyshould be designed to support the in-tended scope of the method and to vali-date the performance across that scope.The number and type of materials sent tothe collaborators is very important: Mate-rial is defined as a matrix/analyte combi-nation, the aim being to have five ma-

trices (or five concentrations) for eachanalyte. In the case of multianalyte meth-ods, more than one analyte may be pres-ent in the same matrix, as a way of re-ducing the total number of matrices sentto collaborators. Each of the individualanalytes must be present in those fivematrices, as a minimum (i.e. one cannotsend one matrix containing five analytesand consider that to be five materials).

For quantitative methods at least fivematerials must be included, a require-ment which can be met by a combinationof samples and levels of analyte. At leasteight laboratories must take part and fol-low the interlaboratory study protocol.For reason of possible outliers, it is rec-ommended that more than this numberof laboratories should take part.

AOAC requires the following param-eters to be included in the study forquantitative methods:1. Reproducibility coefficient of varia-

tion RSDR (CVR) at the appropriateconcentration level

2. Repeatability (within laboratory)standard deviation, Sr

3. Reproducibility (including repeatabil-ity) standard deviation, SR

4. % recovery5. Method applicability statement6. RSDr(CVr), repeatability coefficient

of variation7. Repeatability, r (p 2.8 sr); and repro-

ducibility R(p 2.8 SR)8. Mean9. Number of laboratories

10. Number of outliers.For qualitative methods providing a

yes or no result, it is necessary to haveanalyte concentrations at two levels foreach matrix. Each matrix must be repre-

sented by five samples. Likewise, five ne-gative controls are required for each ma-trix. In total there would be no less than15 samples in a qualitative study, and 15laboratories must participate in this typeof interlaboratory study.

AOAC requires the following param-eters for qualitative methods:1. Sensitivity rate2. False negative rate3. % agreement4. Specificity rate (for non-microbiologi-

cal methods)5. False positive rate (for non-microbio-

logical methods)6. Number of laboratories7. Number of outliers8. Level of analyte (if known).

Since microbiology studies are treatedin AOAC as qualitative studies, the largenumber of participating laboratories maycause problems. The AOAC OfficialMethods Board has on its agenda the de-velopment of a separate protocol for mi-crobiology studies.

Ten steps to an AOAC Official Meth-od:1. Topic proposal to AOAC Office2. Volunteer to conduct in-house valida-

tion3. Review of in-house data and protocol4. Recommendation for interlaboratory

study5. Recruit laboratories; conduct interla-

boratory study6. Analyse results; write report7. Review results8. Recommendation for adoption9. Review by Official Methods Board

(OMB)10. OMB vote for adoption as First Ac-

tion

Page 174: Validation in Chemical Measurement

Methods validation: AOAC’s three validation systems 165

What do AOAC validated methodsmean to the laboratory?– increases method user confidence– increases confidence in the user

of analytical data– provides information on methods

performance characteristics al-lowing the method user to makejudgements about data quality

Guidelines

It is essential that the interlaboratorystudy should only be carried out afterthorough preparation by the AOAC As-sociate Referee, the volunteer who ini-tiates the topic in AOAC. The necessarysteps are well described in the AOACguidelines for interlaboratory study,which describes how to optimize a new oravailable method, develop the within-la-boratory attributes of the optimizedmethod, prepare a description of themethod that is not subject to bias, andhow to invite participation in the study,how to write instructions and prepare re-port forms for participants, and how tofacilitate familiarization of the techniqueby sending practice samples.

Method status and publication– adopted as Official Method– publication in Official Methods

of Analysis– interlaboratory study report pub-

lished in Journal of AOAC, orother scientific journal

– after 2 years member ballot on fi-nal action

Design (samples, range, laboratories)

In order to prepare the design of thestudy, possible sources of variability mustbe identified and included. Only labora-tories familiar with the technique usedshould be invited to participate. In viewof possible outliers, it is recommended toinvite ten or more laboratories to partici-pate.

This interlaboratory study is amethod performance study, not astudy of the laboratory/analyst

Samples must be homogeneous andtested for homogeneity, and should becoded at random, including the two ormore blind replicates. A blank or nega-tive control, and, if available, referencematerials should be provided. Spiked ma-terials are recommended for recoverystudy, incurred materials for residuesstudy.

Obligations of participants

The participants in the interlaboratorystudy must analyze as indicated and fol-low the method exactly, including thenumber of determinations as instructed,and not more! Individual values andblanks should be reported, and raw dataand graphs should be supplied.The organ-izer should be called if any problems oc-cur. A complete report should be submit-ted.

Frequent causes for outliers:– calculation errors– reporting errors– incorrect standards– contamination of reagents, equip-

ment, test, samples

What is learned?– The use of standardized terminology

for method performance characteristics– The scope, applicability and limitations

of analytical methods– To apply criteria for obtaining ade-

quate qualitative and quantitive data– To plan the conduct of a study– To customize the design of a study– To conduct multiple and single point

calibrations– To determine selectivity– To assess methods bias and variability– To conduct a ruggedness test.

Changes in Official Methods– editorial– method extension– analyte addition– substansive revisionSupporting data information de-pends on extent of modificationAll modifications go through thesame levels of review as requiredfor adoption

From the above, it is demonstratedthat organizing/participation in an interla-boratory method performance study is auseful exercise, which may be costly andtime-consuming. Therefore it is importantto realize that there is much potentialbenefit for:

The Analyst: by being recognized asan expert in the publication of study andmethod, by contributing to a successfulvalidation of a method which will proba-bly be used worldwide, contributing andbeing exposed by interaction with peers.

The Laboratory: the personnel timeand costs spent provide a contribution tocommon regulatory and industry stand-ards, support laboratory accreditation, af-ford likely entry into international organi-

zations for standardizing methods of anal-ysis, and strengthen confidence in the val-idity of the methods used in the partici-pating laboratory.

The AOAC Peer-Verified MethodsProgram

For more than 100 years, AOAC had butone methods program. In response to theneeds of analysts and laboratories formethods that were clearly candidates foruse, as demonstrated by performance inmore than one laboratory but not neces-sarily validated by a full interlaboratorystudy, AOAC has developed the Peer-Verified Methods (PVM) system.

Peer Verified Methods Objective:To provide analytical methods thathave not been otherwise evaluatedwith a level of validation acceptablefor some laboratory users

This is intended to provide a class oftested methods which have not been sub-ject of a full interlaboratory study.Through a significantly lower resources-demanding process, this program pro-vides for the rapid movement of methodsinto and recognition by AOAC and a lev-el of validation for many methods whichmight not be otherwise evaluated. APeer-Verified Method can always under-go full interlaboratory study at a laterstage, if so desired.

At least six samples at three concen-trations per matrix (including incurredsamples where applicable and possible),usually with at least one near the ex-pected/normal specification level, are re-quired.

The originating laboratory and at leastone additional, independent laboratorymust participate in the study. Replicateanalyses are required to help establishwithin-laboratory repeatability (preci-sion). PVM methods are announced inInside Laboratory Management, AOAC’smonthly magazine.

The 10 steps of a Peer-Verified Meth-od:1. Author develops in-house perform-

ance data.2. Method writ-up in format.3. Design of protocol for second labora-

tory testing.4. Select independent laboratory and ar-

range testing.5. Independent laboratory conducts

method performance testing.6. Independent laboratory submits re-

sults to author.7. Author analyzes all data and sends to

AOAC.

Page 175: Validation in Chemical Measurement

166 M. Lauwaars

8. All method information to TechnicalReferee.

9. Technical Referee selects at least twoexpert reviewers.

10. Based on reviews, Technical Refereegrants acceptance as PVM method.

PVM status and publication– accepted by AOAC as PVM

method– published as individual methods,

on demand; by mail, fax, Internet– report of study in the Journal of

AOAC INTERNATIONAL– duration of status; 5 years

Peer-Verified Methods may be modif-ied as needed by users; modifications,with supporting data, may be submittedto AOAC and will be published as‘Notes’ attached to the original method.

The AOAC Performance-Tested TestKit Program

The AOAC Performance-Testing Pro-gram is designed essentially for test kits,to provide third-party verification of per-formance claims for commercial, proprie-tary test kit analytical methods. Opera-tion of this program is under the AOACResearch Institute, a subsidiary of AOACINTERNATIONAL.

Test Kit Performance Testing Ob-jectiveTo provide third party verificationof performance claims for commer-cial, proprietary test kit analyticalmethods

A set of requirements for verificationof test kit performance has been develop-ed and applied successfully. In 1996, be-cause of financial constraints, the pro-gram was suspended temporarily and isnow again in full operation.

Test kit producer-generated data toverify performance claims are requiredaccording to an AOAC-approved proto-col, to generally include: ruggedness tests,calibration curves, accuracy, precision,cross reactivity, test kit component stabil-ity, detection limit, limit of quantificationand rates of false positives and false ne-gatives.

The verification of the producer’s dataon selected parameters must be perform-ed by an independent laboratory, accord-ing to AOAC-approved protocol.

When the AOAC Research Institutegrants the one year ‘Performance-Tested’certificate, the AOAC RI Performance-

Tested mark may be used under specificconditions.

Ten steps for the Test Kit Perform-ance-Testing procedure are:1. Kit producer prepares required pack-

age of information.2. Producer submits application and ap-

plication fee.3. AOAC RI Staff performs preview.4. Identification of Expert Reviewers

and independent testing laboratory.5. Expert Reviewers develop testing

protocol.6. Independent testing.7. Expert Reviewers assess results.8. Package insert review.9. Certification.

10. License to use Performance-Testedmark for 1 year.

Changes in Performance Testedkits– graduated levels of re-validation

study– same levels of review as required

for certification

The test kit is certified as an AOACPerformance-Tested Kit and carries a li-censed-to-use certification mark, which isgranted for 1 year.

Notice of certification and validationinformation is given in Inside LaboratoryMangement, AOAC’s monthly magazine.

At present, a special Task Force inAOAC INTERNATIONAL is studyingthe integration of the three methods vali-dation systems. The reason for doing thisis the increasing number of proprietary

techniques that are submitted for valida-tion. To ensure that this is done in a de-sirable way, with possibilities for yearlychecking and updating of scientific ad-vances and the validity of the techniqueitself and whether it is still on the marketin the same format, a study is made onhow best to integrate the systems. Thismay give opportunities for extensive vali-dation by a full interlaboratory study inorder to be compatible with other organi-zations.

The conclusions of the Task Force arediscussed at AOAC meetings, and thismay take some time. Information is avail-able at AOAC INTERNATIONAL, De-partment of Technical Services.

Acknowledgement The author thanksNancy Palmer, former Director ofAOAC’s Department of Technical Ser-vices, for valuable contributions.

References

1. J Assoc Off Anal Chem (1989)72 :694–704(1989)

2. The Referee (1996) 20 :No. 10, 18–19

M. Lauwaars (Y)AOAC INTERNATIONAL,P.O. Box 153, 6720 AD Bennekom,The NetherlandsTel.: c31-318-418-725Fax: c31-318-418-359e-mail: lauwaars6worldonline.n

MEETING REPORTS

Page 176: Validation in Chemical Measurement

Chairpersons, members of relevant bodies, official authorities, scientificorganisations and their aims; addresses and other useful information

Accred Qual Assur (1999) 4 :533Q Springer-Verlag 1999

Magnus Holmgren

Observing validation, un-certainty determinationand traceability in devel-oping Nordtest testmethods

Nordtest has analysed and developed testmethods for over 20 years and presentlythere are over 1250 Nordtest methods in15 different technical areas. More than520 of these methods have been develop-ed in Nordtest projects while the rest areadopted and recommended for use byNordtest. All test methods are listed inthe Nordtest Register of Test Methods [1].

Requirements for test methods as wellas the way they are developed are ever-

lasting subjects of discussion. Examples ofissues in connection with the develop-ment of methods are: validation of themethod, measurement uncertainty andtraceability of the method, documentationof the development work as well as en-dorsement, revision and withdrawal of amethod. Literature dealing with these is-sues is published through many channels,e.g. the proceedings from EUROLABSymposia [2–4]. There are several articlesof interest in these proceedings for thosewho develop test methods.

There are, however, few reports whichdeal with all parts of the developmentprocess. In Nordtest technical report 403,“Observing validation, uncertainty deter-mination and traceability in developingNordtest test methods” [5], the issuesmentioned above are analysed and sug-gestions of how to handle them are made.The analysis is not only limited to Nord-test methods but is generally valid for alltypes of test methods, e.g. in-house meth-ods developed at one laboratory or inter-national standards used world-wide.

The report is available from the Nord-test secretariat free of charge.

References

1. Nordtest (1998) Nordtest register oftest methods. Nordtest Technical Re-ports and General Documents NTDOC, GEN 017. Nordtest, Espoo

2. (1992) Quality management and assu-rance in testing laboratories. 1st EU-ROLAB Symposium Proceedings.Strasbourg, France, January 28–30

3. (1994) Testing for the Years 2000. 2ndEUROLAB Symposium Proceedings.Florence, Italy, April 25–27

4. (1996) Testing and analysis for indus-trial competitiveness and sustainabledevelopment. 3rd EUROLAB Sympo-sium Proceedings, Berlin, Germany,June 5–7

5. Holmgren M (1998) Observing valida-tion, uncertainty determination andtraceability in developing Nordtest testmethods. NT TECHN REPORT 403.Nordtest, Espoo

M. HolmgrenSP Swedish National Testing andResearch Institute, P.O. Box 857,SE-50115 Borås, Swedene-mail: magnus.holmgren6sp.seTel.: c46-33-16 50 07Fax: c46-33-16 50 10