turning data into information

69
Turning Data into Information Limitations and Solutions Richard Burrows

Upload: brittany-wynn

Post on 31-Dec-2015

18 views

Category:

Documents


1 download

DESCRIPTION

Turning Data into Information. Limitations and Solutions. Richard Burrows. First, a little history. Lead in Albacore: Guide to Lead Pollution in Americans Science, Vol 207, March 1980 p1167 Typical results for fresh albacore muscle were around 400 ng/g Pb - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Turning Data into Information

Turning Data into InformationTurning Data into Information

Limitations and Solutions

Richard Burrows

Page 2: Turning Data into Information

First, a little history

• Lead in Albacore: Guide to Lead Pollution in Americans

~ Science, Vol 207, March 1980 p1167~ Typical results for fresh albacore muscle were

around 400 ng/g Pb~ Typical results for albacore muscle from lead

soldered cans were around 700-1000 ng/g• Therefore, the canning process approximately

doubles the concentration of lead in tuna?

Page 3: Turning Data into Information

• Actually, when analyzed using clean preparation techniques and isotope dilution ICPMS the concentration of lead in fresh albacore muscle was found to be approx 0.3 ng/g

~ Highly regarded government and commercial laboratories at the time were overestimating the concentration of lead in fresh tuna by over1000 X.

Lab 1 Lab 2

Pb in albacore muscle 400 0.3

Pb in albacore muscle from lead soldered can

700 1400

Factor 1.75 4700

Page 4: Turning Data into Information
Page 5: Turning Data into Information

Issues with Detection Limits

• MDL~ Short term~ Small data set~ No consideration of blank bias~ Assumes constant variance

Page 6: Turning Data into Information

• If there is no blank bias, then no problem, but

Page 7: Turning Data into Information

.39

Spike Level ug/L STD N Mean

Recovery %

0.007 0.03261 8 0.900 12860 0.01 0.02191 8 0.901 9013

0.015 0.02406 8 0.903 6021 0.02 0.04697 8 0.840 4201 0.03 0.25555 8 0.734 2448 0.04 0.23866 8 0.943 2357 0.07 0.03795 8 0.950 1357 0.1 0.07147 8 1.01 1007 0.15 0.16215 8 1.04 695 0.2 0.24956 8 1.16 582 0.3 0.05539 8 1.26 418 0.4 0.02806 8 1.08 270 0.7 0.02448 8 1.40 200 1 0.19004 8 1.79 179

1.5 0.03847 8 2.24 149

EPA MDL 0.073

EPA ML 0.2

Episode 6000 data, Chromium by 200.8

Page 8: Turning Data into Information

• The more sensitive the method, the bigger the problem with ignoring blank bias in the detection limit determination~ ICPMS~ Method 1668 PCBs~ Method 1631 Mercury~ SIM analysis

Page 9: Turning Data into Information

Method blank detect rates, multi lab study

• 8270C 0.3%• 8270 SIM 6.4%• 8260B 2%• 8021B 16%, • ICPMS 8%

Page 10: Turning Data into Information

Solutions to Detection Limits

• Consider long term variability• Keep non-constant variance in mind• Consider qualitative identification criteria

Take Blank Bias into Account!!

Page 11: Turning Data into Information

A Better MDL

Use method blanksDL = <X> + ts

Estimate QL At least 2X DL

Check QL with spikes

Check QLAll results > DL

RSD OKRecovery OK

OngoingQuarterly QL verification

Periodic reassessment of

<X> + ts

Detection/Quantitation Federal Advisory Committee Procedure

Page 12: Turning Data into Information
Page 13: Turning Data into Information

Issues with Quantitation Limits

• Until recently, no requirement for a prepped standard at the quantitation limit

• Based on MDL• Precision based on statistical prediction

~ 3 times MDL, therefore 10%RSD

Page 14: Turning Data into Information

Solutions to Quantitation Limits

• Recent method and regulatory updates~ SW-846 Update V

◦ LLOQ standard, prepped, per quarter at least, must be within 50% of true value, or generate in house limits

~ Drinking water methods◦ 7 replicates initially, at MRL, prediction interval

within 50-150%

~ Texas PQL◦ 8 spikes at the PQL◦ 10% RSD, metals; 20% RSD, volatiles; 30% RSD

semivolatiles

Page 15: Turning Data into Information

Solutions to Quantitation Limits

• DQFAC procedure~ 7 replicates at QL, then quarterly verifications~ Limits not defined in the procedure

Page 16: Turning Data into Information

DQFAC

• What we need a procedure to do:~ Provide an explicit, verifiable estimate of bias at

the quantitation limit~ Provide an explicit, verifiable estimate of precision

at the quantitation limit~ Provide that qualitative identification criteria

defined in the method are met at the quantitation limit

~ Assess multi and inter laboratory variability when data from more than one laboratory is used

Page 17: Turning Data into Information

A Better Quantitation Limit

Estimate QL At least 2X DL

Analyze a minimum of 7

replicates divided into at least 3

batches

Use TCEQ PQL level

Analyze a minimum of 8

replicates

FACDQ Procedure

Texas PQL Procedure

Page 18: Turning Data into Information

A Better Quantitation Limit

Assess Results

Precision and accuracy better than any regulatory

requirements

Assess Results

Precision and accuracy better than TCEQ

requirements

FACDQ Procedure

Texas PQL Procedure

Lowest expected result > detection limit

Page 19: Turning Data into Information

Ongoing verification

At leat 4 spikes at QL per year

Evaluate at least every 2 years

At least 4 spikes at PQL per year

Evaluate at least once per year

FACDQ Procedure

Texas PQL Procedure

Precision and accuracy better than any regulatory

requirements

Precision and accuracy better than TCEQ

requirements

Page 20: Turning Data into Information

Ongoing verification assessment (continues

Lowest expected result evaluation

Qualitative identification evaluation

FACDQ Procedure

Texas PQL Procedure

Page 21: Turning Data into Information

Determination of Precision and Accuracy Criteria

• Step 1

GUESS

Metals Volatiles Semivolatiles

Precision 10% RSD 20% RSD 30% RSD

Accuracy 70-130% 70-130% 50-150%

There will be poor performers…..

Page 22: Turning Data into Information

Determination of Precision and Accuracy Criteria

• Step 2

• Spike at multiple levels around the anticipated quantitation limit

VERIFY

Analyte ug/L

Benzene 0.5 1 2 4 8

Acrylonitrile 12.5 25 50 100 200

Page 23: Turning Data into Information

Evaluation levels

• Metals

6010

Thallium blank 1 2 4 8 16

Vanadium blank 5 10 20 40 80

6020

Thallium blank 0.5 1 2 4 8

Vanadium blank 1.25 2.5 5 10 20

Page 24: Turning Data into Information

Arsenic

0 1 2 3 4 5 61%

10%

100%

1000%

RSD vs. True Concentration (T)

Data Constant-SD Model SL-SD Model Expo-SD Model Hybrid-SD Model IQE10%

IQE20% IQE30%

True Concentration (T)

RS

D

Page 25: Turning Data into Information

o-Xylene

0 5 10 15 20 251%

10%

100%

RSD vs. True Concentration (T)

Data Constant-SD Model SL-SD Model Expo-SD Model Hybrid-SD Model IQE10%

IQE20% IQE30%

True Concentration (T)

RS

D

Page 26: Turning Data into Information

Vinyl acetate

0 10 20 30 40 50 601%

10%

100%

RSD vs. True Concentration (T)

Data Constant-SD Model SL-SD Model Expo-SD Model Hybrid-SD ModelIQE10% IQE20% IQE30%

True Concentration (T)

RS

D

Page 27: Turning Data into Information

Objectives

• NOT the lowest quantitation limits that can be achieved~ Reasonable limits that are relevant to groundwater

monitoring criteria and can be achieved by most labs

~ PQLs that can be verified by data analyzed at the PQL

Page 28: Turning Data into Information

Next Steps

• Gather additional data bracketing expected quantitation limits

• 30 plus labs involved• Attempt to mimic real world conditions• Large data set will be available in about 6

months

Page 29: Turning Data into Information

Summary

Determine limits using

multi lab data

Consider regulatory

need

Identify poor performers

Individual labs demonstrate

ability to meet limits

Update limits if

necessary

Don’t need low ppb

levels for minerals

Some analytes will

not meet desired MQOs

Spiking at the PQL

TCEQ collects ongoing

verification data

IQE was used – other

procedures could be used

Page 30: Turning Data into Information

Problems with solutions to Quantitation limits

• Key points~ Spike at or very close to the quantitation limit

Page 31: Turning Data into Information

Problems with solutions to Quantitation limits

• Precision is highly dependent on how the data is generated

• Method 8260• 70 analytes, spiked at 0.2 ug/L, one batch

~ Average RSD = 8.2%

• Multiple batches, multiple instruments, spikes aged after preparation to simulate holding time

1ug/L 2ug/L 5ug/L 10ug/L 20ug/L

24.6% RSD 15.7% RSD 13.0% RSD 13.4% RSD 12.8% RSD

Page 32: Turning Data into Information
Page 33: Turning Data into Information

Issues with Calibration

• Analyze at least 5 points• RSD, linear regression, quadratic regression• r, r2 > 0.990 (0.995)

Page 34: Turning Data into Information

The curve that cannot fail

Conc Resp1 0.002 0.003 0.004 0.005 0.0010 0.00

100 117

slope 0.81564corr 0.99679int 4.16667

Page 35: Turning Data into Information

Calibration issues

r= 0.997, r2 = 0.994 RSE = 179%

Page 36: Turning Data into Information

Dalapon

RSE = 63%

Page 37: Turning Data into Information

Solutions to Calibration

• Calculate “readback” for each level~ Recent drinking water methods~ Recent SW-846 methods

• Pros~ Provides an indication of the error introduced at

each level~ Conceptually straightforward

• Cons~ Lots of numbers!~ Difficult to compare different curve types~ Need to be careful with criteria

Page 38: Turning Data into Information

Solutions to calibration

• RSE~ Extends applicability of RSD (used for average

curve) to all other curve types• Pros

~ Allows easy comparison of curve types~ Will indicate failing calibration if any point (high or

low concentration) has a high deviation from the curve

~ Can use same criteria as RSD• Cons

~ Not currently available in most chromatographic data systems

Page 39: Turning Data into Information

Error 1 20% 50% 100% 34% 50% 30%

Error 2 20% 20% 20% 28% 0% 10%

Error 3 20% 20% 20% 5% 0% 10%

Error 4 20% 20% 20% 3% 0% 10%

Error 5 20% 20% 20% 1% 0% 10%

Error 6 20% 20% 20% 6% 0% 10%

Error 7 20% 20% 20% 8% 0% 10%

RSE (RSD) 24% 31% 50% 20% 22% 17%

RSE examples

Page 40: Turning Data into Information

Guidelines Establishing Test Procedures for the Analysis of Pollutants Under the Clean

Water Act; Analysis and Sampling Procedures

When a regression curve is calculated as an alternative to using the average response factor, the quality of the calibration may be evaluated using the Relative Standard Error (RSE). The acceptance criterion for the RSE is the same as the acceptance criterion for Relative Standard Deviation (RSD), in the method. RSE is calculated as:

pn

CPCC

RSE

n

i i

ii

1

2

100

Page 41: Turning Data into Information
Page 42: Turning Data into Information

8081A

15 pesticides identifiedWhich are real?

Page 43: Turning Data into Information
Page 44: Turning Data into Information

8330B

Page 45: Turning Data into Information

Solutions to Sample Matrix

• ICPMS – instrumentation advances• Complex chromatograms – possible techniques

exist, but are not used because of cost – GC/GC• Cleanups

Page 46: Turning Data into Information

2D GC

Page 47: Turning Data into Information

2E5cps

45 50 55 60 65 70 75 80Mass

Blank Acid Matrices and IPA in ICPMS No Gas Mode

No Gas ModeUnspiked 5% HNO3 + 5% HCl + 1% H2SO4 + 1% IPA Matrix

Unspiked Matrix – ALL peaks are due to polyatomic interferences

Multiple polyatomic interferences affect almost every mass – Interferences are matrix-dependent

Color of spectrum indicates which matrix gave each interfering peak

Page

ClOArC

ArN

ArO, CaO

CaO,NaCl

S2, SO2

ArS, Cl2

Ar2

ArCl

ArOH,CaOH

ClO

CaO

CaO,NaCl

ClO,NaS

SO2, S2,

ArCl

Ar2

Ar2, Ca2, ArCa,S2O, SO3

Br,Ar2H

ArN2H,SO2H

S2, SO2 ArS, Cl2

ArS

Cl2

ClN2, CaOH,ArNaNaClH

Br,Ar2H

SO, SOH

ArC

CO2

SN

CO2H

Cl2H

ArCO, ArCN

Page 48: Turning Data into Information

45 50 55 60 65 70 75 80Mass

2E5cps

He Mode

ALL polyatomic interferences are removed in He Mode

Unspiked 5% HNO3 + 5% HCl + 1% H2SO4 + 1% IPA Matrix

ALL polyatomic interferences are removed in He Mode (same cell conditions)

Is sensitivity still OK?

Blank Acid Matrices and IPA in He Mode

Color of spectrum indicates which matrix gave each interfering peak

Page 49: Turning Data into Information

2E5cps

45 50 55 60 65 70 75 80Mass

10ppb Spike in 5% HNO3 + 5% HCl + 1% H2SO4 + 1% IPA Matrix

Consistent high sensitivity for all isotopes of all elements in He Mode

Matrix Mix with Spike (10ppb) in He Mode

He Mode

Good signal for all spike elements at 10ppb Spike. Perfect template fit for all elements – no residual interferences and no loss of analyte signal by reaction

Consistent sensitivity and perfect template match for all elements

Page 50: Turning Data into Information

False Positive ProbabilityIn Real Data

Page 51: Turning Data into Information

Dataset

• 19 labs, one month of blank measurements• 301,520 individual blank measurements• 1,306 distinct analytes• 9,991 above MDL (3.3%)• 1,097 above RL (0.4%)• One or more hits above the MDL in 302 analytes

52

Page 52: Turning Data into Information

The good news

Analyte NameHits above

MDL Number of blanks

Methyl tert-butyl ether 0 2521

1,1,1-Trichloroethane 0 2016

Chloroethane 0 2011

Trichlorofluoromethane 0 1958

Dichlorodifluoromethane 0 1922

2-Hexanone 0 1871

1,1,1,2-Tetrachloroethane 0 1787

Vinyl acetate 0 1700

Chlorodibromomethane 0 1678

2,2-Dichloropropane 0 1657

Acrylonitrile 0 1648

Bromobenzene 0 1642

sec-Butylbenzene 0 1642

1,1,2-Trichloro-1,2,2-trifluoroethane 0 1501

Chlorobromomethane 0 1421

53

Page 53: Turning Data into Information

Continued

Analyte NameHits above

MDL Number of blanks

2-Chloroethyl vinyl ether 0 1373

Isopropyl ether 0 1185

Acetonitrile 0 1060

Cyclohexane 0 1017

Methyl methacrylate 0 995

Propionitrile 0 951

Ethyl methacrylate 0 897

Methacrylonitrile 0 854

Isobutyl alcohol 0 844

Pentachloroethane 0 806

Acenaphthylene 0 768

3-Chloro-1-propene 0 726

4-Nitrophenol 0 646

Hexachloroethane 0 642

54

Page 54: Turning Data into Information

And so on…

Analyte Name Hits above MDL Number of blanks

2,4,6-Trichlorophenol 0 602

1-Chlorohexane 0 599

2,6-Dinitrotoluene 0 588

Hexachlorocyclopentadiene 0 547

Dimethyl phthalate 0 533

Isophorone 0 530

2,4,5-Trichlorophenol 0 523

2,4-Dichlorophenol 0 516

2-Chloronaphthalene 0 516

2,4-Dinitrophenol 0 515

2,4-Dimethylphenol 0 512

2-Chlorophenol 0 511

2-Nitrophenol 0 508

3,3'-Dichlorobenzidine 0 507

N-Nitrosodiphenylamine 0 502

55

Page 55: Turning Data into Information

And on…

Analyte Name Hits above MDLNumber of

blanks

4-Bromophenyl phenyl ether 0 500

N-Nitrosodi-n-propylamine 0 500

bis(2-Chloroethoxy)methane 0 500

4-Chlorophenyl phenyl ether 0 497

4-Chloro-3-methylphenol 0 494

2-Methylphenol 0 481

Tert-butyl ethyl ether 0 480

4,6-Dinitro-2-methylphenol 0 473

Carbazole 0 460

4-Chloroaniline 0 459

2-Nitroaniline 0 455

4-Nitroaniline 0 443

3-Nitroaniline 0 440

Bromodichloromethane 0 426

56

Page 56: Turning Data into Information

For another 60 pages if we wanted to go that long

Analyte Name Hits above MDL Number of blanks

gamma-BHC (Lindane) 0 423

Heptachlor epoxide 0 415

Methoxychlor 0 407

Pyridine 0 389

n-Heptane 0 388

4,4'-DDE 0 386

Dichlorofluoromethane 0 385

4,4'-DDD 0 383

Aldrin 0 383

Dieldrin 0 381

Endrin aldehyde 0 380

Endosulfan sulfate 0 380

Endosulfan I 0 378

Benzyl chloride 0 372

Ethyl acetate 0 37057

Page 57: Turning Data into Information

The bad news

Analyte NameHits above

MDLNumber of

blanks% above

MDL % above RL

Naphthalene 410 3021 13.6% 0.5%

Methylene Chloride 364 2006 18.1% 0.7%

Ca 323 1609 20.1% 3.0%

Si 315 753 41.8% 2.8%

SiO2 272 587 46.3% 2.4%

Zn 268 1912 14.0% 0.8%

Acetone 250 1921 13.0% 1.2%

2-Butanone (MEK) 228 1746 13.1% 3.8%

Cu 223 1873 11.9% 1.1%

Al 195 1516 12.9% 0.4%

Toluene 179 2794 6.4% 0.1%

58

Page 58: Turning Data into Information

Continued

Analyte NameHits above

MDLNumber of

blanks% above

MDL% above

RL

Hexachlorobutadiene 171 2182 7.8% 0.2%

B 165 1200 13.8% 5.8%

1,2-Dichlorobenzene 163 2823 5.8% 0.0%

Mo 160 1446 11.1% 3.9%

Cr 157 1800 8.7% 0.3%

Mn 145 1646 8.8% 0.3%

Ba 141 1654 8.5% 0.2%

K 139 1484 9.4% 2.8%

Na 137 1808 7.6% 0.9%

Benzene 136 2721 5.0% 0.0%

59

Page 59: Turning Data into Information

A few more

Analyte NameHits above

MDLNumber of

blanks% above

MDL% above

RL

1,2,4-Trichlorobenzene 135 2319 5.8% 0.0%

Pb 134 1974 6.8% 0.1%

Fe 132 1708 7.7% 0.3%

Tl 124 1627 7.6% 0.3%

Sb 123 1634 7.5% 1.0%

Bromomethane 122 1986 6.1% 0.0%

1,2,4-Trimethylbenzene 120 1959 6.1% 0.0%

Mg 118 1566 7.5% 0.0%

1,2,3-Trichlorobenzene 110 1679 6.6% 0.0%

Ethylbenzene 108 2683 4.0% 0.3%

1,3-Dichlorobenzene 104 2808 3.7% 0.0%

Ti 102 1094 9.3% 0.0%

60

Page 60: Turning Data into Information

Methods

• ICP 10%, ICPMS 8%• 8021B 16%, 8260B 2%• 8270C 0.3%, 8270 SIM 6.4%• Various semivolatile hydrocarbon methods, 24%

to 33%

61

Page 61: Turning Data into Information

• What do these hits in blanks tell us about the probability of a false positive in a sample?

Page 62: Turning Data into Information

Compound “A”

09/09/2008

10/29/2008

12/18/2008

02/06/2009

03/28/2009

05/17/2009

07/06/2009

08/25/2009-0.2

-0.1

0

0.1

0.2

0.3

0.4

RL

MDL

3%5% between LOD and LOQ

X Method Blanks

Y Samples

Page 63: Turning Data into Information

False Positive Rate Samples

Assumption:

The False Positive Rate in Samples =

Detect rate in blanks

for

Page 64: Turning Data into Information

Detect rate in blank= 3%

False positive rate in samples=3%

x 100% = 60%

Chance that a hit in a sample is a F+

3% 5%

Sample detects between MDL - RL =5%

Page 65: Turning Data into Information

66

A Closer look at 4 labs

• Actual reported results from samples based on requirement to report to MDL

• 138,212 reported results

66

Page 66: Turning Data into Information

67

How many do we expect from the blanks?

• 5,043 reported results between MDL and RL, 3.6%

• Expected number based on blanks is 3,511, 2.3%

• If the frequency of false positives in samples is the same as that in blanks

• 3511 of these results would be false positives

• Of those results between MDL and RL, 70% might be false positives

67

Page 67: Turning Data into Information

68

YIKES!!!!!!!

68

Page 68: Turning Data into Information

69

What was that again?

69

If the frequency of false positives in samples is the same as that in blanks

Of those results below between MDL and RL, 70% are likely to be false

positives

Page 69: Turning Data into Information

Questions?