generalities & qualitative testing plans may 8-10, 2006 iowa state university, ames – usa...
DESCRIPTION
ISTA Statistics Committee3 0.07% 0.12% 0.09% 0.05%0.11% Seed Lot Sample Challenges: random sampling variabilityTRANSCRIPT
Generalities & Qualitative Testing Plans
May 8-10, 2006Iowa State University, Ames – USA
Jean-Louis LaffontKirk Remund
ISTA Statistics Committee 2
• Introduce Acceptance Sampling– review assumptions– definitions– understand strengths & limitations
• Use with a qualitative assay– zero tolerance plans– plans that allow deviants– purity testing
Objectives
ISTA Statistics Committee 3
0.07%
0.12%
0.09%
0.05%0.11%
Seed LotSample
Challenges: random sampling variability
ISTA Statistics Committee 4
Challeges: Sampling & Assay Variability
0.12%0.15%
Seed Lot SampleSampling Error
0.09%
SamplePrep
Assay(PCR)< 0.10%
Assay System Error
ISTA Statistics Committee 5
1. Manage sampling variability & assay errors
2. Maintain flexibility: seed pooling schemes, single or double stage testing
3. Maintain confidence in decisions– “We are 95% confident that the GMO
presence in this lot is < 0.1%”
Benefits of acceptance sampling approach
ISTA Statistics Committee 6
• Definition 1– “Obtain sample so that each seed has an equal and
independent chance of being selected [called a simple random sample (SRS)]”
– Index every seed, pick random numbers, obtain indexed seeds– Good idea?
• Definition 2: mimic SRS sample– bag sampling (ISTA rules)– probe sampling (uniform grid)– systematic sampling
54321 1,000,000,000
...
Assumption: “Representative” Sample
ISTA Statistics Committee 7
Sampling bulk containers (e.g., trucks or bins)
Often reasonable approach if heterogenuity occurs as horizontal or inverted cone layers
Sampling collection point: probe the depth of the container
Probe sampling
ISTA Statistics Committee 8
• Sample a flow of seed on regular time interval– flow from hopper bottom truck– flow from a silo
• More samples as heterogeneity increases• Sample collect from cut through entire stream
of flowing seed• Caution: Make sure that there is not cyclic
behavior in flow that correlates with sampling interval
Systematic sampling
ISTA Statistics Committee 9
…
seed lot
primary samples
composite sample
submitted sample
seed pools (bulks)for testing
Mix well!
Obtaining Pools to Evaluate Bulk Characteristics
Obtain sample
ISTA Statistics Committee 10
• Sample size should be no larger than 10% of population
• This condition must hold to use Seedcalc or Qalstat
• If this assumption is not met we must use methods based on the hypergeometric distribution
Assumption: Seed lot is large
ISTA Statistics Committee 11
SEEDSEEDSEEDSEED
SEED LOT
SEED
SAMPLE OF SEEDS
X DEVIANT SEEDS FOUND
X>C XC
ACCEPT LOT
REJECT LOT
Acceptance sampling for qualitative assays
Number of deviant
seeds is distributed
binomial
ISTA Statistics Committee 12
Definitions• LQL = lower quality limit
– highest level of impurity that is acceptable to consumer– “95% confident that seed impurity is below 1%” (LQL=1%)
• AQL = acceptable quality level– level of impurity that is acceptable to producer and consumer– Some definitions
• Conservative: producer can produce seed at this impurity level or below• Practical: process average• Set in relation to threshold
– generally, AQL less than or equal to 1/2 LQL
ISTA Statistics Committee 13
Definitions, cont.
0%
% impurity0.5%
LQL
0.2%
Mos
t pro
duct
ion
betw
een
0% &
this
val
ue
% p
rodu
ctio
n
proc
ess
aver
age
0.15%
AQL
ISTA Statistics Committee 14
• Consumer Risk = chance of accepting “bad” lot (lot impurity = LQL)• also called beta ()
• Producer Risk = chance of rejecting “good” lot (lot impurity = AQL)• also called alpha ()
Definitions, cont.
ISTA Statistics Committee 15
0%
20%
40%
60%
80%
100%
True Impurity in Lot
Cha
nce
of A
ccep
ting
Lot
AQL LQL
High chance of accepting lot at AQL (alpha)
High chance of rejecting lot at LQL (beta)
Ideal OC Curve
want these whateverdon’t
want these
Operating characteristic (OC) curve
ISTA Statistics Committee 16
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0.00% 0.25% 0.50% 0.75% 1.00% 1.25% 1.50% 1.75% 2.00%
True Impurity Level (%)
Prob
abili
ty o
f Acc
eptin
g Lo
t (%
)
n=400, c=1 Large n n=400, c=4
AQL=0.5%
Poor Testing Planlow producer risk
high consumer risk
Poor Testing Planhigh producer risklow consumer risk
Good Testing Planlow producer risklow consumer risk
LQL=1.0%
OC curves, cont.,
ISTA Statistics Committee 17
0%20
%40
%60
%80
%10
0%
0 0.5 1 1.5 2 2.5
Actual % Impurity in Lot
Prob
abili
ty o
f Acc
eptin
g Lo
t (%
)
RetestAcceptance
0%20
%40
%60
%80
%10
0%
0 0.5 1 1.5 2 2.5Actual % Impurity in Lot
Prob
abili
ty o
f Acc
eptin
g Lo
t (%
)
RetestAcceptance
LQL = thresholdAQL = what producer can deliver
LQL = 2 x thresholdAQL = ½ x threshold
(similar to tolerance approach)
LQL & AQL in relation to threshold
thre
shol
d
thre
shol
d
ISTA Statistics Committee 18
Reducing Costs: Testing Seed Pools Rather than Individuals
300 seeds per pool
• Works well in testing for adventitious presence
• Assay must be able to detect one GM seed in pool of all conventional seed with high confidence
5 seed pools
ISTA Statistics Committee 19
Challenge: setting the threshold
Option 1: require true zero thresholdresult: test all seed in entire lot…..
Option 2: “zero tolerance” in sampleresult 1: hidden non-zero threshold
Example: USDA recommendation for Starlink (Cry9c), test 2400 seeds and allow zero positives yields a 0.19% threshold rather than zero.
result 2: high cost to producerThrow away a lot of good seed due to false positives and sampling variability
ISTA Statistics Committee 20
Challenge: setting the threshold, cont.
Option 3: set reasonable non-zero threshold, allow for some positivesresult 1: manage consumer and producer
risks to acceptable levelsresult 2: better manage impact of assay
errors on resultsresult 3: most seed approved for sale will be
much lower than threshold (e.g., 3 or 10 times lower)
ISTA Statistics Committee 21
Zero Tolerance Plans
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0.00% 0.25% 0.50% 0.75% 1.00% 1.25% 1.50% 1.75% 2.00%
True Impurity Level (%)
Prob
abili
ty o
f Acc
eptin
g Lo
t (%
)
AQL=0.5%
LQL=1.0%
ISTA Statistics Committee 22
0%
10%
20%
30%
40%
50%
0.01% 0.10% 0.25% 0.50% 1.50% 2.00%
Accept Reject1%
thre
shol
d
Reject 0%of “Good” Lots
Accept 0%of “Bad” Lots
The Perfect Plan
True Lot Impurity
ISTA Statistics Committee 23
0%
10%
20%
30%
40%
50%
0.01% 0.10% 0.25% 0.50% 1.50% 2.00%
Accept Reject1%
thre
shol
dReject ~20%of “Good” Lots
Accept <1%of “Bad” Lots
Zero Tolerance Plan - Test one pool of 300
ISTA Statistics Committee 24
0%
10%
20%
30%
40%
50%
0.01% 0.10% 0.25% 0.50% 1.50% 2.00%
Accept Reject1%
thre
shol
dReject 5%of “Good” Lots
Accept <1%of “Bad” Lots
Almost Perfect Plan: Test 6 pools of 300, accept 4 deviants pools or less
ISTA Statistics Committee 25
OC curves for two testing plans
0%
20%
40%
60%
80%
100%
0 0.5 1 1.5 2 2.5Actual % Impurity in Lot
1 pool of 3006 pools of 300
thre
shol
d
ISTA Statistics Committee 26
Hypothetical situation: “Ten seed pools of 300 seeds each are tested from a conventional seed lot and 5 pools test positive for adventitious presence. The lot is labeled as having less than 1% adventitious presence and it is shipped.”
Should they have shipped the lot?
ISTA Statistics Committee 27
Yes.
10 pools of 300 seeds each
Can see up to 7 positive pools and still have 95% confidence the true lot purity is below 1% threshold
60 pools of 50 seeds each
Can see up to 17 positive pools and still have 95% confidence the true lot purity is below 1% threshold
INTERPRETWITH CARE!!
ISTA Statistics Committee 28
OC Curves for two testing plans
0%
20%
40%
60%
80%
100%
0 0.5 1 1.5 2 2.5Actual % Impurity in Lot
60 pools of 50 seeds10 pools of 300 seeds
thre
shol
d
ISTA Statistics Committee 29
• False negative rate (FNR)– probability that a positive sample tests
negative– PCR failures, DNA problems, …
• False positive rate (FPR)– probability that a negative sample tests
positive– DNA contamination, …
More definitions
ISTA Statistics Committee 30
Assay Error Impact (pool size =1)
0
20
40
60
80
100
0 2 4 6 8
% Deviants in Lot
Cha
nce
of A
ccep
ting
Lot
20%false negative rate
2% false positive rate
1% false positive rate
No Errors
10% falsenegative rate
ISTA Statistics Committee 31
Double Stage Testing Plan
N1
X1
N2
X2
X a1 X b1
a X b 1
X X c1 2 X X c1 2
REJ
ECT
LOT
AC
CEPT LO
T
ISTA Statistics Committee 32
Trait Purity Testing
• Example: Testing RR Soybeans are above 98% trait purity
• Must test individual seeds• DNA or protein assay detects intended trait
rather than unintended trait in AP testing• FNR has larger effect on testing plan than FPR• Roles of FNR & FPR reverse in Seedcalc6 and
Qalstat programs
No PoolingAllowed!!
ISTA Statistics Committee 33
Introduction to Seedcalc