Applied Problem Solving andResearch Using StatisticalMethods with NIST Examples
Adam L. Pintar
September, 2016
Introduction
About Me
Grew up in Kansas
Education
Pittsburg State University: Mathematics
: Statistics
Family
3/71
About NIST
NIST’s mission is to promote U.S. innovation and industrialcompetitiveness by advancing measurement science,standards, and technology in ways that enhance economicsecurity and improve our quality of life.
NIST is the national metrology institute (NMI) for the UnitedStates. As an NMI, NIST
Maintains primary measurement standards for the seven baseunits in the SI system of units and for derived unitsOffers calibration services and measurement standards tosupport international tradeDevelops new measurement technologies
The Institute for National Measurement Standards of theNational Research Council is the Canadian counterpart
4/71
NIST Campus
5/71
About SED
Churchill Eisenhart was the firstChief
Originally Statistical EngineeringLaboratory
1947
Lola Deming was a foundingmember too
6/71
More History
W. J. Youden (Right)
J. Cameron (Left)
Graeco-Latin Square
Enamel on metal
7/71
Outline
Experiment Design
X-ray CT for detecting additive manufacturing (3D printing)defectsInteractive discussion (choosing factors and levels)Accelerated degradation of polymeric materials
Exploratory Data Analysis
Accelerated degradation of polymeric materialsVolumetric versus RECIST length tumor measurementsStandard reference material heterogeneity
Probability/Stochastic Modeling
Standard reference material heterogeneityDistribution of peak pressureStandard reference material valuesInteractive discussion (choosing prior distributions)
Multidisciplinary projects
Localizing leaks of geologically sequestered CO2
8/71
Experiment Design
X-ray CT for detecting additive manufacturing (3Dprinting) defects
Goal
Best practices for measuring void size
11/71
Question 1
What factors influence the measurements?
12/71
List of Factors – Version 1
CT acquisition
1. Voltage2. Current3. Filter type4. Filter thickness5. Magnification6. Frame rate7. Number of images per projection8. Detector type9. Pixel size
10. Scintillator type11. Scintillator thickness
Reconstruction
1. Algorithm2. Center of rotation3. Beam hardening correction4. Scattering correction
Artifact
1. Material2. Flaw size3. Flaw shape
Image processing
1. Smoothing filter2. Thresholding algorithm
20 factors!
13/71
Available Resources
Able to produce about 20 images
Need 220 ≈ 1m runs for a 2-level full factorial experiment
The closest we can get to 20 runs is a 220−15 fractionalfactorial
32 runsMain effects confounded with 2 factor interactionsNo estimate of pure error
14/71
List of Factors – Version 2
CT acquisition
1. Voltage (numeric)2. Current (numeric)3. Magnification (numeric)4. Frame rate (numeric)5. Number of images per projection (numeric)
Reconstruction
1. Algorithm (categorical)
15/71
Design
26−2 fractional factorial with 4“center” runs
Main effects not confoundedwith 2 factor interactions2 df pure error estimateSome information about 2factor interactions
## volt curr mag fr n_img alg
## 1 0 0 0 0 0 -1
## 2 0 0 0 0 0 1
## 3 -1 -1 1 1 1 1
## 4 1 1 -1 1 -1 1
## 5 1 -1 1 -1 -1 1
## 6 -1 -1 1 -1 1 -1
## 7 1 1 1 1 1 1
## 8 -1 1 -1 -1 1 1
## 9 1 -1 1 1 -1 -1
## 10 -1 -1 -1 -1 -1 -1
## 11 0 0 0 0 0 -1
## 12 1 1 1 -1 1 -1
## 13 1 -1 -1 -1 1 1
## 14 -1 1 1 1 -1 -1
## 15 -1 1 -1 1 1 -1
## 16 1 1 -1 -1 -1 -1
## 17 1 -1 -1 1 1 -1
## 18 -1 1 1 -1 -1 1
## 19 -1 -1 -1 1 -1 1
## 20 0 0 0 0 0 1
16/71
Computation
R is a language and environment for statistical computing andgraphics. . . .
17/71
2-level Fractional Factorial Designs in R
FrF2 function from the R package FrF2
factor_names <- c('volt', 'curr',
'mag', 'fr',
'n_img', 'alg')
my_design <- FrF2(factor.names = factor_names,
resolution = 4, ncenter = 4)
18/71
## volt curr mag fr n_img alg
## 1 0 0 0 0 0 0
## 2 0 0 0 0 0 0
## 3 -1 -1 1 -1 1 -1
## 4 1 1 -1 1 -1 1
## 5 -1 -1 -1 1 -1 1
## 6 1 -1 -1 1 1 -1
## 7 1 -1 1 -1 -1 1
## 8 1 1 1 -1 1 -1
## 9 -1 -1 -1 -1 -1 -1
## 10 1 -1 1 1 -1 -1
## 11 0 0 0 0 0 0
## 12 -1 1 -1 -1 1 1
## 13 -1 1 1 -1 -1 1
## 14 1 1 1 1 1 1
## 15 1 -1 -1 -1 1 1
## 16 -1 1 -1 1 1 -1
## 17 -1 1 1 1 -1 -1
## 18 1 1 -1 -1 -1 -1
## 19 -1 -1 1 1 1 1
## 20 0 0 0 0 0 0
## class=design, type= FrF2.center
19/71
design.info(my_design)$alias
## $legend
## [1] "A=volt" "B=curr" "C=mag" "D=fr" "E=n_img" "F=alg"
##
## $main
## character(0)
##
## $fi2
## [1] "AB=CE=DF" "AC=BE" "AD=BF" "AE=BC" "AF=BD" "CD=EF"
## [7] "CF=DE"
20/71
Interactive DiscussionChoosing Factor Levels
Accelerated Degradation of Polymeric Materials
Goal
Lab measurements
Field measurements
23/71
Two Experiments
Laboratory
Precisely control/measure a few factors
Field
Environmental variationMimic building movement
Focus on Laboratory
24/71
Lab Factors
Light intensity
Temperature
Humidity
Mechanical strain
25/71
Design
Light intensity – Fixed at max
Temperature – 4 levels
Humidity – Fixed
Mechanical strain – 4 levels
42 full factorial
26/71
0 5 10 15 20
2025
3035
4045
50
Mechanical Strain (%)
Tem
pera
ture
(C
)
27/71
Exploratory Data Analysis
Accelerated Degradation of Polymeric Materials
Raw Data
Temp: 21Strain: 0
0.0
0.4
0.8
1.2
0 20 40 60 80
Chamber 1Chamber 2Chamber 3Chamber 4
Temp: 21Strain: 5
Temp: 21Strain: 11
0 20 40 60 80
Temp: 21Strain: 21
Temp: 31Strain: 0
Temp: 31Strain: 5
Temp: 31Strain: 11
Temp: 31Strain: 21
0.0
0.4
0.8
1.2
Temp: 41Strain: 0
0.0
0.4
0.8
1.2
Temp: 41Strain: 5
Temp: 41Strain: 11
Temp: 41Strain: 21
Temp: 51Strain: 0
Temp: 51Strain: 5
0 20 40 60 80
Temp: 51Strain: 11
Temp: 51Strain: 21
0.0
0.4
0.8
1.2
0 20 40 60 80
Exposure Days
Mod
ulus
Rat
io
30/71
Volumetric versus RECIST Length TumorMeasurements
Experiment
32/71
Raw Data
Volume (cm3)
Mas
s (g
)
2 3 4 5 6
23
45
6
RECIST (mm)
20 25 30 35
23
45
6
Color differentiates diapers
Expect straight lines with positive slopes
33/71
Standard Reference Material Heterogeneity
SRMs
35/71
SRM Certificate
36/71
Coal Bottle-to-Bottle Differences (Heterogeneity)
2 4 6 8 10
3.2
3.4
3.6
3.8
4.0
4.2
4.4 Bromine
Bottle
mg/
kg
ObservationsGrand mean
37/71
Probabilistic/StochasticModelling
Standard Reference Material Heterogeneity
Bottle-to-Bottle Differences (Heterogeneity)
2 4 6 8 10
3.0
3.5
4.0
4.5
Bromine
Bottle
mg/
kg
ObservationsGrand meanGrand mean CIBottle mean CI
ANOVA p−value: 0.074
40/71
p-value
H0: Bottles all the sameHA: At least one bottle different
anova(lm(raw_data$mg_kg ~ factor(raw_data$bottle)))
## Analysis of Variance Table
##
## Response: raw_data$mg_kg
## Df Sum Sq Mean Sq F value Pr(>F)
## factor(raw_data$bottle) 9 1.62004 0.180005 2.6253 0.07438 .
## Residuals 10 0.68565 0.068565
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
41/71
Power
Probability of concluding at least one bottle is different whenthat is the true state of nature
power.anova.test(groups = 10, n = 2,
between.var = c(0.5, 1, 1, 2), within.var = 1,
sig.level = c(0.01, 0.05, 0.15, 0.1) )
##
## Balanced one-way analysis of variance power calculation
##
## groups = 10
## n = 2
## between.var = 0.5, 1.0, 1.0, 2.0
## within.var = 1
## sig.level = 0.01, 0.05, 0.15, 0.10
## power = 0.08086632, 0.51463217, 0.77715565, 0.93728416
##
## NOTE: n is number in each group
42/71
Peak Pressure
Structural Design
44/71
Experimental Data
0 20 40 60 80 100
−3.
0−
2.0
−1.
00.
0
Time (s)
Pre
ssur
e
45/71
Data Processing
20 40 60 80 100
0.0
0.5
1.0
1.5
2.0
2.5
3.0
Time (s)
Pre
ssur
e
46/71
Simulated Data
0 50 100 150 200
0.0
1.0
2.0
3.0
Pre
ssur
eOriginal Data Set
0 50 100 150 200
0.0
1.0
2.0
3.0
Pre
ssur
e
Fake Data Set #1
0 50 100 150 200
0.0
1.0
2.0
3.0
Time (s)
Pre
ssur
e
Fake Data Set #2
47/71
Distribution of the Peak
Distribution of the Peak Value
Peak Value
Den
sity
3.0 3.5 4.0 4.5
0.0
0.5
1.0
1.5
Mean
48/71
Uncertainty in the Distribution of the Peak
Distribution of the Peak Value
Peak Value
Den
sity
3.0 3.5 4.0 4.5
0.0
0.5
1.0
1.5
Mean
Bootstrap Replicates80% CI for the Mean
49/71
Standard Reference Material Values
Bayesian methods – high level
Two sources of information
DataSubject matter expertise (Prior)
Bayes rule tells us how to combine them
52/71
Hard Rock Mine Waste
Governor Basin, Colorado53/71
Ag (Silver) Raw Data
7075
80m
g/kg
ICPMS Lab 1 ICPOES ICPMS Lab 2 INAA
54/71
Posterior With Flat Prior
40 60 80 100 120
0.00
0.02
0.04
0.06
0.08
Ag (Silver)
mg/kg
Den
sity
95% interval [55, 104]55/71
The Problem
7075
80m
g/kg
ICPMS Lab 1 ICPOES ICPMS Lab 2 INAA
56/71
Prior Information
0.0 0.2 0.4 0.6 0.8 1.0
01
23
45
Logarithmic Units
Den
sity
InfomativeFlat
Quality control data (Log scale)57/71
Posterior With Informative Prior
60 70 80 90 100 110
0.00
0.02
0.04
0.06
0.08
0.10
Ag (Sliver)
mg/kg
Den
sity
95% interval [66, 83]58/71
Interactive DiscussionChoosing Prior Distributions
Multidisciplinary Projects
Goal
Detect and locate leaks of CO2 stored in geological formations
61/71
Test Site in Montana
The data that were analyzed come from a similar site in Ft.Wayne Indiana using the same equipment
62/71
Likelihood at Ft. Wayne
Grid is clearly visibleExpected no signal, but found a strong one
63/71
Collaborators
Zachary H. Levine (NIST)
Jeremy T. Dobler (Harris Corp.)
Nathan Blume (Harris Corp.)
Michael Braun (Harris Corp.)
T. Scott Zaccheo (Atmospheric and Environmental Research)
Timothy G. Pernini (Atmospheric and EnvironmentalResearch)
64/71
Reference
Levine, Z. H., Pintar, A. L., Dobler, J. T., Blume, N., Braun, M.,Zaccheo, T. S., Pernini, T. G., ”The Detection of Carbon DioxideLeaks Using Quasi-tomographic Laser Absorption SpectroscopyMeasurements in Variable Wind,” *Atmospheric MeasurementTechniques*, **9**, 1627–1636, 2016, http://www.atmos-meas-tech.net/9/1627/2016/amt-9-1627-2016.pdf.
65/71
Summary
Three main toolboxes
Experiment designExploratory data analysisStochastic/Probabilistic modelling
Toolboxes interact with each other
Multidisciplinary teams
66/71
Questions
67/71
Question One
Thank you Michelle T.
Synopsis
Purchase four physical standards for calibrating liquidchromatography instrumentsCombine the standards into a single new “secondary” standardHow to verify that the combination procedure does not changethe concentrations listed on the certificates
68/71
Simple Graphical Approach
Quality control checks when measuring candidate SRMs
810
1214
16
µg/k
g
Certificate Measurements
69/71
Formalization of the Simple Approach
H0: No differenceHA: Some difference
Review paper
Rukhin, A. L., “Assessing Compatibility of Two Laboratories:Formulations as a Statistical Hypothesis Testing Problem,”Metrologia, 50, 49–59 (2013).
Potential problem
Can find strong evidence of a difference but not strongevidence of equivalence
70/71
Second FormalizationH0: Difference is practically importantHA: Difference is not practically important
ReferenceAnderson-Cook, C. M. and Borror, C. M., “The DifferenceBetween “equivalent” and “not different,”” QualityEngineering, 28, 249–262 (2016).
Some modification necessary8
1012
1416
µg/k
g
Certificate Measurements
71/71