![Page 1: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/1.jpg)
department of mathematics and computer science
2DS01
Statistics 2 for Chemical
Engineering
http://www.win.tue.nl/~sandro/2DS01
![Page 2: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/2.jpg)
department of mathematics and computer science
Lecturers
• Marko Boon ([email protected])
• Dr. A. Di Bucchianico ([email protected])
• Ir. G.D. Mooiweer ([email protected])
• Drs. C.M.J. Rusch – Groot ([email protected])
![Page 3: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/3.jpg)
department of mathematics and computer science
Important to remember
• Web site for this course: http://www.win.tue.nl/~sandro/2DS01/
• No textbook, but handouts + Powerpoint sheets through web site
• Bring notebook to fourth lecture (12th of April) and self-study
• Software:
– Statgraphics (version 5.1). If not installed, install through
http://w3.tue.nl/nl/diensten/dienst_ict/organisatie/groepen/wins/campus_software/
– Java (at least version 1.4). Install through http://java.com.
Java is needed to run Statlab (http://www.win.tue.nl/statlab).
Important: In order to run Statlab during the exams, security settings have to
be adjusted!
![Page 4: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/4.jpg)
department of mathematics and computer science
Goals of this course
• teach students need for statistical basis of
experimentation
• teach students statistical tools for experimentation
– design of experiments (factorial designs, optimal designs)
– analysis of experiments (ANOVA)
– use of statistical software
• give students short introduction to recent
developments
![Page 5: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/5.jpg)
department of mathematics and computer science
Week schedule
Week 1: Introduction to Analysis of Variance
(ANOVA)
Week 2: Factorial designs: screening
Week 3: Factorial designs: optimisation
Week 4: Optimal experimental design
and mixture designs (by A. Di Bucchianico
– Bring your laptop!)
![Page 6: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/6.jpg)
department of mathematics and computer science
Detailed contents of week 1
• statistics and experimentation
• short recapitulation of regression analysis
• one-way ANOVA
• one-way ANOVA with blocks
• multiple comparisons
![Page 7: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/7.jpg)
department of mathematics and computer science
Statistics and experimentation
Chemical experiments often depend on several
factors (pressure, catalyst, temperature, reaction
time, ...)
Two important questions:
• which factors are really important?
• what are optimal settings for important factors?
![Page 8: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/8.jpg)
department of mathematics and computer science
Use of statistical experimentation in chemical engineering
•Chemical synthesis (synthetic steps; work up and separation;
reagents, solvents, catalysts; structure, reactivity and
properties, ...)
•Biotech industry (drug design, analytical biochemistry, process
optimization – fermentation, purification ,...)
•Process industry (process optimization and control -yield, purity,
through put time, pollution, energy consumption; product quality
and performance - material strength, warp, color, taste, odour; ...)
• ...
![Page 9: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/9.jpg)
department of mathematics and computer science
Short history of statistics and experimentation
• 1920’s - ... introduction of statistical methods in
agriculture by Fisher and co-workers
• 1950’s - ... introduction in chemical engineering
(Box, ...)
• 1980’s - ... introduction in Western industry of Japanese
approach (Taguchi, robust design)
• 1990’s - ... combinatorial chemistry, high througput
processing
![Page 10: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/10.jpg)
department of mathematics and computer science
Link to Statistics 1 for Chemical Engineering
• introduction to measurements
– data analysis
– error propagation
• regression analysis
• use of statistical software (Statgraphics)
![Page 11: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/11.jpg)
department of mathematics and computer science
Types of regression analysis
Linear means linear in coefficients, not linear functions!
•Simple linear regression
•Multiple linear regression
• Non-linear regression
0 1Y x
0 1 1 2 2 ...Y x x
21Y C
![Page 12: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/12.jpg)
department of mathematics and computer science
Model:
ssumptions:
• the model is linear (+ enough terms)
• the i's are normally distributed with =0 and
variance 2
• the i's are independent.
Linear regression
0 1 1 2 2 ...i i i iY x x
![Page 13: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/13.jpg)
department of mathematics and computer science
Specific warmth
•specific warmth of vapour at constant pressure as function of
temperature
•data set from Perry’s Chemical Engineers’ Handbook
• thermodynamic theories say that quadratic relation between
temperature and specific warmth usually suffices:
2210 TTC p
![Page 14: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/14.jpg)
department of mathematics and computer science
Scatter plot of specific warmth data
Plot of Cp vs T
T
Cp
250 300 350 4001800
1900
2000
2100
2200
![Page 15: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/15.jpg)
department of mathematics and computer science
Regression output specific warmth data
Polynomial Regression Analysis-----------------------------------------------------------------------------Dependent variable: Cp----------------------------------------------------------------------------- Standard TParameter Estimate Error Statistic P-Value-----------------------------------------------------------------------------CONSTANT 3590.36 76.3041 47.0533 0.0000T -12.1386 0.454369 -26.7153 0.0000T^2 0.0213415 0.000670762 31.8169 0.0000-----------------------------------------------------------------------------
Analysis of Variance-----------------------------------------------------------------------------Source Sum of Squares Df Mean Square F-Ratio P-Value-----------------------------------------------------------------------------Model 169252.0 2 84626.2 6227.13 0.0000Residual 285.388 21 13.5899-----------------------------------------------------------------------------Total (Corr.) 169538.0 23
R-squared = 99.8317 percentR-squared (adjusted for d.f.) = 99.8156 percentStandard Error of Est. = 3.68645Mean absolute error = 2.94042Durbin-Watson statistic = 0.310971 (P=0.0000)Lag 1 residual autocorrelation = 0.640511
![Page 16: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/16.jpg)
department of mathematics and computer science
Issues in regression output
• significance of model
• significance of individual regression parameters
• residual plots:
– normality (density trace, normal probability plot)
– constant variance (against predicted values + each independent
variable)
– model adequacy (against predicted values)
– outliers
– independence
• influential points
![Page 17: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/17.jpg)
department of mathematics and computer science
Residual plot specific warmth data
This behaviour is visible in plot of fitted line only after rescaling!
Residual Plot
predicted Cp
Stu
dentized r
esid
ual
1800 1900 2000 2100 2200-3.8
-1.8
0.2
2.2
4.2
![Page 18: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/18.jpg)
department of mathematics and computer science
Plot of fitted quadratic model for specific warmth data
Plot of Fitted Model
T
Cp
250 300 350 400 4501800
1900
2000
2100
2200
![Page 19: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/19.jpg)
department of mathematics and computer science
Conclusion regression models for specific warmth data
• we need third order model (polynomial of degree
3)
• careful with extrapolation
• original data set contains influential points
• original data set contains potential outliers
![Page 20: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/20.jpg)
department of mathematics and computer science
Analysis of variance
• name refers to mathematical technique, not to
goal
• comparison of means (!!) using variances
(extension of t-test to more than 2 samples)
• samples usually are groups of measurements with
constant factor settings
![Page 21: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/21.jpg)
department of mathematics and computer science
Example: ANOVA
production of yarns: influence of fibre composition on
breaking tension
simplification:
one factor: % cotton
fixed factor levels: 15%, 20%, 25%, 30%, 35%
experimental design: produce on the same machine 5
threads of each type of fibre composition in random
order
![Page 22: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/22.jpg)
department of mathematics and computer science
Statistical setting
Basis model: Yij = + i + ij
influencefactor levels
i=1,2,…k
error term:• normal =0, 2
• independent
replicationsj=1,2,…,n
• Basis hypotheses:H0: i = 0 for all iH1: i 0 for at least one i
overallmean
![Page 23: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/23.jpg)
department of mathematics and computer science
Expectation under H0 (= no effect of factor level)
spread observations with respect to group
means
spread group means with respect to overall
meanchance
![Page 24: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/24.jpg)
department of mathematics and computer science
Expectation under H1
spread observations with respect to
group means
chance
systematicspread group means with respect to
overall mean
![Page 25: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/25.jpg)
department of mathematics and computer science
Illustration of group means
y
3y
2y
1y
![Page 26: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/26.jpg)
department of mathematics and computer science
Group means versus overall mean
y
3y
2y
1y
33 yy j
yy3
yy j3
![Page 27: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/27.jpg)
department of mathematics and computer science
Conclusion
Comparison of both spreads yields indication for H0 vs
H1.
2
1 1.
2
1...
2
1 1..
k
i
n
jiij
k
ii
k
i
n
jij yyyynyy
total treatment:between groups
rest: within groups= +
![Page 28: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/28.jpg)
department of mathematics and computer science
Conclusion
Comparison of both spreads yields indication for H0 vs
H1.
2
1 1.
2
1...
2
1 1..
k
i
n
jiij
k
ii
k
i
n
jij yyyynyy
total treatment:between groups
rest: within groups= +
Spreads are converted into sums of squares:
![Page 29: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/29.jpg)
department of mathematics and computer science
Mean Sums of Squares
sums of squares differ with respect to number of
contributions.
for fair comparison: divide by degrees of freedom.
• we expect under H0: MSbetween MSwithin
• we expect under H1: MSbetween >> MSwithin
summary in ANOVA table
![Page 30: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/30.jpg)
department of mathematics and computer science
Completely Randomized One-factor DesignCompletely Randomized One-factor Design
Experiment, in which one factor varies on k levels.
At each level n measurements are taken.
The order of all measurements is random.
![Page 31: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/31.jpg)
department of mathematics and computer science
Multiple comparisons
• ANOVA only indicates whether there are significantly different
group means
• ANOVA does not indicate which groups have different means
(although we may construct confidence intervals for differences)
• various methods exist for correctly performing pairwise
comparisons:
– LSD (Least Significant Difference) method
– HSD (Honestly Significant Difference) method
– Duncan
– Newman – Keuls
– ...
![Page 32: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/32.jpg)
department of mathematics and computer science
Randomized one-factor block designRandomized one-factor block design
In each block all treatments occur equally often;randomization within blocks
Experiment with one factor and observations in blocks
Blocks are levels of noise factor.
![Page 33: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/33.jpg)
department of mathematics and computer science
Example
testing method for material hardness :
forcepressure pin/tip
strip testing material
practical problem: 4 types of pressure pins do these yield the same results?
![Page 34: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/34.jpg)
department of mathematics and computer science
Experimental design 1
1234
5678
9101112
13141516
pin 1 pin 2 pin 4pin 3
testingstrips
Problem: if the measurements of strips 5 through 8 differ, is
this caused by the strips or by pin 2?
![Page 35: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/35.jpg)
department of mathematics and computer science
Experimental design 2
Take 4 strips on which you measure (in random
order) each pressure pin once :
1324
1432
4321
2314
strip 1 strip 2 strip 4strip 3
pressurepins
![Page 36: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/36.jpg)
department of mathematics and computer science
Blocking
Advantage of blocked experimental design 2:
differences between strips are filtered out
Model: Yij = + i + j + ij
• Primary goal: reduction error term
factorpressure pin
block effectstrip
error term
![Page 37: department of mathematics and computer science 1212 2DS01 Statistics 2 for Chemical Engineering sandro/2DS01](https://reader036.vdocument.in/reader036/viewer/2022081603/56649e2d5503460f94b1d283/html5/thumbnails/37.jpg)
department of mathematics and computer science
Summary
• completely randomized design
• randomized block design
• multiple comparisons
Reading material:
• Statgraphics lecture notes section 4.1 through 4.3
•
http://www.acc.umu.se/~tnkjtg/chemometrics/editorial/aug2002.htm
l