1 design and analysis of experiments (2) basic statistics kyung-ho park

1

Design and Analysis of Experiments (2)

Basic Statistics

Kyung-Ho Park

2

Descriptive Statistics:deals with procedures used to summarize the information contained in a set of measurements.

Inferential Statistics: deals with procedures used to make inferences (predictions) about a population parameter from information contained in a sample.

3

• Population– Totality of the observations with which we are

concerned

• Sample– A subset of observations selected from a

population

4

Population Sample

Mean μ

Variance σ2

Standard deviation σ

Mean x

Variance S2

Standard deviation S

5

Descriptive statistics

Numerical Methods

Graphical Methods

6

Measures of Central Tendency (Location)

1) sample mean:

2) sample median: the middle number when the measurements are arranged in ascending order

3) sample mode: most frequently occurring value

Numerical methods

nx

x

i) sample is sensitive to extreme values

ii) the median is insensitive to extreme values

7

Measures of Dispersion (Variability)

1) range: max – min

2) sample variance:

3) sample standard deviation:

Numerical methods

1)( 2

2

n

xxs

2ss

8

Numerical methods

Measures of Central Measures of DispersionTendency (Location) (Variability)

1. Sample mean 1. Range2. Sample median 2. Mean Absolute Deviation (MAD)3. Sample mode 3. Sample Variance 4. Sample Standard Deviation

9

Graphical Methods

105 221 183 186 121 181 180 143

97 154 153 174 120 168 167 141

245 228 174 199 181 158 176 110

163 131 154 115 160 208 158 133

207 180 190 193 194 133 156 123

134 178 76 167 184 135 229 146

218 157 101 171 165 172 158 169

199 151 142 163 145 171 148 158

160 175 149 87 160 237 150 135

196 201 200 176 150 170 118 149

Table1: Compressive Strength (in psi) of 80 Aluminum-Lithium Alloy Speciments

10

Graphical Methods

c1

Freq

uenc

y

24020016012080

25

20

15

10

5

0

Mean 162.7StDev 33.77N 80

Histogram of c1Normal

c125022520017515012510075

Dotplot of c1

Histogram

Dot Plot

Stem-and-Leaf Display: c1

Stem-and-leaf of c1 N = 80Leaf Unit = 1.0

LO 76, 87

3 9 7 5 10 15 8 11 058 11 12 013 17 13 133455 25 14 12356899 37 15 001344678888(10) 16 0003357789 33 17 0112445668 23 18 0011346 16 19 034699 10 20 0178 6 21 8 5 22 189

HI 237, 245

11

c1

250

200

150

100

Boxplot of c1

second quartilefirst quartile third quartile

whisker extends to smallest data point with 1.5 interquartile ranges from first quartile

Extremeoutliers

whisker extends to largest data point with 1.5 interquartile ranges from third quartile

outliers

IQR 1.5 IQR 1.5 IQR1.5 IQR1.5 IQR

Box Plot

12

Probability PlotsGraphical method for determining whether sample data conform to a hypothesized distribution based on a subjective visual examination of the data

10 observations on the effective service life in minutes of batteries in a portable personal computer

176, 191, 214, 220, 205, 192, 201, 190, 193, 185

j X(j) (j-0.5)/10 Zj1 176 0.05 -1.642 183 0.15 -1.043 185 0.25 -0.674 190 0.35 -0.395 191 0.45 -0.136 192 0.55 0.137 201 0.65 0.398 205 0.75 0.679 214 0.85 1.0410 220 0.95 1.64

(j-0.5)/n=P(Z ≤ zi)

c1

Perc

ent

230220210200190180170160

99

9590

80706050403020

105

1

Mean

0.636

195.7StDev 14.03N 10AD 0.257P-Value

Probability Plot of c1Normal

13

Probability Plots (table 1)

c1

Perc

ent

25020015010050

99.9

99

959080706050403020105

1

0.1

Mean

0.668


Probability Plot of c1Normal

14

Population Sample

Mean μ

Variance σ2

Standard deviation σ

Mean x

Variance S2

Standard deviation S

Estimation

15

Normal DistributionDistribution of a random variable (sampling): Normal distribution

y: a normal random variable

the probability distribution of y

2])()[21exp{(

21)(

yyf

),( 2NY

16

c1

Freq

uenc

y

24020016012080

25

20

15

10

5

0

Mean 162.7StDev 33.77N 80

Histogram of c1Normal

17

Standard Normal Distribution

1, 2 orandom variable

yz

)1,0(Nz

%73.993%44.952%26.681

18

Ex.1 Suppose the current measurement in a strip of wire are assumed to follow a normal distribution with a mean of 10 milliamperes and a variance of 4 (milliamperes)2. What is the probability that a measurement will exceed 13 milliamperes?

06681.0)5.1()2

)1013(2

)10(()13(

ZPXPXP

Cumulative Distribution Function Normal with mean = 10 and standard deviation = 2 x P(?X?<=?x?)13 0.933193

MiniTab

Cal – Probability distribution – Normal

Mean=10.0, S.D=2, Input Constant=13.0

19

Confidence Interval (CI)

20


sampling variability :

x

Interval estimate for a population parameter : confidence interval

CI is constructed so that we have high confidence that it does contain the unknown population parameter

If is the sample mean of a random sample of size n from a normal population with known variance σ2, a 100(1-α)% CI on μ is given by

Where zα/2 is the upper 100 α/2 percentage point of the standard normal distribution

x

nzxnzx // 2/2/

21

Ex 2. Ten measurements of impact energy(J) on specimens of A238 steel cut at 60 as a follows: 64.1, 64.7, 64.5, 64.6, 64.5, 64.3, 64.6, 64.8, 64.2 and 64.3. ℃Assume that impact energy energy is normally distributed with σ=1J. We want to find a 95% CI for μ, the mean impact energy

nzxnzx // 2/2/

zα/2 = z0.025=1.96

n=10, σ=1

46.64x

08.6584.6310196.146.64

10196.146.64

22

64.1

64.7

64.5

64.6

64.5

64.3

64.6

64.8

64.2

64.3

One-Sample Z: C1

The assumed standard deviation = 1

Variable N Mean StDev SE Mean 95% CIC1 10 64.4600 0.2271 0.3162 (63.8402, 65.0798)

Stat -> Basic Stat -> 1 sample Z

(Example 2)

23


If and s are the mean and standard deviation of of a random sample from a normal population with unknown variance σ2, a 100(1-α)% CI on μ is given by

Where tα/2,n-1 is the upper 100 α/2 percentage point of the t distribution with n-1 degrees fo freedom

x

nstxnstx nn 1,2/1,2/ /

24

Ex. 3 An article describes the results of tensile adhesion tests on 22U-700 alloy specimens. The load at specimen failure is as follows (in megapascals):

19.8 10.1 14.9 7.5 15.4 15.4 15.4 18.5 7.9 12.7 11.9 11.4 11.4 14.1 17.6 16.7 15.8 19.5 8.8 13.6 11.9 11.4

We want to find a 95% CI for μ

55.3,71.13 sx

n=22, n-1=21, t0.025,21 = 2.080

nstxnstx nn 1,2/1,2/ /

28.1514.1257.17.1357.171.13

22/)55.3(080.271.1322/)55.3(080.271.13

25

Variable N Mean StDev SE Mean 95% CIC1 22 13.7136 3.5536 0.7576 (12.1381, 15.2892)

C1

20

18

16

14

12

10

8

6

Boxplot of C1

C1

Perc

ent

22.520.017.515.012.510.07.55.0

99

9590

80706050403020

105

1

Mean

0.838


Probability Plot of C1Normal

26

Hypothesis Test

27

Hypothesis Test

We illustrated how to construct a confidence interval estimate of a parameter from sample data

Many problems in engineering require that we decide whether to accept or reject a statement about some parameter : Hypothesis

Decision-making procedure about the hypothesis : hypothesis testing

Hypothesis testing and CI estimation of parameters : Data analysis stage of a comparative experiment

28

Tensile adhesion tests on 22U-700 alloy specimens (Example.3)

We are interested in deciding whether or not the tensile adhesion is 14 megapascals

H0: μ= 14 megapascals Null hypothesis

H1 μ≠14. megapascals Alternative hypothesis

H1 μ≠14 Two-sided alternative hypothesis

H1 μ<>14 One-sided alternative hypothesis

29

Probability of making a type I error: significance level, (α-error)

α=0.05, 0.01 (confidence level : 95.0, 99.0)

α = P(type I error) = P(reject H0 when H0 is true)

β = P(type II error) = P(fail to reject H0 when H0 is false)

30

One-Sample T: C1 Test of mu = 15 vs not = 15Variable N Mean StDev SE Mean 95% CI T PC1 22 13.7136 3.5536 0.7576 (12.1381, 15.2892) -1.70 0.104

MiniTab

Stat-Basic statistics -1t

Test mean=15

Option

Confidence level:95.0, Alternative: not equal

Hypotheses Tests for a Single Sample

31

Hypotheses Tests for Two Samples

Number Catalyst 1

Catalyst 2

1 91.50 89.19 2 94.18 90.95 3 92.18 90.46 4 95.39 93.21 5 91.79 97.19 6 89.07 97.04 7 94.72 91.07 8 89.21 92.75

Average 92.255 92.733 s 2.39 2.98

Table. Catalyst Yield Data

Data

Catalyst 2Catalyst 1

98

97

96

95

94

93

92

91

90

89

Boxplot of Catalyst 1, Catalyst 2

Data

Perc

ent

100.097.595.092.590.087.585.0

99

9590

80706050403020

105

1

Mean0.516

92.73 2.983 8 0.454 0.194

StDev N AD P92.26 2.385 8 0.292

VariableCatalyst 1Catalyst 2

Probability Plot of Catalyst 1, Catalyst 2Normal

32

MiniTab


Two-Sample T-Test and CI: Catalyst 1, Catalyst 2 Two-sample T for Catalyst 1 vs Catalyst 2 N Mean StDev SE MeanCatalyst 1 8 92.26 2.39 0.84Catalyst 2 8 92.73 2.98 1.1Difference = mu (Catalyst 1) - mu (Catalyst 2)Estimate for difference: -0.47750095% CI for difference: (-3.394928, 2.439928)T-Test of difference = 0 (vs not =): T-Value = -0.35 P-Value = 0.729 DF = 13

Stat-Basic statistics -2t(2-sample t)

Sample in different columns

Option

-Confidence level:95.0, - Alternative: not equal

33

Hypotheses Tests for Two Paired Samples

Specimen Tip1 Tip2

1 7 62 3 33 3 54 4 35 8 86 3 27 2 48 9 99 5 4

10 4 5

Ex.5 Data for Hardness testing Experiment

Data

Tip2Tip1

9

8

7

6

5

4

3

2

Boxplot of Tip1, Tip2

Data

Perc

ent

1086420

99

9590

80706050403020

105

1

Mean0.120

4.9 2.234 10 0.337 0.425

StDev N AD P4.8 2.394 10 0.542

VariableTip1Tip2

Probability Plot of Tip1, Tip2Normal

34

MiniTab


Stat-Basic statistics t-t (paired t)

Sample in different columns

Option

-Confidence level:95.0, - Alternative: not equal

35

Hypotheses Tests for Two Paired SamplesPaired T for Tip1 - Tip2

N Mean StDev SE MeanTip1 10 4.80000 2.39444 0.75719Tip2 10 4.90000 2.23358 0.70632Difference 10 -0.100000 1.197219 0.378594

95% CI for mean difference: (-0.956439, 0.756439)T-Test of mean difference = 0 (vs not = 0): T-Value = -0.26 P-Value = 0.798

Two-sample T for Tip1 vs Tip2

N Mean StDev SE MeanTip1 10 4.80 2.39 0.76Tip2 10 4.90 2.23 0.71

Difference = mu (Tip1) - mu (Tip2)Estimate for difference: -0.10000095% CI for difference: (-2.284675, 2.084675)T-Test of difference = 0 (vs not =): T-Value = -0.10 P-Value = 0.924 DF = 17

1 design and analysis of experiments (2) basic statistics kyung-ho park

Documents