statistics and spreadsheets harris chapter 4mtweb.mtsu.edu/nchong/statistics-harris.pdf ·...
Post on 12-Mar-2018
222 Views
Preview:
TRANSCRIPT
Statistics and Spreadsheets Harris Chapter 4
Gaussian DistributionConfidence Intervals
Student’s T-TestsQ-test
Control ChartsSpreadsheets
Gaussian Distribution (random!)
• Mean Value:– The arithmetic
“average”– For a set of data, the
closer your mean is to the true value, the more accurate your results are!
n
X X
i
0i∑
=
Standard Deviation (reproducibility)
• Standard deviation is based on the fact that you will assume that errors are the result of RANDOM events.
• It is based on the shape and distribution of the Gaussian Curve
• A smaller standard deviation means that your results are more reproducible (they don’t vary as much from measurement to measurement).
The Gaussian Curve• Plotting of random
events• Defines standard
deviation• Has a mathematical
definition (formula for the curve)
• Discussed in more detail in the text
99.9 %+/- 3 STD DEV
95.5 %+/- 2 STD DEV
68.3 %+/- 1 STD DEV
% of Events Affected by Random Error that Occur
# of Standard Deviations from the Mean
Calculating a STD DEV (by hand)• Based on the difference
between each value and the mean.
• Also based on the degrees of freedom– Number of measurements
minus one– n-1
1-n
)x(x s
i
0
2i∑ −
=
Let’s do it manually once, together!
• M&M’s Results Handed Out to All!• Calculate mean and standard deviation• Setup a simple table • Use table to keep track of the squared
terms!• LEARN TO DO THIS USING YOUR
CALCULATOR AND MSEXCEL (STDEV is the correct function)
Confidence IntervalsHow Certain Are You?????
• Confidence intervals allow us to calculate a range of values in which we can be confident, at some level, that the “true” value lies
• Originally based on the growth of yeast in beer!• One of the most important tools in evaluating
data!• Back to Elementary School: draw a number line to
see how this works!
Calculating a Confidence Interval
• Determine the Mean • Determine the Standard
Deviation• Determine the degrees of
freedom (n-1)• Decide how confident
you want to be in your data (80%, 90%, 95%, etc.)
• Calculate using appropriate formula.
ns t x ×
±=µt is the value of Student’s t from a t-table (Figure 4-20
n is the # of observations
s is the standard deviation
Confidence Interval Calculation: John C. SchaumloffelCalculate the [Zn] at the 95% confidence interval[Zn] ppm
1.20 u = mean +/- (t x s)/(n^0.5)1.401.501.101.101.26 mean
0.1817 STDEV5 n4 n-1 (degrees of freedom)
2.776 t-value, n=5, 95% confidentHarris Table 4-2
0.2255 is the range of the confidence interval (the +/- value)
Confidence Interval = 1.26 +/- 0.23 ppm Zn
Therefore, we are 95% confident that the "true" value for the concentrationconcentration of Zinc is between 1.03 and 1.49 ppm.
Comparison of Mean’s w/Student’s T
• We can compare two sets of data to determine how confident we are that they are either– Statistically similar– Statistically different
• This is ONLY a statistical test, you can also rely on– Your intuition as a chemist– Your practical experience
• But, statistical test are what win in court!
• We will concentrate on Harris’ “Case Two”– A quantity is measured multiple times by two different
techniques. Each technique gives a mean and standard deviation for the quantity. Are these similar?
• Steps….– Calculate a pooled standard deviation– Calculate a t-value using the pooled standard deviation– Compare the tcalculated to the correct t-value from the table
(ttable)– If tcalc > ttable, the results are statistically different– If tcalc < ttable, the results are statistically similar
Are the [Pu] in the contaminated soil samples from Chemist #1 and Chemist #2 statistically
different?
Q-test to Eliminate Outliers
• Used when you have a set of data with one or more suspect values (“out of whack”)
• A statistical test you can use to provide evidence to eliminate an outlier from the data set
• ONLY a statistical test….
Are any of the soil [Pu] values outliers? Lets check using the Q-
test.
Control Charts
• A graph showing the mean value for a result collected over a period of time
• Ranges for +/- 1, 2, 3 or more standard deviations are shown on the graph
• Used to visually see if data are falling out of a range which would be defined by RANDOM error– Instrumental Fluctuations– Standards or Samples Degrading– Instrument Operator Changing….
• In most regulatory and industrial settings, the mean +/- 2 STDEV is considered acceptable– Warning limit
• Outside of +/- 2 STDEV is considered the action limit– You must correct the situation in this case…..
• Usually, repeated analysis of a known standard is used to develop a control chart.
[Hg] in Quality Control Sample….
Day [Hg] ppb UWL LWL UAL LAL MEAN 0.1431 0.1 0.280937 0.005063 0.349906 -0.06391 STDEV 0.0689692 0.12 0.280937 0.005063 0.349906 -0.063913 0.12 0.280937 0.005063 0.349906 -0.063914 0.13 0.280937 0.005063 0.349906 -0.063915 0.08 0.280937 0.005063 0.349906 -0.063916 0.09 0.280937 0.005063 0.349906 -0.063917 0.11 0.280937 0.005063 0.349906 -0.063918 0.17 0.280937 0.005063 0.349906 -0.063919 0.2 0.280937 0.005063 0.349906 -0.06391
10 0.31 0.280937 0.005063 0.349906 -0.06391
[Hg] Control Chart (spectrophotometry)
-0.1
0
0.1
0.2
0.3
0.4
0 2 4 6 8 10 12
Analysis Day
[Hg]
ng/
mL
top related