rr anova 38
TRANSCRIPT
![Page 1: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/1.jpg)
August 2012
This month's newsletter is the first in a multi-part series on using
the ANOVA method for an ANOVA Gage R&R study. This
method simply uses analysis of variance to analyze the results of
a gage R&R study instead of the classical average and range
method. The two methods do not generate the same results, but they will (in most
cases) be similar.
This newsletter focuses on part of the ANOVA table and how it is developed for the
Gage R &R study. In particular it focuses on the sum of squares and degrees of
freedom. Many people do not understand how the calculations work and the
information that is contained in the sum of squares and the degrees of freedom. In
the next few issues, we will put together the rest of the ANOVA table and complete
the Gage R&R calculations.
In this issue:
Sources of Variation
Example Data
The ANOVA Table for Gage R&R
The ANOVA Results
Total Sum of Squares and Degrees of Freedom
Operator Sum of Squares and Degrees of Freedom
Parts Sum of Squares and Degrees of Freedom
Equipment (Within) Sum of Squares and Degrees of Freedom
Interaction Sum of Squares and Degrees of Freedom
Summary
Quick Links
Any gage R&R study is a study of variation. This means you have to have variation
in the results. On occasion, I get a phone call from a customer wondering why their
Gage R&R study is not giving them any useful information. And, in looking at the
results, I discover that each result is the same - for each part and for each
operator. There is no variation. I am asked - Isn't it good that there is no variation in
the results? No, not in a gage R&R study. It means that the measurement process
cannot tell the difference between the samples. So remember, a gage R&R study
is a study in variation - this means that there must be variation.
If you are not familiar with how to conduct a Gage R&R study, please see our
December 2007 newsletter. This newsletter also includes how to analyze the
results using the average and range method.
As usual, please feel free to leave comments at the end of the newsletter.
Sources of Variation
![Page 2: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/2.jpg)
Suppose you are monitoring a process by pulling samples
of the product at some regular interval and measuring one
critical quality characteristic (X). Obviously, you will not
always get the same result when measure for X. Why not?
There are many sources of variation in the process.
However, these sources can be grouped into three
categories:
variation due to the process itself
variation due to sampling
variation due to the measurement system
These three components of variation are related by the following:
where σt2 is the total process variance; σp
2 is the process variance; σs2 is the
sampling variance and σms2 is the measurement system variance. Note that the
relationship is linear in terms of the variance (which is the square of the standard
deviation), not the standard deviation.
For our purposes here, we will ignore the variance due to sampling (or more
correctly, just include it as part of the process itself). However, for some processes,
sampling variation can greatly impact the results. Thus, we will consider the total
variance to be:
Remember geometry? The right triangle? The Pythagorean Theorem? The above
equation can be represented by the triangle below.
The total standard deviation, σt, for a measurement is equal to the length of the
hypotenuse. The process standard deviation, σp, is equal to the length of one side
of the triangle and the measurement system standard deviation, σms, is equal to the
length of the remaining side.
You can easily see from this triangle what happens as the variation in the product
and measurement system changes. If the product standard deviation is larger than
the measurement standard deviation, it will have the larger impact on the total
standard deviation. However, if the measurement standard deviation becomes too
large, it will begin to have the largest impact.
Thus, the objective of improving a measurement system is to minimize the %
variance due to the measurement system:
% Variance due to measurement system = 100(σms2/σt
2)
![Page 3: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/3.jpg)
The gage R&R study focuses on σms2. In a gage R&R study, you can break down
σms2 into its two components:
Repeatability is the ability of the measurement system to repeat the same
measurements on the same sample under the same conditions. It represents an
assessment of the ability to get the same measurement result each time.
Reproducibility is the ability of measurement system to return consistent
measurements while varying the measurement conditions (different operators,
different parts, etc.) It represents an assessment of the ability to reproduce the
measurement of other operators.
In this series, we will take a look at how the repeatability and reproducibility are
determined using the ANOVA method for Gage R&R.
Example DataWe will re-use the data from our December 2007 newsletter on the average and
range method for Gage R&R. In this example, there were three operators who
tested five parts three times. A picture of part of the Gage R&R design is shown
below.
Operator 1 will test 5 parts three times each. In the figure above, you can see that
Operator 1 has tested Part 1 three times. What are the sources of variation in
these three trials? It is the measurement equipment itself. The operator is the same
and the part is the same. The variation in these three trials is a measure of the
repeatability. It is also called the equipment variation in Gage R&R studies or the
"within" variation in ANOVA studies.
Operator 1 also runs Parts 2 through 5 three times each. The variation in those
results includes the variation due to the parts as well as the equipment variation.
Operator 2 and 3 also test the same 5 parts three times each. The variation in all
results includes the equipment variation, the part variation, the operator variation
and the interaction between operators and parts. The variation in all results is the
reproducibility.
The data from the December 2007 newsletter are shown in the table below.
Operator Part ResultsA 1 3.29 3.41 3.64
![Page 4: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/4.jpg)
2 2.44 2.32 2.423 4.34 4.17 4.274 3.47 3.5 3.645 2.2 2.08 2.16
B
1 3.08 3.25 3.072 2.53 1.78 2.323 4.19 3.94 4.344 3.01 4.03 3.25 2.44 1.8 1.72
C
1 3.04 2.89 2.852 1.62 1.87 2.043 3.88 4.09 3.674 3.14 3.2 3.115 1.54 1.93 1.55
The operator is listed in first column and the part numbers in the second column.
The next three columns contain the results of the three trials for that operator and
part number. For example, the three trial results for Operator A and Part 1 are
3.29, 3.41 and 3.64.
We will now take a look at the ANOVA table, which is used as a starting point for
analyzing the results.
The ANOVA Table for Gage R&RIn most cases, you will use computer software to do the calculations. Since this is a
relatively simple Gage R&R, we will show how the calculations are done. This
helps understand the process better. The software usually displays the results in
an ANOVA table. The basic ANOVA table is shown in the table below for the
following:
k = number of appraisers
r = number of replications
n= number of parts
![Page 5: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/5.jpg)
The first column is the source of variability. Remember that a Gage R&R study is a
study of variation. There are five sources of variability in this ANOVA approach: the
operator, the part, the interaction between the operator and part, the equipment
and the total.
The second column is the degrees of freedom associated with the source of
variation. The degrees of freedom are simply the number of values of a statistic
that are free to vary. For example, suppose you have a sample that contains n
observations. We use the sample to estimate something - usually an average.
When we want to estimate something, it costs us one degree of freedom. So, if we
have n observations and want to estimate the average, then we have n - 1 degrees
of freedom left.
The third column is the sum of squares (SS) associated with the source of
variation. The sum of squares is a measure of variation. It measures the squared
deviations around an average. Remember what the equation for the variance is?
The variance of a set of number is given by:
The sum of squares for the source of variation is very similar to the numerator. You
just take the sum of squares around different averages depending on the source of
variation.
The fourth column is the mean square associated with the source of variation. The
mean square is the estimate of the variance for that source of variability based on
the amount of data we have (the degrees of freedom). So, the mean square is the
sum of squares divided by the degrees of freedom. Note the similarity to the
formula for the variance above.
![Page 6: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/6.jpg)
The fifth column is the F value. This is the statistic that is calculated to determine if
the source of variability is statistically significant. It is the ratio of two variances (or
mean squares in this case).
The ANOVA ResultsThe data above were analyzed using the SPC for Excel software. The resulting
ANOVA table is shown below.
Source df SS MS F p ValueOperator 2 1.630 0.815 100.322 0.0000Part 4 28.909 7.227 889.458 0.0000Operator by Part 8 0.065 0.008 0.142 0.9964Equipment 30 1.712 0.057Total 44 32.317
Let's see where the numbers come from.
Total Sum of Squares and Degrees of FreedomThe total sum of squares (SST) is the sum of the other sources of variability. So,
SST = SS0 + SSP + SS0*P + SSE
The total sum of squares is the squared deviation of each individual result from the
overall average - the average of all results. The overall average of the 45 results is:
The total sum of squares is then given by:
where Xijm is the result for the ith operator running the jth part for the mth trial. This
equation is simply a fancy way of saying that you subtract the average from an
individual result and square that result. This is shown in the figure below for the
squared deviation of the first result.
If you do this for each point and add up the results, you will obtain the following:
SST = 32.317
The calculations are shown in the table below.
Operator
Part 1
Trial 1
Trial 2
Trial 3
Squared Deviation
Trial 1
Squared Deviation
Trial 2
Squared Deviation
Trial 3
![Page 7: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/7.jpg)
A
1 3.29 3.41 3.64 0.120 0.217 0.4852 2.44 2.32 2.42 0.254 0.389 0.2743 4.34 4.17 4.27 1.949 1.504 1.7594 3.47 3.5 3.64 0.277 0.309 0.4855 2.2 2.08 2.16 0.553 0.746 0.614
B
1 3.08 3.25 3.07 0.019 0.094 0.0162 2.53 1.78 2.32 0.171 1.354 0.3893 4.19 3.94 4.34 1.553 0.992 1.9494 3.01 4.03 3.2 0.004 1.180 0.0665 2.44 1.8 1.72 0.254 1.308 1.498
C1 3.04 2.89 2.85 0.009 0.003 0.0092 1.62 1.87 2.04 1.752 1.153 0.8173 3.88 4.09 3.67 0.877 1.314 0.5274 3.14 3.2 3.11 0.039 0.066 0.0285 1.54 1.93 1.55 1.971 1.028 1.943
Sum 32.317
There were a total of 45 results. We calculated the overall average for these
results. So the degrees of freedom associated with the total sum of squares are 45
- 1 = 44. This can also be calculated as nkr - 1.
Operator Sum of Squares and Degrees of FreedomAs mentioned before, you obtain the sum of squares by determining the squared
deviations between two numbers. With the operator source of variability, you will
obtain the squared deviations between the operator average and the overall
average. Algebraically, this is given by:
where nr represents the number of results for operator i and the "i.." subscript
means over all parts and trials for operator i.
In this example, n = 5 and r = 3, so there are 15 results for each operator. The
table below shows how the calculations are done:
Operator
Parts
Trial 1
Trial 2
Trial 3
Operator Average
Squared Deviation for Operator
A
1 3.29 3.41 3.64 3.1567 0.04532 2.44 2.32 2.423 4.34 4.17 4.274 3.47 3.5 3.645 2.2 2.08 2.16
B 1 3.08 3.25 3.07 2.9800 0.0013
![Page 8: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/8.jpg)
2 2.53 1.78 2.323 4.19 3.94 4.344 3.01 4.03 3.25 2.44 1.8 1.72
C
1 3.04 2.89 2.85 2.6947 0.06212 1.62 1.87 2.043 3.88 4.09 3.674 3.14 3.2 3.115 1.54 1.93 1.55
Sum of Deviations 0.108715(Sum of
Deviations)1.6304
Thus,
SSO = 1.6304
So, you can see that the sum of squares due to the operators is based on how the
operator averages deviate from the overall average. There are three operator
averages. Since we calculated the overall average, we lost one degree of freedom.
The degrees of freedom associated with the operators are 3 - 1 = 2, or k -1 = 2.
The variability chart below shows the results by operator by part. The horizontal
blue line is the average for the operator. The horizontal green line is the overall
average. The difference between those two lines is the deivation.
Parts Sum of Squares and Degrees of FreedomThe sum of square due to the parts is done in the same manner as for the
operators except the average you are focusing on are the part averages.
Algebraically, the equation for SSP is:
![Page 9: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/9.jpg)
where kr is the number of results for a given part (3 operators, 3 trials) and the
subscript ".j." is the average of the results for part j across all operators and trials.
The table below shows the calculations. The original data has been sorted by part.
Part Trial 1 Trial 2 Trial 3 Part Average Squared Deviation for Part
1 3.29 3.41 3.64 3.1689 0.05071 3.08 3.25 3.071 3.04 2.89 2.852 2.44 2.32 2.42 2.1489 0.63182 2.53 1.78 2.322 1.62 1.87 2.043 4.34 4.17 4.27 4.0989 1.33433 4.19 3.94 4.343 3.88 4.09 3.674 3.47 3.5 3.64 3.3667 0.17884 3.01 4.03 3.24 3.14 3.2 3.115 2.2 2.08 2.16 1.9356 1.01655 2.44 1.8 1.725 1.54 1.93 1.55
Sum of Deviations 3.21229(Sum of Deviations) 28.9094
Thus,
SSP = 28.9094
Again, you can see how the sum of square due to parts is based on how the part
averages deviate from the overall average. There are five parts. Again, we
calculated the overall average, so one degree of freedom is lost. There are n - 1 =
5 -1 = 4 degrees of freedom associated with the parts sum of squares.
Equipment (Within) Sum of Square and Degrees of FreedomThe equipment sum of squares uses the deviation of the three trials for a given part
and a given operator from the average for that part and operator. This can be
expressed as:
The calculations are shown in the table below.
![Page 10: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/10.jpg)
Interaction Sum of Square and Degrees of FreedomWe will make use of the equality stated earlier to find the interaction sum of
squares. This equality was:
SST = SS0 + SSP + SS0*P + SSE
SS0*P = SST- (SS0 + SSP + SSE)
SS0*P = 32.317 - (1.63 + 28.909 + 1.712)
SS0*P = 0.065
The same equality holds for the degrees of freedom:
df0*P = dfT - (df0 + dfP + dfE)
df0*P =44 - (2 + 4 + 30)
df0*P = 8
SummaryThis is the first of a multi-part series on using ANOVA to analyze a Gage R&R
study. It focused on providing a detailed explanation of how the calculations are
done for the sum of squares and degrees of freedom. We will finish out the ANOVA
table as well as complete the Gage R&R calculations in the coming issues.
Quick LinksVisit our home page
SPC for Excel Software
Online Videos of How the SPC for Excel Software Works
Measurement Systems Analysis (Gage R&R)
Software Customer Complaint SPC Software
SPC Training
![Page 11: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/11.jpg)
Complete Teaching Guides
SPC PowerPoint Training Modules You Can Customize
SPC Implementation
Special Offers
Ordering Information
September 2012
This month's newsletter is the second in a multi-part series on
using the ANOVA method for a Gage R&R study. This method simply uses
analysis of variance to analyze the results of a gage R&R study instead of the
classical average and range method. The two methods do not generate the same
results, but they will (in most cases) be similar. With the ANOVA method, we will
break down the variance into four components: parts, operators, interaction
between parts and operators and the repeatability error due to the measurement
system (or gage) itself.
The first part of this series focused on part of the ANOVA table. We took an in-
depth look at how the sum of squares and degrees of freedom were determined.
Many people do not understand how the calculations work and the information that
is contained in the sum of squares and the degrees of freedom. In this issue we will
complete the ANOVA table and show how to determine the % of total variance that
is due to the measurement system (the % GRR).
In this issue:
The Data
![Page 12: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/12.jpg)
The ANOVA Table for Gage R&R
The ANOVA Table Results
Expected Mean Squares
The Variances of the Components
The % Gage R&R
Summary
Quick Links
As always, please feel free to leave a comment at the bottom newsletter.
The DataWe are using the data from our December 2007 newsletter on the average and
range method for Gage R&R. This newsletter also explains how to set up a gage
R&R study. In this example, there were three operators who tested five parts three
times. A partial picture of the Gage R&R design is shown below.
Operator 1 tested each 5 parts three times. In the figure above, you can see that
Operator 1 has tested Part 1 three times. What are the sources of variation in
these three trials? It is the measurement equipment itself. The operator is the same
and the part is the same. The variation in these three trials is a measure of the
repeatability. It is also called the equipment variation in Gage R&R studies or
just with the “within” variation in ANOVA studies.
Operator 1 also runs Parts 2 through 5 three times each. The variation in those
results includes the variation due to the parts as well as the equipment variation.
Operator 2 and 3 also test the same 5 parts three times each. The variation in all
results includes the equipment variation, the part variation, the operator variation
and the interaction between operators and parts. An interaction can exist if the
operator and parts are not independent. The variation due to operators is called the
reproducibility. The data we are using are shown in the table below.
Operator Part ResultsA 1 3.29 3.41 3.64
2 2.44 2.32 2.423 4.34 4.17 4.274 3.47 3.5 3.64
![Page 13: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/13.jpg)
5 2.2 2.08 2.16
B
1 3.08 3.25 3.072 2.53 1.78 2.323 4.19 3.94 4.344 3.01 4.03 3.25 2.44 1.8 1.72
C
1 3.04 2.89 2.852 1.62 1.87 2.043 3.88 4.09 3.674 3.14 3.2 3.115 1.54 1.93 1.55
The ANOVA Table for Gage R&RIn most cases, you will use computer software to do the calculations. Since this is a
relatively simple Gage R&R, we will show how the calculations are done. This
helps understand the process better. The software usually displays the results in
an ANOVA table. The basic ANOVA table is shown in the table below for the
following where k = number of operators, r = number of replications, and n=
number of parts.
The first column is the source of variability. Remember that a Gage R&R study is a
study of variation. There are five sources of variability in this ANOVA approach: the
operator, the part, the interaction between the operator and part, the equipment
and the total.
The second column is the degrees of freedom associated with the source of
variation. The third column is the sum of squares. The calculations with these two
columns were covered in the first part of this series.
The fourth column is the mean square associated with the source of variation. The
mean square is the estimate of the variance for that source of variability (not
necessarily by itself) based on the amount of data we have (the degrees of
![Page 14: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/14.jpg)
freedom). So, the mean square is the sum of squares divided by the degrees of
freedom. We will use the mean square information to estimate the variance of
each source of variation – this is the key to analyzing the Gage R&R results.
The fifth column is the F value. This is the statistic that is calculated to determine if
the source of variability is statistically significant. It is based on the ratio of two
variances (or mean squares in this case).
The ANOVA Table ResultsThe data was analyzed using the SPC for Excel software. The results for the
ANOVA table are shown below.
Source df SS MS F p ValueOperator 2 1.630 0.815 100.322 0.0000Part 4 28.909 7.227 889.458 0.0000Operator by Part 8 0.065 0.008 0.142 0.9964Equipment 30 1.712 0.057Total 44 32.317
Note that there is an additional column in this output – the p values. This is the
column we want to examine first. If the p value is less than 0.05, it means that the
source of variation has a significant impact on the results. As you can see in the
table, the “operator by part” source is not significant. Its p value is 0.9964. Many
software packages contain an option to remove the interaction if the p value is
above a certain value – most often 0.25. In that case, the interaction is rolled into
the equipment variation. We will keep it in the calculations here – though it has little
impact since its mean square is so small.
The next column we want to look at is the mean square column. This column is an
estimate of the variance due to the source of variation. So,
MSOperators = 0.815
MSParts = 7.227
MSOperators*Parts = 0.008
MSEquipment = 0.057
You might be tempted to assume, for example, that the variance due to the
operators is 0.815. However, this would be wrong. There are other sources of
variation present in all put one of these variances. We must use the Expected
Mean Square to find out what other sources of variation are present. We will use σ2
to denote a variance due to a single source.
Expected Mean SquaresAs stated above, the mean square column contains a variance that is related to the
source of variation in the first column. To find the variance of each source of
![Page 15: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/15.jpg)
variation, we have to use the expected mean square (EMS). The expected mean
square represents the variance that the mean square column is estimating.
There are algorithms that allow you to generate the expected mean squares. This
is beyond the scope of this newsletter. So, we will just present the expected mean
squares.
Let’s start at the bottom with the equipment variation. This is really the within
variation (also called error). It is the repeatability portion of the Gage R&R study.
The expected mean square for equipment is the repeatability variance. The
repeatability variance is the mean square of the equipment from the ANOVA table.
Now consider the interaction expected mean square which is given by:
Note that the EMS for the interaction tern contains the repeatability variance as
well as the variance of the interaction between the operators and parts. This is
what is estimated by the mean square of the interaction. The parts expected mean
square is shown below.
Note that the EMS for parts contains the variances for repeatability, the interaction
and parts. This is what is estimated by the mean square for parts. And last, the
expected mean square for the operators is given by:
The EMS for operators contains the variances for repeatability, the interaction and
operators. This is what the mean square for operators is estimating.
The Variances of the ComponentsWe can solve the above equations for each individual σ2. Repeatability is already
related directly to the mean square for equipment so we don’t need to do anything
there. The other three can be solved as follows:
We can now do the calculations to estimate each of the variances.
![Page 16: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/16.jpg)
Note that the value of the variance for the interaction between the operators and
parts is actually negative. If this happens, the variance is simply set to zero.
% Gage R&RThe Measurement Systems Analysis manual published by AIAG (www.aiag.org)
provides the following definition: The measurement system variation for
repeatability and reproducibility (or GRR) is defined as the following:
GRR2=EV2 + AV2
where EV is the equipment variance and AV is the appraiser (or operator)
variance. Thus:
The total variance is the sum of the components:
We can use the total variance to determine the % contribution of each source to
the total variance. This is done by dividing the variance for each source by the total
variance. For example, the % variation due to GRR is given by:
The results for all the sources of variation are shown in the table below.
SourceVariance
% of Total Variance
GRR 0.1109 12.14%Equipment (Repeatability)
0.0571 6.25%
Operators (Reproducibility)
0.0538 5.89%
Interaction 0.0000 0.00%Parts 0.8021 87.86%
![Page 17: RR anova 38](https://reader036.vdocument.in/reader036/viewer/2022081821/55cf9ae0550346d033a3d263/html5/thumbnails/17.jpg)
Total 0.9130 100.00%
Based on this analysis, the measurement system is responsibility for 12.14% of the
total variance. This may or may not be acceptable depending on the process and
what your customer needs and wants. Note that this result is based on the total
variance. It is very important that the parts you use in the Gage R&R study
represent the range of values you will get from production.
One of the major problems people have with Gage R&R studies is selecting
samples that do not truly reflect the range of production. If you have to do that, you
can begin to look at how the results compare to specifications. We will take a look
at that next month as we compare the ANOVA method to the Average and Range
method for analyzing a Gage R&R experiment. You could also use a variance
calculated directly from a month's worth of production in place of the total variance
in the analysis.
SummaryIn this newsletter, we continued our exploration of the using ANOVA to analyze a
Gage R&R experiment. We completed the ANOVA table, presented the expected
mean squares and how to use those to estimate the variances of the components,
and showed how to determine the %GRR as a percent of the total variance.
In the next newsletter, we will compare the ANOVA method to the Average and
Range method for Gage R&R.
Quick Links