03 - measure current state
DESCRIPTION
six sigma measuring current stateTRANSCRIPT
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 1
1
Six Sigma Measure Phase
Measuring the CurrentState of a Process
2
Case Study Scanner Mfg
Key Output Variables (Ys) Weld Shear Force (from destructive test)
Specification: Shear Force > 13 lbs Visual Weld Inspection (binary: pass/fail)
Process Variables (Xs) Material (melt flow index) Surface condition Press force Clamping force Temperature
1
3
54
2
Problem:Weld Defects betweenMylar Motor and Attachment Bracket(UltrasonicWeld Operation)
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 2
3
Topics
I. Review Types of Data
II. Review of Exploring Data Patterns and Descriptive Statistics
III. Six Sigma Metric Calculations* Yield Defects Per Million (DPM) Defects Per Million Opportunities (DPMO) DPM based on Variable (Numerical) Data
* Note: Other metrics will be discussed in future lectures
4
I. Types of Data Variables
Selection of analysis method/tool depends on type of data
Discrete/ Continuous Variables (Numerical/Quantitative Data) Discrete variables - vary by whole units (# of customers) Continuous variables - vary to any degree, limited only by
precision of measurement system. Time to complete a task Manufactured hole diameter measurement may be 10 mm, 10.0 mm, 10.01 mm, 10.008 mm
Qualitative (Categorical) Variables (Attribute Data) Binary (pass/fail; defective/ not defective) Ordinal (ordered classification system such as survey rating systems) Nominal (non-ordered groups or classifications)
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 3
5
Qualitative (Categorical) Data
To analyze qualitative data, we typically assign discrete numerical values and/or use them to stratify or group other numerical data by categories
Some examples are: Binary Variables assign discrete binary outcome (0/1)
Examples: On Time Delivery, Service Quality Binary Attribute: On Time (0) / Late (1); OK (0) / NOK (1)
Ordinal Variables assign discrete ordinal scale to classify responses Ordinal Attribute natural order is implied between categories but the magnitude
of difference is unknown Example 1: Variable = Size
Small, Medium, and Large Example 2: Variable = Survey Response to Question (with ordinal attribute scale)
Strongly Disagree(1), Disagree(2), Neutral(3), .. Strongly Agree (5)
Nominal (Categorical or Grouping) Variables use to stratify or group data Variable Example: Distribution Center
Nominal Attributes: Northeast, Southwest, Central Other Examples: Shift (e.g., Day or Night); Plant; Department; Model Type
6
II. Review of Exploring Data Patterns and Descriptive Statistics
To characterize a variable, we typically observe a Sample from a Population and run statistical analysis (e.g., compute Statistics).
Some Common Statistical Analysis/Tools to characterize a variable include:A. Data patterns regardless of time order
Common Tools (Sample size, N > 30): Histogram, Box Plot If small sample size (e.g., N < 30): use Dot Plot
B. Data patterns in time order (i.e., to evaluate process stability over time) Run Chart (also known as trend chart or time series plot) Statistical Process Control (SPC) Chart (refer to SPC lecture)
C. Descriptive Statistics Summary Table common statistics to report include: Sample Size, N Location Statistics: Mean and Median Dispersion (Variation) Statistics: St Dev, Variance, Range (with Min and Max) Symmetry and Peakedness of Distribution Shape: Skewness and Kurtosis Additional Statistics: Trimmed Mean, Quartiles, or Percentiles
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 4
7
A. Histogram Example
Typical Y-Axis: frequency or relative frequency May use relative frequency (%) if sample size is large May create using Excel or Minitab
Minitab Commands:>> Graph>> Histogram>> Select VariableShearForce
ShearForce
Freq
uenc
y
24181260
16
14
12
10
8
6
4
2
0
Histogram of ShearForce
Note: Requirement is Shear Force >= 13 (Lower Specification Limit (LSL) = 13)
8
Normal Vs. Skewed Data
Does shear force data appear normally distributed or another (e.g., skewed right, skew left, or bi-modal)? Is this likely a natural
phenomenon?
Normal Skewed Right Bi-Modal
ShearForce
Freq
uenc
y
24181260
16
14
12
10
8
6
4
2
0
Histogram of ShearForce
Skewed Left
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 5
9
Statistical Test - Normality We may use Minitab to test for Normality
Null Hypothesis (Ho): Data are Normal; Ha ~ Data are not Normal Test Conclusion: p-value is ~0.000 (note: if p-value < alpha, reject Ho)
ShearForce
Perc
ent
403020100
1.0E+02
99
9590
80706050403020
10
5
1
0.1
Mean
> Stat>> Basic Statistics>> Normality TestSelect Variable
ShearForce
Note: Selected Anderson Darling Test
Default: alpha error = 0.05
10
Box Plot Calculations
**
Mild Outlier(s)
Upper Whisker:Highest value within
upper limit
Median
Third quartile (Q3)
First quartile (Q1)
Q3 75th PercentileMedian - 50th PercentileQ1 25th Percentilefs = Q3 Q1
Upper Limit:Q3 + 1.5 fsLower Limit:Q1 1.5 fs*Lower Whisker:
Lowest value within lower limit
Extreme Outlier(s)
< extremeoutlier
> extremeoutlier
Q1 - 1.5 fs > Q1 - 3.0 fs> mildoutlier
Q3 + 1.5 fs < Q3 + 3.0 fs< mildoutlier
Excel Command (E.g., Q3)=percentile(data array, 0.75)
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 6
11
Box Plot Shear Force What does this box plot suggest?
Minitab Commands:>> Graph>> Boxplot>> Select VariableY = ShearForce
Shea
rFor
ce
30
25
20
15
10
5
0
Boxplot of ShearForce
12
Histogram Vs. Box Plot
Box plots provide a similar representation of distribution as Histogram (for Normal, skewed right, skewed left) Exception: must show multi-modal with histogram
ShearForce
Freq
uenc
y
24181260
16
14
12
10
8
6
4
2
0
Histogram of ShearForce
Shea
rFor
ce
30
25
20
15
10
5
0
Boxplot of ShearForce
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 7
13
Outlier Analysis (Extreme Values)
Box plots provide an effective tool to identify possible outliers
Outliers are non-representative values in a data set and generally result from measurement or data entry error (e.g., record using wrong units) observation being obtained under a different set of circumstances
(e.g., special cause) data recorded during peak volume versus typical conditions
Outliers may significantly affect descriptive statistics such asmean/standard deviation and other statistics (e.g., correlation between two variables)
14
Outliers: Good Or Bad?
Data Analysis Trap is to automatically exclude outliers
Outliers may suggest a better set of operating conditions are available
Unfortunately, deciding whether to include or exclude outliers is an experience-developed skill Try to understand the source of outliers before discarding
If decide to remove outlier, some typical strategies are: With a large sample size, remove the entire observation For smaller samples (N < 100) where you collect data on
several variables, you may want to keep the sample. Here, we typically replace the outlier sample value with median value for that variable. Why Median?
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 8
15
Multiple Box Plots
Minitab Commands:>> Graph>> Boxplot>> Select Graph VariableY = ShearForceX = Batch
During the analyze phase, we often stratify Box Plot Results for Y output by grouping variables (e.g., Nominal Variables) Is shear force consistent across all batches of incoming material?
Production Batch*
Shea
rFor
ce
P3P2P1
30
25
20
15
10
5
0
Boxplot of ShearForce vs Production Batch*
16
B. Run Chart (Time Series Plot)
If time sequence available, we often like to examine data by time (look for time trends)
Index
Shea
r Fo
rce
(lb)
60544842363024181261
30
25
20
15
10
5
0
Time Series Plot of Shear Force (lb)
Minitab Commands:>> Graph>> Time Series Plot>> Select Graph VariableY = ShearForce
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 9
17
C. Minitab - Descriptive Statistics
Another common analysis to perform during the measure phase is to compute descriptive statistics for Y (if Y may be evaluated as continuous variable)
Descriptive Statistics: ShearForce Minitab Command >> Stat >> Basic Statistics
Descriptive Statistics: Shear Force (lb)
Variable N N* Mean SE Mean StDev Minimum Q1 MedianShear Force (lb) 60 0 17.670 0.883 6.841 1.400 11.350 20.200
Variable Q3 Maximum Skewness KurtosisShear Force (lb) 23.275 26.900 -0.75 -0.53
Or, Use Excel to Create Table with: N, Mean, StDev, Min, Max, Range, Skew
Questions: What does a skewness of -0.75 suggest? Why does the median differ from the mean for these data?
18
Stratification Analysis of Descriptive Statistics
May wish to stratify an output by an X variable Descriptive Statistics: ShearForce
Minitab Command >> Stat >> Basic Statistics By Variable: Batch
What do these data suggest?
Descriptive Statistics: ShearForce
Production
Batch* N Mean TrMean StDev Minimum Median Maximum
P1 20 22.170 22.272 2.859 16.200 22.450 26.300
P2 20 16.30 16.47 7.07 2.60 18.05 26.90
P3 20 14.55 14.71 7.32 1.40 12.30 24.70
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 10
19
III. Six Sigma Metric Calculations
1. Yield (e.g., Simple Quality Yield)
2. Defects Per Million (DPM) (Attribute Data) Note: DPM also known as PPM for parts per million defective
3. Defects Per Million Opportunities (DPMO)
4. Defects per Million (Observed DPM)
5. Defects per Million (Expected DPM)
Note: Other Six Sigma Metrics covered later in course Process Capability, Reliability, Rolled Throughput Yield
20
Specifications
To calculate Yield (or % defective, DPM, DPMO) we need standards or specification limits LSL Lower Specification Limit; USL Upper Specification Limit
Specification limits identify acceptance levels. Unilateral Specification Limit Examples
Process time = 13 lbs
Bilateral Specification Limit Examples 30 +/- 5 days (Nominal=30; LSL=25; USL=35) Width 1000 +/- 0.5 mm
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 11
21
1. Quality Yield (% Acceptable)
Quality Yield = (# Good Units) / (Total # Units) x 100% Unit: part, service, customer, document, procedure, etc.
Or, Yield = (1 Fraction Defective) x 100% Where Fraction Defective = # Defective / Total # Units # Defective is a binary assessment (e.g., 0-not late; 1-late)
typical convention for binary let defect = 1
Example: Suppose 232 of 1034 bills are late (802 are on-time),
calculate the Quality YieldQuality Yield = 802/1034 = 77.6%
22
DPM and DPMO Methods Depending on type of data, often convert Yield to defects per million
(DPM) or defects per million opportunity (DPMO ) Method used varies based on type of data/ assumptions
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 12
23
2. Defective Method for DPM
Suppose you have a process where each unit is classified as defective or not defective
DPM = Fraction Defective x 1 MillionNote: Yield = 1 fraction defective
Suppose you fabricate 4000 welds and find that 35 are defective. What is the DPM?
InspectedUnitsTotalDefectiveTotal
# # DefectiveFraction =
Fraction Defective: 35 / 4000 = 0.008750
DPM =
24
3. # Defects per Unit Opportunity Method (DPMO)
Use if a particular inspection unit or part has 1 or more defects (multiple opportunities)
Example: Suppose we visually inspect weld manufacturing process for various conditions A: Excess Part Deflection after welding B: Poor weld penetration C: Poor weld appearance (e.g., excess flash)
Note: each weld (unit) could have 0 - 3 defects
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 13
25
Defects per Million Opportunity (DPMO)
Here, we use opportunities to summarize the total number of possible chances for error (i.e., defects) in system
Where: Total # Defects = Total # defects across all units
Million 1(TOP) iesOpportunit Total
Defects#TotalDPMO x=
categorydefect iesOpportunit # Total
==
iiesOpportunit i
26
DPMO Example
Given the following data set of three features per unit: Suppose you have 1,000 welds (TOP = 3 x 1,000 = 3000)
Fraction nonconforming = 59/3,000 = 1.967% DPMO = 19,667
Part Feature DefectsA 22B 19C 18
total 59
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 14
27
DPMO Hotel Survey Example Varying Opportunities per Unit
the number of opportunities may vary by unit (customer) In hotel example below, not all guests may use hotel meal service Here, the total opportunities is obtained by summing the
opportunities for each category Given the following data set, what is the DPMO?
Concern GuestsDefects
(Not Satisfied)
Opportunities
Poor Meal Service* 447 111 447Poor House Keeping 1000 82 1000Problems with Reservations 1000 34 1000Long Check In 1000 96 1000Long Check Out 1000 58 1000
Total 381 4447# defects TOP
* Note: not all guests used a hotel meal service
28
Overall DPMO For Multiple Groups (Facilities)
DPMO also may be used to summarize multiple groups (e.g., departments, facilities) Note: Opportunity per group also provides a measure of complexity For example, perhaps one of the hotel does not offer any meal services
DPMO = 1054/13786 * 1M
Hotel Poor Meal Service*
Poor House
Keeping
Problems with Reservations
Long Check
In
Long Check
Out
Total Defects TOP
A 111 82 34 96 58 381 4447B 120 89 37 102 62 410 5114C n/a 75 28 90 70 263 4225
TOTAL 1054 13786
OVERALL DPMO 76,454
Defects
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 15
29
Feature # Defects # Opportunities DPM0A 3 200,000 15
B 0 200,000C 0 200,000D 0 200,000E 0 200,000
0
} CombinedDPMO= 3(3 / 1M)
DPMO The Denominator Game
Suppose we measure 200,000 units with 1 feature per unit. What happens to the DPMO as the # of features (concerns) with NO defects increases? NOTE: Features MUST BE Customer Related and should not just
be added to improve DPMO
Total Defects: 3 Total Opportunities: 1,000,000
30
Denominator Game Example
Suppose you have a hole specification
Could you have one defect opportunity for oversized and another for undersized?
What if we added the category missing weld to our example? How might we include that in determining total opportunities?
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 16
31
4. Variable Data Methodfor Observed DPM
If you collect numerical measurements for a characteristic (dimension) of each unit, we may convert each observation to a binary result based on specification limits of the characteristic and then compute DPM
Either In-Specification or Out-Specification (Defect)
Here, fraction defective = # units observed out-of-specification / total # units
DPM = Fraction Defective x 1 Million Also known as parts per million (PPM) defective
32
DPM Example: Shear Force(based on Observed Out-Specification)
Specifications: Ok, if shear force
>= 13
To compute DPM, need to convert each observed measurement to a binary output (0-within specification, 1= outside specification or a defect)
Note: Observed DPM also may be obtained using Minitab with Process Capability Summary Analysis Tool
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 17
33
5. Variable Data Methodfor Expected DPM
Used when collecting variable data and data may be reasonably assumed to follow a known or assumed distribution (e.g., normal)
Use software to fit data to statistical distribution (e.g., Normal Distribution) and estimate the probability (Pr) of a defect based on the distribution and its properties
Expected (Predicted) DPM = Pr (Defect) x 1 Million
DEFECT DEFECT
LOWERSPECIFICATION
UpperSPECIFICATION
NormalExample:bilateraltolerance
34
Expected DPM Using Minitab Capability Analysis: Minitab will compute expected DPM (based on assumed
distribution). Note will examine non-normal distributions in later module or see appendix)
Note: Menu will vary based on Minitab Version Used
Suppose weassume Normality
Version 14
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 18
35
Minitab Process Capability Analysis (excellent all-in-one analysis tool**)
Minitab (Version 14) Command:Stat >> Quality Tools >> Capability Analysis (Normal)Variable: ShearForce Subgroup Size ~ 1; LSL=13(minitab assumptions: unbiasing constants, average moving range method with length=2)
Does NormalityAssumption Matterin this example?
3024181260
LSLProcess Data
Sample N 60StDev(Within) 4.56185StDev(Overall) 6.86963
LSL 13Target *USL *Sample Mean 17.67
Potential (Within) Capability
CCpk 0.34
Overall Capability
Pp *PPL 0.23PPU *Ppk
Cp
0.23Cpm *
*CPL 0.34CPU *Cpk 0.34
Observed PerformancePPM < LSL 316666.67PPM > USL *PPM Total 316666.67
Exp. Within PerformancePPM < LSL 152986.54PPM > USL *PPM Total 152986.54
Exp. Overall PerformancePPM < LSL 248314.53PPM > USL *PPM Total 248314.53
WithinOverall
Process Capability of ShearForce
Observed DPM:316,667
Expected (Predicted)DPM: 248,314
36
Observed Vs. Expected DPM
If collect variable data (e.g., continuous) and have specifications, we may always convert to a binary outcome and compute Observed DPM
Or, we can predict the DPM (Expected DPM) by fitting sample data to a distribution and then determining the probability of a defect x 1M.
Of note: neither is wrong ultimately you want to use the most representative estimate -- Rule of thumb:
If data reasonably fit a distribution shape (e.g., Normal or Weibull), report the Expected (Predicted) DPM. Particularly if data are from a smaller sample size (e.g., 30-100).
If data do not reasonably fit a distribution and large sample size is available (> 200), use observed DPM.
If not sure, report them both in current state note: data often are not normal when assessing the current state
during measure phase as some problems create non-normality
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 19
37
Summary In the measure phase, we typically include:
Histogram and/or Box Plot of raw data (if continuous data) May include Normality Test or Distribution ID Probability Plot Analysis (see appendix)
Run Chart (or SPC Chart) to show any time series trends Summary Statistics (if continuous data)
N, mean, median, standard deviation, variance, min, max, range, skew Estimate of Current State in terms of: Yield, DPM, or DPMO
Calculations vary depending on type of data, best fit distribution, defect opportunity classification, # opportunities for defect per unit, etc.
For numerical variables, use Expected DPM for smaller samples sizes (< 100), particularly if data reasonably fit a known distribution. For larger sample sizes, may use either observed DPM and/or Expected DPM (if good distribution fit).
When identifying opportunities for DPMO, they should be important to the customer and independent of other categories (avoid denominator game).
38
Appendix: Distribution ID Plot Minitab has a tool to help determine best distribution fit
STAT >> Reliability/Survival >> Distribution Analysis Right Censoring >> Distribution ID Plot
Choose distribution with highest correlation coefficient / lowest AD score
Common DistributionOptions:Weibull (best result)ExponentialLognormalNormalOthers available
-
Session 03 - Measuring Current State
Pat Hammett, University of Michigan 20
39
Shear Force Results Best Result look for:
lowest AD score based on max
likelihood estimation
highest correlation coefficient based on Least
Squares Estimation
Here, we do not have a good distribution fit for any of the options (recall, bi-modal)!
ShearForce
Pe
rce
nt
100101
1.0E+02
90
50
10
ShearForce
Pe
rce
nt
100101
1.0E+0299
90
50
10
10.1
ShearForce
Pe
rce
nt
100.010.01.00.1
1.0E+02
90
50
10
ShearForce
Pe
rce
nt
40200
1.0E+0299
90
50
10
10.1
Correlation CoefficientWeibull0.948
Lognormal0.865
Exponentia*
Normal0.954
Probability Plot for ShearForceLSXY Estimates-Complete Data
Weibull Lognormal
Exponential Normal
40
Use Best Fit Distribution to Estimate DPM
Note: topic covered in process capability analysis module
Select Desired Distribution