statistical techniques analysis fgcsolutions - pdf handout
DESCRIPTION
Training Material for Statistical AnalysisTRANSCRIPT
1
Statistical TechniquesStatistical TechniquesAnalysisAnalysis
Presented byPresented by
Frank G. CalucinFrank G. CalucinISO Management Consultant & TrainerISO Management Consultant & Trainer
ObjectivesObjectives
• Provide an understanding on the need and application of statistical techniques based on the requirements of the ISO 9001: 2008 standard.
• Provide knowledge on the basics of statistics, its use, benefits and application to an organization
• Provide hands‐on application of selected techniques in statistical application.
2
RequirementsRequirements
DAY 1:
Simple CalculatorGraphing paper (if available)Ruler
DAY 2:
Computer with Microsoft Office Excel Program
PASS Requirements:Complete the exercisesSubmission of Assignment
QMS Requirements for AnalysisQMS Requirements for AnalysisBrief History:
ISO 9001: 2004, Clause 4.20, Statistical Techniques
• The organization will ESTABLISH PROCEDURES that implement and CONTROL the USE of STATISTICAL TECHNIQUES for the quality system, which means statistical techniques needed by the organization is required to be established to verify the process capabilities and product characteristics
• It also states that procedure should explain how the statistical techniques will be applied and that monitoring and control of these techniques must be documented
3
QMS Requirements for AnalysisQMS Requirements for AnalysisBrief History: ISO 9001: 2000
• ISO 9001: 2000 does not have a separate clause element for statistical techniques as compared to the 1994 version, which means documented procedure is no longer required
• However, this requirement was incorporated in Clause 8, Measurement, Analysis and Improvement, specifically, clause 8.4 – Analysis of Data.
• ISO 9001: 2008, Clause 8.4 – Analysis of Data does not have much change except for clause 8.4 b) and clause d)
QMS Requirements for AnalysisQMS Requirements for Analysis
1. Customer Satisfaction2. Conformity to Product Requirements3. Characteristics and Trends of Processes/Products,
including Preventive Actions4. Suppliers
In summary, the organization to comply with the minimum requirement of ISO 9001: 2008 Clause 8.4, there should be an evidence of implementation that analysis is done on the following four items:
4
Why R U Attending this Course?Why R U Attending this Course?• The standard does not specifically require for the
company to attend training like this.
• However, the standard does require that the organization must DETERMINE, COLLECT AND ANALYZE appropriate data.
• The standard also requires in Clause 6.2.2 that when applicable, the organization must provide training to achieve the necessary competence.
• The requirement in Clause 8.4, Analysis of Data tells us that the users or employees who collects and analyzes appropriate data must have the competence in statistical analysis to comply with this requirement.
Data Analysis and Data Analysis and Customer SatisfactionCustomer Satisfaction
• Quality is defined as the satisfaction on the needs and requirements of the customer
• Gauging on the customer satisfaction based on the Product and Service provided by the organization, this can only be determined through the determination of which data to collect, its objectives and the appropriate analysis.
5
Customer SatisfactionCustomer Satisfaction
• Clause 8.2.1, Customer Satisfaction, states that organization must MONITOR the INFORMATION relating to CUSTOMER PERCEPTION to determine whether the organization HAS MET the customer requirements.
• As such, the organization must develop or establish a system to collect data for a specific purpose and analyze it to determine if the products/services provided to the customer is indeed effective, which leads to the requirement of Clause 8.5.1, Continual Improvement.
BASICS OF STATISTICSBASICS OF STATISTICS(Statistics 101)(Statistics 101)
• STATISTICS – is defined as a branch of mathematics that deals with the theory and method of collecting, organizing, presenting, analyzing, and interpreting data.
• STATISTICAL DATA – is concerned about numerical data such as sales, rejects, nonconformities, population, birth, death, etc.
• DATA GATHERING – includes information gathered through surveys, interviews and raw data from records such as purchase and consumption of materials, etc.
6
Two Main Division of StatisticsTwo Main Division of Statistics
• DESCRIPTIVE STATISTICS – refers to the collection, organization, presentation, computation and interpretation of data in order to describe the samples under investigation. In simplest definition, this type of statistics describe what the data will look like
• INFERENTIAL STATISTICS – is a statistical tool that seeks to give information or inferences or implications pertaining to the populations by studying its representatives. In short, this is about sampling based on a given population
Population and SamplesPopulation and Samples
Sample
Population
Random Sampling: every unit in the population has an equal chance to be chosen
A random sample should represent the population well, so sample statistics from a random sample should provide reasonable estimates of population parameters
We use the samples as estimate of Population Parameter. The quality of all statistical analysis depends on the quality of the sample data
7
Population and SamplesPopulation and Samples
• POPULATION – defined as the totality of objects, individuals or reactions, which have common observed characteristics. Examples of population are trainees, instructors, teachers, students, employees of the company, products and services offered, etc.
• SAMPLING METHOD – defined as getting a small but representative cross section of the population. This representative part is called the SAMPLE.
This will be discussed much later on
VARIABLESVARIABLES
• Variable is one of the basic concepts in statistics, which is referred to observed characteristics such as weight, height, sex, age, IQ, etc.
• Variables are considered as raw data for statistical analysis usually expressed as X, Y, Z, etc.
8
Two Types of VariablesTwo Types of Variables
• Variable is one of the basic concepts in statistics, which is referred to observed characteristics such as weight, height, sex, age, IQ, etc.
• Variables are considered as raw data for statistical analysis usually expressed as X, Y, Z, etc.
VARIABLESVARIABLES
• Discrete Variable – variables obtained through COUNTING like the number of deaths, births, students, trainees, instructors, etc. at any given time
• Continuous Variable – values CAN NEVER BE EXACT no matter what we do in getting the measurement like age, height, weight, temperature, volume, areas, time, etc., for the basic reason that this type of variable can assume any value on an interval of real numbers.
9
Two Groups of VariablesTwo Groups of Variables• Independent Variable – used as predictor if the objective
is to predict the value of one variable
• Dependent Variable – this is the predicted value
To illustrate, if we want to predict the student’s academic achievement for a certain course, we may have to analyze the different factors such as gender, intelligence, study habits, interest, attitude, etc.
• These variables (factors) are what we call as INDEPENDENT VARIABLE. On the other hand, the DEPENDENT VARIABLE is the student’s academic achievement in mathematics.
Two Types of DataTwo Types of Data
• QUALITATIVE DATA – are categorical data, which take the form of categories or attributes such as gender, course, race, religion, etc.
• QUANTITATIVE DATA – are numerical data obtained from measurements like height, weight, age, score, temperature, etc.
10
Measurement of ScalesMeasurement of Scales
• Quantitative Data can be converted to quantitative through a process called MEASUREMENTS. By measurements, numbers are utilized to code objects that then can be treated statistically.
• FOUR TYPES OF MASUREMENTS
1. Nominal Measurement2. Ordinal Measurement3. Interval Measurement4. Ratio Measurement
Nominal MeasurementNominal Measurement• Nominal Measurement – used for identification or
classification purpose
• Nominal Data – the numbers are simply labels. You can count but not order or measure nominal data
Example: a group of students are classified according to courses such 1) engineering; 2) information technology; 3) accounting; 4) nursing.
The above does not have any meaning attached to the magnitudes of numbers assigned to the courses. The numbers indicate as codes.
11
Ordinal MeasurementOrdinal Measurement• Ordinal Measurement – this type of measurement give
the order of ranks or classes items or objects.
• Ordinal Data ‐ ordered but differences between values are not important
Examples: 1st prize, 2nd prize, 3rd prize; 1) very good 2) good 3) fair 4) poor 5) very poor
Interval MeasurementInterval Measurement• Interval Measurement – numbers are assigned to the
items or objects to identify and rank the objects
Example: Jerry weighs 75 kilograms and George weighs 65 kilograms, the difference of 10 kilograms indicate that Jerry is 10 kilograms heavier than George
12
Ratio MeasurementRatio Measurement• Ratio Measurement – ratio of the numbers assigned in
the measurement
Example: Jerry is 50 years old and George is 25 years old, then their age may be expressed in the ration of 2:1
Sampling MethodSampling Method• As defined earlier, sampling is getting small but
representative cross section of a population.
• A representative sample of 100 is generally preferable as compared to the total population of 1,000 to work on for analysis.
13
Sample SizeSample Size
To find the number of samples for a given population, the following is the formula: n = N / 1 + Ne2
Where:
n = sample sizeN = population sizee = margin of error
Sample SizeSample Size
Example: Find the sample size the researcher wants to include in his study if the population size of the students is 1,850 at 95% accuracy
Solution: Since 95% accuracy is to be evaluated, the corresponding percentage margin of error is 5% or 0.05.
Applying the formula:
n = 1,850 / (1 + (1,850 x .052))n = 329
14
Margin of ErrorMargin of Error
• The margin of error is a statistic expressing the amount of random sampling error in a survey's results.
• The larger the margin of error, the less faith one should have that the poll's reported results are close to the "true" figures; that is, the figures for the whole population.
Frequency DistributionFrequency DistributionBasic DefinitionsBasic Definitions
A FREQUENCY DISTRIBUTION TABLE lists categories of scores along with their corresponding frequencies.
15
Frequency DistributionFrequency DistributionBasic DefinitionsBasic Definitions
The FREQUENCY for a particular category or class is the number of original scores that fall into that class.
Frequency DistributionFrequency DistributionBasic DefinitionsBasic Definitions
The CLASSES or categories refer to the groupings of a frequency table
16
Frequency DistributionFrequency DistributionBasic DefinitionsBasic Definitions
The RANGE is the difference between the highest value and the lowest value.
R = highest value – lowest value
In Excel Program
Highest Value = MAX Lowest Value = MIN
Frequency DistributionFrequency DistributionBasic DefinitionsBasic Definitions
LL = Lower Class Limit
LU = Upper Class Limit
The difference between 16 and 25 is 9. The Class Width between the two consecutive lower class limit = 9
The CLASS WIDTH is the difference between two consecutive lower class limits or class boundaries.
17
Frequency DistributionFrequency DistributionBasic DefinitionsBasic Definitions
• The CLASS LIMITS are the smallest or the largest numbers that can actually belong to different classes.
• Lower class limits are the smallest numbers that can actually belong to the different classes.
• Upper class limits are the largest numbers that can actually belong to the different classes.
Frequency DistributionFrequency DistributionBasic DefinitionsBasic Definitions
• CLASS MARKS – is the midpoint of middle value of a class interval. It is obtained by finding the average of the lower class limit and the upper class limit.
• The CLASS MARK of the CLASS LIMIT 16 to 24 is (16 + 24) / 2 = 20
18
Guidelines 4 Making FDGuidelines 4 Making FD
• There should be between 5 and 20 classes.• The class width should be an odd number.• The classes must be mutually exclusive (meaning that
a value cannot belong to two different classes at the same time)
• The classes must be continuous (no gaps)• The classes must be exhaustive (enough classes)• The class must be equal in width
Procedure 4 Making FDTProcedure 4 Making FDT• STEP 1: Determine the range.
R = Highest Value – Lowest Value
• STEP 2: Determine the tentative number of classes (k); k = 1 + 3.322 log N Always round – off
• Note: The number of classes should be between 5 and 20. The actual number of classes may be affected by convenience or other subjective factors
• STEP 3: Find the class size by dividing the range by the number of classes. (Always round – off)
19
Procedure 4 Making FDTProcedure 4 Making FDT• STEP 4:Write the classes or categories starting with the
lowest score. Stop when the class already includes the highest score.
• STEP 5: Add the class width to the starting point to get the second lower class limit. Add the class width to the second lower class limit to get the third, and so on. List the lower class limits in a vertical column and enter the upper class limits, which can be easily identified at this stage.
• STEP 6: Determine the frequency for each class by referring to the tally columns and present the results in a table.
EXERCISEEXERCISE
Based on Philippine National Police records, a total of 464 men died from crime related incident during the first week of July 2010 in the Philippines. Here are the ages of 50 individuals randomly selected from that population.
Construct a frequency distribution table.
Note: The sample of 50 is for demonstration purposes only
20
EXERCISEEXERCISE
GET A PIECE OF PAPER TO MAKE YOUR FREQUENY TABLE
Class Freq REL-F Cum.F CumF% MP
N =
Tally
EXERCISEEXERCISEThe following are the ages of these 50 men who died:
19 18 70 22 1723 25 37 26 2447 69 25 55 2517 36 30 20 4624 29 21 35 3721 27 20 65 2427 23 65 27 1640 41 42 75 6333 65 23 25 2531 18 33 76 22
21
EXERCISEEXERCISE
Step 1: Find the highest and lowest valueHighest Value = 76 Lowest Value = 16
Step 2: Determine the RangeRange (R) = Highest Value – Lowest Value = 76 – 16 = 60
Step 3: Determine the tentative number of classes (k) using the formula = K = 1 + 3. 322 log; N = total number of samples = 50
K = 1 + 3.322 log 50= 1 + 3.322 (1.69897)= 6.64 = 7
Note: Round off the result to the next integer if the decimal part exceeds 0)
EXERCISEEXERCISE
Step 4: Find the class width (size) by dividing the range by the number of classes. (Always round – off)
Class size = Range ÷ Number of Classes c = R ÷ kc = 60 ÷ 7 = 8.57 = 9
22
EXERCISEEXERCISE
Class16253443526170
1st Lower Limit = 16 (lowest value)2nd Lower Limit = 16 + 9 = 25
Starting with the lowest lower limit value of 16, add the class width 9 to get the sum of 25, which will become the next lower limit of 25.
Step 5: Write the classes or categories starting with the lowest score. Add the class width to the starting point to get the second lower class limit. Add the class width to the second lower class limit to get the third, and so on. List the lower class limits in a vertical column as shown in the table
EXERCISEEXERCISE
1st Upper Limit = 9 – 1 = 8 + 16 = 242nd Upper Limit = 9 – 1 = 8 + 25 = 333rd Upper Limit = 9 – 1 = 8 + 34 = 424th Upper Limit = 9 – 1 = 8 + 43 = 515th Upper Limit = 9 – 1 = 8 + 52 = 606th Upper Limit = 9 – 1 = 8 + 61 = 697th Upper Limit = 9 – 1 = 8 + 70 = 78
Class16-2425-3334-4243-5152-6061-6970-78
Step 5: continuation…For the lowest upper limit, subtract 1 from the class value of 9 to get 8, and then add this to the corresponding lower limit, 16, which will give you 24, then continue on until the class including the maximum value of 76 is reached as shown in the table
23
EXERCISEEXERCISE
Class Freq16-24 1825-33 1434-42 743-51 252-60 161-69 570-78 3
N = 50
Tally///// - ///// - ///// - ///
///// - ///// - ///////// - //
///
////////
Step 6: Tally the data, write the numerical values for the tallies in the frequency column and find the frequency. The total number (N) of frequencies should add up to the samples of 50
EXERCISEEXERCISE
Class Freq REL-F16-24 18 36%25-33 14 28%34-42 7 14%43-51 2 4%52-60 1 2%61-69 5 10%70-78 3 6%
N = 50 100%
Tally///// - ///// - ///// - ///
///// - ///// - ///////// - //
///
////////
To get the Relative Frequency of class 16 – 24:
= (Freq / N) x 100= (18 / 50) x 100 = 36%
Step 7: Find the relative frequency. To compute for the relative frequency, divide the frequency with the total number of samples (N)
24
EXERCISEEXERCISE
Class Freq REL-F Cum.F16-24 18 36% 1825-33 14 28% 3234-42 7 14% 3943-51 2 4% 4152-60 1 2% 4261-69 5 10% 4770-78 3 6% 50
N = 50 100%
Tally///// - ///// - ///// - ///
///// - ///// - ///////// - //
///
////////
To get the Cum. Freq. of class 25 – 33:
Class 25 – 33: 18 + 14 = 32Class 34 – 42: 32 + 7 = 39
Step 8: Find the cumulative frequency. To compute for the cumulative frequency, just add the frequencies next to the other. This should add up to 50 when the class of 70 – 78 is reached
EXERCISEEXERCISE
Class Freq REL-F Cum.F CumF%16-24 18 36% 18 36%25-33 14 28% 32 64%34-42 7 14% 39 78%43-51 2 4% 41 82%52-60 1 2% 42 84%61-69 5 10% 47 94%70-78 3 6% 50 100%
N = 50 100%
Tally///// - ///// - ///// - ///
///// - ///// - ///////// - //
///
////////
To get the Cum. Freq. % of class 25 – 33:
Class 25‐33: 36 + 28 = 64Class 34 ‐42: 64 + 14 = 78
Step 9: Find the cumulative frequency percentages. To compute for the cumulative frequency percentage, just add the relative frequency next to the other. This should add to 100% when the class of 70 – 78 is reached
25
EXERCISEEXERCISE
Class Freq REL-F Cum.F CumF% MP16-24 18 36% 18 36% 2025-33 14 28% 32 64% 2934-42 7 14% 39 78% 3843-51 2 4% 41 82% 4752-60 1 2% 42 84% 5661-69 5 10% 47 94% 6570-78 3 6% 50 100% 74
N = 50 100%
Tally///// - ///// - ///// - ///
///// - ///// - ///////// - //
///
////////
To get the midpoint of class 16 – 24, add the lower class limit value of 16 and the upper class limit value of 24, and then divide it by 2
MP Class 16 – 24 = 16 + 24 = 40/2 = 20
Step 10: Find the Class Marks of Midpoints of each classes. To compute for the class mark (midpoint – used for constructing frequency polygon graph), add the lower class limit and upper class limit for each class category, and then divide it by 2.
VISUALIZING THE DATAVISUALIZING THE DATA
The three most commonly used graphs in research are
1. The histogram
2. The frequency polygon
3. The cumulative frequency graph, or ogive (pronounced o‐jive)
26
HISTOGRAMHISTOGRAM
18
14
7
2 1
53
02468
101214
1618
Num
ber o
f Occ
urre
nce
16-24 25-33 34-42 43-51 52-60 61-69 70-78
Age Brackets
Number of Crime Deaths
The histogram is a graph that displays the data by using vertical bars of various heights to represent the frequencies.
Example of Histogram as represented by the frequency distribution we just presented
FREQUENCY POLYGONFREQUENCY POLYGON
Crime Deaths Statistics
36%
28%
14%
4%2%
10%6%
0%
5%
10%
15%
20%
25%
30%
35%
40%
20 29 38 47 56 65 74Age Br a c k e t s
A frequency polygon is a graph that displays the data by using lines that connect points plotted for frequencies at the midpoint of classes. The frequencies represent the heights of the midpoints.
Example of Frequency Polygon
27
Cumulative Frequency GraphCumulative Frequency Graph
Cumulative Frequency of Crime Deaths
36%
64%
84%
94%100%
78% 82%
30%
40%
50%
60%
70%
80%
90%
100%
110%
20 29 38 47 56 65 74
Age Brackets
Freq
uenc
y Pe
rcen
tage
Cumulative frequency graph or ogive is a graph that represents the cumulative frequencies for the classes in a frequency distribution
Example of Cumulative Frequency Graph (Ogive)
InterpretationInterpretation
The graph indicates that ages from 16‐24 has the highest number of fatalities at 36% as compared to least number of casualties in the age bracket of 52~60 at 2%. The average age of individuals killed was at 31.
Based on this analysis, it is most likely that men in age bracket of 16‐24 indicates that younger individuals have now been involved to crimes due to several probable factors such as lifestyle, materialistic indulgence and poverty
28
INDIVIDUAL EXERCISEINDIVIDUAL EXERCISE
Make sure that you have a graphing paper, ruler and calculator
In a Human Resource behavioural study of employees who smoke in a company, the researcher randomly sampled 40 employees who have smoked at least 5 cigarettes per day. The following table shows the number of cigarette sticks smoked by an individual. Determine the average number of cigarettes smoked, the most number of cigarettes consumed and the least number of sticks smoked. Present this in a frequency distribution table and its corresponding graphical representation in histogram, frequency polygon and frequency cumulative graph.
INDIVIDUAL EXERCISEINDIVIDUAL EXERCISE
10 6 13 1422 17 15 1211 18 27 2213 15 16 188 16 12 118 14 25 7
13 25 16 129 22 9 8
12 15 5 1911 11 19 9
The following is the number of cigarettes smoked of the randomly sampled employees
29
Frequency Distribution Using Frequency Distribution Using Microsoft ExcelMicrosoft Excel
19 18 70 22 1723 25 37 26 2447 69 25 55 2517 36 30 20 4624 29 21 35 3721 27 20 65 2427 23 65 27 1640 41 42 75 6333 65 23 25 2531 18 33 76 22
Based on Philippine National Police records, a total of 464 men died from crime related incident during the first week of July 2010 in the Philippines. Here are the ages of 50 individuals randomly selected from that population. Construct a frequency distribution table.
FD Using Microsoft ExcelFD Using Microsoft ExcelStep 1: Start with your Excel Program. Enter the following data and create a tabular form similar to the following:
30
Maximum ValueMaximum ValueStep 2‐1: Find the maximum value and minimum value from the raw data. Starting with the maximum value, place the pointer next to the MAX, cell B9, then click the “fx” function, and then search for the function of MAX. This will be followed by “function argument”
Max Value Max Value –– Function ArgumentFunction ArgumentStep 2‐2: When the function argument appears as shown below, highlight the age range from J2 to N11. The function argument will show up in the box of “Number 1” the highlighted ranges of cells J2:N11 as shown below. Click OK. You will get 76
31
Minimum ValueMinimum ValueStep 3‐1: Find the minimum value by placing the pointer next to the MIN, cell B10, then click the “fx” function, and then search for the function of MIN. This will be followed by “function argument”
Min Value Min Value –– Function ArgumentFunction ArgumentStep 3‐2: When the function argument appears as shown below, highlight the age range from J2 to N11. The function argument will show up in the box of “Number 1” the highlighted ranges of cells J2:N11 as shown below. Click OK. You will get 16
32
Average Value Average Value ‐‐MEANMEANStep 4: Get the MEAN (average) doing the same process by placing the pointer in cell B11, and then click the “fx” function AVERAGE. Value of 34.48 (round off to 34) will be obtained.
Total Number Total Number ‐‐ COUNTCOUNTStep 5: Determine the “n” by placing the pointer in cell B20, and then click the “fx” function COUNT. Value of 50 will be obtained
33
RANGERANGEStep 6: Determine the Range (R) by placing the pointer in cell D9. Enter the following formula in the formula bar: =B9‐B10 76 – 16 = 60Highest Value = 76 = cell B9 Lowest Value = 16 = cell B10
NUMBER OF CLASSESNUMBER OF CLASSESStep 7‐1: Determine the tentative number of classes (k). Go to cell D10. Click the fx function and select “LOG” and then click OK
34
NUMBER OF CLASSESNUMBER OF CLASSESStep 7‐2: A function argument dialog will appear. Click the this symbol
at the end of number. Then this symbol will appearClick the cell E20, then click the symbol below the X then click OK
NUMBER OF CLASSESNUMBER OF CLASSESStep 7‐3: In the formula bar, add * to signify multiplication, then click cell E21 with the value of 3.32
35
NUMBER OF CLASSESNUMBER OF CLASSESStep 7‐3: Enclose the formula LOG(E20)*E21 with parenthesis then type the “+”, and then click F9. You will get 6.64. Round off the value of 6.64 to 7. Type 7 in cell E10
CLASS WIDTHCLASS WIDTHStep 8: Find the class width (size) by dividing the range by the number of classes. You will get 8.57. Round this off to 9, and then type 9 to E11
36
LOWER LIMITSLOWER LIMITSStep 9‐1: Write the classes or categories starting with the 1st lower limit (LL) in cell C13 by typing 16. For the 2nd lower limit, enter the following formula as shown below =C13+$E$11 25. The dollar sign for E11 indicates that the value of 9 in E11 is constant. This will appear when you press F4 after clicking E11. Then copy cell C14 and paste in cells C15:C19
UPPER LIMITSUPPER LIMITSStep 9‐2: For the first upper limit (UL), start with cell D13 by entering the following formula: =$E$11‐$F$9+C13 24. Again, $ sign for cells E11 and F9 to indicate that the specific values of 9 and 1 are constants. Thencopy cell D13 and paste in cells D14:D19 as shown below:
37
FREQUENCIESFREQUENCIESStep 10‐1: Determine the frequencies.
Go to cell D2 and type “=”, and then click the cell D13Copy cell D2 and paste it in cells D3:D8 as shown below:
FREQUENCIESFREQUENCIESStep 10‐2: Highlight cells E2:E8, then click the “fx” function, and then select the frequency, and then click OK as shown below:
38
FREQUENCIESFREQUENCIESStep 10‐3: After clicking OK, a new dialog, showing the “function argument” with “Data Array” and “Bins Array”
FREQUENCIESFREQUENCIESStep 10‐4: Click the symbol at the end of “Data array” . A new dialog will appear as shown below:
Click the symbol below the X, then highlight cells J2:N11 as shown below:
39
FREQUENCIESFREQUENCIESStep 10‐5: For “Bins_array”, click the symbol . A new dialog will appear as shown below:
Click the symbol under X, then highlight cells D2:D8 as shown below:
FREQUENCIESFREQUENCIES
Step 10‐6: Press all together Ctrl, Shift and Enter. Then release all at the same time. The frequency distributions will automatically be shown in cell E2:E8 as shown below
40
FREQUENCIESFREQUENCIESStep 10‐7: Go to cell E13, and then type “=”, then click cell E2. Copy cell E13 and paste in cells E14:E19 as shown below:
Relative FrequenciesRelative FrequenciesStep 11‐1: To compute for the relative frequency, divide the frequency with the total number of samples (N). For cell F13, enter the following formula =E13/$E$20
41
Relative FrequenciesRelative FrequenciesStep 11‐2: Copy cell F13 and paste in cells F14:F19 as shown below:
Cumulative FrequenciesCumulative FrequenciesStep 12‐1: Starting with cell G13, type “=” and click F13 36%For cell G14, enter this formula: =G13+F14 64%; Copy cell G14
42
Cumulative FrequenciesCumulative FrequenciesStep 12‐2: After copying cell G14, and then paste in cells G15:G19
CLASS MARKS (midpoints)CLASS MARKS (midpoints)Step13‐1: Starting with cell H13, use the function AVERAGE by clicking the “fx” function. Highlight cells C13:D13 as shown below, and then click OK 20.
43
CLASS MARKS (midpoints)CLASS MARKS (midpoints)Step13‐2: Copy the value of H13 = 20, and then paste in cells H14:H19 asshown below
ANALYSISANALYSISStep 14: Enter the data for minimum range, 16‐24 (lowest class), maximum range, 70‐78 (highest class) and the most range, 16‐24 (the class that has the highest frequency), and its corresponding relative frequencies as shown below:
Based on this representation, this indicates that out of 464 men that died as a result of the crime related incidents during the first week of July 2010, the most men that died were between the age of 16 and 24 at 36% and the least is between the age of 70 and 78 at 6%. The cause of this has yet to be determined
44
FD Using Microsoft ExcelFD Using Microsoft ExcelStep 15:Create a separate frequency distribution table for the purpose of plotting the graphs as shown below
HISTOGRAMHISTOGRAMTo create Histogram, you need the values for CLASS and FREQUENCIES. Highlight the columns of CLASS and FREQUENCIES, and then click the CHART WIZARD as shown below
45
HISTOGRAMHISTOGRAMSelect column and select the 3D type as shown below, and then click “next”
HISTOGRAMHISTOGRAMA new dialog will appear showing what the graph looks like, and then click “next”
46
HISTOGRAMHISTOGRAMA new dialog will appear for you to type the necessary information for title, x‐axis and Y or Z axis
HISTOGRAMHISTOGRAMYour graph will look like this. There will be modifications on this
0
5
10
1520
Number of Occurrences
16-24 25-33 34-42 43-51 52-60 61-69 70-78
Age Brackets
Number of Crime Deaths
Freq
47
HISTOGRAMHISTOGRAMDelete the “Freq” on the right, and adjust the position of “number of occurrences”, by clicking the “Format Axis Title”
HISTOGRAMHISTOGRAMClick the “alignment tab”, and type 90 in the box for degrees, and then click OK
48
HISTOGRAMHISTOGRAM
18
14
7
2 1
53
02468
1012141618
Num
ber o
f Occ
urre
nce
16-24 25-33 34-42 43-51 52-60 61-69 70-78
Age Brackets
Number of Crime Deaths
Change the color of the wall, remove the grid lines and indicate value as shown in the final graph
FREQUENCY POLYGONFREQUENCY POLYGONTo create Frequency Polygon, you need the values for RELATIVE FREQUENCIES and CLASS MARKS (MIDPOINTS). First, click the CHARTWIZARD as shown below, and then select LINE, click NEXT
49
FREQUENCY POLYGONFREQUENCY POLYGONHighlight the values of Relative Frequency as shown below, and then click “SERIES”
FREQUENCY POLYGONFREQUENCY POLYGONAfter clicking the SERIES tab, the following will appear. Click the symbol in the box next to the “Category (X) axis labels”, and then highlight the values of midpoint, then click next
50
FREQUENCY POLYGONFREQUENCY POLYGONA new dialog will appear for you to input information on the title, x‐axis and y‐axis, then click finish
FREQUENCY POLYGONFREQUENCY POLYGON
Crime Death Statistics
0%
10%
20%
30%
40%
20 29 38 47 56 65 74
Age Brackets
Freq
uenc
y Pe
rcen
tage
Series1
The following graph will appear. Improvements need to be done on the graph such as the wall color, removal of “series” and grid lines
51
FREQUENCY POLYGONFREQUENCY POLYGON
Crime Deaths Statistics
36%
28%
14%
4%2%
10%6%
0%
5%
10%
15%
20%
25%
30%
35%
40%
20 29 38 47 56 65 74Age Br a c k e t s
After making all the changes, your frequency polygon should looklike this:
Cumulative Frequency GraphCumulative Frequency GraphTo create Cumulative Frequency Graph, you need the values for CUMULATIVE FREQUENCIES and CLASS MARKS (MIDPOINTS)First, click the CHART WIZARD as shown below, and then select LINE, click NEXT
52
Cumulative Frequency GraphCumulative Frequency GraphHighlight the values of Cumulative Frequency as shown below, andthen click “SERIES”
Cumulative Frequency GraphCumulative Frequency GraphAfter clicking the SERIES tab, the following will appear. Click the symbol in the box next to the “Category (X) axis labels”, and then highlight the values of midpoint, then click next
53
Cumulative Frequency GraphCumulative Frequency GraphA new dialog will appear for you to input information on the title, x‐axis and y‐axis, then click finish
Cumulative Frequency GraphCumulative Frequency Graph
Crime Death Statistics
0%
20%
40%
60%
80%
100%
120%
20 29 38 47 56 65 74
Age Brackets
Cum
ulat
ive
Freq
uenc
ies
Series1
The following graph will appear. Improvements need to be done on the graph such as the wall color, removal of “series” and grid lines
54
Cumulative Frequency GraphCumulative Frequency Graph
Cumulative Frequency of Crime Deaths
36%
64%
84%
94%100%
78% 82%
30%
40%
50%
60%
70%
80%
90%
100%
110%
20 29 38 47 56 65 74
Age Brackets
Freq
uenc
y Pe
rcen
tage
After making all the changes, your cumulative frequency graph should look like this:
EXERCISEEXERCISE
62 78 56 83 91 82 87 74 80 8385 79 89 91 69 89 88 81 83 7992 87 60 87 83 81 82 78 76 6978 81 66 83 87 73 83 84 54 7476 83 75 89 85 95 87 55 78 9486 88 86 88 83 87 93 78 89 6888 65 78 97 65 86 66 85 72 7582 86 79 93 87 82 78 88 67 8387 48 81 95 81 82 86 87 71 8171 81 88 77 78 79 85 79 65 7569 82 63 89 91 93 84 89 88 7888 86 85 81 78 86 87 72 85 77
In 2009, 170 individuals have attended the ISO 9001: 2008 Internal Quality Audit Training examination. At 95% accuracy, 120 samples were taken. The following table shows the IQA scores of the 120 samples. At 60% passing mark, determine the percentage of those who failed, who passed, the most score bracket and present this in frequencydistribution table with corresponding histograms, frequency polygon and cumulative frequency polygon
55
EXERCISEEXERCISEEnter your data similar to this fashion
Create your frequency table template for computation
56
ORIGINS OF PARETOORIGINS OF PARETOItalian economist, Vilfredo Pareto (1848 – 1923), first thought out the Pareto Diagram in 1897
His famous observation known as 80/20 Rule came from his generalobservation back in 1906 that 80% of the properties in Italy were owned by 20% of the population
This forms the principle that 80% of the OUTCOMES come from 20% of the INPUTS.
It was Dr. J. M. Juran who popularizes and applied the principle in quality control to classify problems of quality into vital few and trivial many, known as PARETO ANALYSIS
Juran states that in many cases, most defects, and the cost of these arises from a relative small number of causes, which became the PARETO DIAGRAM
Pareto Diagram DATAPareto Diagram DATA
Code Causes
Complaint Freq. REL-Freq Cum.F %
C1 Too long on hold 157 42.90% 42.90%C2 No evening/weekend staff 83 22.68% 65.57%C3 Not knowledgeable 41 11.20% 76.78%C4 Not courteous 22 6.01% 82.79%C5 Transferred too many times 20 5.46% 88.25%C6 Could not locate file 11 3.01% 91.26%C7 No phone payment option 11 3.01% 94.26%C8 Hard to understand 9 2.46% 96.72%C9 Charged more than promised 7 1.91% 98.63%C10 Others 5 1.37% 100.00%
366
57
Pareto DiagramPareto Diagram
Customer Service Complaints
76.78%
100.00%98.63%96.72%94.26%
42.90%
65.57%
82.79% 88.25% 91.26%
0%10%20%30%40%50%60%70%80%90%
100%
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
Complaints
Cum
ulat
ive
Freq
uenc
ies
Analysis of DataAnalysis of DataBased on the sample Pareto Diagram, this indicates that the three biggest complaints by the customer are codes 1, 2 and 3. It is clear that the top three categories of customer complaints are the most significant effect on customer service dissatisfaction, which represent a cumulative total of 76.78%
Based on this analysis, the company needs to concentrate on the following three top most complaints of the customer to improve the customer satisfaction.
In this case, the company can up with corrective and preventive measures by analyzing the root causes of these top three complaints of the customer
58
Corrective Action MeasuresCorrective Action Measures
TOO LONG ON HOLD –
This could be due shortage of staff
This could be because of poor training on the CSR
Poor supervision by team leaders
Investigate on the three probable causes
Corrective Action MeasuresCorrective Action Measures
NO EVENING/ WEEKEND STAFF –
The company only operates during day time – investigate on the cost benefit of extending work hours to include evening shift and weekends
This might solve the number 1 problem because if the hours of operation is longer, the customers could spread out their calling times, thus, putting less strain with the CSR
59
Corrective Action MeasuresCorrective Action Measures
NOT KNOWLEDGEABLE –
This relates to the first problem and probable cause
With this third biggest customer complaint, it is imperative that CSR are trained and let them know the importance of quality and the impact on the business for not handling calls efficiently.
Pareto Chart using ExcelPareto Chart using ExcelStart with the type of problems you want to investigate
For the purpose of this illustration, we will use the example of the complaints regarding the customer service
PROBLEM:
A newly established telecommunications company normally operates Monday – Friday, 8 AM to 5PM. In the span of its 3 months operation since it started, the company logged 366 customer service related complaints.
These complaints were categorized, coded and arranged from the most frequent number of complaint to the least as shown below:
60
Pareto Chart using ExcelPareto Chart using Excel
Pareto Chart using ExcelPareto Chart using ExcelCompute for the Relative Frequency. Starting with cell D2, enter the formula =C2/$C$12 42.9%. Copy the value of D2 and paste in cells D3:D11. This should add up to 100% as shown below
61
Pareto Chart using ExcelPareto Chart using ExcelCompute for the Cumulative Frequency. Starting with cell E2, type “=”, then click on D2 42.90%. On cell E3, enter the following formula: =E2+D3 42.9 + 22.68 65.57%. Copy the value of cell E3, then paste in cells E4:E11. This should add up to 100% when you reached cell E11 as shown below
Pareto Chart using ExcelPareto Chart using ExcelMake the Pareto Diagram (chart). Note that code was used in making the Pareto table for simplistic reason and for graphical presentation. In Pareto Diagram, you need the values for Categories, Relative Frequency and Cumulative Frequencies. To start with the chart, press CTRL when you highlight the values of cells A2:A11,D2:D11 and E2:E11 as shown below
62
Pareto Chart using ExcelPareto Chart using ExcelNow, click on the chart wizard icon, and select column, then click next
Pareto Chart using ExcelPareto Chart using ExcelNext dialog will show up to type the title, x‐axis and y‐axis as shown below, then click finish
63
Pareto Chart using ExcelPareto Chart using ExcelAfter clicking finish, this will appear. Note that modifications will still have to be made here to make this a Pareto diagram
Pareto Chart using ExcelPareto Chart using ExcelClick on the graph where the cumulative frequency is as shown below. Click on the right button of the mouse and select chart type
64
Pareto Chart using ExcelPareto Chart using ExcelAfter selecting the chart type, select the LINE as shown below, then click OK
Pareto Chart using ExcelPareto Chart using ExcelAfter clicking OK, the Pareto Chart will show up. Note that modification are still needed with this chart
65
Pareto Chart using ExcelPareto Chart using Excel
Customer Service Complaints
76.78%
100.00%98.63%96.72%94.26%
42.90%
65.57%
82.79%88.25%91.26%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10
Complaints
Cum
ulat
ive
Freq
uenc
ies
After modifications on the chart, the final Pareto Diagram will look like this
EXERCISEEXERCISE
Got lost 1Bad weather 57Sick 17Traffic 11Appointment 4No parking spot 2Woke up late 103Bus late 1Flat tire 1Lost keys 3
On a human resources behavioural study of tardiness, the following data was gathered as the reasons of the employees for coming late to work. Create the Pareto Table and Pareto Diagram (Chart) based on the following data
66
SOLUTIONSOLUTION
Tardiness Reason Number %Total Cum%Woke up late 103 51.50% 51.50%Bad weather 57 28.50% 80.00%Sick 17 8.50% 88.50%Traffic 11 5.50% 94.00%Appointment 4 2.00% 96.00%Lost keys 3 1.50% 97.50%No parking spot 2 1.00% 98.50%Bus late 1 0.50% 99.00%Flat tire 1 0.50% 99.50%Got lost 1 0.50% 100.00%Totals 200 100.00%