© 2003 Prentice-Hall, Inc.
Chapter 2Presenting Data in Tables
and Charts
Business Statistics(9th Edition)
© 2003 Prentice-Hall, Inc.
Chapter Topics
Guidelines to Analyze data Organizing Numerical Data
The Ordered Array and Stem-Leaf Display
Tabulating and Graphing Univariate Numerical Data Frequency Distributions: Tables, Histograms, Polygons Describing Distribution:
Shape, Center and Spread Cumulative Distributions: Tables, the Ogive
Graphing Bivariate Numerical Data
© 2003 Prentice-Hall, Inc.
Chapter Topics Displaying Categorical Data Tabulating and Graphing Univariate Categorical
Data The Summary Table Bar and Pie Charts, the Pareto Diagram
Tabulating and Graphing Bivariate Categorical Data Contingency Tables Side by Side Bar Charts
Case Study: Titanic Data Graphical Excellence and Common Errors in
Presenting Data
(continued)
© 2003 Prentice-Hall, Inc.
Guidelines to Analyze data
First learn something about the context: What was measured? What are the units? How was the measurement carried out? Where the data measured for a particular purpose?
Then make a picture. It is sometimes said that there are three rules for
starting a data analysis: Plot the data, plot the data, and plot the data. Look for an overall pattern and for deviations from
that pattern. Such deviations are called outliers.
4.1 See a real world problem in which a distribution is needed.
© 2003 Prentice-Hall, Inc.
Organizing Numerical Data
2 144677
3 028
4 1
Numerical Data
Ordered Array
Stem and LeafDisplay
Frequency DistributionsCumulative Distributions
Histograms
Polygons
Ogive
Tables
41, 24, 32, 26, 27, 27, 30, 24, 38, 21
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
© 2003 Prentice-Hall, Inc.
Data in RawRaw Form (as Collected): 24, 26, 24, 21, 27, 27, 30, 41, 32, 38
Data in Ordered ArrayOrdered Array from Smallest to Smallest to LargestLargest:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41 Stem-and-Leaf Display:
Organizing Numerical Data(continued)
2 1 4 4 6 7 7
3 0 2 8
4 1
© 2003 Prentice-Hall, Inc.
Tabulating and Graphing Numerical Data
O g ive
0
20
40
60
80
100
120
10 20 30 40 50 60
0
1
2
3
4
5
6
7
10 20 30 40 50 60
2 144677
3 028
4 1
Numerical Data
Ordered Array
Stem and LeafDisplay
Histograms Ogive
Tables
41, 24, 32, 26, 27, 27, 30, 24, 38, 21
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
Frequency DistributionsCumulative Distributions
Polygons
© 2003 Prentice-Hall, Inc.
Tabulating Numerical Data: Frequency Distributions
Sort Raw Data in Ascending Order12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Find Range: 58 - 12 = 46
Select Number of Classes: 5 (usually between 5 and 15)
Compute Class Interval (Width): 10 (46/5 then round up)
Determine Class Boundaries (Limits):10, 20, 30, 40,
50, 60
Compute Class Midpoints: 15, 25, 35, 45, 55
Count Observations & Assign to Classes
© 2003 Prentice-Hall, Inc.
Frequency Distributions, Relative Frequency Distributions and
Percentage Distributions
Class Frequency
10 but under 20 3 .15 15
20 but under 30 6 .30 30
30 but under 40 5 .25 25
40 but under 50 4 .20 20
50 but under 60 2 .10 10
Total 20 1 100
RelativeFrequency
Percentage
Data in Ordered Array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
© 2003 Prentice-Hall, Inc.
Graphing Numerical Data: The Histogram
Histogram
0
3
65
4
2
001234567
5 15 25 35 45 55 More
Fre
qu
en
cy
Data in Ordered Array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
No Gaps Between
Bars
Class MidpointsClass Boundaries
© 2003 Prentice-Hall, Inc.
Bar Chart
How tall are the tallest soldiers in this group?
1. How many in this group are between 65 and 67 inches tall?
2. If we selected a soldier at random from this group, would you estimate that he or she is more likely to be taller than 65 inches or shorter than 65 inches? (Hint: No calculation is needed. Judge from the way the display looks.)
© 2003 Prentice-Hall, Inc.
Graphing Numerical Data: The Frequency Polygon
Frequency
0
1
2
3
4
5
6
7
5 15 25 35 45 55 More
Class Midpoints
Data in Ordered Array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
© 2003 Prentice-Hall, Inc.
Tabulating Numerical Data: Cumulative Frequency
Lower Cumulative CumulativeLimit Frequency % Frequency
10 0 0
20 3 15
30 9 45
40 14 70
50 18 90
60 20 100
Data in Ordered Array:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
© 2003 Prentice-Hall, Inc.
Graphing Numerical Data: The Ogive (Cumulative %
Polygon)
Ogive
0
20
40
60
80
100
10 20 30 40 50 60
Class Boundaries (Not Midpoints)
Data in Ordered Array :12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
© 2003 Prentice-Hall, Inc.
Graphing Bivariate Numerical Data (Scatter
Plot)
Mutual Funds Scatter Plot
0
10
20
30
40
0 10 20 30 40
Net Asset Values
Tota
l Yea
r to
Dat
e R
etur
n (%
)
© 2003 Prentice-Hall, Inc.
Tabulating and Graphing Univariate Categorical Data
Categorical Data
Tabulating Data
The Summary Table
Graphing Data
Pie Charts
Pareto DiagramBar Charts
© 2003 Prentice-Hall, Inc.
Univariate and Bivariate Analysis by Tables and Charts
of Car Data
Variables: •Miles per Gallon•Type of Drive•Weight
© 2003 Prentice-Hall, Inc.
Variables
We will be looking at 3 variables relating to cars
We Will use Histograms Frequency and Percentage Polygon Ogives Scatter Plots (Bivariate Data)
© 2003 Prentice-Hall, Inc.
The following Data Relates to Front and Rear Wheel Drive
Cars
Drive Type and Miles per Gallon (Rear)Cumulative Frequency
12 to under 14 13 2 8.0% 12 0 0.00%14 to under 16 15 4 16.0% 14 0 0.00%16 to under 18 17 3 12.0% 16 4 17.39%18 to under 20 19 8 32.0% 18 7 30.43%20 to under 22 21 4 16.0% 20 15 65.22%22 to under 24 23 2 8.0% 22 19 82.61%24 to under 26 25 2 8.0% 24 21 91.30%26 to under 28 27 0 0.0% 26 23 100.00%28 to under 30 29 0 0.0% 28 23 100.00%30 to under 32 31 0 0.0% 30 23 100.00%32 to under 34 33 0 0.0% 32 23 100.00%
Total 25 100.0%
Cumulative %Frequency
DistributionPercentage Distribution
Lower Limit
Cumulative Frequency
Lower Upper Limit Mid points
Drive Type and Miles per Gallon (Front)Cumulative Frequency
12 to under 14 13 0 0.00% 12 0 0.00%14 to under 16 15 3 3.70% 14 0 0.00%16 to under 18 17 1 1.23% 16 3 3.70%18 to under 20 19 12 14.81% 18 4 4.94%20 to under 22 21 19 23.46% 20 16 19.75%22 to under 24 23 22 27.16% 22 35 43.21%24 to under 26 25 11 13.58% 24 57 70.37%26 to under 28 27 6 7.41% 26 68 83.95%28 to under 30 29 4 4.94% 28 74 91.36%30 to under 32 31 3 3.70% 30 78 96.30%32 to under 34 33 0 0.00% 32 81 100.00%
Cumulative %
Frequency Distribution
Percentage Distribution Lower Limit
Cumulative Frequency
Lower limit
Upper Limit
Mid points
© 2003 Prentice-Hall, Inc.
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
12 14 16 18 20 22 24 26 28 30
Midpoints
Perc
enta
ge
Ogive: Front Wheel Drive
What percentage of the Front wheel Drive Cars do: -
•More than 19 miles per gallon
•Less than 27 Miles per gallon
Estimate the miles per gallon for: -
•50 Percentile of Front Wheel Drive Cars:
•25 Percentile of Front Wheel Drive Cars
© 2003 Prentice-Hall, Inc.
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
12 14 16 18 20 22 24 26 28 30
Midpoints
Perc
enta
ge
Ogive: Rear Wheel Drive
What percentage of the Rear wheel Drive Cars do: -
•More than 19 miles per gallon
•Less than 27 Miles per gallon
Estimate the miles per gallon for: -
•50 Percentile of Rear Wheel Drive Cars:
•Rear Percentile of Front Wheel Drive Cars
© 2003 Prentice-Hall, Inc.
To Compare Distributions
Compare Frequency and Percentage Polygons for Front and Rear Drive Cars
Compare Ogives for Front and Rear Drive Cars
© 2003 Prentice-Hall, Inc.
Frequency Polygon: Drive Type and Miles per Gallon
0
5
10
15
20
25
11 13 15 17 19 21 23 25 27 29 31 33
Miles per Gallon
Front Rear
Answer the Following True or False. This graph shows:
• There are more Rear Wheel Cars than Front Wheel Cars
•No Front Wheel Drive Cars Do more than 25 Miles per gallon
•There are more Front Wheel Drive Cars that do 21 Miles per gallon than Rear Wheel Drive Cars
© 2003 Prentice-Hall, Inc.
0%
5%
10%
15%
20%
25%
30%
35%
11 13 15 17 19 21 23 25 27 29 31 33
Miles per Gallon
Front Rear
Percentage Polygon: Drive Type and Miles per GallonAnswer the Following True or False.
This graph shows:
• Overall the miles per gallon for Rear Wheel is not as good as Front Wheel Drive Cars
• 40% of Front Wheel Drive Cars do 21 Miles per gallon
•There are more Front Wheel Drive Cars that do 21 Miles per gallon than Rear Wheel Drive Cars
© 2003 Prentice-Hall, Inc.
0%
20%
40%
60%
80%
100%
120%
12 14 16 18 20 22 24 26 28 30 32
Miles per Gallon
Front Rear
Cumulative Percentage Polygon: Drive Type and Miles
per Gallon
Answer the Following True or False. This graph shows:
• Overall the miles per gallon for Front Wheel is better than Rear Wheel Drive Cars
• 80% of Front Wheel Drive Cars do less than 21 Miles per gallon
© 2003 Prentice-Hall, Inc.
Car Weight (lbs) V’s Miles per Gallon
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
0 5 10 15 20 25 30 35
Miles per Gallon
Weight (lbs)
Answer the Following True or False. This graph shows:
• Overall the Weight of the car is associated with the Miles per gallon
• The Heavier the car the greater the miles per gallon
© 2003 Prentice-Hall, Inc.
Displaying Categorical Data
Three Rules of Data Analysis: - Make a picture – It will reveal things you can
not see on a table and will help you think clearly
Make a picture – Well designed display will show the important features and patterns in your data i.e. missing wrong data or unexpected patterns
Make Picture –It is the best way to tell others what about your data.
© 2003 Prentice-Hall, Inc.
Frequency Tables1. What is the most common hair color in this group of children?
2. What is the second most common color?
3. Are the categories given here well-defined?
4. If not, how would you improve them?
5. Does this distribution of hair colors resemble the distribution you see among people in Cambodia?
6. If not, how does it differ?
Fair Red Medium Dark Black
27% 5.3% 39.7% 25.8%
2.2%
© 2003 Prentice-Hall, Inc.
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Red Medium Dark Black
Bar Chart1. What is the most common hair color in this group of children?
2. What is the second most common color?
3. Is there a greater difference between the relative frequencies of Red and Black, or between the relative frequencies of Medium and Dark?
© 2003 Prentice-Hall, Inc.
Graphing Univariate Categorical Data
0 1 0 2 0 3 0 4 0 5 0
S to c k s
B o n d s
S a vin g s
C D
Categorical Data
Tabulating Data
The Summary Table
Graphing Data
Pie Charts
Pareto DiagramBar Charts
0
5
1 0
1 5
2 0
2 5
3 0
3 5
4 0
4 5
S to c k s B o n d s S a vin g s C D
0
2 0
4 0
6 0
8 0
1 0 0
1 2 0
© 2003 Prentice-Hall, Inc.
Tabulating and Graphing Bivariate Categorical Data
Contingency Tables:
Standard of Living A B C TotalRich 8 5 0 13Medium 15 10 1 26Poor 39 39 25 103Poorest 37 41 50 128Total 99 95 76 270
Standard of Living A B C TotalRich 8% 5% 0% 5%Medium 15% 11% 1% 10%Poor 39% 41% 33% 38%Poorest 37% 43% 66% 47%Total 100% 100% 100% 100%
Cummune
Cummune
© 2003 Prentice-Hall, Inc.
Pie Chart (Analyzing Standard of Living in 4
Communes)
Rich5%
Medium10%
P oor38%
P oorest47%
Rich
Medium
Poor
Poorest
Standar d of Living
© 2003 Prentice-Hall, Inc.
Bivariate Categorical Data(for Standard of Living by Commune)
0 10 20 30 40 50 60
Rich
Medium
Poor
Poorest
C
B
A
Commune
Frequency
© 2003 Prentice-Hall, Inc.
Stacked Bar Chart Bivariate Categorical Data
0%
20%
40%
60%
80%
100%
A B C
Poorest
Poor
Medium
Rich
Which commune has the highest percentage of Poor
© 2003 Prentice-Hall, Inc.
Pareto Diagram
Axis for line graph shows
cumulative % invested
Axis for bar
chart shows
number in each
category
Pateto of Returned Phones
0
1
2
3
4
5
6
Returned Phones
Co
un
t
0
20
40
60
80
100
120
%
Count Cumulative Percentage
© 2003 Prentice-Hall, Inc.
Case Study: Titanic Data
Survived Age Sex ClassDead Adult Male ThirdDead Adult Male CrewDead Adult Male ThirdDead Adult Male CrewDead Adult Male CrewDead Adult Male CrewAlive Adult Female FirstDead Adult Male ThirdDead Adult Male Crew
•Part of a table detailing the Titanic
•Problem with this table - in fact all tables like this – is that you can not really see what is going on
•We need to show patterns, relationships, trends and even exceptions
What are the variables?
© 2003 Prentice-Hall, Inc.
Frequency Tables
•The Count provides the Frequency of each category.
•The Variable CLASS has only 4 categories so it is very easy to read
•The Percentage Provides the Relative Frequency
What percentage was not crew ?
© 2003 Prentice-Hall, Inc.
Example: Create a Frequency Table
Survived Age Sex ClassDead Adult Male ThirdAlive Adult Male CrewDead Adult Male ThirdDead Child Male CrewDead Adult Male CrewAlive Child Male CrewAlive Adult Female FirstDead Adult Male ThirdDead Adult Male Crew
Example: Create 3 Frequency Tables
1. Survived
2. Age
3. Class
© 2003 Prentice-Hall, Inc.
Bar Charts
Displays the distribution of a categorical variable, showing the counts for each category next to each other for easy comparison
What does this graph tell us?
People on the Titanic by Ticket Class
0
200
400
600
800
Count
Fre
qu
ency First
Second
ThirdExample: Create 6 bar Charts for data on previous
slide.
1. Survived: A Frequency B Percentage
2. Age : A Frequency B Percentage
3. Class : A Frequency B Percentage
© 2003 Prentice-Hall, Inc.
Pie Charts
Pie Chars show us the whole group of cases as a circle. They slice into pieces whose size is proportional to the fraction of the whole in each category
Where are there more crew or Third class passengers?
First15%
Second13%
Third32%
Crew40%
Example: Draw 6 Pie Charts for data on previous slide.
1. Survived: A Frequency B Percentage
2. Age : A Frequency B Percentage
3. Class : A Frequency B Percentage
© 2003 Prentice-Hall, Inc.
Bivariate Table (Cross Tabulations)
Was there a relationship between the CLASS and the chance of surviving?
We need to look at Class and Survival together on a contingency table.
Marginal Distribution
© 2003 Prentice-Hall, Inc.
Example: Cross Tabulations
Create a cross tabulation for:
1. Survived and Age
2. Survived and Sex
3. Age and Sex
Survived Age Sex ClassDead Adult Male ThirdAlive Adult Male CrewDead Adult Male ThirdDead Child Male CrewDead Adult Male CrewAlive Child Male CrewAlive Adult Female FirstDead Adult Male ThirdDead Adult Male Crew
© 2003 Prentice-Hall, Inc.
Row Percentages indicate Chance
How can the presentation can be improved?
© 2003 Prentice-Hall, Inc.
Conditional Distributions makes it clearer
•We have redefined the WHO of the study into two groups: Who was ALIVE and Who was DEAD
•Conditional distributions is when one variable has been selected which satisfies some condition?
If % of ROW same it shows the distributions are INDEPENDENT of Class How can the presentation can be improved?
© 2003 Prentice-Hall, Inc.
Clustered Bar Chart of comparing percentage of Alive and Dead in Titanic
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Alive Dead
Crew
Third
Second
First
Did the Crew, Third Class, Second Class and First Class have the same Chance of Survival (being Alive or Dead)?
Class
Each bar as the ”Whole” and divides it proportionally into segments corresponding to the percentage in each group.
© 2003 Prentice-Hall, Inc.
Data changed to show an Equal Chance of Survival
Survival First Second Third Crew TotalAlive 202 118 178 212 710Dead 101 59 89 106 355Total 303 177 267 318 1065
Class
First Second Third Crew TotalAlive 202 118 178 212 710% Row 28% 17% 25% 30% 100%
First Second Third Crew TotalDead 101 59 89 106 355% Row 28% 17% 25% 30% 100%
•The Conditional Distributions is based on Category Survival.•Note now the % Row are equal
This means the chance of being alive and dead is the same in each Class. Thus, Survival is independent of Class
Modified
© 2003 Prentice-Hall, Inc.
Clustered Bar Chart for Equal Chance of Survival
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Alive Dead
Crew
Third
Second
First
Class
© 2003 Prentice-Hall, Inc.
Alternatively, create Conditional Distributions split on Class (By Columns)
First % ColumnAlive 202 67%Dead 101 33%Total 303 100%
Second % ColumnAlive 118 67%Dead 59 33%Total 177 100%
Third % ColumnAlive 178 67%Dead 89 33%Total 267 100%
Crew % ColumnAlive 212 67%Dead 106 33%Total 318 100%
Modified Data to show equal Chance
The Passengers in Category First Class have an equal chance of survival…..
… as they do in Second Class…
… as they do in Third Class…
… as they do in Crew …
© 2003 Prentice-Hall, Inc.
Pie Charts are also a good choice when you are primarily interested in percentage
ALIVE
First
Second
Third
Crew
DEAD
First
Second
Third
Crew
What does this show? (See Hidden Slide)
The Dead (non survivors) are mostly crew and 3rd class passengers, survivors on the other hand are nearly evenly split across the classes.
These Pie Charts use original Titanic Data
© 2003 Prentice-Hall, Inc.
Examining Contingency tables STEP-BY-STEP
Think Variable – Identify the Variables are report the W’s. be certain that the data are counts and the categories do not overlap
Show Mechanics – Make an appropriate display to see whether there is a difference in the relative proportions. Bar Charts may work equally well
Tell Interpretation – Discuss the pattern in the tables and display and if you can flag any issues to managers to inform them of any consequences. This is extremely important when conducting research.
© 2003 Prentice-Hall, Inc.
Example: Cross Tabulations and Graphs
You have created a cross tabulation for:
1. Survived and Age2. Survived and Sex3. Age and Sex Workout Row
percentages and then display them on a Stacked bar or Pie Chart
Survived Age Sex ClassDead Adult Male ThirdAlive Adult Male CrewDead Adult Male ThirdDead Child Male CrewDead Adult Male CrewAlive Child Male CrewAlive Adult Female FirstDead Adult Male ThirdDead Adult Male Crew
© 2003 Prentice-Hall, Inc.
Principles of Graphical Excellence
Well-Designed Presentation of Data that Provides: Substance Statistics Design
Communicate Complex Ideas with Clarity, Precision and Efficiency
Gives the Largest Number of Ideas in the Most Efficient Manner
Almost Always Involves Several Dimensions Telling the Truth about the Data
© 2003 Prentice-Hall, Inc.
Errors in Presenting Data
Using ‘Chart Junk’ No Relative Basis in Comparing Data
between Groups Compressing the Vertical Axis No Zero Point on the Vertical Axis
© 2003 Prentice-Hall, Inc.
‘Chart Junk’
Good Presentation
1960: $1.00
1970: $1.60
1980: $3.10
1990: $3.80
Minimum Wage Minimum Wage
0
2
4
1960 1970 1980 1990
$
Bad Presentation
© 2003 Prentice-Hall, Inc.
No Relative Basis
Good PresentationA’s received by
studentsA’s received by
students
Bad Presentation
0
200
300
FR SO JR SR
Freq.
10
30
FR SO JR SR
%
FR = Freshmen, SO = Sophomore, JR = Junior, SR = Senior
© 2003 Prentice-Hall, Inc.
Compressing Vertical Axis
Good Presentation
Quarterly Sales Quarterly Sales
Bad Presentation
0
25
50
Q1 Q2 Q3 Q4
$
0
100
200
Q1 Q2 Q3 Q4
$
© 2003 Prentice-Hall, Inc.
No Zero Point on Vertical Axis
Good Presentation
Monthly SalesMonthly Sales
Bad Presentation
0
39
42
45
J F M A M J
$
36
39
42
45
J F M A M J
$
Graphing the first six months of sales
36
© 2003 Prentice-Hall, Inc.
Chapter Summary
Organized Numerical Data The Ordered Array and Stem-Leaf Display
Tabulated and Graphed Univariate Numerical Data Frequency Distributions: Tables, Histograms,
Polygons Cumulative Distributions: Tables, the Ogive
Graphed Bivariate Numerical Data
© 2003 Prentice-Hall, Inc.
Chapter Summary
Tabulated and Graphed Univariate Categorical Data The Summary Table Bar and Pie Charts, the Pareto Diagram
Tabulated and Graphed Bivariate Categorical Data Contingency Tables Side by Side Charts
Discussed Graphical Excellence and Common Errors in Presenting Data
(continued)