chapter 3 graphical and numerical summaries of qualitative data

30
Chapter 3 Graphical and Numerical Summaries of Qualitative Data UNIT OBJECTIVES At the conclusion of this unit you should be able to: 1) Construct graphs that appropriately describe data 2) Calculate and interpret numerical summaries of a data set. 3) Combine numerical methods with graphical methods to analyze a data set.

Upload: griffin-munoz

Post on 02-Jan-2016

39 views

Category:

Documents


1 download

DESCRIPTION

Chapter 3 Graphical and Numerical Summaries of Qualitative Data. UNIT OBJECTIVES At the conclusion of this unit you should be able to: 1)Construct graphs that appropriately describe data 2)Calculate and interpret numerical summaries of a data set. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Chapter 3Graphical and Numerical

Summaries of Qualitative DataUNIT OBJECTIVES

At the conclusion of this unit you should be able to: 1) Construct graphs that appropriately describe

data 2) Calculate and interpret numerical summaries

of a data set. 3) Combine numerical methods with graphical

methods to analyze a data set.

Page 2: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Displaying Qualitative Data

“Sometimes you can see a lot just by looking.”

Yogi Berra

Hall of Fame Catcher, NY Yankees

Page 3: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

The three rules of data analysis won’t be difficult to remember

1. Make a picture —reveals aspects not obvious in the raw data; enables you to think clearly about the patterns and relationships that may be hiding in your data.

2. Make a picture —to show important features of and patterns in the data. You may also see things that you did not expect: the extraordinary (possibly wrong) data values or unexpected patterns

3. Make a picture —the best way to tell others about your data is with a well-chosen picture.

Page 4: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Bar Charts: show counts or relative frequency for

each category Example: Titanic passenger/crew distribution

Titanic Passengers by Class

885

325285

706

0.00

100.00

200.00

300.00

400.00

500.00

600.00

700.00

800.00

900.00

1000.00

Crew First Second Third

Page 5: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Pie Charts: shows proportions of the

whole in each category Example: Titanic passenger/crew

distribution Titanic Passengers by Class

Crew40%

First15%

Second13%

Third32%

Page 6: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Example: Top 10 causes of death in the United States 2001

Rank Causes of death Counts% of top

10s% of total

deaths

1 Heart disease 700,142 37% 28%

2 Cancer 553,768 29% 22%

3 Cerebrovascular 163,538 9% 6%

4 Chronic respiratory 123,013 6% 5%

5 Accidents 101,537 5% 4%

6 Diabetes mellitus 71,372 4% 3%

7 Flu and pneumonia 62,034 3% 2%

8 Alzheimer’s disease 53,852 3% 2%

9 Kidney disorders 39,480 2% 2%

10 Septicemia 32,238 2% 1%

All other causes 629,967 25%

For each individual who died in the United States in 2001, we record what was

the cause of death. The table above is a summary of that information.

Page 7: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Top 10 causes of deaths in the United States 2001

Top 10 causes of death: bar graphEach category is represented by one bar. The bar’s height shows the count (or

sometimes the percentage) for that particular category.

The number of individuals who died of an accident in

2001 is approximately 100,000.

Page 8: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

0100200300400500600700800

Counts

(x1000)

Bar graph sorted by rank Easy to analyze

Top 10 causes of deaths in the United States 2001

0100200300400500600700800

Cou

nts

(x10

00)

Sorted alphabetically Much less useful

Page 9: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Percent of people dying fromtop 10 causes of death in the United States in 2001

Top 10 causes of death: pie chartEach slice represents a piece of one whole. The size of a slice depends on what

percent of the whole this category represents.

Page 10: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Percent of deaths from top 10 causes

Percent of deaths from

all causes

Make sure your labels match

the data.

Make sure all percents

add up to 100.

Page 11: Chapter 3 Graphical and Numerical Summaries of Qualitative Data
Page 12: Chapter 3 Graphical and Numerical Summaries of Qualitative Data
Page 13: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Side-by side bar chart

Page 14: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Student Debt North Carolina Schools

Page 15: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Contingency Tables: Categories for Two

Variables Example: Survival and class on the

Titanic

Crew First Second Third TotalAlive 212 202 118 178 710Dead 673 123 167 528 1491Total 885 325 285 706 2201

Marginal distributions marg. dist. of survival

710/2201 32.3%

1491/2201 67.7%

marg. dist. of class

885/2201 40.2%

325/2201 14.8%

285/2201 12.9%

706/2201 32.1%

Page 16: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Marginal distribution of class.Bar chart.

Page 17: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Marginal distribution of class: Pie chart

Page 18: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Contingency Tables: Categories for Two Variables (cont.)

Conditional distributions.Given the class of a passenger, what is the chance the passenger survived?

ClassCrew First Second Third Total

Alive Count 212 202 118 178 710Survival % of col. 24.0% 62.2% 41.4% 25.2% 32.3%

Dead Count 673 123 167 528 1491% of col. 76.0% 37.8% 58.6% 74.8% 67.7%

Total Count 885 325 285 706 2201

Page 19: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Conditional distributions: segmented bar chart

Page 20: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Contingency Tables: Categories for Two

Variables (cont.)Questions: What fraction of survivors were in first class? What fraction of passengers were in first class and

survivors ? What fraction of the first class passengers

survived? ClassCrew First Second Third Total

Alive Count 212 202 118 178 710Survival % of col. 24.0% 62.2% 41.4% 25.2% 32.3%

Dead Count 673 123 167 528 1491% of col. 76.0% 37.8% 58.6% 74.8% 67.7%

Total Count 885 325 285 706 2201

202/710

202/2201

202/325

Page 21: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

TV viewers during the Super Bowl in 2007. What is the marginal distribution of those who watched the commercials only?

1. 8.0%

2. 23.5%

3. 58.2%

4. 27.7%

Page 22: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

TV viewers during the Super Bowl in 2007. What percentage watched the Game and were Female?

1. 41.8%

2. 38.8%

3. 51.2%

4. 19.8%

Page 23: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

TV viewers during the Super Bowl in 2007. Given that a viewer did not watch the Super Bowl Game or Commercials, what percentage were male?

1. 45.2%

2. 48.8%

3. 26.8%

4. 27.7%

Page 24: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

3-Way Tables

Example: Georgia death-sentence data

Race of Defendant

Black White Black White TotalsDeath Yes 18 50 2 58 128

Sentence No 1420 178 62 687 2347Totals 1438 228 64 745 2475

1.2 21.9 3.1 7.8% Death Sentence

Black WhiteRace of Victim Race of Victim

Page 25: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

UC Berkeley Lawsuit

MEN WOMEN

No. ofapplicants

2691 1835

Admitted 1199 557

%admitted

44.6 30.4

Page 26: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

LAWSUIT (cont.)

MEN WOMENMAJOR No. of

ApplicantsNo.

AdmittedNo. of

ApplicantsNo.

AdmittedA 825 512 (62%) 108 *89 (82%)B 560 353 (63%) 25 *17 (68%)C 325 120 (37%) 593 202 (34%)D 417 138 (33%) 375 *131 (35%)E 191 53 (28%) 393 94 (24%)F 373 23 (6%) 341 *24 (7%)

TOTAL 2691 1199 1835 557

Page 27: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Simpson’s Paradox

The reversal of the direction of a comparison or association when data from several groups are combined to form a single group.

Page 28: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

Fly Alaska Airlines, the on-time airline!

Alaska Airlines American West% Arrivals No. of % Arrivals No. of

Destination On Time Arrivals On Time ArrivalsL. A. 88.9% 559 85.6% 811Phoenix 94.8% 233 92.1% 5,255San Diego 91.4% 232 85.5% 448San Fran. 83.1% 605 71.3% 449Seattle 85.8% 2,146 76.7% 262Total 3,775 7,225

Page 29: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

American West Wins!You’re a Hero!

Alaska Airlines American West% Arrivals No. of % Arrivals No. of

Destination On Time Arrivals On Time ArrivalsL. A. 88.9% 559 85.6% 811Phoenix 94.8% 233 92.1% 5,255San Diego 91.4% 232 85.5% 448San Fran. 83.1% 605 71.3% 449Seattle 85.8% 2,146 76.7% 262Total 86.7% 3,775 89.1% 7,225

Page 30: Chapter 3 Graphical and Numerical Summaries of Qualitative Data

End of Chapter 3