graphs, good and badweb.as.uky.edu/statistics/users/dcluek2/sta 200 summer 2011/lect… · votes in...

18
CHAPTER 10 Graphs, Good and Bad

Upload: others

Post on 29-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

CHAPTER 10Graphs, Good and Bad

Page 2: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

DISPLAYING DATA

The first part of this course dealt with the production of data, through random sampling and randomized comparative experiments.

This particular unit focuses on good ways to summarize and organize data.

2

Page 3: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

DATA TABLES

Who did you vote for in the 2008 presidential election?

One way to organize the responses for all Americans is to create a data table.

Good data tables should contain the following things: A clear main heading Clearly labeled variables Rates (percentages or proportions) should be used

either instead of or to supplement counts

3

Page 4: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

EXAMPLE 10.1Votes in 2008 Presidential Election

Data tables show what values a variable takes and how often it takes these values. In other words, data tables present the distribution of a variables

Candidate Number of votes PercentageBarack Obama 69,456,897 52.92%John McCain 59,934,814 45.66%Ralph Nader 738,475 0.56%Bob Barr 523,686 0.40%Chuck Baldwin 199,314 0.15%Cynthia McKinney 161,603 0.12%Other 242,539 0.18%Total 131,257,328 100%

4

Page 5: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

TYPES OF VARIABLES

Some variables place individuals into categories (like eye color or gender), while some variables have a meaningful numerical scale (like height, age, or exam score).

There are two types of variables: A categorical variables places an individual into

one of several categories. A quantitative variable takes numerical values for

which arithmetic operations such as averaging make sense.

5

Page 6: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

CATEGORICAL VARIABLES

Pie charts and bar graphs are good ways to show the distribution of a categorical variable.

So we could summarize our presidential election data with either a pie char or a bar graph.

6

Page 7: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

PIE CHART

53%46%

1%0% 0%

0% 0%

Voters in 2008 Presidential Election

ObamaMcCainNaderBarrBaldwinMcKinneyOther

7

Page 8: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

BAR GRAPH

0

10,000,000

20,000,000

30,000,000

40,000,000

50,000,000

60,000,000

70,000,000

80,000,000

Obama McCain Nader Barr Baldwin McKinney Other

Numb

er of

Voter

s

Candidate

Voters in 2008 Presidential Election

8

Page 9: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

PICTOGRAMS

Another method of displaying the distribution of a categorical variable.

What is a problem with this graphic?

9

Page 10: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

PICTOGRAMS

Here are two charts which display the same information Ownership among

certain types of pets Often misleading

because they misrepresent the difference between values of the categorical variable.

The artists who produce pictograms often sacrifice the accuracy of data so that they can avoid distortion of the pictures being used

10

Page 11: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

LINE GRAPHS

Line graphs are used to display how a quantitative variable changes over time.

A line graph of a variable plots each observation against the time at which it was measured. We always put time on the horizontal axis (x-axis) and the variable on the vertical axis (y-axis). We then connect each data point to display the change over time.

11

Page 12: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

EXAMPLE 10.2 For any line graph, we

want to look for an overall pattern and any striking deviations from that pattern.

What is the overall pattern?

Are there any striking deviations from that pattern.

010000002000000300000040000005000000600000070000008000000900000010000000

1981

1985

1989

1993

1997

2001

Coun

t

Year

Sales of New Trucks

12

Page 13: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

SEASONAL VARIATION

Particular line graphs may display what is known as seasonal variation. This is a pattern that repeats itself at regular time intervals.

Often times, series of regular measurements over time might be seasonally adjusted. This means that the expected seasonal variation is removed before the data are published.

13

Page 14: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

EXAMPLE 10.3 Notice that the line graph has seasonal variation.

We see that every year there is a spike in airline passengers.

The overall trend here is an increase in airline passengers.

14

Page 15: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

MISREPRESENTING DATA

The most common method of misrepresenting data in line graphs is a result of picking certain scales.

Notice how when I choose this scale, it looks like we have a rather slow increase in the number of unmarried couples over time.

0500010000

1977 1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003

Unmarried

 Couples 

(thousands)

Year

Unmarried Couples

15

Page 16: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

MISREPRESENTING DATA

However, when we switch scales for the same data, we might be inclined to draw a different conclusion.

While this line graph still shows an increasing trend, it looks much more dramatic than the previous line graph.

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000Unmarried

Couples

(

thousands)

Year

Unmarried Couples

16

Page 17: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

MAKING GOOD GRAPHS

Title, Label, Scale Make sure labels and legends describe variables

and their measurement units. Be careful with the scales used.

Make the data stand out We want to ensure that the data itself, rather than

any background art or labels, catches the viewer’s attention.

Avoid pictograms and be careful when choosing scales. Avoid 3D effects or other graphics that might confuse people.

17

Page 18: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how

REMINDERS

Chapter 10 homework is posted online and due Friday.

18