graphs, good and badweb.as.uky.edu/statistics/users/dcluek2/sta 200 summer 2011/lect… · votes in...
TRANSCRIPT
![Page 1: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/1.jpg)
CHAPTER 10Graphs, Good and Bad
![Page 2: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/2.jpg)
DISPLAYING DATA
The first part of this course dealt with the production of data, through random sampling and randomized comparative experiments.
This particular unit focuses on good ways to summarize and organize data.
2
![Page 3: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/3.jpg)
DATA TABLES
Who did you vote for in the 2008 presidential election?
One way to organize the responses for all Americans is to create a data table.
Good data tables should contain the following things: A clear main heading Clearly labeled variables Rates (percentages or proportions) should be used
either instead of or to supplement counts
3
![Page 4: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/4.jpg)
EXAMPLE 10.1Votes in 2008 Presidential Election
Data tables show what values a variable takes and how often it takes these values. In other words, data tables present the distribution of a variables
Candidate Number of votes PercentageBarack Obama 69,456,897 52.92%John McCain 59,934,814 45.66%Ralph Nader 738,475 0.56%Bob Barr 523,686 0.40%Chuck Baldwin 199,314 0.15%Cynthia McKinney 161,603 0.12%Other 242,539 0.18%Total 131,257,328 100%
4
![Page 5: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/5.jpg)
TYPES OF VARIABLES
Some variables place individuals into categories (like eye color or gender), while some variables have a meaningful numerical scale (like height, age, or exam score).
There are two types of variables: A categorical variables places an individual into
one of several categories. A quantitative variable takes numerical values for
which arithmetic operations such as averaging make sense.
5
![Page 6: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/6.jpg)
CATEGORICAL VARIABLES
Pie charts and bar graphs are good ways to show the distribution of a categorical variable.
So we could summarize our presidential election data with either a pie char or a bar graph.
6
![Page 7: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/7.jpg)
PIE CHART
53%46%
1%0% 0%
0% 0%
Voters in 2008 Presidential Election
ObamaMcCainNaderBarrBaldwinMcKinneyOther
7
![Page 8: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/8.jpg)
BAR GRAPH
0
10,000,000
20,000,000
30,000,000
40,000,000
50,000,000
60,000,000
70,000,000
80,000,000
Obama McCain Nader Barr Baldwin McKinney Other
Numb
er of
Voter
s
Candidate
Voters in 2008 Presidential Election
8
![Page 9: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/9.jpg)
PICTOGRAMS
Another method of displaying the distribution of a categorical variable.
What is a problem with this graphic?
9
![Page 10: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/10.jpg)
PICTOGRAMS
Here are two charts which display the same information Ownership among
certain types of pets Often misleading
because they misrepresent the difference between values of the categorical variable.
The artists who produce pictograms often sacrifice the accuracy of data so that they can avoid distortion of the pictures being used
10
![Page 11: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/11.jpg)
LINE GRAPHS
Line graphs are used to display how a quantitative variable changes over time.
A line graph of a variable plots each observation against the time at which it was measured. We always put time on the horizontal axis (x-axis) and the variable on the vertical axis (y-axis). We then connect each data point to display the change over time.
11
![Page 12: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/12.jpg)
EXAMPLE 10.2 For any line graph, we
want to look for an overall pattern and any striking deviations from that pattern.
What is the overall pattern?
Are there any striking deviations from that pattern.
010000002000000300000040000005000000600000070000008000000900000010000000
1981
1985
1989
1993
1997
2001
Coun
t
Year
Sales of New Trucks
12
![Page 13: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/13.jpg)
SEASONAL VARIATION
Particular line graphs may display what is known as seasonal variation. This is a pattern that repeats itself at regular time intervals.
Often times, series of regular measurements over time might be seasonally adjusted. This means that the expected seasonal variation is removed before the data are published.
13
![Page 14: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/14.jpg)
EXAMPLE 10.3 Notice that the line graph has seasonal variation.
We see that every year there is a spike in airline passengers.
The overall trend here is an increase in airline passengers.
14
![Page 15: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/15.jpg)
MISREPRESENTING DATA
The most common method of misrepresenting data in line graphs is a result of picking certain scales.
Notice how when I choose this scale, it looks like we have a rather slow increase in the number of unmarried couples over time.
0500010000
1977 1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003
Unmarried
Couples
(thousands)
Year
Unmarried Couples
15
![Page 16: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/16.jpg)
MISREPRESENTING DATA
However, when we switch scales for the same data, we might be inclined to draw a different conclusion.
While this line graph still shows an increasing trend, it looks much more dramatic than the previous line graph.
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000Unmarried
Couples
(
thousands)
Year
Unmarried Couples
16
![Page 17: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/17.jpg)
MAKING GOOD GRAPHS
Title, Label, Scale Make sure labels and legends describe variables
and their measurement units. Be careful with the scales used.
Make the data stand out We want to ensure that the data itself, rather than
any background art or labels, catches the viewer’s attention.
Avoid pictograms and be careful when choosing scales. Avoid 3D effects or other graphics that might confuse people.
17
![Page 18: Graphs, Good and Badweb.as.uky.edu/statistics/users/dcluek2/STA 200 Summer 2011/Lect… · Votes in 2008 Presidential Election Data tables show what values a variable takes and how](https://reader034.vdocument.in/reader034/viewer/2022052005/601848e5d8e33e3e3b1f450f/html5/thumbnails/18.jpg)
REMINDERS
Chapter 10 homework is posted online and due Friday.
18