section 5.1

25
Section 5.1 Incomes and Other Quantities

Upload: aimon

Post on 06-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Section 5.1. Incomes and Other Quantities. Examples of Categorical Variables. What is your gender? Did you see Toni Morrison last night? How confident are you that you’ll be able to find a job in your major upon graduation? not confident at allsomewhat confident - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Section 5.1

Section 5.1

Incomes and Other Quantities

Page 2: Section 5.1

Examples of Categorical Variables

• What is your gender?

• Did you see Toni Morrison last night?

• How confident are you that you’ll be able to find a job in your major upon graduation?

not confident at all somewhat confident

confident very confident

Page 3: Section 5.1

Numerical Summaries of Categorical Data

• What is your gender?

– There were 10 females and 20 males in the sample.

• How can we numerically summarize this data besides reporting raw counts?

Page 4: Section 5.1

Summarizing Categorical Data

– So the proportion of females is 33.33% and the proportion of males is 67.77%.

– Are there other appropriate numeric measurements? Does it makes sense to describe gender by an average? NO.

Page 5: Section 5.1

Job Confidence Results

• 10 say not confident

• 15 say somewhat confident

• 30 say confident

• 20 say very confident

• How can we numerically summarize this data? Using proportions.

Page 6: Section 5.1

Graphical Displays of Categorical Data

• Bar Graphs

• Pie Charts

Page 7: Section 5.1

Numeric Variables

• Numeric data consists of numbers representing measurements.

• The text calls “numeric data”, “number line data”.

• Examples:– Weights of football players– Prices of college textbooks– Age of US Presidents at inauguration

Page 8: Section 5.1

Looking Ahead

• Chapters 5-7 examine many of the same ideas that we studied in Chapters 1-4, except from the point of view of numeric variables.

• Similar to before, we’ll look at numerical and graphical summaries of data, sampling distributions of statistics, confidence intervals, and hypothesis tests.

Page 9: Section 5.1

Overview of Chapter 5 (in part)

• Numerical Summaries of Numeric Variables

– Measures of center: What is the center value?

– Measures of spread: Is the data set close to the center or spread out?

Page 10: Section 5.1

Numerical Summaries for Numeric Data

• Prices of college textbooks– $82.50, $75.50, $27.50, $88.25, $79.00,

$120.50, $90.25, $68.50, $85.50, $90.25

– How should we summarize this numeric data?– Does computing proportions make sense?

Page 11: Section 5.1

Measurements of Center for Numeric Data

Three common measures of “center” are:

• Mean – arithmetic average

• Median – “middle” value

• Mode – most frequent

Page 12: Section 5.1

CO2 Pollution of the 8 Largest Nations

• The Pew Center on Global Climate Change reports that possible global warming is due in large part to human activity that produces carbon dioxide emissions and other greenhouse gases. The CO2 emissions from fossil fuel combustion are the result of the generation of electricity, heating, and gas consumption in cars.

Page 13: Section 5.1

Which countries are most populated?

• http://www.aneki.com/populated.html

Page 14: Section 5.1

Per capita CO2 emissions for the 8 largest countries in population size

(metric tons/person)• China 2.3• India 1.1• USA 19.7• Indonesia 1.2• Russia 9.8• Brazil 1.8• Pakistan 0.7• Bangladesh 0.2

Page 15: Section 5.1

Dotplot of the CO2 emissions data

20100

Per Capita Carbon Dioxide Emissions

Page 16: Section 5.1

Mean

• Defn: The sum of the data values divided by the number of data values.

• Ex: Find the mean the 8 countries:

Ans: 6.4

8

2.07.08.98.12.17.191.13.2

Page 17: Section 5.1

Median

• Defn: Center value of ordered data.

• Ex: Find the median of the data set.Begin by ordering the data.

0.2, 0.7, 1.1, 1.2, 1.8, 2.3, 9.8, 19.7 Since there are an even number of data points, the median

is the mean of the middle two values, 1.2 and 1.8. So the median is 1.5.

Page 18: Section 5.1

Why two measures of center?

• The mean and median are usually different so journalists have an opportunity to mislead you by which one is reported.

• Ex: In 2004 the median household income was $44, 389 and the mean household income was $60,528.

Page 19: Section 5.1

Mean vs. Median

• Median is below about half of its observations.

• It’s possible for the mean to be below most of the observations.

• Ex: http://bcs.whfreeman.com/ips5e/default.asp?s=&n=&i=&v=&o=&ns=0&uid=0&rau=0

Page 20: Section 5.1

Describing the Shape of a Histogram

• Mean is the balance point.• If a histogram is symmetrical, its balance

point is the middle observation. In this case, mean=median.

• Distributions that are not symmetrical are skewed – either to the right (tail extends out further to the right than the left) or to the left (tail extends out further to the left than to the right.)

Page 21: Section 5.1

Skewed RightHow much cash do you have on you?

Median = $15 Mean = $35.82

40036032028024020016012080400

30

20

10

0

Cash

Fre

que

ncy

Page 22: Section 5.1

Skewed left

Page 23: Section 5.1

Number of States VisitedMedian = 15 Mean = 16.43

454035302520151050

15

10

5

0

States

Fre

que

ncy

Page 24: Section 5.1

Mean follows skewness

• If a distribution of data is skewed, the mean will be farther towards the tail than the median.

Page 25: Section 5.1

Exercises• The workers and management of a company are

having a labor dispute. Explain why workers might use the median income of all employees to justify a raise but management might use the mean to argue that a raise is not needed.

• The mean age of four people in a room is 30 years. A new person whose age is 55 years enters the room. What is the mean age of the five people in the room?