chapter 1 part 1

24
Chapter 1: Sampling and Data 1/25/2017

Upload: jason-edington

Post on 07-Feb-2017

143 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Chapter 1 part 1

Chapter 1: Sampling and Data1/25/2017

Page 2: Chapter 1 part 1

Chapter 1: Sampling and Data• In this course, we’ll learn how to organize and summarize

data.• This is called ‘descriptive statistics’. • Two common ways to summarize data include• Graphing• Numbers (finding the average, for example).

Page 3: Chapter 1 part 1

Chapter 1: Sampling and Data• After we have studied probability (a mathematical tool used to

study randomness) and probability distributions, we will use formal methods for drawing conclusions from “good” data.• The formal methods are called inferential statistics.• Statistical inference uses probability to determine how confident we can

be that our conclusions are correct.

Page 4: Chapter 1 part 1

Variables

• What sex are you (M or F)?• This is an example of a binomial variable.• Binomial Variables only allow us to choose from one of the two

categories.• We can’t find the ‘average’ of genders• We can only say that __% of the group is male, and __% of the group is female.

• Sex is also a categorical or qualitative variable, as opposed to a quantitative variable.• That is, male and female are qualities, not numbers.• We could assign #1 to being female, and #2 to being male, but this would be

meaningless.

Page 5: Chapter 1 part 1

Are you a native of Mendocino or Lake County?

• Here, we need either a ‘Yes’ or a ‘No’. • This is another example of a categorical, binomial variable.• Please answer if yes if you are native to either county.• What if you were actually born in Santa Rosa, though your parents lived

in Ukiah, and you were brought home to Ukiah? Are you still a Native? • That is up to you to decide.

Page 6: Chapter 1 part 1

Are you a graduate of a Mendocino or Lake County high school?

• Another qualitative, binomial variable, with yes or no for an answer.• What if you are still in high school in Lake or Mendocino County?• Should you answer ‘yes’ or ‘no’?• It’s up to you. You are probably going to graduate, so you can answer yes.

Page 7: Chapter 1 part 1

What is your favorite color?

• Clearly, there are more than two choices here.• This is called a multinomial variable – many names.• This is still a categorical variable – it is not numerical or

quantitative.• Please, you are only allowed to put down one color, so don’t

choose ‘blue or green’. • You can be use as many descriptive adjectives as you like.• If you do not have a favorite color, that’s ok; you can leave it blank, or

you can choose one that you just kind of like.

Page 8: Chapter 1 part 1

What is your ZIP code?

• Your answer should be a five-digit number.• It will likely be 9 5 4 __ __ .

• Even though the ZIP code is a number, it doesn’t act like a number. • You can’t give an average ZIP code for a group – that doesn’t make any sense.• You can’t compare ZIP codes for size.• People from Willits cannot claim they’re better than people from Ukiah because their

ZIP code (95490) is 8 bigger than Ukiah’s (95482). • The only way to analyze a set of ZIP codes is by saying what percent of a

group has each ZIP code.

Page 9: Chapter 1 part 1

What is your ZIP code?• A ZIP code is a multinomial, categorical variable.• But, there is something numerical about ZIP codes;• After all, they are numbers.• Thus, you could say that ZIP codes are a quantitative variable, but at the

very lowest rung on a ladder which we call levels of measurement.• This first level is the nominal level, meaning that the number is really

only being used as a name.

Page 10: Chapter 1 part 1

The next level is represented by these questions:

• Which statement best characterizes your attitude towards taking math classes?

1. If I never have to take another math class it will be too soon.2. I would prefer not to take a math class.3. I can take them or leave them.4. I enjoy math classes.5. I adore taking math; it’s my favorite thing to study.

Page 11: Chapter 1 part 1

The next level is represented by these questions:

• Which statement best characterizes your attitude towards social media? (Facebook, Snapchat, Twitter, etc.)

1. I never use social media and can’t stand it.2. I seldom use social media.3. I can take it or leave it.4. I enjoy social media.5. I love using social media; I can’t get through the day without

it.

Page 12: Chapter 1 part 1

The next level is represented by these questions:

• The answers to both of these questions are clearly numbers, and they function as numbers in many ways.• You could say that the name of your attitude toward math

classes is 5 or 1, but that doesn’t really convey your response.• Answering with 5 means you have a very positive attitude

about the subject; 1 a very negative one.• There is an order to the responses;• The larger the number the more positive the response.

• We call this the ordinal level of measurement.

Page 13: Chapter 1 part 1

The next level is represented by these questions:

• Analyzing a set of responses we might give the percents for each answer.• We might also pool the 5’s and the 4’s as those with positive attitudes• Or the 1’s and 2’s as those with negative attitudes• Or the 3’s 4’s and 5’s as those who don’t have negative attitudes, etc.• We can rank the attitudes in a way that makes sense, hence the name

ordinal.

Page 14: Chapter 1 part 1

The next level is represented by these questions:

• However, there are some characteristics of numbers that are lacking here.• We can’t say that a person who answers with a 5 is more positive than a

person who answers with a 4 to the same extent that a person who answers with a 4 is more positive than a person who answers with a 3.

• In other words, we can’t make inferences about the differences between the responses, only their rank.• It would be ridiculous to assert that a person answering 4 is

twice as positive as a person answering 2.• These sorts of numerical characteristics belong to the

remaining, higher levels of measurement.

Page 15: Chapter 1 part 1
Page 16: Chapter 1 part 1

If you own a car, what year is it?• Your answer should be a four-digit number starting with either 19 or 20.• No fractions please• If you don’t own a car, leave it blank• If you own more than one car, choose one.

• The year your car was made is clearly a number, and the newer the car, the bigger the number.• Thus, this variable is also ordinal, but it’s also something more.• A car made in 2012 is 5 years newer than a car made in 2007.• Thus, the differences between years have significance.

• We call these differences intervals and so year of car is at the interval level of measurement.

Page 17: Chapter 1 part 1

Interval Level of Measurement• We can reasonably say how much newer or older one car is

than another.• However, although we can say that a certain car is half as old

as another, or twice as old, we can’t do so using the years the cars were made without subtracting them from the current year. • If cars had been made in the year 1000, we couldn’t say that

such a car is twice as old as a car made in the year 2000.

Page 18: Chapter 1 part 1

What size shoes do you wear?• Please, just put down one number, or a whole number

followed by .5, and leave it at that.• I know that sizes differ from men to women, but you’ve already stated

your sex.• Shoe size is definitely a number, definitely ordinal, and

definitely at the interval level of measurement.• Size 10 is the same amount bigger than 8 as 8 is bigger than 6.• (Each size bigger is supposed to be ¼ inch longer.)

• Shoe sizes still lack that last, most sophisticated property of numbers – the ability to say that one size is twice as big as another.• The size might be, but the shoes aren’t.

Page 19: Chapter 1 part 1

How old did you turn on your last birthday?

• Please use a whole number.• Age on last birthday is at the ordinal and interval levels of

measurement, but there is something more.• We can say that one person is twice as old as another if one

person is 40 and the other 20.• We can discuss the ratio of their ages as 2:1.• Thus, this variable is at the ratio level of measurement, which

is the highest level you can achieve.

Page 20: Chapter 1 part 1

How old did you turn on your last birthday?

• You can also talk meaningfully about an age of 0.• Notice that you were asked to round your age to the nearest whole

number.• Specifically, you were asked to round down to the nearest number.

• This is because you do not count your age, you measure it.• This is a fundamental distinction between different uses of numbers for different

variables.• Some variables are measured, like age, or height, and some are counted.• When you measure a variable, you always have to determine how fine a

measurement to use.• You could give your age to the nearest year, or month, or week, or day, or hour, or

minute, or…but you’d have to keep changing your answer as you get more precise.

Page 21: Chapter 1 part 1

How old did you turn on your last birthday?

• We say that age is a continuous variable, meaning that it could be measured to any degree of fineness.• Only variables which are measured are continuous.

Page 22: Chapter 1 part 1

How many pets do you have?• What is a pet? Is a goat a pet, or a farm animal? It’s up to you!• How about pets that belong to the whole family? It’s up to you!• Clearly, this should be a whole number, and 0 is fine.• Number of pets is at the ratio level of measurement; • you can have twice as many pets as someone else, • and you can have 0 pets.

• However, you cannot have 4 ½ pets, or anything ending in a fraction.

• This is because you don’t measure your pets, but you count them.

Page 23: Chapter 1 part 1

How many pets do you have?• Number of pets is a variable which is counted, and we call that

a discrete variable.• (Note, that is discrete, not discreet.)

• Think of the word ‘Concrete’• Latin – Crete is ‘growing’, con is with, so Concrete is growing together• Discrete is growing apart• Discrete numbers have gaps between them.• You can have 3 pets or 4 pets, but no number of pets between these two

numbers.

Page 24: Chapter 1 part 1

The Survey• Be sure to complete the survey on Canvas by Sunday night.• We will use the data from the class in the class.• All answers are anonymous.• We may combine the data with the other statistics classes on campus as

well, but again, it will all remain anonymous.