chapter 1 part 1
TRANSCRIPT
Chapter 1: Sampling and Data1/25/2017
Chapter 1: Sampling and Data• In this course, we’ll learn how to organize and summarize
data.• This is called ‘descriptive statistics’. • Two common ways to summarize data include• Graphing• Numbers (finding the average, for example).
Chapter 1: Sampling and Data• After we have studied probability (a mathematical tool used to
study randomness) and probability distributions, we will use formal methods for drawing conclusions from “good” data.• The formal methods are called inferential statistics.• Statistical inference uses probability to determine how confident we can
be that our conclusions are correct.
Variables
• What sex are you (M or F)?• This is an example of a binomial variable.• Binomial Variables only allow us to choose from one of the two
categories.• We can’t find the ‘average’ of genders• We can only say that __% of the group is male, and __% of the group is female.
• Sex is also a categorical or qualitative variable, as opposed to a quantitative variable.• That is, male and female are qualities, not numbers.• We could assign #1 to being female, and #2 to being male, but this would be
meaningless.
Are you a native of Mendocino or Lake County?
• Here, we need either a ‘Yes’ or a ‘No’. • This is another example of a categorical, binomial variable.• Please answer if yes if you are native to either county.• What if you were actually born in Santa Rosa, though your parents lived
in Ukiah, and you were brought home to Ukiah? Are you still a Native? • That is up to you to decide.
Are you a graduate of a Mendocino or Lake County high school?
• Another qualitative, binomial variable, with yes or no for an answer.• What if you are still in high school in Lake or Mendocino County?• Should you answer ‘yes’ or ‘no’?• It’s up to you. You are probably going to graduate, so you can answer yes.
What is your favorite color?
• Clearly, there are more than two choices here.• This is called a multinomial variable – many names.• This is still a categorical variable – it is not numerical or
quantitative.• Please, you are only allowed to put down one color, so don’t
choose ‘blue or green’. • You can be use as many descriptive adjectives as you like.• If you do not have a favorite color, that’s ok; you can leave it blank, or
you can choose one that you just kind of like.
What is your ZIP code?
• Your answer should be a five-digit number.• It will likely be 9 5 4 __ __ .
• Even though the ZIP code is a number, it doesn’t act like a number. • You can’t give an average ZIP code for a group – that doesn’t make any sense.• You can’t compare ZIP codes for size.• People from Willits cannot claim they’re better than people from Ukiah because their
ZIP code (95490) is 8 bigger than Ukiah’s (95482). • The only way to analyze a set of ZIP codes is by saying what percent of a
group has each ZIP code.
What is your ZIP code?• A ZIP code is a multinomial, categorical variable.• But, there is something numerical about ZIP codes;• After all, they are numbers.• Thus, you could say that ZIP codes are a quantitative variable, but at the
very lowest rung on a ladder which we call levels of measurement.• This first level is the nominal level, meaning that the number is really
only being used as a name.
The next level is represented by these questions:
• Which statement best characterizes your attitude towards taking math classes?
1. If I never have to take another math class it will be too soon.2. I would prefer not to take a math class.3. I can take them or leave them.4. I enjoy math classes.5. I adore taking math; it’s my favorite thing to study.
The next level is represented by these questions:
• Which statement best characterizes your attitude towards social media? (Facebook, Snapchat, Twitter, etc.)
1. I never use social media and can’t stand it.2. I seldom use social media.3. I can take it or leave it.4. I enjoy social media.5. I love using social media; I can’t get through the day without
it.
The next level is represented by these questions:
• The answers to both of these questions are clearly numbers, and they function as numbers in many ways.• You could say that the name of your attitude toward math
classes is 5 or 1, but that doesn’t really convey your response.• Answering with 5 means you have a very positive attitude
about the subject; 1 a very negative one.• There is an order to the responses;• The larger the number the more positive the response.
• We call this the ordinal level of measurement.
The next level is represented by these questions:
• Analyzing a set of responses we might give the percents for each answer.• We might also pool the 5’s and the 4’s as those with positive attitudes• Or the 1’s and 2’s as those with negative attitudes• Or the 3’s 4’s and 5’s as those who don’t have negative attitudes, etc.• We can rank the attitudes in a way that makes sense, hence the name
ordinal.
The next level is represented by these questions:
• However, there are some characteristics of numbers that are lacking here.• We can’t say that a person who answers with a 5 is more positive than a
person who answers with a 4 to the same extent that a person who answers with a 4 is more positive than a person who answers with a 3.
• In other words, we can’t make inferences about the differences between the responses, only their rank.• It would be ridiculous to assert that a person answering 4 is
twice as positive as a person answering 2.• These sorts of numerical characteristics belong to the
remaining, higher levels of measurement.
If you own a car, what year is it?• Your answer should be a four-digit number starting with either 19 or 20.• No fractions please• If you don’t own a car, leave it blank• If you own more than one car, choose one.
• The year your car was made is clearly a number, and the newer the car, the bigger the number.• Thus, this variable is also ordinal, but it’s also something more.• A car made in 2012 is 5 years newer than a car made in 2007.• Thus, the differences between years have significance.
• We call these differences intervals and so year of car is at the interval level of measurement.
Interval Level of Measurement• We can reasonably say how much newer or older one car is
than another.• However, although we can say that a certain car is half as old
as another, or twice as old, we can’t do so using the years the cars were made without subtracting them from the current year. • If cars had been made in the year 1000, we couldn’t say that
such a car is twice as old as a car made in the year 2000.
What size shoes do you wear?• Please, just put down one number, or a whole number
followed by .5, and leave it at that.• I know that sizes differ from men to women, but you’ve already stated
your sex.• Shoe size is definitely a number, definitely ordinal, and
definitely at the interval level of measurement.• Size 10 is the same amount bigger than 8 as 8 is bigger than 6.• (Each size bigger is supposed to be ¼ inch longer.)
• Shoe sizes still lack that last, most sophisticated property of numbers – the ability to say that one size is twice as big as another.• The size might be, but the shoes aren’t.
How old did you turn on your last birthday?
• Please use a whole number.• Age on last birthday is at the ordinal and interval levels of
measurement, but there is something more.• We can say that one person is twice as old as another if one
person is 40 and the other 20.• We can discuss the ratio of their ages as 2:1.• Thus, this variable is at the ratio level of measurement, which
is the highest level you can achieve.
How old did you turn on your last birthday?
• You can also talk meaningfully about an age of 0.• Notice that you were asked to round your age to the nearest whole
number.• Specifically, you were asked to round down to the nearest number.
• This is because you do not count your age, you measure it.• This is a fundamental distinction between different uses of numbers for different
variables.• Some variables are measured, like age, or height, and some are counted.• When you measure a variable, you always have to determine how fine a
measurement to use.• You could give your age to the nearest year, or month, or week, or day, or hour, or
minute, or…but you’d have to keep changing your answer as you get more precise.
How old did you turn on your last birthday?
• We say that age is a continuous variable, meaning that it could be measured to any degree of fineness.• Only variables which are measured are continuous.
How many pets do you have?• What is a pet? Is a goat a pet, or a farm animal? It’s up to you!• How about pets that belong to the whole family? It’s up to you!• Clearly, this should be a whole number, and 0 is fine.• Number of pets is at the ratio level of measurement; • you can have twice as many pets as someone else, • and you can have 0 pets.
• However, you cannot have 4 ½ pets, or anything ending in a fraction.
• This is because you don’t measure your pets, but you count them.
How many pets do you have?• Number of pets is a variable which is counted, and we call that
a discrete variable.• (Note, that is discrete, not discreet.)
• Think of the word ‘Concrete’• Latin – Crete is ‘growing’, con is with, so Concrete is growing together• Discrete is growing apart• Discrete numbers have gaps between them.• You can have 3 pets or 4 pets, but no number of pets between these two
numbers.
The Survey• Be sure to complete the survey on Canvas by Sunday night.• We will use the data from the class in the class.• All answers are anonymous.• We may combine the data with the other statistics classes on campus as
well, but again, it will all remain anonymous.