chapter 12 sample surveys richard dong & stanley chen
TRANSCRIPT
Chapter 12 Sample Surveys
Richard Dong & Stanley Chen
Vocabulary● Population - the entire group of individuals or instances about whom
we hope to learn● Sample - a (representative) subset of a population, examined in hope
of learning about the population● Sample Survey - a study that asks questions of a sample drawn
from some population in the hope of learning about the entire population; Polls taken to assess voter preferences are common these
● Bias - any systematic failure of a sampling method to represent its population is this; These sampling methods tend to over-or underestimate parameters; It is almost impossible to recover from this, so efforts to avoid it are well spent
● Randomization - the best defense against bias is this, in which each individual is given a fair, random chance of selection
● Sample size - the number of individuals in a sample; Determines how well the sample represents the population, not the fraction of the population sampled
● Census - a sample that consists of the entire population
● Population parameter - a numerically valued attribute of a model for a population.
Vocabulary● Statistic - values calculated for sampled data● Representative - if the statistics computed from the sample accurately reflect the
corresponding population parameters● Simple Random Sample SRS - this of sample size n is a sample in which each set of
n elements in the population has an equal chance of selection● Sampling Frame - a list of individuals from which the sample is drawn● Sampling variability - natural tendency of randomly drawn samples to differ, one from
another● Stratified random sampling - population is divided into subpopulations, or strata, and
random samples are then drawn from each stratum; Best if strata are homogeneous, but different from each other
● Cluster sampling - entire groups, or cluster, are chosen at random. Selected as a matter of convenience, practicality, or cost; Clusters should be representative of the population, and therefore heterogeneous and similar to each other
● Multistage sampling - combine several different types of sampling methods
Vocabulary● Systematic sample - individuals are selected systematically
from a sampling frame. first number must be random; ex: every 10th person
● Pilot - a small trial run● Voluntary Response Bias - individuals can choose on
their own whether or not to participate in the sample● Convenience Sample - taken from individuals who are
conveniently available● Undercoverage - part of population is less represented
● Nonresponse bias - large fraction of those sampled fail to respond; those who respond are not likely to represent the whole population; ex: telephone survey
● Response bias - the word of questions that influences a responders answer; ex: "How do you feel about the cost cuts to local zoos that are making animals starve to death?"
Formulas● There are no formulas for
this chapter because this unit covers surveys and thus, math is not required.
Concepts● Representative samples can offer us important insights about population.● The size of the sample is most important.● Simple Random Sample (SRS) is the standard. Every person has an equal
chance.● Stratified samples reduce variability by identifying homogenous subgroups
and use random sampling in those groups.● Cluster samples randomly select among heterogeneous subgroups that
resemble the population, only smaller.● Systematic samples are the least expensive and can work in some
situations. We want to start randomly though.● Multistage samples combine several of the aforementioned methods.
Concepts ● Bias can destroy our insights through
poor sampling methods.● Nonresponse bias arises when
respondents might not respond.● Response bias arises when sampled
individuals might be influenced by wording or interviewer behavior.
● Voluntary response bias are almost always biased and should be avoided.
● Convenience samples are likely to be flawed for similar reasons
● Even with a reasonable design, sample frames may not be representative. Undercoverage may occur when too few individuals are sampled.
● Always report all biases when performing a survey so that others can evaluate your data for fairness and accuracy in your results.
Problem Example #21● Question: Examine each of the following questions for possible bias. If you
think the question is biased, indicate how and propose a better question.o Should companies that pollute the environment be compelled to pay
the costs of cleanup?o Given that 18-year-olds are old enough to vote and to serve in the
military, is it fair to set the drinking age at 21?● Answers
o There’s a bias towards yes because of the word ‘pollute’. A better question should be ‘Should companies be responsible for any costs of environmental clean-up?’
o There’s a bias towards no because of ‘old enough to serve in the military’. ‘Do you think drinking age should be lowered from 21?’ is a better survey question.
Problem Example #23● Question: Anytime we conduct a survey we must take care to avoid undercoverage. Suppose we plan to select
500 names from the city phone book, call their homes between noon and 4 p.m., and interview whoever answers, anticipating contacts with at least 200 people.o Why is it difficult to use a simple random sample here?o Describe a more convenient, but still random, sampling strategy. o What kinds of households are likely to be included in the eventual sample of opinion? Who will be
excluded?o Suppose, instead, that we continue calling each number, perhaps in the morning or evening, until an
adult is contacted and interviewed. How does this improve the sampling design?o Random digit dialing machines can generate the phone calls for us. How would this improve our design?
Is anyone still excluded?● Answers
o People with unlisted numbers, people without phones and those at work cannot be reached. As a result, not everyone has an equal chance.
o We can generate random numbers and call at random times to make sure everyone has equal chanceo Families that has person at home are more likely to be included under the original plan. Many more
people could be included under the second plan. However, people without phones are still excluded.o This design can randomize the phone numbers, but time of day is still an issue. And People without
phones are still excluded.
Thank you.