fpp chapter 19 surveys. general idea parameter statistic inference sample population
TRANSCRIPT
FPP Chapter 19
Surveys
General Idea
Parameter
Statistic
Inference
Sample
Population
Some new vocabulary Population Sample Parameter Statistic Inference Bias Non-response bias Response bias Simple random sample Convenience sampling Frame coverage bias Judgment sampling Voluntary sampling Probably others that I’ve missed
Plan of Study 1. Issues in questionnaire design
2. Methods for selecting units to survey
3. Administration of surveys
Challenges to writing good questions1. Defining objectives and specifying the kind of
answers needed to meet objectives of the question2. Ensuring all respondents have a shared, common
understanding of the question3. Ensuring people are asked questions to which they
know the answers4. Asking questions respondents are able to answer
in the terms required by the question5. Asking questions respondents are willing to answer
accurately6. Asking questions that don’t lead respondent to a
certain answer
Steps to running a survey1. Establish the target population
2. Obtain a sampling frame (this can be very difficult)
3. Select a sample
4. Obtain data from the sampled units
Misspecifying target population1994 Democratic gubernatorial primary in
Arizona
All polls predict Eddie Basha would trail front-runner by at least 9 points
Result of election: Basha won
Target population used in polls: registered voters who had voted in previous primaries
Surveys that use census as sampling frameU.S. census often used as frame for many
federal and social surveys target population here is folks living in U.S.
U.S. census misses some people can you think of any examples?
Samples take from frame are non-representative even before sampling
Selecting samplesUnits sample should be representative of the
target populationHow do we ensure this?
Select a subset of units from the frame at randomMost common method is to obtain a “simple
random sample”
If random sample is large enough, it should have characteristics that mirror the characteristics of the population frame.
Obtaining survey dataRemember the following when designing a
surveyImperative that purpose of survey is stated
clearly
Confidentiality should be promised and keptAt ISU there is a group that ok’s confidentiality of
surveys is met
Method for asking questions should be the same for all sampled units
Unreliable methods of selecting samplesWhat follows are examples of how NOT to
select a sample
Convenience sampling:Picking units that are easy to measure
Judgement sampling:Picking units you judge as representative of the
population
Voluntary response sampling:Picking units who respond voluntarily
What are some examples of each?
Additional potential pitfallsNonresponse bias:
Units that do not respond differ from those that do. These folks will be under representated.
Frame coverage bias:Frame doesn’t include all of target population
Can we think of some examples?
Example of voluntary response surveyNightline call-in poll:
Ted Koppel asked people to call his show to express their opinion on whether the United Nations should continue to have its headquarters in New York
186,000 people called in with 67% saying no.Independent random sample: 72% said yes.
Examples of problematic survey designsShere Hite’s book, Women and Love: A
Cultural Revolution in Progress (1987), claims:
84% of women “not satisfied emotionally with their relationships” (pg. 804)
95% of women “report forms of emotional and psychological harassment from men with whom they are in love relationships” (pg. 810)
70% of women “married five or more years are having sex outside of their marriages” (pg. 856)
Hite’s surveyTo whom did she send a survey?
100,00 questionnaires mailed to professional women’s groups, counseling centers, church societies, and senior citizens’ centers.
Her target population was women. What was her actually population?
Hite’s surveyWhat did the survey look like?
127 essay questions on questionnaire
4.5% of these questionnaires returned
What was not taken into account?
Hite’s surveyHow did she ask the questions?
Questions use vague words like “love”. People have different interpretations of such words
Questions were leading“Does your husband/lover treat you as an equal? Or
are there times when he seems to you as an inferior? Leave you out of decisions? Act superior?” (pg. 795)
Another problematic survey designThe article “Abortion Rights Groups
Surveying Voters’ Views”, by Jack Coffman, appeared in the December 26, 1989 issue of the St. Paul Pioneer Press Dispatch.
Problems with Minnesota survey
Random sampling comment 1Say you collect data on units using a method other than
a random sample, and you know these data are not representative of the population of interest. Then, you take a random sample from these collected data. This random sample is representative of the population.
Wrongo !!
Large random samples are representative of the population in the frame.
Effectively, this methods uses the unrepresentative, collected data as a frame.
By randomly sampling from a unrepresentative sample, you just get a smaller unrepresentative sample.
Random sampling comment 2Say you obtain data that are representative
of the target population. Should you take a random sample from these collected data?This question arises when researchers use
data collected by others, for example in a Stat 101 project.
No!
If you have a representative sample, use it. This sub-sampling method just reduces the
amount of data you work with
Random sampling comment 3 A census is a measurement of outcomes for all units in the
population. For example the U.S.. Government does a census of the population every 10 years to apportion seats in the House of Representatives. It also takes censuses of agriculture and business.
Why do survey instead of census?
Surveys are cheaper They require much fewer people to contact
Surveys results can be obtained more quickly Same reason as above This is important because we want to make policy decisions on
current answers not answers that are months or years old. Surveys can be more accurate
Fewer people to contact, less problems with interviewer effects and non-response bias
Up shot: less data of high quality is better than more data of poor quality
Random sampling comment 4Most major surveys are not simple random samplesThey involve multiple stages of random selection
e.g., randomly pick 100 cities. From these cities random pick 500 households, then random pick 1 person from each household
Data collection like this are NOT representative of the population. However, because units are selected randomly, statistician can account for the non-representation.
This is done by assigning a weight to each observation that reflects how many units it represents in the populationA good question to ask here would be: Where do the
weights come from?Generally when analyzing data from surveys that are
not simple random samples it is wise to contact a professional statistician