random samples 12/5/2013. readings chapter 6 foundations of statistical inference (pollock) (pp...
TRANSCRIPT
Random Samples
12/5/2013
Readings
• Chapter 6 Foundations of Statistical Inference (Pollock) (pp 122-135)
Final Exam
• SEC 1– December 11th (Wednesday)– 1:30 pm - 3:30 pm
• SEC 2 – December 10th (Tuesday)– 1:30 pm - 3:30 pm
Final Paper
• Due 12/6/2013 by 11:59 AM- Doyle 226B• Turnitin via Blackboard Copy by 11:59PM on
12/6
Reminders for the Paper
• Dataset information is in Chapter 1 and in the appendix (p. 2-4). GSS and NES also has information on line– World.sav - http://www.hks.harvard.edu/fs/pnorris/Data/Data.htm
• If running x-tabs don’t forget column %’s
OPPORTUNITIES TO DISCUSS COURSE CONTENT
Office Hours For the Week
• When– Friday 7-12– No Office hours during final exam week– And by appointment
Course Learning Objectives
1. Students will learn the basics of polling and be able to analyze and explain polling and survey data.
2. Students will learn the basics of research design and be able to critically analyze the advantages and disadvantages of different types of design.
Sampling
After we write the survey, we have to select people!
Rules on Sampling
• if cost dictates that a sample be drawn, a probability sample is usually preferable to a nonprobability sample.
• The Law of Large Numbers
Collecting a sample
• Population
• Sampling Frame
• The Sample itself
The best that we can hope for is that every unit in the sampling frame has an equal chance of being selected
The Law of Large Numbers
• Smaller samples cause greater error.
• The larger the sample size, the greater the probability that our sample will represent the population.
All probability samples yield estimates of the target population
Two Things that Deal With the Stars
Astronomy Astrology
Polling is Science (Astronomy)
• Polls are right more than they are wrong
• We especially love them when it favors our candidates.
Polling is Random (Astrology)
• It is not an exact science, there is error in every poll.
• Polls Don’t Vote, People Vote
• We like it less when it doesn’t favor our candidate
Same Election, Different Results
Poll Date Sample MoE Obama (D)
Romney (R) Spread
Rasmussen Tracking
10/4 - 10/6 1500 LV 3 47 49 Romney +2
Gallup Tracking
9/30 - 10/6 3050 RV 2 49 46 Obama +3
CNN/Opinion Research
9/28 - 9/30 783 LV 3.5 50 47 Obama +3
National Journal
9/27 - 9/30 789 LV 4.2 47 47 Tie
NBC News/WSJ
9/26 - 9/30 832 LV 3.4 49 46 Obama +3
NPR9/26 - 9/30 800 LV 4 51 44 Obama +7
ABC News/Wash Post
9/26 - 9/29 813 LV 4 49 47 Obama +2
Different Questions Perhaps?
• If the election were held today, would you vote for Barack Obama or Mitt Romney?
• If the election were held today, would you vote for Mitt Romney or Barack Obama?
• If the election were held today, would you vote for Democrat Barack Obama or Republican Mitt Romney?
• If the election were held today, would you vote for Republican Mitt Romney or Democrat Barack Obama?
• If the election were held today, for whom would you vote?
More likely a different sample
SAMPLING ERRORPolling is 95% Science and 5% Astrology
The accuracy of estimates is expressed in terms of the margin or error and the confidence level
The Confidence Level
• The Confidence Level- can we trust these results?
• Surveys use a 95% confidence interval that the results will fall within the margin of error
• There is a 5% (1 out of 20) chance that the results will fall outside this range and produce wacky findings.
• This error often appears when you keep asking the same questions again and again
The Margin of Error
• Margin of Error
• A floating range above and below the estimate.
• Large Samples= Less Error
Still too early
PPP POLL
Texas Tribune Poll
On these Polls
PPP• Abbott could be leading by
as much as 54.4 to 30.6
• The Race could be as close as 45.6-39.4
Texas Tribune• Abbott could be leading by
as much as 42.83 to 31.17
• The Race could be as close as 37.17 to 36.83 for Abbott
• Races that fall within the margin of error we say “too close to call”.
What else determines sampling error
• Non-response rate
• Variability
• Bias
How Can a Survey of 1000 People Represent Millions of Voters?
• Responses Cancel each other out
• No New opinions are added
Its Logarithmic
PERSONAL INTERVIEWSData Collection Method I
Cluster Sampling (How we conduct it)
• People Move, Houses Don’t
• Random Samples of known units
• Each unit in the cluster has a chance
BLS
Personal Interviews
• Advantages
• Disadvantages
MAIL SURVEYSMethod II
Collecting a Sample
• Every address is in our frame
• Often Used to target specific Groups
• Less popular for “hot topics”
About Mail
• Advantages
• Disadvantages
TELEPHONE SURVEYSHow Most Surveys are Done Today
Telephone Surveys
• Every Phone Number has an equal Chance of Being Selected
• It is important that you select the right people
Advantages of Phone
• Fast
• CATI
• Closed Ended Questions
Why it is not a true random sample
• Some people do not have phones
• Some people simply will not answer (75% refusal rate)
Surveys miss out on
• Poll Sampling excludes many—minorities, young people, and new Americans
• Angry White Men
Who we often get
Problems of Cell Phones
• Some polls Exclude them
• You have to pay people to participate
• Some polls contact you and ask you to call back
Exit Polls
• Use a random selection of polling places
• Quick Recall and Fast Data
• Problems (early voting)
WE HAVE TO SURVEY A LOT MORE PEOPLE THAN WE USED TO
This makes it less random
Not All Sampling Frames are Created Equal
Low Response rates and trying to get likely voters slow things down
and drive up costs.
SO SHOULD WE FOLLOW THE POLLS?
Verify all Polls
• Who Conducted it
• How many they sampled
• How they sampled
• Specific question wording
Always Check
• Who sponsored the poll
• How they got the sample
• How big was the sample
• Specific questions