week 7 sampling
TRANSCRIPT
![Page 1: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/1.jpg)
SamplingSamplingAndrew MartinAndrew Martin
PS 372PS 372University of KentuckyUniversity of Kentucky
![Page 2: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/2.jpg)
What do polls tell us?
![Page 3: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/3.jpg)
Population❖ If we want to assess American support for
competing presidential candidates, we must clarify what we mean by Americans.
❖ In other words, we need to specify a population.
❖ A population is the complete set of relevant units of analysis.
❖ For the purpose of studying elections, the population is generally defined as the U.S. voting-age population (residents 18 and older).
![Page 4: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/4.jpg)
Sample❖ Interviewing every voting-age American would be
impossible; the time and money constraints are too great.
❖ However, it is possible (and more practical) to select a sample from the population for investigation.
❖ A sample is any subset of units collected in some manner from a population.
❖ The sample size and method ultimately determines the quality of inferences that can be made about the population.
![Page 5: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/5.jpg)
Population vs. Sample
Study Population Sample
U.S. VotingVoting-Age
Pop.Gallup Poll
Respondents
Military Coups All coupsCoups in Latin
America in 1980sSupreme Court
Decision-making
All decisions on the merits
Merits decisions on Burger Court
Democratization
All democratizing
countries
Democratizing post-Cold War
![Page 6: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/6.jpg)
Populations and Samples
❖ Ideally, political scientists would like to measure and gather information about the population.
❖ Examples: Averages, differences between two groups, relationships among variables.
❖ If this information can be found and quantified for the entire population, the number is known as a population parameter.
❖ However, sample statistics allow political scientists to approximate the corresponding population values, or parameters.,
![Page 7: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/7.jpg)
Population Parameters
❖ Population parameters are typically denoted by lowercase English or Greek letters, usually the Greek letter theta ( θ ).
❖ A proportion, such as the proportion of Americans who support the war in Iraq at a particular time, typically designated as P or π.
![Page 8: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/8.jpg)
Sample Statistics
❖ Sample statistics are frequently expresses with a hat (^) over a character to denote it is not a parameter but a statistic. Sometimes lowercase p is used for a sample proportion.
![Page 9: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/9.jpg)
Statistics vs. Parameters
❖ For a population mean, the lowercase Greek letter Mu (μ) is used.
❖ For the corresponding sample statistic, μ hat is used, or Y bar
![Page 10: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/10.jpg)
Population vs. Samples
❖ An element is a single occurrence, realization or instance of the objects or entities being studied.
❖ A population can be subdivided into small groups known as strata.
❖ Each stratum shares one or more characteristic.
![Page 11: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/11.jpg)
Sampling
❖ The particular population from which a sample is actually drawn is called a sampling frame.
❖ Sampling frames are supposed to contain all elements that are part of the population of interest, but in practice are often incomplete.
❖ Example: Polling UK students using the annual student phone directory.
![Page 12: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/12.jpg)
1936 Presidential Election
❖ Literary digest predicted that Republican Alf Landon would defeat Democrat Franklin D. Roosevelt.
❖ The population: U.S. voters.
❖ The sampling frame: Telephone directories and automobile registration lists.
❖ Telephone and car ownership were not common then. The sample was not representative of the actual population because it overrepresented wealthy voters.
![Page 13: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/13.jpg)
Sampling❖ Now, virtually everyone has a telephone. But
some people have unlisted numbers.
❖ Researchers have developed random digit dialing to deal with this problem. A computer randomly selects telephone numbers, which is how people with unlisted numbers are contacted.
❖ However, not everyone owns a telephone. Millions of people are switching to cell phones, which will eventually cause pollsters to change their methodology.
![Page 14: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/14.jpg)
Sampling
❖ Political science researchers like to use information collected in the sample to make inferences about the whole population.
❖ If the sampling frame is incomplete or inappropriate, sample bias will occur.
❖ This causes the sample to be unrepresentative of the population and can lead scholars to draw incorrect conclusions.
![Page 15: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/15.jpg)
Sampling
❖ A probability sample is simply a sample for which each element in the total population has a known probability of being sampled.
❖ A nonprobability sample is one in which each element in the population has an unknown probability of being selected.
![Page 16: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/16.jpg)
Simple Random Sample
❖ In a simple random sample each element and combination of elements has an equal chance of being selected.
❖ However, this is often difficult to do in practice.
![Page 17: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/17.jpg)
Simple Random Sample❖ During the Vietnam War, the selective service
decided it would randomly draft soldiers by sampling days of the year.
❖ A drum contained 366 capsules with days of the year. Capsules were drawn, and men born on the day written on the capsule would be called to serve in the military unless exempted.
❖ However, the capsules must not have been properly mixed, because the Selective Service tended to oversample days during the last six months of the year.
![Page 18: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/18.jpg)
Simple Random Sample Method 1
❖ Example: We have 1,507 elements in the population and wish to draw a sample of 150.
❖ Every element is numbered, starting at 1 and ending at 1,507.
❖ Using a random number table, elements can be selected each time its corresponding number appears.
❖ Any system of combining the numbers is acceptable as long as the numbers are random.
![Page 19: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/19.jpg)
Simple Random Sample Method 2
❖ Example: We have 1,507 elements in the population and wish to draw a sample of 150.
❖ All elements are represented on corresponding marbles and put in a hit, which is continuously and thoroughly mixed.
❖ Each element has an equal chance of being selected.
![Page 20: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/20.jpg)
Systematic Sample
❖ Elements are selected from a list at pre-determined intervals. In other words, they are chosen systematically rather than randomly.
❖ Every jth element on a list is selected. This number is known as the sampling interval.
❖ If you have a population of 2,000 on a list and you want a sample of 200, you can select every 1oth element on the list for the sample.
❖ Usually the starting number is randomly selected. This is known as a random start.
![Page 21: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/21.jpg)
Systematic Sample
❖ Systematic samples could be biased if:
1. The elements on the list have been ranked according to a characteristic.
2. If the list contains a pattern corresponding to a sampling interval.
![Page 22: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/22.jpg)
Stratified Sample
❖ A stratified sample is a probability sample in which elements sharing one or more characteristics are grouped, and elements are selected from each group in proportion to the group’s representation in the total population.
❖ It is less difficult to draw a sample from a homogeneous population than a heterogeneous population.
![Page 23: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/23.jpg)
Stratified Sample❖ Can be proportionate or disproportionate.
❖ In a proportionate sample, each stratum is represented in proportion to its size in the population.
❖ To determine the number of elements to sample in each stratum, a sampling fraction must be calculated.
![Page 24: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/24.jpg)
Proportionate Stratified Sample
❖ Ex: We have 8,000 freshmen, 5,000 sophomores, 4,000 juniors and 3,000 seniors we wish to poll about the presidential election. We want to sample 2,000 students.
❖ The sampling fraction would be the desired sample size divided by the population, so 2000/20000.
❖ Therefore, we would sample 10 percent of each stratum, which in this case by school year.
![Page 25: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/25.jpg)
Proportionate Stratified Sample
❖ When selecting characteristics on which to stratify a list, you should chose those expected to have a relationship with the dependent variable.
❖ Ex: Household income samples should stratify by education, sex and race.
❖ Ex: MCs stratified by party and experience.
❖ Ex: News stories by network.
![Page 26: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/26.jpg)
Disproportionate Stratified Sample
❖ In a disproportionate sample, some strata are overrepresented and others are overrepresented.
❖ Usually, strata making up a smaller percentage of the population get oversample so we can make useful inferences about that group independent of the other strata.
❖ To prevent having a biased sample, each strata is weighted by its proportion of the population.
![Page 27: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/27.jpg)
Disproportionate Stratified Sample
❖ JRM 223
❖ .625(Liberal Arts) + .125(Engineering) + .25(Business) = Mean Student Body GPA
❖ .625(2.5) + .125(3.3) + .25(2.7) = 2.65
![Page 28: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/28.jpg)
Cluster Samples
❖ A cluster sample is a probability sample in which the sampling frame initially consists of clusters of elements.
![Page 29: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/29.jpg)
❖ NN 174
![Page 30: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/30.jpg)
Cluster Samples
❖ Suppose there are 500 blocks, and from these blocks 25 are chosen at random.
❖ On these 25 blocks, there are 4,000 dwelling units or households.
❖ One quarter of these households will be contacted because we desire a sample of 1,000 individuals.
![Page 31: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/31.jpg)
Cluster Samples
❖ Each household's probability of being selected is the same. How do we know this?
❖ 25/500 (probability that the household's block will be chosen) X 1000/4000 (probability of being surveyed if block is chosen) = 1/80
![Page 32: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/32.jpg)
Nonprobability Samples
❖ Sometimes an element's chance of being selected is unknown, and nonprobability samples have to be collected.
❖ Although probability samples are prefereable, sometimes they are not feasible.
❖ Researchers may be able to learn more studying carefully selected, even unusual cases.
![Page 33: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/33.jpg)
Nonprobability Samples❖ Purposive samples allow the researcher to have
discretion in selecting elements for observation. (Ex: Fenno's Home Style)
❖ A quota sample is a sample in which elements are sampled in proportion to the population. Similar to a stratified sample but elements are not chosen probabilistically.
❖ In a snowball sample, respondents are asked to identify other persons who might qualify for inclusion in the sample.
![Page 34: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/34.jpg)
Statistical Inference
❖ Statistical inference involves the mathematical theory and techniques for making conjectures about the unknown characteristics (parameters) of populations based on samples.
![Page 35: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/35.jpg)
Statistical Inference
❖ Samples statistics provide us with estimates or approximations of population parameters.
❖ These estimates may differ from the “true” value of the population parameter, but if the sample is collected correctly and is large enough, the estimates are unlikely to be far from the truth.
![Page 36: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/36.jpg)
Statistical Inference
❖ We will focus on three concepts:
❖ Expected values
❖ Standard errors
❖ Sampling distributions
![Page 37: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/37.jpg)
Expected Value
❖ Expected value is the mean or average value of a sample statistic based on repeated samples from a population.
![Page 38: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/38.jpg)
Expected Value
❖ Suppose a candidate for state senate wants to know how many independent voters live in a district, which has grown rapidly during the last 10 years. Therefore, there are no reliable Census data available.
❖ Why might a state senator care about the number of independents in his/her district?
![Page 39: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/39.jpg)
Expected Value❖ Suppose the true level of registered Independents is 25 percent,
or .25.
❖ In formal terms, P = .25, where P = population parameter.
❖ You take the first sample. Two of 10 respondents say they are independents. Your first estimate, or sample statistic, has some sampling error.
❖ Specifically, the sampling error is the discrepancy (.05) between the population parameter (.25) and the sample statistic (.20).
![Page 40: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/40.jpg)
Expected Value
❖ What about four samples? Let's assume you conduct 4 samples of 10 and the proportion of independents is: (.20 + .30 + .40 + .20)/4 = .275 (observed value or mean not far from true value of .25)
❖ Four samples of 10 brings us closer to the truth than one sample. What about 1,000 samples of 10? What about 1,000 samples of 50?
![Page 41: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/41.jpg)
Expected Value
❖ If statistics (or observed values) are calculated for each of many, many independently and randomly chosen samples, their average or mean will equal the corresponding population parameter (or true value).
❖ Statisticians refer to this mean as the expected value (E) of the estimator.
![Page 42: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/42.jpg)
Expected Value
❖ If θ represents the population parameter, then represents a sample estimator of that characteristic. We can then write:
E =
![Page 43: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/43.jpg)
Expected Value
❖ In the case of a sample proportion based on a simple random sample, we have:
❖ E (p) = Pwhere p is sample proportion, P is the estimated population proportion.
❖ In the long run, the sample statistic the average results would theoretically equal the true value or the population proportion.
![Page 44: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/44.jpg)
Sampling Distribution
❖ A sampling distribution of a sample statistic is a theoretical expression that describes the mean variation, and shape of the distribution in an infinite number of occurrences of the statistic when calculated on samples of size N drawn independently and randomly from a population.
![Page 45: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/45.jpg)
❖ JRM 230
![Page 46: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/46.jpg)
❖ JRM 233
![Page 47: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/47.jpg)
❖ JRM 234
![Page 48: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/48.jpg)
❖ JRM 235
![Page 49: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/49.jpg)
Confidence and Error
❖ Confidence is the degree of belief that an estimated range of values – more specifically, a high or low value – includes or covers the population parameter. In political science this concept is normally described by a confidence interval.
❖ Standard error is the standard deviation or measure of variability of a sampling distribution. In other words, it tells us how much variation there is in the sampling methods.
![Page 50: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/50.jpg)
Standard Error❖ The standard error of a reported proportion or
percentage p measures its accuracy, and is the estimated standard deviation of that percentage. It can be estimated from just p and the sample size, n, if n is small relative to the population size, using the following formula.
![Page 51: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/51.jpg)
Standard Error
.25.7510
=.14
![Page 52: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/52.jpg)
Confidence Intervals
❖ Get a poll, talk about it.
![Page 53: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/53.jpg)
![Page 54: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/54.jpg)
Gallup Poll Standard Error
❖ If we want the margin of error, or confidence interval for the poll, we multiply this number by 1.96.
❖ .01 * 1.96 = 1.96 ≈ 2
.52.482761
=.01
![Page 55: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/55.jpg)
❖ Margin of Error Graphic
![Page 56: Week 7 Sampling](https://reader035.vdocument.in/reader035/viewer/2022062703/55500fb0b4c90555618b4941/html5/thumbnails/56.jpg)
Confidence Intervals
❖ Standard distribution photo