lecture 7 notes - asal...
TRANSCRIPT
Lecture 7 Notes➢ Chapter 9. Sample Survey
➢ Chapter 10. Experiments and Observational Studies
1
Why Sample?
2
• Examining the whole population may not be possible.
• We hear this kind of thing all the time, a survey asking
“if there were provincial election tomorrow, which party would you vote for?”
o How is this survey done?
o Why is this survey done?
o What happens if we tried to survey everybody?
Three Keys of Sampling
3
1. Examine a part of the whole (sample)
2. Randomize (to obtain the sample)
3. Sample Size
Examine a Part of the Whole
4
• The entire group of individuals that we want information about is called population.
• We would like to know about an entire population of individuals, but examining all of them is usually
impractical, if not impossible.
• We settle for examining a smaller group of individuals – a sample – selected from the population.
Representative Sample
5
• We would like the selected sample to be:
o Random; members would have an equal chance of being selected.
o For example if we want to understand university students’ experience at U of T , we need to randomly
select students from the entire population of U of T students.
• Sample need to be representative of the population. That means, it need to:
o Match the population
o Avoid bias: over or underestimate some characteristics of the population
Sample Survey
6
• Example: Opinion polls are example of sample surveys, designed to ask questions of a small group of people in the hope
of learning something about the entire population.
o professional pollsters work quite hard to ensure that the sample they take is representative of the population.
o If not, sample can be misleading information about the population.
• Another example: I study students’ attitudes about statistics.
o How: By administrating the Survey of Attitude Towards Statistics (SATS-36©) and linking students’ responses to
their repository record from the Office of Registrar.
Randomize: Why Does Randomization Work?
7
• It protects us from the influences of all features of our population by making sure that, on average, the
sample looks like the rest of population.
• It protects us from factors that we know are in the data, and it can protects us from factors that we do not
know that are in the data.
• In short term, it is unpredictable. In long term, it is predictable.
• We cannot predict which individuals are going to end up in sample.
• With a large sample, the sample will have approximately right proportion of different genders, different age
groups (e.g., young, old), different living areas (e.g., urban, rural), and of course many more layers
(grouping) or things that we didn’t think of.
It’s the Sample Size
8
• How large a random sample do we need for the sample to be reasonably representative of the population?
• It’s the size of the sample, not the size of the population, that makes the difference in sampling.
• Exception: If the population is small enough and the sample is more than 10% of the whole population,
the population size can matter.
• The fraction of the population that you’ve sampled doesn’t matter. It’s the sample size itself that’s important
Does a Census Make Sense?
9
• A survey with all individuals in the population is called a census.
• Wouldn’t it better to just include everyone and “sample” the entire population?
• It can be difficult to complete a census:
• sometimes the population changes due to changes in job locations for the individuals – thus, the findings
might not be relevant to the current population by the time the census is completed.
• there might be some individuals who are hard to locate or hard to measure, high cost, etc.
For example at the same time when the 2016 Canadian census were being conducted, there was a
devastating fire in Fort McMurry in Alberta. Please see the below link regarding how the population was
estimated (using municipal estimates) in Fort McMurry:
http://www.edmontonsun.com/2017/02/08/statscan-confident-in-pre-fire-fort-mcmurray-population-count
Populations and Parameters;Samples and Statistics
10
• A parameter is a number that describes the population.
True values in the population.
Actual numerical values in the population.
E.g., Actual number of people voted for a political party in the entire country (entire population)
• A parameter is a fixed number, but in practice we do not know its value.
• We often use a statistic to estimate an unknown parameter.
• A Statistic is a number that describes a sample.
• The value of a statistic is known when we have taken a sample, but it can change from sample to sample.
E.g., a survey that is a subset of a population (NOT the entire population) may result in indicating the estimated
number of people (in the survey) who voted for a candidate in a electoral political campaign.
Populations and Parameters;Samples and Statistics
11
• The numerical values that we calculate from a sample, for example, the sample mean, sample standard deviation,
and sample correlation, are statistics.
• The statistics are estimates of population parameters (see table below):
Example: Population and Sample
• The Programme for International Student Assessment (PISA) is a triennial international survey which aims to
evaluate education systems worldwide by testing the skills and knowledge of 15-year-old students.
• Around 510,000 students (sample) in 65 economies took part in the PISA 2012 assessment of reading,
mathematics and science representing about 28 million 15-year-olds globally (population).
https://www.youtube.com/watch?v=q1I9tuScLUA (duration: 12:14)
12
Example: Population and Parameter; Sample and Statistic
The survey asked, “?”.
• Population: The collection of all adults in
• Sample: subjects.
• Method of sampling:
• % of sampled subjects answered.
• % describes a characteristic of the sample, it is a statistic.
• Parameter: The true percentage of all in the (the population) “context”.
Note:
An inferential statistical method can predict how close the sample value of 86% was likely to be to the unknown percentage of the population believing in heaven
E.g., An inferential method in later chapter presents that the POPULATION percentage that believe in heaven falls between 84% to
88%. This means that the sample value of “86%” has a margin of error of 2% (even though the sample size was tiny compared to
population size, we can conclude that a large percentage of the population believed in heaven).
13
Simple Random Sample
14
• A simple random sample consists of n individuals from the population chosen in such a way that every set of n
individuals has an equal chance to be the sample actually selected.
• It requires a list of whole population (sampling frame).
• Drawing a simple random sample:
• Using random digit table;
• Using computer (a statistical software) – this is a sophisticated way.
Stratified Random Sampling
• Divides population into separate groups, called strata.
• Individual in each stratum are similar to each other (homogeneous).
• Select a Simple Random Sample (SRS) from each stratum.
• Combine the random selection from each stratum to make overall sample.
• The strata are groups we want to compare.
Example:
stratum 1 Stratum 2 Stratum 3
15
Example of Stratified Random Sampling
• Suppose there are total of 960 students in grades 1, 2, and 3 in three schools.
• We want to randomly select 100 students (combined).
• The goal is to measure students’ reading abilities and compare them among the grade levels.
• Here is what we know in terms of counts in each grade level (all three school combined):
Proportional Sampling:
Randomly select 40 students from grade 1 class;
Randomly select 30 students from grade 2 class;
Randomly select 30 students from grade 3 class.
Total sample size (n): 40 + 30 + 30 = 100
Grade Number of students Percentage (of total)
1 400 400/960 * 100 ≅ 40%
2 280 280/960 * 100 ≅ 30%
3 280 280/960 * 100 ≅ 30%
16
Why Stratified Random Sampling is a Good Method?
• Stratified random sampling can reduce bias.
• Stratifying can also reduce the variability of our results.
• Therefore, sample statistic should be closer to population parameter.
17
Cluster and Multistage Sampling
• Cluster sampling is useful when:• a complete list of population is not available.• simple random sampling is difficult• we don’t have access to all strata or stratifying is not practical
• Therefore, splitting the population into similar parts or clusters can make sampling more practical.
• Divide the population into a number of clusters.
• The goal is not to compare groups, but rather to use them to form a sample.
• Randomly select number of clusters.
• From each cluster, randomly select number of subjects (what and whom you want to sample).
18
Example of Cluster Sampling
• Let’s say that I (as a researcher) would like to investigate high school students’ statistical literacy by the end of
the Grade 12 Mathematics of Data Management Course.
• This course is an introduction to statistics at the high school level in the Ontario Mathematics curriculum.
• How can I do this research?
• Where should I start?
• Where is my randomly selected sample of students?
19
Example of Cluster Sampling
There are 42 public secondary schools in Mississauga.
I will randomly select 3 schools.
From each randomly selected school, I will randomly select 10 students who took Grade 12 Mathematics of Data Management Course; total sample size is 30 students.
20
Multistage Sampling
• Often hierarchy of clusters.
• For example: chapter – section – sentence – word; we could choose:
• Chapters
• Sections within chosen chapters
• Sentence within chosen section
• Word within chosen sentence
• The above example is an example of multistage sampling.
• At each stage, the choice of selection is made by simple random sampling.
21
Example of Multistage Sampling
• I could make my previous example a multistage sampling.
• There are 51 cities in Ontario.
• Randomly select cities from Ontario (e.g., Mississauga, Toronto).
• From the randomly selected cities, randomly select board of educations (e.g., Peel District School Board, Dufferin-peel Catholic School Board, Toronto District School Board).
• From the randomly selected boards of education, randomly select Secondary Schools.
• From the randomly selected Secondary School, randomly select students who take the course:
Grade 12 Mathematics of Data Management.
22
In Summary
• .Choose cluster/multistage sampling for convenience.
• Choose stratified random sampling for accuracy
23
Systematic Random Sampling
• Sometimes we draw a sample by selecting individuals systematically.
• For example, you might survey every 10th person on an alphabetical list of students.
• To make it random, you must still start the systematics selection from a randomly selected individual.
• When there is no reason to believe that the order of the list could be associated in any way with the responses sought, systematic random sampling can give a representative sample.
24
Example of Systematic Random Sampling
Suppose we want to systematically randomly select 100 students from 30,000.
N = 30,000 (population size); n = 100 (sample size).
We could use student directory (e.g., office of registrar) as our sampling frame.
System:
Let K be population size (N) divided by sample size (n).
K = N/n = 30,000/100 = 300
Select a subject at random from first K names in the sampling frame.
1. In this case, we randomly select one student from the first 300 students in the directory.
2. Select the second student from the next 300 students in the directory.
3. And so on, until we have 100 randomly selected students.
25
Things that can go wrong with Sample Survey
• Not getting who you want (non-response)• E.g. Miss the mail in questionnaire • Solution: follow up with the potential cases (reminder messages)
• Getting the question(s) right• Solution: Avoid favoring a certain answer in way the question is asked
• Not giving choices for answer• E.g., getting open-ended responses• Solution: use Likert scale: strongly agree to strongly disagree
• Sampling volunteers• Do not rely on people who choose to respond, e.g., callers to radio show
• Sampling badly but conveniently
• Undercoverage• Not being able to sample certain parts of population
26
Sampling Bias: Nonprobability Sampling
• Samples do not have an equal chance of being selected.
• Inference using such samples are have unknown reliability (sampling bias).
• Most common nonprobability sampling is: volunteer sampling.
• E.g., mail in questionnaires
Example:
One night the ABC program asked viewers whether the United Nations should continue to be located in the U.S.
• Of the 186000 respondents, 76% said yes.
• Is this a random sample?
• Of the poll using a random sample of 500 respondents estimated the population percentage to be about 28%.
• Even though it is a smaller sample, it is a better representation of the population (more trustworthy).
27
Response Bias
Wording of a question on a survey can have a large impact on the results.
Example: 2006 New York Times poll.
“Do you favor a gasoline tax?” 12% yes
“Do you favor a gasoline tax
• to reduce U.S. dependence on foreign oil?” 55% yes
• to reduce global warming?” 59% yes
28
Nonresponse Bias
When participants cannot be reached or refuse to participate.
Example:
In her book, Woman in Love, author Shere Hite surveyed women in the U.S.
She mailed her survey to 100,000 women.
4,500 women responded (4.5% of all the 100,000 women).
Her conclusions (based on her survey):
70% of women who had been married at least five years have extramarital affairs.
Concern:
- Is 4.5% of women representative of the 100,000 women?
- Is 4.5% of women representative of the entire population of American women?
29
Nonresponse Bias
Missing Data:
Missing responses for some of the variables measured.
Example:
EQAO test
Concern:
How to treat the missing information.
Idea: Should we omit the entire subject who provided missing information?
Idea: Should we impute (replace with some other statistics) for the missing information using some strategies?
Idea: If most cases (subjects) miss reporting their response to a specific question, what could this mean regarding the questionnaire? In that case, how could we improve our research question(s)?
30
Observational vs Experimental Study
31
Observational Study:
• An observational study observes individual cases and measures variables of interest but does not attempt to influence
the response.
• Observational studies may help identify variables that have an effect but they do not prove cause and effect.
• Observational studies are commonly used and are valuable for discovering trends and possible relationships.
• However, it is not possible for observational studies to demonstrate a causal relationship.
Experimental Study:
• An experiment imposes a treatment on individuals in order to observe their response.
• When our goal is to understand the cause and effect, experiments are the only source of fully convincing data.
Example: How do you find out if exercise helps insomnia?
32
• Randomly select individuals. Find out if they exercise and how much. Ask them to rate their insomnia.
• Suppose the people who exercise more suffer less from insomnia. Can you conclude that people who suffer
from insomnia should be recommended to exercise?
• maybe: but this is only an association, not cause and effect.
• This is an example of an observational study.
• We can assesses association (like correlation) but cannot conclude cause and effect.
Retrospective Study
33
Retrospective study “looking back”:
• Like in the previous example, measure exercise and insomnia from historical records.
• Another example: Is there a relationship between students’ attitudes toward statistics and their previous achievement in
mathematics?
• Collect information regarding students’ previous achievement in mathematics from students’ record (Office of Registrar)
• Another Example: Is there a relationship between music and academic performance (grades)?
• Identify students who play music and collect data on their past grades.
• If music courses and grades are correlated, we cannot say that music causes grades. Why?
• There are other variables, hidden variables, that we did not take into account that could contribute to and explain the
variation in academic performance (grades).
• Such variables could also be correlated with music. They would then be correlated with academic performance.
These variables are called, confounding variables.
Prospective Study
34
Prospective study “looking forward”:
• Identify subjects in advance, collect data as events happen.
• In our previous (music and grades) example, select students who have not begun music lessons and record their
academic performance over years (longitudinal study).
• Compare this group of students with those students who did not take music courses.
• Still, we cannot say music causes grades. Why?
• There are still some possibilities that other variables not taken into account could explain grades.
Retrospective vs Prospective Study
35
• Are data from the past even accurate?
• Which is better, retrospective or prospective?
• Prospective studies usually have less confounding and bias. The outcome measures need to be a common one.
In general, this type of study takes a long time, but it is a better approach.
• Retrospective studies are easier to obtain enough data for rare events. They take less time to do. There is
however, concerns regarding bias/confounding in these type of studies.
Experimental Study
36
How do we establish cause and effect?
For example in the previous example of exercise and insomnia:
• we need to randomly choose some subjects and instruct them to exercise.
• we instruct the other subjects not to exercise.
• We then assess insomnia for all subjects.
Why is this type of study a better approach? How does it level out effects of other variables?
• It is a better approach because, choosing two groups at random means that the groups should start out
relatively equal in terms of anything that might matter.
• If the groups end up unequal in terms of insomnia, then there is evidence that exercise made a difference.
Terminologies in an Experimental Study
37
• People/animals/whatever participating in an experiment are called experimental units / subjects.
• An experimental study has:
• at least one explanatory variable, a factor, to manipulate.
• at least one response variable, an outcome measure.
• Specific values chosen for a factor are called levels.
• Combination of manipulated levels of factors called treatment.
Back to the Experimental Study Example
38
Suppose we add diet into the variation of exercise and insomnia experiment:
Suppose we have:
• Three kinds of exercise: none, moderate, strenuous
• Two different diets: Vegetarian, Non-vegetarian
• Factors are:
• Exercise, with 3 levels
• Diet, with 2 levels
• There are 3 x 2 = 6 treatments (6 combinations of 2 factors)
• We divide subjects into 6 groups at random.
Principles of experimental design
39
1. Control:
We control sources of variation other than the factors we are testing by making conditions as similar as possible
for all treatment groups.
2. Randomize:
• Allows us to equalize the effects of unknown or uncontrollable sources of variation.
• It does not eliminate the effects of these sources, but it spreads them out across the treatment levels so that
we can see past them.
3. Replicate:
Get many measurements of response for each treatment.
4. Blocking:
Divide experimental units into groups of similar ones and sample appropriately. If we group similar individuals
together and then randomize within each of these blocks, we can remove much of the variability due to the
difference among the blocks.
Placebo
40
A placebo is a “fake” treatment designed to look like a real one.
• Why is that important?
• It is known that receiving any treatment will cause a subject to improve.
• We Want to show that the “real” treatment is not just effective, but better than a placebo. Then have
evidence that the treatment is worth knowing about.
• We can also use current standard treatment to compare with.
• Subjects getting placebo/standard treatment called control group.
Blinding
41
• When we know what treatment was assigned, it’s difficult not to let that knowledge influence our
assessment of the response, even when we try to be careful.
• In order to avoid the bias that might result from knowing what treatment was assigned, we use blinding.
• There are two main classes of individuals who can affect the outcome of the experiment:
• Those who could influence the results (subjects, treatment administrators, technicians)
• Those who evaluate the results (judges, treating physicians, etc.)
• When every individual in either one of these classes is blinded, an experiment is said to be single-blind.
• When everyone in both classes is blinded, the experiment is called double-blind.
Blinding
42
Pepsi vs Coke: study preference of consumers
• Single-blinded is when the participants don’t know which cup they are tasting, but the administrator knows
which cup is Pepsi or Coke.
• If consumer’s preference is shown to the administrator, the administrator may treat the study differently
(change the order of the cup).
• Double-blinded is when both participants and the administrators don’t know which cup is Pepsi and Coke.
In Summary
43
The best experiments are:
• Randomized
• Comparative
• Double-blind
• Placebo-controlled
Completely Randomized Design (CRD)
44
When all experimental units are allocated at random among all treatments, the experimental design is
completely randomized.
Example:
Suppose we want to compare three teaching methods A , B and C.
• Select a group of students at random; e.g. 30 students.
• Divide them into there groups at random.
• One group studying from method A, one from method B and the other from method C.
• Assess performance of all students and compare the results.
Randomized Block Design (RBD)
45
• A block is a group of experimental units or subjects that are known before the experiment to be similar in
some way that is expected to affect the response to the treatments.
• In a randomized block design (RBD), the random assignment of units to treatments is carried out separately
within each block.
Example
Suppose We want to compare three teaching methods A , B and C. The strength of the students can affect results
(i.e. strength is a blocking variable).
• Select a group of students at random; e.g. 30 students.
• Divide the students into blocks, say 10 blocks: the best three students, the next best three and so on.
• Form each block pick one of them at random to study from method A, one from method B and the other
from method C. i.e. the subjects in each block are randomized into the three teaching methods.
• Assess performance of all students and compare the results.
Ethical Experiments
46
• Idea of imposing treatments on subjects might be questionable:
• What if study effects of smoking on lung disease?
• Would have to prevent some subjects from smoking, and make some subjects smoke for
duration of study (!!!)
• There are some known unhealthy/dangerous things you cannot ask subjects to do.
• Also, giving a placebo when a best proven treatment is available is not ethical.
• Subjects who receive placebo must not be subject to serious harm by so doing.
• See Declaration of Helsinki, which governs experiments on human subjects:
www.wma.net/en/30publications/10policies/b3/index.html