stat 512 day 6: sampling. last time get lots of sleep! characteristics of the distribution of a...

23
Stat 512 Day 6: Sampling

Post on 21-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Stat 512

Day 6: Sampling

Last Time

Get lots of sleep! Characteristics of the distribution of a

quantitative variable Shape, center, spread, outliers (in context)

“Formal” analysis for comparing two groups: statistical significance What is the distribution of the “by chance” results?

Statistical Significance

Calculate the difference in means Could a difference this large happen by chance?

Can use simulation to mimic the randomization process, assuming no difference between the groups

See how often you get a difference at least as large by chance alone (no treatment effect) p-value, statistical significance

Consider study design to decide whether to draw a causal conclusion

Statistical Significance

Example 2 – Day 5

Actual study Hypothetical data

92.15 deprivededunrestrict xx 92.15 deprivededunrestrict xx

Example 2 – Day 5

Statistical Process

Compareresults

Randomized?

Getting the observational units in the first place!

Explanatory Variable

Statistical Process

Compareresults

Randomized?

Example 1: Sampling Words

Circle 10 representative words Def: A parameter is a numerical characteristic

of the population (pi, mu, sigma)

Def: A statistic is a numerical characteristic of the sample , s (x-bar, p-hat, s)px ˆ,

Example 1: Sampling Words

Does our sampling method generally lead to good estimates of the parameter?

Sample results vary from sample to sample!

A sampling method is unbiased if the distribution of the sample statistics is centered at the population value.

Bias Literary Digest (p. 21) Bad Sampling Frame Voluntary response bias

Those who chose to respond are most likely to feel strongly, usually negatively, on the issue.

Nonresponse bias Those who aren’t home or who don’t have listed

numbers or who refuse to participate Convenience sample

Those who are easy to get a hold of, easily remembered

Example 1: Sampling Words

Def: A simple random sample gives everyone word in the population an equal probability of being selected. Every sample of n words is as likely as any other

sample of n words.

Example 1: Sampling Words

Selecting a simple random sampleMTB> set c1

DATA> 1:268

DATA> end

MTB> sample 5 c1 c2 Find the corresponding ID numbers of the

sampling frame (from webpage) Determine the average length of the 5 words

in your sample

Example 2: Sampling Words (cont.) What is the long-term pattern of these sample

means? Def: A sampling distribution of a statistic is the

distribution of the sample statistic for all possible samples (of the same size) from the population.

An empirical sampling distribution gives you an idea of the pattern from a large number of samples of the same size

Summary

Values of sample statistics vary from sample to sample – sampling variability Random sampling error

Sampling distribution = distribution of sample statistics (from all possible random samples) Observational units = samples Variable = sample statistics (e.g., sample means) Sampling method is unbiased if sampling distribution is

centered at parameter of interest Random samples are unbiased and allow us to

estimate the size of the random sampling error Sampling distribution follows a predictable pattern

Statistical Significance

This consistent pattern helps us to decide when we might have a surprising value for the sample statistic. Level of surprise depends on sample size

p-value indicates how often a random sample would like to a value of the sample statistic at least as extreme Is sample statistic result “significantly” different

from population parameter?

Example 3: Comparison Shopping

Example

Lost ticket, would you buy another?

Lost $20, would you buy another?

Lives saved?

Lives lost?

Prediction: more likely if lost ticket

Prediction: Option A more likely when in terms of lives saved

Nonsampling Errors

March 6-8, 2004 Wall Street Journal/NBC poll of 1,018 adults

GAY MARRIAGE opinions depend on how the question is asked.

To one poll question, a 52%-43% majority opposes a constitutionalamendment "making it illegal for gay couples to marry." A 54%-42%majority responds favorably to a second query that omits the word"illegal" and more benignly asks about an amendment "that defined

marriage as a union only between a man and a woman."

Sources of Nonsampling Errors Sensitive questions

Social acceptability Wording of question

Appearance of interviewer Order of choices Unsure response, change mind, faulty

memory

For Tuesday

Submit your tentative project proposal (see syllabus for additional guidelines)

Submit PP 6 in Blackboard Read Sec. 4.1 and 4.2 Complete Example 3 from the Day 6 handout

Project Discussion