lecture 3 sampling theory epsy 640 texas a&m university

Post on 31-Dec-2015

26 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

LECTURE 3 SAMPLING THEORY EPSY 640 Texas A&M University. POPULATIONS. finite population consists of the actual group of objects or persons, which we know is potentially countable and finite. - PowerPoint PPT Presentation

TRANSCRIPT

LECTURE 3SAMPLING THEORY

EPSY 640Texas A&M University

POPULATIONS

• finite population consists of the actual group of objects or persons, which we know is potentially countable and finite.

• infinite population population is a mathematical abstraction that is useful because the properties of the population are assumed or defined carefully, ,

POPULATIONS

• Parameter = characteristic of the population.

• If a sample is drawn and the characteristic computed, it will be a statistic for the sample.

POPULATIONS

• Accessible vs. Target Populations.

• Target Population, the population we wish to represent.

• Instead, we might be able to draw from all public school grade 3 students in class during a particular week in the school year. This is our Accessible Population, the population we have access to.

Target population

Accessiblepopulation

Sample

Figure 4.1: Inferences from sample to Populations

Sampling Methods

• RANDOM SAMPLING– SIMPLE– STRATIFIED– MULTISTAGE– CLUSTER

• SYSTEMATIC SAMPLING

• CONVENIENCE (NONRANDOM) SAMPLING

RANDOM SAMPLING

• If every member of a population has an equal chance of being selected

• involves being able to define and count the population.

• can then use a process called randomization to select the sample

Table of Random NumbersLocation RN Location RN 234 75 308 01 ….. 235 13 309 26 ….. 236 95 310 31 ….. 237 22 311 69 ….. 238 46 312 29 ….. 239 86 313 98 ….. 240 55 314 34 ….. 241 59 315 17 …..

In selecting a sample of 20 students from a list of of 75, a random start point was selected as shown above. The ad hoc rule was to go down the column to the bottom and up the next. Thus, children with identifiers 75 1, 13, 26, 69, 22, 46, 29, 55, 34, 59, 59, and17 have been selected within this section of the random number table. The location value allows checking and replication of a random sample selection process.

finite population correction

• fpc= 1- n/N = 1-f

where n= # in sample

N=# in population

Finding survey sample size

(z/d)2

n = ________________________

1 + (1/N)(z/d)2

z = z-score for probability for confidence interval required (usually 1.96 for .05 or 2.59 for .01)

= SD of distribution (can be 1.0 for arbitrary units)

d = desired degree of error in SD units

Finding survey sample size- example

Alpha=.05, N=1,000,000 d=.1 , = 1

(1.96/.1)2

n = ________________________

1 + (1/1000000)(1.96/.1)2

= 19.62

= 384.16

Population SizeSample Size Required for d= .1for = .05 for = .01

20 19 19 30 28 29 40 36 38 50 44 46 75 62 67 100 79 87 125 94 105 150 108 122 175 120 138 200 132 154 225 142 168 250 151 182 275 160 194 300 168 207 350 183 229 400 196 250 500 217 285 600 234 315 750 254 352 1000 278 399 1500 306 460 2000 322 498 2500 333 525 5000 357 586 7500 365 610 10000 370 623 100000 383 660 1000000 384 663Table 4.2: Sample sizes required for various population sizes for 95% and 99% confidence intervals

Mean and standard deviation for simple random sampling

• (x.) = (sample mean estimates population mean unbiasedly)

• V(x.) = (1/n) s2(1-f) (variance must be corrected)

• _____

• s x. = V(x.)

• = standard error of the mean =sm

-1.96sm -sm sm 1.96sm

Mean from a particular sample

-1.96sm -sm sm 1.96sm

Mean from a particular sample

Original Data Distribution

Distribution of Means

Confidence interval

• Mean zsx.

• z = # SDs of normal distribution for some probability of confidence, usually .01 or .05

• for real data: x. 1.96s x gives a confidence interval around the mean:– Interpretation: in 95 of 100 times we do the

study, the population mean will be in the interval we construct.

-1.96sm -sm sm 1.96sm

Mean from a particular sample

Distribution of MeansConfidence

interval

Interpretation: in one event is either IN or OUT of the confidence interval; for 100 intervals, it should be IN 95 times on average.

Stratified random sampling

• subpopulations, called strata.

• We then use simple random sampling for each stratum.

• We can decide to sample proportionately or disproportionately.

Stratified random sampling

• Proportionate sampling: percentage in sample is same as in population

• or disproportionate sampling: percentage in sample is different from that of population

• Example Males and Females (50% in pop.).– Proportional: 50 males, 50 females– Disproportional: 75 males, 25 females

Stratified random sampling

• Example: Ethnicity of students in District: 80% Anglo, 10% Hispanic, 5% African American, 5% Native American

• Proportional for 200 student sample:– 160 Anglo, 20 Hispanic, 10 African-American, 10

Native American

• Disproportional:– 50 Anglo, 50 Hispanic, 50 African-American, 50

Native American

Stratified random sampling• Example: Ethnicity of students in District: 80% Anglo, 10% Hispanic, 5%

African American, 5% Native American

• Proportional for 200 student sample:– 160 Anglo, 20 Hispanic, 10 African-American, 10 Native American

– May give poor estimates for H, AA, NA samples

• Disproportional:– 50 Anglo, 50 Hispanic, 50 African-American,

50 Native American– Will give estimates with similar confidence

intervals for all groups– may need fpc for some groups

Mean for stratified random sample.

s

x..est = ( Ni xi.)/N

i=1

Where Ni = numer of cases in the population stratum i,

N = total number of cases in the entire population, and

s = number of strata.

Mean for stratified random sample- example

3 strata, N1=1000, N2=2000, N3=3000

X1 = 70, X2 = 80, X3 = 90

s

x..est = ( Ni xi.)/N

i=1

= [(1000 x 70) + (2000 x 80) + (3000 x 90) ] / 6000

= 83.33

SD for stratified random sample. • s

• V(x..est) = Ni2 s2

xi./N2

• i=1

• x. = V(x..est) ,

• where s2xi.= V(xi.), the variance error of the

mean using the simple random sample formula

SD for stratified random sample.

SUBPOPULATION NI ni X. si sm.

A 77 50 10 5 .419

B 229 50 11 6 .751

C 738 50 12 7 .956

X..est = (77 x 10 + 229 x 11 + 738 x 12)/1044 = 11.63

V(X..est) = (772 x (.419)2 + 2292 x (.751)2 + 7382 x (.956)2 ) /10442 = .485

s(X..est) = .696

Table 4.3: Calculation of stratified sample mean and variance error of the mean

SD for stratified random sample.

SUBPOPULATION NI ni X. si sm.

A 77 50 10 5 .419

B 229 50 11 6 .751

C 738 50 12 7 .956 s2m = (1/ni)si2 (1-fi)

.4192 = (1/50)52 (1-50/77)

.7512 = (1/50)62 (1-50/229)

.9562 = (1/50)72 (1-50/738)

top related