1/58 statistics sampling. statistics in practice cinergy, formerly cincinnati gas & electric...

106
1/58 Statistics Sampling

Post on 19-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

1/58

Statistics

Sampling

Page 2: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

STATISTICS in PRACTICE

Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and electric power to customers in the Greater Cincinnati area.

To improve service to its customers, Cinergy continually strives to stay up-to-date with its customers’ needs.

Page 3: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

STATISTICS in PRACTICE

Cinergy is using the survey results to improve the forecasts of energy demand and to improve service to its commercial customers.

Page 4: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Contents

Terminology Used in Sample Surveys Types of Surveys and Sampling Methods Survey Errors Simple Random Sampling Stratified Simple Random Sampling Cluster Sampling Systematic Sampling

Page 5: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Terminology Used in Sample Surveys

An element is the entity on which data are collected.

A population is the collection of all elements of interest.

A sample is a subset of the population. The target population is the population we want

to make inferences about.

Page 6: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Terminology Used in Sample Surveys

The sampled population is the population from which the sample is actually selected.

These two populations are not always the same. If inferences from a sample are to be valid, the

sampled population must be representative of the target population.

Page 7: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Terminology Used in Sample Surveys

Example: Dunning Microsystems, Inc. (DMI), a manufacturer of personal computers and peripherals, would like to collect data about the characteristics of individuals who purchased a DMI personal computer.

A sample survey of DMI personal computer owners could be conducted.

Page 8: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Terminology Used in Sample Surveys

The elements in this sample survey would be individuals who purchased a DMI personal

computer. The population would be the collection of all

people who purchased a DMI personal computer.

The sample would be the subset of DMI personal computer owners who are surveyed.

Page 9: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Terminology Used in Sample Surveys

The target population consists of all people who purchased a DMI personal computer.

The sampled population, however, might be all owners who sent warranty registration cards back to DMI.

Not every person who buys a DMI personal computer sends in the warranty card, so the sampled population would differ from the target population.

Page 10: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Terminology Used in Sample Surveys

The population is divided into sampling units which are groups of elements or the elements themselves.

A list of the sampling units for a particular study is called a frame.

Page 11: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Terminology Used in Sample Surveys

The choice of a particular frame is often determined by the availability and reliability of a list.

The development of a frame can be the most difficult and important steps in conducting a sample survey.

Page 12: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Terminology Used in Sample Surveys

Example: suppose we want to survey certified professional engineers who are involved in the design of heating and air conditioning systems for commercial buildings

If a list of all professional engineers were available, the sampling units would be the professional engineers we want to survey.

Page 13: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Terminology Used in Sample Surveys

If such a list is NOT available, a business telephone directory might provide a list of all engineering firms.

we could select a sample of the engineering firms to survey; then, for each firm surveyed, we might interview all the professional engineers.

Page 14: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Types of Surveys

Surveys Involving Questionnaires Three common types are mail surveys, telephone

surveys, and personal interview surveys. Survey costs are lower for mail and telephone

surveys. With well-trained interviewers, higher response

rates and longer questionnaires are possible with personal interviews.

The design of the questionnaire is critical.

Page 15: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Surveys Not Involving Questionnaires Often, someone simply counts or measures the

sampled items and records the results. An example is sampling a company’s inventory

of parts to estimate the total inventory value.

Types of Surveys

Page 16: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Nonprobabilistic Sampling Probabilistic Sampling

Sampling Methods

Page 17: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Non-probabilistic Sampling Methods

The probability of obtaining each possible sample to be computed.

Statistically valid statements cannot be made about precision of the estimates.

Sampling cost is lower and implementation is easier

Methods include convenience and judgment sampling.

Page 18: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Convenience Sampling

The units included in the sample are chosenbecause of accessibility.

In some cases, convenience sampling is the onlypractical approach.

Non-probabilistic Sampling Methods

Page 19: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Judgment Sampling

A knowledgeable person selects sampling unitsthat he/she feels are most representative of thepopulation.

The quality of the result is dependent on the personselecting the sample.

Generally, no statistical statement should be madeabout the precision of the result.

Non-probabilistic Sampling Methods

Page 20: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Nonprobabilistic Sampling Methods

Example Convenience sampling: professor

conducting a research study at a university may ask student volunteers to participate in the study simply because they are in the professor’s class.

Page 21: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Probabilistic Sampling Methods

The probability of obtaining each possible samplecan be computed.

Methods include simple random, stratified simplerandom, cluster, and systematic sampling.

Confidence intervals can be developed which providebounds on the sampling error.

Page 22: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Two types of errors can occur in conducting a survey :

Sampling error Nonsampling error

Survey Errors

Page 23: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Sampling Error

It is defined as the magnitude of the differencebetween the point estimate, developed from thesample, and the population parameter.

It can not be avoided, but it can be controlled.

It occurs because not every element in thepopulation is surveyed.

It cannot occur in a census.

Survey Errors

Page 24: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Nonsampling Error

It can occur in both a census and a sample survey.

Examples include:

Measurement error Errors due to nonresponse

Errors due to lack of respondent knowledge

Selection error Processing error

Survey Errors

Page 25: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Nonsampling Error• Measurement Error

Measuring instruments are not properlycalibrated.

People taking the measurements are not properlytrained.

Survey Errors

Page 26: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Nonsampling Error• Errors Due to Nonresponse

They occur when no data can be obtained, oronly partial data are obtained, for some of theunits surveyed.

The problem is most serious when a bias iscreated.

Survey Errors

Page 27: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Nonsampling Error• Errors Due to Lack of Respondent Knowledge

These errors on common in technical surveys.

Some respondents might be more capable thanothers of answering technical questions.

Survey Errors

Page 28: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Nonsampling Error• Selection Error

For example, in a survey of “small truck owners”some interviewers include SUV owners whileother interviewers do not.

An inappropriate item is included in the survey.

Survey Errors

Page 29: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Nonsampling Error• Processing Error

Data is incorrectly transferred from recordingforms to computer files.

Data is incorrectly recorded.

Survey Errors

Page 30: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Simple Random Sampling

A simple random sample of size n from a finitepopulation of size N is a sample selected such thatevery possible sample of size n has the sameprobability of being selected.

We begin by developing a frame or list of allelements in the population.

Then a selection procedure, based on the use ofrandom numbers, is used to ensure that eachelement in the sampled population has the sameprobability of being selected.

Page 31: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Simple Random Sampling

We will see in the upcoming slides how to: Estimate the following population parameters:

• Population mean• Population total• Population proportion

Determine the appropriate sample size

Page 32: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

In a sample survey it is common practice to providean approximate 95% confidence interval estimate ofthe population parameter.

Assuming the sampling distribution of the pointestimator can be approximated by a normalprobability distribution, we use a value of t = 2 fora 95% confidence interval.

Simple Random Sampling

Page 33: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

The interval estimate is:

2(Estimate of the Standard Error of the Point Estimator)

The bound on the sampling error is:

Point Estimator +/- 2(Estimate of the Standard Error of the Point Estimator)

Simple Random Sampling

Page 34: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Estimate of the Standard Error of the Mean

Population Mean

• Point Estimator

x

N n ss

N n

x

Simple Random Sampling

Page 35: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Approximate 95% Confidence Interval Estimate

Population Mean

• Interval Estimate

/ 2 xx t s

2 xx s

Simple Random Sampling

Page 36: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Simple Random Sampling

Estimate of the Standard Error of the Total

X̂ Nx

ˆ xxs Ns

Population Total

• Point Estimator

Page 37: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Approximate 95% Confidence Interval Estimate

Population Total

• Interval Estimate

ˆ/ 2 xNx t s

ˆ2 xNx s

Simple Random Sampling

Page 38: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Estimate of the Standard Error of the Proportion

Population Proportion

• Point Estimator

(1 )1p

p pN ns

N n

p

Simple Random Sampling

Page 39: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Approximate 95% Confidence Interval Estimate

Population Proportion

• Interval Estimate

/ 2 pp t s

2 pp s

Simple Random Sampling

Page 40: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Determining the Sample Size

An important consideration is choice of sample size.

A budget might dictate how large the sample can be.

The best choice usually involves a tradeoff betweencost and precision (size of the confidence interval).

Larger samples provide greater precision, but aremore costly.

A specified level of precision might dictate howsmall a sample can be.

Page 41: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Choosing a level of precision amounts to choosing avalue for B.

Smaller confidence intervals provide more precision.

The size of the approximate confidence intervaldepends on the bound B on the sampling error.

Given a desired level of precision, we can solve forthe value of n.

Simple Random Sampling

Page 42: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Simple Random Sampling Necessary Sample Size

for Estimating the Population Mean

2

22

4

Nsn

BN s

Hence,

n

s

N

nNB 2

Page 43: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Simple Random Sampling

Necessary Sample Size

for Estimating the Population Total

2

22

4

Nsn

Bs

N

Page 44: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Simple Random Sampling

Necessary Sample Size

for Estimating the Population Proportion

2

(1 )

(1 )4

Np pn

BN p p

Page 45: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Simple Random Sampling Example: Steddy Investments

Ben Steddy is a financial advisor for 200 clients.

A sample of 40 clients has been

taken to obtain various demo-

graphic data and information

about the clients’ investment

objectives. Statistics of partic-

ular interest are the clients’net worth and the proportion favoring fixed income

investments.

Page 46: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Simple Random Sampling

Example: Steddy InvestmentsFor the sample, the mean net worth was

$480,000

(with a standard deviation of

$120,000), and the proportion

favoring fixed-income invest-

ments was .30.

Page 47: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Simple Random Sampling

Approximate 95% Confidence Interval for TNW

ˆ

200 40 120200 3,394.113

200 40xX

N n ss Ns N

N n

Point Estimate of Total Net Worth (TNW)

Estimate of Standard Error of TNW

= $89,211,774 to $102,788,226

= $3,394,113

$ ( ) ,X Nx= = =200480 96000 thousand = $96,000,000$ ( ) ,X Nx= = =200480 96000 thousand = $96,000,000

)113.394,3(2000,962 ˆ xsxN

Page 48: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

358.0 To 242.0)029(.2300.2 psp

.30p

(1 ) .3(1 .3)200 40 .029

1 200 200 1p

p pN ns

n n

Approximate 95% Confidence Interval

Point Estimate of Population Proportion

Favoring Fixed-Income Investments

Estimate of Standard Error of Proportion

Simple Random Sampling

Page 49: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Simple Random Sampling

One year later Steddy wants to again survey his clients. He now has 250 clients and wants to set a bound of $30,000 on the

error of the estimate of their mean net worth.

What is the necessary sample size?

Example: Steddy Investments

Page 50: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Necessary Sample Size

Steddy will need a sample size of 51.

96.50

1204

30250

)120(250

42

2

2

22

2

sB

N

Nsn

Simple Random Sampling

Page 51: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Stratified Simple Random Sampling

The population is first divided into H groups, called strata.

Then for stratum h, a simple random sample of size nh is selected.

The data from the H simple random samples are combined to develop an estimate of a population parameter.

Page 52: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Stratified Simple Random Sampling

If the variability within each stratum is smaller than the variability across the strata, a stratified simple random sample can lead to greater precision

The basis for forming the various strata depends on the judgment of the designer of the sample.

Page 53: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Stratified Simple Random Sampling

Example: The College of Business at Lakeside College wants to conduct a survey of this year’s graduating class to learn about their starting salaries.

There are five majors in the college:

N = 1500 students who graduated this year,

N1 = 500 accounting majors,

N2 = 350 finance majors,

N3 = 200 information systems majors,

N4 = 300 marketing majors, and

N5 = 150 operations management majors

Page 54: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Stratified Simple Random Sampling

A stratified simple random sample of n = 180 students is selected;

n1 = 45, 45 of the 180 students majored in

accounting,

n2 = 40, 40 majored in finance,

n3 = 30, 30 majored in information systems,

n4= 35, 35 majored in marketing , and

n5 = 30, 30 majored in operations management.

Page 55: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

ChemTech International has used stratified simple

random sampling to obtain demographic information

and preferences regarding health

care coverage for its employees

and their families.

Example: ChemTech International

The population of employees

has been divided into 3 strata

on the basis of age: under 30, 30-49, and 50 or over.

Some of the sample data is shown on the next slide.

Stratified Simple Random Sampling

Page 56: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Demographic Data

50 or Over 125 30 425 130 .6830-49 250 45 400 100 .70

Under 30 100 30 $250 $75 .60

475 105

Annual FamilyDental ExpenseProportion

MarriedStratum Nh nh Mean St.Dev.

Stratified Simple Random Sampling

Page 57: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Point Estimator

Stratified Simple Random Sampling

Population Mean

1

Hh

st hh

Nx x

N

where: H = number of strata

N = total number of elements in the population (all strata)

Nh = number of elements in the population in stratum h

= sample mean for stratum hhx

Page 58: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Estimate of the Standard Error of the Mean

Population Mean

2

21

1( )

st

Hh

x h h hh h

ss N N n

N n

Stratified Simple Random Sampling

Page 59: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Interval Estimate

Population Mean

/ 2 stst xx t s

• Approximate 95% Confidence Interval Estimate

2stst xx s

Stratified Simple Random Sampling

Page 60: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Stratified Simple Random Sampling

The College of Business at Lakeland College provided the sample means for each major, or stratum, are

$35,000 for accounting,

$33,500 for finance,

$41,500 for information systems,

$32,000 for marketing, and

$36,000 for operations management.

Page 61: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Stratified Simple Random Sampling

Point estimate of the population mean

the standard error

017,35

)000,36(1500

150)000,32(

1500

300)500,41(

1500

200)500,33(

1500

350)000,35(

1500

500

stx

5

1

2

698,037,909,42)(h h

hhhh n

snNN

13868.070,19)698,037,909,42()1500(

12

stxs

Page 62: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Stratified Simple Random Sampling

A 95% confidence interval estimate of the population mean is 35,017 2(138) = 35,017 276, or $34,741 to $35,293.

Page 63: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Estimate of Standard Error of Mean

1

100 250 125250 400 425

475 475 475

Hh

st hh

Nx x

N

2

2 21

1 1( ) 19,390,972

475st

Hh

x h h hh h

ss N N n

N n

= $9.27

= $375

Point Estimate of Mean Annual Dental Expense

Stratified Simple Random Sampling

Page 64: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Approximate 95% Confidence Interval

for Mean Annual Dental Expense

2 = 375 2(9.27) = $356.46 to $393.54stst xx s

Stratified Simple Random Sampling

Page 65: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Point Estimator

Population Total

• Estimate of the Standard Error of the TotalˆstX Nx

ˆ stxxs Ns

Stratified Simple Random Sampling

Page 66: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Interval Estimate

Population Total

• Approximate 95% Confidence Interval Estimate

ˆ/ 2st xNx t s

ˆ2st xNx s

Stratified Simple Random Sampling

Page 67: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

125,178$125,178)375(475ˆ stxNx

Approximate 95% Confidence Interval

Point Estimate of Total Family Expense

For All Employees

= $169,318 to $186,932

807,8125,178)27.9)(475(2125,1782ˆ stxNsX

Stratified Simple Random Sampling

Page 68: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Point Estimator

Population Proportion

where: H = number of strata

N = total number of elements in the population (all strata)

Nh = number of elements in the population in stratum h

1

Hh

st hh

Np p

N

= sample proportion for stratum hhx

Stratified Simple Random Sampling

Page 69: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Estimate of the Standard Error of the Proportion

Population Proportion

21

(1 )1( )

1st

Hh h

p h h hh h

p ps N N n

N n

Stratified Simple Random Sampling

Page 70: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Interval Estimate

Population Proportion

• Approximate 95% Confidence Interval Estimate

2stst pp s

/ 2 stst pp t s

Stratified Simple Random Sampling

Page 71: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Stratified Simple Random Sampling

Example: the Lakeland College survey, The college wants to know the proportion of

graduates receiving a starting salary of $36,000 or more.

Page 72: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Stratified Simple Random Sampling

The results of the sample survey of 180 graduates show that 63 received starting salaries of $36,000 or more

16 of the 63 majored in accounting, 3 majored in finance, 29 majored in information systems, 0 majored in marketing, and 15 majored in operations management.

Page 73: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Stratified Simple Random Sampling

The point estimate of the proportion receiving starting salaries of $36,000 or more

The standard error;

3149.

30

15

1500

150

35

0

1500

300

30

29

1500

200

40

3

1500

350

45

16

1500

500

stx

5

1

6913.15701

)1()(

h h

hhhhh n

ppnNN

0264.)6913.1570()1500(

12

stps

Page 74: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

1

100 250 125.6 .7 .68 .6737

475 475 475

Hh

st hh

Np p

N

Approximate 95% Confidence Interval for Proportion

Point Estimate of Proportion Married

Estimate of Standard Error of Proportion

2 21

(1 )1 1( ) 391.637

1 475st

Hh h

p h h hh h

p ps N N n

N n

= .0417

)0417(.26737.2 stpst sp =.5903 to .7571

Stratified Simple Random Sampling

Page 75: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Sample Size When Estimating Population Mean

2

1

22 2

14

H

h hh

H

h hh

N sn

BN N s

Stratified Simple Random Sampling

Page 76: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Sample Size When Estimating Population Total

2

12

2

14

H

h hh

H

h hh

N sn

BN s

Stratified Simple Random Sampling

Page 77: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Stratified Simple Random Sampling

Sample Size When Estimating Population Proportion

2

1

22

1

(1 )

(1 )4

H

h h hh

H

h h hh

N p pn

BN N p p

Page 78: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Stratified Simple Random Sampling

Proportional Allocation of Sample n to the Strata

hh

Nn n

N

Page 79: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Cluster sampling requires that the population be divided into N groups of elements called clusters.

We would define the frame as the list of N clusters.

We then select a simple random sample of n clusters.

We would then collect data for all elements in each of the n clusters.

Page 80: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

A primary application of cluster sampling involves area sampling, where the clusters are counties, city blocks, or other well-defined geographic sections.

Cluster sampling tends to provide better results than stratified sampling when the elements within the clusters are heterogeneous.

Page 81: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Example: We want to survey registered voters in the state of Ohio

One approach is to develop a frame consisting of all registered voters in the state of Ohio and then select a simple random sample of voters from this frame

Page 82: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Alternatively, in cluster sampling, we might choose to define the frame as the list of the N = 88 counties in the state. Each county or cluster would consist of a group of registered voters, and each registered voter in the state would belong to one and only one cluster.

Page 83: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Counties of The State of Ohio Used As Clusters of Registered Voters

Page 84: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

N = number of clusters in the population

M = number of elements in the population

Cluster Sampling

M = average number of elements in a cluster

Notation

ai = number of observations in cluster i with

a certain characteristic

xi = total of all observations in cluster i

Mi = number of elements in cluster i

n = number of clusters selected in the sample

Page 85: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Point Estimator

Population Mean

• Estimate of Standard Error of the Mean

1

1

n

ii

c n

ii

xx

M

2

12

( )

1c

n

i c ii

x

x x MN n

sNnM n

Cluster Sampling

Page 86: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

• Interval Estimate

Population Mean

• Approximate 95% Confidence Interval Estimate/ 2 cc xx t s

2cc xx s

Cluster Sampling

Page 87: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

• Point Estimator

Population Total

• Estimate of the Standard Error of the Total

ˆcX Mx

ˆ cxXs Ms

Page 88: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

• Interval Estimate

Population Total

• Approximate 95% Confidence Interval Estimate

ˆ/ 2c xMx t s

ˆ2c xMx s

Page 89: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

• Point Estimator

Population Proportion

1

1

n

ii

c n

ii

ap

M

Page 90: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

• Estimate of the Standard Error of the Proportion

Population Proportion

2

12

( )

1c

n

i c ii

p

a p MN n

sNnM n

Page 91: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

• Interval Estimate

Population Proportion

• Approximate 95% Confidence Interval Estimate

/ 2 cc pp t s

2cc pp s

Page 92: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Example: A survey conducted by the CPA (certified public accountant) Society of the 12,000 practicing CPAs in a particular state.

The CPA Society used a cluster sample to minimize the total travel and interviewing cost

Page 93: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

The frame consisted of all CPA firms that were registered to practice accounting in the state. Suppose there are N = 1000 clusters, or CPA firms,

A simple random sample of n = 10 CPA firms is to be selected.

Page 94: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Results of CPA Sample Survey

We have N = 1000, n = 10, M = 12,000, = 12,000/1000 = 12M

Page 95: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling An estimate of the mean salary is

the standard error

A 95% confidence interval estimate for the mean annual salary is 51,250 2(1979) = 51,250 3958 or $47,292 to $55,208.

250.51128

6560cx

n

iici Mxx

1

2 378.281,51)(

979.1110

378.281,51

)12)(10)(1000(

1010002

cxs

Page 96: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

American

History

Example: Cooper County Schools

There are 40 high schools in Cooper

County. School officials are interested in

the effect of participation in athletics

on academic preparation for college.

A cluster sample of 5 schools has

been taken and a questionnaire

administered to all the seniors

on the basketball teams at those

schools. There are a total of 1200 high

school seniors in the county playing basketball.

Page 97: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Data Obtained From the Questionnaire

173 845 40 970 234 38 880 183 30 905 122 20 980 161 45 840 15School

Numberof Players

AverageSAT Score

Number Planningto Attend College

Page 98: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Point Estimate of Population Mean SAT Score

xx

Mc

ii

ii

1

5

1

545 840 20 980 40 970

45 20 30 38 40906

( ) ( ) ... ( )x

x

Mc

ii

ii

1

5

1

545 840 20 980 40 970

45 20 30 38 40906

( ) ( ) ... ( )

Cluster Sampling

Page 99: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Estimate of Standard Error of the

Point Estimator of Population Mean

sN n

NnM

x x M

nx

i c ii

c

2

2

1

5

1

( )s

N n

NnM

x x M

nx

i c ii

c

2

2

1

5

1

( )

1200 173

1200 173 30

18 541 9445 1

504782( )( )

, ,.

1200 173

1200 173 30

18 541 9445 1

504782( )( )

, ,.

Page 100: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Approximate 95% Confidence Interval Estimate of the Population Mean SAT Scorex sc xc

2 = 906 2(5.0478) = 896 to 916x sc xc 2 = 906 2(5.0478) = 896 to 916

Page 101: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Estimate of Standard Error of the

Point Estimator of Population Total

Point Estimator of Population Total SAT Score

ˆ 1200(5.0478) 6,057.36cxX

s Ms

ˆ 1200(906) 1,087,200cX Mx

Cluster Sampling

Page 102: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Approximate 95% Confidence Interval Estimate of the Population Total SAT Score ˆ2 1200(906) 2(6,057.36)c xMx s

1,087,720 12,114.72

= 1,075,605.28 to 1,099,834.72

Page 103: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Point Estimate of Population Proportion

Planning to Attend College5

15

1

84 .49

173

ii

c

ii

ap

M

Page 104: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Estimate of Standard Error of the Point Estimator of the Population Proportion

2

12

( )

1c

n

i c ii

p

a p MN n

sNnM n

2 2

2

(15 .49(45)) ... (23 .49(40))1200 1731200(173)(30) 5 1cps

.0141368cps

Page 105: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Cluster Sampling

Approximate 95% Confidence Interval Estimate of the Population Proportion Planning College

2 .49 2(.0141368) .49 .0282736cc pp s

= .4617264 to .5182736

Page 106: 1/58 Statistics Sampling. STATISTICS in PRACTICE Cinergy, formerly Cincinnati Gas & Electric Company (CG&E), is a public utility that provides gas and

Systematic Sampling

Systematic Sampling is often used as an alternative to simple random sampling which can be time-consuming if a large population is involved.

If a sample size of n from a population of size N is desired, we might sample one element for every N/n elements in the population.

We would randomly select one of the first N/n elements and then select every (N/n)th element thereafter.

Since the first element selected is a random choice, a systematic sample is often assumed to have the properties of a simple random sample.