collecting statistical data

18
COLLECTING STATISTICAL DATA When information is packaged in numerical form, it is called data. (page 447) Statistics is the science of dealing with data. This includes gathering data, organizing data, interpreting data, and understanding data. (page 447)

Upload: tacita

Post on 04-Jan-2016

43 views

Category:

Documents


1 download

DESCRIPTION

When information is packaged in numerical form, it is called data . (page 447). Statistics is the science of dealing with data. This includes gathering data , organizing data , interpreting data , and understanding data . (page 447). COLLECTING STATISTICAL DATA. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: COLLECTING STATISTICAL DATA

COLLECTING STATISTICAL DATA

When information is packaged in numerical form, it is called data. (page 447)

Statistics is the science of dealing with data. This includes gathering data, organizing data, interpreting data, and understanding data. (page 447)

Page 2: COLLECTING STATISTICAL DATA

The N-value is an accurate head count of every member of a population. (page 448)

Do not confuse the N-value with the population itself.

The group of individuals or objects to which a statistical statement refers is called the population. (page 448)

Page 3: COLLECTING STATISTICAL DATA

Determining the N-Value of a population

Census: Collecting data from every member of the population (page 450)

When populations are small and accessible, one can actually get an exact N-value by simply counting “heads”.

Page 4: COLLECTING STATISTICAL DATA

The N-value N=430 is only an estimate.

Collecting data from a selected subgroup of a population and then using those data to draw a conclusion and make statistical inferences about the entire population is called conducting a survey. The subgroup of the population from which the data is collected is called a sample. (page 452)

Page 5: COLLECTING STATISTICAL DATA

STEP 1. The Capture: Capture (choose) a sample of size , tag (mark, identify) the animals (objects, people), and release them back into the general population.

1n

2nSTEP 2. The Recapture: After a certain period of time, capture a new sample of size , and take an exact head count of the tagged individuals (i.e., those that were also in the first sample). Let’s call this number k.

THE CAPTURE-RECAPTURE METHOD: ESTIMATING THE N-VALUE OF A POPULATION BY SAMPLING

(page 460)

The ratio is approximately equal to the ratio . From this we get: .

2n/k N/n1

N

n

n

k 1

2

Page 6: COLLECTING STATISTICAL DATA

Example. You want to estimate how many fish there are in a small pond. Suppose you capture 500 fish, tag them, and throw them back in the pond. After a couple of days, you go back to the pond and capture 120 fish, of which 30 are tagged. Give an estimate of the N-value of the fish population in the pond.

Page 7: COLLECTING STATISTICAL DATA

Page 459Statisticians use the term statistic to describe any kind of numerical information drawn from a sample.

A statistic is always an estimate for some unknown measure, called a parameter, of the population.

Page 8: COLLECTING STATISTICAL DATA

Sampling error (page 459) is the difference between a parameter and a statistic used to estimate that parameter.

Page 9: COLLECTING STATISTICAL DATA

In surveys, chance error is the result of sampling variability : the fact that two different samples are likely to give two different statistics, even when the samples are chosen using the same sampling method.

Sample bias is the result of having a poorly chosen sample.

Page 459

Page 10: COLLECTING STATISTICAL DATA

The critical issues are:a. Finding a sample that is representative of the population, andb. Determining how big the sample should be.

If n is the sample size and N is the population size then n/N is called the sampling rate. This is usually expressed as a percentage. A sampling rate of x% indicates that the sample is x% of the population. (page 529)

Choosing a good sample of a reasonable size is more important that the sampling rate.

Page 11: COLLECTING STATISTICAL DATA

Consider the M&M exampleAssume there are 1,500 M&M’s in the container• Describe the population of the survey• Describe the sample for this survey• Give the sample statistic for the number of

M&M’s in this survey• Give the parameter of the survey• Give the sampling proportion for the survey• Give the sampling error• Give the sampling error, expressed as a percent• Is the sampling error found a result of sampling

variability or sampling bias? Explain.

Page 12: COLLECTING STATISTICAL DATA

A public opinion poll (page 453) is a special kind of a survey in which the members of the sample provide information by answering specific questions from an “interviewer”.

Page 13: COLLECTING STATISTICAL DATA

Example

• In order to estimate how effective Mr. Evans is in his Core 120 class, he gives 5 students in his section a survey on which they rate his/her efficiency on a scale of 1-5. he chooses students from each section in the following way: one that is getting an “A”, one that is getting a “B”, one that is getting a “C”, etc…

• The scores reported were 4, 4, 5, 4, 4. • At the end of the semester, he gives the survey to

all students in the class and finds that the average rating is 4.45

Page 14: COLLECTING STATISTICAL DATA

• Describe the population of the survey• Describe the sample for this survey• Give the sample statistic (if it is given) for the average

rating of Mr. Evans• Give the parameter (if it is given) for the average rating of

Mr. Evans• Give the sampling proportion (rate) for the survey• Give the sampling error• Give the sampling error, expressed as a percent• Is the sampling error found a result of sampling variability

or sampling bias? Explain.• What could be done with this sample to eliminate (or

minimize) the sampling error.

Page 15: COLLECTING STATISTICAL DATA

A public opinion poll (page 453) is a special kind of a survey in which the members of the sample provide information by answering specific questions from an “interviewer”.

The critical issues are:a. Finding a sample that is representative of the population, andb. Determining how big the sample should be.

Page 16: COLLECTING STATISTICAL DATA

• Bush's lead gets smaller in poll

• By Susan Page,

• USA TODAY WASHINGTON — President Bush leads Sen. John Kerry by 8 percentage points among likely voters, the latest USA TODAY/ CNN/Gallup Poll shows. That is a smaller advantage than the president held in mid-September but shows him maintaining a durable edge in a race that was essentially tied for months.

Results based on likely voters are based on the sub

sample of 758 survey respondents deemed most

likely to vote in the November 2004 General Election. The margin of

sampling error is ±4 percentage points.

Page 17: COLLECTING STATISTICAL DATA

1936 Literary digest poll

• 1936 - Great Depression• Presidential election between Democrat Franklin D.

Roosevelt and Republican Alfred Landon.• Literary digest runs a poll before election

– Telephone lists, professional organizations, magazine subscriptions

– Created a list of 10,000 names

• From 2.4 million respondents:– Landon 57%– Roosevelt 43%

• Actual results– Landon 38%– Roosevelt 62%

Page 18: COLLECTING STATISTICAL DATA

Homework

• Read pages 452 - 461

• Page 467: 1 – 16, 29 – 34, 37, 38, 59