ap statistics designing studies & experiments student handout

15
Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org AP Statistics Designing Studies & Experiments Student Handout 2017-2018 EDITION

Upload: others

Post on 20-Mar-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org

AP Statistics

Designing Studies & Experiments

Student Handout

2017-2018 EDITION

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 2

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 1

Designing Studies and Experiments

A free response question dealing with sampling or experimental design has appeared on every

AP Statistics exam. The question is designed to assess your understanding of fundamental

concepts such as identifying a potential source of bias and its consequences, describing an

appropriate sampling method, recognizing the difference between random sampling and random

assignment, and recognizing when an inference or generalization can be made based on the

design of the study. Student responses using clear communication and correct statistical

vocabulary earn the highest score.

Reminders about content and communications:

• When identifying potential bias, be sure to link the bias to a consequence with a specific

direction in overestimating or underestimating the statistic.

• When describing a simple random sampling method based on a situation, describe a valid

sampling procedure that allows each group of n individuals an equal opportunity to be

randomly selected. Your description should be clear enough to be implemented in the

same way by several different first year statistics students.

• Without random samples from the population, results cannot be generalized to the

population. All factors of the situation must be considered before making a generalization.

• Without random assignment to treatment groups, cause and effect conclusions cannot be

made from experimental results.

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 2

2011 #3 – slightly modified

An apartment building has nine floors and each floor has four apartments. The building owner wants to install new carpeting in eight apartments to see how well it wears before she decides whether to replace the carpet in the entire building. The figure below shows the floors of apartments in the building with their apartment numbers. Only the nine apartments indicated with an asterisk (*) have children in the apartment.

(a) Is this an observational study or an experiment? Explain. (b)For convenience, the apartment building owner wants to use a cluster sampling method, in which the floors are clusters, to select the eight apartments. Describe a process for randomly selecting eight different apartments using this method. (c) An alternative sampling method would be to select a stratified random sample of eight apartments, where the strata are apartments with children and apartments with no children. A stratified random sample of size eight might include two randomly selected apartments with children and six randomly selected apartments with no children. In the context of this situation, give one statistical advantage of selecting such a stratified sample as opposed to a cluster sample of eight apartments using the floors as clusters.

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 3

Multiple Choice Questions:

1. A large company wants to conduct a survey to determine the proportion of its male

employees who practice yoga on a daily basis. Two of its regional offices are chosen at

random and all of the male employees at each office are surveyed. This plan is an example of

which type of sampling?

A) Cluster

B) Convenience

C) Simple random

D) Stratified random

E) Systematic

2. Which of the following is a key distinction between well designed experiments and

observational studies?

A) More subjects are available for experiments than for observational studies.

B) Ethical constraints prevent large-scale observational studies.

C) Experiments are less costly to conduct than observational studies.

D) An experiment can show a direct cause-and-effect relationship, whereas an observational

study cannot.

E) Tests of significance cannot be used on data collected from an observational study.

3. A local television station is interested in how citizens in a small town feel about the increased

sales tax proposed by the city council. The question “Are you in favor of the proposed sales

tax increase that will be used to improve the sidewalks and streets in downtown?” was shown

on the screen during the evening news broadcast and viewers were instructed to text their

answer to the number given on the screen. This survey method could produce biased results

for all of the following reasons except

A) the wording of the question is biased.

B) a person could answer the survey multiple times.

C) it uses a stratified sample rather than a simple random sample.

D) the survey excludes voters who do not watch the evening news cast.

E) people who feel strongly about the issue are more likely to respond.

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 4

4. The oil used in gasoline engines for cars can be mineral oil, a synthetic blend (a mixture of

mineral oil and synthetic oil), or pure synthetic oil. An experiment is to be conducted to

determine whether oil type effects a car’s engine. In previous studies, it was determined that

engine size (4-cylinder, 6-cyliner, 8-cylinder) is associated with engine life, but car type

(coupe, sedan, wagon) is not associated with engine life. This experiment would best be done

A) by blocking on car type

B) by blocking on oil type

C) by blocking on engine life

D) by blocking on engine size

E) without blocking

5. The director of an alumni association wants to estimate the mean income of the members of

the class of 1999. Each person in the class of 779 graduates was given the survey and 163 of

the graduates returned the survey. How could the nonresponse by the 616 graduates who did

not return the survey cause the results of the survey to be biased?

A) The graduates who did not respond caused the assumption of independence to be invalid.

B) The graduates who did not respond changed the survey from a census to a simple random

sample of graduates.

C) The graduates who did not respond reduced the sample size and smaller samples are more

biased than large samples.

D) The graduates who did not respond may represent a group that is homogeneous with

respect to income and differs from the graduates who did respond.

E) The graduates who did not respond may represent a group that is heterogeneous with

respect to income and is similar to the graduates who did respond.

6. A consumer group for a camping and hiking magazine would like to compare a new

mosquito spray made with essential oils to a traditional spray that contains DEET. The group

is interested in the amount of time the spray protects the wearer from the bloodthirsty insects.

Subjects will be treated and sent into a mosquito filled meadow for a 12-hour period. Which

of the following is the BEST method for assigning the treatments?

A) Have the subjects choose which spray they are willing to use for the 12-hour period.

B) Assign the sprays to the subjects on the basis of their camping and hiking experience

without randomization.

C) Give the new spray to all subjects for a 12-hour period, then give the DEET spray to all

subjects for a second 12-hour period.

D) Randomly assign the subjects to two groups, giving the new spray to one group and the

DEET spray to the second group.

E) Each subject uses a randomization device to select which spray to apply to the right side

of their body. The other spray is then applied to the left side of their body.

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 5

7. An education researcher would like to give a survey to a stratified random sample of students

in a large school district using grade level as the strata. Which of the following would NOT

be a characteristic of this stratified random sample?

A) A random sample of students will be chosen.

B) Each student in the population belongs to only one stratum.

C) The population of students will be divided into homogeneous groups by grade level.

D) Proportional numbers of students could be selected from each grade level.

E) Every possible subset of the population of students in the district has the same chance of

being selected.

8. Students taking an exam at Westwood High School were randomly selected to receive either a

peppermint candy or a similar looking candy without peppermint to determine if peppermint

really improves thinking. Both groups showed an increase in test scores.

This is an example of

A) a successful experiment due to the peppermint treatment.

B) poor design due to the lack of a control group.

C) measurement bias since we do not know the difficulty level of the exam.

D) the placebo effect due to the increase of scores in the non-peppermint group.

E) blocking by peppermint and no peppermint candy.

9. A chemical company designs an experiment to determine whether or not a new pesticide will

work better than a commonly used treatment to eliminate ants. The company uses a sample of

fire ants to test the new pesticide. The proportion of surviving ants that were randomly

selected to receive the new pesticide was significantly lower than the proportion of ants that

received the commonly used pesticide. The company concluded that the new pesticide is

indeed better at killing ants. Is this a correct conclusion?

A) No, because there was not a group that received no pesticide.

B) No, because the experiment was only done with fire ants so the results cannot be

generalized to all ants.

C) Yes, because the ants were randomly selected for the two treatment groups.

D) Yes, because the difference was statistically significant.

E) Yes, because there was a control group to reduce the effects of confounding variables.

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 6

Additional Free Response Questions:

2004A Question 2 Researchers who are studying a new shampoo formula plan to compare the condition of hair for people who use the new formula with the condition of hair for people who use the current formula. Twelve volunteers are available to participate in this study. Information on these volunteers (numbered 1 through 12) is shown in the table below.

Volunteer Gender Age

1 Male 21

2 Female 20

3 Male 47

4 Female 60

5 Female 62

6 Male 61

7 Male 58

8 Female 44

9 Male 44

10 Female 24

11 Male 23

12 Female 46

(a) These researchers want to conduct an experiment involving the two formulas (new and

current) of shampoo. They believe that the condition of hair changes with age but not gender. Because researchers want the size of the blocks in an experiment to be equal to the number of treatments, they will use blocks of size 2 in their experiment. Identify the volunteers (by number) that would be included in each of the six blocks and give the criteria you used to form the blocks.

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 7

(b) Other researchers believe that hair condition differs with both age and gender. These researchers will also use blocks of size 2 in their experiment. Identify the volunteers (by number) that would be included in each of the six blocks and give the criteria you used to form the blocks.

(c) The researchers in part (b) decide to select three of the six blocks to receive the new

formula and to give the other three blocks the current formula. Is this an appropriate way to assign treatments? If so, describe a method for selecting the three blocks to receive the formula. If not, describe an appropriate method for assigning treatments.

2008 Question 2

A local school board plans to conduct a survey of parents’ opinions about year-round schooling

in elementary schools. The school board obtains a list of all families in the district with at least

one child in an elementary school and sends the survey to a random sample of 500 of the

families. The survey question is provided below.

A proposal has been submitted that would require students in elementary schools to attend school

on a year round basis. Do you support this proposal? (Yes or No)

The school board received responses from 98 of the families, with 76 of the responses indicating

support for year-round schools. Based on this outcome, the local school board concludes that

most of the families with at least one child in elementary school prefer year-round schooling.

(a) What is a possible consequence of nonresponse bias for interpreting the results of this

survey?

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 8

(b) Someone advised the local school board to take an additional random sample of 500 families

and to use the combined results to make their decision. Would this be a suitable solution to

the issue raised in part (a). Explain.

(c) Suggest a different follow-up step from the one suggested in part (b) that the local school

board could take to address the issue raised in part (a).

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 9

Important vocabulary

• The population is the entire group you would like to study or draw a conclusion about. Any numerical value that comes from the population is a parameter. Parameters are usually unknown. The study of an entire population is called a census.

• The sample is the part of the population from which you take data. Any numerical value that comes from a sample is called a statistic.

• A simple random sample (SRS) of size n gives every individual and every group of size n an equal chance of being chosen. To carry out an SRS of size n , number the list of possible subjects or experimental units. Clearly describe how to complete the randomization using a random digit table or a random number generator. For each chosen number, write down the name of the corresponding subject or experimental unit. Ignore repeats. Continue until you have a list of n different subjects or experimental units.

• Choose a stratified random sample if you want to be sure to have some subjects from each subgroup in your sample. Split into subgroups called strata. Then take an SRS out of each subgroup. Note: All subjects in the subgroup must be similar (homogeneous) with respect to a characteristic that might be related to the response variable. For example, when investigating the average amount of time spent on homework each night, the strata could be freshmen, sophomores, juniors, and seniors. Then your sample would be sure to have students from each grade level since grade level is probably related to average homework time.

• Use a cluster sample if you have many groups that are similar to each other. Randomly choose one or more groups to be the sample. Note: The subjects in the groups should not be alike (heterogeneous), but each group should be similar to every other group. For example, if each fourth grade class in an elementary school has students of all ability levels and all socioeconomic groups, then randomly choosing one class as a sample would give an acceptable representation of the fourth grade as a whole.

• Convenience sample is asking whomever you happen to run into. Not a good idea but quick and easy.

• Systematic random sample: choosing every nth person through a door or on a list. • Bias occurs when a study systematically favors one outcome over another. It can occur

when a certain group is over- or under-represented or when your measurement device impacts the results.

• Voluntary response bias occurs when subjects voluntarily choose to be in the sample, and people usually volunteer only if they have strong motivation.

• Undercoverage occurs when some groups of people are ignored when the sample is being chosen. A survey sent by email ignores people without access to email.

• Non-response bias occurs when one group of subjects having something in common do not answer. (For example, only people with a lot of time on their hands respond.)

• Response bias occurs when subjects give incorrect answers, either because they have forgotten details, they are intimidated by the interviewer, or they lie about embarrassing or illegal activities.

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 10

• Experiments are studies in which the researcher imposes a treatment on experimental units. • Sometimes different groups are simply compared with one another. If no treatment is

assigned or imposed, the study is called an observational study. • Some experiments have a control group (a group of experimental units that receive no

treatment or receive only a placebo), but this is not necessary for a well-designed

experiment. • Experimental units are the smallest independent “objects” to which treatments are assigned

and on which a response is measured. Consider an experiment that is designed to determine which of several types of fish food

will result in the greatest weight gain for fish. If tanks contain several fish, and food is

added to the water in the tank, then the tank is the experimental unit (not the individual

fish), since the fish in a tank are not independent of one another, but tanks are independent

of one another.

• Replication refers to having multiple experimental units in each treatment group (repeating

the treatment), not to repeating the entire experiment.

• In an experiment, randomization refers to randomly assigning experimental units to the

treatments. Often the experimental units are not a random sample of the population of

interest. While this is not a problem with the experimental design, it may limit the scope of

inference for the experimental results. (Note that random samples are important in surveys.)

• The purpose of random assignment (of experimental units to treatment groups or of

treatments to experimental units) is to even out extraneous variables and make treatment

groups that are approximately similar in all respects except for the treatment.

• In a double blind experiment, someone must know which treatment the experimental unit

received! The subjects (assuming they are people) are blind to which treatments they are

receiving, and anyone who interacts with the subjects should also be blinded to which

treatment was given to each subject.

• If the response variable is in any way a subjective evaluation, then the person

performing that evaluation should be blind to what treatments were applied. But

obviously, some person or people on the research team must have a record of what

treatments have been applied to what subjects.

• A confounding variable is a variable that affects the response variable and also is related

to group membership. A variable that affects the response variable and is not related to

group membership (that is, the variable would be expected to even out across the groups) is

not a confounding variable. You may refer to this type of variable as an extraneous variable.

It is best to avoid using the term lurking variable.

o For example: It has been observed that people who take long vacations have, on

average, significantly longer lifespans than people who don’t. Can we conclude

that vacationing is a way to extend your lifespan? Not necessarily: a person’s

income could be a confounding variable—people with higher incomes are more

likely to be able to take long vacations, and they’re also more likely to afford

health care that could lead to longer lifespans. Note that something like exercise

would probably be an extraneous variable and not a confounding variable.

Exercise may indeed be associated with longer lifespans, but is there an

association between getting exercise and taking long vacations?

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 11

• Blocks are groups of experimental units that are homogeneous with respect to some

inherent characteristic that is expected to affect the response to treatments.

o Blocks are considered a form of control – blocks help control known sources of

variability among the experimental units so that the experimenter is better able to

detect differences in the response variable that are due to the treatments.

o “Blocking is used to control the factors you can see; randomization helps balance

the ones you cannot see.” Richard L. Scheaffer, AP Statistics Chief Faculty

Consultant, 1997‐1999

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 12

1

California is experiencing a

severe drought which affects

the yield of all crops.

Desalinated water may be an

option for irrigation but the

water still has a higher salt

content than ground water. Is

there a difference in pecan yield

using desalinated water?

2

Stevia is a natural substitute for

sugar extracted from a plant by

the same name.

Does using Stevia in place of

sugar reduce the risk of Type II

diabetes?

3

Good manners are listed as one

of the top ten skills adults

believe children need to succeed

in the modern world. What

percent of high school students

would be categorized as having

good manners?

4

Resveratrol is a plant based

compound found in places such as

the skin of grapes or

blueberries. Some believe

resveratrol lowers the risk of

cancer. How could we determine

if taking resveratrol supplements

is beneficial?

5

A city is hoping to avoid the

need of a new trash landfill by

giving every customer a recycling

bin as well as a trash bin. They

are interested in predicting the

amount of trash reduction going

into the landfill using the new

recycling bins. How should they

conduct a study to determine

this?

6

How long does it really take to

complete AP Statistics

homework?

Copyright © 2017 National Math + Science Initiative, Dallas, Texas. All rights reserved. Visit us online at www.nms.org 13

7

As water shortages become

increasingly common in many

cities throughout the United

States, efforts to conserve this

necessary resource are being

explored. What water saving

habits are citizens most likely to

actually do to conserve water?

8

Sodium nitrate is often added to

lunch meat and hot dogs to fight

harmful bacteria. What long

term health effects do nitrates

have on our health?

9

Sleep:

Not now, of course!

High school students are

increasingly sleep deprived. Is

there a way to solve this

problem? You may come up with

a study or experiment to help

work towards a solution.

10

Wind power is renewable, clean,

and takes up a relatively small

amount of land area.

What is the optimum length for

the rotor blades of a wind

turbine for generating the

maximum amount of electricity?

11

What is the difference in

lifespan between lemurs living in

the wild and lemurs living in

captivity?

12

What is the smallest amount of

pesticide that can be used on

strawberry plants and still

remain effective in keeping bugs

from eating the crop?