ap statistics unit 2 concepts (chapter 4)

20
1 AP Statistics – Unit 2 Concepts (Chapter 4) Baseline Topics: (must show mastery in order to receive a ‘3’ or above I can distinguish between a census and a sample I can identify a systematic sample. Textbook Problem #26 p. 228 I can identify a cluster or multi-stage sample and distinguish them from simple random or stratified samples. Strive p. 77 Q#4 I can distinguish a simple random sample from a stratified random sample. Strive p. 77-78 Q#4 & Q #6 I can identify the experimental units or subjects, explanatory variables (factors), treatments, and response variables in an experiment. Strive p. 70-71 I can know when a matched pairs experimental design is appropriate and how to implement such a design. Strive p. 77 Q#3 MULTIPLE CHOICE: All of us nonsmokers can rejoice—the mosaic tobacco virus that affects and injures tobacco plants is spreading! Meanwhile, a tobacco company is investigating if a new treatment is effective in reducing the damage caused by the virus. Eleven plants were randomly chosen. On each plant, one leaf was randomly selected, and one half of the leaf (randomly chosen) was coated with the treatment, while the other half was left untouched (control). After two weeks, the amount of damage to each half of the leaf was assessed. For purposes of comparing the damage, which of the following is the appropriate type of procedure? A. Two-proportion z procedures B. Two-sample z procedures C. Matched pairs t procedures D. Two proportion t procedures E. Two-sample t procedures I can distinguish between a completely randomized design and a randomized block design. Strive p. 78 Q#7 I can identify the population and sample in a sample survey. Strive p. 66 I can identify voluntary response samples and convenience samples. o VRS means participants are ‘self-selected’, but are in no way randomly selected. o Convenience is ‘somewhat random’ but not random enough. These participants are simply the easiest individuals to get to be a part of the sample. I can explain how these bad sampling methods can lead to bias. o Voluntary Response & Convenience samples are both considered ‘bad.’ Textbook p. 209-211

Upload: others

Post on 18-Dec-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AP Statistics Unit 2 Concepts (Chapter 4)

1

AP Statistics – Unit 2 Concepts (Chapter 4) Baseline Topics: (must show mastery in order to receive a ‘3’ or above I can distinguish between a census and a sample I can identify a systematic sample.

Textbook Problem #26 p. 228 I can identify a cluster or multi-stage sample and distinguish them from simple random or

stratified samples. Strive p. 77 Q#4

I can distinguish a simple random sample from a stratified random sample. Strive p. 77-78 Q#4 & Q #6

I can identify the experimental units or subjects, explanatory variables (factors), treatments, and response variables in an experiment.

Strive p. 70-71 I can know when a matched pairs experimental design is appropriate and how to

implement such a design. Strive p. 77 Q#3

MULTIPLE CHOICE: All of us nonsmokers can rejoice—the mosaic tobacco virus that affects and injures tobacco plants is spreading! Meanwhile, a tobacco company is investigating if a new treatment is effective in reducing the damage caused by the virus. Eleven plants were randomly chosen. On each plant, one leaf was randomly selected, and one half of the leaf (randomly chosen) was coated with the treatment, while the other half was left untouched (control). After two weeks, the amount of damage to each half of the leaf was assessed. For purposes of comparing the damage, which of the following is the appropriate type of procedure? A. Two-proportion z procedures B. Two-sample z procedures C. Matched pairs t procedures D. Two proportion t procedures E. Two-sample t procedures

I can distinguish between a completely randomized design and a randomized block design.

Strive p. 78 Q#7 I can identify the population and sample in a sample survey.

Strive p. 66

I can identify voluntary response samples and convenience samples. o VRS means participants are ‘self-selected’, but are in no way randomly selected. o Convenience is ‘somewhat random’ but not random enough. These participants

are simply the easiest individuals to get to be a part of the sample.

I can explain how these bad sampling methods can lead to bias. o Voluntary Response & Convenience samples are both considered ‘bad.’

Textbook p. 209-211

Page 2: AP Statistics Unit 2 Concepts (Chapter 4)

2

I can describe how to use Table D to select a simple random sample (SRS). o If you are selecting among NINE individuals, each person receives a 1 DIGIT label. o If you are selecting among TEN individuals, each person receives a 1 DIGIT label. o If you are selecting among ELEVEN individuals, each person receives a 2 DIGIT

label. o If you are selecting among ONE HUNDRED individuals, each person receives a 2

DIGIT label. o If you are selecting among ONE HUNDRED ONE (101) individuals, each person

receives a 3 DIGIT label. Strive p. 66

I can give advantages and disadvantages of each sampling method.

I can explain how under coverage, nonresponse, and question wording can lead to bias in a sample survey.

Strive p. 67; Textbook p. 221-224

I can distinguish between an observational study and an experiment. o In an observational study, there are NOT treatments. You cannot determine

‘cause and effect’ unless it’s an experiment. Strive p. 69-70

I can explain how a lurking variable in an observational study can lead to confounding.

Strive p. 70-71

I can describe a completely randomized design for an experiment. Textbook p. 238

I can explain why random assignment is an important experimental design principle. o It helps to decrease bias and lurking variables.

Strive p. 77 Q#1

I can describe how to avoid the placebo effect in an experiment. Strive p. 71; Textbook p. 242-243

I can explain the meaning and the purpose of blinding in an experiment. Strive p. 79 Q#10; Textbook p. 244

I can explain in context what “statistically significant” means.

Strive p. 70; Textbook p. 244

I can determine the scope of inference for a statistical study. Strive p. 73

I can evaluate whether a statistical study has been carried out in an ethical manner.

Page 3: AP Statistics Unit 2 Concepts (Chapter 4)

3

AP Statistics – Exam Review Data Collection CLIFF NOTES/Overview of Unit 2 (Chapter 4) Some Key Vocabulary:

Voluntary Response sample -- Know what a voluntary response sample is, be able to give examples of them, and understand why they are unreliable.

Population vs. Sample -- Have a clear understanding of what the difference is between a population and a sample.

Understand what sample bias is. table of random digits Observational studies vs. experiments – Have a clear understanding of the

difference between the two. Placebo effect

Major Concepts to be mastered: Sampling Designs and Problems with Sampling #1. Understand the difference between the following sampling designs: Simple Random Sample (SRS)

Often uses a table of random digits, but doesn’t HAVE to. Each member of population has an equal chance/probability of being selected. It does not involved breaking the population into any kind of ‘group’.

Note about Sampling and SRS: Know what is required for a sample to be a simple random sample (SRS).

o A simple random sample of size ‘n’ consists of ‘n’ individuals from the population chosen in such a way that every set of ‘n’ individuals has an equal chance to be the sample actually selected.

o The easiest way to carry this out is by places names in a hat (population) and drawing them out (sample).

Stratified Sample The population of interest is broken up into different groups called ‘strata’. ‘Strata’ – Groups made up of similar individuals. (NOT to be confused with BLOCKING) An ‘SRS’ is down within each ‘strata’.

Cluster Sample These are typically done in a large scale setting. The population is broken into groups/clusters that are not necessarily made up of similar

individuals. A ‘cluster’ should be a group that mirrors the population at large, rather than be made up of

similar individuals like a stratified sample. In cluster sampling, you randomly select a GROUP/CLUSTER; then within each selected

CLUSTER, you conduct a CENSUS. Multistage

Very similar to a Cluster sample, except: you randomly select a GROUP; then within each selected group, you conduct an SRS (NOT a Census)

Systematic Sample The first individual is randomly selected and from there, you selected every ‘kth’ individual

after that. Census: Not a sampling method, because EVERY person in the population is selected and surveyed.

Page 4: AP Statistics Unit 2 Concepts (Chapter 4)

4

Example: Consider a class of six boys and six girls. I want to randomly pick a committee of two students from this group. I decide to flip a coin.

o If "heads," I will choose two girls by a random process. o If "tails," I will choose two boys by a random process. o Now, each student has an equal probability (1/6) of being chosen for the

committee. However, the two students are not an SRS of size two picked

from members of the class. Why not? Because this selection process does not allow for a

committee consisting of one boy and one girl. o To have an SRS of size two from the class, each group of two students would have

to have an equal probability of being chosen as the committee.

SRS refers to how you obtain your sample; random allocation is what you use in an experiment to assign subjects to treatment groups. They are not synonyms.

Understand why under coverage, non-response, poorly worded questions, and response bias cause problems in collecting a sample.

Know the difference between: i. Sampling errors – Occurs during the actual process of choosing a sample. (Under

coverage) ii. Non sampling errors – occurs AFTER the actual process of choosing the sample.

(Non-response, Poorly worded questions, and response bias)

Experimental Design Understand the following terms regarding experimental design:

Note: Well-designed experiments satisfy the principles of:

Control Randomization Replication

#1. Experimental units & subjects These are the individuals or objects the experiment is performed upon. When these individuals are human, they are called ‘subjects.’

#2. Treatments There should be both a control and experimental group involved. There can be more than ONE experimental group. The control group may often ‘do nothing’. Must be a control group in order to do a full and adequate comparison after the

experiment is finished. #3. Levels

These levels can be done in a variety of different ways (temperature, weight, etc, etc… the possibilities are endless).

#4. Factors These are the EXPLANATORY (X-VARIABLES)

Page 5: AP Statistics Unit 2 Concepts (Chapter 4)

5

#5. Control Group -- Why is using a control group necessary when conducting an experiment? Control for the effects of lurking variables by comparing several treatments in the

same environment. Note: Control is not synonymous with "control group."

#6. Experimental Group There can be more than one experimental group.

#7. Randomization -- What is randomization and why is it important? Randomization refers to the random allocation of subjects to treatment

groups, and not to the selection of subjects for the experiment. Randomization is an attempt to "even out" the effects of lurking variables

across the treatment groups. Randomization helps avoid bias.

#8. What does ‘statistically significant’ mean? This term is used VERY frequently in the scientific community when the results of an

experiment are summarized. Means that: “an observed effect so large that it would RARELY occur by chance.”

Think: p-value.

Important Thing to Understand: Distinguish the language of surveys from the language of experiments.

Stratifying : sampling :: Blocking : experiment It is not enough to memorize the terminology related to surveys, observational

studies, and experiments. You must be able to apply the terminology in context.

Types of Designs for Experiments #1. What is a double-blind study?

An experiment is double blind if neither the subjects nor the experimenters know who is receiving what treatment.

A third party can keep track of this information. #2. Describe a block design

A block is a group of units/subjects that are known BEFORE the experiment to be similar in some way that is expected to affect the response to the treatments.

Blocking refers to a deliberate grouping of subjects in an experiment based on a characteristic (such as gender, cholesterol level, race, or age) that you suspect will affect responses to treatments in a systematic way.

After blocking, you should randomly assign subjects to treatments within the blocks. Blocking reduces unwanted variability. NO ONE is every randomly assigned to a BLOCK!!!!!

#3. Be able to describe a matched pairs design Is a block design of size 2. Can be done where two very similar individuals/subjects each get a different

treatment and then the results are compared afterwards OR it can be done where ONE individual completes both treatments and the results of each treatment are compared at the end.

Page 6: AP Statistics Unit 2 Concepts (Chapter 4)

6

#4. Describe a completely randomized design: The treatments are assigned to all the experimental units/subjects completely by

chance. Does NOT require that each treatment be assigned to an equal number of

experimental units, only that they are assigned at random. This type of design can compare any number of treatments.

#5. What is a simulation?

This is an ‘imitation’ of chance/random behavior based on a model that accurately reflects the situation.

#6. What does “confounding” mean? Confounding occurs when two variables are associated in such a way that their effects

on a response variable cannot be distinguished from each other. EXAMPLE: Suppose that subjects in an observational study who eat an apple a day get

significantly fewer cavities than subjects who eat less than one apple each week. o A possible confounding variable is overall diet. Members of the apple-a-day

group may tend to eat fewer sweets, while those in the non-apple-eating group may turn to sweets as a regular alternative.

o Since different diets could contribute to the disparity in cavities between the two groups, we cannot say that eating an apple each day causes a reduction in cavities.

#7. What is replication? Replication means using a large enough number of subjects to reduce chance

variation in a study. Replication helps to reduce variability . Note: In science, replication often means, "do the experiment again."

#8. Cause & effect (causation) – can only be concluded through well designed experiments.

Page 7: AP Statistics Unit 2 Concepts (Chapter 4)

7

AP Statistics Exam Review – MULTIPLE CHOICE PRACTICE part I Chapter 4 AP Statistics Practice Multiple Choice Test Learning Targets: #T4.1. I can understand how a census differs from a survey. When we take a census, we attempt to collect data from

a. A stratified random sample. b. Every individual chosen in a simple random sample. c. Every individual in the population. d. A voluntary response sample e. A convenience sample

#T4.2. I can describe how to use Table D to select a simple random sample (SRS). We wish to draw a sample of 5 without replacement from a population of 50 households. Suppose the households are numbered 01, 02, . . . , 50, and suppose that the relevant line of the random number table is

11362 35692 96237 90842 46843 62719 64049 17823 Then the households selected are

a. households 11 13 36 62 73 b. households 11 36 23 08 42 c. households 11 36 23 23 08 d. households 11 36 23 56 92 e. households 11 35 96 90 46

#T4.3. I can distinguish between an observational study and an experiment. A study of treatments for angina (pain due to low blood supply to the heart) compared bypass surgery, angioplasty, and use of drugs. The study looked at the medical records of thousands of angina patients whose doctors had chosen one of these treatments. It found that the average survival time of patients given drugs was the highest. What do you conclude?

a. This study proves that drugs prolong life and should be the treatment of choice b. We can conclude that drugs prolong life because the study was a comparative

experiment. c. We can’t conclude that drugs prolong life because the patients were volunteers. d. We can’t conclude that drugs prolong life because this was an observational study. e. We can’t conclude that drugs prolong life because no placebo was used.

#T4.4. I can distinguish a simple random sample from a stratified random sample. A simple random sample (SRS) is

a. Any sample selected by using chance. b. Any sample that gives every possible sample of the same size the same chance to be

selected. c. A sample that gives every possible sample of the same size the same chance to be

selected. d. A sample that doesn’t involve strata or clusters. e. A sample that is guaranteed to be representative of the population.

Page 8: AP Statistics Unit 2 Concepts (Chapter 4)

8

#T4.5. I can explain why random assignment is an important experimental design principle. Consider an experiment to investigate the effectiveness of different insecticides in controlling pests and their impact on the productivity of tomato plants. What is the best reason for randomly assigning treatment levels (spraying or not spraying) to the experimental units (farms)?

a. Random assignment makes the experiment easier to conduct since we can apply the insecticide in any pattern rather than in a systematic fashion.

b. Random assignment will tend to average out all other uncontrolled facts such as soil fertility so that they are not confounded with the treatment effects.

c. Random assignment makes the analysis easier since the data can be collected and entered into the computer in any order.

d. Random assignment is required by statistical consultants before they will help you analyze the experiment.

e. Random assignment implies that it is not necessary to be careful during the experiment, during data collection, and during data analysis.

#T4.6. I can distinguish between an observational study and an experiment. The most important advantage of experiments over observational studies is that

a. Experiments are usually easier to carry out. b. Experiments can give better evidence of causation. c. Confounding cannot happen in experiments. d. An observational study cannot have a response variable. e. Observational studies cannot use random samples.

#T4.7. I can identify a cluster or multi-stage sample and distinguish them from simple random or stratified samples.

I can distinguish a simple random sample from a stratified random sample. I can identify voluntary response samples and convenience samples.

A TV station wishes to obtain information on the TV viewing habits in its market area. The market area contains one city of population 170,000, another city of 70,000, and four towns of about 5000 inhabitants each. The station suspects that the viewing habits may be different in larger and smaller cities and in the rural areas. Which of the following sampling designs would give the type of information that the station require?

a. A cluster sample using the cities and towns as clusters. b. A convenience sample from the market area c. A simple random sample from the whole market area d. A stratified sample from the cities and towns in the market area. e. An online poll that invites all people from the cities and towns in the market area to

participate.

Page 9: AP Statistics Unit 2 Concepts (Chapter 4)

9

#T4.8. I can explain how these bad sampling methods can lead to bias. Bias in a sampling method is

a. Any error in the sample result, that is, any deviation of the sample result from the truth about the population.

b. The random error due to using chance to select a sample. c. Any error due to practical difficulties such as contacting the subjects selected. d. Any systematic error that tends to occur in the same direction whenever you use this

sampling method. e. Racism or sexism on the part of those who take the sample.

#T4.9. I can describe a completely randomized design for an experiment. I can distinguish between a completely randomized design and a randomized block design. I can know when a matched pairs experimental design is appropriate and how to implement such a design. I can identify the experimental units or subjects, explanatory variables (factors), treatments, and response variables in an experiment. You wonder if TV ads are more effective when they are longer or repeated more often or both. So, you design an experiment. You prepare 30 second and 60 second ads for a camera. Your subjects all watch the same TV program, but you assign them at random to four groups. One group sees the 30 second ad once during the program; another sees it three times; the third group sees the 60 second ad once; and the last group sees the 60 second ad three times. You ask all subjects how likely they are to buy the camera.

a. This is a randomized block design, but not a matched pairs design. b. This is a matched pairs design. c. This is a completely randomized design with one explanatory variable (factor). d. This is a completely randomized design with two explanatory variables (factors). e. This is a completely randomized design with four explanatory variables (factors).

#T4.10. I can explain why random assignment is an important experimental design principle.

I can describe a completely randomized design for an experiment. A researcher wishes to compare the effects of 2 fertilizers on the yield of soybeans. She has 20 plots of land available for the experiment, and she decides to use a matched pairs design with 10 pairs of plots. To carry out the random assignment for this design, the researcher should

a. Use a table of random numbers to divide the 20 plots into 10 pairs and then, for each pair, flip a coin to assign the fertilizers to the 2 plots.

b. Subjectively divide the 20 plots into 10 pairs (making the plots within a pair as similar as possible) and then, for each pair, flip a coin to assign the fertilizers to the 2 plots.

c. Use a table of random numbers to divide the 20 plots into 10 pairs and then use the table of random numbers a second time to decide upon the fertilizer to be applied to each member of the pair.

d. Flip a coin to divide the 20 plots into 10 pairs and then, for each pair, use a table of random numbers to assign the fertilizers to the 2 plots.

e. Use a table of random numbers to assign the 2 fertilizers to the 20 plots and then use the table of random numbers a second time to place the plots into 10 pairs.

Page 10: AP Statistics Unit 2 Concepts (Chapter 4)

10

#T4.11. I can explain how these bad sampling methods can lead to bias. I can explain how under coverage, nonresponse, and question wording can lead to bias in a sample survey.

You want to know the opinions of American high school teachers on the issue of establishing a national proficiency test as a prerequisite for graduation from high school. You obtain a list of all high school teachers belonging to the National Education Association (the country’s largest teachers’ union) and mail a survey to a random sample of 2500 teachers. In all, 1347 of the teachers return the survey. Of those who responded, 32% say that they favor some kind of national proficiency test. Which of the following statements about this situation is true?

a. Since random sampling was used, we can feel confidence that the percent of all American high school teachers who would say they favor a national proficiency test is close to 32%

b. We cannot trust these results, because the survey was mailed. Only survey results from face to face interviews are considered valid.

c. Because only half of those who were mailed the survey actually responded, we can feel pretty confident that the actual percent of all American high school teachers who would say they favor a national proficiency test is close to 32%.

d. The results of this survey may be affected by nonresponse bias. e. The results of this survey cannot be trusted due to voluntary response bias.

#T4.12: I can distinguish between a completely randomized design and a randomized block design. I can describe a completely randomized design for an experiment. I can distinguish between an observational study and an experiment. The hotel manager of a small resort in Florida wants to compare the effectiveness of two laundry detergents, Detergent #1 and Detergent #2, in cleaning the bed linens used in the rooms at the resort. As the dirty linens are brought to the hotel laundry facility, it is placed into the only washing machine the hotel uses. When the washing machine contains 10 bed sheets, the manager tosses a fair coin to determine whether Detergent #1 or Detergent #2 will be used for that load of bed sheets. The cleanliness of the load of bed sheets/linens is rated on a scale of 1 to 5 by a employee who is not aware of which detergent has been used. The hotel manager continues this experiment for many approximately each day for a month. Which of the following best describes the manager’s study?

a. A stratified experimental design, with the type of detergent being the strata. b. A randomized block design with Detergent #1 and Detergent #2 as blocks. c. A randomized block design with the washing machine as the block. d. A completely randomized design e. A matched pairs design with Detergent #1 and Detergent #2 as the pair.

Page 11: AP Statistics Unit 2 Concepts (Chapter 4)

11

AP Statistics Exam Review – MULTIPLE CHOICE PRACTICE part II CONCEPT: I can identify a systematic sample. I can identify a cluster or multi-stage sample and distinguish them from simple random or

stratified samples. I can distinguish a simple random sample from a stratified random sample.

#1. The purchasing agent for a computer retail store wants to estimate the proportion of defective wireless keyboards in a shipment of 20,000 keyboards from the store’s primary supply chain. The shipment consists of 400 containers each having 50 keyboards inside. The purchasing agent numbers the containers from 1 to 400 and randomly selects fifteen numbers in that range. He then opens the fifteen containers with the corresponding numbers, examines all 50 keyboards in each of these containers, and determines the proportion of the 750 keyboards that are defective. What type of sample is this?

a. Systematic sample b. Multi-stage sample c. Simple random sample (SRS) d. Stratified random sample e. Cluster random sample

CONCEPT: I can interpret a correlation value, as well as the value of the coefficient of determination (r-squared) in the context of a given problem.

#2. A high school physics teacher was conducting an experiment with his class on the length of time it will take a marble to roll down a sloped chute. The class ran repeated trials in order to determine the relationship between the length, in centimeters, of the sloped chute and the time, in seconds, for the marble to roll down the chute. A linear relationship was observed and the correlation coefficient was 0.964. After discussing their results, the teacher instructed the students to convert all of the length measurements to meters but leave the time in seconds. What effect will this have on the correlation of the two variables?

a. Because the standard deviation of the lengths in meters will be one hundredth of the standard deviation of the lengths in centimeters, the correlation will decrease by one hundredth to 0.954.

b. Because the standard deviation of the lengths in meters will be only one hundredth of the standard deviation of the lengths in centimeters, the correlation will decrease proportionally to 0.00964.

c. Because changing from centimeters to meters does not affect the value of the correlation, the correlation will remain 0.964

d. Because only the length measurements have been changed, the correlation will decrease substantially.

e. Because meters are a much more common measurement for length in determining speed, the linear relationship of the data will be stronger and thus the correlation will increase substantially.

Page 12: AP Statistics Unit 2 Concepts (Chapter 4)

12

CONCEPT: I can distinguish between a completely randomized design and a randomized block design.

#3. A dog food company wishes to test a new high-protein formula for puppy food to determine whether it promotes faster weight gain than the existing formula for that puppy food. Puppies participating in an experiment will be weighed at weaning (when they begin to eat puppy food) and will be weighed at one-month intervals for one year. In designing this experiment, the investigators wish to reduce the variability due to natural differences in puppy growth rates. Which of the following strategies is most appropriate for accomplishing this?

a. Block on dog breed and randomly assign puppies to existing and new formula groups within each breed.

b. Block on geographic location and randomly assign puppies to existing and new formulas groups within each geographic area.

c. Stratify on dog breed and randomly sample puppies within each breed. Then assign puppies by breed to either the existing or the new formula.

d. Stratify on geographic location of the puppies and randomly sample puppies within each geographic area. Then assign puppies by geographic area to either the existing or the new formula.

e. Stratify on gender and randomly sample puppies within gender groups. Then assign puppies by gender to either the existing or the new formula.

CONCEPT: I can explain how a lurking variable in an observational study can lead to confounding.

#4. A researcher observes that, on average, the number of divorces in cities with Major League Baseball teams is larger than in cities without Major League Baseball teams. The most plausible explanation for this observed association is that the

a. presence of a Major League Baseball team causes the number of divorces to rise (perhaps husbands are spending too much time at the ballpark).

b. high number of divorces is responsible for the presence of Major League Baseball teams (more single men means potentially more fans at the ballpark, making it attractive for an owner to relocate to such cities).

c. association is due to the presence of a lurking variable (Major League teams tend to be in large cities with more people, hence a greater number of divorces).

d. association makes no sense, since many married couples go to the ballpark together. e. observed association is purely coincidental. It is implausible to believe the observed

association could be anything other than accidental. CONCEPT: I can explain why random assignment is an important experimental design principle.

#5. Control groups are used in experiments in order to a. Control the effects of outside variables on the outcome. b. Control the subjects of a study to ensure that all participate equally. c. Guarantee that someone other than the investigators, who have a vested interest in the

outcome, controls how the experiment is conducted. d. Achieve a proper and uniform level of randomization. e. Reduce the variability in results.

Page 13: AP Statistics Unit 2 Concepts (Chapter 4)

13

CONCEPT: I can distinguish between an observational study and an experiment.

#6. What electrical changes occur in muscles as they get tired? Student subjects are instructed to hold their arms above their shoulders as long as they can. Meanwhile, the electrical activity in their arm muscles is measured. This is

a. An observational study. b. An uncontrolled experiment. c. A randomized comparative experiment. d. A matched pairs design. e. Impossible to describe unless more details of the study are provided.

CONCEPT: I can describe how to use Table D to select a simple random sample (SRS).

#7. The following numbers appear in a table of random digits: 38683 50279 38224 09844 13578 28251 12708 24684

A scientist will be measuring the total amount of leaf litter in a random sample (n = 5) of forest sites selected without replacement from a population of 45 sites. The sites are labeled 01, 02, . . . , 45 and she starts at the beginning of the line of random digits and takes consecutive pairs of digits. Which of the following is correct?

a. Her sample is 38, 25, 02, 38, 22 b. Her sample is 38, 68, 35, 02, 22 c. Her sample is 38, 35, 27, 28, 08 d. Her sample is 38, 65, 35, 02, 79 e. Her sample is 38, 35, 02, 22, 40

CONCEPT: I can distinguish between different sampling methods.

#8. An airline that wants to assess customer satisfaction chooses a random sample of 10 of its flights during a single month and asks all of the passengers on those flights to fill out a survey. This is an example of a

a. multistage sample. b. stratified sample. c. cluster sample. d. simple random sample. e. convenience sample.

Page 14: AP Statistics Unit 2 Concepts (Chapter 4)

14

CONCEPT: I can distinguish between different sampling methods.

#9. A maple sugar manufacturer wants to estimate the average trunk diameter of Sugar Maples trees in a large forest. There are too many trees to list them all and take a SRS, so he divides the forest into several hundred 10 meter by 10 meter plots, selects 25 plots at random, and measures the diameter of every Sugar Maple in each one. This is an example of a

a. multistage sample. b. stratified sample. c. simple random sample. d. cluster sample. e. convenience sample.

CONCEPT: I can distinguish between experimental designs.

#10. To test the effects of a new fertilizer, 100 plots were divided in half. Fertilizer A is randomly applied to one half, and B to the other. This is

a. an observational study. b. a matched pairs experiment. c. a completely randomized experiment. d. a block design, but not a matched pairs experiment. e. impossible to classify unless more details of the study are provided.

CONCEPT: I can distinguish between experimental designs and sampling methods.

#11. You work for an advertising agency that is preparing a new television commercial to appeal to women. You have been asked to design an experiment to compare the effectiveness of three versions of the commercial. Each subject will be shown one of the three versions and then asked her attitude toward the product. You think there may be large differences between women who are employed outside the home and those who are not. Because of these differences, you should use

a. a completely randomized design. b. a categorical variable. c. a block design. d. a stratified design. e. a multistage sample.

Page 15: AP Statistics Unit 2 Concepts (Chapter 4)

15

CONCEPTS: I can distinguish between a completely randomized design and a randomized block design. I can know when a matched pairs experimental design is appropriate and how to implement

such a design. I can identify the experimental units or subjects, explanatory variables (factors), treatments,

and response variables in an experiment. #12. A materials engineer wishes to compare the durability of two different types of paving material. She has 40 different one-mile stretches of interstate highway that she’s been authorized to repave for this study. She decides to carry out a matched pairs experiment. Which of the following is the best way for her to carry out the randomization for this study?

a. Use a table of random digits to divide the 40 roadways into 20 pairs and then, for each pair, flip a coin to decide which pavement to use on which member of the pair.

b. Subjectively divide the 40 roadways into 20 pairs (making the roadways within each pair as different as possible) and then, for each pair, flip a coin to decide which pavement to use on which member of the pair.

c. Use a table of random digits to divide the 40 roadways into two groups of twenty, and then use the table of random digits a second time to decide which pavement to use on which group.

d. Let each of the 40 roadways act as its own pair, dividing each roadway into the first half-mile and the second half-mile. Flip a coin for each of the 40 roadways to decide which half-mile gets which pavement.

e. Let each of the 40 roadways act as its own pair, dividing each roadway into the first half-mile and the second half-mile. Flip a coin once to decide which pavement is put on the first half-mile of all the roadways.

AP Statistics Unit 2 – Free Response PRACTICE

#1. Your school will send a delegation of 35 seniors to a student life convention. 200 girls and 150 boys are eligible to be chosen. If a sample of 20 girls and separate sample 15 boys are each selected randomly, it gives each senior the same chance to be chosen to attend the convention.

(a) Is it an SRS? Explain.

(b) Beginning at line 108 in the random digits table, reproduced below, select the first three senior girls to be in the sample. Explain your procedures clearly.

108 60940 72024 17868 24943 61790 90656 87964 18883

109 36009 19365 15412 39638 85453 46816 83485 41979

110 38448 48789 18338 24697 39364 42006 76688 08708

Page 16: AP Statistics Unit 2 Concepts (Chapter 4)

16

CONCEPT: I can distinguish between an observational study and an experiment. I can explain how a lurking variable in an observational study can lead to confounding. I can determine the scope of inference for a statistical study. I can explain in context what “statistically significant” means.

#2. FREE RESPONSE QUESTION: (from 1999 AP Statistics Exam #3) The dentists in a dental clinic would like to determine if there is a difference between the number of new cavities in people who eat an apple a day and in people who eat less than one apple a week. They are going to conduct a study with 50 people in each group. Fifty clinic patients who report that they routinely eat an apple a day and 50 clinic patients who report that they eat less than one apple a week will be identified. The dentists will examine the patients and their records to determine the number of new cavities the patients have had over the past two years. They will then compare the number of new cavities in the two groups.

a. Why is this an observational study and not an experiment? b. Explain the concept of confounding in the context of this study. Include an example of

a possible confounding variable. c. If the mean number of new cavities for those who ate an apple a day was statistically

significantly smaller than the mean number of new cavities for those who at less than one apple a week, could one conclude that the lower number of new cavities can be attributed to eating an apple a day? Explain.

Page 17: AP Statistics Unit 2 Concepts (Chapter 4)

17

#3. Scenario 4-9 As a researcher for a pharmaceutical company, you are designing a study to test the effectiveness of a new treatment for migraine headaches. You have been given a list of 126 people willing to participate in the trial. The first 70 people are female; the remaining 56 are male. Use Scenario 4-9. Describe a design for this experiment. Be sure to include a description of how you assign individuals to the treatment groups.

Page 18: AP Statistics Unit 2 Concepts (Chapter 4)

18

#4. Scenario 4-7 High blood pressure adds to the workload of the heart and arteries and may increase the risk of heart attacks. If not treated, this condition can also lead to heart failure, kidney failure, or stroke. We wish to test the effectiveness of Angiotensin-converting enzyme (ACE) inhibitors as a treatment for high blood pressure. Use Scenario 4-7. Assume that 600 men and 500 women suffering from high blood pressure are available for the study. Describe a design for this experiment. Be sure to include a description of how you assign individuals to the treatment groups. I can understand how a census differs from a survey. What inference procedures would be appropriate to use for data that comes from a census? I can identify the population and sample in a sample survey. A large random sample of people aged 18 to 49 was taken in the city of Nashville regarding which type of music they liked the most. What population can the results of this sample survey be safely generalized? I can identify a cluster or multi-stage sample and distinguish them from simple random or

stratified samples. Which sample designs are broken into groups? Which type of sample is described below:? “Every possible subset of the population, the desired sample size, has an equal chance of being selected.”

Page 19: AP Statistics Unit 2 Concepts (Chapter 4)

19

I can determine the scope of inference for a statistical study. 2006 AP Statistics Exam Free-Response A biologist is interested in studying the effect of growth-enhancing nutrients and different salinity (salt) levels in water on the growth of shrimps. The biologist has ordered a large shipment of young tiger shrimps from a supply house for use in the study. The experiment is to be conducted in a laboratory where 10 tiger shrimps are placed randomly into each of 12 similar tanks in a controlled environment. The biologist is planning to use 3 different growth-enhancing nutrients (A, B, and C) and two different salinity levels (low and high).

a. List the treatments that the biologist plans to use in this experiment. b. What principle of experimental design explains why each treatment goes into TWO

tanks, rather than just one? Scenario 4-10 Does ginkgo improve memory? The law allows marketers of herbs and other natural substances to make health claims that are not supported by evidence. Brands of ginkgo extract claim to “improve memory and concentration.” A randomized comparative experiment found no statistically significant evidence for such effects. The subjects were 230 healthy volunteers over 60 years old. They were randomly assigned to ginkgo or a placebo pill (a dummy pill that looks and tastes the same). All the subjects took a battery of tests for learning and memory before treatment started and again after six weeks.

a. Use Scenario 4-10. The study was double-blind. What does this mean?

b. Use Scenario 4-10. Comment briefly on the extent to which results of this study can be generalized to some larger population, and the extent to which cause and effect has been established.

c. Use Scenario 4-10. Explain why it is advantageous to use 230 volunteers in this study, rather than, say, 30.

Page 20: AP Statistics Unit 2 Concepts (Chapter 4)

20

A cookie manufacturer is trying to determine how long cookies stay fresh on store shelves, and the extent to which the type of packaging and the store’s temperature influences how long the cookies stay fresh. He designs a completely randomized experiment involving low (64 Fº) and high (75 Fº) temperatures and two types of packaging—plastic and waxed cardboard. List the following:

a. experimental units

b. factors

c. treatments I can identify voluntary response samples and convenience samples.

SELF SELECTION = VOLUNTARY RESPONSE SAMPLE!!! I can explain how these bad sampling methods can lead to bias. What is the main issue with voluntary response samples that makes them inappropriate to draw inferences from? I can explain how under coverage, nonresponse, and question wording can lead to bias in

a sample survey. Bias is present in each of the following sampling designs. In each case, identify the type of bias involved and state whether you think the sample result obtained is lower or higher than the actual value for the population.

(a) A political pollster seeks information about the proportion of American adults who oppose gun controls. He asks an SRS of 1000 American adults: “Do you agree or disagree with the following statement: Americans should preserve their constitutional right to keep and bear arms.” A total of 910, or 91%, said, “Agree” (that is, 910 out of the 1000 oppose gun controls). (b). A flour company in Minneapolis wants to know what percent of local households bake at least twice a week. A company representative calls 500 randomly-selected households during the daytime and finds that 50% of those who responded bake at least twice a week.

I can explain how these bad sampling methods can lead to bias. What is a way to make sure that response bias is minimized in the creation of a sample survey? What does the use of a random digit dialer in a telephone survey help to avoid?