mathacle pset ----- stats, concepts in statistics, 1st...
TRANSCRIPT
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
1
1
st Quarterly Exam ~ Sampling, Designs, Exploring Data and Regression
Part 1 – Review
I. SAMPLING
MC I-1.) [APSTATSMC2014-6M] Approximately 52 percent of all recent births were boys. In a simple
random sample of 100 recent births, 49 were boys and 51 were girls. The most likely explanation for the
difference between the observed results and the expected results in this case is
(A) bias
(B) variability due to sampling
(C) nonsampling error
(D) a sampling frame that is incomplete
(E) small sample size
MC I-2.) [APSTATSMC2007-2] In which of the following situations would it be most difficult to use
census?
(A) To determine what proportion of licensed bicycles on a university campus have lights.
(B) To determine what proportion of students in a high school support wearing uniforms.
(C) To determine what proportion of registered students enrolled in a college are employed more than 20
hours each week.
(D) To determine what proportion of single-family dwellings in Tenafly, New Jersey have two-car garages.
(E) To determine what proportion of fish in Lake Michigan are bass.
MC I-3.) [CBAPSTATSPRACTICE-2] Under which of the following conditions is it preferable to use
stratified random sampling rather than simple random sampling?
A.) The population can be divided into a large number of strata so that each stratum contains only a few
individuals.
B.) The population can be divided into a small number of strata so that each stratum contains a large
number of individuals.
C.) The population can be divided into strata so that the individuals in each stratum are as much alike as
possible.
D.) The population can be divided into strata so that the individuals in each stratum are as different as
possible.
E.) The population can be divided into strata of equal sizes so that each individual in the population still has
the same chance of being selected.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
2
MC I-4.) [CBAPSTATSPRACTICE-9] Each person in a simple random sample of 2,000 received a survey,
and 317 people returned their survey. How could nonresponse cause the results of the survey to be biased?
A) Those who did not respond reduced the sample size, and small samples have more bias than large
samples.
B.) Those who did not respond caused a violation of the assumption of independence.
C.) Those who did not respond were indistinguishable from those who did not receive the survey.
D.) Those who did not respond represent a stratum, changing the simple random sample into a stratified
random sample.
E) Those who did respond may differ in some important way from those who did not respond.
MC I-5.) [APSTATSMC2014-2] A researcher wants to know the percentage of villages in a certain
African country that have access to a clean drinking water source less than ¼ mile from the center of the
village. The country is divided into 12 districts and each district has many villages in it, as indicated in the
table below. The researcher selects a random sample of 10% of the villages from each district.
Which of the following terms best describes this sampling method?
A.) Simple random sampling
B.) Stratified random sampling
C.) Cluster sampling
D.) Systemic sampling
E.) Voluntary response sampling
MC I-6.) [APSTATSMC2014-21] In a recent poll of 1,500 randomly selected eligible voters, only 525 (35
percent) said that they did not vote in the last election. However, a vote count showed that 80 percent of
eligible voters actually did not vote in the last election. Which of the following types of bias is most likely
to have occurred in the poll?
(A) Nonresponse bias
(B) Sampling bias
(C) Selection bias
(D) Response bias
(E) Undercoverage bias
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
3
MC I-7.) [APSTATSMC2012-12] In the design of a survey, which of the following best explains how to
minimize response bias?
(A) Increase the sample size.
(B) Decrease the sample size.
(C) Randomly select the sample.
(D) Increase the number of questions in the survey.
(E) Carefully word and field-test survey questions.
MC I-8.) [APSTATSMC2014-9] A survey was administered to parents of high school students in a certain
state to see if the parents thought the students’ academic needs were being met. To select the sample, the
parents were divided into two groups— one group of parents who live in cities with populations of more
than 100,000 and the other group of parents who live in cities with populations less than or equal to
100,000. A random sample of 100 parents from each group was taken. Which of the following statements
about the sample of 200 parents is true?
(A) It is a convenience sample because the sample of parents was easily obtained.
(B) It is a stratified random sample because parents were randomly selected from each group.
(C) It is a random cluster sample because parents were randomly selected from each group.
(D) It is a random cluster sample because groups of high schools were randomly selected. (E) It is a
systematic sample because the parents were systematically divided into two groups.
MC I-9.) [APSTATSMC2012-4] A bank surveyed all of its 60 employees to determine the proportion who
participate in volunteer activities. Which of the following statements is true?
(A) The bank should not use the data from this survey because this is an observational study.
(B) The bank can use the result of this survey to prove that working for the bank causes employees to
participate in volunteer activities.
(C) The bank did not select a random sample of employees, so the survey will not provide the bank with
useful information.
(D) The bank would have to use the survey data to construct a confidence interval in order to estimate the
proportion of employees who participate in volunteer activities.
(E) The bank does not need to use an inference procedure to determine the proportion of employees who
participate in volunteer activities because the survey was a census of all employees.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
4
MC I-10.) [APSTATSMC2002-4] Suppose that 30 percent of the subscribers to a cable television service
watch the shopping channel at least once a week. You are to design a simulation to estimate the probability
that none of five randomly selected subscribers watches the shopping channel at least once a week. Which
of the following assignments of the digits 0 through 9 would be appropriate for modeling an individual
subscriber's behavior in this simulation?
(A) Assign "0, 1, 2" as watching the shopping channel at least once a week and "3, 4, 5, 6, 7, 8, and 9" as
not watching,
(B) Assign "0, 1, 2, 3" as watching the shopping channel at least once a week and "4, 5, 6, 7, 8, and 9" as
not watching.
(C) Assign "1, 2, 3, 4, 5" as watching the shopping channel at least once a week and "6, 7 , 8, 9, and 0" as
not watching.
(D) Assign "0" as watching the shopping channel at least once a week and "1, 2, 3, 4, and 5" as not
watching; ignore digits "6, 7, 8, and 9,"
(E) Assign "3" as watching the shopping channel at least once a week and "0, 1, 2, 4, 5, 6, 7, 8, and 9" as
not watching.
MC I-11.) [APSTATSMC1997-27] The student government at a high school wants to conduct a survey of
student opinion. It wants to begin with a simple random sample of 60 students. Which of the following
survey methods will produce a simple random sample?
(A) Survey the first 60 students to arrive at school in the morning.
(B) Survey every 10th
student entering the school library until 60 students are surveyed.
(C) Use random numbers to choose 15 each of the first-year, second-year, third-year, and fourth-year
students.
(D) Number the cafeteria seats. Use a table of random numbers to choose seats and interview the students
until 60 have been interviewed.
(E) Number the students in official school roster. Use a table of random numbers to choose 60 students
from this roster for the survey.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
5
MC I-12.) [APSTATSMC1997-7M] A certain county has 1,000 farms. Corn is grown on 100 of these
farms but on none of the others. In order to estimate the total farm acreage of corn for the county, two plans
are proposed.
Plan I:
a.) Sample 20 farms at random.
b.) Estimate the mean acreage of corn per farm +/- some standard error.
c.) Multiply the mean and standard error by 1000 to get the interval of estimate of the total.
Plan II:
a.) Identify the 100 corn-growing farms.
b.) Sample 20 corn-growing farms at random.
c.) Estimate the mean acreage of these 20 corn-growing farms +/- some standard error.
d.) Multiply the mean and standard error by 100 to get the interval of estimate of the total.
On the basis of information given, which of the following is better method for estimating the total farm
acreage of corn for the county?
(A) Choose plan I over plan II
(B) Choose plan II over plan I
(C) Choose either plan, since both are good and will produce equivalent results.
(D) Choose neither plan, since neither estimates the total farm acreage of corn.
(E) The plans cannot be evaluated from the information given.
MC I-13.) [APSTATSFRQ2015-3] Recently, a company acquired the rights to use a forest
– like the one shown in the photograph below – to harvest trees to produce lumber.
The company wants to conduct a study to estimate the mean trunk diameter of the trees
from the forest by taking a random sample of approximately 5 percent of the tree from
the forest. For the study, the company divides the forest into 200 equally sized plots of
approximately one acre each, as shown in the figure below.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
6
Because of previous logging practices and growth patterns, plots with older trees, such as
Plot 6, tend to have fewer trees but with larger trunk diameters, and plots with younger
trees, such as Plot 121, tend to have more trees but with smaller trunk diameters. This is
illustrated in the two figures of Plot 6 and Plot 121 by the varying number and sizes of
the symbol .
a.) Describe the procedure for using cluster sampling to obtain a random sample of
approximately 5 percent of the trees from the forest, using the plots as clusters.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
7
b.) Describe a procedure for using stratified sampling to obtain a random sample of
approximately 5 percent of the trees form the forest, using the plots as strata.
c.) For the study, give one advantage of using cluster sampling as described in part a.)
over stratified sampling as described in part (b).
d.) For the study, give one advantage of using stratified sampling as described in part b.)
over cluster sampling as described in part a.).
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
8
II. DESIGN OF STUDIES
MC II-1.) [APSTATSMC1997-9] To check the effect of cold temperature on elasticity of two brands of
rubber bands, one box of Brand A and one box of Brand B rubber bands are tested. Ten bands from Brand
A box are placed in a freezer for two hours and ten bands from the Brand B box are kept at room
temperature. The amount of stretch before breakage is measured on each rubber band, and the mean for
cold bands is compared to the mean for the others. Is this a good experimental design?
(A) No, because the means are not proper statistics for comparison.
(B) No, because more than two brands should be used.
(C) No, because more temperatures should be used.
(D) No, because temperature is confounded with brand.
(E) Yes
MC II-2.) [APSTATSMC2014-3] A well-designed experiment should have which of the following
characteristics?
I. Subjects assigned randomly to treatments
II. A control group or at least two treatment groups
III. Replication
(A) I only
(B) I and II only
(C) I and III only
(D) II and III only
(E) I, II, and III
MC II-3.) [APSTATSMC2014-38] Which of the following distinguishes an observational study from a
randomized experiment?
(A) In an observational study volunteers are always used, whereas in a randomized experiment a random
sample is always taken from the population.
(B) In an observational study a random sample is always taken from the population, whereas in a
randomized experiment volunteers are always used.
(C) In an observational study treatments are not randomly assigned, whereas in a randomized experiment
treatments are randomly assigned.
(D) In an observational study a control group is never used, whereas in a randomized experiment a control
group is always used.
(E) An observational study can be double-blind, whereas a randomized experiment can only be single-blind
because the experimenter determines who is randomly assigned to each treatment.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
9
MC II-4.) [APSTATSMC2012-10] A compact disc (CD) manufacturer wanted to determine which of two
different cover designs for a newly released CD will generate more sales. The manufacturer chose 70 stores
to sell the CD. Thirty-five of these stores were randomly assigned to sell CDs with one of the cover designs
and the other 35 were assigned to sell the CDs with the other cover design. The manufacturer recorded the
number of CDs sold at each of the stores and found a significant difference between the mean number of
CDs sold for the two cover designs. Which of the following gives the conclusion that should be made based
on the results and provides the best explanation for the conclusion?
(A) It is not reasonable to conclude that the difference in sales was caused by the different cover designs
because this was not an experiment.
(B) It is not reasonable to conclude that the difference in sales was caused by the different cover designs
because there was no control group for comparison.
(C) It is not reasonable to conclude that the difference in sales was caused by the different cover designs
because the 70 stores were not randomly chosen.
(D) It is reasonable to conclude that the difference in sales was caused by the different cover designs
because the cover designs were randomly assigned to stores.
(E) It is reasonable to conclude that the difference in sales was caused by the different cover designs
because the sample size was large.
MC II-5.) [2012APSTATSMC2012-11] The manager of a public swimming pool wants to compare the
effectiveness of two laundry detergents, Detergent A and Detergent B, in cleaning the towels that are used
daily. As each dirty towel is turned in, it is placed into the only washing machine on the premises. When
the washing machine contains 20 towels, the manager flips a coin to determine whether Detergent A or
Detergent B will be used for that load. The cleanliness of the load of towels is rated on a scale of 1 to 10 by
a person who does not know which detergent was used. The manager continues this experiment for many
days. Which of the following best describes the manager’s study?
(A) A completely randomized design
(B) A randomized block design with Detergent A and Detergent B as blocks
(C) A randomized block design with the washing machine as the block
(D) A matched-pairs design with Detergent A and Detergent B as the pair
(E) An observational study
MC II-6.) [APSTATSMC2002-1] Which of the following is a key distinction between well designed
experiments and observational studies?
(A) More subjects are available for experiments than for observational studies.
(B) Ethical constraints prevent large-scale observational studies.
(C) Experiments are less costly to conduct than observational studies.
(D) An experiment can show a direct cause-and-effect relationship, whereas an observational study
cannot.
(E) Tests of significance cannot be used on data collected from an observational study.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
10
MC II-7.) [APSTATSMC2007-14] A researcher wishes to test a new drug developed to treat hypertension
(high blood pressure). A group of 40 hypertensive men and 60 hypertensive women is to be used. The
experimenter randomly assigns 20 of the men and 30 of the women to placebo and assigns the rest to the
treatment. The major reason for separate assignment for men and women is that
(A) it is a large study with 100 subjects.
(B) the new drug may affect men and women differently.
(C) the new drug may affect hypertensive and nonhypertensive people differently.
(D) this design uses matched pairs to detect the new-drug effect.
(E) there must be an equal number of subjects in both the placebo group and the treatment group.
MC II-8.) [APSTATSFRQ2016-3] Alzheimer’s disease results in a loss of cognitive ability beyond what is
expected with typical aging. A local newspaper published an article with the following headline.
The article reported that a study tracked the medical histories of 21,123 men and women for 23 years. The
article stated that, for those who smoked at least two packs of cigarettes a day, the risk of developing
Alzheimer’s disease was 2.57 times the risk for those who did not smoke.
(a) Identify the explanatory and response variables in the study.
(b) Is the study described in the article an observational study or an experiment? Explain.
(c) Exercise status (regular weekly exercise versus no regular weekly exercise) was mentioned in the article
as a possible confounding variable. Explain how exercise status could be a confounding variable in the
study.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
11
III. EXPLORING DATA
MC III-1.) [APSTATSMC2014-2] Professor James gave the same test to his three sections. On the 34-
question test, the highest score was 32 and the lowest was 15. Based on the information displayed in the
boxplot below, which of the following statements is true?
(A) Section 1 has the smallest interquartile range.
(B) The lowest score in section 2 is highest than the highest score in either of the other sections.
(C) Section 2 has the smallest range of scores.
(D) The top 25% of scores in section 2 are lower than the highest score in section 3.
(E) At least 50% of the scores in section 3 are higher than all of the scores in section 1.
MC III-2.) [APSTATSMC2014-7] Each person in a random sample of adults was asked how many DVDs
he or she owned. Summary statistics are given below.
Which of the following statements is true?
(A) Seventy-five percent of the adults in the sample own more than 95 DVDs.
(B) Fifty percent of the adults in the sample own between 0 and 129.4 DVDs.
(C) The distribution of the number of DVDs owned appears to be approximately symmetric.
(D) The interquartile range of the number of DVDs owned is 65.
(E) The distribution of the number of DVDs owned contains outliers on both the low side and the high side.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
12
MC III-3.) [APSTATSMC2012-2] A random sample of 374 United States pennies was collected, and the
age of each penny was determined. According to the boxplot below, what is the approximate interquartile
range (IQR) of the ages?
MC III-4.) [APSTATSMC2007-12]
The table above shows the sample size, the mean, and the median for two samples of measurements. What
is the median for the combined sample of 47 measurements?
(A) 42.6 49.2
2
(B) 45.0 48.5
2
(C) 21(42.6) 26(49.2)
47
(D) 21(45.0) 26(48.5)
47
(E) It cannot be determined from the information given.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
13
MC III-5.) [APSTATSMC2013-12] The number of hurricanes reaching the East Coast of the United States
was recorded for each of the last ten decades by the National Hurricane Center. Summary measures are
shown below.
Min=12 Max=24
Lower quartile =15 Upper quartile =18
Median=16 n=10
Which of the following statements is true?
(A) The smallest observation is 12 and it is an outlier. No other observation is the data set could be outliers.
(B) The largest observation is 24 and it is an outlier. No other observations in the data set could be outliers.
(C) Both 12 and 24 are outliers. It is possible that there are also other outliers.
(D) 12 is an outlier and it is possible that there are other outliers at the low end of data set. There are no
outliers at the high end of the data set.
(E) 24 is an outlier and it is possible that there are other outliers at the high end of the data set. There are no
outliers at the low end of the data set.
MC III-6.) [APSTATSMC2015-7] Data were collected on the number of text messages sent by each
student in a large school for one day. A boxplot of the data is shown below.
Based on the boxplot, which of the following statements is the most reasonable conclusion?
(A) There are more students with data values below the median than there are students with data values
above the median.
(B) There are more students with data values between the first quartile and the median than there are
students with data values between the median and the third quartile.
(C) There are fewer students with data values between the first quantile and the median than there are
students with data values between the median and the third quartile.
(D) There are approximately the same number of students with data values between the first quartile and
the minimum as there are students with data values between the third quartile and the maximum.
(E) The data are less spread out between the first quartile and the median than between the median and the
third quartile.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
14
MC III-7.) [APSTATSMC2012-13] For a sample of 42 rabbits, the mean weight is 5 pounds and the
standard deviation of weights is 3 pounds. Which of the following is most likely true about the weights for
the rabbits in this sample?
(A) The distribution of weights is approximately normal because the sample size is 42, and therefore the
central limit theorem applies.
(B) The distribution of weights is approximately normal because the standard deviation is less than the
mean.
(C) The distribution of weights is skewed to the right because the least possible weight is within 2 standard
deviations of the mean.
(D) The distribution of weights is skewed to the left because the least possible weight is within 2 standard
deviations of the mean.
(E) The distribution of weights has a median that is greater than the mean.
MC III-8.) [APSTATSMC2012-3] The histogram below shows the number of minutes needed by 45
students to finish playing a computer game. Which of the following statements is correct?
(A) The distribution is skewed to the right.
(B) The distribution is skewed to the left.
(C) The distribution appears to be normal.
(D) The distribution appears to be chi-square.
(E) The distribution appears to be uniform.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
15
MC III-9.) [APSTATSFRQFORMB2010-1] As a part of US Dept of Agriculture’s Super Dump cleanup
efforts in the early 1990s, various sites in the country were targeted for cleanup. Three of the targeted sites
– River X, River Y, and River Z – had become contaminated with pesticides because they were located
near abandoned pesticide dump sites. Measurements of concentration of aldrin (a commonly used
pesticide) were taken at twenty randomly selected locations in each river near the dump sites.
The boxplots shown below display the five-number summaries for concentrations, in parts per million
(ppm) of aldrin, for the twenty locations that were sampled in each of the three rivers.
Compare the distributions of the concentration of Aldrin among the three rivers.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
16
MC III-10.) [APSTATSMC2007-24] The histogram below displays the times, in minutes, needed for each
chimpanzee in a sample of 26 to complete a simple navigational task.
The largest observation, 93, is an outlier since 125.87)(5.1 133 QQQ . Which of the following
boxplots could represent the information in the histogram?
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
17
MC III-11.) [APSTATSMC2014-12] A school is having a contest in which students guess the number of
candies in a jar. The student whose guess is closest to the correct number of candies in the jar wins a prize.
The number of candies guessed by male and female students is shown in the back-to-back stemplot below.
Which of the following statements is true about the distributions of guesses?
(A) The distribution of guesses for male students is skewed to the left, and the distribution of guesses for
female students is skewed to the right.
(B) The distribution og guesses for male students is skewed to the right, and the distribution of guesses for
female students is skewed to the left.
(C) The distribution of guesses for male and female students are both skewed to the right.
(D) The distribution of guesses for male and female students are both skewed to the left.
(E) The distribution of guesses for male and female students are both symmetric.
MC III-12.) [CBAPSTATSPRACTICEPROBLEM] Consider a data set of positive values, at least two of
which are not equal. Which of the following sample statistics will be changed when each value in this data
set is multiplied by a constant whose absolute value is greater than 1?
I. The mean
II. II. The median
III. III. The standard deviation
(A) I only
(B) II only
(C) III only
(D) I and II only
(E) I, II, and III
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
18
MC III-13.) [APSTATSMC2014-26]
MC III-14.) [APSTATSMC2014-3] Administrators at a state university computed the mean GPA (grade
point average) for juniors and seniors majoring in either physics or chemistry. The results are displayed in
the table below. When juniors and seniors are grouped together, could physics majors have a higher mean
GPA than chemistry majors?
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
19
MC III-15.) [APSTATSFRQ2013-3] An environmental group conducted a study to determine whether
crows in a certain region were ingesting food containing unhealthy level of lead. A biologist classified lead
levels greater than 6.0 parts per million (ppm) as unhealthy. The lead levels of a random sample of 23
crows in the region were measured and recorded. The data are shown in the stamplot below.
What proportion of crows in the sample had lead levels that are classified by the biologist as unhealthy?
MC III-16.) [APSTATSFRQB2011] Records are kept by each state in US on the number of pupils enrolled
in public schools and the number of teachers employed by public schools for each school year. From these
records, the ratio of the number of pupils to the number of teachers (P-T ratio) can be calculated for each
state. The histogram below show the P-T ratio for every state during the 2001-2002 school year. The
histogram on the left displays the ratios for 24 states that are west of the Mississippi River, and the
histogram on the right displays the ratios for the 26 states that are east of Mississippi River.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
20
(a) Describe how you would use the histograms to estimate the median P-T ratio for each group (west and
east) of states. Then use this procedure to estimate the median of the west group and the median of the east
group.
(b) Write a few sentences comparing the distribution of P-T ratios for states in the two groups (west and
east) during the 2001 – 2002 school year.
(c) Using your answers in parts (a) and (b), explain how you think the mean P-T ratio during the 2001 –
2002 school year will compare for the two groups (west and east).
MC III-17.) [APSTATSFRQ2016-1] Robin works as a server in a small restaurant, where she can earn a tip
(extra money) from each customer she servers. The histogram shows the distribution of her 60 tip amounts
for one day of work.
(a) Write a few sentences to describe the distribution of tip amounts for the day shown.
(b) One of the tip amounts was $8. If the $8 tip had been $18, what effect would the increase have had on
the following statistics? Justify your answers.
The mean:
The median:
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
21
MC III-18.) [APSTATSMC2007-1] The statistics below provide a summary of the distribution of heights,
in inches, for a simple random sample of 200 young children.
Mean: 46 inches
Median: 45 inches
Standard Deviation: 3 inches
First Quartile: 43 inches
Third Quartile: 48 inches
About 100 children in sample have heights that are
(A) less than 43 inches
(B) less than 48 inches
(C) between 43 and 48 inches
(D) between 40 and 52 inches
(E) more than 46 inches
MC III-19.) APSTATSMC2007-18] One hundred people were interviewed and classified according to their
attitude toward small cars and their personality type. The results are shown in the table below.
Which of the following is true?
(A) Of the three attitude groups, the group with negative attitude has the highest proportion of type A
personality types.
(B) Of the three attitude groups, the group with the neutral attitude has the highest proportion of type B
personality types.
(C) For each personality type, more than half of the 100 respondents have a neutral attitude toward small
cars.
(D) The proportion that has a positive attitude toward small cars is higher among people with a type B
personality type than among people with type A personality type.
(E) More than half of the 100 respondents have a type A personality type and a positive attitude toward
small cars.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
22
MC III-20.) [APSTATSFRQ1997-10] The boxplot below summarize two data sets, A and B. Which of the
following must be true?
I. Set A contains more data than Set B.
II. The box of Set A contains more data than the box of Set B.
III. The data in Set A have a larger range than the data in Set B.
IV.
(A) I only
(B) III only
(C) I and II only
(D) II and III only
(E) I, II and III
MC III-21.) [APSTATSFRQ2012-5] The histogram below displays the frequencies of waiting times, in
minutes, for 175 patients in a dentist’s office.
Which of the following could be the median of the waiting times, in minutes?
(A) 2.50
(B) 7.25
(C) 12.25
(D) 15.00
(E) 17.50
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
23
MC III-22.) [APSTATSFRQ2012-6] Data were collected on the amount, in dollars, that individual
customers spent on dinner in an Italian restaurant. The quartiles for these data are given below.
Which of the following statements must be true for these customers?
(A) At least half of the customers spent less than or equal to $44.27 and at least half spent greater than or
equal to $44.27.
(B) Seventy-five percent of the customers spent between $36.27 and $58.97.
(C) Twenty-five percent of the customers spent less than or equal to $58.97 and the remaining 75 percent
spent greater than or equal to $58.97.
(D) The mean amount spent by customers is $44.27.
(E) A majority of customers spent $44.27.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
24
Answers
Part 1 – Review
I. SAMPLING
MC I-1.) B
MC I-2.) E
MC I-3.) C. For Choice A, to reduce the number of people is not the reason to divide into strata. For
Choice B, tries to divide by size, not homogeneity. For Choice D, it is the division for cluster sampling. For
Choice E, to divide area into equal sizes is not the purpose to form strata.
MC I-4.) E. For Choice A, reduced sample size may still be useful as long as the conditions are satisfied.
For Choice B, the results are independent as long as those people do not influence each other. For Choice C,
no info on this. For Choice D, the people who don’t response may not response regardless the ways it was
sampled.
MC I-5.) B
MC I-6.) D
MC I-7.) E
MC I-8.) B
MC I-9.) E
MC I-10.) A
MC I-11.) E
MC I-12.) B. Plan I is SRS, and plan II is stratified sampling.
MC I-13.) a.)
1.) Label each plot from 1 to 200.
2.) Randomly generate 10 numbers from 1 to 200.
3.) Measure ALL the trees in those 10 sample plots.
b.)
1.) Divide the area into 200 plots.
2.) For each plot, label ALL the trees. Then, randomly generate numbers to cover 5% of all the labeled
trees.
3.) Measure the labeled trees in each plot.
c.) Cluster sampling is relatively easier and cheaper to do, since only 10 plots of all trees need to be
measured. The drawback is that those sampled plots may not represent the forest well when the distribution
varies from area to area.
d.) The stratified sampling may better represent the entire area, but the survey job could be harder or more
expensive to finish, since the trees in each plot needs to be labeled and sampling is needed in every plot.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
25
II. DESIGN OF STUDIES
MC II-1.) D
MC II-2.) E
MC II-3.) C. Observational study treatments are not randomly assigned.
MC II-4.) D. The word “cause” should be avoided.
MC II-5.) A
MC II-6.) D
MC II-7.) B
MC II-8.)
a.). Explanatory variable is the degree of cigarette smoking. Response variable is whether that person
develops Alzheimer’s disease.
b.) Observational: the people were not assigned to certain degree of cigarette smoking.
c.) Two possible cases to confound: people who exercise more may smoke less or people who exercise
more may be less likely to have Alzheimer’s disease. In either case exercise is a confounding factor.
III. EXPLORING DATA MULTIPLE-CHOICE PROBLEMS
MC III-1.) E. Each quartile contains a 25% of data.
MC III-2.) D. The interquartile range is 653096 .
MC III-3.) 15. IQR is 15520 .
MC III-4.) E.
MC III-5.) E.
MC III-6.) D
MC III-7.) C. The curve can only span 5 pounds (less than 2 sigmas) to the left and possibly more on the
right. So, it may not be symmetric around the mean or median.
MC III-8.) B. The distribution is skewed to the left.
MC III-9.)
Shape (S): River X is skewed to the right; River Y is more symmetrical; River Z is skewed to left.
Outlier (O): No outlier for all rivers.
Center (C) : River X has the highest median of the all three rivers.
Spread (S): River Z has the smallest spread and clustered around “the center.”
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
26
MC III-10.) D. quartile value is a number form the data, and 70 is the largest number in the data that is less
than 87.125, and the median is around the 13th
number which is greater than 20.
MC III-11.) D. The left-side should be the side with smaller numbers and the right-side should have larger
numbers.
MC III-12.) E. The problem can be understood better when the concepts and formulas of expected values
and variance are introduced. Let X be random variable of the original data set and kXY for a constant
0k . Then ][][ XkYE , ][][ 2 XVarkYVar or yy k , and median for Y will be multiplied
by k . All three measures will change.
MC III-13.) D . The median is around 55, and the median divides the area into two congruent parts.
MC III-14.) Yes. Since there are no numbers given for each of the categories, think of the extreme case that
there are only one junior taking physics and a lot more seniors taking physics, and only one senior taking
chemistry and a lot more juniors taking chemistry. The averages in this case could be close to a 3.2 for
seniors taking physics and a 3.0 for juniors taking chemistry.
MC III-15.) 17.4%. There are 4 crows with ppm higher than 6: 6.3, 6.4, 6.6, 6.8. So, %4.1723
4
MC III-16.
) a.) West median is between 15~16, east median is between 15~16. Medians are similar. b.) West: spread
out -> range=22-12=10, skewed to right, unimodal, more variability. East: clustered -> range =19-12=7,
symmetric, unimodal. So, west has more variability.
c.) Since they have similar medians and West skewed to right, West should have high mean.
MC III-17.)
a.) SOCS. S: skewed to the right; O: one tip around $20 is an outlier; C: median is around 2.5~5; S: most
tips are less than $5 and the range is 0~22.5.
b.) mean has 6
1
60
10 dollars, and median is unchanged, since it is around 2.5~5, and $8 and $18 are on
right.
MC III-18.) C. Half of the children are in the middle 50%.
MC III-19.) B. %4520
9 .
MC III-20.) B.
MC III-21.) B.
MC III-22.) A.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
27
Part 2 – Quarterly Exam Questions
MULTIPLE-CHOICE QUESTIONS
I. SAMPLING
MC I-1.) [APSTATSMC2002-9] A volunteer for a mayoral candidate's campaign periodically conducts
polls to estimate the proportion of people in the city who are planning to vote for this candidate in the
upcoming election. Two weeks before the election, the volunteer plans to double the sample size in the
polls. The main purpose of this is to
(A) reduce nonresponse bias
(B) reduce the effects of confounding variables
(C) reduce bias due to the interviewer effect
(D) decrease the variability in the population
(E) decrease the standard deviation of the sampling distribution of the sample proportion
MC I-2.) [APSTATSMC2002-15] A high school statistics class wants to conduct a survey to determine
what percentage of students in the school would be willing to pay a fee for participating in after-school
activities. Twenty students are randomly selected from each of the freshman, sophomore, junior, and
senior classes to complete the survey. This plan is an example of which type of sampling?
(A) Cluster
(B) Convenience
(C) Simple random
(D) Stratified random
(E) Systematic
MC I-3.) [APSTATSMC2002-16] Jason wants to determine how age and gender are related to political
party preference in his town. Voter registration lists are stratified by gender and age-group. Jason selects a
simple random sample of 50 men from the 20 to 29 age-group and records their age, gender, and party
registration (Democratic, Republican, neither). He also selects an independent simple random sample of 60
women from the 40 to 49 age-group and records the same information. Of the following, which is the most
important observation about Jason's plan?
(A) The plan is well conceived and should serve the intended purpose.
(B) His samples are too small.
(C) He should have used equal sample sizes.
(D) He should have randomly selected the two age groups instead of choosing them nonrandomly.
(E) He will be unable to tell whether a difference in party affiliation is related to differences in age or to
the difference in gender.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
28
MC I-4.) [APSTATSMC2015-2] A researcher wanted to estimate the average amount of money spent on
extracurricular activities per school in a certain region. The researcher randomly selected 20 public schools
and 20 private schools in the region to use for a sample. Which of the following best describes the type of
the sample that was taken?
(A) A census
(B) A cluster sample
(C) A convenience sample
(D) A simple random sample
(E) A stratified sample
MC I-5.) [APSTATSMC2007-20] Which of the following is NOT a characteristic of stratified sampling?
(A) Random sampling is part of the sampling procedure.
(B) The population is divided into groups of units that are similar on some characteristic.
(C) The strata are based on facts known before the sample is selected.
(D) Each individual unit in the population belongs to one and only one of strata.
(E) Every possible subset of population, of the desired sample size, has an equal chance of being selected.
MC I-6.) [APSTATSMC2012-15] A polling firm is interested in surveying a representative sample of
registered voters in the United States. The firm has automated its sampling so that random phone numbers
within the United States are called. Each time a number is called, the procedure below is followed.
• If there is no response or if an answering machine is reached, another number is automatically called.
• If a person answers, a survey worker verifies that the person is at least 18 years of age.
• If the person is not at least 18 years of age, no response is recorded, and another number is called.
• If the person is at least 18 years of age, that person is surveyed.
Some people claim the procedure being used does not permit the results to be extended to all registered
voters. Which of the following is NOT a legitimate concern about the procedure being used?
(A) Registered voters with children under the age of 18 years may be underrepresented in the sample.
(B) Registered voters with unlisted telephone numbers may be underrepresented in the sample.
(C) Registered voters who have more than one telephone number may be overrepresented in the sample.
(D) Registered voters who live in households consisting of more than one voter may be underrepresented.
(E) People who are not registered to vote may bias the sample results.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
29
MC I-7.) [APSTATSMC2012-18] When using a one-sample t-procedure to construct a confidence interval
for the mean of a finite population, a condition is that the population size be at least 10 times the sample
size. The reason for the condition is to ensure that
(A) the sample size is large enough
(B) the central limit theorem is applicable for the sample mean
(C) the sample standard deviation is a good approximation of the population standard deviation
(D) the degree of dependence among observations is negligible
(E) the sampling method is not biased
MC I-8.) [APSTATSMC2013-2] A school principal wanted to investigate student opinion about the food
served in the school cafeteria. The principal selected at random 50 first-year students, 50 second-year
students, 50 third-year students, and 50 fourth-year students to complete a questionnaire. Which of the
following best describes the principal’s sampling plan?
(A) A stratified random sample
(B) A simple random sample
(C) A cluster sample
(D) A convenience sample
(E) A systematic sample
MC I-9.) [APSTATSMC2013-27] A certain motel is roughly 20 miles from the entrance to Yosemite
National Park. The motel manager wants to get a better estimate of the distance and asks five people to
each measure the distance, to the nearest tenth of a mile, using the odometer in his or her car. The manager
will use the median of the five measurements as the estimate of the distance. Which of the following
statements is NOT a statistical justification for the manager’s plan?
(A) Odometer reading should be considered a variable when used to measure this distance.
(B) The median of the five measurements is more likely to be close to the actual distance than is a single
measurements.
(C) The actual distance should be considered a variable, and taking five measurements allows the manager
to estimate the variability in the actual distance.
(D) If one or two odometers give inaccurate readings, the estimate still should be fairly close to the actual
distance.
(E) The manager can get some indication of how far off the estimate might be.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
30
MC I-10.) [APSTATSMC2013-33] A regional transportation authority is interested in estimating the mean
number of minutes working audits in the region spends commuting to work on a typical day. A random
sample of working audits will be selected from each of three strata: urban, suburban, and rural. Selected
individuals will be asked the number of minutes they spend commuting to work on a typical day. Why is
stratification used in this situation?
(A) To remove bias when estimating the proportion of working audits living in urban, suburban, and rural
areas.
(B) To remove bias when estimating the mean commuting time
(C) To reduce bias when estimating the mean commuting time
(D) To decrease the variability in estimates of the proportion of working adults living in urban, suburban,
and rural areas.
(E) To decrease the variability in estimates of the mean commuting time.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
31
II. DESIGN OF STUDIES
MC II-1.) [APSTATSMC2013-15] An experiment will be concluded to determine whether children learn
their multiplication facts better by practicing with flash cards or by practicing on a computer. Children who
volunteer for the experiment will be randomly assigned to one of the two treatments. Because the children’s
gender may affect the outcome, there will be blocking by gender. After practice, the children will be given
a test on their multiplication facts. Why will it be impossible to conduct a double-blind experiment?
(A) The experimenter will know whether the child is a boy or a girl and whether he or she used flash cards
or the computer.
(B) The child will know whether he or she is a boy or a girl.
(C) The child will know whether he or she used flash cards or computer.
(D) The person who grades the tests will know whether the child was a boy or a girl.
(E) The person who grades the tests will know whether the child used flash cards or the computer.
MC II-2.) [APSTATSMC1997-18]
MC II-3.) [APSTATSMC2002-25] A study of existing records of 27,000 automobile accidents involving
children in Michigan found that about 10 percent of children who were wearing a seatbelt (group SB) were
injured and that about 15 percent of children who were not wearing a seatbelt (group NSB) were injured.
Which of the following statements should NOT be included in a summary report about this study?
(A) Driver behavior may be a potential confounding factor.
(B) The child's location in the car may be a potential confounding factor.
(C) This study was not an experiment, and cause-and-effect inferences are not warranted.
(D) This study demonstrates clearly that seat belts save children from injury.
(E) Concluding that seatbelts save children from injury is risky, at least until the study is independently
replicated.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
32
MC II-4.) [APSTATSMC2007-9] A television news editor would like to know how local registered voters
would respond to the question, "Are you in favor of the school bond measure that will be voted on in an
upcoming special election?" A television survey is conducted during a break in the evening news by listing
two telephone numbers side by side on the screen, one for viewers to call if they approve of the bond
measure, and the other to call if they disapprove. This survey method could produce biased results for a
number of reasons. Which one of the following is the most obvious reason?
(A) It uses a stratified sample rather than a simple random sample.
(B) People who feel strongly about the issue are more likely to respond.
(C) Viewers should be told about the issues before the survey is conducted.
(D) Some registered voters who call might not vote in the election.
(E) The wording of the question is biased.
MC II-5.) [APSTATSMC2007-31] Automobile brake pads are either metallic or nonmetallic. An
experiment is to be conducted to determine whether the stopping distance is the same for both types of brake
pads. In previous studies, it was determined that car size (small, medium, large) is associated with stopping
distance, but car type (sedan, wagon, coupe) is not associated with stopping distance. The experiment would
be best done
(A) by blocking on car size
(B) by blocking on car type
(C) by blocking on stopping distance
(D) by blocking on brake pad type
(E) without blocking
MC II-6.) [APSTATSMC2007-35] A group of students has 60 houseflies in a large container and needs to
assign 20 to each of the three groups labeled A, B, and C for an experiment. They can capture the flies one
at a time when the flies enter a side chamber in the container that is baited with food. Which of the
following methods will be most likely to result in three comparable groups of 20 houseflies each?
(A) Label the first 20 flies caught as Group A, the second 20 caught as group B, and the third 20 caught
as group C.
(B) Write the letters A, B, and C on separate slips of paper. Randomly pick one of the slips of paper
and assign the first 20 flies caught to that group. Pick another slip and assign the next 20 flies caught to
that group. Assign the remaining flies to the remaining group.
(C) When each fly is caught, roll a die. If the die shows an even number, the fly is labeled A. If the
die shows an odd number, the fly is labeled B. When 20 flies have been labeled A and 20 have been
labeled B, the remaining flies are then labeled C.
(D) Place each fly in its own numbered container (numbered from 1 to 60) in the order that it was
caught. Write the numbers from 1 to 60 on slips of paper, put the slips in a jar, and mix them well. Pick 20
numbers out of the jar. Assign the flies in the containers with those numbers to group A. Pick 20 more
numbers and assign the flies in the containers with those numbers to group B. Assign the remaining 20
flies to group C.
(E) When each fly is caught, roll a die. If the die shows a 1 or 2, the fly is labeled A. If the die shows
a 3 or 4, the fly is labeled B. If the die shows a 5 or 6, the fly is labeled C. Repeat this process for all 60
flies.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
33
MC II-7.) [APSTATSMC2013-34] A randomized block design will be used in an experiment to compare
two lotions that protect people from getting sunburned. Which of the following should guide the formation
of the blocks?
(A) Participants in the same block should receive the same lotion.
(B) Participants should be randomly assigned to the blocks.
(C) Participants should be kept blind as to which block they are in.
(D) Participants within each block should be as similar as possible with respect to how easily they get sunburned.
(E) Participants within each block should be as different as possible with respect to how easily they get
sunburned.
MC II-8.) [APSTATSMC2015-14] The dining and nutrition staff at the University of Georgia plans to
survey students to get their opinion on the new nutrition program introduced this semester at each of the
on-campus dining halls. They are interested in getting feedback from students living both on-campus and
off-campus about the new gluten-free and vegetarian options offered at each meal. Which of the following
sampling methods is the most appropriate for accomplishing this?
(A) Hand out a survey to every 10th student that enters each dining hall on a specified day.
(B) Group students by housing status, one group representing those living on campus and the other
representing those living off campus. Email a survey to 100 randomly selected students from each group.
(C) On equally sized slips of paper, write down the names of all the dormitories on campus as well as all
the apartment complexes off campus. Put all the names in a hat, mix them well, and draw out five of them.
Email a survey to all students in the five randomly selected buildings.
(D) Hand out a survey to the first 50 students that enter each dining hall on a specified day.
(E) Create a Facebook page for each dining hall where students can post their comments.
MC II-9.) [APSTATSMC2015-22] A university statistics professor wants to know if including review
problems in each set of homework problems (treatment I) is more effective than including only new
problems (treatment II). He teaches three sections of the course: a morning, an afternoon, and an evening
section, each with 30 students. Within each section the professor randomly assigns 15 students to treatment
I and 15 students to treatment II. Compared to randomly assigning 45 students to each treatment, what is
the advantage of randomly assigning 15 students to each treatment within each section?
(A) Random assignment within section eliminates the placebo effect.
(B) Random assignment within section allows the professor to generalize the results to all sections.
(C) Random assignment within section permits the professor and students to be blinded as to the treatment
group assignment.
(D) Random assignment within section accounts for possible differences in performance due to the time of
day the class meets.
(E) Random assignment within section reduces the effect of nonresponse bias.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
34
MC II-10.) [APSTATSMC2015-30] Nearly 12,000 high school students across 11 different countries
were surveyed about both their sleeping habits and their performance in school. Based on the results,
researchers concluded that a lack of sleep is linked to students earning poor grades in school. Which of the
following statements is true?
(A) This is an observational study. Therefore, researchers cannot conclude that a lack of sleep causes poor
grades.
(B) This is an observational study. Therefore, researchers can conclude that a lack of sleep causes poor
grades.
(C) This study is a well-designed experiment. Therefore, researchers cannot conclude that a lack of sleep
causes poor grades.
(D) This study is a well-designed experiment. Therefore, researchers can conclude that a lack of sleep
causes poor grades.
(E) This is neither an observational study nor a well-designed experiment.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
35
III.. EXPLORING DATA
MC III-1.) [APSTATSMC1997-14]
MC III-2.) [APSTATSMC1997-21]
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
36
MC III-3.) [APSTATSMC1997-22]
MC III-4.) [APSTATSMC2002-14]
The boxplots shown above summarize two data sets, I and II. Based on the boxplots, which of the
following statements about these two data sets CANNOT be justified?
(A) The range of data set I is equal to the range of data set II.
(B) The interquartile range of data set I is equal to the interquartile range of data set II.
(C) The median of data set I is less than the median of data set II.
(D) Data set I and data set II have the same number of data points.
(E) About 75% of the values in data set II are greater than or equal to about 50% of the values in data set
I.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
37
MC III-5.) [APSTATSMC2002-20] A small town employs 34 salaried, nonunion employees. Each
employee receives an annual salary increase of between $500 and $2000 based on a performance review by
the mayor's staff. Some employees are members of the mayor's political party, and the rest are not.
Students at the local high school form two lists, A and B, one for the raises granted to employees who are
in the mayor's party, and the other for raises granted to employees who are not. They want to display a
graph (or graphs) of the salary increases in the student newspaper that readers can use to judge whether the
two groups of employees have been treated in a reasonably equitable manner.
Which of the following displays is least likely to be useful to readers for this purpose?
(A) Back-to-back stemplots of A and B
(B) Scatterplot of B versus A
(C) Parallel boxplots of A and B
(D) Histograms of A and B that are drawn to the same scale
(E) Dotplots of A and B that are drawn to the same scale
MC III-6.) [APSTATSMC2002-27]
The figure above shows a cumulative relative frequency histogram of 40 scores on a test given in an AP
Statistics class. Which of the following conclusions can be made from the graph?
(A) There is greater variability in the lower 20 test scores than in the higher 20 test scores.
(B) The median test score is less than 50.
(C) Sixty percent of the students had test scores above 80.
(D) If the passing score is 70, most students did not pass the test.
(E) The horizontal nature of the graph for the test scores of 60 and below indicates that those scores
occurred most frequently.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
38
MC III-7.) [APSTATSMC2007-15] The histograms below represent the distribution of five different
data sets, each containing 28 integers, from 1 through 7, inclusive. The horizontal and vertical scales are
the same for all graphs. Which graph represents the data set with the largest standard deviation.
MC III-8.) [APSTATSMC2007-33] Five estimators for a parameter are being evaluated. The true value
of the parameter is 0. Simulations of 100 random samples, each of size n, are drawn from the population.
For each simulated sample, the five estimates are computed. The histograms below display the simulated
sampling distributions for the five estimators. Which simulated sampling distribution is associated with the
best estimator for this parameter?
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
39
MC III-9.) [APSTATSMC2013-05] The amount of time required for each of 100 mice to navigate through
a maze was recorded. The histogram below shows the distribution of times, in seconds, for the 100 mices.
Which of the following values is closest to the standard deviation of the 100 mice?
(A) 2.5 seconds
(B) 10 seconds
(C) 20 seconds
(D) 50 seconds
(E) 90 seconds
MC III-10.) [APSTATSMC2013-06] A graph (not shown) of the selling prices of homes in a certain city
for the month of April reveals that the distribution is skewed to the left. Which of the following statements
is the most reasonable conclusion about the selling prices based on the graph?
(A) The mean is greater than the median.
(B) The median is the average of the first quartile and the third quartile.
(C) There are fewer selling prices between the first quartile and the median than there are between the
median and the third quartile.
(D) There are more selling prices that are less than the mean than selling prices that are greater than the
mean.
(E) The value of maximum minus third quartile is less than the value of first quartile minus minimum.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
40
FREE-RESPONSE QUESTIONS
Directions: Show all your work. Indicate clearly the methods you use, because you will be scored on the
correctness of your methods as well as on the accuracy and completeness of your results and explanations.
FRQ 1.1.) [APSTATSFRQ2014-04] As part of its twenty-fifth reunion celebration, the class of 1988
(students who graduated in 1988) at a state university held a reception on campus. In an informal survey,
the director of alumni development asked 50 of the attendees about their incomes. The director computed
the mean income of the 50 attendees to be $189,952. In a news release, the director announced, “The
members of our class of 1988 enjoyed resounding success. Last year’s mean income of its members was
$189,952!”
a.) What would be a statistical advantage of using the median of the reported incomes, rather than the mean,
as the estimate of the typical income?
b.) The director felt the members who attended the reception may be different from the class as a whole. A
more detailed survey of the class was planned to find a better estimate of the income as well as other facts
about the alumni. The staff developed two methods based on the available funds to carry out the survey.
Method 1: Send out an e-mail to all 6,826 members of the class asking them to complete an online form.
The staff estimates that at least 600 members will respond.
Method 2: Select a simple random sample of members of the class and contact the selected members
directly by phone. Follow up to ensure that all responses are obtained. Because method 2 will require more
time than method 1, the staff estimates that only 100 members of the class could be contacted using method
2.
Which of the two methods would you select for estimating the average yearly income of all 6,826 members
of the class of 1988 ? Explain your reasoning by comparing the two methods and the effect of each method
on the estimate.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
41
FRQ 1.2.) [APSTATSFRQ2011B-02] People with acrophobia (fear of heights) sometimes enroll in
therapy sessions to help them overcome this fear. Typically, seven or eight therapy sessions are needed
before improvement is noticed. A study was conducted to determine whether the drug D-cycloserine, used
in combination with fewer therapy sessions, would help people with acrophobia overcome this fear.
Each of 27 people who participated in the study received a pill before each of two therapy sessions.
Seventeen of the 27 people were randomly assigned to receive a D-cycloserine pill, and the remaining 10
people received a placebo. After the two therapy sessions, none of the 27 people received additional pills or
therapy. Three months after the administration of the pills and the two therapy sessions, each of the 27
people was evaluated to see if he or she had improved.
a.) Was this study an experiment or an observational study? Provide an explanation to support your answer.
b.) When the data were analyzed, the D-cycloserine group showed statistically significantly more
improvement than the placebo group did. Based on this result, would the researchers be justified in
concluding that the D-cycloserine pill and two therapy sessions are as beneficial as eight therapy sessions
without the pill? Justify your answer.
c.) A newspaper article that summarized the results of this study did not explain how it was determined
which people received D-cycloserine and which received the placebo. Suppose the researchers allowed the
therapists to choose which people received D-cycloserine and which received the placebo, and no
randomization was used. Explain why such a method of assignment might lead to an incorrect conclusion.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
42
FRQ 1.3.) [APSTATSFRQ2015-01] Two large corporations, A and B, hire many new college graduates as
accountants at entry-level positions. In 2009 the starting salary for an entry-level accountant position was
$36,000 a year at both corporations. At each corporation, data were collected from 30 employees who were
hired in 2009 as entry-level accountants and were still employed at the corporation five years later. The
yearly salaries of the 60 employees in 2014 are summarized in the boxplots below.
a.) Write a few sentences comparing the distributions of the yearly salaries at the two
corporations.
b.) Suppose both corporations offered you a job for $36,000 a year as an entry-level accountant.
(i) Based on the boxplots, give one reason why you might choose to accept the job at corporation A.
(ii) Based on the boxplots, give one reason why you might choose to accept the job at corporation B.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
43
FRQ 1.4.) [APSTATSFRQ2013-06, Investigative Task ] Tropical storms in the Pacific Ocean with
sustained winds that exceed 74 miles per hour are called typhoons. Graph A below displays the number of
recorded typhoons in two regions of the Pacific Ocean—the Eastern Pacific and the Western Pacific—for
the years from 1997 to 2010.
a.) Compare the distributions of yearly frequencies of typhoons for the two regions of the Pacific Ocean for
the years from 1997 to 2010.
b.) For each region, describe how the yearly frequencies changed over the time period from 1997 to 2010.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
44
A moving average for data collected at regular time increments is the average of data values for two or
more consecutive increments. The 4-year moving averages for the typhoon data are provided in the table
below. For example, the Eastern Pacific 4-year moving average for 2000 is the average of 22, 16, 15, and
21, which is equal to 18.50.
c.) Show how to calculate the 4-year moving average for the year 2010 in the Western Pacific. Write your
value in the appropriate place in the table.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
45
d.) Graph B below shows both yearly frequencies (connected by dashed lines) and the respective 4-year
moving averages (connected by solid lines). Use your answer in part (c) to complete the graph.
e.) Consider graph B.
i) What information is more apparent from the plots of the 4-year moving averages
than from the plots of the yearly frequencies of typhoons?
ii) What information is less apparent from the plots of the 4-year moving averages
than from the plots of the yearly frequencies of typhoons?
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
46
Answers
Part 2 – Quarterly Exam Questions
I. SAMPLING
MC I-1.) E , You can only reduce variability of a sample by increasing the sample size, since
n
qp 00 . You cannot reduce the variability of the population. Bias and effect of confounding factors
are “built in” in the way you collect sample, so increasing sample size may not reduce them.
MC I-2.) D. This is a typical stratified sampling used on the homogenous strata to reduce the sample
variability for a given sample size.
MC I-3.) E, Jason probably should have chosen the same age groups for both men and women.
MC I-4.) E.
MC I-5.) E. The strata, not any subsets of population, are usually selected by homogeneity. The individuals
in each stratum are selected randomly.
MC I-6.) B. Since the polling firm decides to survey by calling the people, they should not concern about
the people the firm cannot reach! They should only worry about the problems when they proceed to survey.
MC I-7.) D. The sample size has to be small enough to ignore “non-replacement” problem, as in the
example of skittles.
MC I-8.) A.
MC I-9.) C. The actual distance is a parameter of the experiment, so you can measure the variability of the
parameter.
MC I-10.) E. The advantage of using stratification is to reduce the sampling variability for a given sample
size.
II. DESIGN OF STUDIES
MC II-1.) C
MC II-2.) E
MC II-3.) D. This was an observation study, so you can’t really claim cause-and-effect since there was no
control group to compare.
MC II-4.) B. Response/non-response bias could be the worst compared with other bias.
MC II-5.) A. Size is associated with distance.
MC II-6.) D. This is actually more of a sampling problem than a design problem. For answer E, you not
obtain a 20/20/20 division.
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
47
MC II-7.) D. Like stratified sampling.
MC II-8.) B. Again, this is actually a sampling problem.
MC II-9.) D. Reduce the effect of confounding/lurking factors.
MC II-10.) A
III. EXPLORING DATA
MC III-1.) D. Look for some statistic among groups to see if it is different/similar. In this case, the statistic
is the ratio of job/population.
MC III-2.) B. Any change in data would affect the mean. For the other “central” measures, it depends on
what is changed.
MC III-3.) A. The association of data with schools was not provided.
MC III-4.) D. Boxplots do not provide the info for the total number of data.
MC III-5.) B. There is no corresponding paired variables, so scatter plots are appropriate.
MC III-6.) A. This is cumulative function, so the x value for 50% mark on the y-axis represents the median.
That is the median is about 70s. This also means that scores of the bottom half ranges from 30s~70s and the
scores for the upper half ranges from 70s ~100.
MC III-7.) D. The standard deviation “measures” how far all the data from the “center”.
MC III-8.) B. All 5 estimates are closely clustered around zero.
MC III-9.) B. Use the rule of 6 -to-cover-99% of the data: 9601156 .
MC III-10.) E
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
48
The following answers/solutions are from College Board. Your answers/solutions could vary.
FRQ 1.1.)
a.) The median is less affected by skewness and outliers than the mean. With a variable such as income, a
small number of very large incomes could dramatically increase the mean but not the median. Therefore,
the median would provide a better estimate of a typical income value.
b.) Method 2 is better than Method 1. A sample obtained from Method 1 could be biased because of the
voluntary nature of the response. It is plausible that class members with larger incomes might be more
likely to return the form than class members with smaller incomes. The mean income for such a sample
would overestimate the mean income of all class members. With Method 2, despite the smaller sample size,
the random selection is likely to result in a sample that is more representative of the entire class and
produce an unbiased estimate of mean yearly income of all class members.
FRQ 1.2.)
a.). The study was an experiment because treatments (D-cycloserine or placebo) were imposed by the
researchers on the people with acrophobia.
b.) No, the experiment was designed to compare the D-cycloserine group with a control group that received
the placebo. The researchers can conclude that the D-cycloserine pill and two therapy sessions show
significantly more improvement than a placebo and two therapy sessions. However, there is no basis for
comparison with another group of people with acrophobia who received eight therapy sessions and no pill.
c.) One example is that if the therapists were allowed to choose who received the placebo and who received
D-cycloserine, they might assign the people with more severe acrophobia to one of the groups and the
people with less severe acrophobia to the other group. Thus, the improvement after only two therapy
sessions could be related to the initial severity of the acrophobia rather than to the effects of D-cycloserine.
FRQ 1.3.)
Mathacle
PSet ----- Stats, Concepts in Statistics, 1st Quarterly Exam
Level ---- 3
Number --- 1
Name: ___________________ Date: _____________
49
FRQ 1.4.)
a.) The Western Pacific Ocean had more typhoons than the Eastern Pacific Ocean in all but one of these
years. The average seems to have been about 31 typhoons per year in the Western Pacific Ocean, which is
higher than the average of about 19 typhoons per year in the Eastern Pacific Ocean. The Western Pacific
Ocean also saw more variability (in number of typhoons per year) than the Eastern Pacific Ocean; for
example, the range of the frequencies for the Western Pacific is about 21 typhoons and only 10 typhoons
for the Eastern Pacific.
b.) The Western Pacific Ocean had a decreasing trend in number of typhoons per year over this time period,
especially from about 2001 through 2010. In contrast, the Eastern Pacific Ocean was fairly consistent in the
number of typhoons per year over this time period, with a slight increasing trend in the later years from
2005 through 2010.
c.)
d.)
e.)
(i) The overall trends across this time period were more apparent with the moving averages than with the
original frequencies. The moving averages reduce variability, making more apparent the overall decreasing
trend in number of typhoons in the Western Pacific Ocean and the slight increasing trend in the number of
typhoons in the Eastern Pacific Ocean.
(ii) The year-to-year variability in number of typhoons is less apparent with the moving averages than with
the original frequencies.