math117finalreview-fall2014

Upload: sayeesh-kapu

Post on 05-Jul-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/16/2019 Math117FinalReview-Fall2014

    1/14

    Nation  Women in 

    Parliament (%) 

    Female Economic 

    Activity 

    Iceland  33.3  87 

    Australia  28.3  79 Canada  24.3  83 

    Japan 10.7  65 

    United States 15.0  81 

    New Zealand  32.2  81 

    MATH 117 - Elements of Statistics

    Fall 2014 Final Exam Review for Chapters 1  –  11, 15

    From the textbook: Statistics: The Art and Science of Learning from Data (3rd ed.), by Alan Agresti and Christine Franklin

    1. UW Student Survey In a University of Wisconsin (UW) 

    study about alcohol abuse among students, 100 of the

    40,858 members of the student body in Madison were 

    sampled and asked to complete a questionnaire. One

    question asked was, “On how many days in the past

    week did you consume at least one alcoholic drink?”a. Identify the population and the sample.

    b. For the 40,858 students at UW, one characteristic of

    interest was the percentage who would respond

    “zero” to this question. For the 100 students

    sampled, suppose 29% gave this response. Does this

    mean that 29% of the entire population of UW

    students would make this response? Explain.

    c. Is the numerical summary of 29% a sample statistic, 

    or a population parameter?

    2. Median versus mean sales price of new homes In

    December 2010, the US Census Bureau reported that themedian US sales price of new homes was $241,500.

    Would you expect the mean sales price to have been

    higher or lower? Explain.

    3. Female Heights According to a recent report from the

    US National Center for Health Statistics, females

    between 25 and 34 years of age have a bell -shaped

    distribution for height, with a mean of 65 inches and

    standard deviation of 3.5 inches.

    a. Give an interval within which about 95% of the

    heights fall.

    b. What is the height for a female who is 3 standard

    deviations below the mean? Would this be a ratherunusual height? Why?

    4. High School Graduation Rates The distribution of high

    school graduation rates in the United States in 2004 had

    a minimum value of 78.3 (Texas), first quartile of 83.6,

    median of 87.2, third quartile of 88.8, and maximum

    value of 92.3 (Minnesota) (Statistical Abstract of the

    United States, 2006).

    a. Report the range and the interquartile range.

    b. Would a box plot show any potential outliers?

    Explain.

    5. Blood Pressure A World Health Organization study (the

    MONICA project) of health in various countries reported

    that in Canada, systolic blood pressure readings have a

    mean 121 and a standard deviation of 16. A reading

    above 140 is considered to be high blood pressure.

    a. What is the z-score for a blood pressure reading of

    140? How is this z-score interpreted? 

    b. The systolic blood pressure values have a bell-

    shaped distribution. Report an interval within which

    about 95% of the systolic blood pressure values fall.

    6. Life After Death for Males and Females In a recent General

    Social Survey, respondents answered the question, “Do you

    believe in a life after death?” The table shows the responses

    cross-tabulated with gender.

    Opinion About Life After Death by Gender 

    Gender  Opinion About Life After Death 

    Yes  No 

    Male  621  187 

    Female  834  145 

    a. Construct a table of conditional proportions.

    b. Summarize results. Is there much difference between

    responses of males and females?

    7. Women in Government and Economic Life The OECD

    (Organization for Economic Cooperation and Development)

    consists of advanced, industrialized countries that accept theprinciples of representative democracy and a free market

    economy. For the nations outside of Europe that are in the

    OECD, the table shows UN data from 2007 on the percentage

    of seats in parliament held by women and female economic

    activity as a percentage of the male rate.

    a. Treating women in parliament as the response variable,

    prepare a scatterplot and find the correlation. Explain

    how the correlation relates to the trend shown in the

    scatterplot.

    b. Use software or a calculator to find the regression

    equation. Explain why the y-intercept is not meaningful.

    c. Find the predicted value and residual for the United

    States. Interpret the residual.

    d. With UN data for all 23 OECD nations, the correlation 

    between these variables is 0.56. For women in

    parliament, the mean is 26.5% and the standard

    deviation is 9.8%. For female economic activity, the

    mean is 76.8 and the standard deviation is 7.7. Find the

    prediction equation, treating women in parliament as

    the response variable.

  • 8/16/2019 Math117FinalReview-Fall2014

    2/14

    8. Predict Crime Using Poverty A recent analyses of data for

    the 50 US states on y = violent crime rate (measured as

    number of violent crimes per 100,000 people in the state)

    and x = poverty rate (percent of people in the state living

    at or below the poverty level) yielded the regression

    equation  = 209.9+ 25.5.a. Interpret the slope.

    b. The state poverty rates ranged from 8.0 (for Hawaii)to 24.7 (for Mississippi). Over this range, find the

    range of predicted values for the violent crime rate.

    c. Would the correlation between these variables be

    positive or negative? Why?

    9. Football Discipline A large southern university had

    problems with 17 football players being disciplined for

    team rule violations, arrest charges, and possible NCAA

    violations. The online Atlanta Journal Constitution ran a

    poll with the question, “Has the football coach lost control

    over his players?” having possible responses, “Yes, he’s

    been too lenient,” and “No, he can’t control everything

    teenagers do.” 

    a. Was there potential for bias in this study? If so, what

    types of bias?

    b. The poll results after two days were 

    Yes  6012  93% 

    No  487  7% 

    Does this large sample size guarantee that the results are

    unbiased? Explain.

    10. Video games mindless? “Playing video games not so

    mindless.” This was the headline of a CNN news report

    about a study that concluded that young adults whoregularly play video games demonstrated better visual

    skills than young adults who do not play regularly. Sixteen

    young men volunteered to take a series of tests that

    measured their visual skills; those who had played video

    games in the previous six months performed better on the

    test than those who hadn’t played.

    a. What are the explanatory and response variables?

    b. Was this an observational study or an experiment? 

    Explain.

    c. Specify a potential lurking variable. Explain.

    11. Peyton Manning Completions As of the end of the 2010 

    NFL season, Indianapolis Colts quarterback Peyton

    Manning, throughout his 13-year career, completed 65%

    of all of his pass attempts. Suppose the probability each

    pass attempted in the next season has probability 0.65 of

    being completed.

    a. Does this mean that if we watch Manning throw 100 

    times in the upcoming season, he would complete

    exactly 65 passes? Explain.

    b. Explain what this probability means in terms of

    observing him over a longer period, say for 1000 passes

    over the course of the next two seasons assuming

    Manning is still at his typical playing level. Would it be

    surprising if his completion percentage over a large

    number of passes differed significantly from 65%?

    12. Driver’s Exam Three 15-year-old friends with no particulabackground in driver’s education decide to take the written

    part of the Georgia Driver’s Exam. Each exam was graded a

    a pass (P) or a failure (F).

    a. How many outcomes are possible for the grades

    received by the three friends together? Using a tree

    diagram, list the sample space.

    b. If the outcomes in the sample space in part a are

    equally likely, find the probability that all three pass the

    exam.

    c. In practice, the outcomes in part a are not equally likely

    Suppose that statewide 70% of 15-year-olds pass the

    exam. If these three friends were a random sample of

    their age group, find the probability that all three pass.

    d. In practice, explain why probabilities that apply to a

    random sample are not likely to be valid for a sample of

    three friends.

    13. Grandparents Let X = the number of living grandparents

    that a randomly selected adult American has. According to

    recent General Social Surveys, its probability distribution is

    approximately P(0) = 0.71, P(1) = 0.15, P(2) = 0.09, P(3) =

    0.03, P(4) = 0.02.

    a. Does this refer to a discrete or a continuous random 

    variable? Why?

    b. Show that the probabilities satisfy the two conditionsfor a probability distribution.

    c. Find the mean of this probability distribution. 

    14. Z-score and Tail Probability 

    a. Find the z-score for the number that is less than only 1%

    of the values of a normal distribution. Sketch a graph to

    show where this value is.

    b. Find the z-scores corresponding to the (i) 90th

    and (ii)

    99th

    percentiles of a normal distribution.

    15. Cloning Butterflies The wingspans of recently cloned

    monarch butterflies follow a normal distribution with mean

    9 inches and standard deviation 0.75 inches. What 

    proportion of the butterflies has a wingspan

    a. less than 8 inches?

    b. wider than 10 inches? 

    c. between 8 and 10 inches?

    d. ten percent of the butterflies have a wingspan wider

    than how many inches?

  • 8/16/2019 Math117FinalReview-Fall2014

    3/14

    16. Exam Performance An exam consists of 50 multiple-

    choice questions. Based on how much you studied, for any

    given question you think you have a probability of p = 0.70

    of getting the correct answer. Consider the sampling

    distribution of the sample proportion of the 50 questions

    on which you get the correct answer.

    a. Find the mean and standard deviation of the sampling

    distribution of this proportion.b. What do you expect for the shape of the sampling 

    distribution? Why?

    c. If truly p = 0.70, would it be very surprising if you got

    correct answers on only 60% of the questions? Justify

    your answer by using the normal distribution to

    approximate the probability of a sample proportion of

    0.60 or less.

    17. Aunt Erma’s Restaurant In Example 5 about Aunt Erma’s

    Restaurant, the daily sales follow a probability distribution

    that has a mean of µ = $900 and a standard deviation of σ 

    = $300. This past week the daily sales for the seven days 

    had a mean of $980 and a standard deviation of $276.

    a. Identify the mean and standard deviation of the

    population distribution.

    b. Identify the mean and standard deviation of the data 

    distribution. What does the standard deviation

    describe?

    c. Identify the mean and the standard deviation of the 

    sampling distribution of the sample mean for samples

    of seven daily sales. What does this standard

    deviation describe?

    18. Approval Rating for President Obama A July 2011 Gallup

    poll based on the responses of 1500 adults indicated that46% of Americans approve of the job Barack Obama is

    doing as president. One way to summarize the findings of

    the poll is by saying, “It is estimated that 46% of American

    approve of the job Barack Obama is doing as president.

    This estimate has a margin of error of plus or minus 3%.”

    How could you explain the meaning of this to someone

    who has not taken a statistics course?

    19. Vegetarianism Time magazine (July 15, 2002) quoted a

    poll of 10,000 Americans in which only 4% said they were

    vegetarian.

    a. What has to be assumed about this sample to

    construct a confidence interval for the populationproportion of vegetarians?

    b. Construct a 99% confidence interval for the

    population proportion. Explain why the interval is so

    narrow, even though the confidence level is high.

    c. In interpreting this confidence interval, can you

    conclude that fewer than 10% of Americans are

    vegetarians? Explain your reasoning.

    20. Grandpas Using Email When the GSS asked in 2004,

    “About how many hours per week do you spend sending

    and answering email?” the eight males in the sample of age

    at least 75 responded:

    0, 1, 2, 2, 7, 10, 14, 15 

    a. The TI-83+/84 screen shot shows results of a statistical 

    analysis for finding a 90% confidence interval. Identify

    the results shown and explain how to interpret them.b. Find and interpret a 90% confidence interval for the 

    population mean.

    Explain why the population distribution may be skewed

    right. If this is the case, is the interval you obtained in

    part b useless, or is it still valid? Explain.

    TInterval(2.34,10.41)  = 6.38  = 6.02  = 8.00 21. US Popularity In 2007, a poll conducted for the BBC of

    28,389 adults in 27 countries found that the United Stateshad fallen sharply in world esteem since 2001

    (www.globescan.com). The United States was rated third

    most negatively (after Israel and Iran), with 30% of those

    polled saying they had a positive image of the United States

    a. In Canada, for a random sample of 1008 adults, 56%

    said the United States is mainly a negative influence in 

    the world. True or false: The 99% confidence interval of

    (0.52, 0.60) means that we can be 99% confident that

    between 52% and 60% of the population of all Canadian

    adults have a negative image of the United States.

    b. In Australia, for a random sample of 1004 people, 66%

    said the United States is mainly a negative influence in

    the world. True or false: The 95% confidence interval of(0.63, 0.69) means that for a random sample of 100

    people, we can be 95% confident that between 63 and

    69 people in the sample have a negative image of the

    United States. 

    22. Driving After Drinking In December 2004, a report based o

    the National Survey on Drug Use and Health estimated that

    20% of all Americans of ages 16 to 20 drove under the

    influence of drugs or alcohol in the previous year (AP,

    December 30, 2004). A public health unit in Wellington, New

    Zealand, plans a similar survey for young people of that age

    in New Zealand. They want a 95% confidence interval tohave a margin of error of 0.04.

    a. Find the necessary sample size if they expect results

    similar to those in the United States.

    b. Suppose that in determining the sample size, they use

    the safe approach that sets = 0.50 in the formula forn. Then, how many records need to be sampled?

    Compare this to the answer in part a. Explain why it is

    better to make an educated guess about what to expect

    for ̂, when possible.

  • 8/16/2019 Math117FinalReview-Fall2014

    4/14

    23. Mean property tax. A tax assessor wants to estimate the

    mean property tax bill for all homeowners in Madison,

    Wisconsin. A survey 10 years ago got a sample mean and

    standard deviation of $1400 and $1000.

    a. How many tax records should the tax assessor

    randomly sample for 95% confidence interval for the

    mean to have a margin of error equal to $100? Whatassumption does your solution make?

    b. In reality, suppose that they’d now get a standard 

    deviation equal to $1500. Using the sample size you

    derived in part a, without doing any calculation,

    explain whether the margin of error for a 95%

    confidence interval would be less than $100, equal to

    $100 or more than $100.

    c. Refer to part b. Would the probability that the sample

    mean falls with $100 of the population mean be less

    than 0.95, equal to 0.95 or greater than 0.95? Explain.

    24. H0

    or Ha? For each of the following hypothesis explain

    whether it is a null hypothesis or alternative hypothesis:

    a. For females, the population mean on the political

    ideology scale is equal to 4.0.

    b. For males, the population proportion who support the

    death penalty is larger than 0.50.

    c. The diet has an effect; the population mean can

    change in weight being less than 0.

    d. For all subway sandwich stores worldwide, the

    difference between sales this month and in the

    corresponding month last year has been a mean of 0.

    25. ESP A person who claims to possess extrasensory

    perception (ESP) says she can guess more often than notthe outcome of a flip of a balanced coin. Out of 20 flips,

    she guesses correctly 12 times. Would you conclude that

    she truly has ESP? Answer by reporting all five steps of a

    significance test of the hypothesis that each of her guesses

    has probability 0.50 of being correct against the

    alternative that corresponds to her having ESP. 

    26. Jurors and gender A jury list contains the names of all

    individuals who may be called for jury duty. The

    proportion of the available jurors on the list who are

    women is 0.53. If 40 people are selected to serve as

    candidates for being picked on the jury, show all steps of

    significance test of the hypothesis that the selections are

    random with respect to gender.

    a. Set up notation and hypotheses, and specify

    assumptions.

    b. 5 of the 40 selected were women. Find the test 

    statistic.

    c. Report the P-value, and interpret.

    d. Explain how to make a decision using a significance 

    level or 0.01.

    27. Type I and Type II errors Refer to the previous exercise.

    a. Explain what type I and Type II errors mean in the

    context of that exercise.

    b. If you made an error with the decisions in part d, is it a

    Type I or Type II error?

    28. Tennis balls in control? When it is operating correctly amachine for manufacturing tennis balls produces balls with a

    mean weight of 57.6 grams. The last eight balls

    manufactured had weights

    57.3, 57.4, 57.2, 57.5, 57.4, 57.1, 57.3, 57.0 

    a. Using a calculator or software, find the test statistic and

    P-value for a test of whether the process is in control

    against the alternative that the true mean of the

    process now differs from 57.6.

    b. For significance level of 0.05, explain what you would

    conclude. Express your conclusion so it would be

    understood by someone who never studied statistics.

    c. If your decision in part b is in error, what type of error

    have you made?

    29. Wage claim false? Management claims that the mean

    income for all senior-level assembly-line workers in a large

    company equals $500 per week. An employee decides to

    test the claim, believing that it is actually less than $500. For

    a random sample of nine employees, the incomes are:

    430, 450, 450, 440, 460, 420, 430, 450, 440. 

    Conduct a significance test of whether the population mea

    income equals $500 per week against the alternative that i

    less. Include all assumptions, the hypotheses, test statisticsP-value, and interpret the results in context.

    30. Legal trial errors Consider the analogy discussed in Section

    9.4 between making a decision about null hypothesis in a

    significance test and making a decision about the innocence

    or guilt of a defendant in a criminal trial.

    a. Explain the difference between Type I and Type II errors

    in the trial setting.

    b. In this context, explain intuitively why decreasing the

    chance of Type I error increases the chance of Type II

    error.

  • 8/16/2019 Math117FinalReview-Fall2014

    5/14

    31. Gender and belief in afterlife. This table shows results from

    the 2008 General Social Survey on gender and whether or

    not on believes in an afterlife.

    Belief in Afterlife 

    Gender Yes No Total

    Female 599  111  710 

    Male  425  168  593 

    a. Denote the population proportion who believe in an

    afterlife by p1 for females and by p2 for males.

    Estimate p1, p2 and ( p1 –  p2 ).

    b. Find the standard error for the estimate of ( p1 –  p2 ). 

    Interpret. 

    c. Construct a 95% confidence interval for ( p1 –  p2 ). Can

    you conclude which of  and  is larger? Explain.d. Suppose that, unknown to us, p1 = 0.81 and p2 = 0.72. 

    Does the confidence interval in part c contain the 

    parameter it is designed to estimate? Explain. 

    32. Belief depend on gender? Refer to the previous exercise.

    a. Find the standard error of ̂  ̂ for a test of:  = .b. For two-sided test, find the test statistic and P-value,

    and make a decision using significance level 0.05.

    Interpret.

    c. Suppose that actually p1 = 0.81 and p2 = 0.72. Was the

    decision in part b in error?

    d. State the assumption on which the methods in this

    exercise are based.

    33. Heavier horseshoe crabs more likely to mate? A study of a

    sample of horseshoe crabs on a Florida island (J. Brockman,Ethology , vol. 102, 1996, pp. 1-21) investigated the factors

    that were associated with whether or not female crabs had a

    male crab mate. Basic statistics, including five-number

    summary on weight (kg) for the 111 female crabs who had a

    male crab nearby and for the 62 female crabs who did not

    have a male nearby, are given in the table. Assume that

    these horseshoe crabs have the properties of random 

    sample of all such crabs.

    Summary Statistics for Weights of Horseshoe Crabs

    # Mean Std. Dev. Min Q1 Med Q3 Max

    Mate 111 2.6 0.6 1.5 2.2 2.6 3.0 5.2

    No Mate 62 2.1 0.4 1.2 1.8 2.1 2.4 3.2

    34. Sex roles A study of the effect of the gender of the tester

    the sex-role differentiation scores16

    in Manhattan gave a

    random sample of preschool children the Occupational

    Preference Test. Children were asked to give three choices

    what they wanted to be when they grew up. Each occupat

    was rated on a scale from 1 (traditionally feminine) to 5

    (traditionally masculine), and a child’s score was the meanthe three selections. When the tester was male, the 50 gir

    had = 2.9 and s = 1.4, whereas when the tester wasfemale, the 90 girls had = 3.2 and s = 1.2. Show all stepa test of the hypothesis that the population mean is the sa

    for the female and male testers, against the alternative th

    they differ. Report the P-value and interpret.

    35. Internet book prices Anna’s project for her introductory

    statistics course was to compare the selling prices of

    textbooks at two Internet bookstores. She first took a

    random sample of 10 textbooks used that term in courses

    her college, based on the list of texts compiled by the coll

    bookstore. The prices of those textbooks at two Internet

    sites were

    Site A: $115, $79, $43, $140, $99, $30, $80, $99, $119, $6

    Site B: $110, $79, $40, $129, $99, $30, $69, $99, $109, $6

    a. Are these independent samples or dependent sample

    Justify your answer.

    b. Find the mean for each sample. Find the mean of the

    difference scores. Compare, and interpret.

    c. Using software or a calculator, construct a 90% 

    confidence interval comparing the population mean

    prices of all the textbooks used that term at her colle

    Interpret.

    36. Comparing book prices 2 For the data in the previous

    exercise, use software or a calculator to perform a

    significance test comparing the population mean prices.

    Show all steps of the test, and indicate whether you woul

    conclude that the mean price is lower at one or the two

    Internet bookstores.

    a. Sketch box plots for the weight distributions of the two

    groups. Interpret by comparing the groups with respect

    to the shape, center, and variability.

    b. Estimate the difference between the mean weights of

    the female crabs who have mates and those who don't.

    c. Find the standard error for the estimate in part b.

    d. Construct a 90% confidence interval for the difference

    between the population mean weights, and interpret.

  • 8/16/2019 Math117FinalReview-Fall2014

    6/14

    37. Down syndrome diagnostic test The table shown, from

    Example 8 in Chapter 5, cross-tabulates whether a fetus

    has Down syndrome by whether or not the triple blood

    diagnostic test for Down syndrome is positive (that is,

    indicates that the fetus has Down syndrome).

    a. Tabulate the conditional distributions for the blood

    test result given the true Down syndrome status.

    b. For the Down cases, what percentage was diagnosed

    as positive by the diagnostic test? For the unaffected

    cases, what percentage got a negative test result?

    Does the diagnostic test appear to be a good one?

    c. Construct the conditional distribution on Down

    syndrome status, for those who have a positive test

    result. (Hint : You condition on the first column total

    and find proportions in that column.) Of those cases,

    what percentage truly have Down syndrome? Is the

    result surprising? Explain why this probability is

    small.

    Blood Test Result

    Down Syndrome Status Positive Negative Total

    D (Down) 48 6 54

    D0

    (unaffected) 1307 3921 5228 

    Total 1355 3927 5282

    40. Tanning experiment Suppose the tanning experiment

    described in Examples 1 and 2 used only four participants,

    two for each treatment.

    a. Show the six possible ways the four ranks could be

    allocated, two to each treatment, with no ties. 

    b. For each possible sample, find the mean rank for each 

    treatment and the difference between the mean ranks.c. Presuming H0 is true of identical treatment effects,

    construct the sampling distribution of the difference 

    between the sample mean ranks for the two

    treatments.

    41. Test for tanning experiment Refer to the previous exercise

    For the actual experiment, suppose the participants using

    the tanning studio got ranks 1 and 2 and the participants

    using the tanning lotion got ranks 3 and 4.

    a. Find and interpret the P-value for the alternative 

    hypothesis that the tanning studio tends to give better

    tans than the tanning lotion.

    b. Find and interpret the P-value for the alternative 

    hypothesis that the treatments have different effects.

    c. Explain why it is a waste of time to conduct this

    experiment if you plan to use a 0.05 significance level to

    make a decision.

    38. Down and chi-squared For the data in the previousexercise,  = 114.4. Show all steps of the chi-squaredtest of independence.

    39. Gender gap? Exercise 11.1 showed a 2 x 3 table relating

    gender and political party identification, shown againhere. The chi-squared statistic for these data equals 8.294.

    Conduct all five steps of chi-squared test.

    Political Party

    Sex Dem Indep Repub

    F 422 381 273

    M 299 365 232

  • 8/16/2019 Math117FinalReview-Fall2014

    7/14

    Answers

    1. 

    UW Student Survey 

    a.  The population is the entire UW student body of 40,858. The sample is the 100 students who were asked to complete thequestionnaire. 

    b.  This value would not necessarily equal the value of the entire population of UW students. It is quite possible that the sample of 100 is not exacrepresentative of the whole student body. This percentage is only an estimate of the percentage of all students who would respond this way. I

    unlikely that any single sample of 100 would have a percentage that was exactly the percentage of the entire population. 

    c. 

    The numerical summery is the sample statistics because it only summarizes for a sample, not for a population. 

    2.  Median versus mean sales price of new homes

    We would expect the mean sales price to have been higher due to the distribution being skewed to the right. A few very expensive

    homes will greatly affect the mean, but not the median sales price.

    3. 

    Female Heights

    a.  According to the Empirical Rule, 95% of the scores in a bell-shaped distribution fall within two standard deviation of the mean.

    ̅ 2 = 65 2(3.5) = 58 ̅ + 2 = 65 + 2(3.5) = 72 

    Thus, 95% of the heights likely fall between 58 and 72 inches.b.  The height for a woman who is three standard deviations below the mean is 54.5.

    ̅ 3 = 65 3(3.5) = 54.5 This is on the cusp of what would be considered an outlier according to the z-score criterion. Scores that are beyond three standa

    deviations from the mean are considered to be potential outliers. So, yes, this height is bordering on unusual.

    4.  High School Graduation Rates

    a. 

    The range is the difference between the lowest and the highest scores: 92.3 78.3 = 14. The interquartile range (IQR) is the difference between scores at the 25

    th  and 75

    th percentiles: IQR=Q3Q1= 88.8 83.6 = 5.2. 

    b.  1.5(IQR)= 7.8  from Q1 or Q3; this criterion suggests that potential outliers would be scores less than 75.8  and greater than 96.6.There are no scores beyond these values, so it would not indicate any potential outliers.

    5. 

    Blood Pressure

    a. 

    =  −  ̅   =  −   = 1.19 A score of 1.19 indicates that a person with a blood pressure of 140, the cutoff having a high blood pressure, falls 1.19standard deviations above the mean.

    b.  About 95% of all the values in a bell-shaped distribution fall within two standard deviation of the meanin this case 32. Subtractingtwo times the standard deviation from the mean and adding two times the standard deviation to the mean tells us about 95% ofsystolic blood pressures fall between 89 and 153.

    6.  Life After Death for Males and Females

    Using 2008 data:

    a.  Opinion about life after death

    Gender Yes No Total

    Male 0.77 0.23 808

    Female 0.85 0.15 979

    b. 

    Overall, both male and women are more likely to believe in life after death than not, but women are somewhat more likely to do

    so.

  • 8/16/2019 Math117FinalReview-Fall2014

    8/14

    7. 

    Women in Government and Economic Life

    a. 

    The correlation between women in parliament and female economic activity is 0.745. This correlation is supported by the positive

    linear trend evident in the scatterplot, but note this is largely driven by the point (for Japan) having female economic activity very low

    (65).

    b.  The regression equation is = 48.91 + 0.1986 . Since the   –intercept correspond to an  value of 0, the   –intercept is not

    meaningful in this case. (Female economic activity=0 is outside the range of the observed data).c.

     

    The predicted value for U.S. is 48.91 +0.1986(81) = 25.5 with 15.0 25.5 = 10.5 as the corresponding residual. Theregression equation underestimates the percentage of women in parliament by 10.5% for the U.S.

    d.  = 0.56 .. = 0.7127  and = 26.5 0.7127(76.8) = 28.24. Thus the prediction equation is = 28.24 +0.7127.8.

     

    Predict Crime Using Poverty

    a.  The slope of 25.5 indicates that for each percentage increase in the poverty, the predicted violent crime rate increases by 25.5 crime

    per 100, 000people statewide.

    b.  The range predicted values runs from 413.9 to 839.8 crimes per 100, 000 people.

    = 209.9 +25.5(8.0) = 413.9  = 209.9 + 25.5(24.7) = 839.75  (rounds to 839.8)

    c.  The correlation would be positive we know this because the slope is positive and the slope and the correlation have always the same

    sign.

    9.  Football Discipline

    a.  Because this is a volunteer sample, there is the potential for sampling bias, both because of the sample is not selected randomly

    (those who have responded might have been those felt the most strongly) and because of the undercover age (anyone without

    Internet access would not have been able to participate). There also is potential for response biases because of the statements are

    leading.

    b.  If the sample is biased due to undercoverage and lack of random sampling, it does not matter how big the sample. It is almost alway

    better to have a small random sample than a large volunteer sample.

  • 8/16/2019 Math117FinalReview-Fall2014

    9/14

    10. 

    Video games mindless? 

    a.  The explanatory is the history playing video games, and the response variable is visual skills.

    b.  This was an observation study because the men were not randomly assigned to treatment (played video games versus hadn’t played

    those who already were in these groups were observed.

    c. 

    One possible lurking variable is reaction time. Excellent reaction times might make it easier and therefore more fun, to play video

    games, leading young people to be more likely to play. Excellent reaction times also might lead young men to perform better on task

    measuring visual skills. These young men might have performed better on tasks measuring visual skills regardless of whether they

    played video games.11.

     

    Peyton Manning Completions

    a.  No. What it means that in 100 passes we expect to see about 65 completions, but actually the number may vary somewhat.

    b.  If Manning is still at his typical paying level, it would be quite surprising if his completion percentage over a large number of passes

    differed significantly from 0.6. The more passes he throws, the closer the observed percentage should be to 0.65.

    12. 

    Driver’s Exam

    a.  There are 2 × 2 × 2 = 8 possible outcomes. Construct the tree yourself. An alternative way of presenting the outcomes:let P=”pass” and F=”failure”, the outcomes are: PPP, PPF, PFP, FPP, PFF, FPF, FFP, FFF.

    b.  If the eight outcomes are equally likely, all three pass the exam is = 0.125. This also be calculated by multiplying the probability th

    the first would pass(0.5), by the probability that the second would pass(0.5), by the probability that the third would pass(0.5),

    0.5 × 0.5 ×0.5 = 0.125. 

    c. 

    If the three friends were a random sample of their age group, the probability that all would pass 0.7 ×0.7 ×0.7 = 0.343. d.  The probabilities that apply to a random sample are not likely to be valid for a sample of three friends because the three friends are

    likely to be similar on many characteristics that might affect the performance on such a test ( e.g., IQ) . In addition it is possible that

    they studied together.

    13.  Grandparents 

    a.  This refers to a discrete random variable because there can only be whole numbers of grandparents. One can’t have 1.78

    grandparents.

    b.  The probabilities satisfy the two conditions for a probability distribution because they each fall between 0 and 1, and the sum of the

    probabilities of all possible values is 1.

    c.  The mean of this probability is 0(0.71) + 1(0.15) + 2(0.09) + 3(0.03) + 4(0.02) ≈ 0.5. 14.  Z-score and Tail Probability 

    a. 

    The z- score that is less than only 1% of the values would be greater than 99% of the values. If we look up 0.99 on Table A, we see th

    z-score is 2.33.

    b. 

    (i) The z-score that is above 0.90 is 1.28.

    (ii) The z-score that is above 0.99 is 2.33.

    15.  Cloning Butterflies

    a.  =  − . = 1.33; according to Table A, 0.092 of the butterflies have a wingspan less than 8 inches.b.  =  −.  = 1.33; according to Table A, 0.092 of the butterflies have a wingspan wider than 10 inches.c.  From parts a and b, 1 2(0.092) = 0.816  of the butterflies have wingspans between 8 and 10 inches.d.

      From Table A, the 90th

     percentile is 1.28. 1.28 ×0.75 + 9 = 9.96  inches. Thus 10% of the butterflies have wingspan wider than 9.96inches.

    16. 

    Exam Performance

    a.  Mean= = 0.70, standard error=  (−)   =  .(−.)   = 0.0648.b.  Since = 50, by the Central Limit Theorem, we would expect the shape of the sampling distribution to be approximately normal

    with mean=0.70 and standard deviation≈ 0.0648.c.  The z-score for 0.60 is

    .−..   = 1.54  giving a cumulative probability of 0.06. It would not be surprising to only get 60% of the

    answers correct.

  • 8/16/2019 Math117FinalReview-Fall2014

    10/14

    1

     

    17. 

    Aunt Erma’s Restaurant

    a.  The population distribution has a mean $900 and a standard deviation of $300.

    b.  The population distribution has a mean $980 and a standard deviation of $276. The standard deviation of the data distribution

    describes the spread of the daily sales values for this past week.

    c. 

    The mean of the sampling distribution of the sample mean = =$900; standard error=   √  = 

    √   ≈ 113.4 dollars. The standarderror describes the spread of the sample means based on sample of seven days sales.

    18. 

    Approval Rating for President Obama

    We could tell to someone who hadn’t  taken a statistics course that we do not know the exact percentage of the population who

    approve of the job Barack Obama is doing as president, but we are quite sure that it is within 3% of 46%, that is, between 43% and

    49%.

    19.  Vegetarianism

    a.  We must assume that the data were obtained randomly.

    b.  =  (−)   =  .(−.),   ≈ 0.002. 

    The confidence interval is ̂ ± () Lower limit: 0.04 2.58(0.002) = 0.035. Upper limit: 0.04 + 2.58(0.002) = 0.045. The interval is so narrow, even though the confidence level is high, mainly because of the very large sample size. The very largesample size contributes to a small standard error by providing a very large denominator for the standard error calculation.

    c.  We can conclude that fewer than 10% of Americans are vegetarians because 10% falls above the highest believable value in the

    confidence interval.

    20.  Grandpas Using Email

    a.  The first result is a 90% confidence interval for the mean hours spent per week sending and answering e-mail for males at least age

    75. The sample mean, ̅, is listed as 6.38 hours. Thus, the estimated mean spent per week sending and answering e-mail for males atleast age 75 is 6.38 hours. The sample standard deviation is 6.02. This quantity estimates the population standard deviation which

    tells us how far we can expect a typical observation to vary from the mean. These estimates are based on a sample of size 8.

    b. 

    The confidence interval is 2.34 to 10.41. We can be 90% confident that the populations mean numbers of hours spent per weeksending and answering e-mail for males at least age 75 is 2.34 to 10.41 hours.

    c.  Since there are likely to be a lot of men over the age of 75 who do not use email but also some who use e-mail regularly, this

    distribution is likely skewed right. Since the t-distribution is robust to violations of normal assumption, the interval is still valid.

    21.  US Popularity

    a.  True

    b.  False

    22.  Driving After Drinking

    a.  =   [(−)]   =  [.(−.)](.)

    (.)   = 385. b.

      =   [(−)]

      =   [.(−.)](.)

    (.)   = 601. This is larger than the answer in a. If we can make an educated guess about what to expect for the proportion, we can use a smaller

    sample size, saving possibly unnecessary time and money.

    23.  Mean property tax 

    a.  =     = ()(.)

    ()   = 385. The solution makes the assumption that the standard deviation will be similar now.

    b.  The margin of error would be more than $100 because the standard error would be larger than predicted.

    c.  With a larger margin of error, the 95% confidence interval is wider; thus, the probability the sample mean is within $100(which is les

    than the margin of error from b) of the population mean is less than 0.95.

  • 8/16/2019 Math117FinalReview-Fall2014

    11/14

    1

     

    24. 

    H0 or Ha?

    a. 

    null hypothesis

    b.  alt alternative hypothesis

    c.  alternative hypothesis

    d.  null hypothesis

    25.  ESP 

    1. 

    Assumptions: The data are categorical (correct vs. incorrect guesses) and are obtained randomly. The expected successes and failureare less than 15 under null hypothesis : = (20)(0.5) = 10 < 15 and (1 ) = (20)(0.5) = 10 < 15, so this test isapproximate.

    2.  Hypotheses: : = 0.5; : > 0.5 3.  Test statistic: =   .−. .(−.)/ = 0.89 4.  P-value:0.19

    5.  Conclusion: If the null hypothesis were true, the probability would be 0.19 of getting a test statistic at least as extreme as the value

    observed. There is no strong evidence that the population proportion correct guesses is higher than 0.50.

    26.  Jurors and gender

    a.  The proportion of jurors who are women

    Hypotheses: : = 0.53; : ≠ 0.53 b.  Test statistic: =   .−. .(−.)/ = 5.1 c.  P-value is 0.000. If the null hypothesis were true, the probability would be almost 0 of getting a test statistic at least as extreme as

    the value observed.

    d.  This P-value is more extreme than the significance level of 0.01. We can reject the null hypothesis; we have strong evidence that

    women are not being selected in numbers proportionate to their representation in their jury pool.

    27.  Type I and Type II errors

    a.  In the previous exercises, a Type I error would have occurred if we had rejected the null hypothesis, concluding that the women wer

    passed over jury duty, when they really were not. A Type II error would have occurred if we had failed to reject the null hypothesis,

    but women were picked disproportionate to their representation in their jury pool.

    b. 

    If we made an error, it was a Type I error.

    28.  Tennis balls in control?

    a.  Software indicates a test statistic of 5.5 and a P-value of 0.001.b.

     

    For a significance level of 0.05, we conclude that the process is not in control. The machine is producing tennis balls that weight less

    than they are supposed to.

    c.  If we rejected the null hypothesis when in fact it is true, we have made a Type I error and conclude that the process is not in control

    when it actually is.

    d. 

    29.  Wage claim false?

    1.  Assumptions: The data are quantitative. The data seem to have been produced using randomization. We also assume an

    approximately normal population distribution.

    2.  Hypotheses: : = 500; : < 500 3.

     

    The sample mean is 441.11, and the standard deviation is 12.69.

    Test statistic: =   .−.√ 

    = 13.9 4.

     

    P-value:0.000

    5.  Conclusion: If the null hypothesis were true, the probability would be almost 0 of getting a test statistic at least as extreme as the

    value observed. There is extremely strong evidence that the population mean is less than 500; we can conclude that the mean incom

    is less than $500 per week.

  • 8/16/2019 Math117FinalReview-Fall2014

    12/14

    1

     

    30. 

    Legal trial errors

    a.  A Type I error in a trail setting would occur if we convicted a defendant who was not guilty. A Type II error would occur if we failed to

    convict a guilty defendant.

    b.  To decrease the chance of a Type I error, we would decrease the significance level. In doing this, it is more difficult to reject the null

    hypothesis (i.e., find someone guilty) . Thus, there will be more guilty people who are not found guilty, a Type II error.

    31.  Gender and belief in afterlife. 

    a.  The sample proportions who report that they believe in afterlife for females is = 0.8437, for males  = 0.7167, and for the

    difference between females and males is 0.8437 0.7167 = 0.127.b.  The standard error for the estimate of (  ) is

    =  (−)   + (−)

      =  .(−.)

      + .(−.)   = 0.0230 

    This is the standard deviation for the difference between males and females for the sample of these sizes.

    c.  The confidence interval is ̂  ̂ ± (); lower endpoint: 0.1271.96(0.0230) = 0.0819; upper end point:0.127+1.96(0.0230) = 0.172. The confidence interval is (0.0819,0.172).Because 0 is less than all plausible values given in the confidence interval, we can conclude that the proportion of females who repo

    that they believe in afterlife is higher than the proportion of males.

    d. 

    The difference between these population proportions, 0.09, is in the confidence interval. The confidence interval in part c) contains

    the parameter it is designated to estimate.

    32.  Belief depend on gender 

    a. 

    =  ̂(1 ̂)  +  =  0.7859(1 0.7859)

      +   = 0.0228. 

    b.  =  (−)−   =   .. = 5.57; P-value≈ 0. If the null hypothesis were true, the probability would be approximately 0 of getting a test statistic at least as extreme as the value

    observed. Therefore, we reject the null hypothesis, and conclude that the population proportions believing in afterlife are different

    for females and males.

    c. 

    If the population difference were 0.810.72=0.09, our decision would have been correct.d.

      The assumptions on which the methods in this exercises are based are independent random samples for the two groups and that we

    had at least 5 successes and 5 failures.

    33. 

    Heavier horseshoe crabs more likely to mate? 

    a.  Construct yourself two boxplots of weight for female carbs who have mates and who do not have mates on the same scale. Then

    conclude that the female crabs have a higher median and a bigger spread if they had a mate than if they did not have a mate. The

    distribution for female crabs with a mate is right-skewed, whereas the distribution for female crabs without a mate is symmetrical.

    b.  The estimated difference between the mean weights of female crabs who have mates and do not have mates is 2.62.1=0.5.

    c.  =   +  =  

    . +  .  = 0.076 

    d.  (̅  ̅) ± .() Because   and   are large, we will approximate t with the normal distribution using z=1.645.0.5 1.645(0.076) = 0.375 0.5 + 1.645(0.076) = 0.625 (0.375, 0.625)

    We can be 90% confidence the difference between the population mean weights of female crabs with and without a mate is betwee

    0.375 and 0.625. Because 0 does not fall in this interval, we can conclude that female crabs with a met weight more than do female

    crabs without a mate.

  • 8/16/2019 Math117FinalReview-Fall2014

    13/14

    1

     

    34. 

    Sex roles

    1.  Assumptions: The data are quantitative (Child’s score); the samples are independent and we will assume that they were obtaine d

    randomly; we will assume that the populations scores distribution are approximately normal for each group.

    2.  Hypotheses: :  = ; :  ≠   where group 1 represents with the group with male tester group 2 represents the group witfemale tester.

    3.  =  .   + .

     = 0.2349, =  .−..  = 1.28 4.

     

    P-value:0.2055.  If the null hypothesis were true, the probability would be 0.205 of getting a test statistic at least as extreme as the value observed.

    Since the P-value is quite large, there is not much evidence of a difference in the population mean of the children’s scores when the

    tester is male than female.

    35.  Internet book prices

    a.  The samples are dependent because they are prices of the same ten books at two different internet sites

    b.  Let group 1 be the prices from Site A and group 2 from Site B. Then, ̅ = $87.30,̅ = $83.00,̅  = $4.30. Sample mean price forthe books from Site A is higher than the sample mean price for the book from Site B. Thus, the sample mean of the difference

    between the prices from the two sites is positive.

    c.  A 90% confidence interval for     is given by (1.53, 7.03). Since 0 is less than the values in the confidence interval, we canconclude that the prices for textbooks used at her college are more expensive at Site A than at Site B.

    36. 

    Comparing book prices 21.  Assumptions: the differences in prices are a random sample from a population that is approximately normal.

    2.  :  = 0; :  ≠ 0 

    3.  =   −/√  =  .

    ./√  = 2.88 

    4.  P-value:0.02

    5. 

    If the null hypothesis is true, the probability obtaining a difference in sample means extreme as the value observed is 0.02. We

    would reject the null hypothesis and conclude there is a significant difference in the prices of textbooks used at her college between

    the two sites for =0.05, or 0.10, but not for =0.01.37.  Down syndrome diagnostic test 

    a.  BLOOD TEST RESULT 

    STATUS Positive Negative Total

    D(Down) 0.89 0.11 54

    (Unafected) 0.25 0.75 5228b.

     

    For the Down cases, 89% were correctly diagnosed. For the unaffected cases, 75% get a negative result. The test seems fairly good,

    but there are a good number of false positives and false negatives.

    c.  BLOOD TEST RESULT 

    STATUS Positive Negative

    D(Down) 0.035 0.002

    (Unafected) 0.965 0.998Total 1355 3927

    Of the positive cases, only 0.035 truly have Down syndrome. This result is not surprising because there are so few cases overall. The

    fairly large numbers of false positives will overwhelm the much smaller number of actual cases.

    38. 

    Down and chi-squared

    1.  The assumptions are there are two categorical variables (Down syndrome status and blood test result), that randomization was used

    to obtain the data and that the expected count was at least five in all cells.

    2.  : Down syndrome status and blood test result are independent : Down syndrome status and blood test result are dependent

    3.    = 114.4, df=14.

     

    P-value:0.000

  • 8/16/2019 Math117FinalReview-Fall2014

    14/14

    1

     

    5. 

    If the null hypothesis were true, the probability would be almost 0 of getting a test statistic at least as extreme as the value

    observed. There is very strong evidence of an association between test result and actual status.

    39.  Gender gap?

    1.  The assumptions are there are two categorical variables (party identification and gender), that randomization was used to obt ain th

    data and that the expected count was at least five in all cells.

    2.  :  Party identification and gender are independent :  Party identification and gender are dependent

    3.    = 8.294, df=24.  P-value:0.016

    5.  If the null hypothesis were true, the probability would be 0.016 of getting a test statistic at least as extreme as the value observed.

    There is very strong evidence that party identification depends on gender.

    40.  Tanning experiment

    a. 

    Treatments Ranks 

    Lotion (1, 2) (1, 3) (1, 4) (2, 3) (2, 4) (3, 4)

    Studio (3, 4) (2, 4) (2, 3) (1, 4) (1, 3) (1, 2)

    b. 

    Lotion mean rank 1.5 2.0 2.5 2.5 3.0 3.5

    Studio mean rank 3.5 3.0 2.5 2.5 2.0 1.5

    Difference of mean

    ranks

    -2.0 -1.0 0.0 0.0 1.0 2.0

    c. 

    Difference between mean ranks probability

    -2.0 1/6

    -1.0 1/6

    0.0 2/6

    1.0 1/6

    2.0 1/6

    41. 

    Test for tanning experiment

    a.  The P-value is 1/6 =0.17; if the treatments had identical effects, the probability would be 0.17 of getting sample like we observed, or

    even more extreme, in this direction. It is plausible that the null hypothesis is correct, and that the studio does not lead to better

    results than the lotion.

    b. 

    The P-value is 2/6 =0.33; if the treatments had identical effects, the probability would be 0.33 of getting sample like we observed, or

    even more extreme, in this direction. It is plausible that the null hypothesis is correct, and that the treatments do not lead to differe

    results.

    c.  It is a waste of time to conduct this experiment if we plan to use a 0.05 significance level because the smallest possible P-value is 0.1