chap_04

77
PART 2 Probability and Inference Probability and Sampling Distributions Probability Theory Introduction to Inference Inference for Distributions Inference for Proportions CHAPTER 4 CHAPTER 5 CHAPTER 6 CHAPTER 7 CHAPTER 8 Probability and Inference PART 2

Upload: api-19746504

Post on 16-Nov-2014

93 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: chap_04

PA

RT

2Probabilityand Inference

Probability and Sampling Distributions

Probability Theory

Introduction to Inference

Inference for Distributions

Inference for Proportions

CHAPTER 4

CHAPTER 5

CHAPTER 6

CHAPTER 7

CHAPTER 8

Probabilityand Inference

PA

RT

2

Page 2: chap_04

230

Prelude

Of audits and probabilities

A firm reimburses its employees for business-related expenses like mealswith clients. To obtain a reimbursement an employee must file an

expense voucher with receipts for expenses of $25 or more. A look at thevouchers shows suspiciously many expenses in the $24.00 to $24.99 range.Are employees padding their vouchers by claiming amounts just a bit toosmall to require a receipt?

A bank allows individual officers to decide if credit card balances upto $5000 should be written off as uncollectible. Write-offs above $5000must be approved by a supervisor. An auditor finds more write-offs thanexpected in the $4500–$4999 range. An investigation finds that some bankofficers have been writing off credit debts of friends who ran up credit cardbills to just below the $5000 limit.

In both scenarios, the auditor must have a legitimate way to establisha cutoff for how many is “too many.” How many $24 expense vouchersshould the firm expect? How many credit card balance write-offs inthe $4500 range should the bank expect? Only if questions like theseare answered can a company recognize when something suspiciousis taking place. We can use a to specify how often

different values occur when we sample from a large collection ofdata. One particularly striking probability model describes the

distribution of first digits in large collections of numberssuch as claimed expenses and credit card account

write-offs. This is , the topic ofour first case in this chapter.

probability model

Benford’s Law

Page 3: chap_04

4

Probabilityand SamplingDistributions

4.1 Randomness

4.2 Probability Models

4.3 Random Variables

4.4 The Sampling Distributionof a Sample Mean

Probabilityand SamplingDistributions

CH

AP

TER

4

Page 4: chap_04

Probability and Sampling Distributions232 CHAPTER 4

Coin tossingEXAMPLE 4.1

Introduction

4.1 Randomness

The idea of probability

chancebehavior is unpredictable in the short run but has a regular and predictablepattern in the long run.

� 1

The reasoning of statistical inference rests on asking, “How often wouldthis method give a correct answer if I used it very many times?” Inferenceis most secure when we produce data by random sampling or randomizedcomparative experiments. The reason is that when we use chance to chooserespondents or assign subjects, the laws of probability answer the question“What would happen if we did this many times?” The purpose of thischapter is to see what the laws of probability tell us, but without going intothe mathematics of probability theory.

What is the mean income of households in the United States? The Bureauof Labor Statistics contacted a random sample of 55,000 households inMarch 2001 for the Current Population Survey. The mean income of the55,000 households for the year 2000 was $57 045. That $57,045 isa that describes the CPS sample households. We use it to estimatean unknown , the mean income of all 106 million Americanhouseholds. We know that would take several different values if the Bureauof Labor Statistics had taken several samples in March 2001. Fortunately,we also know that this sampling variability follows a regular pattern thatcan tell us how accurate the sample result is likely to be. That pattern obeysthe laws of probability. Our starting point in understanding probabilityis the phenomenon we observed in the simulations of Section 3.3:

Toss a coin, or choose an SRS. The result can’t be predicted in advance,because the result will vary when you toss the coin or choose the samplerepeatedly. But there is still a regular pattern in the results, a pattern thatemerges clearly only after many repetitions. This remarkable fact is the basisfor the idea of probability.

x ,statistic

parameterx

When you toss a coin, there are only two possible outcomes, heads or tails.Figure 4.1 shows the results of tossing a coin 5000 times twice. For each numberof tosses from 1 to 5000, we have plotted the proportion of those tosses thatgave a head. Trial A (solid line) begins tail, head, tail, tail. You can see that theproportion of heads for Trial A starts at 0 on the first toss, rises to 0.5 whenthe second toss gives a head, then falls to 0.33 and 0.25 as we get two more tails.Trial B, on the other hand, starts with five straight heads, so the proportion ofheads is 1 until the sixth toss.

The proportion of tosses that produce heads is quite variable at first. Trial Astarts low and Trial B starts high. As we make more and more tosses, however, the

Page 5: chap_04

0.0

0.1

0.2

Trial A

Probability = 0.5

Trial B

1 5 10 50 100 500 1000 5000Number of tosses

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Pro

port

ion

of h

eads

4.1 Randomness 233

FIGURE 4.1 The proportion of tosses of a coin that give a head changesas we make more tosses. Eventually, however, the proportion approaches0.5, the probability of a head. This figure shows the results of two trials of5000 tosses each.

www.whfreeman.com/pbsThe applet available on the Web site for this book,

, animates Figure 4.1. It allows you to choosethe probability of a head and simulate any number of tosses of a coin withthat probability. Experience shows that the proportion of heads graduallysettles down close to the probability. Equally important, it also shows thatthe proportion in a small or moderate number of tosses can be far from theprobability. Probability describes what happens in the long run.

“Random” in statistics is not a synonym for “haphazard” but a descrip-tion of a kind of order that emerges only in the long run. We often encounterthe unpredictable side of randomness in our everyday experience, but werarely see enough repetitions of the same random phenomenon to observethe long-term regularity that probability describes. You can see that regu-larity emerging in Figure 4.1. In the very long run, the proportion of tossesthat give a head is 0.5. This is the intuitive idea of probability. Probability0.5 means “occurs half the time in a very large number of trials.”

We might suspect that a coin has probability 0.5 of coming up headsjust because the coin has two sides. As Exercises 4.1 and 4.2 illustrate, suchsuspicions are not always correct. The idea of probability is . That

probability

What Is Probability?

only

empirical

proportion of heads for both trials gets close to 0.5 and stays there. If we made yeta third trial at tossing the coin a great many times, the proportion of heads wouldagain settle down to 0.5 in the long run. We say that 0.5 is the of ahead. The probability 0.5 appears as a horizontal line on the graph.

Page 6: chap_04

Probability and Sampling Distributions234 CHAPTER 4

Some coin tossersEXAMPLE 4.2

Thinking about randomness

APPLY YOURKNOWLEDGE

RANDOMNESS AND PROBABILITY

random

probability

4.1 Nickels spinning.

4.2 Nickels falling over.

is, it is based on observation rather than theorizing. Probability describeswhat happens in very many trials, and we must actually observe many trialsto pin down a probability. In the case of tossing a coin, some diligent peoplehave in fact made thousands of tosses.

We call a phenomenon if individual outcomes are uncertainbut there is nonetheless a regular distribution of outcomes in a largenumber of repetitions.

The of any outcome of a random phenomenon is theproportion of times the outcome would occur in a very long series ofrepetitions.

That some things are random is an observed fact about the world. Theoutcome of a coin toss, the time between customers arriving at an ATMmachine, and the price of a share of a company’s stock at the close ofthe market are all random. So is the outcome of a random sample ora randomized experiment. Probability theory is the branch of mathemat-ics that describes random behavior. Of course, we can never observe aprobability exactly. We could always continue tossing the coin, for example.

The French naturalist Count Buffon (1707–1788) tossed a coin 4040 times. Result:2048 heads, or proportion 2048/4040 = 0.5069 for heads.

Around 1900, the English statistician Karl Pearson heroically tossed a coin24,000 times. Result: 12,012 heads, a proportion of 0.5005.

While imprisoned by the Germans during World War II, the South Africanmathematician John Kerrich tossed a coin 10,000 times. Result: 5067 heads, aproportion of 0.5067.

Hold a nickel upright on its edge under your forefingeron a hard surface, then snap it with your other forefinger so that it spinsfor some time before falling. Based on 50 spins, estimate the probability ofheads.

You may feel that it is obvious that the probability ofa head in tossing a coin is about 1/2 because the coin has two faces. Suchopinions are not always correct. The previous exercise asked you to spin anickel rather than toss it—that changes the probability of a head. Now tryanother variation. Stand a nickel on edge on a hard, flat surface. Pound thesurface with your hand so that the nickel falls over. What is the probability thatit falls with heads upward? Make at least 50 trials to estimate the probabilityof a head.

APPLY YOURKNOWLEDGE

Page 7: chap_04

4.1 Randomness 235

ECTION UMMARY

ECTION XERCISES

S

S

4.1 S

4.1 E

independent

random phenomenon

probability

4.3 Random digits.

4.4 How many tosses to get a head?

independence

Mathematical probability is an idealization based on imagining what wouldhappen in an indefinitely long series of trials.

The best way to understand randomness is to observe random behavior—not only the long-run regularity but the unpredictable results of shortruns. You can do this with physical devices, as in Exercises 4.1 to 4.5,but computer simulations (imitations) of random behavior allow fasterexploration. Exercises 4.9, 4.10, and 4.11 suggest some simulations ofrandom behavior. As you explore randomness, remember:

You must have a long series of trials. That is, the outcome ofone trial must not influence the outcome of any other. Imagine a crookedgambling house where the operator of a roulette wheel can stop it whereshe chooses—she can prevent the proportion of “red” from settling downto a fixed number. These trials are not independent.

The idea of probability is empirical. Computer simulations start withgiven probabilities and imitate random behavior, but we can estimate areal-world probability only by actually observing many trials.

Nonetheless, computer simulations are very useful because we need longruns of trials. In situations such as coin tossing, the proportion of anoutcome often requires several hundred trials to settle down to theprobability of that outcome. The kinds of physical techniques suggestedin the exercises are too slow for this. Short runs give only rough estimatesof a probability.

A has outcomes that we cannot predict but thatnonetheless have a regular distribution in very many repetitions.

The of an event is the proportion of times the event occurs inmany repeated trials of a random phenomenon.

The table of random digits (Table B) was produced by arandom mechanism that gives each digit probability 0.1 of being a 0. Whatproportion of the first 200 digits in the table are 0s? This proportion is anestimate, based on 200 repetitions, of the true probability, which in this caseis known to be 0.1.

When we toss a penny, experience showsthat the probability (long-term proportion) of a head is close to 1/2. Supposenow that we toss the penny repeatedly until we get a head. What is theprobability that the first head comes up in an odd number of tosses (1, 3, 5,and so on)? To find out, repeat this experiment 50 times, and keep a recordof the number of tosses needed to get a head on each of your 50 trials.

(a) From your experiment, estimate the probability of a head on the firsttoss. What value should we expect this probability to have?

(b) Use your results to estimate the probability that the first head appearson an odd-numbered toss.

Page 8: chap_04

Probability and Sampling Distributions236 CHAPTER 4 �

4.5 Tossing a thumbtack.

4.6 Three of a kind.

4.7

4.8 What probability doesn’t say.

4.9 Simulating consumer behavior.

proportion

countWhat Is Probability?

What Is Probability?

Toss a thumbtack on a hard surface 100 times.How many times did it land with the point up? What is the approximateprobability of landing point up?

You read in a book on poker that the probability of beingdealt three of a kind in a five-card poker hand is 1/50. Explain in simplelanguage what this means.

Probability is a measure of how likely an event is to occur. Match one ofthe probabilities that follow with each statement of likelihood given. (Theprobability is usually a more exact measure of likelihood than is the verbalstatement.)

0 0.01 0.3 0.6 0.99 1

(a) This event is impossible. It can never occur.

(b) This event is certain. It will occur on every trial.

(c) This event is very unlikely, but it will occur once in a while in a longsequence of trials.

(d) This event will occur more often than not.

The idea of probability is that theof heads in many tosses of a balanced coin eventually gets close to 0.5. Butdoes the actual of heads get close to one-half the number of tosses?Let’s find out. Set the “Probability of heads” in theapplet to 0.5 and the number of tosses to 40. You can extend the number oftosses by clicking “Toss” again to get 40 more. Don’t click “Reset” duringthis exercise.

(a) After 40 tosses, what is the proportion of heads? What is the countof heads? What is the difference between the count of heads and 20(one-half the number of tosses)?

(b) Keep going to 120 tosses. Again record the proportion and count ofheads and the difference between the count and 60 (half the number oftosses).

(c) Keep going. Stop at 240 tosses and again at 480 tosses to record thesame facts. Although it may take a long time, the laws of probabilitysay that the proportion of heads will always get close to 0.5 and alsothat the difference between the count of heads and half the number oftosses will always grow without limit.

About half of the customers entering alocal computer store will purchase a new computer before leaving thestore. Use the applet or your statistical software tosimulate 100 customers independently entering the store with each having aprobability of 0.5 of purchasing a new computer on this visit to the store.(In most software, the key phrase to look for is “Bernoulli trials.” This is thetechnical term for independent trials with Yes/No outcomes. Our outcomeshere are “buy” and “not buy.”)

(a) What percent of the 100 simulated customers bought a new computer?

(b) Examine the sequence of “buys” and “not buys.” How long was thelongest run of “buys”? Of “not buys”? (Sequences of random outcomesoften show runs longer than our intuition thinks likely.)

Page 9: chap_04

4.2 Probability Models 237

4.2 Probability Models

UNCOVERING FRAUD BY DIGITAL ANALYSIS

4.10 Simulating an opinion poll.

4.11 More efficient simulation.

2

“Digital analysis” is one of the big new tools among auditors and investigatorslooking for fraud. Faked numbers in tax returns, payment records, invoices,expense account claims, and many other settings often display patterns thataren’t present in legitimate records. Some patterns, like too many round

What Is Probability?

np

A recent opinion poll showed that about 65%of the American public have a favorable opinion of the software companyMicrosoft. Suppose that this is exactly true. Choosing a person at randomthen has probability 0.65 of getting one who has a favorable opinion ofMicrosoft. Use the applet or your statistical softwareto simulate choosing many people independently. (In most software, the keyphrase to look for is “Bernoulli trials.” This is the technical term for inde-pendent trials with Yes/No outcomes. Our outcomes here are “favorable”or not.)

(a) Simulate drawing 20 people, then 80 people, then 320 people. Whatproportion has a favorable opinion of Microsoft in each case? We expect(but because of chance variation we can’t be sure) that the proportionwill be closer to 0.65 in longer runs of trials.

(b) Simulate drawing 20 people 10 times and record the percents in eachtrial who have a favorable opinion of Microsoft. Then simulate drawing320 people 10 times and again record the 10 percents. Which set of10 results is less variable? We expect the results of 320 trials to be morepredictable (less variable) than the results of 20 trials. That is “long-runregularity” showing itself.

Continue the exploration begun in Exercise 4.10.Software allows you to simulate many independent “Yes/No” trials morequickly if all you want to save is the count of “Yes” outcomes. The key word“Binomial” simulates independent Bernoulli trials, each with probability

of a “Yes,” and records just the count of “Yes” outcomes.

(a) Simulate 100 draws of 20 people from the population in Exercise 4.10.Record the number who have a favorable opinion of Microsoft on eachdraw. What is the approximate probability that out of 20 people drawnat random at least 14 have a favorable opinion of Microsoft?

(b) Convert the “favorable” counts into percents of the 20 people in eachtrial. Make a histogram of these 100 percents. Describe the shape,center, and spread of this distribution.

(c) Now simulate drawing 320 people. Do this 100 times and record thepercent who respond “favorable” on each of the 100 draws. Make ahistogram of the percents and describe the shape, center, and spread ofthe distribution.

(d) In what ways are the distributions in parts (b) and (c) alike? In whatways do they differ? (Because regularity emerges in the long run, weexpect the results of drawing 320 individuals to be less variable thanthe results of drawing 20 individuals.)

CA

SE4.1

Page 10: chap_04

Probability and Sampling Distributions238 CHAPTER 4 �

PROBABILITY MODELS

sample space

event

probability model

3

numbers, are obvious and easily avoided by a clever crook. Others are moresubtle. It is a striking fact that the first digits of numbers in legitimaterecords often follow a distribution known as . Here it is:

First digit 1 2 3 4 5 6 7 8 9

Proportion 0.301 0.176 0.125 0.097 0.079 0.067 0.058 0.051 0.046

These proportions are in fact . That is, they are the propor-tions specified by Benford’s Law if we examine a very large number of similarrecords. Set a computer to work comparing invoices from your company’svendors with this distribution. Aha—here is a vendor whose invoices show avery different pattern. All the invoices were approved by the same manager.Confronted, he admits that he was raising the amounts of the invoices andbuying from his wife’s company at the inflated amounts. Digital analysis hasuncovered another case of fraud.

Of course, not all sets of data follow Benford’s Law. Numbers that areassigned, such as Social Security numbers, do not. Nor do data with a fixedmaximum, such as deductible contributions to individual retirement accounts(IRAs). Nor of course do random numbers. But a surprising variety of datafrom natural science, social affairs, and business obeys this distribution.

Gamblers have known for centuries that the fall of coins, cards, and dicedisplays clear patterns in the long run. Benford’s Law says that first digitsalso follow a clear pattern. The idea of probability rests on the observedfact that the average result of many thousands of chance outcomes can beknown with near certainty. How can we give a mathematical description oflong-run regularity?

To see how to proceed, think first about a very simple random phe-nomenon, tossing a coin once. When we toss a coin, we cannot know theoutcome in advance. What do we know? We are willing to say that the out-come will be either heads or tails. We believe that each of these outcomeshas probability 1/2. This description of coin tossing has two parts:

a list of possible outcomes

a probability for each outcome

This description is the basis for all probability models. Here is the vocabularywe use.

The of a random phenomenon is the set of all possibleoutcomes. is used to denote sample space.

An is an outcome or a set of outcomes of a randomphenomenon. That is, an event is a subset of the sample space.

A is a mathematical description of a randomphenomenon consisting of two parts: a sample space and a way ofassigning probabilities to events.

Benford’s Law

probabilities

S

S

Page 11: chap_04

4.2 Probability Models 239

FIGURE 4.2

Rolling diceEXAMPLE 4.3

The 36 possible outcomes in rolling two dice.

APPLY YOURKNOWLEDGE

4.12

� ��

� �

� �

The sample space can be very simple or very complex. When we toss acoin once, there are only two outcomes, heads and tails. The sample spaceis H T . If we draw a random sample of 55,000 U.S. households, asthe Current Population Survey does, the sample space contains all possiblechoices of 55,000 of the 106 million households in the country. This isextremely large. Each member of is a possible sample, which explains theterm .

SA

A

sum

S , , , , , , , , , ,

S S

SS

S

S ,

SS

sample space

Rolling two dice is a common way to lose money in casinos. There are 36 possibleoutcomes when we roll two dice and record the up-faces in order (first die, seconddie). Figure 4.2 displays these outcomes. They make up the sample space . “Roll a5” is an event, call it , that contains four of these 36 outcomes:

In craps and other games, all that matters is the of the spots on theup-faces. Let’s change the random outcomes we are interested in: roll two dice andcount the spots on the up-faces. Now there are only 11 possible outcomes, froma sum of 2 for rolling a double one through 3, 4, 5, and on up to 12 for rolling adouble six. The sample space is now

2 3 4 5 6 7 8 9 10 11 12

Comparing this with Figure 4.2 reminds us that we can change by changing thedetailed description of the random phenomenon we are describing.

In each of the following situations, describe a sample space for the randomphenomenon. In some cases, you have some freedom in your choice of .

(a) A new business is started. After two years, it is either still in business orit has closed.

(b) A rust prevention treatment is applied to a new car. The responsevariable is the length of time before rust begins to develop on thevehicle.

APPLY YOURKNOWLEDGE

Page 12: chap_04

Probability and Sampling Distributions240 CHAPTER 4 �

Probability rules

Any probability is a number between 0 and 1.

All possible outcomes together must have probability 1.

The probability that an event does not occur is 1 minus the probabilitythat the event does occur.

4.13

The true probability of any outcome—say, “roll a 5 when we toss twodice”—can be found only by actually tossing two dice many times, and thenonly approximately. How then can we describe probability mathematically?Rather than try to give “correct” probabilities, we start by laying down factsthat must be true for any assignment of probabilities. These facts followfrom the idea of probability as “the long-run proportion of repetitions inwhich an event occurs.”

1. Any proportion is anumber between 0 and 1, so any probability is also a number between0 and 1. An event with probability 0 never occurs, and an event withprobability 1 occurs on every trial. An event with probability 0.5 occursin half the trials in the long run.

2. Because someoutcome must occur on every trial, the sum of the probabilities for allpossible outcomes must be exactly 1.

3.If an event occurs in (say) 70% of all trials, it

fails to occur in the other 30%. The probability that an event occurs andthe probability that it does not occur always add to 100%, or 1.

SS

S

(c) A student enrolls in a statistics course and at the end of the semesterreceives a letter grade.

(d) A quality inspector examines four portable CD players and rates eachas either “acceptable” or “unacceptable.” You record the sequence ofratings.

(e) A quality inspector examines four portable CD players and rates each aseither “acceptable” or “unacceptable.” You record the number of unitsrated “acceptable.”

In each of the following situations, describe a sample space for the randomphenomenon. In some cases you have some freedom in specifying , especiallyin setting the largest and the smallest value in .

(a) Choose a student in your class at random. Ask how much time thatstudent spent studying during the past 24 hours.

(b) The Physicians’ Health Study asked 11,000 physicians to take an aspirinevery other day and observed how many of them had a heart attack ina 5-year period.

(c) In a test of a new package design, you drop a carton of a dozeneggs from a height of 1 foot and count the number of broken eggs.

(d) Choose a Fortune 500 company at random. Look up how much theCEO of the company makes annually.

(e) A nutrition researcher feeds a new diet to a young male white rat.The response variable is the weight (in grams) that the rat gains in8 weeks.

Page 13: chap_04

4.2 Probability Models 241

Benford’s LawEXAMPLE 4.4

If two events have no outcomes in common, the probability that one orthe other occurs is the sum of their individual probabilities.

PROBABILITY RULES

Rule 1.

Rule 2.

Rule 3.

Rule 4. disjoint

addition rule for disjoint events.

� �

� �

� �

� �

� � �

4.If one event

occurs in 40% of all trials, a different event occurs in 25% of all trials,and the two can never occur together, then one or the other occurs on65% of all trials because 40% + 25% = 65%.

We can use mathematical notation to state Facts 1 to 4 more concisely.Capital letters near the beginning of the alphabet denote events. If isany event, we write its probability as ( ). Here are our probability factsin formal language. As you apply these rules, remember that they are justanother form of intuitively true facts about long-run proportions.

The probability ( ) of any event satisfies 0 ( ) 1.

If is the sample space in a probability model, then ( ) 1.

The probability that an event does not occur is

( does not occur) 1 ( )

Two events and are if they have no outcomesin common and so can never occur simultaneously. If and aredisjoint,

( or ) ( ) ( )

This is the

S

P P

. .

P P P

. . .

AP A

P A A P A

S P S

A

P A P A

A BA B

P A B P A P B

The probabilities assigned to first digits by Benford’s Law are all between 0 and 1.They add to 1 because the first digits listed make up the sample space . (Note thata first digit can’t be 0.)

The probability that a first digit is anything other than 1 is, by Rule 3,

(not a 1) 1 (first digit is 1)

1 0 301 0 699

That is, if 30.1% of first digits are 1s, the other 69.9% are not 1s.“First digit is a 1” and “first digit is a 2” are disjoint events. So the addition

rule (Rule 4) says

(1 or 2) (1) (2)

0 301 0 176 0 477

About 48% of first digits in data governed by Benford’s Law are 1s or 2s.Fraudulent records generally have many fewer 1s and 2s.

CA

SE4.1

Page 14: chap_04

P (roll a 5) � P ( ) + P ( ) + P ( ) + P ( )

Probability and Sampling Distributions242 CHAPTER 4

Probabilities for rolling diceEXAMPLE 4.5

APPLY YOURKNOWLEDGE

4.14 Moving up.

4.15 Causes of death.

4.16 Rating the economy.

Outcome Probability

� � � �

� � .

Figure 4.2 displays the 36 possible outcomes of rolling two dice. Whatprobabilities shall we assign to these outcomes?

Casino dice are carefully made. Their spots are not hollowed out, whichwould give the faces different weights, but are filled with white plastic of thesame density as the colored plastic of the body. For casino dice it is reasonable toassign the same probability to each of the 36 outcomes in Figure 4.2. Because all36 outcomes together must have probability 1 (Rule 2), each outcome must haveprobability 1/36.

What is the probability of rolling a 5? Because the event “roll a 5” containsthe four outcomes displayed in Example 4.3, the addition rule (Rule 4) says that itsprobability is

1 1 1 136 36 36 36

40 111

36

What about the probability of rolling a 7? In Figure 4.2 you will find sixoutcomes for which the sum of the spots is 7. The probability is 6/36, or about0.167.

An economist studying economic class mobility finds that theprobability that the son of a lower-class father remains in the lower classis 0.46. What is the probability that the son moves to one of the higherclasses?

Government data on job-related deaths assign a singleoccupation for each such death that occurs in the United States. The dataon occupational deaths in 1999 show that the probability is 0.134 thata randomly chosen death was agriculture-related, and 0.119 that it wasmanufacturing-related. What is the probability that a death was eitheragriculture-related or manufacturing-related? What is the probability thatthe death was related to some other occupation?

A Gallup Poll (June 11–17, 2001) interviewed arandom sample of 1004 adults (18 years or older). The people in the samplewere asked how they would rate economic conditions in the United Statestoday. Here are the results:

Excellent 0.03Good 0.39Fair ?Poor 0.12No opinion 0.01

APPLY YOURKNOWLEDGE

Page 15: chap_04

4.2 Probability Models 243

Random digits versus Benford’s LawEXAMPLE 4.6

Assigning probabilities: finite number of outcomes

PROBABILITIES IN A FINITE SAMPLE SPACE

� �

� �� �

� � � � � � � �

� � � � � �

Examples 4.4 and 4.5 illustrate one way to assign probabilities to events:assign a probability to every individual outcome, then add these probabilitiesto find the probability of any event. If such an assignment is to satisfy therules of probability, the probabilities of all the individual outcomes mustsum to exactly 1.

Assign a probability to each individual outcome. These probabilitiesmust be numbers between 0 and 1 and must have sum 1.

The probability of any event is the sum of the probabilities of theoutcomes making up the event.

S

X X

X

P X P X P X P X P X

.

P X X XX

These proportions are probabilities for the random phenomenon of choosingan adult at random and asking the person’s opinion on current economicconditions.

(a) What must be the probability that the person chosen says “fair”? Why?

(b) The official press release focused on the percent of adults giving theeconomy a “positive” rating where positive is defined as “good” or“excellent.” What is this probability?

You might think that first digits in financial records are distributed “at random”among the digits 1 to 9. The 9 possible outcomes would then be equally likely. Thesample space for a single digit is

1, 2, 3, 4, 5, 6, 7, 8, 9

Call a randomly chosen first digit for short. The probability model for iscompletely described by this table:

First digit 1 2 3 4 5 6 7 8 9

Probability 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9

The probability that a first digit is equal to or greater than 6 is

( 6) ( 6) ( 7) ( 8) ( 9)

1 1 1 1 40 444

9 9 9 9 9

Note that this is not the same as the probability that a random digit is strictlygreater than 6, ( 6). The outcome 6 is included in the event 6and is omitted from 6 .

� �

CA

SE4.1

Page 16: chap_04

0.4

0.01

Pro

babi

lity

Outcomes

0.3

0.2

0.1

0 2 3 4

(a)

5 6 7 8 9

0.4

0.01

Pro

babi

lity

Outcomes

0.3

0.2

0.1

0 2 3 4

(b)

5 6 7 8 9

Probability and Sampling Distributions244 CHAPTER 4

FIGURE 4.3

Probability histograms for (a) random digits 1 to 9 and(b) Benford’s Law. The height of each bar shows the probability assigned toa single outcome.

probability histogramsprobabilityhistogram

� � � � �

We can use histograms to display probability distributions as well as dis-tributions of data. Figure 4.3 displays that comparethe probability model for random digits with the model given by Benford’sLaw. The height of each bar shows the probability of the outcome at its base.Because the heights are probabilities, they add to 1. As usual, all the bars ina histogram have the same width. So the areas also display the assignment

V

V

P V . . . . .

Compare this with the probability of the same event among first digits fromfinancial records that obey Benford’s Law. A first digit chosen from such recordshas the distribution

First digit 1 2 3 4 5 6 7 8 9

Probability 0.301 0.176 0.125 0.097 0.079 0.067 0.058 0.051 0.046

The probability that this digit is equal to or greater than 6 is

( 6) 0 067 0 058 0 051 0 046 0 222

Benford’s Law allows easy detection of phony financial records based on randomlygenerated numbers. These records tend to have too few first digits being 1s and 2sand too many first digits of 6 or greater.

Page 17: chap_04

Outcome Model 1

1/7

1/7

1/7

1/7

1/7

1/7

Probability

Model 2

1/3

1/6

1/6

0

1/6

1/6

Model 3

1/3

1/6

1/6

1/6

1/6

1/6

Model 4

1

1

2

1

1

2

4.2 Probability Models 245

FIGURE 4.4 Four assignments of probabilities to the six faces of a die, forExercise 4.17.

APPLY YOURKNOWLEDGE

4.17 Rolling a die.

4.18 Self-employed workers.

Industry Probability

4

of probability to outcomes. Think of these histograms as idealized picturesof the results of very many trials.

accurate

legitimate

Figure 4.4 displays several assignments of probabilities to thesix faces of a die. We can learn which assignment is actually fora particular die only by rolling the die many times. However, some of theassignments are not assignments of probability. That is, they donot obey the rules. Which are legitimate and which are not? In the case ofthe illegitimate models, explain what is wrong.

Draw a self-employed worker at random and recordthe industry in which the person works. “At random” means that we giveevery such person the same chance to be the one we choose. That is, wechoose an SRS of size 1. The probability of any industry is just the proportionof all self-employed workers who work in that industry—if we drew manysuch workers, this is the proportion we would get. Here is the probabilitymodel:

Agriculture 0.130Construction 0.147Finance, insurance, real estate 0.059Manufacturing 0.042Mining 0.002Services 0.419Trade 0.159Transportation, public utilities 0.042

(a) Show that this is a legitimate probability model.

(b) What is the probability that a randomly chosen worker does not workin agriculture?

(c) What is the probability that a randomly chosen worker works inconstruction, manufacturing, or mining?

APPLY YOURKNOWLEDGE

Page 18: chap_04

Probability and Sampling Distributions246 CHAPTER 4

Random numbersEXAMPLE 4.7

Assigning probabilities: intervals of outcomes

4.19 Job satisfaction.

Uniform density curve,

� �

� �

� �

Uniform densitycurve

5

A software random number generator is designed to produce a numberbetween 0 and 1 chosen at random. It is only a slight idealization to consider

number in this range as a possible outcome. The sample space is then

all numbers between 0 and 1

Call the outcome of the random number generator for short. How canwe assign probabilities to such events as 0 3 0 7 ? As in the case ofselecting a random digit, we would like all possible outcomes to be equallylikely. But we cannot assign probabilities to each individual value of andthen add them, because there are infinitely many possible values.

We use a new way of assigning probabilities directly to events—as. Any density curve has area exactly 1 underneath it,

corresponding to total probability 1. We first met density curves as modelsfor data in Chapter 1 (page 51).

P . Y . .

any

S

Y. Y .

Y

areasunder a density curve

We can use the results of a poll on job satisfaction to givea probability model for the job satisfaction rating of a randomly chosenemployed (full-time or part-time) American. Here is the model:

Completely Somewhat Somewhat CompletelyRating satisfied satisfied dissatisfied dissatisfied

Probability ? 0.47 0.12 0.02

(a) What is the probability of a randomly selected employed Americanbeing completely satisfied with his or her job? Why?

(b) What is the probability that a randomly selected employed Americanwill be dissatisfied with his or her job?

The random number generator will spread its output uniformly across the entireinterval from 0 to 1 as we allow it to generate a long sequence of numbers. Theresults of many trials are represented by the as shown inFigure 4.5. This density curve has height 1 over the interval from 0 to 1. The areaunder the curve is 1, and the probability of any event is the area under the curveand above the event in question.

As Figure 4.5(a) illustrates, the probability that the random number generatorproduces a number between 0.3 and 0.7 is

(0 3 0 7) 0 4

because the area under the density curve and above the interval from 0.3 to 0.7 is0.4. The height of the curve is 1 and the area of a rectangle is the product of heightand length, so the probability of any interval of outcomes is just the length of theinterval.

� �

Page 19: chap_04

0 0.3 0.7 1

1

Area = 0.4

P (0.3 ≤ X ≤ 0.7)

(a)

0 0.5 0.8 1

P (X ≤ 0.5 or X > 0.8)

(b)

Area = 0.2Area = 0.5

4.2 Probability Models 247

FIGURE 4.5 Probability as area under a density curve. These Uniform density curves spreadprobability evenly between 0 and 1.

APPLY YOURKNOWLEDGE

4.20 Random numbers.

4.21 Adding random numbers.

Compare the density curves in Figure 4.5 with the probability histogramin Figure 4.3(a). Both describe Uniform distributions. In 4.3(a), there areonly 9 possible outcomes, the whole numbers 1 2 9. In Figure 4.5, theoutcome can be any number between 0 and 1. In both figures, probabilityis given by area. Although the mathematics required to deal with densitycurves is more advanced, the most important ideas of probability apply toboth kinds of probability models.

P Y . .

P Y . .

P Y . Y . .

Y

P Y .

P . Y

P . Y .

P . Y .

T TT

, , ,

Similarly,

( 0 5) 0 5

( 0 8) 0 2

( 0 5 or 0 8) 0 7

Notice that the last event consists of two nonoverlapping intervals, so the totalarea above the event is found by adding two areas, as illustrated by Figure 4.5(b).This assignment of probabilities obeys all of our rules for probability.

Let be a random number between 0 and 1 produced bythe idealized Uniform random number generator described in Example 4.7and Figure 4.5. Find the following probabilities:

(a) (0 0 4)

(b) (0 4 1)

(c) (0 3 0 5)

(d) (0 3 0 5)

Generate two random numbers between 0 and 1and take to be their sum. The sum can take any value between 0 and 2.The density curve of is the triangle shown in Figure 4.6.

. . .

� �

� �

� �

� �

� �

APPLY YOURKNOWLEDGE

Page 20: chap_04

0 1 2

Height = 1

Probability and Sampling Distributions248 CHAPTER 4

FIGURE 4.6

The weight of bags of potato chipsEXAMPLE 4.8

Normal probability models

The density curve for the sum of tworandom numbers, for Exercise 4.21. This density curvespreads probability between 0 and 2.

Normal distributions are probability models.

� �� �

� ��

� �

Any density curve can be used to assign probabilities. The density curvesthat are most familiar to us are the Normal curves introduced in Section 1.3.So There is a close connectionbetween a Normal distribution as an idealized description for data and aNormal probability model. If we look at the weights of all 9-ounce bags of aparticular brand of potato chip, we find that they closely follow the Normaldistribution with mean 9 12 ounces and standard deviation 0 15ounces, (9 12 0 15). That is a distribution for a large set of data. Nowchoose one 9-ounce bag at random. Call its weight . If we repeat therandom choice very many times, the distribution of values of is the sameNormal distribution.

T

P T .

W N . , .

Z N ,

. . W . . .P . W . P

. . .

P . Z .

. . .

. .N . , .

WW

(a) Verify by geometry that the area under this curve is 1.

(b) What is the probability that is less than 1? (Sketch the density curve,shade the area that represents the probability, then find that area. Dothis for (c) also.)

(c) What is ( 0 5)?

What is the probability that a randomly chosen 9-ounce bag has weight between9.33 and 9.45 ounces?

The weight of the bag we choose has the (9 12 0 15) distribution. Findthe probability by standardizing and using Table A, the table of standard Normalprobabilities. We will reserve capital for a standard Normal variable, (0 1).

9 33 9 12 9 12 9 45 9 12(9 33 9 45)

0 15 0 15 0 15

(1 4 2 2)

0 9861 0 9192 0 0669

Figure 4.7 shows the areas under the standard Normal curve. The calculationis the same as those we did in Chapter 1. Only the language of probability is new.

� � �� � � �

� �

Page 21: chap_04

Z = 1.4 Z = 2.2

Probability = 0.0669

Standard Normalcurve

4.2 Probability Models 249

FIGURE 4.7

ECTION UMMARY

The probability in Example 4.8 as an area under the standardNormal curve.

APPLY YOURKNOWLEDGE

S 4.2 S�

probability model

sample spaceevents.

1.

2.

4.22 MPG for 2001 vehicles.

4.23 NAEP math scores.

� �

A for a random phenomenon consists of a samplespace and an assignment of probabilities .

The is the set of all possible outcomes of the randomphenomenon. Sets of outcomes are called assigns a number ( )to an event as its probability.

Any assignment of probability must obey the rules that state the basicproperties of probability:

0 ( ) 1 for any event .

( ) 1.

..

X

X

YY

S P

P P AA

P A A

P S

The Normal distribution with mean 21 2 milesper gallon and standard deviation 5 4 MPG is an approximate modelfor the city gas mileage of 2001 model year vehicles. Figure 1.14 (page 51)pictures this distribution. Let be the city MPG of one 2001 vehicle chosenat random.

(a) Write the event “the vehicle chosen has a city MPG of 32 or higher” interms of .

(b) Find the probability of this event.

Scores on the National Assessment of EducationalProgress 12th-grade mathematics test for the year 2000 were approximatelyNormal with mean 300 points (out of 500 possible) and standard deviation35 points. Let stand for the score of a randomly chosen student. Expresseach of the following events in terms of and use the 68–95–99.7 rule togive the approximate probability.

(a) The student has a score above 300.

(b) The student’s score is above 370.

S

�APPLY YOURKNOWLEDGE

Page 22: chap_04

Probability and Sampling Distributions250 CHAPTER 4

ECTION XERCISES

S 4.2 E

3.

4. Addition rule: disjoint

areas under a density curve.

4.24 Stock price movements.

4.25 Land in Iowa.

4.26 Car colors.

4.27 Colors of M&Ms.

��

� �

6

For any event , ( does not occur) 1 ( ).

Events and are if they have no outcomesin common. If and are disjoint, then ( or ) ( ) ( ).

When a sample space contains finitely many possible outcomes, aprobability model assigns each of these outcomes a probability between 0and 1 such that the sum of all the probabilities is exactly 1. The probabilityof any event is the sum of the probabilities of all the outcomes that makeup the event.

A sample space can contain all values in some interval of numbers. Aprobability model assigns probabilities asThe probability of any event is the area under the curve above theoutcomes that make up the event.

A P A P A

A BA B P A B P A P B

S

You watch the price of Cisco Systems stock for fourdays. Give a sample space for each of these random phenomena:

(a) You record the sequence of up days and down days.

(b) You record the number of up days.

Choose an acre of land in Iowa at random. The probabilityis 0.92 that it is farmland and 0.01 that it is forest.

(a) What is the probability that the acre chosen is not farmland?

(b) What is the probability that it is either farmland or forest?

(c) What is the probability that a randomly chosen acre in Iowa is somethingother than farmland or forest?

Choose a new car or light truck at random and note its color.Here are the probabilities of the most popular colors for cars made in NorthAmerica in 2000:

Color Silver White Black Dark green Dark blue Medium red

Probability 0.176 0.172 0.113 0.089 0.088 0.067

What is the probability that the car you choose has any color other than thesix listed? What is the probability that a randomly chosen car is either silveror white?

The colors of candies such as M&Ms are carefully chosento match consumer preferences. The color of an M&M drawn at randomfrom a bag has a probability distribution determined by the proportions ofcolors among all M&Ms of that type.

(a) Here is the distribution for plain M&Ms:

Color Brown Red Yellow Green Orange Blue

Probability 0.3 0.2 0.2 0.1 0.1 ?

What must be the probability of drawing a blue candy?

Page 23: chap_04

4.2 Probability Models 251

4.28 Crispy M&Ms.

4.29 Legitimate probabilities?

4.30 Who goes to Paris?

4.31 How big are farms?

� �

� � �

P . P .

P . P . P .P .

A B

P A P B

A

(b) The probabilities for peanut M&Ms are a bit different. Here they are:

Color Brown Red Yellow Green Orange Blue

Probability 0.2 0.2 0.2 0.1 0.1 ?

What is the probability that a peanut M&M chosen at random is blue?

(c) What is the probability that a plain M&M is any of red, yellow, or orange?What is the probability that a peanut M&M has one of these colors?

Exercise 4.27 gives the probabilities that an M&M candyis each of brown, red, yellow, green, orange, and blue. “Crispy Chocolate”M&Ms are equally likely to be any of these colors. What is the probabilityof any one color?

In each of the following situations, state whether ornot the given assignment of probabilities to individual outcomes is legitimate,that is, satisfies the rules of probability. If not, give specific reasons for youranswer.

(a) When a coin is spun, (H) 0 55 and (T) 0 45.

(b) When two coins are tossed, (HH) 0 4, (HT) 0 4, (TH) 0 4,and (TT) 0 4.

(c) Plain M&Ms have not always had the mixture of colors given inExercise 4.27. In the past there were no red candies and no blue candies.Tan had probability 0.10, and the other four colors had the sameprobabilities that are given in Exercise 4.27.

Abby, Deborah, Sam, Tonya, and Roberto work in afirm’s public relations office. Their employer must choose two of them toattend a conference in Paris. To avoid unfairness, the choice will be made bydrawing two names from a hat. (This is an SRS of size 2.)

(a) Write down all possible choices of two of the five names. This is thesample space.

(b) The random drawing makes all choices equally likely. What is theprobability of each choice?

(c) What is the probability that Tonya is chosen?

(d) What is the probability that neither of the two men (Sam and Roberto)is chosen?

Choose an American farm at random and measure itssize in acres. Here are the probabilities that the farm chosen falls in severalacreage categories:

Acres 10 10–49 50–99 100–179 180–499 500–999 1000–1999 2000

Probability 0.09 0.20 0.15 0.16 0.22 0.09 0.05 0.04

Let be the event that the farm is less than 50 acres in size, and let be theevent that it is 500 acres or more.

(a) Find ( ) and ( ).

(b) Describe the event “ does not occur” in words and find its probabilityby Rule 3.

� �

Page 24: chap_04

Probability and Sampling Distributions252 CHAPTER 4 �

4.32 Roulette.

4.33 Consumer preference.

4.34 Moving up.

4.35 How large are households?

. . .

A B

, , , ,

XX

X

X

X

X

P X X

P X

X

X

(c) Describe the event “ or ” in words and find its probability by theaddition rule.

A roulette wheel has 38 slots, numbered 0, 00, and 1 to 36. Theslots 0 and 00 are colored green, 18 of the others are red, and 18 are black.The dealer spins the wheel and at the same time rolls a small ball along thewheel in the opposite direction. The wheel is carefully balanced so thatthe ball is equally likely to land in any slot when the wheel slows. Gamblerscan bet on various combinations of numbers and colors.

(a) What is the probability of any one of the 38 possible outcomes? Explainyour answer.

(b) If you bet on “red,” you win if the ball lands in a red slot. What is theprobability of winning?

(c) The slot numbers are laid out on a board on which gamblers place theirbets. One column of numbers on the board contains all multiples of 3,that is, 3 6 9 36. You place a “column bet” that wins if any ofthese numbers comes up. What is your probability of winning?

Suppose that half of all computer users prefer thenew version of your company’s best-selling software, and half prefer the oldversion. If you randomly choose three consumers and record their prefer-ences, there are 8 possible outcomes. For example, NNO means the first twoconsumers prefer the new (N) version and the third consumer prefers the oldversion (O). All 8 arrangements are equally likely.

(a) Write down all 8 arrangements of preferences of three consumers. Whatis the probability of any one of these arrangements?

(b) Let be the number of consumers who prefer the new version of theproduct. What is the probability that 2?

(c) Starting from your work in (a), list the values can take and theprobability for each value.

A study of class mobility in England looked at the economicclass reached by the sons of lower-class fathers. Economic classes arenumbered from 1 (low) to 5 (high). Take to be the class of a randomlychosen son of a father in Class 1. The study found that the probabilitymodel for is

Son’s class 1 2 3 4 5

Probability 0.48 0.38 0.08 0.05 0.01

(a) Check that this distribution satisfies the two requirements for a legitimateassignment of probabilities to individual outcomes.

(b) What is ( 3)? (Be careful: the event “ 3” includes thevalue 3.)

(c) What is ( 3)?

(d) Write the event “a son of a lower-class father reaches one of the twohighest classes” in terms of values of . What is the probability of thisevent?

Choose an American household at random andlet be the number of persons living in the household. If we ignore the few

� �

Page 25: chap_04

4.3 Random Variables 253

4.3 Random Variables

RANDOM VARIABLE

random variable

4.36 Random numbers.

��

Not all sample spaces are made up of numbers. When we toss a coin fourtimes, we can record the outcome as a string of heads and tails, such asHTTH. In statistics, however, we are most often interested in numericaloutcomes such as the count of heads in the four tosses. It is convenient touse a shorthand notation: Let be the number of heads. If our outcome isHTTH, then 2. If the next outcome is TTTH, the value of changesto 1. The possible values of are 0, 1, 2, 3, and 4. Tossing a coin fourtimes will give one of these possible values. Tossing four more times willgive another and probably different value. We call abecause its values vary when the coin tossing is repeated. Examples 4.6 and4.7 used this shorthand notation. In Example 4.8, we let stand for theweight of a randomly chosen 9-ounce bag of potato chips. We know thatwould take a different value if we took another random sample. Because itsvalue changes from one sample to another, is also a random variable.

A is a variable whose value is a numerical outcomeof a random phenomenon.

X

X

P X

P X

P X

P X

X

Y

P Y

P . Y .

P Y .

XX X

X XX

X X random variable

WW

W

households with more than seven inhabitants, the probability model foris as follows:

Household size 1 2 3 4 5 6 7

Probability 0.25 0.32 0.17 0.15 0.07 0.03 0.01

(a) Verify that this is a legitimate probability distribution.

(b) What is ( 5)?

(c) What is ( 5)?

(d) What is (2 4)?

(e) What is ( 1)?

(f) Write the event that a randomly chosen household contains more thantwo persons in terms of . What is the probability of this event?

Many random number generators allow users to specifythe range of the random numbers to be produced. Suppose that you spec-ify that the random number can take any value between 0 and 2. Thenthe density curve of the outcomes has constant height between 0 and 2, andheight 0 elsewhere.

(a) What is the height of the density curve between 0 and 2? Draw a graphof the density curve.

(b) Use your graph from (a) and the fact that probability is area under thecurve to find ( 1).

(c) Find (0 5 1 3).

(d) Find ( 0 8).

� �

� �

Page 26: chap_04

0 1

1 2 3 4 5 6 7 8 9

Probability and Sampling Distributions254 CHAPTER 4 �

APPLY YOURKNOWLEDGE

discrete random variable.

continuous random variable.

4.37

4.38

� �

� �

discrete randomvariable

continuous randomvariable

7

We usually denote random variables by capital letters near the end ofthe alphabet, such as or . Of course, the random variables of greatestinterest to us are outcomes such as the mean of a random sample, forwhich we will keep the familiar notation. As we progress from general rulesof probability toward statistical inference, we will concentrate on randomvariables. When a random variable describes a random phenomenon, thesample space just lists the possible values of the random variable. Weusually do not mention separately.

We will meet two types of random variables, corresponding to two waysof assigning probability: and . Compare the randomvariable in Example 4.6 (page 243) and the random variable inExample 4.7 (page 246). Both have Uniform distributions that distributeprobability evenly across the possible outcomes, but one is discrete and theother is continuous.

The possible values of in Example 4.6 are the whole numbers1 2 3 4 5 6 7 8 9 . These nine values are separated on the number line

by other numbers that are not possible values of :

is a “Discrete” describes the nature of thesample space for . In contrast, the sample space for in Example 4.7 isall numbers between 0 and 1 . These possible values are not separated on

the number line. As you move continuously from the value 0 to the value 1,you do not leave the sample space for :

is a Once again, “continuous” describes thenature of the sample space.

X Yx

XS

S

discrete continuousX Y

X, , , , , , , ,

X

XX Y

Y

Y

For each exercise listed below, decide whether the random variable describedis discrete or continuous and give a brief explanation for your response.

(a) Exercise 4.12(b), page 239

(b) Exercise 4.12(e), page 240

(c) Exercise 4.22, page 249

(d) Exercise 4.35, page 252

For each exercise listed below, decide whether the random variable describedis discrete or continuous and sketch the sample space on a number line.

(a) Exercise 4.13(a), page 240

(b) Exercise 4.13(c), page 240

(c) Exercise 4.13(e), page 240

APPLY YOURKNOWLEDGE

Page 27: chap_04

4.3 Random Variables 255

Discrete Random Variables

Hard-drive sizesEXAMPLE 4.9

Probability Distributions

PROBABILITY DISTRIBUTION

probability distribution

DISCRETE PROBABILITY DISTRIBUTIONS

probability distribution

� �

k

k

i

i i

k

i

i

� � � �

8

1 2 3

1 2 3

1 2

The starting point for studying any random variable is its probabilitydistribution, which is just the probability model for the outcomes.

The of a random variable tells us whatvalues can take and how to assign probabilities to those values.

Because of the difference in the nature of sample spaces for discrete andcontinuous random variables, we describe probability distributions for thetwo types of random variables separately.

A discrete random variable like the random digit in Example 4.6 (page243) has a finite number of possible values. The probability distribution of

therefore simply lists the values and their probabilities.

The of a discrete random variable lists thepossible values of and their probabilities:

Value of

Probability

The probabilities must satisfy two requirements:

1. Every probability is a number between 0 and 1, 0 1.

2. The sum of the probabilities is exactly 1, 1.

To find the probability of any event, add the probabilities of theindividual values that make up the event.

We can use a probability histogram to display a discrete distribution.Figure 4.3 (page 244) compares the distributions of random digits and firstdigits that obey Benford’s Law.

XX

X

X

XX

X x x x x

p p p p

p

p p

p p p

px

Buyers of a laptop computer model may choose to purchase either a 10 GB(gigabyte), 20 GB, 30 GB, or 40 GB internal hard drive. Choose customers fromthe last 60 days at random to ask what influenced their choice of computer. To“choose at random” means to give every customer of the last 60 days the same

Page 28: chap_04

Pro

babi

lity

0.6

0.4

0.2

0.010 20 30 40

Outcome

0.5

0.3

0.1

Probability and Sampling Distributions256 CHAPTER 4

FIGURE 4.8

Four coin tossesEXAMPLE 4.10

Probability histogram for the distribution of hard-drive sizesin Example 4.9.

� � � �

� � �

� � �

�� �

XX

X

X

P P X P X

. . .

XX

X

Xnot X

P X X

XP X

chance to be chosen. The size (in gigabytes) of the internal hard drive chosen by arandomly selected customer is a random variable .

The value of changes when we repeatedly choose customers at random, butit is always one of 10, 20, 30, or 40 GB. Here is the distribution of :

Hard-drive size 10 20 30 40

Probability 0.50 0.25 0.15 0.10

The probability histogram in Figure 4.8 pictures this distribution.The probability that a randomly selected customer chose at least a 30 GB hard

drive is

(size is 30 or 40) ( 30) ( 40)

0 15 0 10 0 25

Toss a balanced coin four times; the discrete random variable counts the numberof heads. How shall we find the probability distribution of ?

The outcome of four tosses is a sequence of heads and tails such as HTTH.There are 16 possible outcomes in all. Figure 4.9 lists these outcomes along withthe value of for each outcome. A reasonable probability model says that the16 outcomes are all equally likely; that is, each has probability 1/16. (This model isjustified both by empirical observation and by a theoretical derivation that appearsin the optional Chapter 5.)

The number of heads has possible values 0, 1, 2, 3, and 4. These values areequally likely. As Figure 4.9 shows, 0 can occur in only one way, when

the outcome is TTTT. So ( 0) 1/16. The outcome 2, on the otherhand, can occur in six different ways, so that

count of ways 2 can occur( 2)

16616

Page 29: chap_04

Pro

babi

lity

0.4

0.3

0.2

0.1

0.00 1 2 3 4

Outcome

TTTT

X = 0

HTTTTHTTTTHTTTTH

X = 1

HTTHHTHTTHTHHHTTTHHTTTHH

X = 2

HHHTHHTHHTHHTHHH

X = 3

HHHH

X = 4

4.3 Random Variables 257

FIGURE 4.9

FIGURE 4.10

Possible outcomes in four tosses of a coin,for Example 4.10. The random variable is the numberof heads.

Probability histogram for the number of heads in fourtosses of a coin, for Example 4.10.

X

� � �

� � �

� � �

� � �

� � �

X

P X .

P X .

P X .

P X .

P X .

X

We can find the probability of each value of from Figure 4.9 in the sameway. Here is the result:

1( 0) 0 0625

164

( 1) 0 25166

( 2) 0 375164

( 3) 0 25161

( 4) 0 062516

These probabilities have sum 1, so this is a legitimate probability distribution. Intable form, the distribution is

Number of heads 0 1 2 3 4

Probability 0.0625 0.25 0.375 0.25 0.0625

Figure 4.10 is a probability histogram for this distribution. The probabilitydistribution is exactly symmetric. It is an idealization of the distribution of the

Page 30: chap_04

Probability and Sampling Distributions258 CHAPTER 4

Continuous Random Variables

Tread lifeEXAMPLE 4.11

CONTINUOUS PROBABILITY DISTRIBUTIONS

probability distribution

� �

� � � �

� �

� �

� �

Why talk about tossing coins? The probability distribution for repeatedcoin tossing also fits many real-world problems: respondents answer Yes orNo to a poll question; consumers buy or don’t buy a product after viewingan advertisement; an inspector finds good or defective parts. Of course, wemust allow the probability of a “head” to be something other than one-halfin these settings, but that is a minor change. We will extend the results ofExample 4.10 when we return to sampling distributions in the next section.

A continuous random variable like the Uniform random number inExample 4.7 (page 246) or the Normal package weight in Example 4.8(page 248) has an infinite number of possible values. Continuous probabilitydistributions therefore assign probabilities directly to events as areas undera density curve.

The of a continuous random variable isdescribed by a density curve. The probability of any event is the areaunder the density curve and above the values of that make up theevent.

X X

P X . . . .

P X P X

. .

XX

N ,

X , , ,P X , P

P Z .

.

YW

X

X

proportions for numbers of heads after many tosses of four coins, which would benearly symmetric but is unlikely to be exactly symmetric.

Any event involving the number of heads observed can be expressed in terms of, and its probability can be found from the distribution of . For example, the

probability of tossing at least two heads is

( 2) 0 375 0 25 0 0625 0 6875

The probability of at least one head is most simply found by

( 1) 1 ( 0)

1 0 0625 0 9375

The actual tread life of a 40,000-mile automobile tire has a Normal probabilitydistribution with 50,000 miles and 5500 miles. We say has a

(50000 5500) distribution. The probability that a randomly selected tire has atread life less than 40,000 miles is

50 000 40 000 50 000( 40 000)

5500 5500

( 1 82)

0 0344

� �

� �� �

� �

� �

Page 31: chap_04

30,000 40,000 50,000 60,000 70,000

Area =0.0344

4.3 Random Variables 259

FIGURE 4.11

Can a random number be exactly 0.8?EXAMPLE 4.12

The Normal distribution with 50,000 and 5500.The shaded area is ( 40,000), calculated in Example 4.11.P X

all continuous probability distributions assign probability 0 to everyindividual outcome.

� ��

� �

� �� �

The probability distribution for a continuous random variable assignsprobabilities to intervals of outcomes rather than to individual outcomes.In fact,

Only intervals of values have positive probability. Tosee that this makes sense, return to the Uniform random number generatorof Example 4.7.

N ,P X

P Y .exactly

exactlyY . Y .

Y .

The (50000 5500) distribution is shown in Figure 4.11 with the area( 40,000) shaded.

What is ( 0 8), the probability that a random number generator produces0.8? Figure 4.12 shows the Uniform density curve for outcomes between

0 and 1. Probabilities are areas under this curve. The probability of any intervalof outcomes is equal to its length. This implies that because the point 0.8 has nolength, its probability is 0.

Although this fact may seem odd, it makes intuitive as well as mathematicalsense. The random number generator produces a number between 0.79 and 0.81with probability 0.02. An outcome between 0.799 and 0.801 has probability0.002. A result between 0.799999 and 0.800001 has probability 0.000002. Yousee that as we home in on 0.8 the probability gets closer to 0. To be consistent, theprobability of an outcome equal to 0.8 must be 0.

Because there is no probability exactly at 0 8, the two events 0 8and 0 8 have the same probability. We can ignore the distinction between

Page 32: chap_04

0 0.8 1

Probability and Sampling Distributions260 CHAPTER 4

FIGURE 4.12

The density curve of the Uniform distributionof numbers chosen at random between 0 and 1. This modelassigns probability 0 to the event that the number generatedhas any one specific value, such as 0.8.

APPLY YOURKNOWLEDGE

4.39 How many cars?

� �

Example 4.12 reminds us that continuous probability distributions areidealizations. We don’t measure the weight of potato chip bags or the treadlife of tires to an unlimited number of decimal places. If we measuretread life to the nearest mile, we could use a discrete distribution with a verysmall probability on each specific tread life. It is much more convenient touse a continuous Normal distribution. There is an element of “not exactlyright but good enough to use in practice” in most choices of a probabilitymodel.

We began this chapter with a general discussion of the idea of probabilityand the properties of probability models. Two specific types of probabil-ity models are distributions of discrete and continuous random variables.Our study of statistics will employ only these two types of probability models.

X

X

X P X

and when finding probabilities for continuous (but not discrete) randomvariables.

Choose an American household at random and let therandom variable be the number of cars (including SUVs and light trucks)they own. Here is the probability model if we ignore the few households thatown more than 5 cars:

Number of cars 0 1 2 3 4 5

Probability 0.09 0.36 0.35 0.13 0.05 0.02

(a) Verify that this is a legitimate discrete distribution. Display the distribu-tion in a probability histogram.

(b) Say in words what the event 1 is. Find ( 1).

(c) Your company builds houses with two-car garages. What percent ofhouseholds have more cars than the garage can hold?

� �

� �

APPLY YOURKNOWLEDGE

Page 33: chap_04

4.3 Random Variables 261

Pick 3EXAMPLE 4.13

The mean of a random variable

4.40

4.41

Probability is the mathematical language that describes the long-run regularbehavior of random phenomena. The probability distribution of a randomvariable is an idealized distribution of the proportions of outcomes invery many observations. The probability histograms and density curves thatpicture probability distributions resemble our earlier pictures of distributionsof data. In describing data, we moved from graphs to numerical measuressuch as means and standard deviations. Now we will make the same moveto expand our descriptions of the distributions of random variables. We canspeak of the mean winnings in a game of chance or the standard deviationof the randomly varying number of calls a travel agency receives in an hour.We will learn more about how to compute these descriptive measures andabout the laws they obey.

The mean of a set of observations is their ordinary average. The meanof a random variable is also an average of the possible values of , butwith an essential change to take into account the fact that not all outcomesneed be equally likely. An example will show what we must do.

Y

P Y .

P . Y

P . Y .

P . Y .

P . Y .

N ,X

P X ,

P X ,

XX

X

xX X

Let be a random number between 0 and 1 produced by the idealizedUniform random number generator with density curve pictured in Fig-ure 4.12. Find the following probabilities:

(a) (0 0 4)

(b) (0 4 1)

(c) (0 3 0 5)

(d) (0 3 0 5)

(e) (0 226 0 713)

Example 4.11 gives the Normal distribution (50000 5500) for the treadlife of a type of tire (in miles). Calculate the following probabilities:

(a) The probability that a tire lasts more than 50,000 miles.

(b) ( 60 000)

(c) ( 60 000)

Most states and Canadian provinces have government-sponsored lotteries. Here isa simple lottery wager, from the Tri-State Pick 3 game that New Hampshire shareswith Maine and Vermont. You choose a three-digit number; the state choosesa three-digit winning number at random and pays you $500 if your number ischosen. Because there are 1000 three-digit numbers, you have probability 1/1000of winning. Taking to be the amount your ticket pays you, the probabilitydistribution of is

Payoff $0 $500

Probability 0.999 0.001

� �

� �

� �

� �

� �

Page 34: chap_04

Probability and Sampling Distributions262 CHAPTER 4 �

mean of aprobability distribution

expected value

MEAN OF A DISCRETE RANDOM VARIABLE

mean

� �

� ��

X

X

k

k

X k k

i i�

� � � �

mean

expected value

1 2 3

1 2 3

1 1 2 2

� �

If you play Tri-State Pick 3 several times, we would as usual call the meanof the actual amounts your tickets pay . The mean in Example 4.13 is adifferent quantity—it is the long-run average payoff you expect if you playa very large number of times. Just as probabilities are an idealized descriptionof long-run proportions, the mean of a probability distribution describesthe long-run average outcome. The common symbol for the

is , the Greek letter mu. We used in Chapter 1for the mean of a Normal distribution, so this is not a new notation. Wewill often be interested in several random variables, each having a differentprobability distribution with a different mean. To remind ourselves that weare talking about the mean of , we can write rather than simply . InExample 4.13, $0 50. Notice that, as often happens, the mean is nota possible value of . You will often find the mean of a random variablecalled the of . This term can be misleading, for we don’tnecessarily expect one observation on to be close to its mean.

The mean of any discrete random variable is found just as in Exam-ple 4.13. It is an average of the possible outcomes, but a weighted average inwhich each outcome is weighted by its probability. Because the probabilitiesadd to 1, we have total weight 1 to distribute among the outcomes. Anoutcome that occurs half the time has probability one-half and gets one-halfthe weight in calculating the mean. Here is the general definition.

Suppose that is a discrete random variable whose distribution is

Value of

Probability

To find the of , multiply each possible value by itsprobability, then add all the products:

.

X

x

X.

X XX

X

X

X x x x x

p p p p

X

x p x p x p

x p

What is your average payoff from many tickets? The ordinary average ofthe two possible outcomes $0 and $500 is $250, but that makes no sense as theaverage because $500 is much less likely than $0. In the long run you receive$500 once in every 1000 tickets and $0 on the remaining 999 of 1000 tickets. Thelong-run average payoff is

1 999$500 $0 $0 50

1000 1000

or fifty cents. That number is the mean of the random variable . (Tickets cost $1,so in the long run the state keeps half the money you wager.)

Page 35: chap_04

4.3 Random Variables 263

First digitsEXAMPLE 4.14

X

V

9

� � � � � � � � �

� �

� � � � �

� � � �

What about continuous random variables? The probability distributionof a continuous random variable is described by a density curve. Chapter 1showed how to find the mean of the distribution: it is the point at whichthe area under the density curve would balance if it were made out of solidmaterial. The mean lies at the center of symmetric density curves such asthe Normal curves. Exact calculation of the mean of a distribution with askewed density curve requires advanced mathematics. The idea that themean is the balance point of the distribution applies to discrete randomvariables as well, but in the discrete case we have a formula that gives us thenumerical value of .

X

X

V

V

V

. . . . .

. . . .

.

X V

X

If first digits in a set of records appear “at random,” the nine possible digits 1 to 9all have the same probability. The probability distribution of the first digitis then

First digit 1 2 3 4 5 6 7 8 9

Probability 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9 1/9

The mean of this distribution is

1 1 1 1 1 1 1 1 11 2 3 4 5 6 7 8 9

9 9 9 9 9 9 9 9 9

145 5

9

If, on the other hand, the records obey Benford’s Law, the distribution of thefirst digit is

First digit 1 2 3 4 5 6 7 8 9

Probability 0.301 0.176 0.125 0.097 0.079 0.067 0.058 0.051 0.046

The mean of is

(1)(0 301) (2)(0 176) (3)(0 125) (4)(0 097) (5)(0 079)

(6)(0 067) (7)(0 058) (8)(0 051) (9)(0 046)

3 441

The means reflect the greater probability of smaller first digits under Benford’sLaw.

Figure 4.13 locates the means of and on the two probability histograms.Because the discrete Uniform distribution of Figure 4.13(a) is symmetric, themean lies at the center of symmetry. We can’t locate the mean of the right-skeweddistribution of Figure 4.13(b) by eye—calculation is needed.

CA

SE4.1

Page 36: chap_04

0.4

0.01

Pro

babi

lity

Outcomes

0.3

0.2

0.1

0 2 3 4

(a)

5 6 7 8 9

0.4

0.01

Pro

babi

lity

Outcomes

0.3

0.2

0.1

0 2 3 4

(b)

5 6 7 8 9

Probability and Sampling Distributions264 CHAPTER 4

FIGURE 4.13

Locating the mean of a discrete random variable onthe probability histogram for (a) digits between 1 and 9 chosen atrandom; (b) digits between 1 and 9 chosen from records that obeyBenford’s Law.

APPLY YOURKNOWLEDGE

4.42 Household size.

4.43 Hard-drive sizes.

4.44 Customer orders.

X

X

X

X

X

Choose an American household at random and letbe the number of persons living in the household. If we ignore the fewhouseholds with more than seven inhabitants, the probability model foris as follows:

Household size 1 2 3 4 5 6 7

Probability 0.25 0.32 0.17 0.15 0.07 0.03 0.01

Find the mean . Sketch a probability histogram of the distribution ofand mark the mean on your sketch. The Census Bureau reports that themean size of American households is 2.65 people. Why is your result slightlyless than this?

Example 4.9 (page 255) gives the distribution of customerchoices of hard-drive size for a laptop computer model. Find the meanof this probability distribution. Explain in simple language what tells usabout customer choices. Also explain why knowing is not very helpful tothe computer maker.

Each week your business receives orders from customerswho wish to have air-conditioning units installed in their homes. From past

��

APPLY YOURKNOWLEDGE

Page 37: chap_04

4.3 Random Variables 265

Rules for means

PORTFOLIO ANALYSIS

�� �

� �

The of an investment over a time period is the percent changein the price during the time period, plus any income received. For example,Apple Computer’s stock price was $14.875 at the beginning of January 2001and $21.625 at the end of that month. Apple pays no dividends, so Apple’srate of return for that time period was

change in price 21 625 14 8750 45 or 45%

starting price 14 875

Investors want high returns, but they also want safety. Apple’s stockprice has often halved or doubled within a few months. The variability ofreturns, called in finance, is a measure of the risk of an investment.A highly volatile stock, which may go either up or down a lot, is more riskythan a Treasury bill, whose return is very predictable.

A is a collection of investments held by an individual or aninstitution. begins by studying how the risk and return of aportfolio are determined by the risk and return of the individual investmentsit contains. That’s where statistics comes in: the return on an investmentover some period of time is a random variable. We are interested in the

return and we measure volatility by the of returns.Let’s say that Sadie’s portfolio places 20% of her funds in Treasury bills and80% in an “index fund” that represents all U.S. common stocks. If is theannual return on T-bills and the annual return on stocks, the portfoliorate of return is

0 2 0 8

How can we find the mean and standard deviation of the portfolioreturn starting from information about and ? We must now developthe machinery to do this.

X

X

X

X

rate of return

. .. ,

.

volatility

portfolioPortfolio analysis

mean standard deviation

XY

R . X . Y

R X Y

data, you estimate the distribution of the number of units ordered in a weekto be

Units ordered 0 1 2 3 4 5

Probability 0.05 0.15 0.27 0.33 0.13 0.07

(a) Sketch a probability histogram for the distribution of .

(b) Calculate the mean of and mark it on the horizontal axis of yourprobability histogram.

(c) If you hire workers sufficient to handle the mean demand, what is theprobability that you will be unable to handle all of a week’s installationorders?

CA

SE4.2

Page 38: chap_04

Probability and Sampling Distributions266 CHAPTER 4

Portfolio analysisEXAMPLE 4.15

RULES FOR MEANS

Rule 1.

Rule 2.

addition rule for means.

X

Y

��

�� �

� �

� � �

X

Y

X Y

X Y

X

X

Xa bX

X Y X Y

� �

��

� �

� �

10

Think first not about investments but about making refrigerators. Dim-ples and paint sags are two kinds of flaws in the painted finish of refrigerators.Not all refrigerators have the same number of dimples: many have none,some have one, some two, and so on. The inspectors report finding anaverage of 0.7 dimples and 1.4 sags per refrigerator. How many total im-perfections of both kinds (on the average) are on a refrigerator? That’s easy:if the average number of dimples is 0.7 and the average number of sags is1.4, then counting both gives an average of 0 7 1 4 2 1 flaws.

In more formal language, the number of dimples on a refrigerator is arandom variable that varies as we inspect one refrigerator after another.We know only that the mean number of dimples is 0 7. The numberof paint sags is a second random variable having mean 1 4. (Asusual, the subscripts keep straight which variable we are talking about.) Thetotal number of both dimples and sags is another random variable, the sum

. Its mean is the average number of dimples and sags together.It is just the sum of the individual means and . That’s an importantrule for how means of random variables behave.

Here’s another rule. A large lot of plastic coffee-can lids has meandiameter 4.2 inches. What is the mean in centimeters? There are 2.54centimeters in an inch, so the diameter in centimeters of any lid is 2.54 timesits diameter in inches. If we multiply every observation by 2.54, we alsomultiply their average by 2.54. The mean in centimeters must be 2 54 4 2,or about 10.7 centimeters. More formally, the diameter in inches of a lidchosen at random from the lot is a random variable with mean .The diameter in centimeters is 2 54 , and this new random variable hasmean 2 54 .

The point of these examples is that means behave like averages. Here arethe rules we need.

If is a random variable and and are fixed numbers,then

If and are random variables, then

This is the

X .Y .

. . .

X.

Y .

X Y

. .

X. X

.

X a b

a b

X Y

The past behavior of the two securities in Sadie’s portfolio is pictured inFigure 4.14, which plots the annual returns on Treasury bills and common stocksfor the years 1950 to 2000. We can calculate mean returns from the data behindthe plot:

= annual return on T-bills 5 2%= annual return on stocks 13 3%

CA

SE4.2

Page 39: chap_04

0

–40

Treasury bill return (percent)

Stoc

k re

turn

(pe

rcen

t)

5 10 15 20

–20

0

20

40

60

4.3 Random Variables 267

FIGURE 4.14

Gain CommunicationsEXAMPLE 4.16

Annual returns on U.S. common stocks versus returns onTreasury bills for the years 1950 to 2000.

R X Y

� �

� �

� � �

R . X . Y

. .

. . . . .

expected return

Sadie invests 20% in T-bills and 80% in stocks. We can find the mean returnon her portfolio by combining Rules 1 and 2:

0 2 0 8

0 2 0 8

(0 2)(5 2) (0 8)(13 3) 11 68%

This calculation uses historical data on returns. Next year may of course be verydifferent. It is usual in finance to use the term in place of meanreturn. Remember our warning that we can’t generally expect a future value to beclose to the mean.

Gain Communications sells aircraft communications units to both the military andthe civilian markets. Next year’s sales depend on market conditions that cannot bepredicted exactly. Gain follows the modern practice of using probability estimatesof sales. The military division estimates its sales as follows:

Units sold 1000 3000 5000 10,000

Probability 0.1 0.3 0.4 0.2

� � �

Page 40: chap_04

Probability and Sampling Distributions268 CHAPTER 4 �

APPLY YOURKNOWLEDGE

4.45 Selling mobile phones.

X

Y

X X

Y Y

Z X Y

Z X Y

X Y

� � � �

� � � � �

� � �

� � � �

� � �

� � �

� �

� �

� �

� �

� � �

2000

3500

2000 3500

2000 3500

X Y

. . . , .

. . .

XX

, ,

Y

, ,

Z X Y

, , , ,

, ,

, ,

XY

These are personal probabilities that express the informed opinion of Gain’sexecutives. The corresponding sales estimates for the civilian division are

Units sold 300 500 750

Probability 0.4 0.5 0.1

Take to be the number of military units sold and the number of civilianunits. From the probability distributions we compute that

(1000)(0 1) (3000)(0 3) (5000)(0 4) (10 000)(0 2)

100 900 2000 2000 5000 units

(300)(0 4) (500)(0 5) (750)(0 1)

120 250 75 445 units

Gain makes a profit of $2000 on each military unit sold and $3500 on eachcivilian unit. Next year’s profit from military sales will be 2000 , $2000 times thenumber of units sold. By Rule 1 for means, the mean military profit is

2000 (2000)(5000) $10 000 000

Similarly, the civilian profit is 3500 , and the mean profit from civilian sales is

3500 (3500)(445) $1 557 500

The total profit is the sum of the military and civilian profit:

2000 3500

Rule 2 for means says that the mean of this sum of two variables is the sum of thetwo individual means:

$10 000 000 $1 557 500

$11 557 500

This mean is the company’s best estimate of next year’s profit, combining theprobability estimates of the two divisions. We can do this calculation more quicklyby combining Rules 1 and 2:

2000 3500

(2000)(5000) (3500)(445) $11 557 500

You own a mobile-phone store with two locations,one in the local mall and one downtown on Main Street. Let be the numberof phones sold during the next month at the mall location, and let be the

� �

� �

� � �

� �

� �

APPLY YOURKNOWLEDGE

Page 41: chap_04

4.3 Random Variables 269

The variance of a random variable

4.46 Mutual funds.

X Y

W

X

Y

� �

��

X X

X

X

2

2 2

2

2

� �

� �

� �

11

The mean portfolio return in Example 4.15 is less than the mean returnfor stocks because Sadie added 20% Treasury bills. What does she gain bygiving up some average return? T-bills are much less volatile than stocks,so she hopes that her portfolio returns will vary less from year to yearthan would be the case if she invested all her funds in stocks. The meanis a measure of the center of a distribution. Even the most basic numericaldescription requires in addition a measure of the spread or variability of thedistribution.

The variance and the standard deviation are the measures of spread thataccompany the choice of the mean to measure center. Just as for the mean, weneed a distinct symbol to distinguish the variance of a random variable fromthe variance of a data set. We write the variance of a random variable as

. The definition of the variance of a random variable is similar to thedefinition of the sample variance given in Chapter 1. That is, the varianceis an average of the squared deviation ( ) of the variable from itsmean . As for the mean, the average we use is a weighted average in which

W .X .Y .

s X

sX X

number of phones sold during the next month at the downtown location.For the mall location, you estimate the following probability distribution:

Phones sold 200 300 400

Probability 0.4 0.4 0.2

For the downtown location, your estimated distribution is

Phones sold 100 200 300 400

Probability 0.3 0.5 0.15 0.05

(a) Calculate and .

(b) Your rental cost is higher at the mall location. You make $25 profit oneach phone sold at the mall location and $35 profit on each phone soldat the downtown location. Calculate the mean profit for each location.

(c) Calculate the mean profit for both locations combined.

The addition rule for means extends to sums of anynumber of random variables. Let’s look at a portfolio containingthree mutual funds. The monthly returns on Fidelity MagellanFund, Fidelity Real Estate Fund, and Fidelity Japan Fund for the 36 monthsending in December 2000 had approximately these means:

Magellan monthly return 1 14%Real Estate monthly return 0 16%Japan monthly return 1 59%

What is the mean monthly return for a portfolio consisting of 50% Magellan,30% Real Estate, and 20% Japan?

� �

��

CA

SE4

.2

Page 42: chap_04

Probability and Sampling Distributions270 CHAPTER 4

Gain CommunicationsEXAMPLE 4.17

VARIANCE OF A DISCRETE RANDOM VARIABLE

variance

standard deviation

X X

i i i i i x i

X X

XX

� � �

� � � �

k

k

X X Xk kX

i X i

X

�� � � �

1 2 3

1 2 3

2 2 2 21 1 2 2

2

� �

� � �

2

2

2

2

2

2

2

2

each outcome is weighted by its probability to take account of outcomes thatare not equally likely. Calculating this weighted average is straightforwardfor discrete random variables but requires advanced mathematics in thecontinuous case. Here is the definition.

Suppose that is a discrete random variable whose distribution is

Value of

Probability

and that is the mean of . The of

( ) ( ) ( )

( )

The of is the square root of the variance.

X

X

x p x p x p

, , ., , ., , ., , .

, , X , ,.

X

X x x x x

p p p p

X X

x p x p x p

x p

X

In Example 4.16 we saw that the number of communications units that the GainCommunications military division hopes to sell has distribution

Units sold 1000 3000 5000 10,000

Probability 0.1 0.3 0.4 0.2

We can find the mean and variance of by arranging the calculation in theform of a table. Both and are sums of columns in this table.

( )

1,000 0.1 100 (1 000 5 000) (0 1) 1,600,0003,000 0.3 900 (3 000 5 000) (0 3) 1,200,0005,000 0.4 2,000 (5 000 5 000) (0 4) 0

10,000 0.2 2,000 (10 000 5 000) (0 2) 5,000,000

5,000 7,800,000

We see that 7 800 000. The standard deviation of is 7 800 0002792 8. The standard deviation is a measure of how variable the number ofunits sold is. As in the case of distributions for data, the standard deviation of aprobability distribution is easiest to understand for Normal distributions.

� �

� �

� �

Page 43: chap_04

4.3 Random Variables 271

Rules for variances

APPLY YOURKNOWLEDGE

independent

correlation

The correlation between twoindependent random variables is zero.

4.47 Civilian sales.

4.48 Hard-drive sizes.

4.49 Selling mobile phones.

YY

YY

��

independence

correlation

2

2

What are the facts for variances that parallel Rules 1 and 2 for means?The mean of a sum of random variables is always the sum of their means,but this addition rule is not always true for variances. To understand why,take to be the percent of a family’s after-tax income that is spent and

the percent that is saved. When increases, decreases by the sameamount. Though and may vary widely from year to year, their sum

is always 100% and does not vary at all. It is the association betweenthe variables and that prevents their variances from adding. If randomvariables are this kind of association between their values isruled out and their variances do add. Two random variables and are

if knowing that any event involving alone did or did notoccur tells us nothing about the occurrence of any event involving alone.Probability models often assume independence when the random variablesdescribe outcomes that appear unrelated to each other. You should ask ineach instance whether the assumption of independence seems reasonable.

When random variables are not independent, the variance of their sumdepends on the between them as well as on their individualvariances. In Chapter 2, we met the correlation between two observedvariables measured on the same individuals. We defined (page 103) thecorrelation as an average of the products of the standardized and

observations. The correlation between two random variables is definedin the same way, once again using a weighted average with probabilitiesas weights. We won’t give the details—it is enough to know that thecorrelation between two random variables has the same basic propertiesas the correlation calculated from data. We use , the Greek letter rho,for the correlation between two random variables. The correlation is anumber between 1 and 1 that measures the direction and strength ofthe linear relationship between two variables.

Returning to family finances, if is the percent of a family’s after-taxincome that is spent and the percent that is saved, then 100 .This is a perfect linear relationship with a negative slope, so the correlationbetween and is 1. With the correlation at hand, we can state therules for manipulating variances.

Y

XY X Y

X YX Y

X Yindependent,

X YX

Y

r

r xy

r

XY Y X

X Y

Example 4.16 also gives the distribution of Gain’s civiliansales . Find the variance and the standard deviation .

Example 4.9 (page 255) gives the distribution ofhard-drive sizes for sales of a particular model computer. You foundthe mean hard-drive size in Exercise 4.43 (page 264). Find the standarddeviation of the distribution of hard-drive sizes.

Exercise 4.45 gives the distribution of weekly mobilephone sales in two locations.

(a) Calculate the variance and the standard deviation of the number ofphones sold at the mall location.

(b) Calculate and for the downtown location.

� �

� �

APPLY YOURKNOWLEDGE

Page 44: chap_04

Probability and Sampling Distributions272 CHAPTER 4

Pick 3EXAMPLE 4.18

RULES FOR VARIANCES

Rule 1.

Rule 2.

addition rule for variances of independent randomvariables.

Rule 3.

general addition rule for variances of random variables.

i i i i i X i

X X

� �

� � �

� � �

� � � �� �

� � � �� �

��

Xa bX

X Y X Y

X Y X Y

X YX Y X Y

X YX Y X Y

X

X

X

� �

� �

� � �

� �

2 2 2

2 2 2

2 2 2

2 2 2

2 2 2

2

2

� �

2

2

2

2

If is a random variable and and are fixed numbers,then

If and are independent random variables, then

This is the

If and have correlation , then

2

2

This is the

Notice that because a variance is the average of deviations fromthe mean, multiplying by a constant multiplies by the of theconstant. Adding a constant to a random variable changes its mean butdoes not change its variability. The variance of is therefore the sameas the variance of . Because the square of 1 is 1, the addition rule saysthat the variance of a difference of independent random variables is theof the variances. For independent random variables, the difference ismore variable than either or alone because variations in both andcontribute to variation in their difference.

As with data, we prefer the standard deviation to the variance as ameasure of the variability of a random variable. Rule 2 for variances impliesthat standard deviations of independent random variables do add. Tocombine standard deviations, use the rules for variances rather than trying toremember separate rules for standard deviations. For example, the standarddeviations of 2 and 2 are both equal to 2 because this is the squareroot of the variance 4 .

X

x p x p x p

. .

. .

X a b

b

X Y

X Y

squaredX b square

aX a

Xsum

X YX Y X Y

not

X X

The payoff of a $1 ticket in the Tri-State Pick 3 game is $500 with probability1/1000 and 0 the rest of the time. Here is the combined calculation of mean andvariance:

( )

0 0.999 0 (0 0 5) (0 999) 0.2497500 0.001 0.5 (500 0 5) (0 001) 249.50025

0.5 249.75

� �

Page 45: chap_04

4.3 Random Variables 273

SAT scoresEXAMPLE 4.19

X

W X

X Y X Y

X Y X Y

X Y

X X

Y Y

X Y X Y

� �

� �

� � � � �

� � � � �

� �

� �

� � �

� � �

� � � � �

2 2 2

. .

W X

.

W XX

X YX Y

. . .

X Y X Y

. . .

. .

. . .

X

Y

X Y

cannot be computed

X Y

The standard deviation is 249 75 $15 80. Games of chance usuallyhave large standard deviations because large variability makes gamblingexciting.

If you buy a Pick 3 ticket, your net winnings are 1 because the dollaryou pay for the ticket must be subtracted from the payoff. By the rules for means,the mean amount you win is

1 $0 50

That is, you lose an average of 50 cents on a ticket. The rules for variances remindus that the variance and standard deviation of the winnings 1 are thesame as those of the payoff . Subtracting a fixed number changes the mean butnot the variance.

Suppose now that you buy a $1 ticket on each of two different days. Thepayoffs and on the two tickets are independent because separate drawings areheld each day. Your total payoff has mean

$0 50 $0 50 $1 00

Because and are independent, the variance of is

249 75 249 75 499 5

The standard deviation of the total payoff is

499 5 $22 35

This is not the same as the sum of the individual standard deviations, which is$15 80 $15 80 $31 60. Variances of independent random variables add;standard deviations do not.

A college uses SAT scores as one criterion for admission. Experience has shownthat the distribution of SAT scores among the college’s entire population ofapplicants is such that

SAT math score 625 90

SAT verbal score 590 100

What are the mean and standard deviation of the total score among studentsapplying to this college?

The mean overall SAT score is

625 590 1215

The variance and standard deviation of the total from theinformation given. SAT verbal and math scores are not independent, becausestudents who score high on one exam tend to score high on the other also.Therefore, Rule 2 does not apply and we need to know , the correlation between

and , to apply Rule 3.

� �

� �

� � �

� � �

� �

� �

� � �

Page 46: chap_04

Probability and Sampling Distributions274 CHAPTER 4

Portfolio analysisEXAMPLE 4.20

X YX Y X Y

X Y

X Y

X X

Y Y

R X Y

. X . YR . X . Y

X YX Y

R

� � �

� � �

� �

� �

� � �

� � �

� �

� �

� � �

� � �

� � �

� � �

� �

2 2 2

2 2

2 2

2 2 20 2 0 80 2 0 8

2 2 2 2

2 2 2 2

.

.

X Y

X .

Y .

X Y .

R

R . X . Y

. .

. . . . .

. . . .

. . . . . . . . .

.

. .

Nationally, the correlation between SAT math and verbal scores is about0 7. If this is true for these students,

2

(90) (100) (2)(0 7)(90)(100)

30,700

The variance of the sum is greater than the sum of the variancesbecause of the positive correlation between SAT math scores and SAT verbalscores. We find the standard deviation from the variance,

30,700 175

Now we can complete our initial analysis of Sadie’s portfolio. Based on annualreturns between 1950 and 2000, we have

annual return on T-bills 5.2% 2 9%

annual return on stocks 13.3% 17 0%

Correlation between and : 0 1

We see that stocks had higher returns than Treasury bills on the average but werealso much more volatile. This is a basic relationship in finance: a more volatileinvestment must offer a higher return to compensate investors for its greater risk.For the return on Sadie’s portfolio of 20% T-bills and 80% stocks,

0 2 0 8

0 2 0 8

(0 2 5 2) (0 8 13 3) 11 68%

To find the variance of the portfolio return, combine Rules 1 and 3:

2

(0 2) (0 8) 2 (0 2 )(0 8 )

(0 2) (2 9) (0 8) (17 0) (2)( 0 1)(0 2 2 9)(0 8 17 0)

183 719

183 719 13 55%

The portfolio has a smaller mean return than an all-stock portfolio, but it is alsoless volatile. As a proportion of the all-stock values, the reduction in standarddeviation is greater than the reduction in mean return. That’s why Sadie put somefunds into Treasury bills.

� � � �� �

� �

� �

� �

� � �

� � � �� �

� � � � �

CA

SE4.2

Page 47: chap_04

4.3 Random Variables 275

APPLY YOURKNOWLEDGE

4.50 Comparing sales.

4.51 Comparing sales.

4.52

X X

Y Y

� �

� �

� �

Example 4.20 illustrates the first step in modern finance, using mean andstandard deviation to describe the behavior of a portfolio. The next step isto seek best-possible portfolios based on a given set of securities by findingthe combination with the highest return for a specified level of risk. Theinvestor says how much risk (measured as standard deviation of returns)she is willing to bear; the best portfolio has the highest mean return for thisstandard deviation. Our examples also suggest the characteristic weaknessof portfolio analysis: risk and return in the future may be very different fromthe and we get from past returns.

X

Y

X Y

.X Y

XY

.

.

Tamara and Derek are sales associates in a large electronicsand appliance store. Their store tracks each associate’s daily sales in dollars.Tamara’s sales total varies from day to day with mean and standarddeviation

$1100 and $100

Derek’s sales total also varies, with

$1000 and $80

Because the store is large and Tamara and Derek work in different depart-ments, we might assume that their daily sales totals vary independently ofeach other. What are the mean and standard deviation of the difference

between Tamara’s daily sales and Derek’s daily sales? Tamara sellsmore on the average. Do you think she sells more every day? Why?

It is unlikely that the daily sales of Tamara and Derek inthe previous problem are independent. They will both sell more during theChristmas season, for example. Suppose that the correlation between theirsales is 0 4. What are now the mean and standard deviation of thedifference ? Can you explain conceptually why positive correlationbetween two variables reduces the variability of the difference betweenthem?

Exercise 4.45 (page 268) gives the distributions of , the number of wirelessphones sold at the mall location of a store during the next month, and , thenumber of wireless phones sold at the downtown location during the nextmonth. You did some useful variance calculations in Exercise 4.49 (page271). Each phone sold at the mall location results in $25 profit, and eachphone sold at the downtown location results in $35 profit.

(a) Calculate the standard deviation of the profit for each location usingRule 1 for variances.

(b) Assuming phone sales at the two locations are independent, calculatethe standard deviation for total profit of both locations combined.

(c) Assuming 0 8, calculate the standard deviation for total profit ofboth locations combined.

(d) Assuming 0, calculate the standard deviation for total profit ofboth locations combined. How does this compare with your result inpart (b)? In part (c)?

(e) Assuming 0 8, calculate the standard deviation for total profit ofboth locations combined. How does this compare with your result inpart (b)? In part (c)? In part (d)?

� �

� �

APPLY YOURKNOWLEDGE

Page 48: chap_04

Probability and Sampling Distributions276 CHAPTER 4

ECTION UMMARY

S 4.3 S�

random variableprobability distribution

discrete continuous.

discrete random variable

continuous random variabledensity curve

Normal distributions

probabilityhistogram

mean standard deviation

mean

variance

standard deviation

� � �

� �

� � � �

� �

� �

X X

i i

X k k

X

k kX

X

Xa bX

Xa bX

� � � �

� � � �

� �

1 1 2 2

2

2 2 2 21 1 2 2

2 2 2

A is a variable taking numerical values determined bythe outcome of a random phenomenon. The ofa random variable tells us what the possible values of are and howprobabilities are assigned to those values.

A random variable and its distribution can be or

A has finitely many possible values. Theprobability distribution assigns each of these values a probability between0 and 1 such that the sum of all the probabilities is exactly 1. Theprobability of any event is the sum of the probabilities of all the values thatmake up the event.

A takes all values in some interval ofnumbers. A describes the probability distribution of acontinuous random variable. The probability of any event is the area underthe curve above the values that make up the event.are one type of continuous probability distribution.

You can picture a probability distribution by drawing ain the discrete case or by graphing the density curve in the

continuous case.

The probability distribution of a random variable , like a distributionof data, has a and a .

The is the balance point of the probability histogram or densitycurve. If is discrete with possible values having probabilities , themean is the average of the values of , each weighted by its probability:

The is the average squared deviation of the values of thevariable from their mean. For a discrete random variable,

( ) ( ) ( )

The is the square root of the variance. Thestandard deviation measures the variability of the distribution about themean. It is easiest to interpret for Normal distributions.

The mean and variance of a continuous random variable can becomputed from the density curve, but to do so requires more advancedmathematics.

The means and variances of random variables obey the following rules.If and are fixed numbers, then

X X

X

X

X x pX

x p x p x p

x p x p x p

a ba b

b�

Page 49: chap_04

4.3 Random Variables 277

ECTION XERCISESS 4.3 E

independent,

4.53 How many rooms?

4.54 Households and families.

4.55 How many rooms?

� � �

� � � �� �

� � � �� �

� � �

� � �

X Y X Y

X YX Y X Y

X YX Y X Y

X Y X Y

X Y X Y

� �

� � �

� �

� �

� �

2 2 2

2 2 2

2 2 2

2 2 2

12

If and are any two random variables having correlation , then

2

2

If and are then 0. In this case,

X Y

X Y

Furniture makers and others are interested in how manyrooms housing units have, because more rooms can generate more sales.Here are the distributions of the number of rooms for owner-occupied unitsand renter-occupied units in San Jose, California:

Rooms 1 2 3 4 5 6 7 8 9 10

Owned 0.003 0.002 0.023 0.104 0.210 0.224 0.197 0.149 0.053 0.035Rented 0.008 0.027 0.287 0.363 0.164 0.093 0.039 0.013 0.003 0.003

(a) Make probability histograms of these two distributions, using the samescales. What are the most important differences between the distribu-tions for owner-occupied and rented housing units?

(b) Find the mean number of rooms for both types of housing unit. Howdo the means reflect the differences you found in (a)?

In government data, a household consists of alloccupants of a dwelling unit, while a family consists of two or more personswho live together and are related by blood or marriage. Here are thedistributions of household size and of family size in the United States:

Number of persons 1 2 3 4 5 6 7

Household probability 0.25 0.32 0.17 0.15 0.07 0.03 0.01Family probability 0 0.42 0.23 0.21 0.09 0.03 0.02

You have considered the distribution of household size in Exercises 4.35 and4.42. Compare the two distributions using probability histograms, means,and standard deviations. Write a brief comparison, using your calculationsto back up your statements.

Which of the two distributions for room counts inExercise 4.53 appears more spread out in the probability histograms? Why?Find the standard deviation for both distributions. The standard deviationprovides a numerical measure of spread.

Page 50: chap_04

Probability and Sampling Distributions278 CHAPTER 4 �

4.56 Tossing four coins.

4.57 Pick 3 once more.

4.58 Tossing four coins.

4.59 Independent random variables?

4.60 Independent random variables?

4.61 Time and motion studies.

i

i

x

X

x

p

X Y

X Y

X Y

X Y

X Y

X

Y

XY

The distribution of the count of heads in four tossesof a balanced coin was found in Example 4.10 to be

Number of heads 0 1 2 3 4

Probability 0.0625 0.25 0.375 0.25 0.0625

Find the mean from this distribution. Then find the mean number ofheads for a single coin toss and show that your two results are related by theaddition rule for means.

The Tri-State Pick 3 lottery game offers a choice of severalbets. You choose a three-digit number. The lottery commission announcesthe winning three-digit number, chosen at random, at the end of each day.The “box” pays $83.33 if the number you choose has the same digits as thewinning number, in any order. Find the expected payoff for a $1 bet onthe box. (Assume that you chose a number having three different digits.)

The distribution of outcomes for tossing four coins givenin Exercise 4.56 assumes that the tosses are independent of each other. Findthe variance for four tosses. Then find the variance of the number of heads(0 or 1) on a single toss and show that your two results are related by theaddition rule for variances of independent random variables.

For each of the following situations, wouldyou expect the random variables and to be independent? Explain youranswers.

(a) is the rainfall (in inches) on November 6 of this year and is therainfall at the same location on November 6 of next year.

(b) is the amount of rainfall today and is the rainfall at the samelocation tomorrow.

(c) is today’s rainfall at the Orlando, Florida, airport, and is today’srainfall at Disney World just outside Orlando.

In which of the following games of chancewould you be willing to assume independence of and in making aprobability model? Explain your answer in each case.

(a) In blackjack, you are dealt two cards and examine the total pointson the cards (face cards count 10 points). You can choose to be dealtanother card and compete based on the total points on all three cards.

(b) In craps, the betting is based on successive rolls of two dice. is thesum of the faces on the first roll, and the sum of the faces on the nextroll.

A time and motion study measures the timerequired for an assembly line worker to perform a repetitive task. The datashow that the time required to bring a part from a bin to its position onan automobile chassis varies from car to car with mean 11 seconds andstandard deviation 2 seconds. The time required to attach the part to thechassis varies with mean 20 seconds and standard deviation 4 seconds.

(a) What is the mean time required for the entire operation of positioningand attaching the part?

Page 51: chap_04

4.3 Random Variables 279

4.62 A chemical production process.

4.63 Time and motion studies.

4.64 A chemical production process.

4.65 Get the mean you want.

4.66 Combining measurements.

4.67 Gain Communications.

Z Z

� �

X X

X

X

Y

Y XX Y

Z X Y ZY

XY

(b) If the variation in the worker’s performance is reduced by better training,the standard deviations will decrease. Will this decrease change the meanyou found in (a) if the mean times for the two steps remain as before?

(c) The study finds that the times required for the two steps are independent.A part that takes a long time to position, for example, does not takemore or less time to attach than other parts. How would your answerin (a) change if the two variables were dependent, with correlation 0.8?With correlation 0.3?

Laboratory data show that the time requiredto complete two chemical reactions in a production process varies. The firstreaction has a mean time of 40 minutes and a standard deviation of 2 minutes;the second has a mean time of 25 minutes and a standard deviation of 1minute. The two reactions are run in sequence during production. There isa fixed period of 5 minutes between them as the product of the first reactionis pumped into the vessel where the second reaction will take place. What isthe mean time required for the entire process?

Find the standard deviation of the time requiredfor the two-step assembly operation studied in Exercise 4.61, assuming thatthe study shows the two times to be independent. Redo the calculationassuming that the two times are dependent, with correlation 0.3. Can youexplain in nontechnical language why positive correlation increases thevariability of the total time?

The times for the two reactions in thechemical production process described in Exercise 4.62 are independent.Find the standard deviation of the time required to complete the process.

Here is a simple way to create a random variablethat has mean and standard deviation : takes only the two values

and , each with probability 0.5. Use the definition of the meanand variance for discrete random variables to show that does havemean and standard deviation .

You have two scales for measuring weight. Bothscales give answers that vary a bit in repeated weighings of the same item. Ifthe true weight of an item is 2 grams (g), the first scale produces readingsthat have mean 2.000 g and standard deviation 0.002 g. The second scale’sreadings have mean 2.001 g and standard deviation 0.001 g.

(a) What are the mean and standard deviation of the differencebetween the readings? (The readings and are independent.)

(b) You measure once with each scale and average the readings. Your resultis ( )/2. What are and ? Is the average more or lessvariable than the reading of the less variable scale?

Examples 4.16 and 4.17 concern a probabilisticprojection of sales and profits by an electronics firm, Gain Communications.The mean and variance of military sales appear in Example 4.17 (page270). You found the mean and variance of civilian sales in Exercise 4.47(page 271).

(a) Because the military budget and the civilian economy are not closelylinked, Gain is willing to assume that its military and civilian sales

� �

� � � �

� �

� �

Page 52: chap_04

Probability and Sampling Distributions280 CHAPTER 4 �

4.68 Gain Communications, continued.

4.69 Study habits.

Portfolio analysis.

4.70 Diversification.

4.71 More on diversification.

W W

X X

Y Y

WX WY XY

WY

� �

� � �

� � �

� � �

� � �

13

X Y

Z XY

Here are the means, standard deviations, and correlationsfor the monthly returns from three Fidelity mutual funds for the 36 monthsending in December 2000. Because there are three random variables, thereare three correlations. We use subscripts to show which pair of randomvariables a correlation refers to.

W monthly return on Magellan Fund . .

X monthly return on Real Estate Fund . .

Y monthly return on Japan Fund . .

Correlations

. . .

Exercises 4.70 to 4.72 make use of these historical data.

higher less

vary independently. What is the standard deviation of Gain’s totalsales ?

(b) Find the standard deviation of the estimated profit, 20003500 .

Redo Exercise 4.67 assuming correla-tion 0.01 between civilian and military sales. Redo the exercise assumingcorrelation 0.99. Comment on the effect of small and large correlations onthe uncertainty of Gain’s sales projections.

The academic motivation and study habits of female studentsas a group are better than those of males. The Survey of Study Habits andAttitudes (SSHA) is a psychological test that measures these factors. Thedistribution of SSHA scores among the women at a college has mean 120 andstandard deviation 28, and the distribution of scores among men studentshas mean 105 and standard deviation 35. You select a single male studentand a single female student at random and give them the SSHA test.

(a) Explain why it is reasonable to assume that the scores of the twostudents are independent.

(b) What are the mean and standard deviation of the difference (femaleminus male) between their scores?

(c) From the information given, can you find the probability that the womanchosen scores higher than the man? If so, find this probability. If not,explain why you cannot.

1 14% 4 64%

0 16% 3 61%

1 59% 6 75%

0 19 0 54 0 17

Many advisors recommend using roughly 20%foreign stocks to diversify portfolios of U.S. stocks. Michael ownsFidelity Magellan Fund, which concentrates on stocks of largeAmerican companies. He decides to move to a portfolio of 80% Magellanand 20% Fidelity Japan Fund. Show that (based on historical data) thisportfolio has both a mean return and volatility than Magellanalone. This illustrates the beneficial effects of diversifying among investments.

Diversification works better when theinvestments in a portfolio have small correlations. To demonstratethis, suppose that returns on Magellan Fund and Japan Fund hadthe means and standard deviations we have given but were uncorrelated( 0). Show that the standard deviation of a portfolio that combines

� �

� �

� �

� � �

CA

SE4

.2C

ASE

4.2

Page 53: chap_04

4.3 Random Variables 281

4.72 (Optional) Larger portfolios.

4.73 (Optional) Perfectly correlated variables.

4.74 A hot stock.

4.75 Making glassware.

R W X Y

WX W X WY W YR W X Y

XY X Y

X Y X Y XY

X X

� � � � �

� � �

� � �

� � �

� � � � �

� � �

2 2 2 2 2 2 2

a b cR aW bX cY a b c

a b cR

a b c

a b c ab acbc

X

X

80% Magellan with 20% Japan is then smaller than your result from theprevious exercise. What happens to the mean return if the correlation is 0?

Portfolios often contain more thantwo investments. The rules for means and variances continue toapply, though the arithmetic gets messier. A portfolio containingproportions of Magellan Fund, of Real Estate Fund, and of Japan Fundhas return . Because , , and are the proportionsinvested in the three funds, 1. The mean and variance of theportfolio return are:

2 22

Having seen the advantages of diversification, Michael decides to invest hisfunds 60% in Magellan, 20% in Real Estate, and 20% in Japan. What arethe (historical) mean and standard deviation of the monthly returns for thisportfolio?

We know that variances add if therandom variables involved are uncorrelated ( 0), but not otherwise. Theopposite extreme is perfect positive correlation ( 1). Show by usingthe general addition rule for variances that in this case the standard deviationsadd. That is, if 1.

You purchase a hot stock for $1000. The stock either gains30% or loses 25% each day, each with probability 0.5. Its returns onconsecutive days are independent of each other. This implies that allfour possible combinations of gains and losses in two days are equallylikely, each having probability 0.25. You plan to sell the stock aftertwo days.

(a) What are the possible values of the stock after two days, and what isthe probability for each value? What is the probability that the stock isworth more after two days than the $1000 you paid for it?

(b) What is the mean value of the stock after two days? You see that thesetwo criteria give different answers to the question “Should I invest?”

In a process for manufacturing glassware, glass stemsare sealed by heating them in a flame. The temperature of the flame variesa bit. Here is the distribution of the temperature measured in degreesCelsius:

Temperature 540 545 550 555 560

Probability 0.1 0.25 0.3 0.25 0.1

(a) Find the mean temperature and the standard deviation .

(b) The target temperature is 550 C. Use the rules for means and variancesto find the mean and standard deviation of the number of degrees offtarget, 550.�

� � � �

� � � � � � � � � �� � �

� � � �

� �

CA

SE4

.2

Page 54: chap_04

Probability and Sampling Distributions282 CHAPTER 4

Does this wine smell bad?EXAMPLE 4.21

4.4 The Sampling Distributionof a Sample Mean

4.76 Irrational decision making?

Y Y

� �

Section 3.3 (page 206) motivated our study of probability by looking atand . A statistic from a random

sample will take different values if we take more samples from the samepopulation. That is, sample statistics are random variables. The values ofa statistic do not vary haphazardly from sample to sample but have aregular pattern, the sampling distribution, in many samples. This is thedistribution of the random variable. We can now use the language ofprobability to examine one very important sampling distribution, that of asample mean . Some new and important facts about probability emergefrom this examination. Here is the example we will follow to introduce therole of probability in statistical inference.

X

Y X

New York Times

parameter

sampling variability sampling distributions

x

(c) A manager asks for results in degrees Fahrenheit. The conversion ofinto degrees Fahrenheit is given by

932

5

What are the mean and standard deviation of the temperature ofthe flame in the Fahrenheit scale?

The psychologist Amos Tversky did many stud-ies of our perception of chance behavior. In its obituary of Tversky (June 6,1996), the cited the following example.

(a) Tversky asked subjects to choose between two public health programsthat affect 600 people. One has probability 1/2 of saving all 600 andprobability 1/2 that all 600 will die. The other is guaranteed to saveexactly 400 of the 600 people. Find the mean number of people savedby the first program.

(b) Tversky then offered a different choice. One program has probability1/2 of saving all 600 and probability 1/2 of losing all 600, while the otherwill definitely lose exactly 200 lives. What is the difference between thischoice and that in (a)?

(c) Given option (a), most people choose the second program. Given option(b), most people choose the first program. Do people appear to usemeans in making their decisions? Why do you think their choicesdiffered in the two cases?

Sulfur compounds such as dimethyl sulfide (DMS) are sometimes present in wine.DMS causes “off-odors” in wine, so winemakers want to know the odor threshold,the lowest concentration of DMS that the human nose can detect. Different peoplehave different thresholds, so we start by asking about the mean threshold inthe population of all adults. The number is a that describes thispopulation.

To estimate , we present tasters with both natural wine and the same winespiked with DMS at different concentrations to find the lowest concentration at

� �

Page 55: chap_04

4.4 The Sampling Distribution of a Sample Mean 283

Statistical estimation and the law of large numbers

LAW OF LARGE NUMBERS

��

��

A parameter, such as the mean odor threshold of all adults, is in practicea fixed but unknown number. A statistic, such as the mean threshold of arandom sample of 10 adults, is a random variable. It seems reasonableto use to estimate . An SRS should fairly represent the population, sothe mean of the sample should be somewhere near the mean of thepopulation. Of course, we don’t expect to be exactly equal to , and werealize that if we choose another SRS, the luck of the draw will probablyproduce a different .

If is rarely exactly right and varies from sample to sample, why is itnonetheless a reasonable estimate of the population mean ? Here is oneanswer: if we keep on taking larger and larger samples, the statistic is

to get closer and closer to the parameter . We have the comfortof knowing that if we can afford to keep on measuring more subjects,eventually we will estimate the mean odor threshold of all adults veryaccurately. This remarkable fact is called the . It isremarkable because it holds for population, not just for some specialclass such as Normal distributions.

Draw independent observations at random from any populationwith finite mean . As the number of observations drawn increases,the mean of the observed values gets closer and closer to the mean

of the population.

The mean of a random variable is the average value of the variablein two senses. By its definition, is the average of the possible values,weighted by their probability of occurring. The law of large numbers saysthat is also the long-run average of many independent observations on thevariable. The law of large numbers can be proved mathematically startingfrom the basic laws of probability. The behavior of is similar to the ideaof probability. In the long run, the proportion of outcomes taking any valuegets close to the probability of that value, and the average outcome getsclose to the population mean. Figure 4.1 (page 233) shows how proportionsapproach probability in one example. Here is an example of how samplemeans approach the population mean.

x . statistic

x

x

xx

x

x

x

xguaranteed

law of large numbersany

x

x

which they can identify the spiked wine. Here are the odor thresholds (measured inmicrograms of DMS per liter of wine) for 10 randomly chosen subjects:

28 40 28 33 20 31 29 27 17 21

The mean threshold for these subjects is 27 4. This sample mean is athat we use to estimate the parameter , but it is probably not exactly equal to .Moreover, we know that a different 10 subjects would give us a different .

� �

Page 56: chap_04

1

29

28

24

23

22

25

26

27

30

31

32

33

34

35

5 10 50

Number of observations, n

Mea

n of

fir

st n

obs

erva

tion

s

100

500

1000

5000

10,0

00

Probability and Sampling Distributions284 CHAPTER 4

FIGURE 4.15

The law of large numbers in actionEXAMPLE 4.22

The law of large numbers in action: as we take moreobservations, the sample mean always approaches the mean of thepopulation.

x �

�� �

xx

x

In fact, the distribution of odor thresholds among all adults has mean 25. Themean 25 is the true value of the parameter we seek to estimate. Typically,the value of a parameter like is unknown. We use an example in which isknown to illustrate facts about the general behavior of that hold even when isunknown. Figure 4.15 shows how the sample mean of an SRS drawn from thispopulation changes as we add more subjects to our sample.

The first subject in Example 4.21 had threshold 28, so the line in Figure 4.15starts there. The mean for the first two subjects is

28 4034

2

This is the second point on the graph. At first, the graph shows that the mean ofthe sample changes as we take more observations. Eventually, however, the meanof the observations gets close to the population mean 25 and settles down atthat value.

If we started over, again choosing people at random from the population,we would get a different path from left to right in Figure 4.15. The law of largenumbers says that whatever path we get will always settle down at 25 as we drawmore and more people.

� �

Page 57: chap_04

4.4 The Sampling Distribution of a Sample Mean 285

The “Law of Small Numbers”

Thinking about the law of large numbers

APPLY YOURKNOWLEDGE

www.whfreeman.com/pbs

4.77 Comparing computers.

15 23.5

4.78

4.79 Playing the numbers.

14

As in the case of probability, an animated version of Figure 4.15 makesthe idea of the mean as long-run average outcome clearer. The

applet at the text Web site, ,gives you control of an animation.

The law of large numbers is the foundation of such business enterprisesas gambling casinos and insurance companies. The winnings (or losses) of agambler on a few plays are uncertain—that’s why gambling is exciting. InFigure 4.15, the mean of even 100 observations is not yet very close to .It is only that the mean outcome is predictable. The houseplays tens of thousands of times. So the house, unlike individual gamblers,can count on the long-run regularity described by the law of large numbers.The average winnings of the house on tens of thousands of plays will bevery close to the mean of the distribution of winnings. Needless to say, thismean guarantees the house a profit. That’s why gambling can be a business.

The law of large numbers says broadly that the average results of manyindependent observations are stable and predictable. Casinos are not theonly businesses that base forecasts on this fact. A grocery store decidinghow many gallons of milk to stock and a fast-food restaurant deciding howmany beef patties to prepare can predict demand even though their manycustomers make independent decisions. The law of large numbers says thatthese many individual decisions will produce a stable result. It is worth theeffort to think a bit more closely about so important a fact.

Both the rules of probability and the law of large numbers describe theregular behavior of chance phenomena . Psychologists have

n

n

Mean of aRandom Variable

in the long run

in the long run

Pfeiffer Consulting, a technology consulting group,designed benchmark tests to compare the speed with which computer modelscomplete a variety of tasks. Pfeiffer announced that the mean completiontime was minutes for 733-MHz Power Mac G4 models and minutesfor 400-MHz Power Mac G3 models. Do you think the bold numbers areparameters or statistics? Explain your reasoning carefully.

Figure 4.15 shows how the mean of observations behaves as we keepadding more observations to those already in hand. The first 10 observationsare given in Example 4.21. Demonstrate that you grasp the idea of Fig-ure 4.15: find the mean of the first one, then two, three, four, and five ofthese observations and plot the successive means against . Verify that yourplot agrees with the first part of the plot in Figure 4.15.

The numbers racket is a well-entrenched illegal gam-bling operation in most large cities. One version works as follows. Youchoose one of the 1000 three-digit numbers 000 to 999 and pay your localnumbers runner a dollar to enter your bet. Each day, one three-digit numberis chosen at random and pays off $600. The mean payoff for the populationof thousands of bets is 60 cents. Joe makes one bet every day for manyyears. Explain what the law of large numbers says about Joe’s results as hekeeps on betting.

APPLY YOURKNOWLEDGE

Page 58: chap_04

Probability and Sampling Distributions286 CHAPTER 4

How Large Is a Large Number?

��

15

16

17

discovered that the popular understanding of randomness is quite differentfrom the true laws of chance. Most people believe in an incorrect “lawof small numbers.” That is, we expect even short sequences of randomevents to show the kind of average behavior that in fact appears only in thelong run.

Try this experiment: Write down a sequence of heads and tails that youthink imitates 10 tosses of a balanced coin. How long was the longest string(called a ) of consecutive heads or consecutive tails in your tosses? Mostpeople will write a sequence with no runs of more than two consecutive headsor tails. Longer runs don’t seem “random” to us. In fact, the probabilityof a run of three or more consecutive heads or tails in 10 tosses is greaterthan 0.8, and the probability of a run of three or more heads anda run of three or more tails is almost 0.2. This and other probabilitycalculations suggest that a short sequence of coin tosses will often not seemrandom to most people. The runs of consecutive heads or consecutive tailsthat appear in real coin tossing (and that are predicted by the mathematicsof probability) surprise us. Because we don’t expect to see long runs, we mayconclude that the coin tosses are not independent or that some influence isdisturbing the random behavior of the coin.

Belief in the law of small numbers influences behavior. If a basketballplayer makes several consecutive shots, both the fans and his teammatesbelieve that he has a “hot hand” and is more likely to make the next shot.This is doubtful. Careful study suggests that runs of baskets made or missedare no more frequent in basketball than would be expected if each shot wereindependent of the player’s previous shots. Players perform consistently,not in streaks. (Of course, some players make a higher percent of their shotsin the long run than others.) Our perception of hot or cold streaks simplyshows that we don’t perceive random behavior very well.

Gamblers often follow the hot-hand theory, betting that a run will con-tinue. At other times, however, they draw the opposite conclusion whenconfronted with a run of outcomes. If a coin gives 10 straight heads, somegamblers feel that it must now produce some extra tails to get back tothe average of half heads and half tails. Not so. If the next 10,000 tossesgive about 50% tails, those 10 straight heads will be swamped by the laterthousands of heads and tails. No compensation is needed to get back to the av-erage in the long run. Remember that it is in the long run that theregularity described by probability and the law of large numbers takes over.

Our inability to accurately distinguish random behavior from systematicinfluences points out once more the need for statistical inference to supple-ment exploratory analysis of data. Probability calculations can help verifythat what we see in the data is more than a random pattern.

The law of large numbers says that the actual mean outcome of many trialsgets close to the distribution mean as more trials are made. It doesn’t sayhow many trials are needed to guarantee a mean outcome close to . Thatdepends on the of the random outcomes. The more variable the

run

both

only

variability

Page 59: chap_04

4.4 The Sampling Distribution of a Sample Mean 287

EYOND THE ASICS: ORE AWS OF ARGE UMBERS

Is There a Winning System for Gambling?

What If Observations Are Not Independent?

APPLY YOURKNOWLEDGE

B B M L L N

4.80 Help this man.

outcomes, the more trials are needed to ensure that the mean outcome isclose to the distribution mean .

The law of large numbers is one of the central facts about probability. Ithelps us understand the mean of a random variable. It explains whygambling casinos and insurance companies make money. It assures us thatstatistical estimation will be accurate if we can afford enough observations.The basic law of large numbers applies to independent observations that allhave the same distribution. Mathematicians have extended the law to manymore general settings. Here are two of these.

Serious gamblers often follow a system of betting in which the amountbet on each play depends on the outcome of previous plays. You might,for example, double your bet on each spin of the roulette wheel until youwin—or, of course, until your fortune is exhausted. Such a system tries totake advantage of the fact that you have a memory even though the roulettewheel does not. Can you beat the odds with a system based on the outcomesof past plays? No. Mathematicians have established a stronger version ofthe law of large numbers that says that if you do not have an infinite fortuneto gamble with, your long-run average winnings remain the same as longas successive trials of the game (such as spins of the roulette wheel) areindependent.

You are in charge of a process that manufactures video screens for computermonitors. Your equipment measures the tension on the metal mesh that liesbehind each screen and is critical to its image quality. You want to estimatethe mean tension for the process by the average of the measurements.Alas, the tension measurements are not independent. If the tension on onescreen is a bit too high, the tension on the next is more likely to also behigh. Many real-world processes are like this—the process stays stable inthe long run, but observations made close together are likely to be bothabove or both below the long-run mean. Again the mathematicians cometo the rescue: as long as the dependence dies out fast enough as we takemeasurements farther and farther apart in time, the law of large numbersstill holds.

x

x

(a) A gambler knows that red and black are equally likely to occur on eachspin of a roulette wheel. He observes five consecutive reds and betsheavily on red at the next spin. Asked why, he says that “red is hot” andthat the run of reds is likely to continue. Explain to the gambler what iswrong with this reasoning.

(b) After hearing you explain why red and black remain equally probableafter five reds on the roulette wheel, the gambler moves to a poker game.

APPLY YOURKNOWLEDGE

Page 60: chap_04

Probability and Sampling Distributions288 CHAPTER 4 �

Sampling distributions

4.81 The “law of averages.”

4.82 The law of large numbers.

The law of large numbers assures us that if we measure enough subjects,the statistic will eventually get very close to the unknown parameter .But our study in Example 4.21 had just 10 subjects. What can we say about

from 10 subjects as an estimate of ? We ask: “What would happen ifwe took many samples of 10 subjects from this population?” We learned inSection 3.3 how to answer this question:

Take a large number of samples of size 10 from the same population.

Calculate the sample mean for each sample.

Make a histogram of the values of . This histogram shows how variesin many samples.

In Section 3.3 (page 208) we used computer simulation to take manysamples in a different setting. The histogram of values of the statistic

x

Mean of a Random Variable

x

x

always

x

x

x

x x

He is dealt five straight red cards. He remembers what you said andassumes that the next card dealt in the same hand is equally likely to bered or black. Is the gambler right or wrong? Why?

The baseball player Tony Gwynn got a hit about34% of the time over his 20-year career. After he failed to hit safely in sixstraight at-bats, the TV commentator said, “Tony is due for a hit by the lawof averages.” Is that right? Why?

Figure 4.2 (page 239) shows the 36 possibleoutcomes of rolling two dice and counting the spots on the up-faces. These36 outcomes are equally likely. You can calculate that the mean for the sumof the two up-faces is 7. This is the population mean for the idealizedpopulation that contains the results of rolling two dice forever. The law oflarge numbers says that the average from a finite number of rolls getscloser and closer to 7 as we do more and more rolls.

(a) Go to the applet. Click “More dice” onceto get two dice. Click “Show mean” to see the mean 7 on the graph.Leaving the number of rolls at 1, click “Roll dice” three times. Notethe count of spots for each roll (what were they?) and the average for thethree rolls. You see that the graph displays at each point the averagenumber of spots for all rolls up to the last one. Now you understandthe display.

(b) Set the number of rolls to 100 and click “Roll dice.” The applet rollsthe two dice 100 times. The graph shows how the average count ofspots changes as we make more rolls. That is, the graph shows as wecontinue to roll the dice. Make a rough sketch of the final graph.

(c) Repeat your work from (b). Click “Reset” to start over, then roll twodice 100 times. Make a sketch of the final graph of the mean againstthe number of rolls. Your two graphs will often look very different.What they have in common is that the average eventually gets close tothe population mean 7. The law of large numbers says that thiswill happen if you keep on rolling the dice.

� �

Page 61: chap_04

4.4 The Sampling Distribution of a Sample Mean 289

The mean and the standard deviation of x

MEAN AND STANDARD DEVIATION OF A SAMPLE MEAN

meanstandard deviation

unbiased estimator

Averages are less variable than individualobservations.

The results of large samples are less variable than theresults of small samples.

� ��

��

� �

��

unbiased estimator

18

approximates the that we would see if we kept onsampling forever. The idea of a sampling distribution is the foundation ofstatistical inference. One reason for studying probability is that the lawsof probability can tell us about sampling distributions without the need toactually choose or simulate a large number of samples.

The first important fact about the sampling distribution of follows byusing the rules for means and variances, though we won’t do the algebra.Here is the result.

Suppose that is the mean of an SRS of size drawn from a largepopulation with mean and standard deviation . Then theof the sampling distribution of is and its is

.

Both the mean and the standard deviation of the sampling distributionof have important implications for statistical inference.

The mean of the statistic is always the same as the mean of thepopulation. The sampling distribution of is centered at . In repeatedsampling, will sometimes fall above the true value of the parameterand sometimes below, but there is no systematic tendency to overestimateor underestimate the parameter. This makes the idea of lack of bias in thesense of “no favoritism” more precise. Because the mean of is equal to

, we say that the statistic is an of the parameter .

An unbiased estimator is “correct on the average” in many samples. Howclose the estimator falls to the parameter in most samples is determinedby the spread of the sampling distribution. If individual observationshave standard deviation , then sample means from samples of size

have standard deviation .

Not only is the standard deviation of the distribution of smaller thanthe standard deviation of individual observations, but it gets smaller as wetake larger samples.

If is large, the standard deviation of is smalland almost all samples will give values of that lie very close to the trueparameter . That is, the sample mean from a large sample can be trusted toestimate the population mean accurately. Notice, however, that the standarddeviation of the sampling distribution gets smaller only at the rate . Tocut the standard deviation of in half, we must take four times as manyobservations, not just twice as many.

sampling distribution

x

x n

xn

x

xx

x

xx

xn n

x

n xx

nx

Page 62: chap_04

Probability and Sampling Distributions290 CHAPTER 4 �

The central limit theorem

APPLY YOURKNOWLEDGE

4.83 Generating a sampling distribution.

4.84 Measurements on the production line.

4.85 Measuring blood cholesterol.

We have described the center and spread of the sampling distribution of asample mean , but not its shape. The shape of the distribution of depends

n

x

xx

x

x

x

x

x

x

x x

Let’s illustrate the idea of a samplingdistribution in the case of a very small sample from a very small population.The population is the sizes of 10 medium-sized businesses where size ismeasured in terms of the number of employees. For convenience, the 10companies have been labeled with the integers 0 to 9.

Company 0 1 2 3 4 5 6 7 8 9

Size 82 62 80 58 72 73 65 66 74 62

The parameter of interest is the mean size in this population. The sampleis an SRS of size 4 drawn from the population. Because the companiesare labeled 0 to 9, a single random digit from Table B chooses one companyfor the sample.

(a) Find the mean of the 10 sizes in the population. This is the populationmean .

(b) Use Table B to draw an SRS of size 4 from this population. Write thefour sizes in your sample and calculate the mean of the sample sizes.This statistic is an estimate of .

(c) Repeat this process 10 times using different parts of Table B. Makea histogram of the 10 values of . You are constructing the samplingdistribution of . Is the center of your histogram close to ?

Sodium content (in milligrams) ismeasured for bags of potato chips sampled from a production line. Thestandard deviation of the sodium content measurements is 10 mg.The sodium content is measured 3 times and the mean of the 3 measure-ments is recorded.

(a) What is the standard deviation of the mean result? (That is, if you kepton making 3 measurements and averaging them, what would be thestandard deviation of all the ’s?)

(b) How many times must we repeat the measurement to reduce the standarddeviation of to 5? Explain to someone who knows no statistics theadvantage of reporting the average of several measurements rather thanthe result of a single measurement.

A study of the health of teenagers plans tomeasure the blood cholesterol level of an SRS of youth of ages 13 to 16 years.The researchers will report the mean from their sample as an estimate ofthe mean cholesterol level in this population.

(a) Explain to someone who knows no statistics what it means to say thatis an “unbiased” estimator of .

(b) The sample result is an unbiased estimator of the population truthno matter what size SRS the study chooses. Explain to someone whoknows no statistics why a large sample gives more trustworthy resultsthan a small sample.

APPLY YOURKNOWLEDGE

Page 63: chap_04

20100 30 40

Observations on 1 subject

= 7

Means of 10 subjects

50

σ

σ

10= 2.21

x

4.4 The Sampling Distribution of a Sample Mean 291

FIGURE 4.16

Estimating odor thresholdsEXAMPLE 4.23

The distribution of single observations compared with thedistribution of the means of 10 observations. Averages are less variablethan individual observations.

x

SAMPLING DISTRIBUTION OF A SAMPLE MEAN

� �� � ��

� �

� �

on the shape of the population distribution. Here is one important case:if the population distribution is Normal, then so is the distribution of thesample mean.

If a population has the ( ) distribution, then the sample meanof independent observations has the ( ) distribution.

What happens when the population distribution is not Normal? As thesample size increases, the distribution of changes shape: it looks less like

x

xx

.n

N , xn N , n

x

Adults differ in the smallest amount of DMS they can detect in wine. Extensivestudies have found that the DMS odor threshold of adults follows roughly aNormal distribution with mean 25 micrograms per liter ( g/l) and standarddeviation 7 g/l. Because the population distribution is Normal, the samplingdistribution of is also Normal. Figure 4.16 displays the Normal curve forodor thresholds in the adult population and also the Normal curve for the averagethreshold in random samples of size 10.

Both distributions have the same mean. But means from samples of 10 adultsvary less than do measurements on individual adults. The standard deviation of is

72 21 g/l

10

� �

� �

��

Page 64: chap_04

Probability and Sampling Distributions292 CHAPTER 4

The central limit theorem in actionEXAMPLE 4.24

CENTRAL LIMIT THEOREM

��

��

�� �

� � � �

that of the population and more like a Normal distribution. When thesample is large enough, the distribution of is very close to Normal. Thisis true no matter what shape the population distribution has, as long as thepopulation has a finite standard deviation . This famous fact of probabilitytheory is called the . It is much more useful than thefact that the distribution of is exactly Normal if the population is exactlyNormal.

Draw an SRS of size from any population with mean and finitestandard deviation . When is large, the sampling distribution ofthe sample mean is approximately Normal:

is approximately

More general versions of the central limit theorem say that the distribu-tion of a sum or average of many small random quantities is close to Normal.This is true even if the quantities are not independent (as long as they are nottoo highly correlated) and even if they have different distributions (as long asno one random quantity is so large that it dominates the others). The centrallimit theorem suggests why the Normal distributions are common modelsfor observed data. Any variable that is a sum of many small influences willhave approximately a Normal distribution.

How large a sample size is needed for to be close to Normal dependson the population distribution. More observations are required if the shapeof the population distribution is far from Normal.

Exponentialdistribution

n

n

. n

x

central limit theoremx

nn

x

x N ,n

n x

Figure 4.17 shows how the central limit theorem works for a very non-Normalpopulation. Figure 4.17(a) displays the density curve of a single observation,that is, of the population. The distribution is strongly right-skewed, and themost probable outcomes are near 0. The mean of this distribution is 1, and itsstandard deviation is also 1. This particular distribution is called an

. Exponential distributions are used as models for the lifetime in serviceof electronic components and for the time required to serve a customer or repair amachine.

Figures 4.17(b), (c), and (d) are the density curves of the sample means of 2,10, and 25 observations from this population. As increases, the shape becomesmore Normal. The mean remains at 1, and the standard deviation decreases,taking the value 1 . The density curve for the sample mean of 10 observationsis still somewhat skewed to the right but already resembles a Normal curvehaving 1 and 1 10 0 32. The density curve for 25 is yet moreNormal. The contrast between the shapes of the population distribution and of thedistribution of the mean of 10 or 25 observations is striking.

� �

Page 65: chap_04

0 1

(a)0 1

(b)

0 1(d)

0 1(c)

4.4 The Sampling Distribution of a Sample Mean 293

FIGURE 4.17

Maintaining air conditionersEXAMPLE 4.25

The central limit theorem in action: the distribution ofsample means from a strongly non-Normal population becomes moreNormal as the sample size increases. (a) The distribution of 1 observation.(b) The distribution of for 2 observations. (c) The distribution of for 10observations. (d) The distribution of for 25 observations.

x

x xx

� �

� �

The central limit theorem allows us to use Normal probability calcula-tions to answer questions about sample means from many observations evenwhen the population distribution is not Normal.

X

x

.

x N , .

P x .

x

The time that a technician requires to perform preventive maintenance on anair-conditioning unit is governed by the Exponential distribution whose densitycurve appears in Figure 4.17(a). The mean time is 1 hour and the standarddeviation is 1 hour. Your company operates 70 of these units. What is theprobability that their average maintenance time exceeds 50 minutes?

The central limit theorem says that the sample mean time (in hours) spentworking on 70 units has approximately the Normal distribution with mean equalto the population mean 1 hour and standard deviation

10 12 hour

70 70

The distribution of is therefore approximately (1 0 12). This Normal curve isthe solid curve in Figure 4.18.

Because 50 minutes is 50/60 of an hour, or 0.83 hour, the probability we wantis ( 0 83). A Normal distribution calculation gives this probability as 0.9222.This is the area to the right of 0.83 under the solid Normal curve in Figure 4.18.

Using more mathematics, we could start with the Exponential distribution andfind the actual density curve of for 70 observations. This is the dashed curve in

��

Page 66: chap_04

0.83 1

Probability and Sampling Distributions294 CHAPTER 4

FIGURE 4.18

The exact distribution (dashed) and the Normalapproximation from the central limit theorem (solid) for the averagetime needed to maintain an air conditioner, for Example 4.25.

APPLY YOURKNOWLEDGE

4.86 ACT scores.

4.87 Flaws in carpets.

4.88 Returns on stocks.

..

x

x

x

Figure 4.18. You can see that the solid Normal curve is a good approximation. Theexactly correct probability is the area under the dashed density curve. It is 0.9294.The central limit theorem Normal approximation is off by only about 0.007.

The scores of students on the ACT college entrance examinationin a recent year had the Normal distribution with mean 18 6 andstandard deviation 5 9.

(a) What is the probability that a single student randomly chosen from allthose taking the test scores 21 or higher?

(b) Now take an SRS of 50 students who took the test. What are the meanand standard deviation of the sample mean score of these 50 students?

(c) What is the probability that the mean score of these students is 21 orhigher?

The number of flaws per square yard in a type of carpetmaterial varies with mean 1.6 flaws per square yard and standard deviation1.2 flaws per square yard. The population distribution cannot be Normal,because a count takes only whole-number values. An inspector samples200 square yards of the material, records the number of flaws found in eachsquare yard, and calculates , the mean number of flaws per square yardinspected. Use the central limit theorem to find the approximate probabilitythat the mean number of flaws exceeds 2 per square yard.

The distribution of annual returns on common stocks isroughly symmetric, but extreme observations are somewhat more frequentthan in Normal distributions. Because the distribution is not strongly non-Normal, the mean return over a number of years is close to Normal. We have

APPLY YOURKNOWLEDGE

Page 67: chap_04

4.4 The Sampling Distribution of a Sample Mean 295

ECTION UMMARY

ECTION XERCISES

S

S

4.4 S

4.4 E

population meansample mean

law of large numbers

sampling distribution

mean unbiasedestimator

standard deviation

central limit theorem

4.89 Business employees.19.

14

4.90 Personal income.

$22,771.$41,798.

4.91 Roulette.

��

��

�� �

When we want information about the for somevariable, we often take an SRS and use the to estimate theunknown parameter .

The states that the actually observed meanoutcome must approach the mean of the population as the number ofobservations increases.

The of describes how the statistic varies in allpossible samples of the same size from the same population.

The of the sampling distribution is , so that is anof .

The of the sampling distribution of is for anSRS of size if the population has standard deviation . That is, averagesare less variable than individual observations.

If the population has a Normal distribution, so does .

The states that for large the sampling distributionof is approximately Normal for any population with finite standarddeviation . That is, averages are more Normal than individualobservations. We can use the ( ) distribution to calculateapproximate probabilities for events involving .

x

x x

x

x nn

x

nx

N , nx

seen that annual returns on common stocks (not adjusted for inflation) havevaried with mean about 13% and standard deviation about 17%. Andrewplans to retire in 45 years. He assumes (this is dubious) that the past patternof variation will continue. What is the probability that the mean annualreturn on common stocks over the next 45 years will exceed 15%? What isthe probability that the mean return will be less than 7%?

There are more than 5 million businesses in the UnitedStates. The mean number of employees in these businesses is about Auniversity selects a random sample of 100 businesses in North Dakota andfinds that they average about employees. Is each of the bold numbers aparameter or a statistic?

The Annual Demographic Supplement to the CurrentPopulation Survey interviewed more than 128,000 people in March 2001.The Census Bureau reports that the mean income of the Hispanic malesinterviewed was The mean income for the non-Hispanic whitemales interviewed was Is each of the bold numbers a parameter ora statistic?

A roulette wheel has 38 slots, of which 18 are black, 18 are red,and 2 are green. When the wheel is spun, the ball is equally likely to come

x

Page 68: chap_04

Probability and Sampling Distributions296 CHAPTER 4 �

4.92 Generating a sampling distribution.

4.93 Dust in coal mines.

4.94 Making auto parts.

4.95 Bottling cola.

x

n x x

x

x

x

.

.

.

to rest in any of the slots. One of the simplest wagers chooses red or black.A bet of $1 on red returns $2 if the ball lands in a red slot. Otherwise,the player loses his dollar. When gamblers bet on red or black, the twogreen slots belong to the house. Because the probability of winning $2 is18/38, the mean payoff from a $1 bet is twice 18/38, or 94.7 cents. Explainwhat the law of large numbers tells us about what will happen if a gamblermakes a large number of bets on red.

Table 1.13 (page 82) gives the popula-tion counts of the 58 counties in California. Consider these counties to bethe population of interest.

(a) Make a histogram of the 58 counts. This is the population distribution.It is strongly skewed to the right.

(b) Find the mean of the 58 counts. This is the population mean . Markon the axis of your histogram.

(c) Label the members of the population 01 to 58 and use Table B to choosean SRS of size 10. What is for your sample? Mark the value ofwith a point on the axis of your histogram from (a).

(d) Choose four more SRSs of size 10, using different parts of Table B. Findfor each sample and mark the values on the axis of your histogram from(a). Would you be surprised if all five ’s fell on the same side of ? Why?

(e) If you chose a large number of SRSs of size 10 from this population andmade a histogram of the -values, where would you expect the centerof this sampling distribution to lie?

A laboratory weighs filters from a coal mine to measurethe amount of dust in the mine atmosphere. Repeated measurements of theweight of dust on the same filter vary Normally with standard deviation

0 08 milligram (mg) because the weighing is not perfectly precise. Thedust on a particular filter actually weighs 123 mg. Repeated weighings willthen have the Normal distribution with mean 123 mg and standard deviation0.08 mg.

(a) The laboratory reports the mean of 3 weighings. What is the distributionof this mean?

(b) What is the probability that the laboratory reports a weight of 124 mgor higher for this filter?

An automatic grinding machine in an auto parts plantprepares axles with a target diameter 40 125 millimeters (mm). Themachine has some variability, so the standard deviation of the diametersis 0 002 mm. A sample of 4 axles is inspected each hour for processcontrol purposes, and records are kept of the sample mean diameter. Whatwill be the mean and standard deviation of the numbers recorded?

A bottling company uses a filling machine to fill plastic bottleswith cola. The bottles are supposed to contain 300 milliliters (ml). In fact,the contents vary according to a Normal distribution with mean 298 mland standard deviation 3 ml.

(a) What is the probability that an individual bottle contains less than295 ml?

(b) What is the probability that the mean contents of the bottles in asix-pack is less than 295 ml?

Page 69: chap_04

4.4 The Sampling Distribution of a Sample Mean 297

4.96 Glucose testing and medical costs.

4.97 Manufacturing defects.

4.98 Auto accidents.

4.99 Budgeting for expenses.

� �

x

x

xx

x

x

Shelia’s doctor is concerned that she maysuffer from gestational diabetes (high blood glucose levels during pregnancy).There is variation both in the actual glucose level and in the blood test thatmeasures the level. A patient is classified as having gestational diabetes ifthe glucose level is above 140 milligrams per deciliter one hour after asugary drink is ingested. Shelia’s measured glucose level one hour afteringesting the sugary drink varies according to the Normal distribution with

125 mg/dl and 10 mg/dl.

(a) If a single glucose measurement is made, what is the probability thatShelia is diagnosed as having gestational diabetes?

(b) If measurements are made instead on 4 separate days and the meanresult is compared with the criterion 140 mg/dl, what is the probabilitythat Shelia is diagnosed as having gestational diabetes?

(c) If Shelia is incorrectly diagnosed with gestational diabetes, then she(and her insurance company) will incur unnecessary additional expenses(insulin, needles, doctor visits) for treating the condition during theremainder of her pregnancy. These additional expenses are greater thanthe cost of repeating the test on three additional days as suggested in (b).Considering the probabilities you calculated in (a) and (b), comment onwhich testing method seems more appropriate given that Shelia has anacceptable mean glucose level of 125 mg/dl.

Newly manufactured automobile radiators mayhave small leaks. Most have no leaks, but some have 1, 2, or more.The number of leaks in radiators made by one supplier has mean 0.15and standard deviation 0.4. The distribution of number of leaks cannotbe Normal because only whole-number counts are possible. The supplierships 400 radiators per day to an auto assembly plant. Take to bethe mean number of leaks in these 400 radiators. Over several years ofdaily shipments, what range of values will contain the middle 95% of themany ’s?

The number of accidents per week at a hazardous inter-section varies with mean 2.2 and standard deviation 1.4. This distributiontakes only whole-number values, so it is certainly not Normal.

(a) Let be the mean number of accidents per week at the intersectionduring a year (52 weeks). What is the approximate distribution ofaccording to the central limit theorem?

(b) What is the approximate probability that is less than 2?

(c) What is the approximate probability that there are fewer than 100accidents at the intersection in a year? (Hint: Restate this event in termsof .)

Weekly postage expenses for your company have amean of $312 and a standard deviation of $58. Your company has allowedfor $400 postage per week in its budget.

(a) What is the approximate probability that the average weekly postageexpense for the past 52 weeks will exceed the budgeted amount of $400?

(b) What information would you need to determine the probability thatpostage for one particular week will exceed $400? Why wasn’t thisinformation required for (a)?

� �

Page 70: chap_04

Probability and Sampling Distributions298 CHAPTER 4

TATISTICS IN UMMARY

S S

A. PROBABILITY

4.100 Pollutants in auto exhausts.

4.101 Glucose testing and medical costs.

� �

This chapter lays the foundations for the study of statistical inference.Statistical inference uses data to draw conclusions about the population orprocess from which the data come. What is special about inference is thatthe conclusions include a statement, in the language of probability, abouthow reliable they are. The statement gives a probability that answers thequestion “What would happen if I used this method very many times?”The probabilities we need come from the sampling distributions of samplestatistics.

Sampling distributions are the key to understanding statistical inference.The Statistics in Summary figure (on the next page) summarizes the factsabout the sampling distribution of in a way that reminds us of the big ideaof a sampling distribution. Keep taking random samples of size from apopulation with mean . Find the sample mean for each sample. Collectall the ’s and display their distribution. That’s the sampling distribution of

. Keep this figure in mind as you go forward.To think more effectively about sampling distributions, we use the

language of probability. Probability, the mathematics that describes ran-domness, is important in many areas of study. Here, we concentrate oninformal probability as the conceptual foundation for statistical inference.Because random samples and randomized comparative experiments usechance, their results vary according to the laws of probability. Here is areview list of the most important things you should be able to do afterstudying this chapter.

1. Recognize that some phenomena are random. Probability describesthe long-run regularity of random phenomena.

2. Understand that the probability of an event is the proportion of timesthe event occurs in very many repetitions of a random phenomenon.

x

L x L

LL

L

xn

xx

x

The level of nitrogen oxides (NOX) in theexhaust of a particular car model varies with mean 0.9 grams per mile (g/mi)and standard deviation 0.15 g/mi. A company has 125 cars of this model inits fleet.

(a) What is the approximate distribution of the mean NOX emission levelfor these cars?

(b) What is the level such that the probability that is greater thanis only 0.01? (Hint: This requires a backward Normal calculation. Seepage 66 of Chapter 1 if you need to review.)

In Exercise 4.96, Shelia’s measuredglucose level one hour after ingesting the sugary drink varies according tothe Normal distribution with 125 mg/dl and 10 mg/dl. Find thelevel such that there is probability only 0.05 that the mean glucose level of4 test results falls above for Shelia’s glucose level distribution. What is thevalue of ? (Hint: This requires a backward Normal calculation. See page66 of Chapter 1 if you need to review.)

� �

Page 71: chap_04

SRS size n

SRS size nµ

σ

values of x—

n√

SRS size n x

x

x

STATISTICS IN SUMMARYApproximate Sampling Distribution of x

µσ

Population meanStandard deviation

Statistics in Summary 299

B. RANDOM VARIABLES

C. THE SAMPLING DISTRIBUTION OF A SAMPLE MEAN

Use the idea of probability as long-run proportion to think aboutprobability.

3. Know the rules that are obeyed by any legitimate assignment ofprobabilities to events. Use these rules to find the probabilities of eventsthat are formed from other events.

4. Know what properties an assignment of probabilities to a finitenumber of outcomes must satisfy. Find probabilities of events in afinite sample space by adding the probabilities of their outcomes.

5. Know what a density curve is. Find probabilities of events as areasunder a density curve.

1. Use the language of random variables to give compact descriptions ofevents and their probabilities, such as ( 2 5).

2. Verify that the probability distribution of a discrete random variable islegitimate. Use the distribution to find probabilities of events involvingthe random variable.

3. Use the density curve of a continuous random variable to find proba-bilities of events involving the random variable.

4. Calculate the mean and standard deviation of a discrete randomvariable from its probability distribution.

5. Know and use the rules for means and variances of sums and dif-ferences of random variables, both for independent random variablesand correlated random variables.

1. Interpret the sampling distribution of a statistic as describing theprobabilities of its possible values.

P X .

Page 72: chap_04

Probability and Sampling Distributions300 CHAPTER 4

HAPTER EVIEW XERCISES

C 4 R E4.102 Customer backgrounds.

4.103 Who gets promoted?

4.104 Predicting the ACC champion.

4.105 Classifying occupations.

� �

��

2. Recognize when a problem involves the mean of a sample. Under-stand that estimates the mean of the population from which thesample is drawn.

3. Use the law of large numbers to describe the behavior of as the sizeof the sample increases.

4. Find the mean and standard deviation of a sample mean from anSRS of size when the mean and standard deviation of thepopulation are known.

5. Understand that is an unbiased estimator of and that the variabilityof about its mean gets smaller as the sample size increases.

6. Understand that has approximately a Normal distribution when thesample is large (central limit theorem). Use this Normal distributionto calculate probabilities that concern .

xx

x

xn

xx

x

x

A company that offers courses to prepare would-beM.B.A. students for the GMAT examination has the following informationabout its customers: 20% are currently undergraduate students in business;15% are undergraduate students in other fields of study; 60% are collegegraduates who are currently employed; and 5% are college graduates whoare not employed.

(a) This is a legitimate assignment of probabilities to customer backgrounds.Why?

(b) What percent of customers are currently undergraduates?

Exactly one of Brown, Chavez, and Williams will bepromoted to partner in the law firm that employs them all. Brown thinksthat she has probability 0.25 of winning the promotion and that Williamshas probability 0.2. What probability does Brown assign to the outcomethat Chavez is the one promoted?

Las Vegas Zeke, when asked to predictthe Atlantic Coast Conference basketball champion, follows the modernpractice of giving probabilistic predictions. He says, “North Carolina’sprobability of winning is twice Duke’s. North Carolina State and Virginiaeach have probability 0.1 of winning, but Duke’s probability is three timesthat. Nobody else has a chance.” Has Zeke given a legitimate assign-ment of probabilities to the eight teams in the conference? Explain youranswer.

Choose an American worker at random and assignhis or her occupation to one of the following classes. These classes are usedin government employment data.

A Managerial and professionalB Technical, sales, administrative supportC Service occupationsD Precision production, craft, and repairE Operators, fabricators, and laborersF Farming, forestry, and fishing

Page 73: chap_04

Chapter 4 Review Exercises 301

4.106 Rolling dice.

4.107 Nonstandard dice.

4.108 An IQ test.

� �

XP X

X

Y

x

The table below gives the probabilities that a randomly chosen worker fallsinto each of 12 sex-by-occupation classes:

Class A B C D E F

Male 0.14 0.11 0.06 0.11 0.12 0.03Female 0.09 0.20 0.08 0.01 0.04 0.01

(a) Verify that this is a legitimate assignment of probabilities to theseoutcomes.

(b) What is the probability that the worker is female?

(c) What is the probability that the worker is not engaged in farming,forestry, or fishing?

(d) Classes D and E include most mechanical and factory jobs. What is theprobability that the worker holds a job in one of these classes?

(e) What is the probability that the worker does not hold a job in ClassesD or E?

Figure 4.2 (page 239) shows the possible outcomes for rollingtwo dice. If the dice are carefully made, each of these 36 outcomes hasprobability 1/36. The outcome of interest to a gambler is the sum of thespots on the two up-faces. Call this random variable . Example 4.5 showsthat ( 5) 4/36.

(a) Give the complete probability distribution of in the form of a table ofthe possible values and their probabilities. Draw a probability histogramto display the distribution.

(b) One bet available in craps wins if a 7 or an 11 comes up on the next rollof two dice. What is the probability of rolling a 7 or an 11 on the nextroll?

(c) Several bets in craps lose if a 7 is rolled. If any outcome other than 7occurs, these bets either win or continue to the next roll. What is theprobability that anything other than a 7 is rolled?

You have two balanced, six-sided dice. One is a standarddie, with faces having 1, 2, 3, 4, 5, and 6 spots. The other die has threefaces with 0 spots and three faces with 6 spots. Find the probabilitydistribution for the total number of spots on the up-faces when you rollthese two dice.

The Wechsler Adult Intelligence Scale (WAIS) is a common “IQtest” for adults. The distribution of WAIS scores for persons over 16 yearsof age is approximately Normal with mean 100 and standard deviation 15.

(a) What is the probability that a randomly chosen individual has a WAISscore of 105 or higher?

(b) What are the mean and standard deviation of the average WAIS scorefor an SRS of 60 people?

(c) What is the probability that the average WAIS score of an SRS of 60people is 105 or higher?

(d) Would your answers to any of (a), (b), or (c) be affected if the distributionof WAIS scores in the adult population were distinctly non-Normal?

Page 74: chap_04

Probability and Sampling Distributions302 CHAPTER 4 �

4.109 Weights of eggs.

4.110 How many people in a car?

4.111 A grade distribution.

4.112 Simulating a mean.

Insurance.

� � � � �

19

x

x

The business of selling insurance is based on probability andthe law of large numbers. Consumers (including businesses) buy insurancebecause we all face risks that are unlikely but carry high cost—think of afire destroying your home. So we form a group to share the risk: we all paya small amount, and the insurance policy pays a large amount to those fewof us whose homes burn down. The insurance company sells many policies,so it can rely on the law of large numbers. Exercises 4.113 to 4.118 exploreaspects of insurance.

The weight of the eggs produced by a certain breed of henis Normally distributed with mean 65 grams (g) and standard deviation 5 g.Think of cartons of such eggs as SRSs of size 12 from the population of alleggs. What is the probability that the weight of a carton falls between 750 gand 825 g?

A study of rush-hour traffic in San Franciscocounts the number of people in each car entering a freeway at a suburbaninterchange. Suppose that this count has mean 1.5 and standard deviation0.75 in the population of all cars that enter at this interchange during rushhours.

(a) Could the exact distribution of the count be Normal? Why or why not?

(b) Traffic engineers estimate that the capacity of the interchange is 700cars per hour. According to the central limit theorem, what is theapproximate distribution of the mean number of persons in 700randomly selected cars at this interchange?

(c) What is the probability that 700 cars will carry more than 1075 people?(Hint: Restate this event in terms of the mean number of people percar.)

North Carolina State University posts the grade distri-butions for its courses online. You can find that the distribution of gradesin Accounting 210 in the spring 2001 semester was

Grade A B C D F

Probability 0.18 0.32 0.34 0.09 0.07

(a) Verify that this is a legitimate assignment of probabilities to grades.

(b) Using the common scale A 4, B 3, C 2, D 1, F 0, what isthe mean grade in Accounting 210?

One consequence of the law of large numbers is thatonce we have a probability distribution for a random variable, we can findits mean by simulating many outcomes and averaging them. The law of largenumbers says that if we take enough outcomes, their average value is sure toapproach the mean of the distribution.

I have a little bet to offer you. Toss a coin 10 times. If there is no run ofthree or more straight heads or tails in the 10 outcomes, I’ll pay you $2. Ifthere is a run of three or more, you pay me just $1. Surely you will want totake advantage of me and play this game?

Simulate enough plays of this game (the outcomes are $2 if you winand $1 if you lose) to estimate the mean outcome. Is it to your advantageto play?

Page 75: chap_04

Chapter 4 Review Exercises 303

4.113 Fire insurance.

4.114 More about fire insurance.

4.115 Life insurance.

4.116 More about life insurance.

4.117 The risk of selling insurance.

4.118 The risk of selling insurance, continued.

X

X

�� � �

X

X

X

X

X Y

X YZ . X . Y

An insurance company looks at the records for millions ofhomeowners and sees that the mean loss from fire in a year is $250 perperson. (Most of us have no loss, but a few lose their homes. The $250 is theaverage loss.) The company plans to sell fire insurance for $250 plus enoughto cover its costs and profit. Explain clearly why it would be stupid to sellonly 12 policies. Then explain why selling thousands of such policies is asafe business.

In fact, the insurance company sees that in theentire population of homeowners, the mean loss from fire is $250and the standard deviation of the loss is $300. The distribution oflosses is strongly right-skewed: many policies have $0 loss, but a few havelarge losses. If the company sells 10,000 policies, what is the approximateprobability that the average loss will be greater than $260?

A life insurance company sells a term insurance policy to a21-year-old male that pays $100,000 if the insured dies within the next 5years. The probability that a randomly chosen male will die each year canbe found in mortality tables. The company collects a premium of $250 eachyear as payment for the insurance. The amount that the company earnson this policy is $250 per year, less the $100,000 that it must pay if theinsured dies. Here is the distribution of . Fill in the missing probability inthe table and calculate the mean earnings .

Age at death 21 22 23 24 25 26

Earnings $99,750 $99,500 $99,250 $99,000 $98,750 $1250

Probability 0.00183 0.00186 0.00189 0.00191 0.00193 ?

It would be quite risky for you to insure thelife of a 21-year-old friend under the terms of Exercise 4.115. There is ahigh probability that your friend would live and you would gain $1250 inpremiums. But if he were to die, you would lose almost $100,000. Explaincarefully why selling insurance is not risky for an insurance company thatinsures many thousands of 21-year-old men.

We have seen that the risk of an investmentis often measured by the standard deviation of the return on the invest-ment.Themorevariable thereturn is (the larger is), theriskier the investment.We can measure the great risk of insuring one person’s life in Exercise 4.115 bycomputing the standard deviation of the income that the insurer will receive.Find , using the distribution and mean you found in Exercise 4.115.

The risk of insuring one person’slife is reduced if we insure many people. Use the result of the previousexercise and the rules for means and variances to answer the followingquestions.

(a) Suppose that we insure two 21-year-old males, and that their ages atdeath are independent. If and are the insurer’s income from the twoinsurance policies, the insurer’s average income on the two policies is

0 5 0 52

� � � � �

Page 76: chap_04

Probability and Sampling Distributions304 CHAPTER 4

HAPTER ASE TUDY XERCISES

C 4 C S E

4.119 Finite sample spaces.

4.120 (Optional) Infinite sample spaces, Part I.

4.121 (Optional) Infinite sample spaces, Part II.

CASE STUDY 4.1: Benford’s Law.

i i

� � � �1 2 3 4

20

Z

Z X X X X

X X

Z

X

S X

X

X S

Y

S Y

Y

Y S

W

S W

W

W S

Find the mean and standard deviation of . You see that the meanincome is the same as for a single policy but the standard deviation isless.

(b) If four 21-year-old men are insured, the insurer’s average income is

1( )

4

where is the income from insuring one man. The are independentand each has the same distribution as before. Find the mean and standarddeviation of . Compare your results with the results of (a). We see thataveraging over many insured individuals reduces risk.

Choose one employee of your company at randomand let be the number of years that person has been employed by thecompany (rounded to the nearest year).

(a) Give a reasonable sample space for the possible values of .

(b) Is a discrete random variable or a continuous random variable? Why?

(c) How many possible values of did you include in your definition of ?

Buy one share of Apple Computer(AAPL) stock and let be the number of days until the closing market valueof your share is double what you initially paid for the share.

(a) Give the sample space for the possible values of .

(b) Is a discrete random variable or a continuous random variable? Why?

(c) How many possible values of did you include in your definition of ?(This exercise illustrates that “discrete” and “finite” are not the samething, though we have made use only of finite discrete sample spaces.)

Randomly choose one 30-milliliter(ml) ink cartridge from the lot of cartridges produced at your plant in the lasthour. Let be the actual amount of ink in the cartridge. The cartridges aresupposed to contain 30 ml of ink but are designed to hold up to a maximumof 35 ml of ink.

(a) Give a reasonable sample space for the possible values of .

(b) Is a discrete random variable or a continuous random variable? Why?

(c) How many possible values of did you include in your definition of ?

Case 4.1 (page 243) concerns Benford’s Law,which gives the distribution of first digits in many sets of data from business andscience. Locate at least two long tables whose entries could plausibly begin with anydigit 1 through 9. You may choose data tables, such as populations of many citiesor the number of shares traded on the New York Stock Exchange on many days, ormathematical tables such as logarithms or square roots. We hope it’s clear that youcan’t use the table of random digits. Let’s require that your examples each containat least 300 numbers. Tally the first digits of all entries in each table. Report thedistributions (in percents) and compare them with each other, with Benford’s Law,and with the “equally likely” distribution.

Page 77: chap_04

Chapter 4 Case Study Exercises 305

CASE STUDY 4.2: State lotteries. Most American states and Canadian provincesoperate lotteries. State lotteries combine probability with economics and politics.States sometimes add opportunities for gambling in hard times to increase theirtake from operating the games or from taxing games run by others. Investigate thepresence of legal gambling in your state. Include probability aspects, such asthe probability of winning and the expected payoff on different games. Also lookat economic aspects of gambling: how much does the state earn? What percent ofthe state budget is this? Has the state’s take from gambling changed in recent years?Probability and economics are related, because a state lottery keeps only what itdoesn’t pay out in winnings to gamblers. What percent of the money bet in thelottery does the state pay out in prizes? How does this compare with casino games?