hw solutions are on the web. see website for how to calculate probabilities with minitab, excel, and...
TRANSCRIPT
• HW solutions are on the web.
• See website for how to calculate probabilities with minitab, excel, and TI calculators.
Determining normal probabilities:
• Suppose X has a normal distribution with mean 5 and std dev 2.
• Notation X~N(5,4) [notation uses N(mean,variance)]
• What’s the probability that X is less than 4?
x
density
0 5 10
0.0
0.05
0.10
0.15
0.20
7
Pr(X<4) = area undercurve to left of x=4
Normal Density
• What’s Pr(X < 4)?
• Draw (previous page)
• Center and scale:– Pr(X<4) = Pr( (X-5)/2 < (4-5)/2 )
= Pr( Z < -1/2 )
• Look up (appendix 1)= Pr(Z<-1/2)= 0.3085
“Centering and Scaling?”
• Suppose X~N(mu,sigma^2).• Why does (X-mu)/sigma have a N(0,1)
distribution? (X-mu) part is “centering” and /sigma part is “scaling”.
• Idea: All normal distributions have the same shape. Centering and scaling just relabels the x and y axes. The area under the curve (and the probabilities) remains the same.
Pr(X<4) = area undercurve to left of x=4
Pr(Z<-0.5) = area undercurve to left of -0.5Same area as above
x
den
sity
0 5 10
0.0
0.05
0.10
0.15
0.20
-4 -2 0 2 4
0.0
0.1
0.2
0.3
0.4
Centered and scaled X
Den
sity
4
-0.5
Example 2
X ~ N(2,9)What’s Pr(1<X<4)?
X
density
-5 0 5 10
0.0
0.02
0.04
0.06
0.08
0.10
0.12
Want area inbetween thesebars
First let’s do this with the tables.
Pr(1<X<4)=Pr[(1-2)/3<Z<(4-2)/3] (where Z~N(0,1))=Pr[Z<(4-2)/3] –Pr[Z<(1-2)/3]=Pr(Z < 2/3) – Pr(Z < -1/3)= 0.7475 - 0.3694= 0.3781
EVEN IF YOU’LL ALWAYS USE CALCULATORS, MINITAB, EXCEL, OR MATLAB TO DO THESE PROBLEMS, YOU’LL NEED TO KNOW HOW TO DO CALCULATIONS LIKE THE ONES ABOVE…
Purpose of all this is to get to an expression that only uses Pr(Z<a) where Z~N(0,1). All because tables have Pr(Z<a).
Using excel or minitab, the only step that is necessary is to get the probability in terms of CDFs (i.e. Pr[X <= k]).
Pr(1<X<4) = Pr(X<4) – Pr(X<1), where X ~N(2,32)= 0.748 – 0.369 = 0.378
(Do demo in class)
Three probabilities to memorize:Pr(Z < 2.33) = 99%Pr(Z < 1.96) = 97.5%Pr(Z < 1.28) = 90%
Remember: Z~N(0,1) “the standard normal”
Later in the course, we will need to be able to do things like the following:
Let X~N(10,16). Find an a such that
Pr(X < a) = 0.80. x
CDF
0 5 10 15 20
0.0
0.2
0.4
0.6
0.8
1.0
Plot of x versus Pr(X<x)when X~N(10,42)
a is this number here
Probabilitiesare on thisaxis
Let X~N(10,16). Find an a such that
Pr(X < a) = 0.80.
Pr[(X-10)/4 < (a-10)/4] =
=Pr(Z < (a-10)/4] =0.80
Using the table “backwards” we find that
Pr(Z < 0.84) = 0.80
As a result, (a-10)/4 = 0.84
So, a = 13.36
This is called an inverse probability problem.
The Normal Distribution is Pervasive
• Examples of things that are normally distributed:
– Heights, weights, abilities, many, many other measurements
– In general, when a quantity is the result of a combination of many factors and influences, samples of that quantity are very likely to be approximately normally distributed.
– Why?
• A store keeps track of the average amount spent by people each day.
• Let Xi = average amount spent on day i
• It turns out that there is a good reason to believe that Xi has a normal distribution!!!
• This reason is theCENTRAL LIMIT THEOREM
Central Limit Theorem
• Let X1,…,Xn be n independent random variables each with constant mean and constant variance 2.
• Then, as n gets large,
(X1+…+Xn)/n ~ N(, 2/n)and
(X1+…+Xn) ~ N(n, n2)
What does “large n” mean
What “large” is depends on the distribution of Xi.
• If Xi’s are already normal, then the result is true for any n
• If Xi’s have a symmetric distribution, then n at least 3 is probably large enough
• If Xi’s have a skewed distribution, the n of 20 or 30 is probably large enough
Example• Suppose the amount of potassium in a banana
is normally distributed with mean 630mg and standard deviation 40mg.
• You eat 3 bananas a day. Let T = amount of potassium you eat. What is the probability that T < 1800mg?
• By central limit theorem, T~N[3*630,3*(402)].Want Pr(T < 1800) = Pr[ (T-1890)/(sqrt(3)*40) < (1800 – 1890)/(sqrt(3)*40)]= Pr[ Z < -1.33] = 0.0981 (see p672-3 in book for table or use Calculator or Excel or Minitab)
T
density
1600 1700 1800 1900 2000 2100
0.0
0.001
0.002
0.003
0.004
0.005
Area under curve to left of line is Pr(T < 1800)