hudm4122 probability and statistical inference · binomial distribution, will be covered on the ......

Post on 25-Mar-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

HUDM4122Probability and Statistical Inference

March 4, 2015

First things first

The Exam

• Due to Monday’s class cancellation

• Today’s lecture on the Normal Distribution willnot be covered on the Midterm

• However, the previous lecture, on theBinomial Distribution, will be covered on themidterm

Apologies…

• For the fact that this HW still included normaldistribution problems

• We will go over those in the first class afterthe exam

• I was impressed with how well you all diddespite the extra challenge

In the last class

• We studied the Binomial ProbabilityDistribution

• The distribution of probability coming from aset of trials that have two possible outcomes

HW (Binomial)

P1

Which of these is not a binomial experiment?

Your friend throws a ball at a target onConey Island 100 times; how many timesdoes it hit?

You and your 6 friends buy a pizza with 10slices, 4 pepperoni. If each friend takes oneslice, at random, how many slices ofpepperoni should you expect will be taken?

You use an automated detector to figureout how many of 50 randomly sampledstudents gamed the system.

You select 6 climate science professors outof all universities in the country. How manyof them will say global warming is real?

P1

Which of these is not a binomial experiment?

Your friend throws a ball at a target onConey Island 100 times; how many timesdoes it hit?

You and your 6 friends buy a pizza with 10slices, 4 pepperoni. If each friend takes oneslice, at random, how many slices ofpepperoni should you expect will betaken?You use an automated detector to figureout how many of 50 randomly sampledstudents gamed the system.

You select 6 climate science professors outof all universities in the country. How manyof them will say global warming is real?

P1: Common other answer(I can see arguments for it)

Which of these is not a binomial experiment?

Your friend throws a ball at a target onConey Island 100 times; how many timesdoes it hit?

You and your 6 friends buy a pizza with 10slices, 4 pepperoni. If each friend takes oneslice, at random, how many slices ofpepperoni should you expect will be taken?

You use an automated detector to figureout how many of 50 randomly sampledstudents gamed the system.

You select 6 climate science professors outof all universities in the country. How manyof them will say global warming is real?

P2For the followingprobability distribution,what is the mean?X P(X)0 0.11 0.32 03 0.34 0.3

0(0.1)+1(0.3)+2(0)+3(0.3)+4(0.3)For the followingprobability distribution,what is the mean?X P(X)0 0.11 0.32 03 0.34 0.3

0+0.3+0+0.9+1.2For the followingprobability distribution,what is the mean?X P(X)0 0.11 0.32 03 0.34 0.3

2.4For the followingprobability distribution,what is the mean?X P(X)0 0.11 0.32 03 0.34 0.3

Common Wrong Answer:(0.1+0.3+0+0.3+0.3)/5=0.2

For the followingprobability distribution,what is the mean?X P(X)0 0.11 0.32 03 0.34 0.3

P9

For the following binomial probabilitydistribution, what is the P(X=5)?N = 10P = 0.5Q = 0.5

For the following binomial probabilitydistribution, what is the P(X=5)?N = 10P = 0.5Q = 0.5

For the following binomial probabilitydistribution, what is the P(X=5)?N = 10P = 0.5Q = 0.5

In the last class

• We studied the Binomial ProbabilityDistribution

Before we move on…

• Any questions about binomial probabilitydistributions?

Today

• Chapter 6 in Mendenhall, Beaver, & Beaver

• Normal Probability Distribution

• We’ll start today• We’ll finish Wednesday

So far…

• We’ve talked about discrete probabilities, thatcan be written out in a table

x P(x)0 1/41 1/22 1/4

So far…

• Though, of course, for 4000 trials, writing outthat table would take an awfully long time…

But what about

• Distributions of continuous (numerical)variables?

But what about

• Distributions of continuous (numerical)variables?

• These variables could have an infinite numberof potential values

But what about

• Distributions of continuous (numerical)variables?

• These variables could have an infinite numberof potential values

• So you can’t write them in a table

So, whereas…

• The histograms of discrete variables lookblocky

0

0.1

0.2

0.3

0.4

0.5

0.6

0 1 2

p(x)

x

The histograms of continuous variableslook smooth

Why?

• For a discrete variable, with overall mean “7”,individual cases might be measured as “6” or“8” or even “5”

• Whereas for a continuous variable, withoverall mean 6.421, individual cases might bemeasured as any number, with decreasingprobability as you get further away from 6.421

For discrete distributions

• The probability of any value x is found to beP(x), a known probability for that value

For continuous distributions

• The probability of any value x is found by aformula f(x), which represents the density ofcases likely to be found at value x

More dense

Less dense

Even less dense

Because of this

• Probability distributions for continuousvariables

• Are also called probability density functions

Commonality between Discrete andContinuous

• For Discrete– Probabilities for all cases add to 1

• For Continuous– Probability under curve adds to 1

Similarity between Discrete andContinuous

• For Discrete– Probability of value between a and b is sum of

probability for all values between a and b

• For Continuous– Probability of value between a and b is area under

curve between a and b

How do we find the area under thatcurve?

• For some functions, it’s not that bad

Flat Distribution

• Also called the uniform distribution

The Flat Distribution

• F(X)= 0 if X<i OR X>j• F(X)=1 if X>i AND x<j• Where i<j

Example (From Book)

• i=-0.5• J=+0.5

• F(X)= 0 if X<i OR X>j• F(X)=1 if X>i AND x<j• Where i<j

Example (From Book)

• i=-0.5• J=+0.5

• F(X)= 0 if X<-0.5 OR X>+0.5• F(X)=1 if X>-0.5 AND x<+0.5• Where -0.5<+0.5

Example (From Book)

• What is probability -0.2<x<+0.2?

• F(X)= 0 if X<-0.5 OR X>+0.5• F(X)=1 if X>-0.5 AND x<+0.5• Where -0.5<+0.5

Example (From Book)

• What is probability -0.2<x<+0.2?

• F(X)= 0 if X<-0.5 OR X>+0.5• F(X)=1 if X>-0.5 AND x<+0.5• Where -0.5<+0.5

• So for -0.2<x<0.2, F(X)=1– Does everyone see why?

Example (From Book)

• What is probability -0.2<x<+0.2?

• F(X)= 0 if X<-0.5 OR X>+0.5• F(X)=1 if X>-0.5 AND x<+0.5• Where -0.5<+0.5

• Rectangle width = -0.2 to 0.2 = 0.4• Rectangle height = 1

Example (From Book)• What is probability -0.2<x<+0.2?

• F(X)= 0 if X<-0.5 OR X>+0.5• F(X)=1 if X>-0.5 AND x<+0.5• Where -0.5<+0.5

• Rectangle width = -0.2 to 0.2 = 0.4• Rectangle height = 1

• 0.4*1=0.4, so P(-0.2<x<+0.2)=0.4

You Try It

• i=3• J=5

• F(X)= 0 if X<i OR X>j• F(X)=0.5 if X>i AND x<j• Where i<j

• What is P(4.5<x<5)?

You Try It

• i=3• J=5

• F(X)= 0 if X<i OR X>j• F(X)=0.5 if X>i AND x<j• Where i<j

• What is P(x>4.8)?

Questions? Comments?

BTW

• Since Probability of value between a and b isthe area under curve between a and b

• The probability for any specific value isdefined as 0– I.e. P(a) = 0– P(b) = 0

Which implies

• P(x>=a) = P(x>a)• Not true for discrete random variables

Other functions are a bit trickier thanthe Flat Distribution

The one we’ll focus on today is theNormal Distribution

• Also called the Gaussian distribution

• The version of the Normal Distribution that isalmost always used is also called the Zdistribution

• We’ll discuss why in a bit

Why’s it called the normaldistribution?

• It’s a good approximation for a lot of real-world data

Normal Distribution

µ = meanσ = standard deviationπ = 3.14159…e = 2.7183…

How do we find the area under thatcurve?

• Option 1: Break out your integral calculus

How do we find the area under thatcurve?

• Option 2: Break out someone else’s integralcalculus

How do we find the area under thatcurve?

• Option 2: Break out someone else’s integralcalculus

• (Aka a table, or Microsoft Excel)

Normal Distribution(µ = 0, σ = 1)

“Standardized Normal Distribution”(µ = 0, σ = 1)

Standardized Normal Distribution

P(X>0)=0.5P(X<0)=0.5

Standardized Normal Distribution

P(X<-1.96)=0.025

Standardized Normal Distribution

P(X>1.96)=0.025

Standardized Normal Distribution

P(-1.96>X>1.96)=0.95

Questions? Comments?

Remember z-scores?

Z-score formula

• z = ̅• z =

• The deviation, divided by the standarddeviation

The Standardized Normal Distributionshows

the probabilities of the values of Z

• That’s why it’s also called the Z distribution!

“Standardized Normal Distribution”(µ = 0, σ = 1)

Z=0

Z=2Z=-2

Recall from earlier in the semester• If your data is normally distributed

• 68% of your data will be between -1 SD of themean, and +1 SD of the mean– z between -1 and +1

• 95% of your data will be between -2 SD of themean, and +2 SD of the mean– z between -2 and +2

• 99.7% of your data will be between -3 SD of themean, and +3 SD of the mean– z between -3 and +3

Ryan’s Bad Advice For the Lovelorn

• If you tell someone they’re 3 SD better thanthe mean, you’re telling them they’re betterthan 99.7%+1.5%=99.85% of other people

Comments? Questions?

Getting cumulative probabilities

• So, let’s say you want to know the probabilityof Z<0

• Once again, you can– Do the integral calculus– Look in a table– Use Excel

Looking in a table

• Appendix I, Table 3

• Page 664 in MBB

What’s the probability Z<-1.96?

What’s the probability Z<-1.64?

What’s the probability Z>-1.64?

What’s the probability Z>-1.64?

• 1-0.0505=0.9495

What’s the probability -1.96<X<-1.64?

What’s the probability -1.96<X<-1.64?

• Between 0.0505 and 0.0250

• 0.0505-0.0250= 0.0255

What’s the probability -1.96<X<+1.96?

What’s the probability -1.96<X<+1.96?

• Between 0.0250 and 0.9750

• 0.9750-0.250= 0.95

What’s the probability -1.96<X<+1.96?

• Between 0.0250 and 0.9750

• 0.9750-0.250= 0.95

• In other words, 95% of the probabilitydistribution is between -1.96 and 1.96

Questions? Comments?

Doing it in Excel

• =NORMDIST(X,0,1,TRUE)

• For example• =NORMDIST(-1.96,0,1,TRUE)• Equals 0.025

Standardizing a Distribution

• Assuming data is normally distributed

• You can standardize that data,regardless of what its original mean andstandard deviation were

Example

• Undergraduates rate their professor’s quality

• The average rating is 3.8, SD is 0.4

• What is the probability that a professor gets4.5 or better?

Example

• Undergraduates rate their professor’s quality

• The average rating is 3.8, SD is 0.4

• What is the probability that a professor gets4.5 or better?

• Z = ( . . ). = .. = + 1.75

Example

• Undergraduates rate their professor’s quality

• The average rating is 3.8, SD is 0.4

• What is the probability that a professor gets 4.5or better?

• Z = ( . . ). = .. = + 1.75

• P(Z < 1.75) = 0.96

Example• Undergraduates rate their professor’s quality

• The average rating is 3.8, SD is 0.4

• What is the probability that a professor gets 4.5or better?

• Z = ( . . ). = .. = + 1.75

• P(Z < 1.75) = 0.96• thus P(Z>1.75) = 0.04

Example

• Z = ( . . ). = .. = + 1.75

• P(Z < 1.75) = 0.96• thus P(Z>1.75) = 0.04

• So 4% of professors can be expected to get arating of 4.5 or better

Try It In Solver-Explainer Pairs

• According to Stanford-Binet’s totallydiscredited definition of a “genius”,a genius has an IQ of 140 or higher

Try It In Solver-Explainer Pairs

• According to Stanford-Binet’s totallydiscredited definition of a “genius”,a genius has an IQ of 140 or higher

• Now we know it simply requiresworking at an Apple Store

Try It In Solver-Explainer Pairs

• According to Stanford-Binet’s totallydiscredited definition of a “genius”,a genius has an IQ of 140 or higher

• The average IQ is 140, SD is 15

• What is the probability that a person is agenius?

Questions? Comments?

Final questions or commentsfor the day?

Review sessions

• Thursday 10am and 1pm

• Location still TBD

Upcoming Classes

• 3/9 Exam 1

top related