random variables & probability distributions chapter 6 of the textbook pages 167-208

Random Variables & Probability Distributions

Chapter 6 of the textbook

Pages 167-208

Lecture Overview

Schedule

Clarification from Friday

Discrete Random Variables

ScheduleToday:

– Discrete random variables– Homework #4 will be posted this afternoon

Wednesday: – Continuous random variables & bivariate random variables– Homework #3 due

Friday: – Homework 4 help & Excel show and tell

Next Monday: – Any remaining chapter 6 slides – Exam #1 Review – Homework #4 due

Next Wednesday:– Exam #1 (you’re allowed 1 sheet of paper (front & back) for notes & equations)– Test questions will be very reminiscent of homework problems

Next Friday:– Go over exam #1 questions– Intro to S-Plus

Clarification From Friday

On HW3, question #15

P(A) = .3, P(B) = .5, P(B|A) = .4

What is P(A|B)What is P(A∩B)

Using the multiplication theorem

P(B∩A) = P(B|A)*P(A) = 0.4 * 0.3 = .12

Using the definition of conditional probability

P(A|B) = P(A∩B) / P(B) = .12 / .5 = .24

Clarification From Friday

Why doesn’t P(A ∩ B) = P(A) * P(B)?

Answer: because (A) and (B) aren’t statistically independent

Recall that statistical independence is defined as:– P(A|B) = P(A) – OR– P(B|A) = P(B)

This is not true for this problem

If A and B are statistically independent, the multiplication theorem becomes: P(A ∩ B) = P(A) * P(B) since we can just replace P(A|B) with P(A)

Definitions

Random Sample (from Ch. 1)

Variable (from Ch. 1)

Random Variable– “any numerically valued function that is defined over a sample

space”– For the household example in the book - “the variable is random

not because the household makes a random decision to include a certain number of people, but because our sample experiment selects a household randomly”

Example

Imagine randomly sampling students in the union and asking them how many books they are carrying

Example DataElementary Outcomes– Student 1 : 3 books– Student 2 : 2 books– Student 3 : 0 books– Student 4 : 1 books– Student 5 : 2 books– Student 6 : 1 books– Student 7 : 0 books– Student 8 : 1 books– Student 9 : 4 books– Student 10 :1 books

Sample Space : {0,1,2,3,4}

Random Variable (X)(Function: X = # of books)

– X(Student 1) = 3– X(Student 2) = 2 – X(3) = 0– Etc.

Probability of Random Variables– So X can be any value from the

full set of possible #s of books

– x can = any number in the sample space (0,1,2,3,4)

– P(x) is the probability of getting an x in a random sample

– Example: P(0 books) = 2/10 = .2– Example: P(3) = 1/10 = .1

Clarification: X and xX is the random variable– Can be any of the possible values and their associated probabilities– In other words, this can equal any element in a sample space, each

with a probability of occurring

x is one possible outcome of the random variable (i.e., an event)– For example, x can be 0 books, 1 book, 2 books, etc.

Why does this matter?– When we figure out probabilities we are usually concerned with

P(xi) since P(X) = 1– When we figure out expected values (E) or variances (V) we are

concerned with X because we want to know the expected values with respect to all possibilities

Definition

Probability Distribution or Function– “a table, graph, or mathematical function that

describes the potential values of a random variable X and their corresponding probabilities”

Example ContinuedProbability Distribution or Function

Table Form Graph Form

xi P(xi)

0 2/10 = 0.2

1 4/10 = 0.4

2 2/10 = 0.2

3 1/10 = 0.1

4 1/10 = 0.10

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0 1 2 3 4

Number of Books

Pro

bab

ilit

y :

P(x

)

Question: What do these remind you of from past chapters?

Key Concept

Discrete Random Variables– “The set of possible values (i.e., the sample space) is finite or

countably infinite.”

Continuous Random Variables– The set of possible values can be any real number in the range of

possible values (i.e., infinite possible values)

Questions:– What type of random variable is the student / book example?– Can you come up with examples of each?

Probability Mass Function

Specifies the probability distribution for discrete variables

The tables and graphs are examples of the probability mass function (i.e., the probability is “massed” at the discrete possible values)

The probability mass function preserves the provision:

k = the different values of the discrete variable i = 1, 2, …. kThis is identical to the rule from last chapter

k

iixPXP

1

1)()(

Example Continued

1)4()3()2()1()0()(1

PPPPPxPk

ii

k

iixP

1

1)(

11.01.02.04.02.0)(1

k

iixP

Expected Values for Discrete Random Variables

“E” is the term for expected values

E(X) = the expected value of a discrete random variable

To calculate E we need the probability distribution (e.g., the probability distribution table)

)(*)()(1

k

iii xxPXE

Example ContinuedFor our students & books example

So if we randomly selected a student we would expect them to have 1.5 books with themQuestion: does this remind you of any other statistic?

)(*)()(1

k

iii xxPXE

4*1.03*1.02*2.01*4.00*2.0)( XE

).....(*)()(*)()( 1100 xxPxxPXE

5.14.03.04.04.00)( XE

Variance Values for Discrete Random Variables

“V” is the term for the variance of a probability distribution

V(X) = the variance value of a discrete random variable

To calculate V we need E and the probability distribution (e.g., the probability distribution table)

Note: I wrote the equation a little differently than the book to make it clear that you do the sum first and then subtract the E(X)2

])([])(*)([)( 22

1

XExxPXVk

iii

Example Continued

For out students & books example

As with univariate statistics, the standard deviation is the square root of the variance

45.1]25.2[]7.3[)(

]5.1[]1.0*41.0*32.0*24.0*12.0*0[)(

])([.....]*)(*)([)(

])([])(*)([)(

222222

2211

200

22

1

XE

XE

XExxPxxPXE

XExxPXEk

iii

Discrete Probability Models

If we have a census, the probability distribution (table, graph, etc.) is complete/finished/appropriate/accurate

If we have a sample, we try to match our sample probability distribution to a known probability distribution

Discrete Probability Models

Common probability models for discrete random variables include– Uniform distribution

– Binomial distribution

– Poisson distribution

The benefit of using these models is that they have known properties and corresponding equations already made

Discrete Uniform Distribution

The probability of all possible random variables is equal

This equates to a rectangular graph (i.e., flat on top)

Discrete Uniform Distribution Equations

Probability:

Expected:

Variance:

kxXE

k

x

1)(

1

kxP

1)(

12

1)(

2

KXV

Discrete Binomial Distribution

There are 2 and only 2 possible outcomes of a statistical experiment (e.g., flipping a coin)

Discrete Binomial Distribution Equations

Probability:

Expected:

Variance:

= the probability of one solution

)1()( nXV

xnxnxCxP )1()(

nXE )(

Example (from book)

Quiz with 10 multiple choice questions

Each question has 5 possible answers

P(guessing correctly) = 1/5 = 0.2

P(guessing incorrectly) = 4/5 = 0.8

What is the probability of guessing 5 correct answers?

Answering this question with what we learned last chapter

Imagine we only have 2 questionsSince the questions are independent, P(A∩B) = P(A) * P(B) This result is in the upper left box


Now imagine we add a third questionSince the questions are independent P(A∩B∩C) = P(A) * P(B) *P(C) is the upper left box


So 3 out of 3 right answers = P(RRR) = 0.2 * 0.2 * 0.2 = .008

Follow this out to 5 correct & 5 incorrect– = 0.2*0.2*0.2*0.2*0.2*0.8*0.8*0.8*0.8*0.8 (i.e., 0.25 * 0.85)– = 0.000104858 for one option of 5 R, 5W– Another option would be 4 R, 5 W, 1 R (i.e., 1-4 & 10 correct)

How many combinations of 5 Right & 5 Wrong are there?

Use combinations rule: C(10,5) = 252

Answer = 252 * 0.000104858 = 0.026

Answer Question Using the New Equation

026.0)2.01(2.0)(

)1()(510510

5

CxP

CxP xnxnx

This answer can also be found in a table in the back of the book on P. 605

Poisson Discrete Distribution

Poisson distributions are often used to determine the probability of a number of events (x) occurring in a fixed space or over a fixed period of time

But they can be used to determine probabilities for other variables as well (the book used the example of lengths of rope)

Poisson distributions are also called the “distribution of rare events”


Requirements for using a Poisson distribution– Mutually exclusive events are independent– The probability of an event occurring is small and

proportional to the size of the area (or to the length of the interval)

– The probability of 2 or more events occurring in a small area or interval is near zero

Rules 2 and 3 are where the phrase “distribution of rare events”

Poisson Discrete DistributionThe parts of a Poisson distribution equation:– λ - The average occurrence of an event in time or space

• 8 houses per block• 1 hiccup per minute• 1 lightening strike per square mile per decade• The “answers” to questions will be in the same units as λ (e.g., “per

minute”)

– e - base of the natural logarithm (e = 2.71828...)

– X – the Poisson random variable• Just like the random variable for the other distributions• This is the value for which we determine E and V

– x – the values from X for which we find probabilities etc.• E.g., what is the probability of x if x = 2 hiccups per minute?• x can be any positive number


Probability:

Expected:

Variance:

!

)(0 x

exXE

x

x

!)(

x

exP

x

!

)]([)(0

2

x

exXExXV

x

x


The Poisson discrete distribution is actually a family of distributions– The members of the family relate to one λ each

For example:– 1 hiccup per minute uses one family– 2 hiccups per minute uses another family


1 Hiccup Per Minute (i.e., λ = 1)

What is the probability of hiccupping 4 times per minute (i.e., x = 4)?

01533.0)4(16

36788.0)4(

!4

12.71828)4(

!)(

41

P

P

P

x

exP

x

This answer can also be found in a table in the back of the book on P. 606

Continuous Random Variables

Review:– Continuous Random Variables: The set of

possible values can be any real number in the range of possible values (i.e., infinite possible values)

For continuous random variables we use probability density functions rather than probability mass functions

Probability Density Functions

Specify the probability distribution for continuous variables

Unlike probability mass functions we used with discrete variables where all the P(xi) added up to 1, with probability density function the area under the curve = 1

Also unlike probability mass functions we aren’t concerned with P(xi) because each xi is a vertical line with an area equal to zero

Instead we are concerned with probabilities such as P(x > some amount A) or P(x between values B and C)

Probability Density Functions

The probability density function of a random continuous variable X is denoted as f(X)

The “function” part (i.e., the “f”) relates to the equation that produces a curve (i.e., it is used to graph the line)

Conditions satisfied by probability density functions (assume the min and max values are a and b respectively):– f(x) ≥ 0 for a ≤ x ≤ b– The area under f(x) from x = a to x = b = 1

Because we are ultimately concerned with areas under portions of the curve, what type of math do we need?

Continuous Probability Distribution Models

Probability Distribution Models– As with discrete random variables we usually

have a sample rather than a census

– To calculate probabilities from a sample we assume the data conform to some known distribution for which we have handy tables

– This is how we avoid having to do calculus

Continuous Probability Models

Common probability models for continuous random variables include– Uniform distribution

• Rectangular distribution

– Normal distribution (a.k.a. Gaussian)• The “bell shaped curve”

Uniform Continuous Distribution

Probability: P(c to d) = (d-c) / (b-a) given a ≤ c ≤ d ≤ b

Distribution Function: for a ≤ x ≤ b

Expected:

Variance:12

a)-(b)(

2)(

2

XV

abXE

abxf

1)(

Normal Probability Distribution

Probability: convert to z-scores first (explained in a few slides)

Distribution Function:

Expected:

Variance:

Notes: = 3.14……

2)(

)(

XV

XE

2

2

1

2

1)(

x

exf

Features of the Normal Probability Distribution

The mean, median, and mode values are all equal to the peak of the distribution

The distribution is symmetrical

½ of the curve is above the mean and vice-versa

Z scores

To avoid having to calculate probabilities for curves with varying μ, σ, and shapes we can convert any normally distributed random variable to a standard form for which we have tables

To do this we use the z transformation:

This conversion changes our variable measured in units x (e.g., meters, miles, pounds) to units of z (i.e., standard deviation units)

x

zx

Example

If we have a normally distributed dataset of bowling scores with μ = 150, σ = 10, what is the z-score of 175?

– What does a z-score of 2.5 mean?

– Answer: the value of 175 is 2.5 standard deviations above the mean for this particular normal probability distribution

5.210

25

10

150175175

x

z

Probability z-scores

Remember that one of the reasons we calculate z-scores is to ask questions about probability

For example what is the probability of bowling over 175 given our previous example?

To answer this question it is easiest to use a z-table like the one on page 207 of your book

Using Standard Normal Probabilities (i.e., a z-table)

The table in our book is atypical of what you usually see, but more user friendly thanks to the pictures

For our bowling example, find the z-score (2.5) in the column on the left

Now choose the column of interest, in this case column #3: P(Z > z)

The probability value we get from the table is 0.006– This means that the probability of bowling over 175, for our

fictional dataset, is 0.006 (i.e., 0.6%)

More Conventional Z tables

Normally we see z-tables with the following characteristics:– 2 digits of precision in the far left column

– 1 additional digit of precision in each of the 10 other columns to the right

– A value indicating a one-directional probability (i.e., the total probability of values less than a z-score)

• This is equivalent to the 5th column in the z-table in our book

Bivariate Random Variables

Now we turn our attention to the relationship between two variables (hence the name “bivariate”)

The random variables can be discrete or continuous

Most of the following slide & equations should look very familiar to those from chapter 5

Bivariate Probability Functions

Conditions:– 0 ≤ P(x,y) ≤ 1

–

For 2 discrete random variables (x & y) it is useful to set up a contingency table

These contingency tables are just like those for 2 events and they may contain actual counts or probabilities

1),( x y

yxP

Example (from book)100 households sampled and asked how many people are in the household (x) and how many cars are owned by members of the household (y)

Our book did you the disservice of switching the x and y axes

The data can be summarized in the following table:

0 1 2 3

2 10 8 3 2

3 7 10 6 3

4 4 5 12 6

5 1 2 6 15

Cars (y)

Household Size

(x)

Marginal TotalsMarginal Probabilities are the sums of the rows and columns

The marginal totals for household size (x) are in red

The marginal totals for cars (y) are in blue

The total number of households sampled is in green

0 1 2 3

2 10 8 3 2

3 7 10 6 3

4 4 5 12 6

5 1 2 6 15

Cars (y)

Household Size

(x)

23

26

27

24

22 25 27 26 100

Marginal ProbabilitiesAll probabilities are just the totals from each box (see last slide) divided by the total number of households (100)

The marginal probabilities for household size (x) are in red

The marginal probabilities for cars (y) are in blue

The sum of each set of marginal probabilities green

0 1 2 3

2 .1 .08 .03 .02

3 .07 .1 .06 .03

4 .04 .05 .12 .06

5 .01 .02 .06 .15

Cars (y)

Household Size

(x)

.23

.26

.27

.24

.22 .25 .27 .26 1.0

Conditional Probabilities

Equation:

For Example: what is the probability of having a household size of 4 (x=4) given the household has 3 cars (y=3)?

)(

),()|(

yP

yxPyxP

231.026.0

06.0)3|4( P

Covariance

“Covariance is a direct statistical measure of the degree to which two random variables X and Y tend to vary together”

Covariance is positive when X and Y increase together (and therefore decrease together) – Ex. the amount of ice cream you eat and the temperature outside

Covariance is negative when X and Y are inversely related– Ex. the number of layers of clothes you tend to wear and the

temperature outside

When there is no pattern the covariance is close to zero

Covariance

Covariance Equation:

Note 1: There is another option for calculating covariance in your book

Note 2: there are also nice tables showing how you would go about calculating these values on page 202 & 203

)]([)]()[,(),( YEyXExyxPYXCx y

Covariance

Covariance Equation:

Parts of this equation:– The P(x,y) values come from the covariance table

– The E(X) and E(Y) values are calculated using this formula (from about 40 slides ago):

)]([)]()[,(),( YEyXExyxPYXCx y

)(*)()(1

k

iii xxPXE

Covariance Example (from book)

First we need to calculate the expected (E) values:

We can use the marginal probabilities for this

For E(X) multiply the xi by the row totals– (i.e., orange # * red #)

For E(Y) multiply the yi by the column totals– (i.e., purple # * blue #)

52.32.108.178.46.)(

24.*527.*426.*323.0*2)(

)(*)()(1

XE

XE

xxPXEk

iii

0 1 2 3

2 .1 .08 .03 .02

3 .07 .1 .06 .03

4 .04 .05 .12 .06

5 .01 .02 .06 .15

.23

.26

.27

.24

.22 .25 .27 .26 1.0

57.178.54.25.0)(

26.*327.*225.*122.0*0)(

)(*)()(1

YE

YE

yyPYEk

iii

Covariance Example (book is wrong again!)(x,y) P(x,y

)X-E(X) Y-E(Y) [x-E(X)][y-E(Y)] P(x,y) [x-E(X)][y-E(Y)]

2,0 .1 -1.52 -1.57 2.3864 0.23864

2,1 .08 -1.52 -0.57 0.8664 0.069312

2,2 .03 -1.52 0.43 -0.6536 -0.019608

2,3 .02 -1.52 1.43 -2.1736 -0.043472

3,0 .07 -0.52 -1.57 0.8164 0.057148

3,1 .1 -0.52 -0.57 0.2964 0.02964

3,2 .06 -0.52 0.43 -0.2236 -0.013416

3,3 .03 -0.52 1.43 -0.7436 -0.022308

4,0 .04 0.48 -1.57 -0.7536 -0.030144

4,1 .05 0.48 -0.57 -0.2736 -0.01368

4,2 .12 0.48 0.43 0.2064 0.024768

4,3 .06 0.48 1.43 0.6864 0.041184

5,0 .01 1.48 -1.57 -2.3236 -0.023236

5,1 .02 1.48 -0.57 -0.8436 -0.016872

5,2 .06 1.48 0.43 0.6364 0.038184

5,3 .15 1.48 1.43 2.1164 0.31746

Sum 0.6336

Independence

As with events (e.g., A and B) from last chapter, x and y are independent if P(x,y) = P(x)P(y) for all values of x and y

Independence and covariance are closely related, but not the same– Independent variable will have a covariance of 0

– But random variables with a covariance of 0 may not be independent

Problems With Covariance

The sign (+ or -) of the calculated covariance is meaningful, but not the magnitude

This is because the covariance is dependent on the scale of the input data

Therefore if we multiplied x or y by 10 and recalculated the covariance it will have changed even though the relationship between x and y, strictly speaking, is the same

Correlation Coefficient

The correlation coefficient is a standardized statistic that measures the relationship between random variables

Correlation coefficients range from -1 to 1– 1 is a positive relationship (both ↑ or ↓ together)– -1 is an inverse relationship (one ↑ while the other ↓)– 0 suggests, but doesn’t guarantee independence

Unlike covariance the scale of the data does not matter


In chapter 6 the book introduces this statistics for with the assumption that the population covariance (C) & standard deviation (σ) are known or can be calculated

The correlation coefficient for a sample is discussed in chapter 12


Equation:

yxxy

YXC

),(

random variables & probability distributions chapter 6 of the textbook pages 167-208

Documents