chapter 4

CHAPTER 4

• 4.1 - Discrete Models General distributions Classical: Binomial, Poisson, etc.

• 4.2 - Continuous Models General distributions Classical: Normal, etc.

What is the connection between probability and random variables? Events (and their corresponding probabilities) that involve experimental measurements can be described by random variables (e.g., “X = # Males” in previous gender equity example).

2

POPULATION

random variable X

3

x1

x2

x3

x4

x5

x6

…etc….xn

Data values xi

Relative Frequenciesf (xi ) = fi /n

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

xk f (xk)

Total 1

Pop values xi

Probabilitiesf (xi )

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

Total 1

SAMPLE of size n

Example: X = Cholesterol level (mg/dL)

Discrete

POPULATION Pop values x

Probabilitiesf (x)

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

Total 1

Example: X = Cholesterol level (mg/dL)random variable XDiscrete

Probability Histogram

X

Total Area = 1

f(x) = Probability that the random variable X is equal to a specific value x, i.e.,

|x

“probability mass function” (pmf)

f(x) = P(X = x)

X


Probabilitiesf (x)

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

Total 1


Probability Histogram Total Area = 1

F(x) = Probability that the random variable X is less than or equal to a specific value x, i.e.,

“cumulative distribution function” (cdf)

F(x) = P(X x)

|x


Probabilitiesf (x)

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

Total 1



X

Hey!!! What about the

population mean

and the

population variance 2 ???

Calculating probabilities…

P(a X b) = ????????f (x)

b

a

|a

|x

|b

= F(b) – F(a)

7


Probabilitiesf (x)

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

Total 1

• Population mean

Also denoted by E[X], the “expected value” of the variable X.

• Population variance

)(xfx

)()( xfx 22


Just as the sample mean and sample variance s2 were used to characterize “measure of center” and “measure of spread” of a dataset, we can now define the “true” population mean and population variance 2, using probabilities.

x

Pop values x

Probabilitiesf (x)

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

Total 1

8

POPULATION

• Population mean

Also denoted by E[X], the “expected value” of the variable X.

• Population variance

)(xfx

)()( xfx 22


Just as the sample mean and sample variance s2 were used to characterize “measure of center” and “measure of spread” of a dataset, we can now define the “true” population mean and population variance 2, using probabilities.

x

)/()()/()()/()( 212031106140 222

Pop values xi


210 1/6

240 1/3

270 1/2

Total 1

Example 1:

9

POPULATION

)(xfx

)()( xfx 22

)/)(()/)(()/)(( 212703124061210 250

500

1/6

1/3

1/2


)/()()/()()/()( 31303103130 222

Example 2:

10

POPULATION

)/)(()/)(()/)(( 312403121031180 210

600

)(xfx

)()( xfx 22

Pop values xi


180 1/3

210 1/3

240 1/3

Total 1

1/3 1/3 1/3

Equally likely outcomes result in a “uniform distribution.”

(clear from symmetry)


To summarize…

11

12

POPULATION

SAMPLE of size n

x1

x2

x3

x4

x5

x6

…etc….xn

Data xi


x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

xk f (xk)

1

Pop xi


x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

1

Frequency Table

Probability Table

)()(

)(

xfx

xfx22


X

Total Area = 1

Density Histogram

X

Total Area = 1

)()(

)(

xfxxs

xfxx

nn 22

1

Discrete random variable X

13

POPULATION

SAMPLE of size n

x1

x2

x3

x4

x5

x6

…etc….xn

Data xi


x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

xk f (xk)

1

Pop xi


x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

1

Frequency Table

Probability Table

)()(

)(

xfx

xfx22


X

Total Area = 1

Density Histogram

X

Total Area = 1

)()(

)(

xfxxs

xfxx

nn 22

1

?Discrete

random variable X

Continuous

14

One final example…

15

Example 3: TWO INDEPENDENT POPULATIONS

X1 = Cholesterol level (mg/dL)

x f1(x)

210 1/6

240 1/3

270 1/2

Total 1


x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

D = X1 – X2 ~ ???

d Outcomes

-30 (210, 240)

0 (210, 210), (240, 240)

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)

NOTE: By definition, this

is the sample space of

the experiment!NOTE: By definition, this

is the sample space of

the experiment!

What are the probabilities

of the corresponding

events “D = d” for

d = -30, 0, 30, 60, 90?

d Outcomes

-30 (210, 240)

0 (210, 210), (240, 240)

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)

d Probabilities f(d)

-30 1/9 ?

0 2/9 ?

+30 3/9 ?

+60 2/9 ?

+90 1/9 ?16



x f1(x)

210 1/6

240 1/3

270 1/2

Total 1


x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

D = X1 – X2 ~ ???

NO!!!The

outcomes of D are NOT EQUALLY LIKELY!!!

d Outcomes

-30 (210, 240)

0 (210, 210), (240, 240)

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)


-30 (1/6)(1/3) = 1/18 via independence

0 (210, 210), (240, 240)

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)17



x f1(x)

210 1/6

240 1/3

270 1/2

Total 1


x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

D = X1 – X2 ~ ???



0 (210, 210), (240, 240)

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)



0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)18



x f1(x)

210 1/6

240 1/3

270 1/2

Total 1


x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

D = X1 – X2 ~ ???



0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)19



x f1(x)

210 1/6

240 1/3

270 1/2

Total 1


x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

D = X1 – X2 ~ ???



0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (1/6)(1/3) + (1/3)(1/3) + (1/2)(1/3) = 6/18

+60 (1/3)(1/3) + (1/2)(1/3) = 5/18

+90 (1/2)(1/3) = 3/18

1/18

3/18

6/18

5/18

3/18


What happens if the two

populations are dependent?

SEE LECTURE NOTES!



0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)20



x f1(x)

210 1/6

240 1/3

270 1/2

Total 1


x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

D = X1 – X2 ~ ???



0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (1/6)(1/3) + (1/3)(1/3) + (1/2)(1/3) = 6/18

+60 (1/3)(1/3) + (1/2)(1/3) = 5/18

+90 (1/2)(1/3) = 3/18

1 = 250

12 = 500

2 = 210

22 = 600

1/18

3/18

6/18

5/18

3/18


D = (-30)(1/18) + (0)(3/18) + (30)(6/18) + (60)(5/18) + (90)(3/18) = 40

D2 = (-70) 2(1/18) + (-40) 2(3/18) +

(-10) 2(6/18) + (20) 2(5/18) + (50) 2(3/18) = 1100

D = 1 – 2

D 2 = 1

2 + 2 2

General: TWO INDEPENDENT POPULATIONS



0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)21

X1

x f1(x)

210 1/6

240 1/3

270 1/2

Total 1

X2

x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600



0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (1/6)(1/3) + (1/3)(1/3) + (1/2)(1/3) = 6/18

+60 (1/3)(1/3) + (1/2)(1/3) = 5/18

+90 (1/2)(1/3) = 3/18

1 = 250

12 = 500

2 = 210

22 = 600

1/18

3/18

6/18

5/18

3/18


D = (-30)(1/18) + (0)(3/18) + (30)(6/18) + (60)(5/18) + (90)(3/18) = 40

D2 = (-70) 2(1/18) + (-40) 2(3/18) +

(-10) 2(6/18) + (20) 2(5/18) + (50) 2(3/18) = 1100

D = X1 – X2 ~ ???

D = 1 – 2

Mean (X1 – X2) = Mean (X1) – Mean (X2)

D 2 = 1

2 + 2 2

Var (X1 – X2) = Var (X1) + Var (X2) – 2 Cov (X1, X2)

X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL)

These two formulas are valid for continuous as well as discrete distributions.

IF the two populations are dependent…

…then this formula still holds,

BUT……

chapter 4

Documents