chapter 4

21
CHAPTER 4 4.1 - Discrete Models General distributions Classical: Binomial, Poisson, etc. 4.2 - Continuous Models General distributions Classical: Normal, etc.

Upload: jeanne

Post on 05-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

CHAPTER 4. 4.1 - Discrete Models General distributions Classical: Binomial, Poisson, etc. 4.2 - Continuous Models General distributions Classical: Normal, etc. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CHAPTER 4

CHAPTER 4

• 4.1 - Discrete Models General distributions Classical: Binomial, Poisson, etc.

• 4.2 - Continuous Models General distributions Classical: Normal, etc.

Page 2: CHAPTER 4

What is the connection between probability and random variables? Events (and their corresponding probabilities) that involve experimental measurements can be described by random variables (e.g., “X = # Males” in previous gender equity example).

2

Page 3: CHAPTER 4

POPULATION

random variable X

3

x1

x2

x3

x4

x5

x6

…etc….xn

Data values xi

Relative Frequenciesf (xi ) = fi /n

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

xk f (xk)

Total 1

Pop values xi

Probabilitiesf (xi )

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

Total 1

SAMPLE of size n

Example: X = Cholesterol level (mg/dL)

Discrete

Page 4: CHAPTER 4

POPULATION Pop values x

Probabilitiesf (x)

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

Total 1

Example: X = Cholesterol level (mg/dL)random variable XDiscrete

Probability Histogram

X

Total Area = 1

f(x) = Probability that the random variable X is equal to a specific value x, i.e.,

|x

“probability mass function” (pmf)

f(x) = P(X = x)

Page 5: CHAPTER 4

X

POPULATION Pop values x

Probabilitiesf (x)

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

Total 1

Example: X = Cholesterol level (mg/dL)random variable XDiscrete

Probability Histogram Total Area = 1

F(x) = Probability that the random variable X is less than or equal to a specific value x, i.e.,

“cumulative distribution function” (cdf)

F(x) = P(X x)

|x

Page 6: CHAPTER 4

POPULATION Pop values x

Probabilitiesf (x)

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

Total 1

Example: X = Cholesterol level (mg/dL)random variable XDiscrete

Probability Histogram

X

Hey!!! What about the

population mean

and the

population variance 2 ???

Calculating probabilities…

P(a X b) = ????????f (x)

b

a

|a

|x

|b

= F(b) – F(a)

Page 7: CHAPTER 4

7

POPULATION Pop values x

Probabilitiesf (x)

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

Total 1

• Population mean

Also denoted by E[X], the “expected value” of the variable X.

• Population variance

)(xfx

)()( xfx 22

Example: X = Cholesterol level (mg/dL)random variable XDiscrete

Just as the sample mean and sample variance s2 were used to characterize “measure of center” and “measure of spread” of a dataset, we can now define the “true” population mean and population variance 2, using probabilities.

x

Page 8: CHAPTER 4

Pop values x

Probabilitiesf (x)

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

Total 1

8

POPULATION

• Population mean

Also denoted by E[X], the “expected value” of the variable X.

• Population variance

)(xfx

)()( xfx 22

Example: X = Cholesterol level (mg/dL)random variable XDiscrete

Just as the sample mean and sample variance s2 were used to characterize “measure of center” and “measure of spread” of a dataset, we can now define the “true” population mean and population variance 2, using probabilities.

x

Page 9: CHAPTER 4

)/()()/()()/()( 212031106140 222

Pop values xi

Probabilitiesf (xi )

210 1/6

240 1/3

270 1/2

Total 1

Example 1:

9

POPULATION

)(xfx

)()( xfx 22

)/)(()/)(()/)(( 212703124061210 250

500

1/6

1/3

1/2

Example: X = Cholesterol level (mg/dL)random variable XDiscrete

Page 10: CHAPTER 4

)/()()/()()/()( 31303103130 222

Example 2:

10

POPULATION

)/)(()/)(()/)(( 312403121031180 210

600

)(xfx

)()( xfx 22

Pop values xi

Probabilitiesf (xi )

180 1/3

210 1/3

240 1/3

Total 1

1/3 1/3 1/3

Equally likely outcomes result in a “uniform distribution.”

(clear from symmetry)

Example: X = Cholesterol level (mg/dL)random variable XDiscrete

Page 11: CHAPTER 4

To summarize…

11

Page 12: CHAPTER 4

12

POPULATION

SAMPLE of size n

x1

x2

x3

x4

x5

x6

…etc….xn

Data xi

Relative Frequenciesf (xi ) = fi /n

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

xk f (xk)

1

Pop xi

Probabilitiesf (xi )

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

1

Frequency Table

Probability Table

)()(

)(

xfx

xfx22

Probability Histogram

X

Total Area = 1

Density Histogram

X

Total Area = 1

)()(

)(

xfxxs

xfxx

nn 22

1

Discrete random variable X

Page 13: CHAPTER 4

13

POPULATION

SAMPLE of size n

x1

x2

x3

x4

x5

x6

…etc….xn

Data xi

Relative Frequenciesf (xi ) = fi /n

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

xk f (xk)

1

Pop xi

Probabilitiesf (xi )

x1 f (x1)

x2 f (x2)

x3 f (x3)

⋮ ⋮

1

Frequency Table

Probability Table

)()(

)(

xfx

xfx22

Probability Histogram

X

Total Area = 1

Density Histogram

X

Total Area = 1

)()(

)(

xfxxs

xfxx

nn 22

1

?Discrete

random variable X

Continuous

Page 14: CHAPTER 4

14

One final example…

Page 15: CHAPTER 4

15

Example 3: TWO INDEPENDENT POPULATIONS

X1 = Cholesterol level (mg/dL)

x f1(x)

210 1/6

240 1/3

270 1/2

Total 1

X2 = Cholesterol level (mg/dL)

x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

D = X1 – X2 ~ ???

d Outcomes

-30 (210, 240)

0 (210, 210), (240, 240)

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)

NOTE: By definition, this

is the sample space of

the experiment!NOTE: By definition, this

is the sample space of

the experiment!

What are the probabilities

of the corresponding

events “D = d” for

d = -30, 0, 30, 60, 90?

Page 16: CHAPTER 4

d Outcomes

-30 (210, 240)

0 (210, 210), (240, 240)

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)

d Probabilities f(d)

-30 1/9 ?

0 2/9 ?

+30 3/9 ?

+60 2/9 ?

+90 1/9 ?16

Example 3: TWO INDEPENDENT POPULATIONS

X1 = Cholesterol level (mg/dL)

x f1(x)

210 1/6

240 1/3

270 1/2

Total 1

X2 = Cholesterol level (mg/dL)

x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

D = X1 – X2 ~ ???

NO!!!The

outcomes of D are NOT EQUALLY LIKELY!!!

Page 17: CHAPTER 4

d Outcomes

-30 (210, 240)

0 (210, 210), (240, 240)

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)

d Probabilities f(d)

-30 (1/6)(1/3) = 1/18 via independence

0 (210, 210), (240, 240)

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)17

Example 3: TWO INDEPENDENT POPULATIONS

X1 = Cholesterol level (mg/dL)

x f1(x)

210 1/6

240 1/3

270 1/2

Total 1

X2 = Cholesterol level (mg/dL)

x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

D = X1 – X2 ~ ???

Page 18: CHAPTER 4

d Probabilities f(d)

-30 (1/6)(1/3) = 1/18 via independence

0 (210, 210), (240, 240)

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)

d Probabilities f(d)

-30 (1/6)(1/3) = 1/18 via independence

0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)18

Example 3: TWO INDEPENDENT POPULATIONS

X1 = Cholesterol level (mg/dL)

x f1(x)

210 1/6

240 1/3

270 1/2

Total 1

X2 = Cholesterol level (mg/dL)

x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

D = X1 – X2 ~ ???

Page 19: CHAPTER 4

d Probabilities f(d)

-30 (1/6)(1/3) = 1/18 via independence

0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)19

Example 3: TWO INDEPENDENT POPULATIONS

X1 = Cholesterol level (mg/dL)

x f1(x)

210 1/6

240 1/3

270 1/2

Total 1

X2 = Cholesterol level (mg/dL)

x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

D = X1 – X2 ~ ???

d Probabilities f(d)

-30 (1/6)(1/3) = 1/18 via independence

0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (1/6)(1/3) + (1/3)(1/3) + (1/2)(1/3) = 6/18

+60 (1/3)(1/3) + (1/2)(1/3) = 5/18

+90 (1/2)(1/3) = 3/18

1/18

3/18

6/18

5/18

3/18

Probability Histogram

What happens if the two

populations are dependent?

SEE LECTURE NOTES!

Page 20: CHAPTER 4

d Probabilities f(d)

-30 (1/6)(1/3) = 1/18 via independence

0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)20

Example 3: TWO INDEPENDENT POPULATIONS

X1 = Cholesterol level (mg/dL)

x f1(x)

210 1/6

240 1/3

270 1/2

Total 1

X2 = Cholesterol level (mg/dL)

x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

D = X1 – X2 ~ ???

d Probabilities f(d)

-30 (1/6)(1/3) = 1/18 via independence

0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (1/6)(1/3) + (1/3)(1/3) + (1/2)(1/3) = 6/18

+60 (1/3)(1/3) + (1/2)(1/3) = 5/18

+90 (1/2)(1/3) = 3/18

1 = 250

12 = 500

2 = 210

22 = 600

1/18

3/18

6/18

5/18

3/18

Probability Histogram

D = (-30)(1/18) + (0)(3/18) + (30)(6/18) + (60)(5/18) + (90)(3/18) = 40

D2 = (-70) 2(1/18) + (-40) 2(3/18) +

(-10) 2(6/18) + (20) 2(5/18) + (50) 2(3/18) = 1100

D = 1 – 2

D 2 = 1

2 + 2 2

Page 21: CHAPTER 4

General: TWO INDEPENDENT POPULATIONS

d Probabilities f(d)

-30 (1/6)(1/3) = 1/18 via independence

0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (210, 180), (240, 210), (270, 240)

+60 (240, 180), (270, 210)

+90 (270, 180)21

X1

x f1(x)

210 1/6

240 1/3

270 1/2

Total 1

X2

x f2(x)

180 1/3

210 1/3

240 1/3

Total 1

1 = 250

12 = 500

2 = 210

22 = 600

d Probabilities f(d)

-30 (1/6)(1/3) = 1/18 via independence

0 (1/6)(1/3) + (1/3)(1/3) = 3/18

+30 (1/6)(1/3) + (1/3)(1/3) + (1/2)(1/3) = 6/18

+60 (1/3)(1/3) + (1/2)(1/3) = 5/18

+90 (1/2)(1/3) = 3/18

1 = 250

12 = 500

2 = 210

22 = 600

1/18

3/18

6/18

5/18

3/18

Probability Histogram

D = (-30)(1/18) + (0)(3/18) + (30)(6/18) + (60)(5/18) + (90)(3/18) = 40

D2 = (-70) 2(1/18) + (-40) 2(3/18) +

(-10) 2(6/18) + (20) 2(5/18) + (50) 2(3/18) = 1100

D = X1 – X2 ~ ???

D = 1 – 2

Mean (X1 – X2) = Mean (X1) – Mean (X2)

D 2 = 1

2 + 2 2

Var (X1 – X2) = Var (X1) + Var (X2) – 2 Cov (X1, X2)

X1 = Cholesterol level (mg/dL) X2 = Cholesterol level (mg/dL)

These two formulas are valid for continuous as well as discrete distributions.

IF the two populations are dependent…

…then this formula still holds,

BUT……