statistics: unlocking the power of data lock 5 stat 250 dr. kari lock morgan probability sections...

39
Statistics: Unlocking the Power of Data STAT 250 Dr. Kari Lock Morgan Probability SECTIONS 11.1 Probability (11.1) Odds, odds ratio (not in book) Conditional probability (11.1)

Upload: elfrieda-daniels

Post on 24-Dec-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

Statistics: Unlocking the Power of Data Lock5

STAT 250Dr. Kari Lock Morgan

Probability

SECTIONS 11.1• Probability (11.1)• Odds, odds ratio (not in book)• Conditional probability (11.1)

Statistics: Unlocking the Power of Data Lock5

Exam 1Friday, 2/13, in class

You can bring One single-sided page (8 ½ x 11) of notes A non-cell phone calculator

Exam will cover Everything covered through today’s class Chapters 1 and 2, plus today’s material Lecture and lab (but not Minitab commands)

Mix of multiple choice and free response

Statistics: Unlocking the Power of Data Lock5

Event• An event is something that either happens or doesn’t happen, or something that either is true or is not true

• Examples:• You get cancer• A randomly selected person is obese• A particular mutation occurs• It snows tonight

Statistics: Unlocking the Power of Data Lock5

Probability• The probability of event A, P(A), is the probability that A will happen

• Probability always refers to an event

• Probability is always between 0 and 1

• P(A) = 1 means A will definitely happen

• P(A) = 0 means A will definitely not happen

Statistics: Unlocking the Power of Data Lock5

Statistics: Unlocking the Power of Data Lock5

Ways of Expressing Probability1 in 8 women will get breast cancer

1/8 of women will get breast cancer

The proportion of women who will get breast cancer is 1/8, or 0.125.

12.5% of women will get breast cancer

The probability of breast cancer for a female is 0.125

These statements are all saying the same thing

Statistics: Unlocking the Power of Data Lock5

Probability statements can be directly translated into a relative frequency table:

Relative Frequency Table

Get Breast Cancer Do not Get Breast Cancer TOTAL

Statistics: Unlocking the Power of Data Lock5

Probability statements can also be translated into a frequency table, although unlike data description, the total is arbitrary:

Any of these are equally valid for probability calculations!

Frequency Table

Get Breast Cancer Do not Get Breast Cancer TOTAL

Get Breast Cancer Do not Get Breast Cancer TOTAL

Statistics: Unlocking the Power of Data Lock5

Odds

If p denotes the probability of an event, the odds are defined as

Interpreting odds Odds of 1 indicate 50/50 p < 0.5 yield odds < 1 p > 0.5 yield odds > 1

Odds of 3, or 3:1, mean that out of 4 times, we would expect the variable to be in that category 3 times and out of that category 1 time

Statistics: Unlocking the Power of Data Lock5

Breast Cancer Odds

Commonly expressed as 1:7 or 1/7

The odds of a woman getting breast cancer are 1:7

For every one woman who will get breast cancer, 7 women will not.

Statistics: Unlocking the Power of Data Lock5

Probability to Odds

Statistics: Unlocking the Power of Data Lock5

Odds

If p = 2/3, what are the odds?

a) 1b) ½c) 2d) 3

Statistics: Unlocking the Power of Data Lock5

Odds to ProbabilityWe can go from odds to probability with

One may make more sense than the other to you, so be comfortable going back and forth

Statistics: Unlocking the Power of Data Lock5

Odds

If the odds are 1:2, what is the probability?

a) 1b) 1/3c) 1/2d) 2e) 3

Statistics: Unlocking the Power of Data Lock5

Conditional Probability• P(A if B) is the probability of A, if we know B has happened or is true

• This is read in multiple ways:

• “probability of A if B”• “probability of A given B” • “probability of A conditional on B”

• You may also see this written as P(A | B)

Statistics: Unlocking the Power of Data Lock5

Conditional ProbabilityThe probability and odds that we calculated are

restricted only to females, so we are implicitly conditioning on the fact that gender is female.

For females, what’s the chance of getting breast cancer?

What proportion of women will get breast cancer?

Conditional probability is the probability of an event, conditional on (or given) that another variable takes a specific value (gender = female)

Statistics: Unlocking the Power of Data Lock5

P(survival if advanced stage) = 0.27P(survival if early detection) = 0.98

Statistics: Unlocking the Power of Data Lock5

What does this tell us?

a) P(breast cancer if 30 – 39 years old)

b) P(30 – 39 years old if breast cancer)

Statistics: Unlocking the Power of Data Lock5

What does this tell us?

a) P(breast cancer if first-degree relative)

b) P(first degree relative if breast cancer)

Statistics: Unlocking the Power of Data Lock5

What does this tell us? a) P(breast cancer if first-degree relative) = 1/16b) P(breast cancer if first-degree relative) = 1/8c) P(breast cancer if first-degree relative) = 1/4d) P(breast cancer if first-degree relative) = 3/8

Statistics: Unlocking the Power of Data Lock5

What does this tell us?

a) P(breast cancer if no family history)

b) P(no family history if breast cancer)

Statistics: Unlocking the Power of Data Lock5

What does this tell us? a) P(family history if breast cancer) = 0.15b) P(breast cancer if family history) = 0.15c) P(breast cancer if no family history) = 0.15

Statistics: Unlocking the Power of Data Lock5

Statistics: Unlocking the Power of Data Lock5

Statistics: Unlocking the Power of Data Lock5

Statistics: Unlocking the Power of Data Lock5

1 in 8 women (12.5%) of women get breast cancer, so P(breast cancer if female) = 0.125

1 in 800 (0.125%) of men get breast cancer, so P(breast cancer if male) = 0.00125

Statistics: Unlocking the Power of Data Lock5

Two-Way Table

What’s the overall (unconditional) probability of breast cancer?

Create a two-way table, with 1000 each of males and females.

Statistics: Unlocking the Power of Data Lock5

Odds RatioThe odds ratio (OR) is the ratio of the odds of

an event in one group to the odds of an event in another group

Odds ratio for breast cancer comparing females to males:

Statistics: Unlocking the Power of Data Lock5

Odds RatioOdds ratio for breast cancer comparing

females to males:

Statistics: Unlocking the Power of Data Lock5

A 40-year old woman participates in routine screening and has a positive mammography. What’s the probability she has cancer?

a) 0-10%

b) 10-25%

c) 25-50%

d) 50-75%

e) 75-100%

Breast Cancer Screening

Statistics: Unlocking the Power of Data Lock5

1% of women at age 40 who participate in routine screening have breast cancer.

80% of women with breast cancer get positive mammographies.

9.6% of women without breast cancer get positive mammographies.

A 40-year old woman participates in routine screening and has a positive mammography. What’s the probability she has cancer?

Breast Cancer Screening

Statistics: Unlocking the Power of Data Lock5

A 40-year old woman participates in routine screening and has a positive mammography. What’s the probability she has cancer?

a) 0-10%

b) 10-25%

c) 25-50%

d) 50-75%

e) 75-100%

Breast Cancer Screening

Statistics: Unlocking the Power of Data Lock5

Breast Cancer ScreeningA 40-year old woman participates in routine screening and has a positive mammography. What’s the probability she has cancer?

What is this asking for?

a) P(cancer if positive mammography)b) P(positive mammography if cancer)c) P(positive mammography if no cancer)d) P(positive mammography)e) P(cancer)

Statistics: Unlocking the Power of Data Lock5

Cancer

Cancer-free

Positive ResultNegative Result

If we randomly pick a ball from the Cancer bin, it’s more likely to be red/positive.

If we randomly pick a ball the Cancer-free bin, it’s more likely to be green/negative.

EveryoneWe randomly pick a ball from the Everyone bin.

C

C

C

C

C

F F F F F F F F F F F FF F F F F F F F F F FF F F F F F F F F FF F F F F F F F FF F F F F F F FF F F F F F FF F F F F FF F F F F

If the ball is red/positive, is it more likely to be from the Cancer or Cancer-free bin?

Statistics: Unlocking the Power of Data Lock5

100,000 women in the population

1%

1000 have cancer 99,000 cancer-free

99%

80% 20%

800 testpositive

200 testnegative

9.6% 90.4%

9,504 testpositive

89,496 testnegative

Thus, 800/(800+9,504) = 7.8% of positive results have cancer

Statistics: Unlocking the Power of Data Lock5

Two-Way TableCreate a two-way table for mammogram

result and breast cancer status, using 10,000 women total.

Using the table, what’s P(cancer if positive)?

Statistics: Unlocking the Power of Data Lock5

Mammograms

1. Probability1. What’s the probability of breast cancer if a positive

mammogram?2. What’s the probability of breast cancer if a negative

mammogram?

2. Odds1. What’s the odds of breast cancer after a positive

mammogram?2. What’s the odds of breast cancer after a negative

mammogram?

3. Compute the odds ratio comparing breast cancer risk for positive and negative mammograms.

Statistics: Unlocking the Power of Data Lock5

Statistics: Unlocking the Power of Data Lock5

To DoRead Section 2.1

Do HW 2.1, 11 (due Friday, 2/13)

Study for Exam 1 (Friday, 2/13)