lecture 3: probability - oxford statistics · lecture 3: probability 28th of october 2015 15 / 36...
TRANSCRIPT
Lecture 3: Probability
28th of October 2015
Lecture 3: Probability 28th of October 2015 1 / 36
Summary of previous lecture
Define chance experiment, sample space and event
Introduce the concept of the probability of an event
Show how to manipulate intersection, union and complement ofevents
Show how to compute analytically probability for simple chanceexperiments
Define independence between event
Lecture 3: Probability 28th of October 2015 2 / 36
Example: Snails
In a population of a particular species of snail, individuals exhibit differentforms. It is known that
45% have a pink background colouring
30% of individuals are striped,
and 20% of the population are pink and striped
Is the presence or absence of striping independent of background colour?
A = a snail has a pink background colouringS = a snail has stripes.
We know thatP (A) = 0.45, P (S) = 0.3, and P (A ∩ S) = 0.2.
Note that0.2 = P (A ∩ S) 6= 0.135 = P (A)P (S),
so the events A and S are not independent.Lecture 3: Probability 28th of October 2015 3 / 36
Conditional Probability Laws
A = a snail has a pink background colouringS = a snail has stripes.
What is the probability that a snail has stripes GIVEN that it is pink?
This is a conditional probability.
We write this probability as P(S|A) (pronounced ‘probability of S given A’)
Lecture 3: Probability 28th of October 2015 4 / 36
Think of P(B|A) as ‘how much of A is taken up by B’.
A BA∩B
S
A∩Bc Ac∩B
Ac∩Bc
Then we see that
P (B|A) =P (A ∩B)
P (A)
Lecture 3: Probability 28th of October 2015 5 / 36
Example (continued)
A = a snail has a pink background colouringS = a snail has stripes.
What is the probability that a snail has stripes GIVEN that it is pink?
P (S|A) = P (S ∩A)
P (A)=
0.2
0.45= 0.44 > P (S) = 0.3
Thus, knowledge that a snail has a pink background colouring increasesthe probability that it is striped.
Lecture 3: Probability 28th of October 2015 6 / 36
Independence of Events
Definition: Two events A and B are said to be independent ifP (A ∩B) = P (A)P (B).
Note that in this case (provided P (B) > 0), if A and B are independent
P (A|B) =P (A ∩B)
P (B)=
P (A)P (B)
P (B)= P (A),
and similarly P (B|A) = P (B) (provided P (A) > 0).
So for independent events, knowledge that one of the events has occurreddoes not change our assessment of the probability that the other event hasoccur.
If the probability that one event A occurs doesn’t affect the probabilitythat the event B also occurs, then we say that A and B are independent.
Lecture 3: Probability 28th of October 2015 7 / 36
General multiplication Law
We can rearrange the conditional probability law to obtain a generalMultiplication Law.
P(B|A) = P (A ∩B)
P (A)⇒ P(B|A)P(A) = P(A ∩ B)
Similarly
P(A|B)P(B) = P(A ∩ B)
ThenP(A ∩ B) = P(B|A)P(A) = P(A|B)P(B)
Lecture 3: Probability 28th of October 2015 8 / 36
The Partition law
Suppose we know that P(A∩B) = 0.52 and P(A∩Bc) = 0.14 what is p(A)?
A B
A∩B
S
So we have the rule
P (A) = P (A ∩B) + P (A ∩Bc) = 0.52 + 0.14 = 0.66
Lecture 3: Probability 28th of October 2015 9 / 36
More generally, events are mutually exclusive if at most one of the eventscan occur in a given experiment.
Suppose E1, . . . , En are mutually exclusive events, which together formthe whole sample space: E1 ∪ E2 ∪ · · · ∪ En = S. (In other words, everypossible outcome is in exactly one of the E’s.) Then
P (A) =
n∑i=1
P (A ∩ Ei) =n∑
i=1
P (A|Ei)P (Ei)
Lecture 3: Probability 28th of October 2015 10 / 36
Example: Mendelian segregation
At a particular locus in humans, there are two alleles A and B, and it isknown that the population frequencies of the genotypes AA, AB, andBB, are 0.49, 0.42, and 0.09, respectively. An AA man has a child with awoman whose genotype is unknown.
What is the probability that the child will have genotype AB?
We assume that as far as her genotype at this locus is concerned thewoman is chosen randomly from the population
Lecture 3: Probability 28th of October 2015 11 / 36
Use the partition rule, where the partition corresponds to the threepossible genotypes for the woman. Then
P (child AB) = P (child AB and mother AA)
+P (child AB and mother AB)
+P (child AB and mother BB)
= P (mother AA)P (child AB|mother AA)
+P (mother AB)P (child AB|mother AB)
+P (mother BB)P (child AB|mother BB)
= 0.49× 0 + 0.42× 0.5 + 0.09× 1
= 0.3.
Lecture 3: Probability 28th of October 2015 12 / 36
Bayes’ Rule
Bayes’ Rule is a very powerful probability law.
An example from medicine: We want to calculate the probability that aperson has a disease D given they have a specific symptom S, i.e. we wantto calculate P (D|S). This is a difficult task as we would need to take arandom sample of people from the population with the symptom.
A probability that is much easier to calculate is P (S|D), i.e. theprobability that a person with the disease has the symptom. Thisprobability can be assigned much more easily as medical records for peoplewith serious diseases are kept.
The power of Bayes Rule is its ability to take P (S|D) and calculateP (D|S).
Lecture 3: Probability 28th of October 2015 13 / 36
We have already seen a version of Bayes’ Rule
P (B|A) =P (A ∩B)
P (A)
Using the Multiplication Law we can rewrite this as
P (B|A) =P (A|B)P (B)
P (A)
This is the Bayes’ Rule.
Suppose P (S|D) = 0.12, P (D) = 0.01 and P (S) = 0.03. Then
P (D|S) = 0.12× 0.01
0.03= 0.04
Lecture 3: Probability 28th of October 2015 14 / 36
Example: Genetic testing
A gene has two possible types A1 and A2.
75% of the population have A1.
B is a disease that has 3 forms B1 (mild), B2 (severe) and B3
(lethal).
A1 is a protective gene, with the probabilities of having the threeforms given A1 as 0.9, 0.1 and 0 respectively.
People with A2 are unprotected and have the three forms withprobabilities 0, 0.5 and 0.5 respectively.
What is the probability that a person has gene A1 given they have thesevere disease?
Lecture 3: Probability 28th of October 2015 15 / 36
The first thing to do with such a question is ‘decode’ the information, i.e.write it down in a compact form we can work with.
P(A1) = 0.75 P(A2) = 0.25
P(B1|A1) = 0.9 P(B2|A1) = 0.1 P(B3|A1) = 0
P(B1|A2) = 0 P(B2|A2) = 0.5 P(B3|A2) = 0.5
We want P(A1|B2)?
Lecture 3: Probability 28th of October 2015 16 / 36
From Bayes Rule we know that
P(A1|B2) = P(B2|A1)P(A1)
P(B2)
We know P(B2|A1) and P(A1) but what is P(B2)?
We can use the Partition Law since A1 and A2 are mutually exclusive.
P(B2) = P(B2 ∩A1) + P(B2 ∩A2) (Partition Law)
= P(B2|A1)P(A1) + P(B2|A2)P(A2) (Multiplication Law)
= 0.1× 0.75 + 0.5× 0.25
= 0.2
We can use Bayes Rule to calculate P(A1|B2).
P(A1|B2) =0.1× 0.75
0.2= 0.375
Lecture 3: Probability 28th of October 2015 17 / 36
Permutations and Combinations (Probabilitiesof patterns)
In some situations we observe a specific pattern from a large numberof possible patterns.
To calculate the probability of the pattern we need to count the totalnumber of patterns.
This is why we need to learn about permutations and combinations.
Lecture 3: Probability 28th of October 2015 18 / 36
Permutations of n objects
Consider 2 objects A BQ. How many ways can they be arranged? i.e. how many permutationsare there?A. 2 ways AB BA
Consider 3 objects A B CQ. How many ways can they be arranged (permuted)?A. 6 ways ABC ACB BCA BAC CAB CBA
Consider 4 objects A B C DQ. How many ways can they be arranged (permuted)? A. 24 ways
There is a pattern emerging here.
No. of objects 2 3 4 5 6 . . .
No. of permutations 2 6 24 120 720 . . .
Can we find a formula for the number of permutations of n objects?Lecture 3: Probability 28th of October 2015 19 / 36
Suppose we have 5 objects. How many different ways can we place theminto 5 boxes?
There are 5 choices of object for the first box.
5
There are now only 4 objects to choose from for the second box.
5 4
There are 3 choices for the 3rd box, 2 for the 4th and 1 for the 5th box.
5 4 3 2 1
Thus, the number of permutations of 5 objects is 5× 4× 3× 2× 1.
Lecture 3: Probability 28th of October 2015 20 / 36
In general, the number of permutations of n objects is
n(n− 1)(n− 2) . . . (3)(2)(1)
We write this as n! (pronounced ‘n factorial’). There should be a buttonon your calculator that calculates factorials.
Lecture 3: Probability 28th of October 2015 21 / 36
Permutations of r objects from n
Now suppose we have 4 objects and only 2 boxes. How manypermutations of 2 objects when we have 4 to choose from?
There are 4 choices for the first box and 3 choices for the second box
4 3
So there are 12 permutations of 2 objects from 4. We write this as
4P2 = 12
Lecture 3: Probability 28th of October 2015 22 / 36
We say there are nPr permutations of r objects chosen from n.
The formula for nPr is given by
nPr =n!
(n− r)!
To see why this works consider the example above 4P2.
Using the formula we get
4P1 =4!
2!=
4× 3× 2× 1
2× 1= 4× 3
Lecture 3: Probability 28th of October 2015 23 / 36
Combinations of r objects from n
Now consider the number of ways of choosing 2 objects from 4 when theorder doesn’t matter. We just want to count the number of possiblecombinations.
We know that there are 12 permutations when choosing 2 objects from 4.These are
AB AC AD BC BD CDBA CA DA CB DB DC
Notice how the permutations are grouped in 2’s which are the samecombination of letters. Thus there are 12/2 = 6 possible combinations.
AB AC AD BC BD CD
Lecture 3: Probability 28th of October 2015 24 / 36
We write this as4C2 = 6
We say there are nCr combinations of r objects chosen from n.
The formula for nCr is given by
nCr =n!
(n− r)!r!
Lecture 3: Probability 28th of October 2015 25 / 36
Example: the National Lottery
In the National Lottery you need to choose 6 balls from 49.
What is the probability that I choose all 6 balls correctly?
There are 2 ways of answering this question(i) using permutations and combinations(ii) using a tree diagram
Lecture 3: Probability 28th of October 2015 26 / 36
Method 1 - using permutations andcombinations
P (6 correct) =No. of ways of choosing the 6 correct balls
No. of ways of choosing 6 balls
=6P6
49P6
=6!0!49!43!
=6× 5× 4× 3× 2× 1
49× 48× 47× 46× 45× 44= 0.0000000715112 (1 in 14 million)
Lecture 3: Probability 28th of October 2015 27 / 36
Method 2 - using a tree diagram
Consider the first ball I choose, the probability it is correct is
6
49
The second ball I choose is correct with probability
5
48
The third ball I choose is correct with probability
4
47
and so on. Thus the probability that I get all 6 balls correct is
6
49
5
48
4
47
3
46
2
45
1
44= 0.0000000715112 (1 in 14 million)
Lecture 3: Probability 28th of October 2015 28 / 36
Puzzle 2
You have reached the final of a game show. The host shows you threedoors and tell you that there is a prize behind one of the door. You pick adoor. The host then opens one of the door you did not pick: there is noprice behind this door. He then asks you if you want to change from thedoor you chose to the other remaining door.
Is it to your advantage to switch your choice?
Lecture 3: Probability 28th of October 2015 29 / 36
Assume that you picked the first door.
The price has equal probability to be behind each of the three doors.
Let’s study the three cases (depending on where the price is) individuallyand investigate whether it is better to keep your door choice or to switch.
Lecture 3: Probability 28th of October 2015 30 / 36
Case 1: the price is behind the first door.
Lecture 3: Probability 28th of October 2015 31 / 36
The host has two choices: he can tell you that the price is not behind thesecond door or tell you that it is not behind the third door.
In this case,
a player who stays with his initial choice wins.
a player who changes his initial choice loses.
Lecture 3: Probability 28th of October 2015 32 / 36
Case 2: the price is behind the second door.
Lecture 3: Probability 28th of October 2015 33 / 36
The host has no choice: he can only tell you that the price is not behindthe third door.
In this case,
a player who stays with his initial choice loses.
a player who changes his initial choice wins.
Lecture 3: Probability 28th of October 2015 34 / 36
Case 3: the price is behind the third door.
Again, the host has no choice: he can only tell you that the price is notbehind the second door.
In this case,
a player who stays with his initial choice loses.
a player who changes his initial choice wins.
Lecture 3: Probability 28th of October 2015 35 / 36
To summarize:
Case 1: winning strategy is to stay with initial choice
Case 2: winning strategy is to change door choice
Case 3: winning strategy is to change door choice
A player who stays with the initial choice wins in only one out ofthree of these equally likely possibilities, while a player who switcheswins in two out of three.
Possible to obtain the same solution more formally using conditionalprobabilities and Bayes’ formula. See for example,https://en.wikipedia.org/wiki/Monty_Hall_problem.
Lecture 3: Probability 28th of October 2015 36 / 36