lecture 1 bayesian inference and maximum likelihood (idż na całość)
Post on 19-Dec-2015
223 views
TRANSCRIPT
Lecture 1 Bayesian inference and maximum likelihood
(Idż na całość)
Car Zonk Zonk
P=1/3
P=1/2
Car Zonk Zonk
Car Zonk Zonk
Car Zonk Zonk
Remain Change
win loose
1. choice Shown
loose win
loose win
p(A B) p(A | B)p(B)
p(A B)p(A | B)
p(B)
p(B A) p(B | A)p(A)
p(B A)p(B | A)
p(A)
p(A B) p(A | B)p(B) p(B | A)p(A)
p(B | A)p(A)p(A | B)
p(B)
The law of dependent propability
conditional priori(A)posterior
priori(B)
Theorem of Bayes
1
( ) ( | ) ( )
n
i ii
p A p A B p B
Total probability
P(B1) P(B2) P(B3)
P(A|B1) P(A|B2)
p(B | A)p(A)p(A | B)
p(B)
1
( ) ( | )( | )
( | ) ( )
i i
i n
i ii
p B p A Bp B A
p A B p B
p(M3 G1)p(G1| M3)
p(M3)
p(M3 | G1)p(G1)
p(M3 | G1)p(G1) p(M3 | G2)p(G2) p(M3 | G3)p(G3)
1/ 2*1/ 3 1
1/ 2*1/ 3 1*1/ 3 0*1/ 3 3
(Idż na całość)
Assume we choose gate 1 (G1) at the first choice. We are looking for the probability p(G1|M3) that the car is behind gate 1 if we know that the moderator opened gate 3 (M3).
Calopteryx spelendens
We study the occurrence of the damselfly Calopteryx splendens at small rivers. We know from the literature that C. splendens occurs at about 10% of all rivers. Occurrence depends on water quality. Suppose we have five quality classes that occur in 10% (class I), 15% (class II), 27% (class III), 43% (class IV), and 5% (class V) of all rivers. The probability to find Calopteryx in these five classes is 1% (class I), 7% (class II), 14% (class III), 31% (class IV), and 47% (class V).
To which class belongs probably a river if we find Calopteryx?( | ) ( 1)
( | )( | ) ( 1) ( | 2) ( 2) ( | 3) ( 3) ( | 4) ( 4) ( | 5) ( 5)
0.1*0.01( | )
0.1*0.01 0.15*0.07 0.27*0.14 0.43*0.31 0.05*
p A classI p classp classI A
p A classI p class p A class p class p A class p class p A class p class p A class p class
p classI A 0.00480.47
p(class II|A) = 0.051, p(class III|A) = 0.183, p(class IV|A) = 0.647, p(class V|A) = 0.114
Indicator values
Bayes and forensic
False positive fallacyError of the prosecutor
500 suspects
DNA identical1 person
DNA not identical499 persons
DNA test positive1 person
DNA negative495 persons
DNA test positive4 persons
Let’s take a standard DNA test for identifying persons. The test has a precision of more than 99%.What is the probability that we identify the wrong person?
p( | c)p() 1*1/ 500 1p(c | )
p( ) 5 / 500 5
p( | c)p(c)
p(c | )p( | c)p(c) p( | c)p( c)
11* 1500p(c | )
1 4 499 51* *500 499 500
The forensic version of Bayes theorem
The error of the advocate
In the process against the basketball star E. O. Simpson, one of his advocates (a Harvard professor) argued that Simpson sometimes has beaten his wife. However, only very few man
who beat their wives later murder them (about 0.1%).
Whole population250 000 000
Murdered by husbandP = 1/10000
Beaten wives250 000 000 - N
Not beaten wivesN
Murdered otherwiseP = 1/10000
Murdered otherwiseP = 1 /10000
10000 beaten wivesMurdered by husband
P = 1/2
b
b b
p(m | h ) 1p(m | b)
p(m | h ) p(m | h ) 2
Assume a hawk searches a terrain for prey Let p(suc) be the probability to find a prey. Assume he spots a number of pixel at a time that can be model by
squares. Hence he surveys a grid. Of course he knows where to search. That means he
knows a priori probabilities for each square of the virtual grid (p(prey).
However there is another probability for each square, the probability p(suc|prey) to find the prey.
Our hawk will now systematically modify his a priori probabilities p(prey) depending on his successive failures and search where he assumes the highest
probability of success.
Foto: Peter Schild
p(prey | suc)p(suc) p(suc)p(suc | prey)
p(prey | suc)p(suc) p(prey | suc)p( suc) p(suc) p(prey | suc)(1 p(suc))
p(suc)(1 p(suc | prey))p(prey | suc)
p(suc | prey)(1 p(suc))
Now let AT be the total area of search and AE the empty part without prey. k denotes the number of successful hunts within the part of the area with prey. Hence p(suc|prey) = k / (AT-AE)
T E
T T T E T T E T E
T T
T E T T E T T E
A A kk k k
A A A A A (A A ) A A kp(prey | suc)
k k k A k A kA A A A A A (A A )
Gains and costs
Assume a parasitic wasps that attacks clutches of aphids. These clutches are of different quality (size, exposition). The wasp visits one clutch after another. However, because it has of course competitors it has to choose after a certain time. How long should the wasp search to make the best choice that means to attack the best clutch in the given situation?
Foto: R. Long
We define gain and cost functions and apply the odds strategy
p pOdds o
q 1 p
1
kk n
O o
Stopping rule:
Stop when the sum of the odds > 1
Clutch p(best) q=1-p(best) odds=p/q Sum odds Sum 1-p qo20.000 0.050 0.950 0.053 0.053 0.950 0.0519.000 0.053 0.947 0.056 0.108 0.9 0.09736818.000 0.056 0.944 0.059 0.167 0.85 0.14195917.000 0.059 0.941 0.063 0.230 0.8 0.18360916.000 0.063 0.938 0.067 0.296 0.75 0.22213315.000 0.067 0.933 0.071 0.368 0.7 0.25732414.000 0.071 0.929 0.077 0.445 0.65 0.28894413.000 0.077 0.923 0.083 0.528 0.6 0.31671712.000 0.083 0.917 0.091 0.619 0.55 0.34032411.000 0.091 0.909 0.100 0.719 0.5 0.35938610.000 0.100 0.900 0.111 0.830 0.45 0.3734479.000 0.111 0.889 0.125 0.955 0.4 0.3819538.000 0.125 0.875 0.143 1.098 0.35 0.3842097.000 0.143 0.857 0.167 1.264 0.3 0.3793226.000 0.167 0.833 0.200 1.464 0.25 0.3661025.000 0.200 0.800 0.250 1.714 0.2 0.3428814.000 0.250 0.750 0.333 2.048 0.15 0.3071613.000 0.333 0.667 0.500 2.548 0.1 0.2547742.000 0.500 0.500 1.000 3.548 0.05 0.1773871.000 1.000 0.000
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.000 5.000 10.000 15.000 20.000n
Su
cce
ss p
rob
ab
ility
Stopping number
k k
1 11 1k
kk n k nk n k nk
pp(S) (1 p ) q o
1 p
nn
a r 1 r
1 r 1 r r rp(S) da ln( )
n a 1 n a n n
dS 1 r r n 1 r nln( ) 0 ln( ) 1 r
dr n n n r n n e
For our clutch example r = 7.358. This called the 1/e-stopping rule
The wasp should attack that clutch (at position r) that is better than the best of the previous r-1. The probability that the second best clutch is within the first r trials is p2 = r/(a-1) where a is
the position of the best clutch
Maximum likelihoods
Suppose you studied 50 patients in a clinical trial and detected at 30 of them the presence of a certain bacterial disease.
What is the most probable frequency of this disease in the population?
50
0.5
30 20
0.6
30 20
0.8
50 1p (30 | 50) 0.042
30 2
50 3 2p (30 | 50) 0.115
30 5 5
50 4 1p (30 | 50) 0.001
30 5 5
p p 1 iL f (x ...x )
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0 0.2 0.4 0.6 0.8 1p
L(p
)Likelihood function
p30 20 29 20 30 19p
dL50 50 50L p (1 p) 30p (1 p) p 20(1 p) 0
30 30 30dp
33(1 p) 2p p
5
p
p
50ln(L ) ln( 30ln(p) 20ln(1 p)
30
d ln L 30 20 30 p
dp p 1 p 5
log likelihood estimator ln(Lp)