1 maximum likelihood estimates and the em algorithms ii henry horng-shing lu institute of statistics...

78
1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University [email protected] http://tigpbp.iis.sinica.edu.tw/ courses.htm

Upload: kendall-kimble

Post on 14-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

1

Maximum Likelihood Estimates and the EM

Algorithms II

Henry Horng-Shing LuInstitute of Statistics

National Chiao Tung [email protected]

http://tigpbp.iis.sinica.edu.tw/courses.htm

Page 2: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

2

Part 1Computation

Tools

Page 3: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Include Functions in R source("file path") Example

In MME.R:

3

MME=function(y1, y2, y3, y4){

n=y1+y2+y3+y4;phi1=4.0*(y1/n-0.5);phi2=1-4*y2/n;phi3=1-4*y3/n;phi4=4.0*y4/n;phi=(phi1+phi2+phi3+phi4)/4.0;print("By MME method")return(phi); # print(phi);

}

> source(“C:/MME.R”)> MME(125, 18, 20, 24)[1] "By MME method"[1] 0.5935829

In R:

Page 4: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

4

Part 2Motivation Examples

Page 5: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Example 1 in Genetics (1) Two linked loci with alleles A and a, and B

and b A, B: dominant a, b: recessive

A double heterozygote AaBb will produce gametes of four types: AB, Ab, aB, ab

5

A

B b

a B

A

b

a

1/2

1/2

a

B

b

A

A

B b

a 1/2

1/2

Page 6: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Example 1 in Genetics (2) Probabilities for genotypes in gametes

6

No Recombination Recombination

Male 1-r r

Female 1-r’ r’

AB ab aB Ab

Male (1-r)/2 (1-r)/2 r/2 r/2

Female (1-r’)/2 (1-r’)/2 r’/2 r’/2

A

B b

a B

A

b

a

1/2

1/2

a

B

b

A

A

B b

a 1/2

1/2

Page 7: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Example 1 in Genetics (3) Fisher, R. A. and Balmukand, B. (1928). The

estimation of linkage from the offspring of selfed heterozygotes. Journal of Genetics, 20, 79–92.

More:http://en.wikipedia.org/wiki/Genetics http://www2.isye.gatech.edu/~brani/isyebayes/bank/handout12.pdf

7

Page 8: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Example 1 in Genetics (4)

8

MALE

AB (1-r)/2

ab(1-r)/2

aBr/2

Abr/2

FEMALE

AB (1-r’)/2

AABB (1-r) (1-r’)/4

aABb(1-r) (1-r’)/4

aABBr (1-r’)/4

AABbr (1-r’)/4

ab(1-r’)/2

AaBb(1-r) (1-r’)/4

aabb(1-r) (1-r’)/4

aaBbr (1-r’)/4

Aabbr (1-r’)/4

aB r’/2

AaBB(1-r) r’/4

aabB(1-r) r’/4

aaBBr r’/4

AabBr r’/4

Ab r’/2

AABb(1-r) r’/4

aAbb(1-r) r’/4

aABbr r’/4

AAbb r r’/4

Page 9: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Example 1 in Genetics (5) Four distinct phenotypes:

A*B*, A*b*, a*B* and a*b*. A*: the dominant phenotype from (Aa, AA, aA). a*: the recessive phenotype from aa. B*: the dominant phenotype from (Bb, BB, bB). b*: the recessive phenotype from bb. A*B*: 9 gametic combinations. A*b*: 3 gametic combinations. a*B*: 3 gametic combinations. a*b*: 1 gametic combination. Total: 16 combinations.

9

Page 10: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Example 1 in Genetics (6) Let , then

10

(1 )(1 ')r r

2( * *)

41

( * *) ( * *)4

( * *)4

P A B

P A b P a B

P a b

Page 11: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Example 1 in Genetics (7) Hence, the random sample of n from the

offspring of selfed heterozygotes will follow a multinomial distribution:

We know that and

So

11

2 1 1; , , ,

4 4 4 4Multinomial n

(1 )(1 '), 0 1/ 2,r r r

1/ 4 1

0 ' 1/ 2r

Page 12: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Example 1 in Genetics (8) Suppose that we observe the data of

which is a random sample from

Then the probability mass function is

12

1 2 3 4, , , 125,18,20,24y y y y y

2 1 1; , , ,

4 4 4 4Multinomial n

2 31 4

1 2 3 4

! 2 1( , ) ( ) ( ) ( )

! ! ! ! 4 4 4y yy yn

g yy y y y

Page 13: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Maximum Likelihood Estimate (MLE) Likelihood: Maximize likelihood: Solve the score

equations, which are setting the first derivates of likelihood to be zeros.

Under regular conditions, the MLE is consistent, asymptotic efficient and normal!

More: http://en.wikipedia.org/wiki/Maximum_likelihood

13

Page 14: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

MLE for Example 1 (1) Likelihood

MLE:

14

11 2 3 4

2 3 4

! 2( ) ( ) log( ) log( )

! ! ! ! 4

1 ( ) log( ) log( )

4 4

nlogL y

y y y y

y y y

2 31 4

1 2 3 4

! 2 1( ) ( ) ( ) ( )

! ! ! ! 4 4 4y yy yn

Ly y y y

ˆ ˆmax ( ) max log ( )MLE MLEL L

Page 15: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

MLE for Example 1 (2)

15

2 31 4log ( ) 02 1

y yy yd dl L

d d

21 2 3 4 1 2 3 4 4( ) ( 2 2 ) 2 0y y y y y y y y y

A B C

2 4

2MLE

B B AC

A

Page 16: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

MLE for Example 1 (3) Checking:

1.

2.

3. Compare ?

16

2

( )0?

MLE

d

d

ˆ1/ 4 1?MLE

ˆlog ( )MLEL

Page 17: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

17

Part 3Numerical Solutions for

the Score Equations of MLEs

Page 18: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

A Banach Space A Banach space is a vector space over

the field such that1. 2. 3. 4. 5. Every Cauchy sequence of converges in

(i.e., is complete). More:

http://en.wikipedia.org/wiki/Banach_space18

0 .x x B

, .rx r x r K x B , .x y x y x y B

0 0.x x B x

( )kx

BK

B BB

Page 19: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Lipschitz Continuous A closed subset and mapping

1. is Lipschitz continuous on withif

2. is a contraction mapping on if is Lipschitz continuous and

More:http://en.wikipedia.org/wiki/Lipschitz_continuous

19

A B : .F A A0 L

( ) ( )F x F y L x y

0 1.L

AF

F FA

Page 20: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Fixed Point Theorem (1) If is a contraction mapping on if is

Lipschitz continuous and1. has an unique fixed point

such that2. initial

3.

20

0 1.L s A

( ) .F s s (0) ( ) ( -1), ( ), 1, 2,k kx A x F x k

( ) .k

kx s

-

( ) ( 1) ( ) , 0 .1-

k tk t tL

s x x x t kL

F

F FA

Page 21: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Fixed Point Theorem (2) More:

http://en.wikipedia.org/wiki/Fixed-point_theoremhttp://www.math-linux.com/spip.php?article60

21

Page 22: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Applications for MLE (1) Numerical solution

is a contraction mapping

: initial value that

Then22

log ( ) ( )0, i.e. 0 ( ) ' 0 0

( ) ( ) '

L

F

(0) (0) 1

4 ,1 (1) (0) (0) (0)

( ) ( 1) ( 1) ( 1)

( ) '( )

( ) '( )k k k k

F

F

( ) s.t. ( )

kk s F s s

Page 23: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Applications for MLE (2) How to choose s.t. is a contraction

mapping?

Optimal ?

23

( )F

( ) ( )lim '( ) ''( ) 1 1.y x

F x F yF x x L

x y

Page 24: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Parallel Chord Method (1) Parallel chord method is also called simple

iteration.

24

(k) ( 1) ( 1)

( 1)

( ) ( 1)

'( )

0 '( ) 1

k k

k

k k

Page 25: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

25

(2)(1)(0)

(1) '( )

(0) '( )

Parallel Chord Method (2)

Page 26: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Plot Parallel Chord Method by R (1)### Simple iteration ###y1 = 125; y2 = 18; y3 = 20; y4 = 24

# First and second derivatives of log likelihood #f1 <- function(phi) {y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi}f2 <- function(phi) {(-1)*y1*(2+phi)^(-2)-(y2+y3)*(1-phi)^(-

2)-y4*(phi)^(-2)}

x = c(10:80)*0.01y = f1(x)plot(x, y, type = 'l', main = "Parallel chord method", xlab =

expression(varphi), ylab = "First derivative of log likelihood function")

abline(h = 0)

26

Page 27: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Plot Parallel Chord Method by R (2)phi0 = 0.25 # Given the initial value 0.25 #segments(0, f1(phi0), phi0, f1(phi0), col = "green", lty = 2)segments(phi0, f1(phi0), phi0, -200, col = "green", lty = 2)

# Use the tangent line to find the intercept b0 #b0 = f1(phi0)-f2(phi0)*phi0curve(f2(phi0)*x+b0, add = T, col = "red")

phi1 = -b0/f2(phi0) # Find the closer phi #segments(phi1, -200, phi1, f1(phi1), col = "green", lty = 2)segments(0, f1(phi1), phi1, f1(phi1), col = "green", lty = 2)

# Use the parallel line to find the intercept b1 #b1 = f1(phi1)-f2(phi0)*phi1curve(f2(phi0)*x+b1, add = T, col = "red")

27

Page 28: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Define Functions for Example 1 in R We will define some functions and

variables for finding the MLE in Example 1 by R

28

# Fist, second and third derivatives of log likelihood #f1 = function(y1, y2, y3, y4, phi){y1/(2+phi)-(y2+y3)/(1-

phi)+y4/phi}f2 = function(y1, y2, y3, y4, phi) {(-1)*y1*(2+phi)^(-2)-

(y2+y3)*(1-phi)^(-2)-y4*(phi)^(-2)}f3 = function(y1, y2, y3, y4, phi) {2*y1*(2+phi)^(-3)-

2*(y2+y3)*(1-phi)^(-3)+2*y4*(phi)^(-3)}

# Fisher Information #I = function(y1, y2, y3, y4, phi)

{(-1)*(y1+y2+y3+y4)*(1/(4*(2+phi))+1/(2*(1-phi))+1/(4*phi))}

y1 = 125; y2 = 18; y3 = 20; y4 = 24; initial = 0.9

Page 29: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Parallel Chord Method by R (1)> fix(SimpleIteration)function(y1, y2, y3, y4, initial){

phi = NULL;i = 0;alpha = -1.0/f2(y1, y2, y3, y4, initial);phi2 = initial;phi1 = initial+1;while(abs(phi1-phi2) >= 1.0E-5){

i = i+1;phi1 = phi2;phi2 = alpha*f1(y1, y2, y3, y4, phi1)+phi1;phi[i] = phi2;

}print("By parallel chord method(simple iteration)");return(list(phi = phi2, iteration = phi));

}29

Page 30: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Parallel Chord Method by R (2)> SimpleIteration(y1, y2, y3, y4, initial)

30

Page 31: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Parallel Chord Method by C/C++

31

Page 32: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Newton-Raphson Method (1)

http://math.fullerton.edu/mathews/n2003/Newton'sMethodMod.html

http://en.wikipedia.org/wiki/Newton%27s_method

32

( 1) ( 1) ( 1)'( ) '( ) ( ) ''( ) 0k k ks S ( 1)

( 1)( 1)

( 1)( ) ( 1) ( 1)

( 1)

'( )

''( )

'( )( )

''( )

kk

k

kk k k

k

s

G

Page 33: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

33

Newton-Raphson Method (2)

(2)(1)(0)

(1) '( )

(0) '( )

Page 34: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Plot Newton-Raphson Method by R (1)### Newton-Raphson Method ###y1 = 125; y2 = 18; y3 = 20; y4 = 24

# First and second derivatives of log likelihood #f1 <- function(phi) {y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi}f2 <- function(phi) {(-1)*y1*(2+phi)^(-2)-(y2+y3)*(1-phi)^(-

2)-y4*(phi)^(-2)}

x = c(10:80)*0.01y = f1(x)plot(x, y, type = 'l', main = "Newton-Raphson method", xlab =

expression(varphi), ylab = "First derivative of log likelihood function")

abline(h = 0)

34

Page 35: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Plot Newton-Raphson Method by R (2)# Given the initial value 0.25 #phi0 = 0.25segments(0, f1(phi0), phi0, f1(phi0), col = "green", lty = 2)segments(phi0, f1(phi0), phi0, -200, col = "green", lty = 2)

# Use the tangent line to find the intercept b0 #b0 = f1(phi0)-f2(phi0)*phi0curve(f2(phi0)*x+b0, add = T, col = "purple", lwd = 2)

# Find the closer phi #phi1 = -b0/f2(phi0)segments(phi1, -200, phi1, f1(phi1), col = "green", lty = 2)segments(0, f1(phi1), phi1, f1(phi1), col = "green", lty = 2)

35

Page 36: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Plot Newton-Raphson Method by R (3)# Use the parallel line to find the intercept b1 #b1 = f1(phi1)-f2(phi0)*phi1curve(f2(phi0)*x+b1, add = T, col = "red")curve(f2(phi1)*x-f2(phi1)*phi1+f1(phi1), add = T, col = "blue",

lwd = 2)

legend(0.45, 250, c("Newton-Raphson", "Parallel chord method"), col = c("blue", "red"), lty = c(1, 1))

36

Page 37: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Newton-Raphson Method by R (1)> fix(Newton)function(y1, y2, y3, y4, initial){

i = 0;phi = NULL;phi2 = initial;phi1 = initial+1;while(abs(phi1-phi2) >= 1.0E-6){

i = i+1;phi1 = phi2;alpha = 1.0/(f2(y1, y2, y3, y4, phi1));phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4, phi1);phi[i] = phi2;

}print("By Newton-Raphson method");return (list(phi = phi2, iteration = phi));

}

37

Page 38: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Newton-Raphson Method by R (2)> Newton(125, 18, 20, 24, 0.9)[1] "By Newton-Raphson method"$phi[1] 0.577876

$iteration[1] 0.8193054 0.7068499 0.6092827 0.5792747 0.5778784

0.5778760 0.5778760

38

Page 39: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Newton-Raphson Method by C/C++

39

Page 40: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Halley’s Method The Newton-Raphson iteration function is

It is possible to speed up convergence by using more expansion terms than the Newton-Raphson method does when the object function is very smooth, like the method by Edmond Halley (1656-1742):

40

( )( )

'( )

f xg x x

f x

1

2

( ) ( ) ''( )( ) 1

'( ) 2( ( ))

f x f x f xg x x

f x f x

(http://math.fullerton.edu/mathews/n2003/Halley'sMethodMod.html)

Page 41: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Halley’s Method by R (1)> fix(Halley)function( y1, y2, y3, y4, initial){

i = 0;phi = NULL;phi2 = initial;phi1 = initial+1;while(abs(phi1-phi2) >= 1.0E-6){

i = i+1;phi1 = phi2;alpha = 1.0/(f2(y1, y2, y3, y4, phi1));phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4,

phi1)*1.0/(1.0-f1(y1, y2, y3, y4, phi1)*f3(y1, y2, y3, y4, phi1)/(f2(y1, y2, y3, y4, phi1)*f2(y1, y2, y3, y4, phi1)*2.0));

phi[i] = phi2;}print("By Halley method");return (list(phi = phi2, iteration = phi));

}41

Page 42: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Halley’s Method by R (2)> Halley(125, 18, 20, 24, 0.9)[1] "By Halley method"$phi[1] 0.577876

$iteration[1] 0.5028639 0.5793692 0.5778760 0.5778760

42

Page 43: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Halley’s Method by C/C++

43

Page 44: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

44

Bisection Method (1) Assume that  and that there exists

a number such that . If  and have opposite signs, and represents the sequence of midpoints generated by the bisection process, then

and the sequence converges to . That is,  . http://en.wikipedia.org/wiki/

Bisection_method

[ , ]f C a b[ , ]r a b ( ) 0f r

( )f a ( )f b nc

1 1, 2,...

2n n

b ar c n

nc

lim nn

c r

r

Page 45: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

45

1

Bisection Method (2)

Page 46: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Plot the Bisection Method by R### Bisection method ###y1 = 125; y2 = 18; y3 = 20; y4 = 24

f <- function(phi) {y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi}x = c(1:100)*0.01y = f(x)plot(x, y, type = 'l', main = "Bisection method", xlab =

expression(varphi), ylab = "First derivative of log likelihood function")

abline(h = 0)abline(v = 0.5, col = "red")abline(v = 0.75, col = "red")text(0.49, 2200, labels = "1")text(0.74, 2200, labels = "2")

46

Page 47: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Bisection Method by R (1)> fix(Bisection)function(y1, y2, y3, y4, A, B) # A, B is the boundary of

parameter #{

Delta = 1.0E-6; # Tolerance for width of interval #Satisfied = 0; # Condition for loop termination #phi = NULL;

YA = f1(y1, y2, y3, y4, A); # Compute function values #YB = f1(y1, y2, y3, y4, B);# Calculation of the maximum number of iterations #Max = as.integer(1+floor((log(B-A)-log(Delta))/log(2)));

# Check to see if the bisection method applies #if(((YA >= 0) & (YB >=0)) || ((YA < 0) & (YB < 0))){

print("The values of function in boundary point do not differ in sign.");

47

Page 48: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Bisection Method by R (2)print("Therefore, this method is not appropriate here.");quit(); # Exit program #

}for(K in 1:Max){

if(Satisfied == 1)break;

C = (A+B)/2; # Midpoint of intervalYC = f1(y1, y2, y3, y4, C); # Function value at midpoint

#if((K-1) < 100)

phi[K-1] = C;if(YC == 0){

A = C; # Exact root is found #B = C;

}else{

if((YB*YC) >= 0 ){48

Page 49: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Bisection Method by R (3)B = C; # Squeeze from the right #YB = YC;

}else{

A = C; # Squeeze from the left #YA = YC;

}}if((B-A) < Delta)

Satisfied = 1; # Check for early convergence #} # End of 'for'-loop #

print("By Bisection Method");return(list(phi = C, iteration = phi));

}

49

Page 50: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Bisection Method by R (4)> Bisection(125, 18, 20, 24, 0.25, 1)[1] "By Bisection Method"$phi[1] 0.5778754

$iteration [1] 0.4375000 0.5312500 0.5781250 0.5546875 0.5664062

0.5722656 0.5751953 [8] 0.5766602 0.5773926 0.5777588 0.5779419 0.5778503

0.5778961 0.5778732[15] 0.5778847 0.5778790 0.5778761 0.5778747 0.5778754

50

Page 51: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Bisection Method by C/C++ (1)

51

Page 52: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Bisection Method by C/C++ (2)

52

Page 53: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Secant Method

More: http://en.wikipedia.org/wiki/Secant_method http://math.fullerton.edu/mathews/n2003/SecantMethodMod.html

53

( 1)

1

''( )k

( 2) ( 1)( )

( 2) ( 1)

'( ) '( )''( )

k kk

k kslope

Page 54: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Secant Method by R (1)> fix(Secant)function(y1, y2, y3, y4, initial1, initial2){

phi = NULL;phi2 = initial1;phi1 = initial2;alpha = (phi2-phi1)/(f1(y1, y2, y3, y4, phi2)-f1(y1, y2, y3, y4, phi1));phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4, phi1);i = 0;while(abs(phi1-phi2) >= 1.0E-6){

i = i+1;phi1 = phi2;

54

Page 55: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Secant Method by R (2)alpha = (phi2-phi1)/(f1(y1, y2, y3, y4, phi2)-f1(y1,

y2, y3, y4, phi1));phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4,

phi1);phi[i] = phi2;

}

print("By Secant method");return (list(phi=phi2,iteration=phi));

}

55

Page 56: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Secant Method by R (3)> Secant(125, 18, 20, 24, 0.9, 0.05)[1] "By Secant method"$phi[1] 0.577876

$iteration[1] 0.2075634 0.4008018 0.5760396 0.5778801 0.5778760

0.5778760

56

Page 57: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Secant Method by C/C++

57

Page 58: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

58

Secant-Bracket Method The secant-bracket method is also called

the regular falsi method.

'( ) '( ) '( ) '( )A B C A

A B C A

'( ) '( )

'( ) '( )

B A A BC

A B

SC

A B

Page 59: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Secant-Bracket Method by R (1)> fix(RegularFalsi)function(y1, y2, y3, y4, A, B){

phi = NULL;i = -1;Delta = 1.0E-6; # Tolerance for width of interval #Satisfied = 1; # Condition for loop termination #

# Endpoints of the interval [A,B] #YA = f1(y1, y2, y3, y4, A); # compute function values #YB = f1(y1, y2, y3, y4, B);

# Check to see if the bisection method applies #if(((YA >= 0) & (YB >=0)) || ((YA < 0) & (YB < 0))){

print("The values of function in boundary point do not differ in sign");

print("Therefore, this method is not appropriate here");q(); # Exit program #

}59

Page 60: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Secant-Bracket Method by R (2)

while(Satisfied){i = i+1;C = (B*f1(y1, y2, y3, y4, A)-A*f1(y1, y2, y3, y4, B))/(f1(y1,

y2, y3, y4, A)-f1(y1, y2, y3, y4, B)); # Midpoint of interval #

YC = f1(y1, y2, y3, y4, C); # Function value at midpoint #

phi[i] = C;if(YC == 0){ # First 'if' #

A = C; # Exact root is found #B = C;

}else{if((YB*YC) >= 0 ){

B = C; # Squeeze from the right #YB = YC;

60

Page 61: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Secant-Bracket Method by R (3)

}else{A = C; # Squeeze from the left #YA = YC;

}}if(f1(y1, y2, y3, y4, C) < Delta)

Satisfied = 0; # Check for early convergence #}

print("By Regular Falsi Method")return(list(phi = C, iteration = phi));

}

61

Page 62: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Secant-Bracket Method by R (4)> RegularFalsi(y1, y2, y3, y4, 0.9, 0.05)[1] "By Regular Falsi Method"$phi[1] 0.577876$iteration [1] 0.5758648 0.5765007 0.5769352 0.5772324 0.5774356

0.5775746 0.5776698 [8] 0.5777348 0.5777794 0.5778099 0.5778307 0.5778450

0.5778548 0.5778615[15] 0.5778660 0.5778692 0.5778713 0.5778728 0.5778738

0.5778745 0.5778750[22] 0.5778753 0.5778755 0.5778756 0.5778757 0.5778758

0.5778759 0.5778759[29] 0.5778759 0.5778759 0.5778759 0.5778760 0.5778760

0.5778760 0.5778760[36] 0.5778760 0.5778760

62

Page 63: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Secant-Bracket Method by C/C++ (1)

63

Page 64: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Secant-Bracket Method by C/C++ (2)

64

Page 65: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

65

Fisher Scoring Method Fisher scoring method replaces

by where is the Fisher information matrix when the parameter may be multivariate.

( )''( )k( 1) ( 1)( ''( )) ( )k kE I I

Page 66: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Fisher Scoring Method by R (1)> fix(Fisher)function(y1, y2, y3, y4, initial){

i = 0;phi = NULL;phi2 = initial;phi1 = initial+1;while(abs(phi1-phi2) >= 1.0E-6){

i = i+1;phi1 = phi2;alpha = 1.0/I(y1, y2, y3, y4, phi1);phi2 = phi1-f1(y1, y2, y3, y4, phi1)/I(y1, y2, y3, y4, phi1);phi[i] = phi2;

}

print("By Fisher method");return(list(phi = phi2, iteration = phi));

}66

Page 67: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Fisher Scoring Method by R (2)> Fisher(125, 18, 20, 24, 0.9)[1] "By Fisher method"$phi[1] 0.577876

$iteration[1] 0.5907181 0.5785331 0.5779100 0.5778777 0.5778761

0.5778760

67

Page 68: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Fisher Scoring Method by C/C++

68

Page 69: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Order of Convergence Order of convergence is if

and for .

More: http://en.wikipedia.org/wiki/Order_of_convergence

Note: asHence, we can use regression to estimate

69

1c 1p

p

p

1

lim sup , 0

k

pk k

x sc c

x s

1ln lnk kx s c p x s k

Page 70: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Theorem for Newton-Raphson Method If , is a contraction

mapping then and

If exists, has a simple zero, then such that of the Newton-Raphson method is a contraction mapping and .

70

F1p

2p

1, , :I a b F I I

1

'lim sup 1

k

kk

x sF s

x s

0, F x x I

1 ''', , I a b l '' '0, 0l s l 1 , , I s s

G

Page 71: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Find Convergence Order by R (1)> # Coverage order #> # Newton method can be substitute for different method #> R = Newton(y1, y2, y3, y4, initial)[1] "By Newton-Raphson method"> temp = log(abs(R$iteration-R$phi))> y = temp[2:(length(temp)-1)]> x = temp[1:(length(temp)-2)]> lm(y~x)

Call:lm(formula = y ~ x)

Coefficients:(Intercept) x 0.6727 2.0441

71

Page 72: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

> R = Fisher(y1, y2, y3, y4, initial)[1] "By Fisher method"> temp=log(abs(R$iteration-R$phi))> y=temp[2:(length(temp)-1)]> x=temp[1:(length(temp)-2)]> lm(y~x)Call:lm(formula = y ~ x)

Coefficients:(Intercept) x -2.942 1.004

> R = Bisection(y1, y2, y3, y4, 0.25, 1)[1] "By Bisection Method"> temp=log(abs(R$iteration-R$phi))> y=temp[2:(length(temp)-1)]> x=temp[1:(length(temp)-2)]> lm(y~x)Call:lm(formula = y ~ x)

Coefficients:(Intercept) x -1.9803 0.8448

>R = SimpleIteration(y1, y2, y3, y4, initial)[1] "By parallel chord method(simple iteration)"> temp = log(abs(R$iteration-R$phi))> y = temp[2:(length(temp)-1)]> x = temp[1:(length(temp)-2)]> lm(y~x)Call:lm(formula = y ~ x)

Coefficients:(Intercept) x -0.01651 1.01809

> R = Secant(y1, y2, y3, y4, initial, 0.05)[1] "By Secant method"> temp = log(abs(R$iteration-R$phi))> y = temp[2:(length(temp)-1)]> x = temp[1:(length(temp)-2)]> lm(y~x)Call:lm(formula = y ~ x)

Coefficients:(Intercept) x -1.245 1.869

72

Find Convergence Order by R (2)

Page 73: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Find Convergence Order by R (3)> R = RegularFalsi(y1, y2, y3, y4, initial, 0.05)[1] "By Regular Falsi Method"> temp = log(abs(R$iteration-R$phi))> y = temp[2:(length(temp)-1)]> x = temp[1:(length(temp)-2)]> lm(y~x)

Call:lm(formula = y ~ x)

Coefficients:(Intercept) x -0.2415 1.0134

73

Page 74: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Find Convergence Order by C/C++

74

Page 75: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

Exercises Write your own programs for those

examples presented in this talk. Write programs for those examples

mentioned at the following web page:http://en.wikipedia.org/wiki/Maximum_likelihood

Write programs for the other examples that you know.

75

Page 76: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

More Exercises (1) Example 3 in genetics:

The observed data are

where , , and fall in such that Find the MLEs for , , and .

76

2 2 2

, , , 176,182,60,17

~ , 2 , 2 ,2

O A B ABn n n n

Multinomial r p pr q qr pq

p q r [0,1]

1p q r p q r

Page 77: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

More Exercises (2) Example 4 in the positron emission

tomography (PET): The observed data are

and

The values of are known and the unknown parameters are .

Find the MLEs for .77

*

1

( ) ( , ) ( ).B

b

d p b d b

* *~ , 1,2, ,n d Poisson d d D

,p b d

, 1, 2, ,b b B

, 1, 2, ,b b B

Page 78: 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw

More Exercises (3) Example 5 in the normal mixture:

The observed data are random samples from the following probability density function:

Find the MLEs for the following parameters:

78

2

1 1

( ) ~ ( , ), 1, and 0 1 for all .K K

i k k k k kk k

f x Normal k

1 1 1( ,..., , ,..., , ,..., ).K K K

, 1, 2, ,iX i n