an introduction to the em algorithm by naala brewer and kehinde salau project advisor – prof....

An Introduction to the EM AlgorithmBy Naala Brewer and Kehinde Salau

Project Advisor – Prof. Randy EubankAdvisor – Prof. Carlos Castillo-ChavezMTBI, Arizona State University

An Introduction to the EM AlgorithmOutline•History of the EM Algorithm

•Theory behind the EM Algorithm

•Biological Examples including derivations, coding in R, Matlab, C++

•Graphs of iterations and convergence

Brief History of the EM Algorithm

•Method frequently referenced throughout field of statistics

•Term coined in 1977 paper by Arthur Dempster, Nan Laird, and Donald Rubin

Breakdown of the EM Task•To compute MLEs of latent variables

and unknown parameters in probabilistic models

•E-step: computes expectation of complete/unobserved data

•M-step: computes MLEs of unknown parameters

•Repeat!!

Generalization of the EM Algorithm•X- Full sample (latent variable) ~ f(x; θ) Y - Observed sample (incomplete data) ~

f(y;θ) such that y(x) = y

•We define Q(θ;θp) = E[lnf(x;θ)|Y, θp]

•θp+1 obtained by solving, = 0

Generalization (cont.)

•Iterations continue until |θp+1 - θp| or |Q(θp+1;θp) - Q(θp;θp)| are sufficiently small

•Thus, optimal values for Q(θ;θp) and θ are obtained

•Likelihood nondecreasing with each iteration:

Q(θp+1;θp) ≥ Q(θp;θp)

Binomial Distribution – Bin(n,p)

Example 1 – Household Model•n-people, p-probability of getting disease•Derivation•Graphs

Binomial Distribution - Derivation

Binomial Derivation (cont.)

Example 2 – Population of Animals

Rao (1965, pp.368-369), Genetic Linkage Model• Suppose 197 animals are distributed multinomially into

four categories, y = (125, 18, 20, 34) = (y1, y2, y3, y4)

• A genetic model for the population specifies cell probabilities (1/2, ¼ – ¼л, ¼ – ¼л, ¼л)

• Represent y as incomplete data, y1=x1+x2, y2=x3, y3=x4, y4=x5.

Multinomial Distribution-Derivation

Multinomial Derivation (cont.)

Multinomial Coding

Example 2 – Population of Animals•R Coding•Matlab Coding•C++ Coding

R Coding

#initial vector of data

y <- c(125, 18, 20, 34)

#Initial value for unknown parameter

pik <- .5

for(k in 1:10){

x2k <-y[1]*(.25*pik)/(.5 +.25*pik)

pik <- (x2k + y[4])/(x2k + sum(y[2:4]))

print(c(x2k,pik)) #Convergent values

Matlab Coding

%initial vector of data

y = [125, 18, 20, 34];

%Initial value for unknown parameter

pik = .5;

for k = 1:10

x2k = y(1)*(.25*pik)/(.5 + .25*pik)

pik = (x2k + y(4))/(x2k + sum(y(2:4)))

%Convergent values

[x2k,pik]

Multinomial Coding

C++ Coding

#include <iostream>

int main () {

int x1, x2, x3, x4;

float pik, x2k;

std::cout << "enter vector of values, there should be four inputs\n";

std::cin >> x1 >> x2 >> x3 >> x4;

std::cout << "enter value for pik\n";

std::cin >> pik;

for (int counter = 0; counter < 10; counter++){

x2k = x1*((0.25)*pik)/((0.5) + (0.25)*pik);

pik = (x2k + x4)/(x2k + x2 + x3 + x4);

std::cout << "x2k is " << x2k << " and " << " pik is " << pik << std::endl;

return 0;

Matlab Coding

%initial vector of data

y = [125, 18, 20, 34];

%Initial value for unknown parameter

pik = .5;

for k = 1:10

x2k = y(1)*(.25*pik)/(.5 + .25*pik)

pik = (x2k + y(4))/(x2k + sum(y(2:4)))

%Convergent values

[x2k,pik]

Multinomial Coding

Graph of Convergence of Unknowns,πk and x2

Multinomial Distribution

Example 2 -Failure TimesFlury and Zoppè (2000)▫Suppose the lifetime of bulbs follows an

exponential distribution with mean θ

▫The failure times (u1,...,un) are known for n light bulbs

▫In another experiment, m light bulbs (v1,...,vm) are tested; no individual recordings The number of bulbs, r, that fail at time t0 are

recorded

Exponential Distribution - Derivation

Exponential Derivation (cont.)

•Example 2 – Failure Times Graphs

Future Work

•More Biological Examples

An Introduction to the EM AlgorithmReferences[1] Dempster, A.P., Laird, N.M., Rubin, D.B. (1977). Maximum

Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society. Series B (Methodological), Vol. 39, No. 1, , pp. 1-38

[2] Redner, R.A., Walker, H.F. (Apr., 1984). Mixture Densities, Maximum Likelihood and the EM Algorithm. SIAM Review, Vol. 26, No. 2., pp. 195-239.

[3] Tanner, A.T. (1996). Tools for Statistical Inference. Springer-Verlag New York, Inc. Third Edition

an introduction to the em algorithm by naala brewer and kehinde salau project advisor – prof....

p q p p slide

p q p p

p example

convergence slide

em algorithm x

donald rubin slide

cout c coding

pik pik

Documents

constance agregaard false rape ahmed salau

advisor branded marketing - emoney...advisor branded...

newsletter „11...1 bondi sophie whalen, caleb moss,...

hitachi unified compute platform (ucp) advisor deployment...

e-advisor training. what is e-advisor? e-advisor is a...

scott ash ethics complaint investigation ahmed salau

the changes in temperature and relative humidity in lagos...

erik colon, academic advisor michelle jones, academic...

blm 11 a… · web viewmichael salau 23. compensation act...

third-party investment advisor payment form › bin-public...

an introduction to the em algorithm naala brewer and kehinde...

last first name sub-plan advisor secondary advisor a

advisor martin wosnik graduate co-advisor kyle charmanski

salau, odunayo paul - covenant...

the salau scheelite deposit -...

regarding mediums · 2018-12-24 · carlos campetti –...

advisor toolbox 1 tstc faculty advising advisor toolbox

the connected enterprisethingworx platform. controls...

advising center coordinator academic advisor · advising...

mega advisor 2012 customization · mega advisor...