mlvsmap

Upload: hmbx

Post on 13-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/26/2019 MLvsMAP

    1/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum a

    posteriori(MAP)Estimation

    MAQ

    Parameter Estimation

    ML vs. MAP

    Peter N Robinson

    December 14, 2012

    http://find/
  • 7/26/2019 MLvsMAP

    2/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum a

    posteriori(MAP)Estimation

    MAQ

    Estimating Parameters from Data

    In many situations in bioinformatics, we want to estimate op-timal parameters from data. In the examples we have seen inthe lectures on variant calling, these parameters might be theerror rate for reads, the proportion of a certain genotype, theproportion of nonreference bases etc. However, the hello worldexample for this sort of thing is the coin toss, so we will start

    with that.

    http://find/
  • 7/26/2019 MLvsMAP

    3/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori(MAP)Estimation

    MAQ

    Coin toss

    Lets say we have two coins that are each tossed 10 times

    Coin 1: H,T,T,H,H,H,T,H,T,TCoin 2: T,T,T,H,T,T,T,H,T,T

    Intuitively, we might guess that coin one is a fair coin, i.e.,P(X =H) = 0.5, and that coin 2 is biased, i.e.,P

    (X

    =H

    )= 0.5

    http://find/
  • 7/26/2019 MLvsMAP

    4/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori(MAP)Estimation

    MAQ

    Discrete Random Variable

    Let us begin to formalize this. We model the coin toss processas follows.

    The outcome of a single coin toss is a random variable Xthat can take on values in a set X ={x

    1, x

    2, . . . , xn}

    In our example, of course, n= 2, and the values arex1 = 0 (tails) and x2 = 1 (heads)

    We then have a probability mass function p:X [0, 1];the law of total probability states that xXp(xi) = 1This is a Bernoulli distribution with parameter :

    p(X = 1; ) = (1)

    http://find/
  • 7/26/2019 MLvsMAP

    5/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori(MAP)Estimation

    MAQ

    Probability of sequence of events

    In general, for a sequence of two events X1 and X2, the jointprobability isP(X1, X2) =p(X2|X1)p(X1) (2)

    Since we assume that the sequence is iid (identically andindependently distributed), by definition p(X2|X1) =P(X2).Thus, for a sequence ofn events (coin tosses), we have

    p(x1, x2, . . . , xn; ) =n

    i=1

    p(xi; ) (3)

    if the probability of heads is 30%, the the probability of thesequence for coin 2 can be calculated as

    p(T,T,T,H,T,T,T,H,T,T; ) =2(1 )8 =

    3

    10

    2 710

    8(4)

    http://find/
  • 7/26/2019 MLvsMAP

    6/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori(MAP)Estimation

    MAQ

    Probability of sequence of events

    Thus far, we have considered p(x; ) as a function of x,parametrized by . If we view p(x; ) as a function of, thenit is called the likelihood function.

    Maximum likelihood estimation basically chooses a value of

    that maximizes the likelihood function given the observed data.

    http://find/
  • 7/26/2019 MLvsMAP

    7/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori(MAP)Estimation

    MAQ

    Maximum likelihood for Bernoulli

    The likelihood for a sequence of i.i.d. Bernoulli randomvariablesX= [x1, x2, . . . , xn] with xi {0, 1} is then

    p(X; ) =n

    i=1

    p(xi; ) =

    n

    i=1

    xi(1 )1xi (5)

    We usually maximize the log likelihood function rather than theoriginal function

    Often easier to take the derivative

    the log function is monotonically increasing, thus, themaximum (argmax) is the same

    Avoid numerical problems involved with multiplying lotsof small numbers

    http://find/http://goback/
  • 7/26/2019 MLvsMAP

    8/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori(MAP)Estimation

    MAQ

    Log likelihood

    Thus, instead of maximizing this

    p(X; ) =n

    i=1

    xi(1 )1xi (6)

    we maximize this

    log p(X; ) = logni=1

    xi(1 )1xi

    =

    n

    i=1 log xi(1 )1xi

    =ni=1

    log xi + log(1 )1xi

    =n

    i=1

    [xilog + (1 xi) log(1 )]

    http://find/
  • 7/26/2019 MLvsMAP

    9/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori(MAP)Estimation

    MAQ

    Log likelihood

    Note that one often denotes the log likelihood function withthe symbol L= log p(X; ).

    A function f defined on a subset of the real numbers with real

    values is called monotonic (also monotonically increasing, in-creasing or non-decreasing), if for all x and y such that xyone has f(x)f(y)

    Thus, the monotonicity of the log function guarantees that

    argmax

    p(X; ) = argmax

    log p(X; ) (7)

    http://find/
  • 7/26/2019 MLvsMAP

    10/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori(MAP)Estimation

    MAQ

    ML estimate

    The ML estimate of the parameter is then

    argmax

    ni=1

    [xilog + (1 xi) log(1 )] (8)

    We can calculate the argmax by setting the first derivative

    equal to zero and solving for

    http://find/http://goback/
  • 7/26/2019 MLvsMAP

    11/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori(MAP)Estimation

    MAQ

    ML estimate

    Thus

    log p(X; ) =n

    i=1

    [xilog + (1 xi) log(1 )]

    =n

    i=1

    xi

    log +

    ni=1

    (1 xi)

    log(1 )

    = 1

    ni=1

    xi 1

    1

    ni=1

    (1 xi)

    http://find/http://goback/
  • 7/26/2019 MLvsMAP

    12/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    ML estimate

    and finally, to find the maximum we set

    log p(X; ) = 0:

    0 = 1

    ni=1

    xi 1

    1

    ni=1

    (1 xi)

    1

    =n

    i=1

    (1 xi)ni=1xi

    1

    1 =

    ni=11

    ni=1xi

    1

    1

    =

    nni=1xi

    ML = 1

    n

    ni=1

    xi

    Reassuringly, the maximum likelihood estimate is just theproportion of flips that came out heads.

    http://find/http://goback/
  • 7/26/2019 MLvsMAP

    13/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Problems with ML estimation

    Does it really make sense that

    H,T,H,T = 0.5H,T,T,T = 0.25

    T,T,T,T = 0.0

    ML estimation does not incorporate any prior knowledge and

    does not generate an estimate of the certainty of its results.

    http://find/http://goback/
  • 7/26/2019 MLvsMAP

    14/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Maximum a posteriori Estimation

    Bayesian approaches try to reflect our belief about . In thiscase, we will consider to be a random variable.

    p(|X) =

    p(X|)p()

    p(X) (9)

    Thus, Bayes law converts our prior belief about the parameter (before seeing data) into a posterior probability, p(|X), byusing the likelihood function p(X|). The maximum

    a-posteriori (MAP) estimate is defined as

    MAP= argmax

    p(|X) (10)

    http://find/
  • 7/26/2019 MLvsMAP

    15/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Maximum a posteriori Estimation

    Note that because p(X) does not depend on , we have

    MAP

    = argmax

    p(|X)

    = argmax

    p(X|)p()

    p(X)

    = argmax

    p(X|)p()

    This is essentially the basic idea of the MAP equation used bySNVMix for variant calling

    http://find/
  • 7/26/2019 MLvsMAP

    16/41

    Parameter

    Estimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    MAP Estimation; What does it buy us?

    To take a simple example of a situation in which MAP estima-tion might produce better results than ML estimation, let usconsider a statistician who wants to predict the outcome of thenext election in the USA.

    The statistician is able to gather data on party preferencesby asking people he meets at the Wall Street Golf Club1

    which party they plan on voting for in the next election

    The statistician asks 100 people, seven of whom answerDemocrats. This can be modeled as a series ofBernoullis, just like the coin tosses.

    In this case, the maximum likelihood estimate of theproportion of voters in the USA who will vote democraticis ML= 0.07.

    1i.e., a notorious haven of ultraconservativeRepublicans

    http://find/http://goback/
  • 7/26/2019 MLvsMAP

    17/41

    Parameter

    EstimationPeter N

    Robinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    MAP Estimation; What does it buy us?

    Somehow, the estimate of ML= 0.07 doesnt seem quite rightgiven our previous experience that about half of the electoratevotes democratic, and half votes republican. But how shouldthe statistician incorporate this prior knowledge into hisprediction for the next election?

    The MAP estimation procedure allows us to inject our priorbeliefs about parameter values into the new estimate.

    http://find/http://goback/
  • 7/26/2019 MLvsMAP

    18/41

    Parameter

    EstimationPeter N

    Robinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Beta distribution: Background

    The Beta distribution is appropriate to express prior belief abouta Bernoulli distribution. The Beta distribution is a family ofcontinuous distributions defined on [0, 1] and parametrized bytwo positive shape parameters, and

    p() = 1

    B(, )1 (1 )1

    here, [0, 1], and

    B(, ) = (+)

    () ()

    where is the Gamma function(extension of factorial). 0.0 0.2 0.4 0.6 0.8 1.0

    0.

    0

    0.

    5

    1.

    0

    1.

    5

    2.

    0

    2.5

    beta distribution

    x

    p

    df

    ==0.5=5, =1=1, =3==2=2, =5

    http://find/
  • 7/26/2019 MLvsMAP

    19/41

    Parameter

    EstimationPeter N

    Robinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Beta distribution: Background

    Random variables are either discrete (i.e., they can assume oneof a list of values, like the Bernoulli with heads/tails) or con-tinuous (i.e., they can take on any numerical value in a certaininterval, like the Beta distribution with ).

    A probability density function (pdf) of a continuousrandom variable, is a function that describes the relativelikelihood for this random variable to take on a givenvalue, i.e., p() : R R+ such that

    Pr( (a, b)) =

    ba

    p()d (11)

    The probability that the value of lies between a and bis given by integrating the pdf over this region

    http://find/
  • 7/26/2019 MLvsMAP

    20/41

    Parameter

    EstimationPeter N

    Robinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Beta distribution: Background

    Recall the difference between a PDF (for continuous randomvariable) and a probability mass function (PMF) for a discreterandom variable

    A PMF is defined asPr

    (X =xi) =pi, with 0 pi1and

    ipi= 1

    e.g., for a fair coin toss (Bernoulli),Pr(X = heads) = Pr(X = tails) = 0.5

    In contrast, for a PDF, there is no requirement thatp() 1, but we do have +

    p()d= 1 (12)

    http://find/
  • 7/26/2019 MLvsMAP

    21/41

    Parameter

    EstimationPeter N

    Robinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Beta distribution: Background

    We calculate the PDF for the Beta distribution for asequence of values 0, 0.01, 0.02, . . . , 1.00 in R as follows

    x

  • 7/26/2019 MLvsMAP

    22/41

    Parameter

    EstimationPeter N

    Robinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Beta distribution: Background

    Themodeof a continuous probability distribution is thevalue x at which its probability density function has itsmaximum value

    The mode of the Beta(, ) distribution has itsmodeat

    1+ 2 (13)

    alpha

  • 7/26/2019 MLvsMAP

    23/41

    Parameter

    EstimationPeter N

    Robinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Beta distribution: Background

    The code from the previous slide leads to

    0.0 0.2 0.4 0.6 0.8 1.0

    0.

    0

    0.

    5

    1.

    0

    1.

    5

    2.0

    2.

    5

    =7 =3

    x

    pdf

    The mode, shown as the dotted red line, is calculated as 1

    + 2=

    7 1

    7 + 3 2= 0.75

    http://find/
  • 7/26/2019 MLvsMAP

    24/41

    Parameter

    EstimationPeter N

    Robinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Maximum a posteriori (MAP) Estimation

    With all of this information in hand, lets get back to MAPestimation!

    Going back to Bayes rule, again, we seek the value ofthat maximizes the posterior Pr(|X):

    Pr(|X) = Pr(X|)Pr()

    Pr(X) (14)

    http://find/
  • 7/26/2019 MLvsMAP

    25/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Maximum a posteriori (MAP) Estimation

    We then have

    MAP = argmax

    Pr(|X)

    = argmax

    Pr(X|)Pr()

    Pr(X)

    = argmax

    Pr(X|)Pr()

    = argmaxxiX

    Pr(xi|)Pr()

    ( )

    http://find/
  • 7/26/2019 MLvsMAP

    26/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Maximum a posteriori (MAP) Estimation

    As we saw above for maximum likelihood estimation, it is easierto calculate the argmax for the logarithm

    argmax

    Pr(|X) = argmax

    log Pr(|X)

    = argmax

    logxiX

    Pr(xi|) Pr()

    = argmax

    xiX

    {log Pr(xi|)} + log Pr()

    M i i i (MAP) E i i

    http://find/
  • 7/26/2019 MLvsMAP

    27/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Maximum a posteriori (MAP) Estimation

    Lets go back now to our problem of predicting the results ofthe next election. Essentially, we plug in the equations for thedistributions of the likelihood (a Bernoulli distribution) and theprior (A Beta distribution).

    Pr(|X) Pr(xi|) Pr()

    posterior

    Likelihood (Bernoulli)

    prior (Beta)

    M i i i (MAP) E i i

    http://find/
  • 7/26/2019 MLvsMAP

    28/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Maximum a posteriori (MAP) Estimation

    We thus have that

    Pr(xi|) = Bernoulli(xi|) =xi(1 )1xi

    Pr() = Beta(|, ) = 1

    B(, ) 1 (1 )1

    thusPr(|X) Pr(X|)Pr()

    is equivalent to

    Pr(|X)

    i

    Bernoulli(xi|)

    Beta(|, ) (15)

    M i t i i (MAP) E ti ti

    http://find/
  • 7/26/2019 MLvsMAP

    29/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Maximum a posteriori (MAP) Estimation

    Furthermore

    L = log Pr(|X)

    = logi

    Bernoulli(xi|) Beta(|, )=

    i

    log Bernoulli(xi|) + log Beta(|, )

    We solve for MAP= argmax Las follows

    argmax

    i

    log Bernoulli(xi|) + log Beta(|, )

    Note that this is almost the same as the ML estimate except

    that we now have an additional term resultingfromtheprior

    M i t i i (MAP) E ti ti

    http://find/
  • 7/26/2019 MLvsMAP

    30/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Maximum a posteriori (MAP) Estimation

    Again, we find the maximum value of by setting the firstderivative ofLequal to zero and solving for

    L =i

    log Bernoulli(xi|) +

    log Beta(|, )

    The first term is the same as for ML2, i.e.

    i

    log Bernoulli(xi|) = 1

    ni=1

    xi 1

    1

    ni=1

    (1xi) (16)

    2see slide 11

    Ma im m a osteriori (MAP) Estimation

    http://find/
  • 7/26/2019 MLvsMAP

    31/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Maximum a posteriori (MAP) Estimation

    To find the second term, we note

    log Beta(|, ) =

    log

    ( + )

    ()() 1 (1 )1

    =

    log ( + )

    ()()+

    log 1

    (1 )1

    = 0 +

    log 1 (1 )1

    =

    (1)

    log + (1)

    (1 )

    = 1

    11

    Maximum a posteriori (MAP) Estimation

    http://find/
  • 7/26/2019 MLvsMAP

    32/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Maximum a posteriori (MAP) Estimation

    To find MAP, we now set

    L= 0 and solve for

    0 =

    L

    = 1

    n

    i=1xi

    1

    1

    n

    i=1(1 xi) +

    1

    1

    1

    and thus

    ni=1

    (1 xi) + 1

    = (1 )

    i

    xi+ 1

    n

    i=1

    (1 xi) +i

    xi+ 1 + 1

    =i

    xi+ 1

    ni=1

    1 + + 2

    =

    i

    xi+ 1

    Maximum a posteriori (MAP) Estimation

    http://find/
  • 7/26/2019 MLvsMAP

    33/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Maximum a posteriori (MAP) Estimation

    Finally, if we let our Bernoulli distribution be coded as

    Republican=1 and Democrat=0, we have that

    ixi =nr where nrdenotes the number of Republican voters

    (17)Then,

    ni=1

    1 + + 2

    =

    i

    xi+ 1

    [n+ + 2] = nR+ 1

    and finally

    MAP= nR+ 1

    n++ 2

    (18)

    And now our prediction

    http://find/http://goback/
  • 7/26/2019 MLvsMAP

    34/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    MaximumLikelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    And now our prediction

    It is useful to compare the ML and the MAP predictions. Noteagain that and are essentially the same thing as pseudo-counts, and the higher their value, the more the prior affectsthe final prediction (i.e., the posterior).

    Recall that in our poll of 100 members of the Wall Street Golfclub, only seven said they would vote democratic. Thus

    n= 100

    nr = 93

    We will assume that the mode of our prior belief is that50% of the voters will vote democratic, and 50%republican. Thus, = . However, different values foralpha and beta express different strengths of prior belief

    And now our prediction

    http://find/
  • 7/26/2019 MLvsMAP

    35/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    Maximum

    Likelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    And now our prediction

    n nR ML MAP

    100 93 1 1 0.93 0.93100 93 5 5 0.93 0.90100 93 100 100 0.93 0.64100 93 1000 1000 0.93 0.52100 93 10000 10000 0.93 0.502

    Thus, MAP pulls the estimate towards the prior to an extent that depends on the strength of the prior

    The use of MAP in MAQ

    http://find/
  • 7/26/2019 MLvsMAP

    36/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    Maximum

    Likelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    The use of MAP in MAQ

    Recall from the lecture that we call the posterior probabilitiesof the three genotypes given the data D, that is a column withn aligned nucleotides and quality scores of which k correspondto the reference a and n k to a variant nucleotide b.

    p(G =a, a|D) p(D|G =a, a)p(G =a, a)

    n,k (1 r)/2

    p(G =b, b|D) p(D|G =b, b)p(G=b, b)

    n,nk (1 r)/2p(G=a, b|D) p(D|G =a, b)p(G=a, b)

    n

    k

    1

    2n r

    MAQ: Consensus Genotype Calling

    http://find/
  • 7/26/2019 MLvsMAP

    37/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    Maximum

    Likelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    MAQ: Consensus Genotype Calling

    Note that MAQ does not attempt to learn the parameters,rather it uses user-supplied parameter r which roughly corre-sponds to in the election.

    MAQ calls the the genotype with the highest posteriorprobability:

    g= argmaxg(a,a,a,b,b,b)

    p(g|D)

    The probability of this genotype is used as a measure ofconfidence in the call.

    MAQ: Consensus Genotype Calling

    http://find/
  • 7/26/2019 MLvsMAP

    38/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    Maximum

    Likelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    MAQ: Consensus Genotype Calling

    A major problem in SNV calling is false positive heterozygous

    variants. It seems less probable to observe a heterozygous callat a position with a common SNP in the population. For thisreason, MAQ uses a different prior (r) for previously knownSNPs (r= 0.2) and new SNPs (r= 0.001).

    Let us examine the effect of these two priors on variant calling.In R, we can write

    p(G =a, b|D)

    n

    k1

    2n r

    as

    > dbinom(k,n,0.5) * r

    where kis the number of non-reference bases, n is the total number of bases, and 0.5 corresponds to the

    probability of seeing a non-ref base given that the true genotypeisheterozygous.

    MAQ: Consensus Genotype Calling

    http://find/
  • 7/26/2019 MLvsMAP

    39/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    Maximum

    Likelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    MAQ: Consensus Genotype Calling

    A major problem in SNV calling is false positive heterozygous

    variants. It seems less probable to observe a heterozygous callat a position with a common SNP in the population. For thisreason, MAQ uses a different prior (r) for previously knownSNPs (r= 0.2) and new SNPs (r= 0.001).

    Let us examine the effect of these two priors on variant calling.In R, we can write

    p(G =a, b|D)

    n

    k1

    2n r

    as

    > dbinom(k,n,0.5) * r

    where kis the number of non-reference bases, n is the total number of bases, and 0.5 corresponds to the

    probability of seeing a non-ref base given that the true genotypeisheterozygous.

    MAQ: Consensus Genotype Calling

    http://find/
  • 7/26/2019 MLvsMAP

    40/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    Maximum

    Likelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    MAQ: Consensus Genotype Calling

    0 5 10 15 20

    .

    .

    .

    r=0.001

    k

    A

    0 5 10 15 20

    0.0

    0.

    4

    0.

    8

    r=0.2

    k

    B

    The figures show the posterior for the heterozygousgenotype according to the simplified MAQ algorithmdiscussed in the previous lecture

    The prior r= 0.0001 means than positions with 5 or lessALT bases do not get called as heterozygous, whereas theprior with r= 0.2 means that positions with 5 bases do

    get a het call

    MAQ: Consensus Genotype Calling

    http://find/http://goback/
  • 7/26/2019 MLvsMAP

    41/41

    ParameterEstimation

    Peter NRobinson

    EstimatingParametersfrom Data

    Maximum

    Likelihood(ML)Estimation

    Betadistribution

    Maximum aposteriori

    (MAP)Estimation

    MAQ

    Q yp g

    What have we learned?

    Get used to Bayesian techniques important in many areasof bioinformatics

    understand the difference between ML and MAPestimation

    understand the Beta function, priors, pseudocounts

    Note that MAQ is no longer a state of the art algorithm,

    and its use of the MAP framework is relatively simplisticNonetheless, a good introduction to this topic, and we willsee how these concepts are used in EM today

    http://find/