latent variable models christopher m. bishop. 1. density modeling a standard approach: parametric...

10
Latent Variable Models Latent Variable Models Christopher M. Bishop

Upload: delilah-rice

Post on 25-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian

Latent Variable ModelsLatent Variable Models

Christopher M. Bishop

Page 2: Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian

1. Density Modeling1. Density Modeling

A standard approach: parametric models a number of adaptive parameters Gaussian distribution is widely used.

Loglikelihood method

Limitation too flexible: parameter is so excessive not too flexible: only uni-modal

Considering mixture model, latent variable model

Tdp )()(

2

1exp)2(),|( 12/12/ μtμtμt

N

nnp,DpL

1

),|(ln)|(ln),( μtμμ

Page 3: Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian

1.1. 1.1. Latent VariablesLatent Variables

The number of parameters in normal distribution. : d(d+1)/2 + : d d2. Assuming diagonal covariance matrix reduces : d, but this

means that t are statistically independent.

Latent variables Degree of freedom can be controlled, and correlation can be

captured.

Goal to express p(t) of the variable t1,…,td in terms of a smaller

number of latent variables x=(x1,…,xq) where q < d.

Page 4: Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian

Cont’dCont’d

Joint distribution of p(t,x)

Bayesian network express the factorization

d

iitppppp

1

)|()()|()(),( xxxtxxt

Page 5: Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian

Cont’dCont’d

Express p(t|x) in terms of mapping from latent variables to data variables.

The definition of latent variables model is completed by specifying distribution p(u), mapping y(x;w), marginal distributino p(x).

The desired model for distribution p(t), but it is intractable in almost case.

Factor analysis: One of the simplest latent variable models

uwxyt );(

xxxtt dppp )()|()(

uμWxt

Page 6: Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian

Cont’dCont’d

W,: adaptive parameters p(x): chosen to be N(0,I) u: chosen to be zero mean Gaussian with a diagonal covariance

matrix .

Then P(t) is Gaussian, with mean and covariance matrix +WWT.

Degree of freedom: (d+1)(q+1)-q(q+1)/2 Can capture the dominant correlations between the data

variables

Page 7: Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian

1.2. 1.2. Mixture DistributionsMixture Distributions

Uni-modal mixture of M simpler parametric distributions

p(t|i): usually normal distribution with its own i, i.

i: mixing coefficients

mixing coefficients: prior probabilities for the values of the label i.

Considering indicator variable zni.

Posterior probabilities: Rni is expectation of zni.

M

ii ipp

1

)|()( tt

i ii 1 ,10

j nj

ninni jp

ipipR

)|(

)|()|(

t

tt

Page 8: Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian

Cont’dCont’d

EM-algorithm

Mixture of latent-variable models

)}|(ln{}),,({1 1

ipRL i

N

n

M

iniiiicomp tμ

Bayesian network representation of a mixture of latent variable models. Given the values of i and x, the variables t1,…,td are conditionally independent.

Page 9: Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian

2. 2. Probabilistic Principal Probabilistic Principal Component AnalysisComponent Analysis Summary

q principal axes vj, j{1,…,q}

vj are q dominant eigenvectors of sample covariance matrix.

q principal components:

reconstruction vector:

Disadvantage absence of a probability density model and associated likelihood

measure

N

innN 1

T)ˆ)(ˆ(1

μtμtS

)ˆ(VT μtu nn

μut ˆVˆ nn

Page 10: Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian

2.1. 2.1. Relationship to Latent Relationship to Latent VariablesVariables