coherence-aware neural topic modeling · coherence-aware neural topic modeling ran ding, ramesh...

7
Coherence-Aware Neural Topic Modeling Ran Ding, Ramesh Nallapati, Bing Xiang Amazon Web Services {rding, rnallapa, bxiang}@amazon.com Abstract Topic models are evaluated based on their abil- ity to describe documents well (i.e. low per- plexity) and to produce topics that carry coher- ent semantic meaning. In topic modeling so far, perplexity is a direct optimization target. However, topic coherence, owing to its chal- lenging computation, is not optimized for and is only evaluated after training. In this work, under a neural variational inference frame- work, we propose methods to incorporate a topic coherence objective into the training pro- cess. We demonstrate that such a coherence- aware topic model exhibits a similar level of perplexity as baseline models but achieves sub- stantially higher topic coherence. 1 Introduction In the setting of a topic model (Blei, 2012), per- plexity measures the model’s capability to describe documents according to a generative process based on the learned set of topics. In addition to de- scribing documents well (i.e. achieving low per- plexity), it is desirable to have topics (represented by top-N most probable words) that are human- interpretable. Topic interpretability or coherence can be measured by normalized point-wise mutual information (NPMI) (Aletras and Stevenson, 2013; Lau et al., 2014). The calculation of NPMI however is based on look-up operations in a large reference corpus and therefore is non-differentiable and com- putationally intensive. Likely due to these reasons, topic models so far have been solely optimizing for perplexity, and topic coherence is only evalu- ated after training. As has been noted in several publications (Chang et al., 2009), optimization for perplexity alone tends to negatively impact topic coherence. Thus, without introducing topic coher- ence as a training objective, topic modeling likely produces sub-optimal results. Compared to classical methods, such as mean- field approximation (Hoffman et al., 2010) and collapsed Gibbs sampling (Griffiths and Steyvers, 2004) for the latent Dirichlet allocation (LDA) (Blei et al., 2003) model, neural variational infer- ence (Kingma and Welling, 2013; Rezende et al., 2014) offers a flexible framework to accommodate more expressive topic models. We build upon the line of work on topic modeling using neural varia- tional inference (Miao et al., 2016, 2017; Srivastava and Sutton, 2017) and incorporate topic coherence awareness into topic modeling. Our approaches of constructing topic coherence training objective leverage pre-trained word em- beddings (Mikolov et al., 2013; Pennington et al., 2014; Salle et al., 2016; Joulin et al., 2016). The main motivation is that word embeddings carry contextual similarity information that is highly re- lated to the mutual information terms involved in the calculation of NPMI. In this paper, we explore two methods: (1) we explicitly construct a differ- entiable surrogate topic coherence regularization term; (2) we use word embedding matrix as a factor- ization constraint on the topical word distribution matrix that implicitly encourages topic coherence. 2 Models 2.1 Baseline: Neural Topic Model (NTM) The model architecture shown in Figure 1 is a vari- ant of the Neural Variational Document Model (NVDM) (Miao et al., 2016). Let x R |V 1 be the bag-of-words (BOW) representation of a document, where |V | is the size of the vocabu- lary and let z R K×1 be the latent topic variable, where K is the number of topics. In the encoder q φ (z |x), we have π = f MLP (x), μ(x)= l 1 (π), log σ(x)= l 2 (π), h(x, )= μ + σ , where ∼N (0,I ), and finally z = f (h) = ReLU(h). The functions l 1 and l 2 are linear transformations arXiv:1809.02687v1 [cs.CL] 7 Sep 2018

Upload: others

Post on 21-May-2020

20 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Coherence-Aware Neural Topic Modeling · Coherence-Aware Neural Topic Modeling Ran Ding, Ramesh Nallapati, Bing Xiang Amazon Web Services {rding, rnallapa, bxiang}@amazon.com Abstract

Coherence-Aware Neural Topic Modeling

Ran Ding, Ramesh Nallapati, Bing XiangAmazon Web Services

{rding, rnallapa, bxiang}@amazon.com

Abstract

Topic models are evaluated based on their abil-ity to describe documents well (i.e. low per-plexity) and to produce topics that carry coher-ent semantic meaning. In topic modeling sofar, perplexity is a direct optimization target.However, topic coherence, owing to its chal-lenging computation, is not optimized for andis only evaluated after training. In this work,under a neural variational inference frame-work, we propose methods to incorporate atopic coherence objective into the training pro-cess. We demonstrate that such a coherence-aware topic model exhibits a similar level ofperplexity as baseline models but achieves sub-stantially higher topic coherence.

1 Introduction

In the setting of a topic model (Blei, 2012), per-plexity measures the model’s capability to describedocuments according to a generative process basedon the learned set of topics. In addition to de-scribing documents well (i.e. achieving low per-plexity), it is desirable to have topics (representedby top-N most probable words) that are human-interpretable. Topic interpretability or coherencecan be measured by normalized point-wise mutualinformation (NPMI) (Aletras and Stevenson, 2013;Lau et al., 2014). The calculation of NPMI howeveris based on look-up operations in a large referencecorpus and therefore is non-differentiable and com-putationally intensive. Likely due to these reasons,topic models so far have been solely optimizingfor perplexity, and topic coherence is only evalu-ated after training. As has been noted in severalpublications (Chang et al., 2009), optimization forperplexity alone tends to negatively impact topiccoherence. Thus, without introducing topic coher-ence as a training objective, topic modeling likelyproduces sub-optimal results.

Compared to classical methods, such as mean-field approximation (Hoffman et al., 2010) andcollapsed Gibbs sampling (Griffiths and Steyvers,2004) for the latent Dirichlet allocation (LDA)(Blei et al., 2003) model, neural variational infer-ence (Kingma and Welling, 2013; Rezende et al.,2014) offers a flexible framework to accommodatemore expressive topic models. We build upon theline of work on topic modeling using neural varia-tional inference (Miao et al., 2016, 2017; Srivastavaand Sutton, 2017) and incorporate topic coherenceawareness into topic modeling.

Our approaches of constructing topic coherencetraining objective leverage pre-trained word em-beddings (Mikolov et al., 2013; Pennington et al.,2014; Salle et al., 2016; Joulin et al., 2016). Themain motivation is that word embeddings carrycontextual similarity information that is highly re-lated to the mutual information terms involved inthe calculation of NPMI. In this paper, we exploretwo methods: (1) we explicitly construct a differ-entiable surrogate topic coherence regularizationterm; (2) we use word embedding matrix as a factor-ization constraint on the topical word distributionmatrix that implicitly encourages topic coherence.

2 Models

2.1 Baseline: Neural Topic Model (NTM)

The model architecture shown in Figure 1 is a vari-ant of the Neural Variational Document Model(NVDM) (Miao et al., 2016). Let x ∈ R|V |×1be the bag-of-words (BOW) representation of adocument, where |V | is the size of the vocabu-lary and let z ∈ RK×1 be the latent topic variable,where K is the number of topics. In the encoderqφ(z|x), we have π = fMLP (x), µ(x) = l1(π),log σ(x) = l2(π), h(x, ε) = µ + σ � ε, whereε ∼ N (0, I), and finally z = f(h) = ReLU(h).The functions l1 and l2 are linear transformations

arX

iv:1

809.

0268

7v1

[cs

.CL

] 7

Sep

201

8

Page 2: Coherence-Aware Neural Topic Modeling · Coherence-Aware Neural Topic Modeling Ran Ding, Ramesh Nallapati, Bing Xiang Amazon Web Services {rding, rnallapa, bxiang}@amazon.com Abstract

x<latexit sha1_base64="gwjjZUzq4vyZzKoHqRKg8Gwh6Q8=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIYJcFNy4r2Ae0Y8lkMm1oJhmSjFqG/ocbF4q49V/c+Tdm2llo64GQwzn3kpMTJJxp47rfTmltfWNzq7xd2dnd2z+oHh51tEwVoW0iuVS9AGvKmaBtwwynvURRHAecdoPJde53H6jSTIo7M02oH+ORYBEj2FjpfhBIHuppbK/saTas1ty6OwdaJV5BalCgNax+DUJJ0pgKQzjWuu+5ifEzrAwjnM4qg1TTBJMJHtG+pQLHVPvZPPUMnVklRJFU9giD5urvjQzHOo9mJ2NsxnrZy8X/vH5qooafMZGkhgqyeChKOTIS5RWgkClKDJ9agoliNisiY6wwMbaoii3BW/7yKulc1D237t1e1pqNoo4ynMApnIMHV9CEG2hBGwgoeIZXeHMenRfn3flYjJacYucY/sD5/AFQSpL/</latexit><latexit sha1_base64="gwjjZUzq4vyZzKoHqRKg8Gwh6Q8=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIYJcFNy4r2Ae0Y8lkMm1oJhmSjFqG/ocbF4q49V/c+Tdm2llo64GQwzn3kpMTJJxp47rfTmltfWNzq7xd2dnd2z+oHh51tEwVoW0iuVS9AGvKmaBtwwynvURRHAecdoPJde53H6jSTIo7M02oH+ORYBEj2FjpfhBIHuppbK/saTas1ty6OwdaJV5BalCgNax+DUJJ0pgKQzjWuu+5ifEzrAwjnM4qg1TTBJMJHtG+pQLHVPvZPPUMnVklRJFU9giD5urvjQzHOo9mJ2NsxnrZy8X/vH5qooafMZGkhgqyeChKOTIS5RWgkClKDJ9agoliNisiY6wwMbaoii3BW/7yKulc1D237t1e1pqNoo4ynMApnIMHV9CEG2hBGwgoeIZXeHMenRfn3flYjJacYucY/sD5/AFQSpL/</latexit><latexit sha1_base64="gwjjZUzq4vyZzKoHqRKg8Gwh6Q8=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIYJcFNy4r2Ae0Y8lkMm1oJhmSjFqG/ocbF4q49V/c+Tdm2llo64GQwzn3kpMTJJxp47rfTmltfWNzq7xd2dnd2z+oHh51tEwVoW0iuVS9AGvKmaBtwwynvURRHAecdoPJde53H6jSTIo7M02oH+ORYBEj2FjpfhBIHuppbK/saTas1ty6OwdaJV5BalCgNax+DUJJ0pgKQzjWuu+5ifEzrAwjnM4qg1TTBJMJHtG+pQLHVPvZPPUMnVklRJFU9giD5urvjQzHOo9mJ2NsxnrZy8X/vH5qooafMZGkhgqyeChKOTIS5RWgkClKDJ9agoliNisiY6wwMbaoii3BW/7yKulc1D237t1e1pqNoo4ynMApnIMHV9CEG2hBGwgoeIZXeHMenRfn3flYjJacYucY/sD5/AFQSpL/</latexit><latexit sha1_base64="gwjjZUzq4vyZzKoHqRKg8Gwh6Q8=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIYJcFNy4r2Ae0Y8lkMm1oJhmSjFqG/ocbF4q49V/c+Tdm2llo64GQwzn3kpMTJJxp47rfTmltfWNzq7xd2dnd2z+oHh51tEwVoW0iuVS9AGvKmaBtwwynvURRHAecdoPJde53H6jSTIo7M02oH+ORYBEj2FjpfhBIHuppbK/saTas1ty6OwdaJV5BalCgNax+DUJJ0pgKQzjWuu+5ifEzrAwjnM4qg1TTBJMJHtG+pQLHVPvZPPUMnVklRJFU9giD5urvjQzHOo9mJ2NsxnrZy8X/vH5qooafMZGkhgqyeChKOTIS5RWgkClKDJ9agoliNisiY6wwMbaoii3BW/7yKulc1D237t1e1pqNoo4ynMApnIMHV9CEG2hBGwgoeIZXeHMenRfn3flYjJacYucY/sD5/AFQSpL/</latexit>

x<latexit sha1_base64="gwjjZUzq4vyZzKoHqRKg8Gwh6Q8=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIYJcFNy4r2Ae0Y8lkMm1oJhmSjFqG/ocbF4q49V/c+Tdm2llo64GQwzn3kpMTJJxp47rfTmltfWNzq7xd2dnd2z+oHh51tEwVoW0iuVS9AGvKmaBtwwynvURRHAecdoPJde53H6jSTIo7M02oH+ORYBEj2FjpfhBIHuppbK/saTas1ty6OwdaJV5BalCgNax+DUJJ0pgKQzjWuu+5ifEzrAwjnM4qg1TTBJMJHtG+pQLHVPvZPPUMnVklRJFU9giD5urvjQzHOo9mJ2NsxnrZy8X/vH5qooafMZGkhgqyeChKOTIS5RWgkClKDJ9agoliNisiY6wwMbaoii3BW/7yKulc1D237t1e1pqNoo4ynMApnIMHV9CEG2hBGwgoeIZXeHMenRfn3flYjJacYucY/sD5/AFQSpL/</latexit><latexit sha1_base64="gwjjZUzq4vyZzKoHqRKg8Gwh6Q8=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIYJcFNy4r2Ae0Y8lkMm1oJhmSjFqG/ocbF4q49V/c+Tdm2llo64GQwzn3kpMTJJxp47rfTmltfWNzq7xd2dnd2z+oHh51tEwVoW0iuVS9AGvKmaBtwwynvURRHAecdoPJde53H6jSTIo7M02oH+ORYBEj2FjpfhBIHuppbK/saTas1ty6OwdaJV5BalCgNax+DUJJ0pgKQzjWuu+5ifEzrAwjnM4qg1TTBJMJHtG+pQLHVPvZPPUMnVklRJFU9giD5urvjQzHOo9mJ2NsxnrZy8X/vH5qooafMZGkhgqyeChKOTIS5RWgkClKDJ9agoliNisiY6wwMbaoii3BW/7yKulc1D237t1e1pqNoo4ynMApnIMHV9CEG2hBGwgoeIZXeHMenRfn3flYjJacYucY/sD5/AFQSpL/</latexit><latexit sha1_base64="gwjjZUzq4vyZzKoHqRKg8Gwh6Q8=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIYJcFNy4r2Ae0Y8lkMm1oJhmSjFqG/ocbF4q49V/c+Tdm2llo64GQwzn3kpMTJJxp47rfTmltfWNzq7xd2dnd2z+oHh51tEwVoW0iuVS9AGvKmaBtwwynvURRHAecdoPJde53H6jSTIo7M02oH+ORYBEj2FjpfhBIHuppbK/saTas1ty6OwdaJV5BalCgNax+DUJJ0pgKQzjWuu+5ifEzrAwjnM4qg1TTBJMJHtG+pQLHVPvZPPUMnVklRJFU9giD5urvjQzHOo9mJ2NsxnrZy8X/vH5qooafMZGkhgqyeChKOTIS5RWgkClKDJ9agoliNisiY6wwMbaoii3BW/7yKulc1D237t1e1pqNoo4ynMApnIMHV9CEG2hBGwgoeIZXeHMenRfn3flYjJacYucY/sD5/AFQSpL/</latexit><latexit sha1_base64="gwjjZUzq4vyZzKoHqRKg8Gwh6Q8=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsyIYJcFNy4r2Ae0Y8lkMm1oJhmSjFqG/ocbF4q49V/c+Tdm2llo64GQwzn3kpMTJJxp47rfTmltfWNzq7xd2dnd2z+oHh51tEwVoW0iuVS9AGvKmaBtwwynvURRHAecdoPJde53H6jSTIo7M02oH+ORYBEj2FjpfhBIHuppbK/saTas1ty6OwdaJV5BalCgNax+DUJJ0pgKQzjWuu+5ifEzrAwjnM4qg1TTBJMJHtG+pQLHVPvZPPUMnVklRJFU9giD5urvjQzHOo9mJ2NsxnrZy8X/vH5qooafMZGkhgqyeChKOTIS5RWgkClKDJ9agoliNisiY6wwMbaoii3BW/7yKulc1D237t1e1pqNoo4ynMApnIMHV9CEG2hBGwgoeIZXeHMenRfn3flYjJacYucY/sD5/AFQSpL/</latexit>

h<latexit sha1_base64="07x1AYJ35KkUWZqF9AScX9cYixI=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsyI0C4LblxWsA9ox5LJZNrQTDIkGaUM/Q83LhRx67+482/MtLPQ1gMhh3PuJScnSDjTxnW/ndLG5tb2Tnm3srd/cHhUPT7papkqQjtEcqn6AdaUM0E7hhlO+4miOA447QXTm9zvPVKlmRT3ZpZQP8ZjwSJGsLHSwzCQPNSz2F7ZZD6q1ty6uwBaJ15BalCgPap+DUNJ0pgKQzjWeuC5ifEzrAwjnM4rw1TTBJMpHtOBpQLHVPvZIvUcXVglRJFU9giDFurvjQzHOo9mJ2NsJnrVy8X/vEFqoqafMZGkhgqyfChKOTIS5RWgkClKDJ9ZgoliNisiE6wwMbaoii3BW/3yOule1T237t1d11rNoo4ynME5XIIHDWjBLbShAwQUPMMrvDlPzovz7nwsR0tOsXMKf+B8/gA3+pLv</latexit><latexit sha1_base64="07x1AYJ35KkUWZqF9AScX9cYixI=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsyI0C4LblxWsA9ox5LJZNrQTDIkGaUM/Q83LhRx67+482/MtLPQ1gMhh3PuJScnSDjTxnW/ndLG5tb2Tnm3srd/cHhUPT7papkqQjtEcqn6AdaUM0E7hhlO+4miOA447QXTm9zvPVKlmRT3ZpZQP8ZjwSJGsLHSwzCQPNSz2F7ZZD6q1ty6uwBaJ15BalCgPap+DUNJ0pgKQzjWeuC5ifEzrAwjnM4rw1TTBJMpHtOBpQLHVPvZIvUcXVglRJFU9giDFurvjQzHOo9mJ2NsJnrVy8X/vEFqoqafMZGkhgqyfChKOTIS5RWgkClKDJ9ZgoliNisiE6wwMbaoii3BW/3yOule1T237t1d11rNoo4ynME5XIIHDWjBLbShAwQUPMMrvDlPzovz7nwsR0tOsXMKf+B8/gA3+pLv</latexit><latexit sha1_base64="07x1AYJ35KkUWZqF9AScX9cYixI=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsyI0C4LblxWsA9ox5LJZNrQTDIkGaUM/Q83LhRx67+482/MtLPQ1gMhh3PuJScnSDjTxnW/ndLG5tb2Tnm3srd/cHhUPT7papkqQjtEcqn6AdaUM0E7hhlO+4miOA447QXTm9zvPVKlmRT3ZpZQP8ZjwSJGsLHSwzCQPNSz2F7ZZD6q1ty6uwBaJ15BalCgPap+DUNJ0pgKQzjWeuC5ifEzrAwjnM4rw1TTBJMpHtOBpQLHVPvZIvUcXVglRJFU9giDFurvjQzHOo9mJ2NsJnrVy8X/vEFqoqafMZGkhgqyfChKOTIS5RWgkClKDJ9ZgoliNisiE6wwMbaoii3BW/3yOule1T237t1d11rNoo4ynME5XIIHDWjBLbShAwQUPMMrvDlPzovz7nwsR0tOsXMKf+B8/gA3+pLv</latexit><latexit sha1_base64="07x1AYJ35KkUWZqF9AScX9cYixI=">AAAB9XicbVDLSgMxFL1TX7W+qi7dBIvgqsyI0C4LblxWsA9ox5LJZNrQTDIkGaUM/Q83LhRx67+482/MtLPQ1gMhh3PuJScnSDjTxnW/ndLG5tb2Tnm3srd/cHhUPT7papkqQjtEcqn6AdaUM0E7hhlO+4miOA447QXTm9zvPVKlmRT3ZpZQP8ZjwSJGsLHSwzCQPNSz2F7ZZD6q1ty6uwBaJ15BalCgPap+DUNJ0pgKQzjWeuC5ifEzrAwjnM4rw1TTBJMpHtOBpQLHVPvZIvUcXVglRJFU9giDFurvjQzHOo9mJ2NsJnrVy8X/vEFqoqafMZGkhgqyfChKOTIS5RWgkClKDJ9ZgoliNisiE6wwMbaoii3BW/3yOule1T237t1d11rNoo4ynME5XIIHDWjBLbShAwQUPMMrvDlPzovz7nwsR0tOsXMKf+B8/gA3+pLv</latexit>

z = f(h)<latexit sha1_base64="GRyVbnz08DGjcDBuGWGtyy2lcEQ=">AAACCXicbVDLSgMxFM3UV62vUZdugkWomzIjgt0IBTcuK9gHtEPJZDJtaCYZkoxQh9m68VfcuFDErX/gzr8x086ith4IOZxzL/fe48eMKu04P1ZpbX1jc6u8XdnZ3ds/sA+POkokEpM2FkzIno8UYZSTtqaakV4sCYp8Rrr+5Cb3uw9EKir4vZ7GxIvQiNOQYqSNNLThwBcsUNPIfOljdh3WFoVxdj60q07dmQGuErcgVVCgNbS/B4HASUS4xgwp1XedWHspkppiRrLKIFEkRniCRqRvKEcRUV46uySDZ0YJYCikeVzDmbrYkaJI5buZygjpsVr2cvE/r5/osOGllMeJJhzPB4UJg1rAPBYYUEmwZlNDEJbU7ArxGEmEtQmvYkJwl09eJZ2LuuvU3bvLarNRxFEGJ+AU1IALrkAT3IIWaAMMnsALeAPv1rP1an1Yn/PSklX0HIM/sL5+AYOoms0=</latexit><latexit sha1_base64="GRyVbnz08DGjcDBuGWGtyy2lcEQ=">AAACCXicbVDLSgMxFM3UV62vUZdugkWomzIjgt0IBTcuK9gHtEPJZDJtaCYZkoxQh9m68VfcuFDErX/gzr8x086ith4IOZxzL/fe48eMKu04P1ZpbX1jc6u8XdnZ3ds/sA+POkokEpM2FkzIno8UYZSTtqaakV4sCYp8Rrr+5Cb3uw9EKir4vZ7GxIvQiNOQYqSNNLThwBcsUNPIfOljdh3WFoVxdj60q07dmQGuErcgVVCgNbS/B4HASUS4xgwp1XedWHspkppiRrLKIFEkRniCRqRvKEcRUV46uySDZ0YJYCikeVzDmbrYkaJI5buZygjpsVr2cvE/r5/osOGllMeJJhzPB4UJg1rAPBYYUEmwZlNDEJbU7ArxGEmEtQmvYkJwl09eJZ2LuuvU3bvLarNRxFEGJ+AU1IALrkAT3IIWaAMMnsALeAPv1rP1an1Yn/PSklX0HIM/sL5+AYOoms0=</latexit><latexit sha1_base64="GRyVbnz08DGjcDBuGWGtyy2lcEQ=">AAACCXicbVDLSgMxFM3UV62vUZdugkWomzIjgt0IBTcuK9gHtEPJZDJtaCYZkoxQh9m68VfcuFDErX/gzr8x086ith4IOZxzL/fe48eMKu04P1ZpbX1jc6u8XdnZ3ds/sA+POkokEpM2FkzIno8UYZSTtqaakV4sCYp8Rrr+5Cb3uw9EKir4vZ7GxIvQiNOQYqSNNLThwBcsUNPIfOljdh3WFoVxdj60q07dmQGuErcgVVCgNbS/B4HASUS4xgwp1XedWHspkppiRrLKIFEkRniCRqRvKEcRUV46uySDZ0YJYCikeVzDmbrYkaJI5buZygjpsVr2cvE/r5/osOGllMeJJhzPB4UJg1rAPBYYUEmwZlNDEJbU7ArxGEmEtQmvYkJwl09eJZ2LuuvU3bvLarNRxFEGJ+AU1IALrkAT3IIWaAMMnsALeAPv1rP1an1Yn/PSklX0HIM/sL5+AYOoms0=</latexit><latexit sha1_base64="GRyVbnz08DGjcDBuGWGtyy2lcEQ=">AAACCXicbVDLSgMxFM3UV62vUZdugkWomzIjgt0IBTcuK9gHtEPJZDJtaCYZkoxQh9m68VfcuFDErX/gzr8x086ith4IOZxzL/fe48eMKu04P1ZpbX1jc6u8XdnZ3ds/sA+POkokEpM2FkzIno8UYZSTtqaakV4sCYp8Rrr+5Cb3uw9EKir4vZ7GxIvQiNOQYqSNNLThwBcsUNPIfOljdh3WFoVxdj60q07dmQGuErcgVVCgNbS/B4HASUS4xgwp1XedWHspkppiRrLKIFEkRniCRqRvKEcRUV46uySDZ0YJYCikeVzDmbrYkaJI5buZygjpsVr2cvE/r5/osOGllMeJJhzPB4UJg1rAPBYYUEmwZlNDEJbU7ArxGEmEtQmvYkJwl09eJZ2LuuvU3bvLarNRxFEGJ+AU1IALrkAT3IIWaAMMnsALeAPv1rP1an1Yn/PSklX0HIM/sL5+AYOoms0=</latexit>µ(x)

<latexit sha1_base64="S4A32L1g0ns4FLpnsizfH/GGM78=">AAAB/HicbVDNS8MwHE3n15xf1R29BIcwL6MVwR0HXjxOcB+wlpGm6RaWpCVJxVLmv+LFgyJe/UO8+d+Ybj3o5oOQx3u/H3l5QcKo0o7zbVU2Nre2d6q7tb39g8Mj+/ikr+JUYtLDMYvlMECKMCpIT1PNyDCRBPGAkUEwuyn8wQORisbiXmcJ8TmaCBpRjLSRxnbd42nTC2IWqoybK3+cX4zthtNyFoDrxC1JA5Toju0vL4xxyonQmCGlRq6TaD9HUlPMyLzmpYokCM/QhIwMFYgT5eeL8HN4bpQQRrE0R2i4UH9v5IirIpuZ5EhP1apXiP95o1RHbT+nIkk1EXj5UJQyqGNYNAFDKgnWLDMEYUlNVoinSCKsTV81U4K7+uV10r9suU7LvbtqdNplHVVwCs5AE7jgGnTALeiCHsAgA8/gFbxZT9aL9W59LEcrVrlTB39gff4A+86U8Q==</latexit><latexit sha1_base64="S4A32L1g0ns4FLpnsizfH/GGM78=">AAAB/HicbVDNS8MwHE3n15xf1R29BIcwL6MVwR0HXjxOcB+wlpGm6RaWpCVJxVLmv+LFgyJe/UO8+d+Ybj3o5oOQx3u/H3l5QcKo0o7zbVU2Nre2d6q7tb39g8Mj+/ikr+JUYtLDMYvlMECKMCpIT1PNyDCRBPGAkUEwuyn8wQORisbiXmcJ8TmaCBpRjLSRxnbd42nTC2IWqoybK3+cX4zthtNyFoDrxC1JA5Toju0vL4xxyonQmCGlRq6TaD9HUlPMyLzmpYokCM/QhIwMFYgT5eeL8HN4bpQQRrE0R2i4UH9v5IirIpuZ5EhP1apXiP95o1RHbT+nIkk1EXj5UJQyqGNYNAFDKgnWLDMEYUlNVoinSCKsTV81U4K7+uV10r9suU7LvbtqdNplHVVwCs5AE7jgGnTALeiCHsAgA8/gFbxZT9aL9W59LEcrVrlTB39gff4A+86U8Q==</latexit><latexit sha1_base64="S4A32L1g0ns4FLpnsizfH/GGM78=">AAAB/HicbVDNS8MwHE3n15xf1R29BIcwL6MVwR0HXjxOcB+wlpGm6RaWpCVJxVLmv+LFgyJe/UO8+d+Ybj3o5oOQx3u/H3l5QcKo0o7zbVU2Nre2d6q7tb39g8Mj+/ikr+JUYtLDMYvlMECKMCpIT1PNyDCRBPGAkUEwuyn8wQORisbiXmcJ8TmaCBpRjLSRxnbd42nTC2IWqoybK3+cX4zthtNyFoDrxC1JA5Toju0vL4xxyonQmCGlRq6TaD9HUlPMyLzmpYokCM/QhIwMFYgT5eeL8HN4bpQQRrE0R2i4UH9v5IirIpuZ5EhP1apXiP95o1RHbT+nIkk1EXj5UJQyqGNYNAFDKgnWLDMEYUlNVoinSCKsTV81U4K7+uV10r9suU7LvbtqdNplHVVwCs5AE7jgGnTALeiCHsAgA8/gFbxZT9aL9W59LEcrVrlTB39gff4A+86U8Q==</latexit><latexit sha1_base64="S4A32L1g0ns4FLpnsizfH/GGM78=">AAAB/HicbVDNS8MwHE3n15xf1R29BIcwL6MVwR0HXjxOcB+wlpGm6RaWpCVJxVLmv+LFgyJe/UO8+d+Ybj3o5oOQx3u/H3l5QcKo0o7zbVU2Nre2d6q7tb39g8Mj+/ikr+JUYtLDMYvlMECKMCpIT1PNyDCRBPGAkUEwuyn8wQORisbiXmcJ8TmaCBpRjLSRxnbd42nTC2IWqoybK3+cX4zthtNyFoDrxC1JA5Toju0vL4xxyonQmCGlRq6TaD9HUlPMyLzmpYokCM/QhIwMFYgT5eeL8HN4bpQQRrE0R2i4UH9v5IirIpuZ5EhP1apXiP95o1RHbT+nIkk1EXj5UJQyqGNYNAFDKgnWLDMEYUlNVoinSCKsTV81U4K7+uV10r9suU7LvbtqdNplHVVwCs5AE7jgGnTALeiCHsAgA8/gFbxZT9aL9W59LEcrVrlTB39gff4A+86U8Q==</latexit>

log �(x)<latexit sha1_base64="3OvljUvE4Rc/xCbtOHk8Cv2OTAc=">AAACBHicbVC7TsMwFHV4lvIKMHaxqJDKUiUIiY6VWBiLRB9SE1WO46RWHTuyHUQVdWDhV1gYQIiVj2Djb3DaDNByJMtH59yre+8JUkaVdpxva219Y3Nru7JT3d3bPzi0j457SmQSky4WTMhBgBRhlJOuppqRQSoJSgJG+sHkuvD790QqKvidnqbET1DMaUQx0kYa2TWPiRh6isYJaniBYKGaJubLH2bnI7vuNJ054CpxS1IHJToj+8sLBc4SwjVmSKmh66Taz5HUFDMyq3qZIinCExSToaEcJUT5+fyIGTwzSggjIc3jGs7V3x05SlSxm6lMkB6rZa8Q//OGmY5afk55mmnC8WJQlDGoBSwSgSGVBGs2NQRhSc2uEI+RRFib3KomBHf55FXSu2i6TtO9vay3W2UcFVADp6ABXHAF2uAGdEAXYPAInsEreLOerBfr3fpYlK5ZZc8J+APr8wfW8pgu</latexit><latexit sha1_base64="3OvljUvE4Rc/xCbtOHk8Cv2OTAc=">AAACBHicbVC7TsMwFHV4lvIKMHaxqJDKUiUIiY6VWBiLRB9SE1WO46RWHTuyHUQVdWDhV1gYQIiVj2Djb3DaDNByJMtH59yre+8JUkaVdpxva219Y3Nru7JT3d3bPzi0j457SmQSky4WTMhBgBRhlJOuppqRQSoJSgJG+sHkuvD790QqKvidnqbET1DMaUQx0kYa2TWPiRh6isYJaniBYKGaJubLH2bnI7vuNJ054CpxS1IHJToj+8sLBc4SwjVmSKmh66Taz5HUFDMyq3qZIinCExSToaEcJUT5+fyIGTwzSggjIc3jGs7V3x05SlSxm6lMkB6rZa8Q//OGmY5afk55mmnC8WJQlDGoBSwSgSGVBGs2NQRhSc2uEI+RRFib3KomBHf55FXSu2i6TtO9vay3W2UcFVADp6ABXHAF2uAGdEAXYPAInsEreLOerBfr3fpYlK5ZZc8J+APr8wfW8pgu</latexit><latexit sha1_base64="3OvljUvE4Rc/xCbtOHk8Cv2OTAc=">AAACBHicbVC7TsMwFHV4lvIKMHaxqJDKUiUIiY6VWBiLRB9SE1WO46RWHTuyHUQVdWDhV1gYQIiVj2Djb3DaDNByJMtH59yre+8JUkaVdpxva219Y3Nru7JT3d3bPzi0j457SmQSky4WTMhBgBRhlJOuppqRQSoJSgJG+sHkuvD790QqKvidnqbET1DMaUQx0kYa2TWPiRh6isYJaniBYKGaJubLH2bnI7vuNJ054CpxS1IHJToj+8sLBc4SwjVmSKmh66Taz5HUFDMyq3qZIinCExSToaEcJUT5+fyIGTwzSggjIc3jGs7V3x05SlSxm6lMkB6rZa8Q//OGmY5afk55mmnC8WJQlDGoBSwSgSGVBGs2NQRhSc2uEI+RRFib3KomBHf55FXSu2i6TtO9vay3W2UcFVADp6ABXHAF2uAGdEAXYPAInsEreLOerBfr3fpYlK5ZZc8J+APr8wfW8pgu</latexit><latexit sha1_base64="3OvljUvE4Rc/xCbtOHk8Cv2OTAc=">AAACBHicbVC7TsMwFHV4lvIKMHaxqJDKUiUIiY6VWBiLRB9SE1WO46RWHTuyHUQVdWDhV1gYQIiVj2Djb3DaDNByJMtH59yre+8JUkaVdpxva219Y3Nru7JT3d3bPzi0j457SmQSky4WTMhBgBRhlJOuppqRQSoJSgJG+sHkuvD790QqKvidnqbET1DMaUQx0kYa2TWPiRh6isYJaniBYKGaJubLH2bnI7vuNJ054CpxS1IHJToj+8sLBc4SwjVmSKmh66Taz5HUFDMyq3qZIinCExSToaEcJUT5+fyIGTwzSggjIc3jGs7V3x05SlSxm6lMkB6rZa8Q//OGmY5afk55mmnC8WJQlDGoBSwSgSGVBGs2NQRhSc2uEI+RRFib3KomBHf55FXSu2i6TtO9vay3W2UcFVADp6ABXHAF2uAGdEAXYPAInsEreLOerBfr3fpYlK5ZZc8J+APr8wfW8pgu</latexit>

fMLP<latexit sha1_base64="NHguYJAmhhAh8mNsjCzuxYbGyBA=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEaI8FLx4UKtgPaEPZbCft0s0m7G6EEvojvHhQxKu/x5v/xm2bg7Y+GHi8N8PMvCARXBvX/XYKG5tb2zvF3dLe/sHhUfn4pK3jVDFssVjEqhtQjYJLbBluBHYThTQKBHaCyc3c7zyh0jyWj2aaoB/RkeQhZ9RYqRMOsvu75mxQrrhVdwGyTrycVCBHc1D+6g9jlkYoDRNU657nJsbPqDKcCZyV+qnGhLIJHWHPUkkj1H62OHdGLqwyJGGsbElDFurviYxGWk+jwHZG1Iz1qjcX//N6qQnrfsZlkhqUbLkoTAUxMZn/ToZcITNiagllittbCRtTRZmxCZVsCN7qy+ukfVX13Kr3cF1p1PM4inAG53AJHtSgAbfQhBYwmMAzvMKbkzgvzrvzsWwtOPnMKfyB8/kDFqqPXA==</latexit><latexit sha1_base64="NHguYJAmhhAh8mNsjCzuxYbGyBA=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEaI8FLx4UKtgPaEPZbCft0s0m7G6EEvojvHhQxKu/x5v/xm2bg7Y+GHi8N8PMvCARXBvX/XYKG5tb2zvF3dLe/sHhUfn4pK3jVDFssVjEqhtQjYJLbBluBHYThTQKBHaCyc3c7zyh0jyWj2aaoB/RkeQhZ9RYqRMOsvu75mxQrrhVdwGyTrycVCBHc1D+6g9jlkYoDRNU657nJsbPqDKcCZyV+qnGhLIJHWHPUkkj1H62OHdGLqwyJGGsbElDFurviYxGWk+jwHZG1Iz1qjcX//N6qQnrfsZlkhqUbLkoTAUxMZn/ToZcITNiagllittbCRtTRZmxCZVsCN7qy+ukfVX13Kr3cF1p1PM4inAG53AJHtSgAbfQhBYwmMAzvMKbkzgvzrvzsWwtOPnMKfyB8/kDFqqPXA==</latexit><latexit sha1_base64="NHguYJAmhhAh8mNsjCzuxYbGyBA=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEaI8FLx4UKtgPaEPZbCft0s0m7G6EEvojvHhQxKu/x5v/xm2bg7Y+GHi8N8PMvCARXBvX/XYKG5tb2zvF3dLe/sHhUfn4pK3jVDFssVjEqhtQjYJLbBluBHYThTQKBHaCyc3c7zyh0jyWj2aaoB/RkeQhZ9RYqRMOsvu75mxQrrhVdwGyTrycVCBHc1D+6g9jlkYoDRNU657nJsbPqDKcCZyV+qnGhLIJHWHPUkkj1H62OHdGLqwyJGGsbElDFurviYxGWk+jwHZG1Iz1qjcX//N6qQnrfsZlkhqUbLkoTAUxMZn/ToZcITNiagllittbCRtTRZmxCZVsCN7qy+ukfVX13Kr3cF1p1PM4inAG53AJHtSgAbfQhBYwmMAzvMKbkzgvzrvzsWwtOPnMKfyB8/kDFqqPXA==</latexit><latexit sha1_base64="NHguYJAmhhAh8mNsjCzuxYbGyBA=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEaI8FLx4UKtgPaEPZbCft0s0m7G6EEvojvHhQxKu/x5v/xm2bg7Y+GHi8N8PMvCARXBvX/XYKG5tb2zvF3dLe/sHhUfn4pK3jVDFssVjEqhtQjYJLbBluBHYThTQKBHaCyc3c7zyh0jyWj2aaoB/RkeQhZ9RYqRMOsvu75mxQrrhVdwGyTrycVCBHc1D+6g9jlkYoDRNU657nJsbPqDKcCZyV+qnGhLIJHWHPUkkj1H62OHdGLqwyJGGsbElDFurviYxGWk+jwHZG1Iz1qjcX//N6qQnrfsZlkhqUbLkoTAUxMZn/ToZcITNiagllittbCRtTRZmxCZVsCN7qy+ukfVX13Kr3cF1p1PM4inAG53AJHtSgAbfQhBYwmMAzvMKbkzgvzrvzsWwtOPnMKfyB8/kDFqqPXA==</latexit>

fMLP<latexit sha1_base64="NHguYJAmhhAh8mNsjCzuxYbGyBA=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEaI8FLx4UKtgPaEPZbCft0s0m7G6EEvojvHhQxKu/x5v/xm2bg7Y+GHi8N8PMvCARXBvX/XYKG5tb2zvF3dLe/sHhUfn4pK3jVDFssVjEqhtQjYJLbBluBHYThTQKBHaCyc3c7zyh0jyWj2aaoB/RkeQhZ9RYqRMOsvu75mxQrrhVdwGyTrycVCBHc1D+6g9jlkYoDRNU657nJsbPqDKcCZyV+qnGhLIJHWHPUkkj1H62OHdGLqwyJGGsbElDFurviYxGWk+jwHZG1Iz1qjcX//N6qQnrfsZlkhqUbLkoTAUxMZn/ToZcITNiagllittbCRtTRZmxCZVsCN7qy+ukfVX13Kr3cF1p1PM4inAG53AJHtSgAbfQhBYwmMAzvMKbkzgvzrvzsWwtOPnMKfyB8/kDFqqPXA==</latexit><latexit sha1_base64="NHguYJAmhhAh8mNsjCzuxYbGyBA=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEaI8FLx4UKtgPaEPZbCft0s0m7G6EEvojvHhQxKu/x5v/xm2bg7Y+GHi8N8PMvCARXBvX/XYKG5tb2zvF3dLe/sHhUfn4pK3jVDFssVjEqhtQjYJLbBluBHYThTQKBHaCyc3c7zyh0jyWj2aaoB/RkeQhZ9RYqRMOsvu75mxQrrhVdwGyTrycVCBHc1D+6g9jlkYoDRNU657nJsbPqDKcCZyV+qnGhLIJHWHPUkkj1H62OHdGLqwyJGGsbElDFurviYxGWk+jwHZG1Iz1qjcX//N6qQnrfsZlkhqUbLkoTAUxMZn/ToZcITNiagllittbCRtTRZmxCZVsCN7qy+ukfVX13Kr3cF1p1PM4inAG53AJHtSgAbfQhBYwmMAzvMKbkzgvzrvzsWwtOPnMKfyB8/kDFqqPXA==</latexit><latexit sha1_base64="NHguYJAmhhAh8mNsjCzuxYbGyBA=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEaI8FLx4UKtgPaEPZbCft0s0m7G6EEvojvHhQxKu/x5v/xm2bg7Y+GHi8N8PMvCARXBvX/XYKG5tb2zvF3dLe/sHhUfn4pK3jVDFssVjEqhtQjYJLbBluBHYThTQKBHaCyc3c7zyh0jyWj2aaoB/RkeQhZ9RYqRMOsvu75mxQrrhVdwGyTrycVCBHc1D+6g9jlkYoDRNU657nJsbPqDKcCZyV+qnGhLIJHWHPUkkj1H62OHdGLqwyJGGsbElDFurviYxGWk+jwHZG1Iz1qjcX//N6qQnrfsZlkhqUbLkoTAUxMZn/ToZcITNiagllittbCRtTRZmxCZVsCN7qy+ukfVX13Kr3cF1p1PM4inAG53AJHtSgAbfQhBYwmMAzvMKbkzgvzrvzsWwtOPnMKfyB8/kDFqqPXA==</latexit><latexit sha1_base64="NHguYJAmhhAh8mNsjCzuxYbGyBA=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEaI8FLx4UKtgPaEPZbCft0s0m7G6EEvojvHhQxKu/x5v/xm2bg7Y+GHi8N8PMvCARXBvX/XYKG5tb2zvF3dLe/sHhUfn4pK3jVDFssVjEqhtQjYJLbBluBHYThTQKBHaCyc3c7zyh0jyWj2aaoB/RkeQhZ9RYqRMOsvu75mxQrrhVdwGyTrycVCBHc1D+6g9jlkYoDRNU657nJsbPqDKcCZyV+qnGhLIJHWHPUkkj1H62OHdGLqwyJGGsbElDFurviYxGWk+jwHZG1Iz1qjcX//N6qQnrfsZlkhqUbLkoTAUxMZn/ToZcITNiagllittbCRtTRZmxCZVsCN7qy+ukfVX13Kr3cF1p1PM4inAG53AJHtSgAbfQhBYwmMAzvMKbkzgvzrvzsWwtOPnMKfyB8/kDFqqPXA==</latexit>

✏ ⇠ N (0, I)<latexit sha1_base64="M9OPDY9KwPr9Nfe2cn+vi4AXMvE=">AAACCHicbVDLSsNAFJ3UV62vqEsXDhahgpREBLssuNGNVLAPaEKZTKft0HmEmYlQQpdu/BU3LhRx6ye482+ctFlo9cCFwzn3cu89UcyoNp735RSWlldW14rrpY3Nre0dd3evpWWiMGliyaTqREgTRgVpGmoY6cSKIB4x0o7Gl5nfvidKUynuzCQmIUdDQQcUI2OlnnsYkFhTJgUMNOUw4MiMMGLpzbTincLrk55b9qreDPAv8XNSBjkaPfcz6EuccCIMZkjrru/FJkyRMhQzMi0FiSYxwmM0JF1LBeJEh+nskSk8tkofDqSyJQycqT8nUsS1nvDIdmaH6kUvE//zuokZ1MKUijgxROD5okHCoJEwSwX2qSLYsIklCCtqb4V4hBTCxmZXsiH4iy//Ja2zqu9V/dvzcr2Wx1EEB+AIVIAPLkAdXIEGaAIMHsATeAGvzqPz7Lw57/PWgpPP7INfcD6+AQ2YmKQ=</latexit><latexit sha1_base64="M9OPDY9KwPr9Nfe2cn+vi4AXMvE=">AAACCHicbVDLSsNAFJ3UV62vqEsXDhahgpREBLssuNGNVLAPaEKZTKft0HmEmYlQQpdu/BU3LhRx6ye482+ctFlo9cCFwzn3cu89UcyoNp735RSWlldW14rrpY3Nre0dd3evpWWiMGliyaTqREgTRgVpGmoY6cSKIB4x0o7Gl5nfvidKUynuzCQmIUdDQQcUI2OlnnsYkFhTJgUMNOUw4MiMMGLpzbTincLrk55b9qreDPAv8XNSBjkaPfcz6EuccCIMZkjrru/FJkyRMhQzMi0FiSYxwmM0JF1LBeJEh+nskSk8tkofDqSyJQycqT8nUsS1nvDIdmaH6kUvE//zuokZ1MKUijgxROD5okHCoJEwSwX2qSLYsIklCCtqb4V4hBTCxmZXsiH4iy//Ja2zqu9V/dvzcr2Wx1EEB+AIVIAPLkAdXIEGaAIMHsATeAGvzqPz7Lw57/PWgpPP7INfcD6+AQ2YmKQ=</latexit><latexit sha1_base64="M9OPDY9KwPr9Nfe2cn+vi4AXMvE=">AAACCHicbVDLSsNAFJ3UV62vqEsXDhahgpREBLssuNGNVLAPaEKZTKft0HmEmYlQQpdu/BU3LhRx6ye482+ctFlo9cCFwzn3cu89UcyoNp735RSWlldW14rrpY3Nre0dd3evpWWiMGliyaTqREgTRgVpGmoY6cSKIB4x0o7Gl5nfvidKUynuzCQmIUdDQQcUI2OlnnsYkFhTJgUMNOUw4MiMMGLpzbTincLrk55b9qreDPAv8XNSBjkaPfcz6EuccCIMZkjrru/FJkyRMhQzMi0FiSYxwmM0JF1LBeJEh+nskSk8tkofDqSyJQycqT8nUsS1nvDIdmaH6kUvE//zuokZ1MKUijgxROD5okHCoJEwSwX2qSLYsIklCCtqb4V4hBTCxmZXsiH4iy//Ja2zqu9V/dvzcr2Wx1EEB+AIVIAPLkAdXIEGaAIMHsATeAGvzqPz7Lw57/PWgpPP7INfcD6+AQ2YmKQ=</latexit><latexit sha1_base64="M9OPDY9KwPr9Nfe2cn+vi4AXMvE=">AAACCHicbVDLSsNAFJ3UV62vqEsXDhahgpREBLssuNGNVLAPaEKZTKft0HmEmYlQQpdu/BU3LhRx6ye482+ctFlo9cCFwzn3cu89UcyoNp735RSWlldW14rrpY3Nre0dd3evpWWiMGliyaTqREgTRgVpGmoY6cSKIB4x0o7Gl5nfvidKUynuzCQmIUdDQQcUI2OlnnsYkFhTJgUMNOUw4MiMMGLpzbTincLrk55b9qreDPAv8XNSBjkaPfcz6EuccCIMZkjrru/FJkyRMhQzMi0FiSYxwmM0JF1LBeJEh+nskSk8tkofDqSyJQycqT8nUsS1nvDIdmaH6kUvE//zuokZ1MKUijgxROD5okHCoJEwSwX2qSLYsIklCCtqb4V4hBTCxmZXsiH4iy//Ja2zqu9V/dvzcr2Wx1EEB+AIVIAPLkAdXIEGaAIMHsATeAGvzqPz7Lw57/PWgpPP7INfcD6+AQ2YmKQ=</latexit>

q�(z|x)<latexit sha1_base64="TP0wCA7TvMnNTjjrjdgXJOfjhew=">AAACFHicbVDLSsNAFJ3UV62vqEs3g0WoCCURwS4LblxWsA9oQphMJu3QySTOTMQa+hFu/BU3LhRx68Kdf+OkzaK2XhjmcM493HuPnzAqlWX9GKWV1bX1jfJmZWt7Z3fP3D/oyDgVmLRxzGLR85EkjHLSVlQx0ksEQZHPSNcfXeV6954ISWN+q8YJcSM04DSkGClNeebZneckQ1pz/JgFchzpL3ucQEd7FJwnHyannlm16ta04DKwC1AFRbU889sJYpxGhCvMkJR920qUmyGhKGZkUnFSSRKER2hA+hpyFBHpZtOjJvBEMwEMY6EfV3DKzjsyFMl8N90ZITWUi1pO/qf1UxU23IzyJFWE49mgMGVQxTBPCAZUEKzYWAOEBdW7QjxEAmGlc6zoEOzFk5dB57xuW3X75qLabBRxlMEROAY1YINL0ATXoAXaAIMn8ALewLvxbLwaH8bnrLVkFJ5D8KeMr18c/59y</latexit><latexit sha1_base64="TP0wCA7TvMnNTjjrjdgXJOfjhew=">AAACFHicbVDLSsNAFJ3UV62vqEs3g0WoCCURwS4LblxWsA9oQphMJu3QySTOTMQa+hFu/BU3LhRx68Kdf+OkzaK2XhjmcM493HuPnzAqlWX9GKWV1bX1jfJmZWt7Z3fP3D/oyDgVmLRxzGLR85EkjHLSVlQx0ksEQZHPSNcfXeV6954ISWN+q8YJcSM04DSkGClNeebZneckQ1pz/JgFchzpL3ucQEd7FJwnHyannlm16ta04DKwC1AFRbU889sJYpxGhCvMkJR920qUmyGhKGZkUnFSSRKER2hA+hpyFBHpZtOjJvBEMwEMY6EfV3DKzjsyFMl8N90ZITWUi1pO/qf1UxU23IzyJFWE49mgMGVQxTBPCAZUEKzYWAOEBdW7QjxEAmGlc6zoEOzFk5dB57xuW3X75qLabBRxlMEROAY1YINL0ATXoAXaAIMn8ALewLvxbLwaH8bnrLVkFJ5D8KeMr18c/59y</latexit><latexit sha1_base64="TP0wCA7TvMnNTjjrjdgXJOfjhew=">AAACFHicbVDLSsNAFJ3UV62vqEs3g0WoCCURwS4LblxWsA9oQphMJu3QySTOTMQa+hFu/BU3LhRx68Kdf+OkzaK2XhjmcM493HuPnzAqlWX9GKWV1bX1jfJmZWt7Z3fP3D/oyDgVmLRxzGLR85EkjHLSVlQx0ksEQZHPSNcfXeV6954ISWN+q8YJcSM04DSkGClNeebZneckQ1pz/JgFchzpL3ucQEd7FJwnHyannlm16ta04DKwC1AFRbU889sJYpxGhCvMkJR920qUmyGhKGZkUnFSSRKER2hA+hpyFBHpZtOjJvBEMwEMY6EfV3DKzjsyFMl8N90ZITWUi1pO/qf1UxU23IzyJFWE49mgMGVQxTBPCAZUEKzYWAOEBdW7QjxEAmGlc6zoEOzFk5dB57xuW3X75qLabBRxlMEROAY1YINL0ATXoAXaAIMn8ALewLvxbLwaH8bnrLVkFJ5D8KeMr18c/59y</latexit><latexit sha1_base64="TP0wCA7TvMnNTjjrjdgXJOfjhew=">AAACFHicbVDLSsNAFJ3UV62vqEs3g0WoCCURwS4LblxWsA9oQphMJu3QySTOTMQa+hFu/BU3LhRx68Kdf+OkzaK2XhjmcM493HuPnzAqlWX9GKWV1bX1jfJmZWt7Z3fP3D/oyDgVmLRxzGLR85EkjHLSVlQx0ksEQZHPSNcfXeV6954ISWN+q8YJcSM04DSkGClNeebZneckQ1pz/JgFchzpL3ucQEd7FJwnHyannlm16ta04DKwC1AFRbU889sJYpxGhCvMkJR920qUmyGhKGZkUnFSSRKER2hA+hpyFBHpZtOjJvBEMwEMY6EfV3DKzjsyFMl8N90ZITWUi1pO/qf1UxU23IzyJFWE49mgMGVQxTBPCAZUEKzYWAOEBdW7QjxEAmGlc6zoEOzFk5dB57xuW3X75qLabBRxlMEROAY1YINL0ATXoAXaAIMn8ALewLvxbLwaH8bnrLVkFJ5D8KeMr18c/59y</latexit>

p✓(x|z)<latexit sha1_base64="Lzx9LffsryzPfUQTzpP4J8gisJs=">AAACFnicbVDLSsNAFJ34rPUVdelmsAh1YUlEsMuCG5cV7AOaECbTSTt08mDmRqyhX+HGX3HjQhG34s6/cdJmUVsvDHM45x7uvcdPBFdgWT/Gyura+sZmaau8vbO7t28eHLZVnErKWjQWsez6RDHBI9YCDoJ1E8lI6AvW8UfXud65Z1LxOLqDccLckAwiHnBKQFOeeZ54DgwZkKrjx6KvxqH+socJdrQL8Dz5ODnzzIpVs6aFl4FdgAoqqumZ304/pmnIIqCCKNWzrQTcjEjgVLBJ2UkVSwgdkQHraRiRkCk3m541waea6eMglvpFgKfsvCMjocp3050hgaFa1HLyP62XQlB3Mx4lKbCIzgYFqcAQ4zwj3OeSURBjDQiVXO+K6ZBIQkEnWdYh2IsnL4P2Rc22avbtZaVRL+IooWN0gqrIRleogW5QE7UQRU/oBb2hd+PZeDU+jM9Z64pReI7QnzK+fgHM16Ba</latexit><latexit sha1_base64="Lzx9LffsryzPfUQTzpP4J8gisJs=">AAACFnicbVDLSsNAFJ34rPUVdelmsAh1YUlEsMuCG5cV7AOaECbTSTt08mDmRqyhX+HGX3HjQhG34s6/cdJmUVsvDHM45x7uvcdPBFdgWT/Gyura+sZmaau8vbO7t28eHLZVnErKWjQWsez6RDHBI9YCDoJ1E8lI6AvW8UfXud65Z1LxOLqDccLckAwiHnBKQFOeeZ54DgwZkKrjx6KvxqH+socJdrQL8Dz5ODnzzIpVs6aFl4FdgAoqqumZ304/pmnIIqCCKNWzrQTcjEjgVLBJ2UkVSwgdkQHraRiRkCk3m541waea6eMglvpFgKfsvCMjocp3050hgaFa1HLyP62XQlB3Mx4lKbCIzgYFqcAQ4zwj3OeSURBjDQiVXO+K6ZBIQkEnWdYh2IsnL4P2Rc22avbtZaVRL+IooWN0gqrIRleogW5QE7UQRU/oBb2hd+PZeDU+jM9Z64pReI7QnzK+fgHM16Ba</latexit><latexit sha1_base64="Lzx9LffsryzPfUQTzpP4J8gisJs=">AAACFnicbVDLSsNAFJ34rPUVdelmsAh1YUlEsMuCG5cV7AOaECbTSTt08mDmRqyhX+HGX3HjQhG34s6/cdJmUVsvDHM45x7uvcdPBFdgWT/Gyura+sZmaau8vbO7t28eHLZVnErKWjQWsez6RDHBI9YCDoJ1E8lI6AvW8UfXud65Z1LxOLqDccLckAwiHnBKQFOeeZ54DgwZkKrjx6KvxqH+socJdrQL8Dz5ODnzzIpVs6aFl4FdgAoqqumZ304/pmnIIqCCKNWzrQTcjEjgVLBJ2UkVSwgdkQHraRiRkCk3m541waea6eMglvpFgKfsvCMjocp3050hgaFa1HLyP62XQlB3Mx4lKbCIzgYFqcAQ4zwj3OeSURBjDQiVXO+K6ZBIQkEnWdYh2IsnL4P2Rc22avbtZaVRL+IooWN0gqrIRleogW5QE7UQRU/oBb2hd+PZeDU+jM9Z64pReI7QnzK+fgHM16Ba</latexit><latexit sha1_base64="Lzx9LffsryzPfUQTzpP4J8gisJs=">AAACFnicbVDLSsNAFJ34rPUVdelmsAh1YUlEsMuCG5cV7AOaECbTSTt08mDmRqyhX+HGX3HjQhG34s6/cdJmUVsvDHM45x7uvcdPBFdgWT/Gyura+sZmaau8vbO7t28eHLZVnErKWjQWsez6RDHBI9YCDoJ1E8lI6AvW8UfXud65Z1LxOLqDccLckAwiHnBKQFOeeZ54DgwZkKrjx6KvxqH+socJdrQL8Dz5ODnzzIpVs6aFl4FdgAoqqumZ304/pmnIIqCCKNWzrQTcjEjgVLBJ2UkVSwgdkQHraRiRkCk3m541waea6eMglvpFgKfsvCMjocp3050hgaFa1HLyP62XQlB3Mx4lKbCIzgYFqcAQ4zwj3OeSURBjDQiVXO+K6ZBIQkEnWdYh2IsnL4P2Rc22avbtZaVRL+IooWN0gqrIRleogW5QE7UQRU/oBb2hd+PZeDU+jM9Z64pReI7QnzK+fgHM16Ba</latexit>

N (0, I)<latexit sha1_base64="ah24JwkFU+akSqw/ekabSYL6yyM=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSJUkJKIYJcFN7qRCvYBbSiT6aQdOpmEmYlSYj/FjQtF3Pol7vwbJ20W2npg4HDOvdwzx485U9pxvq2V1bX1jc3CVnF7Z3dv3y4dtFSUSEKbJOKR7PhYUc4EbWqmOe3EkuLQ57Ttj68yv/1ApWKRuNeTmHohHgoWMIK1kfp2qRdiPSKYp7fTinOGbk77dtmpOjOgZeLmpAw5Gn37qzeISBJSoQnHSnVdJ9ZeiqVmhNNpsZcoGmMyxkPaNVTgkCovnUWfohOjDFAQSfOERjP190aKQ6UmoW8ms6Bq0cvE/7xuooOalzIRJ5oKMj8UJBzpCGU9oAGTlGg+MQQTyUxWREZYYqJNW0VTgrv45WXSOq+6TtW9uyjXa3kdBTiCY6iAC5dQh2toQBMIPMIzvMKb9WS9WO/Wx3x0xcp3DuEPrM8fmfCS3Q==</latexit><latexit sha1_base64="ah24JwkFU+akSqw/ekabSYL6yyM=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSJUkJKIYJcFN7qRCvYBbSiT6aQdOpmEmYlSYj/FjQtF3Pol7vwbJ20W2npg4HDOvdwzx485U9pxvq2V1bX1jc3CVnF7Z3dv3y4dtFSUSEKbJOKR7PhYUc4EbWqmOe3EkuLQ57Ttj68yv/1ApWKRuNeTmHohHgoWMIK1kfp2qRdiPSKYp7fTinOGbk77dtmpOjOgZeLmpAw5Gn37qzeISBJSoQnHSnVdJ9ZeiqVmhNNpsZcoGmMyxkPaNVTgkCovnUWfohOjDFAQSfOERjP190aKQ6UmoW8ms6Bq0cvE/7xuooOalzIRJ5oKMj8UJBzpCGU9oAGTlGg+MQQTyUxWREZYYqJNW0VTgrv45WXSOq+6TtW9uyjXa3kdBTiCY6iAC5dQh2toQBMIPMIzvMKb9WS9WO/Wx3x0xcp3DuEPrM8fmfCS3Q==</latexit><latexit sha1_base64="ah24JwkFU+akSqw/ekabSYL6yyM=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSJUkJKIYJcFN7qRCvYBbSiT6aQdOpmEmYlSYj/FjQtF3Pol7vwbJ20W2npg4HDOvdwzx485U9pxvq2V1bX1jc3CVnF7Z3dv3y4dtFSUSEKbJOKR7PhYUc4EbWqmOe3EkuLQ57Ttj68yv/1ApWKRuNeTmHohHgoWMIK1kfp2qRdiPSKYp7fTinOGbk77dtmpOjOgZeLmpAw5Gn37qzeISBJSoQnHSnVdJ9ZeiqVmhNNpsZcoGmMyxkPaNVTgkCovnUWfohOjDFAQSfOERjP190aKQ6UmoW8ms6Bq0cvE/7xuooOalzIRJ5oKMj8UJBzpCGU9oAGTlGg+MQQTyUxWREZYYqJNW0VTgrv45WXSOq+6TtW9uyjXa3kdBTiCY6iAC5dQh2toQBMIPMIzvMKb9WS9WO/Wx3x0xcp3DuEPrM8fmfCS3Q==</latexit><latexit sha1_base64="ah24JwkFU+akSqw/ekabSYL6yyM=">AAAB+nicbVDLSsNAFL3xWesr1aWbwSJUkJKIYJcFN7qRCvYBbSiT6aQdOpmEmYlSYj/FjQtF3Pol7vwbJ20W2npg4HDOvdwzx485U9pxvq2V1bX1jc3CVnF7Z3dv3y4dtFSUSEKbJOKR7PhYUc4EbWqmOe3EkuLQ57Ttj68yv/1ApWKRuNeTmHohHgoWMIK1kfp2qRdiPSKYp7fTinOGbk77dtmpOjOgZeLmpAw5Gn37qzeISBJSoQnHSnVdJ9ZeiqVmhNNpsZcoGmMyxkPaNVTgkCovnUWfohOjDFAQSfOERjP190aKQ6UmoW8ms6Bq0cvE/7xuooOalzIRJ5oKMj8UJBzpCGU9oAGTlGg+MQQTyUxWREZYYqJNW0VTgrv45WXSOq+6TtW9uyjXa3kdBTiCY6iAC5dQh2toQBMIPMIzvMKb9WS9WO/Wx3x0xcp3DuEPrM8fmfCS3Q==</latexit>

W<latexit sha1_base64="5r2I0wad77bSzyICVEspKQF+4Gk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEaI8FLx5bsB/QhrLZTtq1m03Y3Qgl9Bd48aCIV3+SN/+N2zYHbX0w8Hhvhpl5QSK4Nq777RS2tnd294r7pYPDo+OT8ulZR8epYthmsYhVL6AaBZfYNtwI7CUKaRQI7AbTu4XffUKleSwfzCxBP6JjyUPOqLFSqzssV9yquwTZJF5OKpCjOSx/DUYxSyOUhgmqdd9zE+NnVBnOBM5Lg1RjQtmUjrFvqaQRaj9bHjonV1YZkTBWtqQhS/X3REYjrWdRYDsjaiZ63VuI/3n91IR1P+MySQ1KtloUpoKYmCy+JiOukBkxs4Qyxe2thE2ooszYbEo2BG/95U3Sual6btVr3VYa9TyOIlzAJVyDBzVowD00oQ0MEJ7hFd6cR+fFeXc+Vq0FJ585hz9wPn8AsTOM0Q==</latexit><latexit sha1_base64="5r2I0wad77bSzyICVEspKQF+4Gk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEaI8FLx5bsB/QhrLZTtq1m03Y3Qgl9Bd48aCIV3+SN/+N2zYHbX0w8Hhvhpl5QSK4Nq777RS2tnd294r7pYPDo+OT8ulZR8epYthmsYhVL6AaBZfYNtwI7CUKaRQI7AbTu4XffUKleSwfzCxBP6JjyUPOqLFSqzssV9yquwTZJF5OKpCjOSx/DUYxSyOUhgmqdd9zE+NnVBnOBM5Lg1RjQtmUjrFvqaQRaj9bHjonV1YZkTBWtqQhS/X3REYjrWdRYDsjaiZ63VuI/3n91IR1P+MySQ1KtloUpoKYmCy+JiOukBkxs4Qyxe2thE2ooszYbEo2BG/95U3Sual6btVr3VYa9TyOIlzAJVyDBzVowD00oQ0MEJ7hFd6cR+fFeXc+Vq0FJ585hz9wPn8AsTOM0Q==</latexit><latexit sha1_base64="5r2I0wad77bSzyICVEspKQF+4Gk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEaI8FLx5bsB/QhrLZTtq1m03Y3Qgl9Bd48aCIV3+SN/+N2zYHbX0w8Hhvhpl5QSK4Nq777RS2tnd294r7pYPDo+OT8ulZR8epYthmsYhVL6AaBZfYNtwI7CUKaRQI7AbTu4XffUKleSwfzCxBP6JjyUPOqLFSqzssV9yquwTZJF5OKpCjOSx/DUYxSyOUhgmqdd9zE+NnVBnOBM5Lg1RjQtmUjrFvqaQRaj9bHjonV1YZkTBWtqQhS/X3REYjrWdRYDsjaiZ63VuI/3n91IR1P+MySQ1KtloUpoKYmCy+JiOukBkxs4Qyxe2thE2ooszYbEo2BG/95U3Sual6btVr3VYa9TyOIlzAJVyDBzVowD00oQ0MEJ7hFd6cR+fFeXc+Vq0FJ585hz9wPn8AsTOM0Q==</latexit><latexit sha1_base64="5r2I0wad77bSzyICVEspKQF+4Gk=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0lEaI8FLx5bsB/QhrLZTtq1m03Y3Qgl9Bd48aCIV3+SN/+N2zYHbX0w8Hhvhpl5QSK4Nq777RS2tnd294r7pYPDo+OT8ulZR8epYthmsYhVL6AaBZfYNtwI7CUKaRQI7AbTu4XffUKleSwfzCxBP6JjyUPOqLFSqzssV9yquwTZJF5OKpCjOSx/DUYxSyOUhgmqdd9zE+NnVBnOBM5Lg1RjQtmUjrFvqaQRaj9bHjonV1YZkTBWtqQhS/X3REYjrWdRYDsjaiZ63VuI/3n91IR1P+MySQ1KtloUpoKYmCy+JiOukBkxs4Qyxe2thE2ooszYbEo2BG/95U3Sual6btVr3VYa9TyOIlzAJVyDBzVowD00oQ0MEJ7hFd6cR+fFeXc+Vq0FJ585hz9wPn8AsTOM0Q==</latexit>

h = µ + � � ✏<latexit sha1_base64="Jp2UFBOVK3wO0hQTGoqglx7jpjs=">AAACCHicbVC7SgNBFL3rM8ZXNKWFgyEgCGHXxjRCwMYygnlANoTZ2UkyZGZnmZkVwpJOG3/FxkIRWz/Bzs5PcTZJoYkHBg7nnMude4KYM21c98tZWV1b39jMbeW3d3b39gsHh00tE0Vog0guVTvAmnIW0YZhhtN2rCgWAaetYHSV+a07qjST0a0Zx7Qr8CBifUawsVKvcDxEl8gXCTpDvmYDgZEvQ2mQT2PNeJYouRV3CrRMvDkp1Yrl+28AqPcKn34oSSJoZAjHWnc8NzbdFCvDCKeTvJ9oGmMywgPasTTCgupuOj1kgspWCVFfKvsig6bq74kUC63HIrBJgc1QL3qZ+J/XSUy/2k1ZFCeGRmS2qJ9wZCTKWkEhU5QYPrYEE8XsXxEZYoWJsd3lbQne4snLpHle8dyKd2PbqMIMOTiCEzgFDy6gBtdQhwYQeIAneIFX59F5dt6c91l0xZnPFOEPnI8fLx+aUg==</latexit><latexit sha1_base64="kUTd8fpX2swTqjjxgyheWFO8p/U=">AAACCHicbVDLSgMxFM3UV62vapcuDJaCIJQZN3YjFNy4rGAf0BlKJpNpQ5PJkGSEMnSnG3/FjQtF3LrwA9zpB/gFfoCZtgttPRA4nHMuN/f4MaNK2/aHlVtaXlldy68XNja3tneKu3stJRKJSRMLJmTHR4owGpGmppqRTiwJ4j4jbX94nvntayIVFdGVHsXE46gf0ZBipI3UKx4M4Bl0eQKPoatonyPoikBo6JJYUZYlynbVngAuEmdGyvVS5eb77euz0Su+u4HACSeRxgwp1XXsWHspkppiRsYFN1EkRniI+qRraIQ4UV46OWQMK0YJYCikeZGGE/X3RIq4UiPumyRHeqDmvUz8z+smOqx5KY3iRJMITxeFCYNawKwVGFBJsGYjQxCW1PwV4gGSCGvTXcGU4MyfvEhaJ1XHrjqXpo0amCIP9sEhOAIOOAV1cAEaoAkwuAX34BE8WXfWg/VsvUyjOWs2UwJ/YL3+AKkcnOw=</latexit><latexit sha1_base64="kUTd8fpX2swTqjjxgyheWFO8p/U=">AAACCHicbVDLSgMxFM3UV62vapcuDJaCIJQZN3YjFNy4rGAf0BlKJpNpQ5PJkGSEMnSnG3/FjQtF3LrwA9zpB/gFfoCZtgttPRA4nHMuN/f4MaNK2/aHlVtaXlldy68XNja3tneKu3stJRKJSRMLJmTHR4owGpGmppqRTiwJ4j4jbX94nvntayIVFdGVHsXE46gf0ZBipI3UKx4M4Bl0eQKPoatonyPoikBo6JJYUZYlynbVngAuEmdGyvVS5eb77euz0Su+u4HACSeRxgwp1XXsWHspkppiRsYFN1EkRniI+qRraIQ4UV46OWQMK0YJYCikeZGGE/X3RIq4UiPumyRHeqDmvUz8z+smOqx5KY3iRJMITxeFCYNawKwVGFBJsGYjQxCW1PwV4gGSCGvTXcGU4MyfvEhaJ1XHrjqXpo0amCIP9sEhOAIOOAV1cAEaoAkwuAX34BE8WXfWg/VsvUyjOWs2UwJ/YL3+AKkcnOw=</latexit><latexit sha1_base64="OtP30//48+JbrWFBFhe7SmdQh8Q=">AAACCHicbVDLSgMxFM34rPVVdenCYBEEocy4sRuh4MZlBfuAzlAymbQNTSZDckcoQ5du/BU3LhRx6ye482/MtLPQ1gOBwznncnNPmAhuwHW/nZXVtfWNzdJWeXtnd2+/cnDYNirVlLWoEkp3Q2KY4DFrAQfBuolmRIaCdcLxTe53Hpg2XMX3MElYIMkw5gNOCVipXzkZ4WvsyxRfYN/woSTYV5EC7LPEcJEnqm7NnQEvE68gVVSg2a98+ZGiqWQxUEGM6XluAkFGNHAq2LTsp4YlhI7JkPUsjYlkJshmh0zxmVUiPFDavhjwTP09kRFpzESGNikJjMyil4v/eb0UBvUg43GSAovpfNEgFRgUzlvBEdeMgphYQqjm9q+YjogmFGx3ZVuCt3jyMmlf1jy35t251Ua9qKOEjtEpOkceukINdIuaqIUoekTP6BW9OU/Oi/PufMyjK04xc4T+wPn8Aa1LmHY=</latexit>

Figure 1: Model architecture

with bias. We choose the multi-layer perceptron(MLP) in the encoder to have two hidden layerswith 3 ×K and 2 ×K hidden units respectively,and we use the sigmoid activation function. The de-coder network pθ(x|z) first maps z to the predictedprobability of each of the word in the vocabularyy ∈ R|V |×1 through y = softmax(Wz+ b), whereW ∈ R|V |×K . The log-likelihood of the documentcan be written as log pθ(x|z) =

∑|V |i=1{x� log y}.

We name this model Neural Topic Model (NTM)and use it as our baseline. We use the same encoderMLP configuration for our NVDM implementationand all variants of NTM models used in Section 3.In NTM, the objective function to maximize is theusual evidence lower bound (ELBO) which can beexpressed as

LELBO(xi)

≈ 1

L

L∑l=1

log pθ(xi|zi,l)−DKL(qφ(h|x)||pθ(h)

where zi,l = ReLU(h(xi, εl)), εl ∼ N (0, I).We approximate Ez∼q(z|x)[log pθ(x|z)] with MonteCarlo integration and calculate the Kullback-Liebler (KL) divergence analytically using the factDKL(qφ(z|x)||pθ(z)) = DKL(qφ(h|x)||pθ(h))due to the invariance of KL divergence under deter-ministic mapping between h and z.

Compared to NTM, NVDM uses different ac-tivation functions and has z = h. Miao (2017)proposed a modification to NVDM called Gaus-sian Softmax Model (GSM) corresponding to hav-ing z = softmax(h). Srivastava (2017) proposeda model called ProdLDA, which uses a Dirichletprior instead of Gaussian prior for the latent vari-able h. Given a learned W , the practice to extracttop-N most probable words for each topic is totake the most positive entries in each column ofW (Miao et al., 2016, 2017; Srivastava and Sutton,2017). This is an intuitive choice, provided that z

is non-negative, which is indeed the case for NTM,GSM and ProdLDA. NVDM, GSM, and ProdLDAare state-of-the-art neural topic models which wewill use for comparison in Section 3.

2.2 Topic Coherence Regularization: NTM-RThe topic coherence metric NPMI (Aletras andStevenson, 2013; Lau et al., 2014) is defined as

NPMI(w)

=1

N(N − 1)

N∑j=2

j−1∑i=1

logP (wi,wj)P (wi)P (wj)

− logP (wi, wj)

where w is the list of top-N words for a topic.N is usually set to 10. For a model generatingK topics, the overall NPMI score is an averageover all topics. The computational overhead andnon-differentiability originate from extracting theco-occurrence frequency from a large corpus1.

From the NPMI formula, it is clear that word-pairs that co-occur often would score high, unlessthey are rare word-pairs – which would be normal-ized out by the denominator. The NPMI scoringbears remarkable resemblance to the contextualsimilarity produced by the inner product of wordembedding vectors. Along this line of reasoning,we construct a differentiable, computation-efficientword embedding based topic coherence (WETC).

Let E be the row-normalized word embeddingmatrix for a list of N words, such that E ∈ RN×Dand ‖Ei,:‖ = 1, where D is the dimension of theembedding space. We can define pair-wise wordembedding topic coherence in a similar spirit asNPMI:

WETCPW (E) =1

N(N − 1)

N∑j=2

j−1∑i=1

〈Ei,:, Ej,:〉

=

∑{ETE} −N2N(N − 1)

where 〈·, ·〉 denotes inner product. Alternatively,we can define centroid word embedding topic co-herence

WETCC(E) =1

N

∑{EtT }

where vector t ∈ R1×D is the centroid of E, nor-malized to have ‖t‖ = 1. Empirically, we found

1A typical calculation of NPMI over 50 topics based onthe Wikipedia corpus takes ∼20 minutes, using code providedby (Lau et al., 2014) at https://github.com/jhlau/topic_interpretability.

Page 3: Coherence-Aware Neural Topic Modeling · Coherence-Aware Neural Topic Modeling Ran Ding, Ramesh Nallapati, Bing Xiang Amazon Web Services {rding, rnallapa, bxiang}@amazon.com Abstract

that the two WETC formulations behave very sim-ilarly. In addition, both WETCPW and WETCCcorrelate to human judgement almost equally wellas NPMI when using GloVe (Pennington et al.,2014) vectors2.

With the above observations, we propose thefollowing procedure to construct a WETC-basedsurrogate topic coherence regularization term: (1)let E ∈ R|V |×D be the pre-trained word embed-ding matrix for the vocabulary, rows aligned withW ; (2) form the W -weighted centroid (topic) vec-tors T ∈ RD×K by T = ETW ; (3) calculatethe cosine similarity matrix S ∈ R|V |×K betweenword vectors and topic vectors by S = ET ; (4)calculate the W -weighted sum of word-to-topiccosine similarities for each topic C ∈ R1×K asC =

∑i(S �W )i,:. Compared to WETCC , in

calculating C we do not perform top-N operationinW , but directly useW for weighted sum. Specif-ically, we use W -weighted topic vector construc-tion in Step-2 and W -weighted sum of the cosinesimilarities between word vectors and topic vectorsin Step-4. To avoid unbounded optimization, wenormalize the rows of E and the columns of W be-fore Step-2, and normalize the columns of T afterStep-2. The overall maximization objective func-tion becomes LR(x; θ, φ) = LELBO + λ

∑iCi,

where λ is a hyper-parameter with positive valuescontrolling the strength of topic coherence regular-ization. We name this model NTM-R.

2.3 Word Embedding as a FactorizationConstraint: NTM-F and NTM-FR

Instead of allowing all the elements in W to befreely optimized, we can impose a factorizationconstraint of W = ET̂ , where E is the pre-trainedword embedding matrix that is fixed, and only T̂is allowed to be learned through training. Underthis configuration, T̂ lives in the embedding space,and each entry in W is the dot product similaritybetween a topic vector T̂i and a word vectorEj . Asone can imagine, similar words would have similarvector representations in E and would have similarweights in each column of W . Therefore the fac-torization constraint encourages words with similarmeaning to be selected or de-selected together thuspotentially improving topic coherence.

We name the NTM model with factorization con-straint enabled as NTM-F. In addition, we can apply

2See Appendix A for details on an empirical study ofhuman judgement of topic coherence, NPMI and WETC withvarious types of word embeddings.

Metric Perplexity NPMI

Number of topics 50 200 50 200

LDA

LDA, mean-field 1046 1195 0.11 0.06LDA, collapsed Gibbs 728 688 0.17 0.14

Neural Models

NVDM 750 743 0.14 0.13GSM 787 829 0.22 0.19ProdLDA 1172 1168 0.28 0.24NTM 780 768 0.18 0.18NTM-R 775 763 0.28 0.23NTM-F 898 1086 0.29 0.24NTM-FR 924 1225 0.27 0.26

Table 1: Comparison to LDA and neural variationalmodels on the 20NewsGroup dataset. Best numbersare bolded. The blue underlined row highlights thebest NPMI and perplexity tradeoff as discussed intext.

the regularization discussed in the previous sectionon the resulting matrix W and we name the result-ing model NTM-FR.

3 Experiments and Discussions

3.1 Results on 20NewsGroupFirst, we compare the proposed models to state-of-the-art neural variational inference based topicmodels in the literature (NVDM, GSM, andProdLDA) as well as LDA benchmarks, on the20NewsGroup dataset3. In training NVDM andall NTM models, we used Adadelta optimizer(Zeiler, 2012). We set the learning rate to 0.01 andtrain with a batch size of 256. For NTM-R, NTM-F and NTM-FR, the word embedding we usedis GloVe (Pennington et al., 2014) vectors pre-trained on Wikipedia and Gigaword with 400,000vocabulary size and 50 embedding dimensions4.The topic coherence regularization coefficient λ isset to 50. The results are presented in Table 1.

Overall we can see that LDA trained with col-lapsed Gibbs sampling achieves the best perplexity,while NTM-F and NTM-FR models achieve thebest topic coherence (in NPMI). Clearly, there is atrade-off between perplexity and NPMI as identi-fied by other papers. So we constructed Figure 2,which shows the two metrics from various models.For the models we implemented, we additionally

3We use the exact dataset from (Srivastava and Sutton,2017) to avoid subtle differences in pre-processing

4Obtained from https://nlp.stanford.edu/projects/glove/

Page 4: Coherence-Aware Neural Topic Modeling · Coherence-Aware Neural Topic Modeling Ran Ding, Ramesh Nallapati, Bing Xiang Amazon Web Services {rding, rnallapa, bxiang}@amazon.com Abstract

700 800 900 1000 1100 1200Perplexity

(on validation set, lower is better)

0.10

0.15

0.20

0.25

0.30

0.35

0.40

Topi

c Co

here

nce

(in N

PMI,

high

er is

bet

ter)

Topic Coherence vs Perplexity (20NewsGroup, 50 topics)

LDA, Mean-fieldLDA, Collapsed GibbsGSMProdLDANVDMNTMNTM-RNTM-FNTM-FR

Figure 2: NPMI vs. perplexity for various models at 50 topics. For NVDM and NTM models the tracescorrespond to the evolution over training epochs. High transparency is the beginning of the training.

show the full evolution of these two metrics overtraining epochs.

From Figure 2, it becomes clear that althoughProdLDA exhibits good performance on NPMI,it is achieved at a steep cost of perplexity, whileNTM-R achieves similar or better NPMI at muchlower perplexity levels. At the other end of thespectrum, if we look for low perplexity, the bestnumbers among neural variational models are be-tween 750 and 800. In this neighborhood, NTM-Rsubstantially outperforms the GSM, NVDM andNTM baseline models. Therefore, we considerNTM-R the best model overall. Different down-stream applications may require different tradeoffpoints between NPMI and perplexity. However, theproposed NTM-R model does appear to providetradeoff points on a Pareto front compared to othermodels across most of the range of perplexity.

3.2 Comments on NTM-F and NTM-FR

It is worth noting that although NTM-F and NTM-FR exhibit high NPMI early on, they fail to main-tain it during the training process. In addition, bothmodels converged to fairly high perplexity levels.Our hypothesis is that this is caused by NTM-F andNTM-FR’s substantially reduced parameter space -from |V | ×K to D ×K, where |V | ranges from1,000 to 150,000 in a typical dataset, while D is onthe order of 100.

Some form of relaxation could alleviate thisproblem. For example, we can let W = ET̂ +A,where A is of size |V | × K but is heavily regu-larized, or let W = EQT̂ where Q is allowed asadditional free parameters. We leave fully address-

ing this to future work.

3.3 Validation on other Datasets

To further validate the performance improvementfrom using WETC-based regularization in NTM-R,we compare NTM-R with the NTM baseline modelon a few more datasets: DailyKOS, NIPS, andNYTimes5 (Asuncion and Newman, 2007). Thesedatasets offer a wide range of document length(ranging from ∼100 to ∼1000 words), vocabularysize (ranging from ∼7,000 to ∼140,000), and typeof documents (from news articles to long-form sci-entific writing). In this set of experiments, we usedthe same settings and hyperparameter λ as beforeand did not fine-tune for each dataset. The resultsare presented in Figure 3. In a similar style as Fig-ure 2, we show the evolution of NPMI and WETCversus perplexity over epochs until convergence.

Among all datasets, we observed improvedNPMI at the same perplexity level, validating theeffectiveness of the topic coherence regularization.However, on the NYTimes dataset, the improve-ment is quite marginal even though WETC im-provements are very noticeable. One particularityabout the NYTimes dataset is that approximately58,000 words in the 140,000-word vocabulary arenamed entities. It appears that the large numberof named entities resulted in a divergence betweenNPMI and WETC scoring, which is an issue toaddress in the future.

5https://archive.ics.uci.edu/ml/datasets/Bag+of+Words

Page 5: Coherence-Aware Neural Topic Modeling · Coherence-Aware Neural Topic Modeling Ran Ding, Ramesh Nallapati, Bing Xiang Amazon Web Services {rding, rnallapa, bxiang}@amazon.com Abstract

2000 2200 2400 2600 2800Perplexity

0.16

0.18

0.20

0.22NP

MI

DailyKOS

NTMNTM-R

2000 2500 3000Perplexity

0.125

0.150

0.175

0.200

0.225

0.250

NPM

I

NIPS

NTMNTM-R

3800 4000 4200 4400Perplexity

0.100

0.125

0.150

0.175

0.200

NPM

I

NYTimesNTMNTM-R

2000 2200 2400 2600 2800Perplexity

0.50

0.55

0.60

0.65

0.70

WET

C C

DailyKOS

NTMNTM-R

2000 2500 3000Perplexity

0.45

0.50

0.55

0.60

0.65

WET

C C

NIPS

NTMNTM-R

3800 4000 4200 4400Perplexity

0.550

0.575

0.600

0.625

0.650

0.675

WET

C C

NYTimesNTMNTM-R

Figure 3: Performance comparison between NTM-R and NTM on multiple datasets, with 50 topics. Toprow is NPMI versus perplexity, bottom row is WETCC versus perplexity. From left to right: DailyKOS,NIPS, and NYTimes. See text for details about the datasets.

4 Conclusions

In this work, we proposed regularization and factor-ization constraints based approaches to incorporateawareness of topic coherence into the formulationof topic models: NTM-R and NTM-F respectively.We observed that NTM-R substantially improvestopic coherence with minimal sacrifice in perplex-ity. To our best knowledge, NTM-R is the first topicmodel that is trained with an objective towardstopic coherence – a feature directly contributing toits superior performance. We further showed thatthe proposed WETC-based regularization methodis applicable to a wide range of text datasets.

ReferencesNikolaos Aletras and Mark Stevenson. 2013. Evalu-

ating topic coherence using distributional semantics.In Proceedings of the 10th International Conferenceon Computational Semantics (IWCS 2013)–Long Pa-pers, pages 13–22.

Arthur Asuncion and David Newman. 2007. UCI Ma-chine Learning Repository.

David M Blei. 2012. Probabilistic topic models. Com-munications of the ACM, 55(4):77–84.

David M Blei, Andrew Y Ng, and Michael I Jordan.2003. Latent dirichlet allocation. Journal of Ma-chine Learning Research, 3(Jan):993–1022.

Jonathan Chang, Sean Gerrish, Chong Wang, Jordan LBoyd-Graber, and David M Blei. 2009. Reading tea

leaves: How humans interpret topic models. In Ad-vances in Neural Information Processing Systems,pages 288–296.

Thomas L Griffiths and Mark Steyvers. 2004. Find-ing scientific topics. Proceedings of the NationalAcademy of Sciences, 101(suppl 1):5228–5235.

Matthew Hoffman, Francis R Bach, and David M Blei.2010. Online learning for latent dirichlet allocation.In Advances in Neural Information Processing Sys-tems, pages 856–864.

Armand Joulin, Edouard Grave, Piotr Bojanowski, andTomas Mikolov. 2016. Bag of tricks for efficient textclassification. arXiv preprint arXiv:1607.01759.

Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprintarXiv:1312.6114.

Jey Han Lau, David Newman, and Timothy Baldwin.2014. Machine reading tea leaves: Automaticallyevaluating topic coherence and topic model quality.In Proceedings of the 14th Conference of the Euro-pean Chapter of the Association for ComputationalLinguistics, pages 530–539.

Yishu Miao, Edward Grefenstette, and Phil Blunsom.2017. Discovering discrete latent topics with neuralvariational inference. In International Conferenceon Machine Learning, pages 2410–2419.

Yishu Miao, Lei Yu, and Phil Blunsom. 2016. Neu-ral variational inference for text processing. In In-ternational Conference on Machine Learning, pages1727–1736.

Page 6: Coherence-Aware Neural Topic Modeling · Coherence-Aware Neural Topic Modeling Ran Ding, Ramesh Nallapati, Bing Xiang Amazon Web Services {rding, rnallapa, bxiang}@amazon.com Abstract

Tomas Mikolov, Kai Chen, Greg Corrado, and Jef-frey Dean. 2013. Efficient estimation of wordrepresentations in vector space. arXiv preprintarXiv:1301.3781.

Jeffrey Pennington, Richard Socher, and ChristopherManning. 2014. Glove: Global vectors for word rep-resentation. In Proceedings of the 2014 Conferenceon Empirical Methods in Natural Language Process-ing (EMNLP), pages 1532–1543.

Danilo Jimenez Rezende, Shakir Mohamed, and DaanWierstra. 2014. Stochastic backpropagation andapproximate inference in deep generative models.In International Conference on Machine Learning,pages 1278–1286.

Alexandre Salle, Marco Idiart, and Aline Villavicencio.2016. Matrix factorization using window samplingand negative sampling for improved word represen-tations. arXiv preprint arXiv:1606.00819.

Akash Srivastava and Charles Sutton. 2017. Autoen-coding variational inference for topic models. arXivpreprint arXiv:1703.01488.

Matthew D Zeiler. 2012. Adadelta: an adaptive learn-ing rate method. arXiv preprint arXiv:1212.5701.

A Word Embedding Topic Coherence

As studied in (Aletras and Stevenson, 2013) and(Lau et al., 2014), the NPMI metric for assessingtopic coherence over a list of words w is definedin Eq. 1.

NPMI(w)

=1

N(N − 1)

N∑j=2

j−1∑i=1

logP (wi,wj)P (wi)P (wj)

− logP (wi, wj)

(1)

where P (wi) and P (wi, wj) are the probability ofwords and word pairs, calculated based on a refer-ence corpus. N is usually set to 10, by convention,so that NPMI is evaluated over the topic-10 wordsfor each topic. For a model generating K topics,the overall NPMI score is an average over all thetopics. The computational overhead comes from ex-tracting the relevant co-occurrence frequency froma large corpus. This problem is exacerbated whenthe look-up also requires a small sliding window asthe authors of (Lau et al., 2014) suggested. A typi-cal calculation of 50 topics based on a few milliondocuments from the Wikipedia corpus takes ∼20minutes6.

6Using code provided by (Lau et al., 2014) at https://github.com/jhlau/topic_interpretability.Running parallel processes on 8 Intel Xeon E5-2686 CPUs.

For a list of words w of length N , we can as-semble a corresponding word embedding matrixE ∈ RN×D with each row corresponding to a wordin the list. D is the dimension of the embeddingspace. Averaging across the rows, we can obtainvector t ∈ R1×D as the centroid of all the wordvectors. It may be regarded as a "topic" vector. Inaddition, we assume that each row of E and t isnormalized, i.e. ‖t‖ = 1 and ‖Ei,:‖ = 1. Withthese, we define pair-wise and centroid word em-bedding topic coherence WETCPW and WETCCas follows:

WETCPW (E) =1

N(N − 1)

N∑j=2

j−1∑i=1

〈Ei,:, Ej,:〉

=

∑{ETE} −N2N(N − 1)

(2)

WETCC(E) =1

N

∑{EtT } (3)

where 〈·, ·〉 denotes inner product. The simplifica-tion in Eq. 2 is due to the row normalization ofE.

In this setting, we have the flexibility to use anypre-trained word embeddings to constructE. To ex-periment, we compared several recently developedvariants 7. The dataset from (Aletras and Steven-son, 2013) provides human ratings for 300 topicscoming from 3 corpora: 20NewsGroup (20NG),New York Times (NYT) and genomics scientific ar-ticles (Genomics), which we use as the human goldstandard. We use Pearson and Spearman correla-tions to compare NPMI and WETC scores againsthuman ratings. The results are shown in Table 2.

7Details of pre-trained word embeddings used in Table 2

• Word2Vec (Mikolov et al., 2013): pre-trained onGoogleNews, with 3 million vocabulary size and 300embedding dimension. Obtained from https://code.google.com/archive/p/word2vec/.

• GloVe (Pennington et al., 2014): pre-trained onWikipedia and Gigaword, with 400,000 vocabu-lary size and 50 and 300 embedding dimension.Obtained from https://nlp.stanford.edu/projects/glove/.

• FastText (Joulin et al., 2016): pre-trained onWikipedia with 2.5 million vocabulary size and 300embedding dimension. Obtained from https://github.com/facebookresearch/fastText.

• LexVec (Salle et al., 2016): pre-trained on Wikipediawith 370,000 vocabulary size and 300 embedding di-mension. Obtained from https://github.com/alexandres/lexvec.

Page 7: Coherence-Aware Neural Topic Modeling · Coherence-Aware Neural Topic Modeling Ran Ding, Ramesh Nallapati, Bing Xiang Amazon Web Services {rding, rnallapa, bxiang}@amazon.com Abstract

Dataset 20NG NYT Genomics

Correlation P S P S P S

NPMI 0.74 0.74 0.72 0.71 0.62 0.65

GloVe-50dWETCPW 0.82 0.77 0.73 0.71 0.65 0.65WETCC 0.81 0.77 0.73 0.71 0.65 0.65

GloVe-300dWETCPW 0.77 0.75 0.78 0.76 0.68 0.70WETCC 0.80 0.75 0.78 0.76 0.68 0.70

Word2VecWETCPW 0.29 0.23 0.53 0.59 0.56 0.55WETCC 0.31 0.23 0.55 0.59 0.56 0.55

FastTextWETCPW 0.40 0.61 0.63 0.67 0.62 0.62WETCC 0.48 0.61 0.64 0.67 0.63 0.62

LexVecWETCPW 0.37 0.57 0.79 0.80 0.65 0.64WETCC 0.47 0.57 0.81 0.80 0.65 0.64

Table 2: NPMI and WETC correlation with humangold standard (P: Pearson, S: Spearman)

From Table 2 we observed a minimal differencebetween pair-wise and centroid based WETC ingeneral. Overall, GloVe appears to perform thebest across different types of corpora and its cor-relation with human ratings is very comparableto NPMI-based scores. Our NPMI calculation isbased on the Wikipedia corpus and should serve asa fair comparison. In addition to the good correla-tion exhibited by WETC, the evaluation of WETConly involves matrix multiplications and summa-tions and thus is fully differentiable and severalorders of magnitude faster than NPMI calculations.WETC opens the door of incorporating topic coher-ence as a training objective, which is the key ideawe will investigate in the subsequent sections. Itis worth mentioning that, for GloVe, the low di-mensional embedding (50d) appears to perform al-most equally well as high dimensional embedding(300d). Therefore, we will use Glove-400k-50d inall subsequent experiments.

While the WETC metric on its own might be ofinterest to the topic modeling research community,we leave the task of formally establishing it as astandard metric in place of NPMI to future work.In this work, we still use the widely accepted NPMIas the objective topic coherence metric for modelcomparisons.