forgetting counts : constant memory inference for a dependent hierarchical pitman-yor process...

13
Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian Wang Nov. 17, 2010

Upload: felicity-hunter

Post on 12-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical

Pitman-Yor Process

Nicholas Bartlett, David Pfau, Frank WoodPresented by Yingjian Wang

Nov. 17, 2010

Page 2: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

• Background• The sequential memoizer• Forgetting• The dependent HPY• Experiment results

Outline

Page 3: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

Background2006,Teh, ‘A hierarchical Bayesian language model based on Pitman-Yor processes’

N-gram Markov chain language model with the HPY prior.

2009, Wood, ‘A Stochastic Memoizer for Sequence Data’

The Sequential Memoizer (SM) with linear space/time inference scheme. (lossless)

2010, Gasthaus, ’ Lossless compression based onthe Sequence Memoizer’

Combine the SM with an arithmetic coder to develop a compressor (PLUMP/dePLUMP), see www.deplump.com.

2010, Bartlett, ‘Forgetting Counts : Constant Memory Inferencefor a Dependent HPY’

Develop a constant memory/space inference for the SM, by using a dependent HPY. (with loss)

Page 4: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

SM-Two concepts

• Memoizer (Donald Michie, 1968): A device which returns former results under the same input instead of recalculating in order to save time.

• Stochastic Memoizer (Wood, 2009): The returned results can change since the prediction probability is based upon a stochastic process.

Page 5: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

SM-model and trie

• model:

• The prefix trie: restaurants.

gram

Page 6: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

SM-the NSP (1)

•The Normalized Stable Process: (Perman, 1990)

Pitman-Yor Process:

A Normalized Stable Process

( , , )G PY d c H

( ,0, )G PY d HDirichlet Process:

(0, , )G PY c H

Concentration parameter: c=0

Discount parameter: d=0

Page 7: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

• Collapse the middle restaurants:Theorem:If:

Then:

• Prefix tree: restaurants (Weiner, 1973; Ukkonen, 1995)

SM-the NSP (2)

Page 8: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

SM-linear space inference

Page 9: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

Forgetting

• Motivation: to achieve constant memory inference on the basis of SM. How to do? ---

• Methods – Forgetting/delete the restaurants. • Restaurants - the basic memory units in the

context tree:• How to delete? – two deletion schemes:

random deletion; greedy deleting.

( , ) 2u usize c t V

Page 10: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

Deletion schemes• Random deletion: uniformly delete one leaf

restaurant.• Greedy deletion: least negatively impacts the

estimated likelihood of the observed sequence.

Leaf restaurants

Page 11: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

The SMC algorithm

Page 12: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

The dependent HPY• But wait, what we get after the deletion-

addition? Will the processes be independent? – No (Since the seating arrangement in the parent restaurant has been changed.)

Page 13: Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman-Yor Process Nicholas Bartlett, David Pfau, Frank Wood Presented by Yingjian

The experiment results