non-negative matrix factorization via alternating updates · 2020-01-03 · non-negative matrix...

Non-negative Matrix Factorization via Alternating Updates

Yingyu Liang, Princeton University

Simons workshop, Berkeley, Nov 2016

Andrej RisteskiYuanzhi Li

Joint work with

Non-convex Problems in ML

Dictionary learningMatrix completion

Deep learning

Dictionary learningMatrix completion

Deep learning

NP-hard to solve in the worst case

• In practice, often solved by “local improvement”

• Gradient descent and variants

• Alternating update

Goal: provable guarantees of simple algos under natural assumptions

When and why do such simple algos work for the hard problems?

Goal: provable guarantees of simple algos under natural assumptions

When and why do such simple algos work for the hard problems?

This work: alternating update for Non-negative Matrix Factorization

Non-negative Matrix Factorization (NMF)

• Given: matrix

• Find: non-negative matrices

NMF in ML Applications

Basic tool in machine learning

• Topic modeling [Blei-Ng-Jordan03, Arora-Ge-Kannan-Moitra12, Arora-Ge-Moitra12,…]

0.05 0.05 0.05

physics

soccer

……

Doc1 Doc2 Doc3 ……

Weather Sport Science ……

soccer

……

physics

Doc1 Doc2 Doc3 ……

• Computer vision [Lee-Seung97, Lee-Seung99, Buchsbaum-Bloch02,…]

• Many others: network analysis, information retrieval, ……

Worst Case v.s. Practical Heuristic

Worst case analysis [Arora-Ge-Kannan-Moitra12]

• Upper bound:

• Lower bound: no algo, assuming ETH

Alternating updates: typical heuristic, suggested by Lee-Seung

Analyzing Non-convex Problems

• Generative model: the input data is generated from (a distribution defined by) a ground-truth solution

• Warm start: good initialization not far away from the ground-truth

Warm start

Update

Beyond Worst Case for NMF

• Separability-based assumptions [Arora-Ge-Kannan-Moitra12]

• Motivated by topic modeling: each column of (topic) has an anchor word

• Lots of subsequent work [Arora-Ge-Moitra12, Arora-Ge-Halpern-Mimno-Moitra-Sontag-

Wu-Zhu12, Gillis-Vavasis14, Ge-Zhou15, Bhattacharyya-Goyal-Kannan-Pani16, …]

0.05 0.05 0.05

0.2 0 0

Weather Sport Science

soccer

……

physics

Beyond Worst Case for NMF

• Separability-based assumptions [Arora-Ge-Kannan-Moitra12]

• Motivated by topic modeling: each column of (topic) has an anchor word

• Lots of subsequent work [Arora-Ge-Moitra12, Arora-Ge-Halpern-Mimno-Moitra-Sontag-

Wu-Zhu12, Ge-Zhou15, Bhattacharyya-Goyal-Kannan-Pani16,…]

• Variational inference [Awasthi-Risteski15]

• Alternating update method on objective

• Requires relatively strong assumptions on , and/or a warm start depending on its dynamic range (not realistic)

Outline

• Introduction

• Our model, algorithm and main results

• Analysis of the algorithm

Generative Model

Each column of is i.i.d. example from

Generative Model

• (A1): columns of are linearly independent

Generative Model

• (A1): columns of are linearly independent

• (A2): are independent random variables

where is a parameter

Our Algorithm

Parameters:

Known as Rectified Linear Units (ReLU) in Deep Learning

Our Algorithm

Parameters:

are two independent examples, and are their decodings

Our Algorithm

Parameters:

Our Algorithm

Parameters:

Warm Start

• (A3): warm start with error

Warm Start

• (A3): warm start with error

Aligned with truth Not too much

Main Result

Analysis: Overview

Show the algo will

1. Maintain and

2. Decrease the potential function

Analysis: Effect of ReLU

Much less noise after

Analysis: Change of Error Matrix

• Therefore,

More General Results

1. Each column of is generated i.i.d. by

• The decoding is

• So can* tolerate large adversarial noise,

• and tolerate zero-mean noise much larger than signal

2. Distribution of only needs to satisfy some moment conditions

Conclusion

• Beyond worse case analysis of NMF

• generative model with mild condition on the feature matrix

• Provable guarantee of alternating update algorithm

• Strong denoising effect by ReLU + non-negativity

Conclusion

• Beyond worse case analysis of NMF

• generative model with mild condition on the feature matrix

• Provable guarantee of alternating update algorithm

• Strong denoising effect by ReLU + non-negativity

Thanks! Q&A

non-negative matrix factorization via alternating updates · 2020-01-03 · non-negative matrix...

Documents

non-negative matrix factorization for real...

chapter 2 non-negative matrix factorization

using non-negative matrix factorization for text...

discriminative non-negative matrix factorization for...

age estimation based on extended non negative factorization

non-negative matrix factorization for drum...

projected gradient methods for non-negative matrix...

non-negative matrix factorization with gaussian process...

non-negative multiple matrix factorization

a deep non-negative matrix factorization neural...

1861 algorithms for non negative matrix factorization

matrix factorization for topic models - derek greene ·...

extensions of non-negative matrix factorization (nmf) to...

provable alternating gradient descent for non …provable...

non-negative matrix factorization for …

algorithms for non-negative matrix factorization

truncated cauchy non-negative matrix factorization · index...

non negative matrix factorization hamdi jenzri. outline ...

non-negative matrix factorization for drum...

non-negative matrix factorization for classifying the