constrained low-rank matrix (and tensor ...laurent.risser.free.fr › tmp_share › optimcimi2018...
Post on 24-Jun-2020
3 Views
Preview:
TRANSCRIPT
CONSTRAINED LOW-RANK MATRIX (AND TENSOR) ESTIMATION
Lenka Zdeborová (IPhT, CEA Saclay, France)
with T. Lesieur, F. Krzakala; Proofs with J. Xu, J. Barbier, N. Macris, M. Dia, M. Lelarge, L. Miolane.
LET’S PLAY A GAME
+1
-1
N=15 people
LET’S PLAY A GAME
-1
+1-1
-1
-1
-1
-1
-1
+1
+1 +1
+1+1
+1
+1
-1
+1-1
-1
-1
-1
-1
-1
+1
+1 +1
+1+1
+1
+1
LET’S PLAY A GAME
+1
-1
LET’S PLAY A GAME
• Generate a random Gaussian variable Z (zero mean and variance Δ)
• Report:
‣ Y=Z+1/⎷N if the cards were the same.
‣ Y=Z-1/⎷N if the cards were different.
-1
+1-1
-1
-1
-1
-1
-1
+1
+1 +1
+1+1
+1
+1
LET’S PLAY A GAME
Collect Yij for every pair (ij).
Goal: Recover cards (up to symmetry) purely from the knowledge of
• Each pair reports:
‣ Yij=Zij+1/⎷N if cards the same.
‣ Yij=Zij-1/⎷N if cards different.
Zij ⇠ N (0,�)
Y = {Yij}i<j
HOW TO SOLVE THIS?
Eigen-decomposition of Y (aka PCA) minimises X
i<j
(Yij � Yij)2 with rank(Y ) = 1
xPCA (leading eigen-vector of Y) estimates x* (up to a sign).
true values of cards: Yij =1
Nx*i x*j + Zij
BBP phase transition:
Zij ⇠ N (0,�) x*i ∈ {−1, + 1}
Δ > 1Δ < 1
xPCA ⋅ x* ≈ 0|xPCA ⋅ x* | > 0
x* ∈ {−1, + 1}N
MAIN QUESTIONS
What is the minimal achievable estimation error on x*?
(Is it possible to do better than PCA?)
What is the minimal efficiently achievable estimation error on x*?
BAYESIAN INFERENCE
P (x|Y ) =P (x)P (Y |x)
P (Y )
P (x|Y ) =1
Z(Y,�)
NY
i=1
[�(xi + 1) + �(xi � 1)]Y
i<j
e�(Yij�xixj/
pN)2
2�
Values of cards:
Posterior distribution:
Bayes-optimal inference = computation of marginals (argmax maximizes the number of correctly assigned values, mean of marginals minimises the mean-squared error).
xi ∈ {−1, + 1}x ∈ {−1, + 1}N
IN THIS TALK
P (u, v|Y ) =1
Z(Y )
NY
i=1
PU (ui)MY
j=1
PV (vj)Y
i,j
Pout(Yij |uTi vj/
pN)
P (x|Y ) =1
Z(Y )
NY
i=1
PX(xi)Y
i<j
Pout(Yij |xTi xj/
pN)
or
Bayes-optimal inference for generic prior and output
Generate ground-truth xi* from PX. Generate Yij from Pout. Goal: Infer x* from Y.
P (x|Y ) =1
Z(Y )
NY
i=1
PX(xi)Y
i1<...<ip
Pout(Yi1....ip |p
(p� 1)!
N (p�1)/2xi1 ...xip)
or
Stochastic block model (dense):
PX(x) =1
r
rX
k=1
�(x� ek)
x 2 Rr eTk = (0, . . . , 0, 1, 0, . . . , 0)r-valued cards
Pout(Yij = 1|xTi xjpN
) = pout +µpN
xTi xj
Another example:
Yij is the adjacency matrix of a graph
Pout(Yij = 0|xTi xjpN
) = 1� pout �µpN
xTi xj
P (x|Y ) =1
Z(Y )
NY
i=1
PX(xi)Y
i<j
Pout(Yij |xTi xj/
pN)
Submatrix localization.
Z2 synchronization.
Planted spin glass (Ising/spherical/vectorial).
Spiked Wigner models.
More examples:
P (x|Y ) =1
Z(Y )
NY
i=1
PX(xi)Y
i<j
Pout(Yij |xTi xj/
pN)
ASYMMETRIC CASE
Gaussian mixture clustering.
Biclustering.
Dawid-Skene model for crowdsourcing.
Johnstone’s spiked covariance model.
Restricted Boltzmann machine with random weights.
P (u, v|Y ) =1
Z(Y )
NY
i=1
PU (ui)MY
j=1
PV (vj)Y
i,j
Pout(Yij |uTi vj/
pN)
TENSOR ESTIMATION
Spiked tensor model (Richard, Montanari, NIPS’14)
Hyper-graph clustering
Tensor completion.
Sub-tensor localisation
P (x|Y ) =1
Z(Y )
NY
i=1
PX(xi)Y
i1<...<ip
Pout(Yi1....ip |p
(p� 1)!
N (p�1)/2xi1 ...xip)
OUR RESULTS
In the limit, , we compute rigorously the minimum mean-squared error
Message passing algorithm that is asymptotically optimal out of a sharply delimited “hard” region of parameters.
P (x|Y ) =1
Z(Y )
NY
i=1
PX(xi)Y
i<j
Pout(Yij |xTi xj/
pN)
N ! 1,M/N = ↵ = O(1)
MMSE =1
N
NX
i=1
(x⇤i � xi)
2
or asymmetric or tensors
xi = ∑x
xi P(x |Y )
COMMENTS
Limit, , high dimensional statistics. Rank = O(1).
Regime of MSE: When is MSE better than a random pick from the prior? And how much better. A statistician would perhaps rather ask how fast the MSE goes to zero.
When we talk about sparsity, finite fraction of non-zeros. In most statistics works # or non-zeros o(N).
The noise and spikes iid. Does not describe most of real data. Precise analysis of optimality and many algorithms possible, intriguing behaviour (phase transitions).
N ! 1,M/N = ↵ = O(1)
How do we compute the Bayes-optimal performance?
Y ! J
xi ! Si
Map to a spin glass?
BACK TO THE CARD GAME
Si 2 {�1,+1}P (S|J) = 1
Z(Y,�)
Y
i<j
e�(Jij�SiSj/
pN)2
2�
P (S|J) = 1
Z(Y,�)e
1�
pN
Pi<j JijSiSj
Boltzmann measure of a mean-field Ising spin glass (Sherrington-Kirkpatrick’75 model)
Jij conditioned on Si*: planted disorder
MEAN-FIELD SPIN GLASS
‣ Mean-field spin glass models solvable using the non-rigorous Replica method / cavity method (Mezard, Parisi, Nishimori, Watkin, Nadal, Sompolinsky, many many others 70s-80s.)
‣ For (Ising spins):
1p�⇤
�=
�⇤
De Almeida; Thouless’78:
p�
Si 2 {�1,+1}
MEAN-FIELD SPIN GLASS
‣ Mean-field spin glass models solvable using the non-rigorous Replica method / cavity method (Mezard, Parisi, Nishimori, Watkin, Nadal, Sompolinsky, many many others 70s-80s.)
‣ For (Ising spins):Si 2 {�1,+1}
p�
1p�⇤
�=
�⇤
0.5
1 �
erro
r
0
LET’S JUMP ~40 YEARS FORWARD:
MAIN RESULTS
DEFINITIONS:
Sij ⌘@ logPout(yij |w)
@w
���yij ,0
1
�⌘ EPout(y|w=0)
"✓@ logPout(y|w)
@w
���y,0
◆2#
Fisher-score matrix
Fisher information
P(x;A,B) =1
Z(A,B)PX(x) exp
✓B>x� x>Ax
2
◆
f(A,B) ⌘ EP(x)
P (x|Y ) =1
Z(Y )
NY
i=1
PX(xi)Y
i<j
Pout(Yij |xTi xj/
pN)
A, B ∈ ℝr×r x ∈ ℝr
THEOREMS:
1
NlogZ(Y ) �(M)concentrates around maximum of
x ⇠ PX(x)
w ⇠ N (0, 1r)
�(M) = Ex,w
"logZ
M
�,M
�x+
rM
�w
!#� Tr(MM>)
4�
Theorem 1: M 2 Rr⇥r
x ∈ ℝr
Why is this useful?
When N>>1, the rN-dimensional problem reduces to a r-dimensional one.
P (x|Y ) =1
Z(Y )
NY
i=1
PX(xi)Y
i<j
Pout(Yij |xTi xj/
pN)
= replica symmetric free energy
THEOREMS:
1
NlogZ(Y ) �(M)concentrates around maximum of
x ⇠ PX(x)
w ⇠ N (0, 1r)
MMSE = Tr[Ex(xx>)� argmax�(M)]
�(M) = Ex,w
"logZ
M
�,M
�x+
rM
�w
!#� Tr(MM>)
4�
Theorem 1:
Theorem 2:
Proofs: Korada, Macris’10, Krzakala, Xu, LZ, ITW’16, Barbier, Dia, Macris, Krzakala, Lesieur, LZ, NIPS’16; more elegant Lelarge, Miolane’16; El-Alaoui, Krzakala’17
M 2 Rr⇥r
FREE ENERGY FOR THE ASYMMETRIC CASE
P (u, v|Y ) =1
Z(Y )
NY
i=1
PU (ui)MY
j=1
PV (vj)Y
i,j
Pout(Yij |uTi vj/
pN)
�(Mu,Mv) = Eu,w
"logZu
↵Mv
�,↵Mv
�u+
r↵Mv
�w
!#
+↵Ev,w
"logZv
Mu
�,Mu
�v +
rMu
�w
!#� ↵Tr(MvM>
u )
2�
Conjectured: Lesieur, Krzakala, LZ’15
Proof: Miolane’17
FREE ENERGY FOR THE TENSOR CASE
P (x|Y ) =1
Z(Y )
NY
i=1
PX(xi)Y
i1<...<ip
Pout(Yi1....ip |p
(p� 1)!
N (p�1)/2xi1 ...xip)
�(M) = Ex,w
"logZ
Mp�1
�,Mp�1
�x+
rMp�1
�w
!#� Mp(p� 1)
2p�
rank=1
Proof (r=1): Lesieur, Miolane, Lelarge, Krzakala, LZ’17
General rank: Barbier, Macris, Miolane’17.
KEY PROOF INGREDIENTS
Guerra’s interpolation (from N independent
scalar denoising problems) +
MAIN QUESTIONS
What is the minimal achievable estimation error on x*?
(Is it possible to do better than PCA?)
What is the minimal efficiently achievable estimation error on x*?
✅
APPROXIMATE MESSAGE PASSING
AMP algorithm estimates means and variances of the marginals:
Thouless, Anderson, Palmer’77, Rangan, Fletcher’12, Matsushita, Tanaka’14, Deshpande, Montanari’14, Lesieur, Krzakala, LZ’15 and 16
P(x |Y ) =1
Z(Y )
N
∏i=1
PX(xi)∏i<j
Pout(yij |x⊤i xj / N)
Bti =
1
N
N
∑l=1
Silatl −
1Δ ( 1
N
N
∑l=1
vtl)at−1
i
At =1
NΔ (N
∑l=1
atl a
tl⊤)
at+1i = f(At, Bt
i )
vt+1i = ∂B f(At, Bt
i )
DEFINITIONS:
Sij ⌘@ logPout(yij |w)
@w
���yij ,0
1
�⌘ EPout(y|w=0)
"✓@ logPout(y|w)
@w
���y,0
◆2#
Fisher-score matrix
Fisher information
P(x;A,B) =1
Z(A,B)PX(x) exp
✓B>x� x>Ax
2
◆
f(A,B) ⌘ EP(x)
P (x|Y ) =1
Z(Y )
NY
i=1
PX(xi)Y
i<j
Pout(Yij |xTi xj/
pN)
Characterisation of the AMP via matrix-order parameter M.
STATE EVOLUTION
M t ⌘ 1
N
NX
i=1
ati(x⇤i )
> 2 Rr⇥r
x ⇠ PX(x)
w ⇠ N (0, 1r)
M t+1 = Ex,w
"f
M t
�,M t
�x+
rM t
�w
!x>
#
Observation: Stationary points of are fixed points of the state evolution.
�(M)
Proof: Rangan, Fletcher’12, Javanmard, Montanari’12, Deshpande, Montanari’14.
MSEAMP = Tr[Ex(xx>)�MAMP]
AMP-MSE given by the local maximum of the free energy reached gradient descent starting from small M/large MSE.
MMSE is given by the global maximum of the free energy.
BOTTOM LINE
M
free
ene
rgy
MAMP
�(M) = Ex,w
"logZ
M
�,M
�x+
rM
�w
!#� Tr(MM>)
4�
MMSE = Tr[Ex(xx>)� argmax�(M)]
argmax�(M)
MSEAMP = Tr[Ex(xx>)�MAMP]
ZOOLOGY OF FIXED POINTS (FOR MATRIX ESTIMATION)
EX(x) = 0
EX(x) 6= 0
M t+1 =⌃M t⌃
�M t+1
(r=1) =[EX(x2)]2
�M t
Zero mean prior:
SE has always a “trivial” fixed point M=0.
Stability of the trivial fixed point:
This is the same as the spectral phase transition of the Fisher score matrix (Edwards’68, known as the BBP’05 transition)
Non-zero mean priors:
MMSE always better than random guessing (spectral methods still have a phase transition).
Multiple fixed points may still exist.
PX(xi) =⇢
2[�(xi � 1) + �(xi + 1)] + (1� ⇢)�(xi)
From fixed points to phase transitions:ac
cura
cyac
cura
cy
ALGORITHMIC INTERPRETATION
noise, Δ
• Easy by approximate message passing. • Impossible information theoretically. • Hard phase: in presence of a first order phase transition.
PX(xi) =⇢
2[�(xi � 1) + �(xi + 1)] + (1� ⇢)�(xi)
Conjecture: No polynomial algorithm works.
- Physically sensible. - Mathematically wide open. ac
cura
cy
PX(xi) =⇢
2[�(xi � 1) + �(xi + 1)] + (1� ⇢)�(xi)
Phase Diagram:
easy
impossible
hard
impossible
easyhard
HARD PHASE IN NATURE
Metastable diamond = high error. Equilibrium graphite = low error. Algorithms are stuck at high error for exponential time.
MAIN QUESTIONS
What is the minimal achievable estimation error on x*?
(Is it possible to do better than PCA?)
What is the minimal efficiently achievable estimation error on x*?
✅
✅
PX(xi) =⇢
2[�(xi � 1) + �(xi + 1)] + (1� ⇢)�(xi)
From fixed points to phase transitions:ac
cura
cyac
cura
cy
OPTIMAL SPECTRAL ALGORITHMS
For zero-mean priors, spectral method that has the same phase transition as AMP. AMP has better error.
For noise that is not Gaussian additive, to have the optimal phase transition, spectral algorithm need to be done on the Fisher score matrix.
Sij ⌘@ logPout(yij |w)
@w
���yij ,0
OPTIMAL PRE-PROCESSING
Exponential additive noise Cauchy additive noise
Sij = sign(Yij) Sij =Yij
1 + Y 2ij
Fisher score:Fisher score:
Pout(y|w) = e�|y�w|/2<latexit sha1_base64="sJc70CzfprbfCVUCTqXK8yG1B/0=">AAACB3icbVDLSsNAFJ3UV62vqEsXDhahLlqTIqgLoejGZQVjC20Mk+m0HTp5MDOxhDRLN/6KGxcqbv0Fd/6N0zYLbT1w4XDOvdx7jxsyKqRhfGu5hcWl5ZX8amFtfWNzS9/euRNBxDGxcMAC3nSRIIz6xJJUMtIMOUGey0jDHVyN/cYD4YIG/q2MQ2J7qOfTLsVIKsnR9+tO0uYeDCKZluLR8AheQHKflEdxeThKj6uOXjQqxgRwnpgZKYIMdUf/ancCHHnEl5ghIVqmEUo7QVxSzEhaaEeChAgPUI+0FPWRR4SdTB5J4aFSOrAbcFW+hBP190SCPCFiz1WdHpJ9MeuNxf+8ViS7Z3ZC/TCSxMfTRd2IQRnAcSqwQznBksWKIMypuhXiPuIIS5VdQYVgzr48T6xq5bxi3JwUa5dZGnmwBw5ACZjgFNTANagDC2DwCJ7BK3jTnrQX7V37mLbmtGxmF/yB9vkDpIeYpg==</latexit><latexit sha1_base64="sJc70CzfprbfCVUCTqXK8yG1B/0=">AAACB3icbVDLSsNAFJ3UV62vqEsXDhahLlqTIqgLoejGZQVjC20Mk+m0HTp5MDOxhDRLN/6KGxcqbv0Fd/6N0zYLbT1w4XDOvdx7jxsyKqRhfGu5hcWl5ZX8amFtfWNzS9/euRNBxDGxcMAC3nSRIIz6xJJUMtIMOUGey0jDHVyN/cYD4YIG/q2MQ2J7qOfTLsVIKsnR9+tO0uYeDCKZluLR8AheQHKflEdxeThKj6uOXjQqxgRwnpgZKYIMdUf/ancCHHnEl5ghIVqmEUo7QVxSzEhaaEeChAgPUI+0FPWRR4SdTB5J4aFSOrAbcFW+hBP190SCPCFiz1WdHpJ9MeuNxf+8ViS7Z3ZC/TCSxMfTRd2IQRnAcSqwQznBksWKIMypuhXiPuIIS5VdQYVgzr48T6xq5bxi3JwUa5dZGnmwBw5ACZjgFNTANagDC2DwCJ7BK3jTnrQX7V37mLbmtGxmF/yB9vkDpIeYpg==</latexit><latexit sha1_base64="sJc70CzfprbfCVUCTqXK8yG1B/0=">AAACB3icbVDLSsNAFJ3UV62vqEsXDhahLlqTIqgLoejGZQVjC20Mk+m0HTp5MDOxhDRLN/6KGxcqbv0Fd/6N0zYLbT1w4XDOvdx7jxsyKqRhfGu5hcWl5ZX8amFtfWNzS9/euRNBxDGxcMAC3nSRIIz6xJJUMtIMOUGey0jDHVyN/cYD4YIG/q2MQ2J7qOfTLsVIKsnR9+tO0uYeDCKZluLR8AheQHKflEdxeThKj6uOXjQqxgRwnpgZKYIMdUf/ancCHHnEl5ghIVqmEUo7QVxSzEhaaEeChAgPUI+0FPWRR4SdTB5J4aFSOrAbcFW+hBP190SCPCFiz1WdHpJ9MeuNxf+8ViS7Z3ZC/TCSxMfTRd2IQRnAcSqwQznBksWKIMypuhXiPuIIS5VdQYVgzr48T6xq5bxi3JwUa5dZGnmwBw5ACZjgFNTANagDC2DwCJ7BK3jTnrQX7V37mLbmtGxmF/yB9vkDpIeYpg==</latexit>
Pout(y|w) = [1 + (y � w)2]�1/⇡<latexit sha1_base64="L4hdWBGK2Eo850AlER/DD9qtke8=">AAACD3icbVDLSsNAFJ3UV62vqks3g0VskdakCOpCKLpxWcHYQpqGyXTSDp08mJlYQuwnuPFX3LhQcevWnX/j9LHQ1gMXDufcy733uBGjQur6t5ZZWFxaXsmu5tbWNza38ts7dyKMOSYmDlnImy4ShNGAmJJKRpoRJ8h3GWm4/auR37gnXNAwuJVJRGwfdQPqUYykkpz8Yd1JW9yHYSyHxeRhUIIXlnFUTMqDUrtqt9OyMTxuRRQ6+YJe0ceA88SYkgKYou7kv1qdEMc+CSRmSAjL0CNpp4hLihkZ5lqxIBHCfdQllqIB8omw0/FDQ3iglA70Qq4qkHCs/p5IkS9E4ruq00eyJ2a9kfifZ8XSO7NTGkSxJAGeLPJiBmUIR+nADuUES5YogjCn6laIe4gjLFWGORWCMfvyPDGrlfOKfnNSqF1O08iCPbAPisAAp6AGrkEdmACDR/AMXsGb9qS9aO/ax6Q1o01ndsEfaJ8/25Wawg==</latexit><latexit sha1_base64="L4hdWBGK2Eo850AlER/DD9qtke8=">AAACD3icbVDLSsNAFJ3UV62vqks3g0VskdakCOpCKLpxWcHYQpqGyXTSDp08mJlYQuwnuPFX3LhQcevWnX/j9LHQ1gMXDufcy733uBGjQur6t5ZZWFxaXsmu5tbWNza38ts7dyKMOSYmDlnImy4ShNGAmJJKRpoRJ8h3GWm4/auR37gnXNAwuJVJRGwfdQPqUYykkpz8Yd1JW9yHYSyHxeRhUIIXlnFUTMqDUrtqt9OyMTxuRRQ6+YJe0ceA88SYkgKYou7kv1qdEMc+CSRmSAjL0CNpp4hLihkZ5lqxIBHCfdQllqIB8omw0/FDQ3iglA70Qq4qkHCs/p5IkS9E4ruq00eyJ2a9kfifZ8XSO7NTGkSxJAGeLPJiBmUIR+nADuUES5YogjCn6laIe4gjLFWGORWCMfvyPDGrlfOKfnNSqF1O08iCPbAPisAAp6AGrkEdmACDR/AMXsGb9qS9aO/ax6Q1o01ndsEfaJ8/25Wawg==</latexit><latexit sha1_base64="L4hdWBGK2Eo850AlER/DD9qtke8=">AAACD3icbVDLSsNAFJ3UV62vqks3g0VskdakCOpCKLpxWcHYQpqGyXTSDp08mJlYQuwnuPFX3LhQcevWnX/j9LHQ1gMXDufcy733uBGjQur6t5ZZWFxaXsmu5tbWNza38ts7dyKMOSYmDlnImy4ShNGAmJJKRpoRJ8h3GWm4/auR37gnXNAwuJVJRGwfdQPqUYykkpz8Yd1JW9yHYSyHxeRhUIIXlnFUTMqDUrtqt9OyMTxuRRQ6+YJe0ceA88SYkgKYou7kv1qdEMc+CSRmSAjL0CNpp4hLihkZ5lqxIBHCfdQllqIB8omw0/FDQ3iglA70Qq4qkHCs/p5IkS9E4ruq00eyJ2a9kfifZ8XSO7NTGkSxJAGeLPJiBmUIR+nADuUES5YogjCn6laIe4gjLFWGORWCMfvyPDGrlfOKfnNSqF1O08iCPbAPisAAp6AGrkEdmACDR/AMXsGb9qS9aO/ax6Q1o01ndsEfaJ8/25Wawg==</latexit>
OTHER EXAMPLES OF PHASE DIAGRAMS
�Alg
PX(xi) = (1� ⇢)�(xi) + ⇢�(xi � 1)
⇢ = 0.01⇢ = 0.2
Non-zero mean priorac
cura
cy
hard
PX(xi) = (1� ⇢)�(xi) + ⇢�(xi � 1)
Non-zero mean prior
accu
racy
Pout(Yij = 1|xTi xjpN
) = pout +µpN
xTi xj � =
pout(1� pout)
µ2
Stochastic block model, r groups
0.0 0.5 1.0 1.5 2.0
�r2
0.0
0.2
0.4
0.6
0.8
1.0
MSE
AMP from solutionSE stable branchSE unstable branch�c = �Alg
�IT
�Dyn
r=15
r>4 hard phase exists. r<4 hard phase does not exist.
hard
� =pout(1� pout)
µ2
2 groups, different sizes, same average degree ✓pout poutpout pout
◆+
µpN
1�⇢⇢ �1�1 ⇢
1�⇢
!
PX(x) = ⇢�
✓x�
r1� ⇢
⇢
◆+ (1� ⇢)�
✓x+
r⇢
1� ⇢
◆
⇢c =1
2� 1p
12
ITsmall ⇢
kAlg =pN
rpout
1� pout,
kIT = log(N)4pout
1� pout.
As in balanced planted clique.
impossible
hard
easy
TENSORS
ZOOLOGY OF FIXED POINTS (FOR TENSOR ESTIMATION)
Zero mean prior:
SE has a “trivial” fixed point M=0, stable for any
Information-theoretic phase transition at
Huge hard phase, until (e.g. Richard, Montanari’14)
Non-zero mean priors:
Hard phase shrinks back to regime.
EX(x) = 0
EX(x) 6= 0
� = ⌦(1)
�IT = ⌦(1)
� = ⌦(1)
Δ = Ω(N(2−p)/4)
Take home: In tensor estimation use your prior!
SPIKED TENSOR (ZERO MEAN SPIKE)
ΔIT
No information contained in Y.
GOOD statisticallyHARD algorithmically
PX(x) = N (x; 0, 1) p=3
PX(x) = N (x; 0.2, 1) p=3
HARD
EASY
Almost no information
in Y.
SPIKED TENSOR (NON-ZERO MEAN SPIKE)
ΔIT
PHASE DIAGRAMS SPIKED TENSORS
PX(xi) = (1� ⇢)�(xi) + ⇢�(xi � 1)PX(x) = N (x;µ, 1)
p=3
CONCLUSION
• Analysis of Bayes optimal inference in low-rank matrix and tensor estimation.
• Approximate message passing, its performance.
• Channel universality. Optimal pre-processing for spectral methods.
• Existence of the hard phase (metastability next to a first order phase transition) for a range of priors.
WORK IN PROGRESS
• Beyond iid priors. Priors coming from another graphical model are also tractable. E.g. optimal generalisation error in neural networks with one small hidden layer.
• Applications of optimal pre-processing for spectral methods: degree corrected stochastic block model. Inference of patterns learned by real biological neural network.
• Nature of the nard phase. Deep connection with algorithmic barrier of sum-of-squares proofs.
TALK BASED ON
• Lesieur, Krzakala, LZ, Phase transitions in sparse PCA, ISIT’15
• Lesieur, Krzakala, LZ, MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel, Allerton’15.
• Lesieur, De Bacco, Banks, Krzakala, Moore, LZ, Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering, Allerton’16
• Krzakala, Xu, LZ, Mutual information in rank-one matrix estimation, ITW’16
• Barbier, Dia, Macris, Krzakala, Lesieur, LZ Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula, NIPS’16
• Lesieur, Krzakala, LZ, Constrained Low-rank Matrix Estimation: Phase Transitions, Approximate Message Passing and Applications, J. Stat. Mech.’17
• Lesieur, Miolane, Lelarge, Krzakala, LZ, Statistical and computational phase transitions in spiked tensor estimation, ISIT’17
top related