markov chains for tensor network states - román orús · markov chains for tensor network states...

Markov Chains for Tensor Network States

Sofyan IblisdirUniversity of Barcelona

September 2013arXiv 1309.4880

Sofyan Iblisdir University of Barcelona Markov Chains for Tensor Network States

Outline

Main issue discussed here: how to find good tensor network states?

I The optimal state problem

I Markov chains

I Multiplicative approximations

I Examples with MPS and PEPS

I Discussion and possible extensions (low and high energy eigenstates, 2D timeevolution, finite temperature).


Introduction

I Tensor network states seem to accurately describe a wide variety of stronglycorrelated systems.

I Profound reason: Nature doesn’t saturate entanglement; it makes sense to tryand represent quantum states with states where entanglement follows welldefined patterns.

I Gapped 1D systems: matrix product states (DMRG).I Scale invariant 1D systems: entanglement renormalisationI Two-dimensional systems: projected entangled pair states

I Systematic investigation of condensed matter systems with such states hasproven hugely successful over the past 20 years. For instance:

I Progress on specific models (e.g. Kagome-Heisenberg, J1− J2)I Better understanding of why some systems are so hard to study.


The optimal state problem I

We are interested in an issue that can be informally stated as follows:

Given a Hamiltonian H (in one or in two dimensions), and a family T of tensornetwork states, find the best approximation φ? ∈ T to some ground state ψ? of H.

Existing methods: DMRG (most popular), Euclidean evolution, site-by-site tensorimprovement.

In practice, these methods have proven very powerful; what has been achieved cannotbe over-rated. It would however be wrong to consider the optimal state problem as”closed”, let alone ”solved”.


The optimal state problem II

I 1D. Despite its success, it is generally impossible to actually prove that DMRGindeed outputs a good solution; we can never be quite sure we haven’t been led toa local minimum. Worse, one can construct situations where DMRG is known toget stuck in sub-optimal solutions. The situation is the same with other methods.

I The optimal state problem is NP-hard1.

I The case of uniformly gapped quantum Hamiltonians seems to be moretractable2; ∃ a poly-time certifiable algorithm. But the method requires priorknowledge about the gap and is tedious to implement.

I The situation seems even worse in two dimensions. What do we know exactly?

I The relevance of the problem is not only complexity-theoretic. See for examplethe MERA-DMRG comparisons about the Heisenberg model on a Kagomelattice3:

EoldDMRG (spin liquid) > EMERA(not spin liquid) > Enew

DMRG (spin liquid).

One possible strategy to make progress is to enlarge the set of methods available tosearch optimal states. Certainly, this will to new difficulties. But hopefully, someinsight can be gained from them.

1N. Schuch, I. Cirac, F. Verstrate, PRL 100, 250501 (2008)2Z. Landau, U. Vazirani, T. Vidick, pre-print arXiv:1307.5143 and ref. therein3S. Depenbrock et al. PRL 109, 067201 (2012)


The optimal state problem III

We have considered the use of Markov chains 4 concatanated into a simulatedannealing algorithm.

I Connects the optimal state problem to sampling. Perhaps it will allows us tobenefit from decades of efforts and abundant literature.

I Gain new perspectives on our problem. Presumably, the issue of certifying asolution is a global optimum can be related to that of bounding mixing times(convergence) for Markov chains.

I When optimality cannot be guaranteed, it is reasonable to try and get answerthrough independent ways and compare; such weak forms of verification arecommonplace in hard combinatorial poblems such as the 3-SAT.

I Seems flexible, might well allow for extensions to other problems: high energyeigenstates, finite temperature, time evolution.

I Relatively easy to implement: no need to compute environments in 2D forinstance.

N.B. ∃ earlier work on Monte-Carlo for tensor networks, but for a wholly differentpurpose: approximate contraction scheme; optimisation was still ”greedy”. Herestandard contraction will do, but different search method, where moves to worseconfigurations are allowed.

4Metropolis algorithm has turned 60 this year!


The Haar-Markov chain I

Consider a chain made of n particles interacting via H (we assume open boundaryconditions). Assume we want the best approximation of the ground state in terms ofan MPS with fixed bond dimension χ:

|ψ〉 =∑s1...sn

tr[A1(s1) . . .An(sn)]|s1 . . . sn〉, (1)

Our first task will be to define a convenient configuration space. Consider the mapϕ : U(dχ)⊗n → Hilb : ω → |ω〉:

u1 u2 un|1χ 1χ|

|1d |1d |1d

Figure 1: Possible preparation of a matrix product state (open boundary condi-tions) of n d-dimensional quantum particles. A χ-dimensional ancilla is preparedin a reference state |1χ, and interacts sequentially with each of the consideredparticle, also prepared in a reference state |1d). The ancilla is eventually pro-jected on |1χ and disregarded.

(1) (2) (n)

Figure 2: Schematic illustration of a Markov-Haar move. In this example, whered = 2 and χ = 1, the configuration space is U(2)⊗n and can be represented asa set of n points on a 2−sphere. The plain (resp. dotted) arrow represents thecurrent (resp. candidate) configuration.

5

We choose the configuration space to be Ω ≡ U(dχ)⊗n.


The Haar-Markov chain II

Let h(ω) = 〈ω|H|ω〉/〈ω|ω〉: energy associated with ω. Sampling according to

π(β)(ω) ≡e−βh(ω)∫

dω′e−βh(ω′), ω ∈ Ω, (2)

would be an invaluable resource. However, except for a few trivial cases, directsampling of π(β) seems to be impossible. But a Markov chain can be constructed:

1. Draw n i.i.d. unitary matrices vk , k = 1 . . . n ∈ U(dχ)⊗n according to the Haarmeasure5.

2. From the current configuration ω = uk , k = 1 . . . n, construct a candidateω′ = ukvαk : k = 1 . . . n, 0 < α < 1.

3. Accept the proposed move ω → ω′ with (Metropolis) probability

P(β)acc (ω → ω′) = min1, e−β(h(ω′)−h(ω)). (3)

5F. Mezzadri, Not. Amer. Math. Soc. 54, 592 (2007).


The Haar-Markov chain III

1. Draw vk , k = 1 . . . n ∈ U(dχ)⊗n ∼ Haar measure.

2. ω′ = ukvαk : k = 1 . . . n, 0 < α < 1.

3. P(β)acc (ω → ω′) = min1, e−β(h(ω′)−h(ω)).

A picture:

u1 u2 un|1χ 1χ|

|1d |1d |1d

Figure 1: Possible preparation of a matrix product state (open boundary condi-tions) of n d-dimensional quantum particles. A χ-dimensional ancilla is preparedin a reference state |1χ, and interacts sequentially with each of the consideredparticle, also prepared in a reference state |1d). The ancilla is eventually pro-jected on |1χ and disregarded.

(1) (2) (n)

Figure 2: Schematic illustration of a Markov-Haar move. In this example, whered = 2 and χ = 1, the configuration space is U(2)⊗n and can be represented asa set of n points on a 2−sphere. The plain (resp. dotted) arrow represents thecurrent (resp. candidate) configuration.

5

I Implementation: draw random numbers and unitaries, evaluate h(ω).

I α should be tuned to achieve good trade-off between modest and ambitiousmoves.

I Strongly ergodic: Ptrans(ω → ω′) > 0 ∀ω, ω′.I Reversible: π(β)(ω)Ptrans(ω → ω′) = π(β)(ω′)Ptrans(ω′ → ω).


The Haar-Markov chain IV

That the Markov chain is defined on a (connected) compact space, strongly ergodicand reversible implies that Ptrans is a contraction:

||Pτtransπ0 − π(β)||TV ≤ 2 η(β)τ , η(β) < 1.

As β →∞, the distribution

π(β)(ω) ≡e−βh(ω)∫

dω′e−βh(ω′), ω ∈ Ω, (4)

exponentially concentrates on minimal energy states. So if β is increased sufficientlyslowly with time, we get a simulating annealing scheme, converging towards a globaloptimum. We don’t know how slow is slow enough; finding a useful bound on η(β) isa difficult problem.


Multiplicative approximation

What should we reasonably expect? Asking for, say, 14-digit precision is probably toomuch. Following requirements common in Monte-Carlo literature, a simulation will beconsidered efficient if it yields a multiplicative approximation in polynomial time. Lete0 denote the true g.s. energy. We will be content if the Metropolis alogrithm yieldsan estimate e′0 satisfying

Proba[|e′0 − e0| < δ |e0|

]> 2/3, (5)

in a time that grows polynomially with n, χ, δ−1.

Remark: Computing the partition function of the ferromagnetic Ising model (with amagnetic field) is known to be #P-complete. Jerrum and Sinclair6 have however beenable to construct an efficient multiplicative for the problem, through a clever design ofMarkov chains. It would be interesting to exhibit quantum analogues of this situation.

6M. Jerrum, A. Sinclair, SIAM J. Comput. 22 1087 (1993).


A case study: the 1D Ising model

H = −n∑

k=1

σzk −

n−1∑k=1

σxkσ

xk+1. (6)

A linear schedule has been assumed: β(t) = β0 + κt.

χ HM (β <∞) HM (β =∞) EE

2 −50.354476 −50.503292 −50.5033034 −50.489549 −50.511711 −50.5676948 −50.338708 −50.514589 −50.569413

16 −50.354480 −50.502324 −50.569425

All computations made on a single core desktop computer. Computations stopped at∼ 10−3 relative accuracy. Data not ”converged”, better figures if longer times areused. The longest computation took ∼ 48 hours, could be reduced down to ∼ 3 hours(double complex → single precision real). At this stage, we shouldn’t aim at acomparison with prior methods (would be too unfair!):

I Many improvements possible (some of them very mundane). We make no claimof optimality; a lot of room for improvement.

I Usual methods are particularly efficient for this Hamiltonian.

I These first results are encouraging.


Extensions to two dimensions I

The scheme can be extended to PEPS (nx × ny square lattice). Again a unitaryconfiguration space can be associated to the problem:

Ω ≡ U(dχ3)⊗nxny

For any search scheme, there is an important additional difficulty when moving fromMPS to PEPS; accuracy when estimating mean values. Let h(ω), δh(ω) denote anestimate for 〈ω|H|ω〉 together with a bound on the error. Similarly, let n(ω), δn(ω)denote an estimate for 〈ω|ω〉 together with a bound on the error.

Definition: a PEPS is pointless if |h(ω)| ≤ δh(ω),f n(ω) ≤ δn(ω) or both. A PEPS iscontractible if it is not pointless.

The Monte-Carlo scheme can be designed so as to avoid pointless PEPS; stabilisescomputations. We consider the distribution:

π(β)(ω) ≡e−βg(ω)∫

dω′e−βg(ω′), ω ∈ Ω,

g(ω) =

+∞ if ω is pointless,

maxa,b∈0,1h(ω)+(−)aδh(ω)

n(ω)+(−)bδn(ω) else.


Extensions to two dimensions II

We have tested the method with the Ising model, defined on a 7× 7 square lattice,with parameters χ = 2, χc = 4.

H = −Jx∑〈k,l〉

σxkσ

xl − hz

∑k

σzk , (7)

Jx hz HM EE7

1 1/2 −85.825 −85.869(3)1 1 −91.492 −91.520(8)

1/2 1 −57.192 −57.377(3)

Each computation made again with a single core desktop. Time: 7 days with doubleprecision complex: ∼ half a day with simple precision reals. No claim of optimality,can be significantly improved.

7Thanks to M. Lubasch for the EE data


Discussion and possible extensions

I Markov chains for tensor network states introduced. No environment; very easyimplementation.

I The method is certainly worth further exploration.

I It would be interesting to explore connections between the analysis of Markovchains and complexity issues regarding quantum Hamiltonians.

I First results encouraging; linear schedules seem to be possible. How well does itcompare to existing methods? We don’t know but there are many possibilities forimprovement:

I Non-monotonic schedules? (seems very powerful)I Metropolis scaling? (trade-off between modest and ambitious moves)I Beyond the simple Haar-Markov chain? We are currently working on biased priors; can

dramatically boost computations; compare say single site update and the Wolffalgorithm for classical Ising models.

I Other configuration spaces

I The method is very flexible.

I Trees and MERAS?I Fermions? (Plain Jordan-Wigner, Mapping to Hamiltonians of spins, fermionic PEPS)I Low and high energy excited states? Change the figure of merit:

h′(ω) = θ(h(ω)− e1)θ(e2 − h(ω)) + λ|〈ω|H2|ω〉 − 〈ω|H|ω〉2|.I Spectral decomposition and time evolution for PEPS?I Finite temperature?


markov chains for tensor network states - román orús · markov chains for tensor network states...

Documents