encoding electronic spectra in quantum circuits with ... · encoding electronic spectra in quantum...

Encoding Electronic Spectra in Quantum Circuits with Linear T Complexity

Ryan Babbush,1,* Craig Gidney,2 Dominic W. Berry,3 Nathan Wiebe,4 Jarrod McClean,1

Alexandru Paler,5 Austin Fowler,2 and Hartmut Neven11Google Inc., Venice, California 90291, USA

2Google Inc., Santa Barbara, California 93117, USA3Department of Physics and Astronomy, Macquarie University, Sydney, NSW 2109, Australia

4Microsoft Research, Redmond, Washington 98052, USA5Institute for Integrated Circuits, Linz Institute of Technology, 4040 Linz, Austria

(Received 9 May 2018; revised manuscript received 1 August 2018; published 23 October 2018)

We construct quantumcircuits that exactly encode the spectra of correlated electronmodels up to errors fromrotation synthesis. By invoking these circuits as oracles within the recently introduced “qubitization”framework, one can use quantum phase estimation to sample states in theHamiltonian eigenbasis with optimalquery complexityOðλ=ϵÞ, where λ is an absolute sum of Hamiltonian coefficients and ϵ is the target precision.For both theHubbardmodel and electronic structureHamiltonian in a secondquantized basis diagonalizing theCoulomboperator, our circuits have T-gate complexityO(N þ logð1=ϵÞ), whereN is the number of orbitals inthe basis. This scenario enables sampling in the eigenbasis of electronic structure Hamiltonians with Tcomplexity O(N3=ϵþ N2 logð1=ϵÞ=ϵ). Compared to prior approaches, our algorithms are asymptoticallymore efficient in gate complexity and require fewerTgatesnear the classically intractable regime.Compiling tosurface code fault-tolerant gates and assuming per-gate error rates of one part in a thousand reveals that one canerror correct phase estimation on interesting instances of these problems beyond the current capabilities ofclassical methods using only about a million superconducting qubits in a matter of hours.

DOI: 10.1103/PhysRevX.8.041015 Subject Areas: Chemical Physics,Quantum Information,Strongly Correlated Materials

I. INTRODUCTION

The ubiquitous problem of predicting material propertiesand chemical reactions from ab initio quantum mechanicsis among the most anticipated applications of quantumcomputing. The limitation of most classical algorithms formodeling the physics of superconductivity and molecularelectronic structure arises from the seemingly exponentialgrowth of entanglement required to accurately capturestrong correlation in systems of interacting electrons.This apparent classical intractability was referenced byFeynman in his seminal work as a key motivation for whywe need quantum computers [1,2]. Fourteen years later,Lloyd formalized the concept of a universal quantumsimulator [3] and demonstrated an extension for treatingsystems of interacting electrons in second quantization [4].

Since then, most work developing fermionic quantumsimulationmethods has focused on time evolution as ameansof estimating Hamiltonian spectra and preparing eigen-states [5] via the quantum phase estimation algorithm [6].Beginning with the proposal of Ref. [7], the idea that oneshould use phase estimation and adiabatic state preparation[8–10] to extract quantum chemistry ground-state energiesbecame especially popular. More recently, experimentaldemonstrations [11–15] have focused on the developmentof variational algorithms [16,17], which are often [18,19], butnot always [20,21], inspired by time-evolution primitives.Performing quantum phase estimation to sample

Hamiltonian spectra requires a quantum circuit to imple-ment an operation WðHÞ, which has eigenvalues that are aknown (and efficient-to-compute) function of the eigen-values ofH. Most past work has analyzed phase estimationof circuits corresponding to dynamical Hamiltonian sim-ulation, i.e., WðHÞ ≈ e−iHτ for some duration τ [6]. Wedenote by f the cost of implementing a primitive circuit thatis repeated to realize WðHÞ; e.g., a Trotter step [22] orTaylor series segment [23]. We further define gðϵÞ as thenumber of times that one must repeat that primitive toensure that the error in the spectrum of H encoded in theeigenphases of WðHÞ is at most OðϵÞ. Then, the cost ofphase estimation is bounded by

*Corresponding [email protected]

Published by the American Physical Society under the terms ofthe Creative Commons Attribution 4.0 International license.Further distribution of this work must maintain attribution tothe author(s) and the published article’s title, journal citation,and DOI.

PHYSICAL REVIEW X 8, 041015 (2018)

2160-3308=18=8(4)=041015(36) 041015-1 Published by the American Physical Society

https://crossmark.crossref.org/dialog/?doi=10.1103/PhysRevX.8.041015&domain=pdf&date_stamp=2018-10-23

https://doi.org/10.1103/PhysRevX.8.041015




https://creativecommons.org/licenses/by/4.0/

https://creativecommons.org/licenses/by/4.0/

O�f · gðϵÞ

ϵkW 0ðHÞk−1

�; ð1Þ

where k · k denotes the spectral norm and we have taken thederivative of the function of the eigenvalues in theoperation W 0ðHÞ. In other words, W 0ðHÞ has eigenvaluesthat are a function of the eigenvalues ofH, and that functionis the derivative of the function that gives the eigenvalues ofWðHÞ. For the case of dynamical time evolution,

WðHÞ ≈ e−iHτ; kW 0ðHÞk−1 ¼ k − iτe−iHτk−1 ¼ 1

τ;

ð2Þ

implying that the cost of phase estimation is O(f · gðϵÞ=ðϵτÞ) in this context.Modern Hamiltonian simulation methods such as the

signal processing algorithm [24] and qubitization [25] haveachieved the provably optimal scaling that is possible forgðϵÞ within a query model that aims to synthesize e−iHτ:

O�λτ þ log ð1=ϵÞ

log log ð1=ϵÞ�: ð3Þ

The definition of λ depends on the query model; e.g., inmodels for which the Hamiltonian is given as a weightedsum of unitaries, λ is the sum of the absolute values of theweightings [25]. However, WðHÞ ≈ e−iHτ is not the onlyencoding from which one may sample the spectrum of Hvia phase estimation. Recent papers [26,27] have advocatedperforming phase estimation on a quantum walk operatorcorresponding to WðHÞ ¼ ei arccosðH=λÞ, which can berealized exactly as a quantum circuit without approxima-tions beyond those required for rotation synthesis [25].(This quantum walk operator also produces eigenvaluescorresponding to e−i arccosðH=λÞ, but we ignore those forsimplicity of the exposition here.) Even within a black-boxquery model, one can achieve gðϵÞ ¼ Oð1Þ if the goal is toimplement ei arccosðH=λÞ rather than e−iH=λ. Performingphase estimation on either circuit would provide the sameinformation since the spectra of these operators are iso-morphic. In this case, the cost of phase estimation isOðf · λ=ϵÞ, which follows from Eq. (1), gðϵÞ ¼ Oð1Þ, and

WðHÞ ¼ ei arccosðH=λÞ;

kW 0ðHÞk−1 ¼��−iei arccosðH=λÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

λ2 −H2p

��−1 ≤ λ: ð4Þ

This work develops methods with such scaling formodeling systems of correlated electrons so that f ¼O(N þ logð1=ϵÞ).Our focus on synthesizing unitaries for phase estimation,

rather than time-evolution operators that could be used asvariational primitives, will result in quantum circuits with

millions of gates. Hence, we need quantum error correc-tion. We focus on planar nearest-neighbor coupled arraysof qubits, which are being developed experimentallyby multiple groups [28–30]. We use the surface code[31–35], as it has the highest gate threshold error ratefor this geometry. Within this model of fault-tolerantquantum computation, the physical resources requiredfor error correcting a quantum circuit are mostly deter-mined by (i) the number of logical qubits and (ii) thenumber of T gates.The focus on T gates arises because applying a single T

gate consumes many logical qubits and takes significantlylonger than applying any other operation [36]. Preparing ajTi state to enable a T gate requires hundreds of thousandsof physical qubits. If the goal is to minimize the number ofqubits required to execute an algorithm, it makes sense toprepare jTi states serially. Typically, it also takes over 100rounds of error detection to prepare a jTi state, leavingplenty of time to perform Clifford gates in parallel with thispreparation, meaning the execution time of the completealgorithm can be approximated as the total number of Tgates multiplied by the time to prepare each jTi state. Thus,throughout this work, we focus on T complexity as theprimary cost model. We note, however, that for all algo-rithms presented or discussed in this work, the T complexityis within logarithmic factors of the gate complexity.We focus on the two most-studied models of corre-

lated electrons: the Fermi-Hubbard model and the molecularelectronic structure Hamiltonian. The Hubbard Hamiltonianis an approximate model of electrons interacting on a planarlattice which some believe may qualitatively capture thebehavior of high-temperature superconductivity in cuprates[37]. The molecular electronic structure Hamiltonian is arealistic model of electrons interacting via the Coulombpotential with real kinetic energy in the presence of anexternal potential (which usually arises from atomic nuclei),in a finite-sized basis [38]. We focus on simulating theelectronic structure Hamiltonian in a basis diagonalizing theCoulomb potential [39–41]. For both theHubbardmodel andmolecular electronic structure Hamiltonian, we are able toprovide circuits that simulate ei arccosðH=λÞ with T complexityO(N þ logð1=ϵÞ), where N is the number of orbitals in asecond-quantized representation of the system. In Tables Iand II, we compare the T complexity of past quantumsimulation methods for these problems.In Theorems 1 and 2, we concisely state the T complex-

ity and ancilla requirements of our approach to phaseestimation for both the electronic structure Hamiltonianand Hubbard model Hamiltonian, respectively. Both ofthese theorems are established throughout the paper, butespecially in Eqs. (54), (55), (61), and (62). In addition tobounding the T complexity of our algorithms, we provideexplicit circuits for their construction and compile allbottleneck primitives down to surface code fault-tolerantgates (topological braiding diagrams). Therefore, the

RYAN BABBUSH et al. PHYS. REV. X 8, 041015 (2018)

041015-2

TABLE

I.Progressionof

lowestTcomplexity

algorithmsforim

plem

entin

gaphaseestim

ationunitary

encoding

eigenvaluesof

theelectronic

structureHam

iltonianin

second

quantization.NotethatNisthenumberofspinorbitals,and

ϵisthetargetprecision.Hereandthroughoutthepaper,Oð·Þ

indicatesanupperbound,O

ð·Þindicatesanupperbound

ignoring

polylogarithmicfactors,andOð∼

·Þindicatesem

piricalscalin

gextrapolated

from

numerics.“O

racleTgates”referstoffrom

Eq.(1),and“PEAqueries”referstotherestoftheexpression

inEq.(1).The

scalings

attributed

tothew

orkon

generalm

ethods

ofHam

iltoniansimulationassumethatoneusesthebestoraclesforelectronicstructureavailableatthatpointintim

e;e.g.,

thescalingattributed

toRef.[25]assum

estheuseof

oraclesfrom

Ref.[42],andthescalingattributed

toRef.[26]assum

estheuseof

oraclesfrom

Ref.[39].Whileabsentfrom

thistable

sinceitdidnotasymptotically

reduce

Tcomplexity,R

ef.[43]was

thefirstworkto

explicitlycompileaquantum

chem

istrysimulationto

Clifford

þTgates.

Year

Reference

Basis

Algorith

mOracleTgates

PEA

queries

TotalTgates

2005

Aspuru-Guzik

etal.[7]

Gaussians

Trotterizatio

nO(polyðN=ϵÞ)

O(polyðN=ϵÞ)

O(polyðN=ϵÞ)

2010

Whitfield

etal.[44]

Gaussians

Trotterizatio

nO(N

4logð1=ϵÞ)

O(polyðN=ϵÞ)

O(polyðN=ϵÞ)

2013

Weckeret

al.[45]

Gaussians

Trotterizatio

nO(N

4logð1=ϵÞ)

OðN

6=ϵ

3=2Þ

O(½N

10logð1=ϵÞ�=

ϵ3=2)

2014

McC

lean

etal.[46]

Gaussians

Trotterizatio

nO(∼N

2logð1=ϵÞ)

OðN

6=ϵ

3=2Þ

O(½∼

N8logð1=ϵÞ�=

ϵ3=2)

2014

Poulin

etal.[47]

Gaussians

Trotterizatio

nO(N

4logð1=ϵÞ)

Oð∼

N2=ϵ

3=2Þ

O(½∼

N6logð1=ϵÞ�=

ϵ3=2)

2014

Babbush

etal.[48]

Gaussians

Trotterizatio

nO(N

4logð1=ϵÞ)

Oð∼

N=ϵ

3=2Þ

O(½∼

N5logð1=ϵÞ�=

ϵ3=2)

2015

Babbush

etal.[42]

Gaussians

Taylorization

OðN

ÞO(½N

4logðN=ϵÞ�=

½ϵloglogðN=ϵÞ�)

OðN

5=ϵÞ

2016

Low

etal.[25]

Gaussians

Qubitizatio

nOðN

ÞO(ðN

4=ϵÞþ

½logðN

=ϵÞ�=

½ϵloglogðN

=ϵÞ�)

OðN

5=ϵÞ

2017

Babbush

etal.[39]

Planewaves

Taylorization

OðN

ÞO(½N

8=3logðN=ϵÞ�=


OðN

11=3=ϵÞ

2017

Berry

etal.[26]

Planewaves

Qubitizatio

nOðN

ÞOðN

8=3=ϵÞ

OðN

11=3=ϵÞ

2018

Kivlichanet

al.[40]

Planewaves

Trotterizatio

nO(N

2þNlogNlogð1=ϵÞ)

Oð∼

N3=2=ϵ

3=2Þ

Oð∼

N7=2=ϵ

3=2Þ

2018

Thispaper

Planewaves

Qubitizatio

nO(N

þlogð1=ϵÞ)

OðN

2=ϵÞ

O(½N

3þN

2logð1=ϵÞ�=

ϵ)

TABLEII.

Progressionof

lowestT

complexity

algorithmsforimplem

entin

gaunitary

encoding

eigenvaluesof

theHubbard

modelforp

hase

estim

ation.Here,Nisthenumbero

fsites,andϵisthetargetprecision.Asbefore,O

ð·Þindicatesan

upperb

ound,O

ð·Þindicatesan

upperb

ound

ignoring

polylogarithmicfactors,andOð∼

·Þindicatesem

piricalscalin

gextrapolated

from

numerics.“O

racleTgates”

referstoffrom

Eq.(1),and“PEAqueries”

referstotherestof

theexpression

inEq.(1).The

scalings

attributed

totheworkon

general

methods

ofHam

iltoniansimulationassumethatoneuses

thebestoraclesforthe

Hubbard

modelavailableatthatpointintim

e.Fo

rinstance,thescalingattributed

toRefs.[25]

and

[26]

assumes

thatoneuses

theSELECToraclesfrom

Ref.[42],which

also

workforthe

Hubbard

model.W

eassumethattheworkof

Ref.[49]w

ould

useoraclesfrom

Ref.[27].The

scalings

attributed

tothisworkassumethatourtechniquesarecombinedwith

thosefrom

Ref.[49],even

though

wedo

notfocus

onthatstrategy

inourfault-tolerant

analysissince

thosemethods

seem

less

effectiveforfiniteprob

lem

sizesnear

theclassically

intractableregime.Nonetheless,w

ediscusshowourmethods

canbe

combinedto

providethestated

complexity

inSec.

VD.

Year

Reference

Algorith

mAncillae

OracleTgates

PEA

queries

TotalTgates

1997

Abram

set

al.[4]

Trotterizatio

nOð1Þ

O(N

logð1=ϵÞ)

O(polyðN=ϵÞ)

O(polyðN=ϵÞ)

2015

Weckeret

al.[50]

Trotterizatio

nOð1Þ

O(N

logð1=ϵÞ)

OðN

2=ϵ

3=2Þ

O(½N

3logð1=ϵÞ�=

ϵ3=2)

2015

Babbush

etal.[42]

Taylorization

O(½log

ðN=ϵÞ�=

½loglog

ðN=ϵÞ�)

O(N

logð1=ϵÞ)

O(½N

logðN=ϵÞ�=


O(½N

2logðN=ϵÞlo

gð1=ϵÞ�=

½ϵlog

logðN=ϵÞ�)

2016

Low

etal.[25]

Qubitizatio

nOðlo

gNÞ

O(N

logðN

=ϵÞ)

O(ðN

=ϵÞþ

½logðN=ϵÞ�=

½ϵlog

logðN=ϵÞ�)

O(½N

2logðN=ϵÞ�=

ϵ)2017

Berry

etal.[26]

Qubitizatio

nOðlo

gNÞ

O(N

logðN

=ϵÞ)

OðN

=ϵÞ

O(½N

2logðN

=ϵÞ�=

ϵ)2017

Poulin

etal.[27]

Qubitizatio

nOðN

ÞO(N

þlogð1=ϵÞ)

OðN

=ϵÞ

O(½N

2þNlogð1=ϵÞ�=

ϵ)2018

Haahet

al.[49]

Qubitizatio

nOðlo

gNÞ

O(N

logðN

=ϵÞ)

Oð1=ϵÞ

OðN

=ϵÞ

2018

Kivlichanet

al.[40]

Trotterizatio

nOðlo

gNÞ

O(N

þlog

Nlogð1=ϵÞ)

Oð∼

1=ϵ

3=2Þ

Oð∼

N=ϵ

3=2Þ

2018

Thispaper

Qubitizatio

nOðlo

gNÞ

O(N

þlogð1=ϵÞ)

Oð1=ϵÞ

OðN

=ϵÞ

ENCODING ELECTRONIC SPECTRA IN QUANTUM … PHYS. REV. X 8, 041015 (2018)

041015-3

fault-tolerant aspect of our analysis goes further than priorestimates in the simulation literature [51], the most rigorousof which stopped at estimates of T complexity for Trotter-based electronic structure phase estimation [43] and for avariety of techniques used to effect time-evolution of theone-dimensional Heisenberg model [52]. We show that onecan perform fault-tolerant phase estimation on interestinginstances of both Fermi-Hubbard and molecular electronicstructure beyond the capabilities of known classical algo-rithms using roughly 1 × 106 physical qubits in the surfacecode, assuming an architecture with two-qubit error rates ofabout one part in a thousand.Theorem 1. Consider the electronic structure Hamil-

tonian in a basis of N spin orbitals, which diagonalizes theCoulomb operator,H¼P

p;qTðp−qÞa†paqþP

pUðpÞnpþPp≠qVðp−qÞnpnq, where fa†p; aqg ¼ δpq are fermionic

raising and lowering operators. Furthermore, defineλ¼P

pq jTðp−qÞjþPp jUðpÞjþP

p≠q jVðp−qÞj. Then,one can perform phase estimation to sample in theeigenbasis of H with an additive error of at most ϵ inthe eigenvalue using circuits with a number of T gatesscaling as 24

ffiffiffi2

pπNλ=ϵþO(ðλ=ϵÞ logðN=ϵÞ) and a num-

ber of ancilla qubits scaling as logðλ3N5=ϵ3Þ þOð1Þ.Theorem 2. Consider the square planar Hubbard model

with periodic boundary conditions in a basis of N spinorbitals, H ¼ −t

Php;qi;σ a

†p;σaq;σ þ ðu=2ÞPp;α≠β np;αnp;β,

where fa†p;α; aq;βg ¼ δpqδαβ are fermionic raising andlowering operators and the hp; qi notation implies asummation over nearest-neighboring orbitals on the peri-odic planar lattice. Furthermore, define λ ¼ 2Ntþ Nu=2.Then, one can perform phase estimation to sample in theeigenbasis of H with an additive error of at most ϵ in theeigenvalue using circuits with a number of T gates scalingas 10

ffiffiffi2

pπNλ=ϵþO(λ logðN=ϵÞ=ϵ) and a number of

ancilla qubits scaling as logðλN3=ϵÞ þOð1Þ.In Sec. II, we give an overview of the simulation strategy

that we use to encode and sample eigenspectra via phaseestimation. Section II A discusses how one can synthesizeei arccosðH=λÞ within the linear combinations of unitariesquery model requiring two oracle circuits: SELECT andPREPARE. Section II B introduces a particularly precisevariant of phase estimation which queries SELECT andPREPARE oracles to estimate spectra with a precisionexceeding the typical Holevo variance. Section II C ana-lyzes the various sources of errors that we need to considerin this algorithm and then bounds the number of times wemust query SELECT and PREPARE in order to perform phaseestimation.Sections III–V focus on explicit constructions of SELECT

and PREPARE. Section III introduces important primitivesfor both SELECT and PREPARE. In Sec. III A, we describecircuits applying controlled unitaries such as the mappingjlijψi ↦ jliXljψi with T complexity OðLÞ, where L is

the number of possible values of l. In Sec. III B, we showhow to selectively apply a Majorana fermion operator, aprimitive necessary for our implementation of SELECT inlater sections. In Sec. III C, we use the result of Sec. III Ato show a particularly efficient variety of quantum read-only memory (QROM), which we use for our PREPARE

circuit. In Sec. III D, we describe a general techniquefor implementing PREPARE in a fashion that keeps λ as smallas possible.Sections IVA and IV B discuss explicit constructions of

SELECT and PREPARE circuits for the electronic structureHamiltonian. Sections VA and V B discuss explicit con-structions of SELECT and PREPARE circuits for the Hubbardmodel Hamiltonian. Sections IV C and V C focus onquantifying the number of T gates and ancillae requiredby the algorithms described in Sec. IV and V. Thesesections include investigations of the finite-size magnitudeof the λ and target precisions required to implement ouralgorithms for interesting problems. In Section VD, wediscuss how our Hubbard model simulation techniques canbe combined with recent results to achieve even lowerscaling based on the locality of the Hubbard Hamiltonian.Finally, Sec. VI discusses the compilation of these

routines to surface code fault-tolerant gates and estimatesthe physical resources required for error-correcting thesealgorithms under realistic assumptions about hardware. Weconclude in Sec. VII with an outlook on future directionsfor quantum simulating correlated electron models.

II. PHASE ESTIMATING SPECTRAOF HERMITIAN LINEAR COMBINATIONS

OF UNITARIES

The primary contribution of this paper is to demon-strate a particularly efficient method of using quantumcomputation to sample the spectra of correlated electronHamiltonians. Though details of our implementation arespecialized to electronic systems, our high-level simulationstrategy represents a general framework for spectral esti-mation. While aspects of this approach were introducedrecently in Refs. [26,27], the techniques involved emergedfrom a history of advances in Hamiltonian simulationprominently involving Szegedy quantum walks [53], the“linear combination of unitaries” (LCU) query model [54],and the method of Hamiltonian simulation known as“qubitization” [25].Oracular methods of Hamiltonian simulation assume that

information about a Hamiltonian is provided by querying“oracle” circuits [55]. These techniques aim to reduce thenumber of times one must query these oracles in order toeffect the intended simulation to target accuracy. Thetechniques in this paper implement oracles from theLCU query model introduced in Ref. [54]. As the namesuggests, this approach begins from the observation thatany Hamiltonian can be decomposed as a linear combina-tion of unitaries,


041015-4

H ¼XL−1l¼0

wlHl; s:t: ðwl ∈ RÞ ∧ ðwl ≥ 0Þ;

H2l ¼ 1; ð5Þ

where wl are scalars and Hl are self-inverse operators thatact on qubits; e.g., Hl could be strings of Pauli operators.The convention in this paper is that the wl are real and non-negative, with any phases included in the Hl.LCU simulation techniques are formulated in terms of

queries to two oracle circuits. The first oracle circuit, the“preparation oracle,” acts on an empty ancilla register ofOðlogLÞ qubits and prepares a particular superpositionstate related to the notation of Eq. (5),

PREPARE≡XL−1l¼0

ffiffiffiffiffiffiwl

λ

rjlih0j;

PREPAREj0i⊗logL ↦XL−1l¼0


λ

rjli≡ jLi;

λ≡XL−1l¼0

wl: ð6Þ

The quantity λ is the same as that in Eq. (3) and turns out tohave significant ramifications for the overall algorithmcomplexity. The second oracle circuit we require acts on theancilla register jli as well as the system register jψi anddirectly applies one of the Hl to the system, controlled onthe ancilla register. For this reason, we refer to the ancillaregister jli as the “selection register” and name the secondoracle the “Hamiltonian selection oracle,”

SELECT≡XLl¼0

jlihlj ⊗ Hl;

SELECTjlijψi ↦ jliHljψi: ð7Þ

Note that the self-inverse nature of theHl operators impliesthat they are both Hermitian and unitary, which means theycan be applied directly to a quantum state.

A. Encoding spectra in Szegedy quantum walksusing qubitization oracles

The essential simulation primitive deployed here (aquantum walk operator based on SELECT and PREPARE)was first introduced as a subroutine to the qubitizationapproach for Hamiltonian time evolution [25]. However,the direct use of this primitive for phase estimation was firstsuggested more recently in Ref. [26]. In Sec. II B and theAppendix, we go beyond existing work and prove that, aslong as the eigenphase of the walk operator is boundedaway from zero (so that the Hamiltonian is not frustration-free), then this algorithm can, in principle, learn as quickly

as traditional phase estimation using the Cramer-Rao bound.We begin our discussion with the observation that the

state jLi from Eq. (6) encodes H as a projection of SELECTonto jLi,

ðhLj ⊗ 1ÞSELECTðjLi ⊗ 1Þ ¼ 1

λ

Xl

wlHl ¼ Hλ: ð8Þ

This encoding is a general condition for qubitization [25],but the LCU oracles SELECT and PREPARE, as defined inEqs. (7) and (6), are not necessarily the only constructionsthat meet this criterion; we refer to the broader family ofcircuits satisfying Eq. (8) as “qubitization oracles.” Withthis in mind, we discuss a walk operatorW that encodes thespectrum of H as a function of the eigenphases of W,although the spectrum of W differs from that of thepropagator e−iHt. One such walk operator W may beconstructed as

W ≡RL · SELECT; RL ≡ ð2jLihLj ⊗ 1 − 1Þ: ð9Þ

This construction takes the form of a Szegedy walk[53] since it is composed of a product of two reflectionoperations. The operation RL is manifestly a reflec-tion operation, and it can be seen that SELECT is a reflectionoperation because

SELECT2 ¼�X

l

jlihlj ⊗ Hl

�2

¼Xl

jlihlj ⊗ H2l ¼ 1: ð10Þ

Fig. 1 shows a circuit that implements W controlled on anancilla.The action of W partitions Hilbert space into a direct

sum of two-dimensional irreducible vector spaces. Throughreasoning about these eigenspaces, we can deduce thespectrum ofW as well as the eigenvectors. In particular, weclaim that the state jLijki and an orthogonal state jϕki spanthe irreducible two-dimensional space that jLijki is inunder the action ofW for arbitrary eigenstate jki of H witheigenvalue Ek. This state jϕki is formally defined to be thecomponent of WjLijki that is orthogonal to jLijki, whichcan be simplified, using Eq. (8), to

jϕki≡ ð1 − jLihLj ⊗ jkihkjÞ · SELECTjLijkikð1 − jLihLj ⊗ jkihkjÞ · SELECTjLijkik

¼ ðSELECT − Ekλ 1ÞjLijkiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

1 − ðEkλ Þ2

q : ð11Þ

The matrix elements of W can be computed for this state.The upper diagonal matrix element follows from Eq. (8),


041015-5

hkjhLjWjLijki ¼ hkjhLjSELECTjLijki ¼ Ek

λ: ð12Þ

The upper transition matrix element between hkjhLj andjϕki is given from Eqs. (10) and (11) as

hkjhLjWjϕki ¼ hkjhLjW ðSELECT − Ekλ 1Þffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

1 − ðEkλ Þ2

q jLijki

¼ 1 − ðEkλ Þ2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

1 − ðEkλ Þ2

q ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 −

�Ek

λ

�2

s: ð13Þ

Note that because phase estimation on W projects thesystem to an eigenstate of W and because W and H sharean eigenbasis, we are only concerned with the action of thisoperator for eigenstates.Equations (12) and (13) give the first row of the action of

W. The remaining entries can be calculated in a similarway, giving the action of W on this irreducible two-dimensional subspace as

W ≡0B@

Ekλ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 − ðEk

λ Þ2q

−ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 − ðEk

λ Þ2q

Ekλ

1CA ¼ ei arccos ðEk=λÞY;

ð14Þ

where Y is the Pauli-Y operator constrained to this two-dimensional space spanned by jLijki and jϕki. Finally,we can see that the phases of the eigenvalues of W inthis subspace are � arccosðEk=λÞ. Whereas the work ofRef. [25] focused on transforming the evolution underarccosðHÞ into evolution under H, the more recent work ofRefs. [26,27] made the simple observation that by perform-ing phase estimation directly on W, one can obtain thespectrum of H as

spectrumðHÞ ¼ λ cos (arg [spectrumðWÞ]); ð15Þ

where arg is the argument function argðeiϕÞ ¼ ϕ.

B. Heisenberg-limited phase estimationof the qubitized quantum walk

Since the original work of Ref. [6], many approacheshave been proposed for estimating eigenphases of a unitaryoperator. Whereas, in the past, iterative phase estimationapproaches have been more popular in quantum simulation,here we propose using an entanglement-based approach.This approach has the virtue of requiring a number ofapplications of the unitary that saturates the Heisenberglimit. The ultimate precision that can be reached when oneapplies phase estimation by controlling a unitary when anancilla is in j1i and applying the identity gate when theancilla is in j0i is a Holevo variance of tan2(π=ð2mþ1 þ 1Þ),where the total number of applications of the unitary is2mþ1 − 1 and m is the number of control qubits used. TheHolevo variance is hcosðϕ − ϕÞi2 − 1, where ϕ is the phaseand ϕ is the estimate of the phase given by the measure-ment. It is a convenient measure of variance for phasebecause it enables simple analytic results and is close to themean-square error for narrowly peaked distributions. Thestates for these optimal phase measurements were given inRef. [56]. To apply them to phase estimation of a unitary,one can take the control qubits to be in this superpositionstate, rather than in a uniform superposition of computa-tional basis states.We perform a slight optimization of that approach by

applying the inverse unitary instead of the identity for theancilla in the j0i state. Taking jϕi to be an eigenstate ofthe unitary with eigenvalue eiϕ, this means that insteadof applying j0ijϕi → j0ijϕi and j1ijϕi → eiϕj1ijϕi, weapply j0ijϕi → e−iϕj0ijϕi and j1ijϕi → eiϕj1ijϕi. Thisdoubles the effective phase difference and turns out to havethe same complexity. As shown in Fig. 2, we accomplishthe controlled inverse by removing controls from Wn andinserting controlled reflection operatorsRL into the circuit,which will cause us to apply either ðW†Þn orWn dependingon the state of the ancilla. We can see why this works byexamining the relation

RL ·Wn ·RL ¼ RL2 · ðSELECT ·RLÞn

¼ ðSELECT ·RLÞn ¼ ðW†Þn; ð16Þ

FIG. 1. A circuit realizing the Szegedy quantum walk operator W controlled on an ancilla qubit. The last three gates in the circuit onthe right constitute the reflection RL controlled on an ancilla. Note that the Z gate with the 0-control is actually controlled on the zerostate of the entire jli register and not just a single qubit. Accordingly, implementation of that controlled Z has T complexity OðlogLÞ,where logL is the size of the jli register. However, that overhead is always negligible compared to the cost of the PREPARE and SELECT

operators in the constructions of this paper.


041015-6

which holds for any integer n as a consequence of the self-inverse nature of RL and W. Moreover, because Wn

always ends with an RL operation, each controlled RLcan be combined in the circuit with the Wn to yield acomplexity no greater than the complexity of just perform-ing the Wn operations.This trick will result in measuring the phase modulo π.

To eliminate the π ambiguity, an additional controlled Wcan be performed without this trick. This case is shownas the first controlled operation in Fig. 2. For m controlqubits, the Holevo variance is still tan2(π=ð2mþ1 þ 1Þ), butthe complexity is reduced by approximately half to 2m

applications of the unitary W.As seen in Fig. 2, our modified phase estimation

algorithm begins with a unitary χm, which prepares the state

χmj0i⊗m ↦

ffiffiffiffiffiffiffiffiffiffiffiffiffiffi2

2m þ 1

r X2m−1n¼0

sin

�πðnþ 1Þ2m þ 1

�jni: ð17Þ

To prepare this state with cost OðmÞ, we first performHadamards on mþ 1 qubits (initially in the j0i state) togive

ffiffiffiffiffiffiffiffiffiffiffiffiffi1

2ðmþ1Þ

r X2m−1n¼0

jni ⊗ ðj0i þ j1iÞ: ð18Þ

Next, we perform a series of m controlled rotations, witheach of the first m qubits as control and qubit mþ 1 astarget. For control qubit k, the rotation on the target qubitmþ 1 is eiπ2

kZ=ð2mþ1Þ. If we perform a further rotation ofeiπZ=ð2mþ1Þ on qubit mþ 1, the resulting state is

ffiffiffiffiffiffiffiffiffiffiffiffiffi1

2ðmþ1Þ

r X2m−1n¼0

ðeiπðnþ1Þ=ð2mþ1Þjni ⊗ j0i

þ e−iπðnþ1Þ=ð2mþ1Þjni ⊗ j1iÞ: ð19Þ

We perform a Hadamard on qubitmþ 1 and measure in thecomputational basis. Measuring j1i gives the state

i

ffiffiffiffiffiffi1

2m

r X2m−1n¼0

sin

�πðnþ 1Þ2m þ 1

�jni: ð20Þ

The probability of success is given by the normalizationð1þ 2−mÞ=2. The scheme can be made unitary anddeterministic via a single step of amplitude amplification.Clearly, this preparation scheme scaling as OðmÞ will notdominate the cost of our overall phase estimation, whichscales as Oð2mÞ, as we discuss in the next section.

C. Error scaling and query complexity

Three sources of error enter our simulation: error due toperforming PEA to finite precision, ϵPEA; error due toapproximate preparation of the Hamiltonian terms withinthe implementation of the PREPARE oracle, ϵPREP; and theerror in synthesizing the inverse QFT, ϵQFT. We choose tomeasure error through the root-mean-square error of theestimator used within phase estimation, i.e.,

Δϕ≡ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiE½distðϕest;ϕtrueÞ2�

q; ð21Þ

where the distance considered above is the angular distancebetween the estimated phase and the actual phase.Provided phase estimation is performed on a unitary

operation, the error in the estimate of the energy is at most

FIG. 2. Heisenberg-limited phase estimation circuit for learning the eigenphase of W with m bits of accuracy with Holevo varianceπ2=22ðmþ1Þ, where RL is ð2jLihLj ⊗ 1 − 1Þ and χm prepares the resource state from Eq. (17), which was shown to be optimal inRef. [56]. Both χm and the inverse quantum Fourier transform (QFT†) have gate complexity OðmÞ, which is completely negligiblecompared to the overall gate complexity of phase estimation, which scales asOð2mÞ. The controlled RL andW ¼ RL · SELECT gates areimplemented as shown in Fig. 1. As a consequence of Eq. (16), this circuit involves only 2m − 1 applications of RL and as manyapplications of SELECT. Note that the first unit ofRL ·W · RL is replaced byW controlled on the zero state of an ancilla in order to helpdisambiguate the outcomes of arccosðEk=λÞ and arccosðEk=λÞ þ π.


041015-7

the error in implementing the unitary [43]. We break upthe estimated phase as the sum of two contributions,ϕest ¼ ϕþ ϵPREP þ ϕtrue. Here, ϕ is a random variable withzero mean EðϕÞ and Holevo variance VHðϕÞ describingthe output of phase estimation, and ϵPREP represents thesystematic errors in the phase that arise because of gatesynthesis. In the limit of small variance, we can express thiswith high probability over the true phase as

Δϕ ≈ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiE½ðϕest − ϕtrueÞ2�

q≈

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiVHðϕÞ þ ðϵPREP þ πϵQFTÞ2

q

≈

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi�π

2mþ1

�2

þ ðϵPREP þ πϵQFTÞ2s

; ð22Þ

where m ancillary qubits are used within the phaseestimation algorithm. Note that such a division of the erroris suboptimal since the cost involved in reducing the errorfor phase estimation is exponentially larger than thatinvolved in increasing the accuracy of the circuit synthesis[43]; however, we take the two errors to be equal forsimplicity.Eq. (15) implies that error in the energy is at most

ΔE¼ λΔcosðϕÞ≤ λΔϕ≈λ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi�π

2mþ1

�2

þðϵPREPþπϵQFTÞ2s

:

ð23Þ

This result suggests that we can choose to estimate thephase to a number of bits given by

m ¼�log

� ffiffiffi2

pπλ

2ΔE

��< log

� ffiffiffi2

pπλ

ΔE

�; ð24Þ

and the target errors can be chosen as

ϵPREP ≤ffiffiffi2

pΔE4λ

; ϵQFT ≤ffiffiffi2

pΔE

4πλ: ð25Þ

Thus, using the phase estimation procedure from Sec. II B,we need at most

2m <

ffiffiffi2

pπλ

ΔEð26Þ

queries to the SELECT oracle and at most twice as manyqueries to the PREPARE oracle in order to estimate spectra towithin error ΔE. Supposing that the circuit PREPARE can beapplied at gate complexity P and the circuit SELECT can beapplied at gate complexity S, the gate complexity of oursimulation [ignoring, for now, the cost of χm and the cost ofthe QFT† since they scale as OðmÞ] is then approximatelybounded from above by

ffiffiffi2

pπλðSþ 2PÞΔE

: ð27Þ

This paper discusses implementations of SELECT andPREPARE that minimize S and P without increasing λ.To implement the inverse QFT that appears in Fig. 2, we

use the semiclassical algorithm described in Ref. [57]. Thisversion of the QFT requires justm − 1 rotation gates andmHadamards when implemented onm qubits. Thus, the errorin each rotation must be at most ϵQFT=ðπmÞ, which impliesthat the inverse QFT will have T complexity scaling asO(m logðm=ϵQFTÞ). As this is an additive cost to other partsof our phase estimation algorithm with T complexityscaling as Oð2mÞ, the cost of performing the QFT withinthe required error budget can be safely neglected.How errors in the coefficients of the implemented

Hamiltonian propagate into ϵPREP is slightly harder to boundowing to the fact that the error in the eigenphase is anonlinear function of the error in the Hamiltonian imple-mentation. In particular, the error can diverge for frus-tration-free Hamiltonians owing to the singularity ofarccos. The main result, shown in the Appendix, is thatPREPARE should be implemented so that if wl is theeffective coefficient of Hl in the approximately imple-mented Hamiltonian, then

jwl − wlj ≤ δ ¼ffiffiffi2

pΔE

4Lð1þ ΔE2

8λ2Þ

�1 −

kHk2λ2

�: ð28Þ

III. LOW T COMPLEXITY PRIMITIVESFOR LCU ORACLES

In this section, we introduce three circuit primitives thatare helpful for implementing SELECT and PREPARE oracleswith low T-gate complexity. We use these primitives forelectronic structure simulation but expect them to be usefulmore generally. These primitives enable black-box imple-mentations of SELECT and PREPARE for any problem withlower asymptotic complexity than prior constructions inthe literature. They also have low T counts at finite size. Weuse these constructions extensively in Secs. IV and V ofthis paper.In Sec. III A, we introduce a technique for “streaming”

bits of an iterator running over a unary register. Oneapplication is that this technique can be used to coherentlyapply operations controlled on a register with logL qubitsin superposition [e.g., the selection register in Eq. (6) andEq. (7)] using a number of T gates scaling as OðLÞ, asopposed to OðL logLÞ as one might normally expect.However, what is even more important is the versatileway that these constructions can be applied.In Sec. III B, we show how one can use the results of

Sec. III A to implement a primitive corresponding tocontrolled application of a Majorana fermion operator.


041015-8

This primitive is used directly in our implementation ofSELECT in Secs. IVA and VA.In Sec. III C, we show a straightforward application of the

techniques in Sec. III A that allow us to develop a particularlyefficient quantum data lookup, which we refer to as “quan-tum read-only memory” (QROM). In particular, for coher-ently querying a database with Lwords, our implementationof QROM has T complexity of 4L − 4 with no dependenceon the word length, which is an asymptotic and constant-factor improvement over all prior literature. We will discussQROM in more detail in a forthcoming work [58].In Sec. III D, we discuss a technique for initializing

a state with L unique coefficients (provided by a clas-sical database), with the number of T gates scaling as4LþO( logð1=ϵÞ), where ϵ is the largest absolute errorthat one can tolerate in the prepared amplitudes. This routineimproves asymptotically over the gate complexity of priorconstructions for a black box PREPARE. It also has theadvantage of implementing PREPARE without increasing thevalue of λ [from Eq. (6)], which has been a frequent problemwith other implementations of PREPARE [39,42,59] thatattempted to obtain scaling sublinear in the number of termsin the linear combinations of unitaries decomposition.

A. Unary iteration and indexed operations

Many of the circuits in this paper rely heavily on atechnique we refer to as “unary iteration.” The unaryiteration process gradually produces, and then uncomputes,qubits that indicate whether an index register is storingspecific values (with respect to the computational basis).We call the process unary iteration because the indicatorqubits are made available one by one (iteration), and theycorrespond to the one-hot (unary) encoding of the indexregister value. While these techniques were developedindependently, we note that a scheme similar to unaryiteration is also used for implementing SELECToperations inRef [52]. Compared to Ref. [52], we lower the T count from6L − 4 to 4L − 4 and explain how to apply the scheme tosizes L that are not powers of 2.For an index register storing an index in the interval

½0; LÞ, the space overhead of converting the index register

into a unary register (as in Ref. [27]) would normally be Lqubits. By comparison, our unary iteration technique isexponentially more efficient in space without any increasedT complexity, requiring only logL ancillae. Our unaryiteration has a T-count of 4L − 4 and can be parallelized ifneeded without increasing the T count. Despite its effi-ciency, unary iteration is still the dominant source of Tcomplexity in our algorithms. We use it for indexedoperations, Majorana operators, reversible preparation ofstates, and database lookup, all of which have T counts thatscale like OðLÞ.To explain unary iteration, we first focus on how it is

used to implement a controlled indexed NOT operation:

jcijlijψi ↦ jcijliðXlÞcjψi; ð29Þ

where jci is the control register, jli is the selection register,jψi is the system register, and the subscript l on Xlindicates that the NOT operation acts on qubit l of thesystem register. From Eq. (7), it should not be surprisingthat this primitive is helpful for our constructions ofSELECT.A simple (but suboptimal) way to implement Eq. (29)

would be to totally control the application of Xl on allpossible values that could occur in the register jli, asshown in Fig. 3. For instance, in order to apply X158 whenjli ¼ j158i, the total-control approach would place a NOT

gate targeting the qubit 158 in the system register jψi butwith a control on each index bit. The control’s type (ON orOFF) would be determined by the binary representation of158 (15810 ¼ 10 011 1102), so there would be a must-be-OFF control on the low bit of the index register (because thelow bit of 158 in binary is 0), a must-be-ON control on thenext bit (because the next bit of 158 in binary is 1), and soforth. In order to cover every case, a separate NOT gate withcorresponding controls would be generated for everyinteger from 0 up to L − 1. This would produce L differentNOT operations, each targeting a different qubit in the targetregister and each having a number of controls equal to thesize of the index register (i.e., logL). Thus, it takesOðL logLÞ T gates to apply Eq. (29) using this approach.Unary iteration will improve this T count to 4L − 4.

FIG. 3. Example total control circuit for performing a controlled indexed Xl operation, with 0 ≤ l < L ¼ 11. This is the (naive)starting point for producing a unary iteration circuit, before optimizations that asymptotically improve the T complexity. When indicesoutside the specified range do not occur, the highlighted runs of OFF-type controls reaching the right-hand side of the circuit can beremoved without affecting the circuit. (There are also other controls that could be removed, but for our purposes, this would becounterproductive because it would interfere with later optimizations).


041015-9

Consider that the controls for the operation targeting thequbit at offset l ¼ 158 are almost identical to the controlsfor the operation targeting the qubit at offset l ¼ 159. Theydiffer only on the low bit of the index register, where 158requires the bit to be OFF, whereas 159 requires the bit to beON. If we combine the logL − 1 other qubits of the indexregister into a single representative qubit that is set if andonly if those controls are met, we could use this represen-tative qubit once for the l ¼ 158 case and again for thel ¼ 159 case. Using the representative qubit twice, insteadof computing it twice, decreases the total amount of workdone. Unary iteration is the result of taking this kind ofrepresentative-reuse idea to its natural limit.We define our unary iteration construction by starting

with a total-control circuit and then applying a fixed set ofsimple transformations and optimizations. Fig. 3 shows anexample starting point, a total-control circuit for L ¼ 11.For unary iteration, we require that the index register neverstore an out-of-range value l ≥ L. For example, considerwhat occurs when the X10 operation from Fig. 3 is notconditioned on l0 (the least significant bit of the indexregister). This would cause an X10 to be applied to the targetwhen l ¼ 11, but this is fine since we know l ≠ 11. We usethe l < L condition to omit several controls from thecircuit. For each possible l, we look at the Xl operation andremove the control on the bth index bit when the following

two conditions are true: (i) the bth bit of L − 1 is not set,and (ii) setting the bth bit of l would change l into a valuelarger than L − 1. Visually, this removes “runs” of must-be-OFF controls that manage to reach the right side of thecircuit as highlighted in Fig. 3.After removing the specified controls, we expand the

remaining controls into nested AND operations (the AND

operation is defined in Fig. 4), always nesting so that lowercontrols are inside higher controls. For clarity, we con-sistently place the ancillae associated with an AND oper-ation just below its lowest input qubit. The result is the“sawtooth” circuit shown in Fig. 5. By iteratively optimiz-ing adjacent AND operations as shown in Fig. 6, thesawtooth circuit from Fig. 5 is optimized into the circuitshown in Fig. 7. This is our unary iteration circuit forL ¼ 11. The optimized circuit always ends up with L − 1AND computations (even when L is not a power of 2),each AND takes 4 T gates to compute, and we have no otherT-consuming operations in the circuit. Thus, the T count ofthis construction is 4L − 4.

B. Selective application of Majorana fermion operators

Now that we have described unary iteration, we canbegin using it to construct primitives relevant for theSELECT oracle. As discussed in detail in Secs. IV and V

FIG. 4. Computing and uncomputing AND operations, defined in terms of Toffoli gates and in terms of Cliffordþ T gates [60].Computing an AND consumes 4 jTi states and is equivalent to applying a Toffoli gate to a target qubit known to be j0i. Uncomputing anAND consumes no jTi states and is equivalent to applying a Toffoli gate to a target qubit guaranteed to end up in the j0i state. DrawingAND operations as “corners” instead of as⊕ symbols is a visual cue that the target qubit will be (or was) OFF after (before) the operation.This is worth highlighting because it affects the T count of synthesizing the operation and means that the target is available for reuse asan ancilla in later operations.

FIG. 5. The “sawtooth” circuit resulting from removing unnecessary bits from Fig. 3 and then adding AND operations from Fig. 4 tocombine the controls for performing a controlled indexed Xl operation with L ¼ 11 possible targets.


041015-10

below, our approach for implementing SELECT will requirethat we have a circuit capable of selectively applying theMajorana fermion operator

jlijψi↦ jli�a†l−al

i

�jψi¼ jliYl ·Zl−1…Z0jψi; ð30Þ

where the last equality holds under the Jordan-Wignertransformation [61]. In this section, we describe explicitcircuits that accomplish the mapping of Eq. (30).In Sec. III A, we discuss selectively applying Xl oper-

ations as a representative example of how one might useunary iteration. However, nothing intrinsic to the unaryiteration construction requires that the indexed operation beso simple. For example, we could switch from applying Xlto applying Zl halfway through the circuit. Or each Xlcould be replaced by multiple Pauli operations targetingmultiple qubits. In general, each index could be associatedwith its own unique set of Pauli operators to be applied tovarious target qubits.We can also apply transformations to our quantum unary

iterators (analogous to transformations of classical itera-tors). Iterators can be mapped, filtered, zipped, aggregated,batched, flattened, grouped, etc. For instance, given aclassical stream of bits, one can aggregate over it withthe ⊕ operation. This produces a new iterator, whichiterates over bits equal to the parity of the values so farfrom the original iterator. It is possible to apply this XOR-aggregation idea to the quantum unary iteration process.We can introduce an “accumulator” qubit and, as eachiterated unary qubit is produced, CNOT it into the accumu-lator. In effect, if the index register is storing l, then theaccumulator will stay OFF until the lth qubit toggles it ON.

The accumulator will then stay ON until the end of theiteration process, where it is uncomputed by a CNOT

from the control qubit. By conditioning Xl on the accu-mulator qubit, instead of on the original unary qubits,efficient ranged operations such as jlijψi → jliGl ·Glþ1 � � �GL−1jψi are produced. We show an example ofan accumulator-based ranged operation in Fig. 8.By using both the accumulator qubit and the original

unary qubits, we can apply a ranged indexed operation andan indexed operation in a single unary iteration, whichgradually sweeps over the possible target qubits. Theresulting combined operation, shown in Fig. 9, is a crucialpart of our SELECT circuit, effecting the transformationof Eq. (30).

C. QROM for low T complexity data lookup

In this section, we explain how one can use thetechniques of Sec. III A in order to implement a parti-cular efficient form of what we call QROM [62], whichis useful in the context of the SUBPREPARE routine, asubroutine of the PREPARE circuit described in Sec. IV B(in Fig. 16). Many quantum algorithms assume the exist-ence of a hypothetical peripheral called “quantum random-access memory” (QRAM) [63], which allows classical orquantum data to be accessed via an index under super-position. The purpose of QROM is to read classical dataindexed by a quantum register, i.e., to perform the follow-ing transformation:

QROMd ·XL−1l¼0

αljlij0i ¼XL−1l¼0

αljlijdli; ð31Þ

FIG. 6. When two AND operations are adjacent, the uncomputation-and-recomputation can be replaced by CNOT and NOT operations.Each such merger saves 4 T gates.

FIG. 7. An L ¼ 11 unary iteration circuit that applies Xl to the qubit l in the system register jψi, where l is the value stored in theselection register. The circuit is obtained by merging AND operations from Fig. 5 using the method shown in Fig. 6. It computes 10 AND

operations and so has a T count of 10 × 4 ¼ 40 ¼ 4L − 4.


041015-11

FIG. 8. Ranged operation construction implementing jlijψi → jliQl−1k¼0 Gkjψi. It applies theG operation to a range of values, instead

of to a single value, by using an accumulator. The accumulator is guaranteed to be cleared after the final CNOT targeting it (drawn as aline merging into an ancilla qubit). This occurs because (unless CONTROL is not set and the accumulator simply stays unset) exactly oneof the unary bits must have been set, and we targeted the accumulator with CNOTs controlled by each of those bits in turn. Note thatG½p;qÞrefers to G being applied to every qubit index k satisfying p ≤ k < q.

FIG. 9. Application of a selected Majorana fermion operator, jlijψi ↦ jliYl · Zl−1 � � �Z0jψi as described in Eq. (30). Thisapplication is accomplished by performing a ranged operation (as shown in Fig. 8) and an indexed operation (similar to what is shown inFig. 7) with a single pass through the selection register jli. It has a T count of 4L − 4, where L is the number of integer values that can beheld by the selection register jli.


041015-12

where l is the index to read, αl is the amplitude of jli, anddl is the word associated with index l in a classical list dcontaining L words. Our implementation of QROM isshown in Fig. 10. Note that our notion of QROM isunrelated to the discussion of ROM on a quantum computerin Ref. [64].The read-only aspect of QROM makes it distinctly

different from QRAM in that one can read from QROMbut cannot write to it during the course of a computation.A few algorithms, such as the procedure introduced inRef. [65], actually do require that one write to QRAM; thus,for such cases, QROM would not be appropriate. A notabledifference between this paper and most previous work onQRAM [63,66–68] is that we describe the cost of QROM interms of a fault-tolerant cost model: the number of T gatesperformed and the number of ancilla qubits required. Undersuch cost models, the “bucket brigade” QRAM design ofGiovannetti et al. [63,66] has T complexity (and, thus, alsotime complexity under reasonable error-correction models)of OðLÞ regardless of the fact that it has depth OðlogLÞbecause implementing it as an error-corrected circuit con-sumes OðLÞ T gates and OðLÞ ancillae qubits. Ourimplementation of QROM consumes only 4L T gates andlogL ancillae, which is a constant-factor improvement in Tcount and an exponential improvement in space usage overthe construction of Giovannetti et al..

D. Subsampling the coefficient oracle

In this section, we introduce a technique for initializing astate with L unique coefficients (provided by a clas-sical database) with a number of T gates scaling as4LþO( logð1=ϵÞ), where ϵ is the largest absolute errorthat one can tolerate in the prepared amplitudes. This resultconstitutes a general procedure for implementing PREPARE

such that the cost of circuit synthesis is additive, rather thanmultiplicative (as in most prior schemes). In particular, itimproves on the database scheme from Ref. [42] (based onthe procedure of Ref. [69]), which requires a number ofT gates scaling asO(L logðL=ϵÞ). Importantly, our schemedoes not increase the value of L or λ, which would usuallybe the case for most “on-the-fly” strategies for implement-ing PREPARE [23,42,59].Generalizing the requirements of Eq. (6), we begin with

the observation that it would be acceptable to have aPREPARE circuit that initializes the state

jLi≡XL−1l¼0


λ

rjlijtempli; ð32Þ

where jtempli is an unspecified junk register entangledwith jli. Equivalently, any pure state jLi would suffice if

FIG. 10. Finite-sized example of the QROM database loading scheme used in our implementation of SUBPREPARE. If the index register

contains l, the output register ends up storing dl, where d is some precomputed data vector used when constructing the circuit. The toppart of the circuit performs unary iteration, as described in Sec. III A. The bottom part of the circuit loads classical data associated witheach possible index. The classical data are encoded into the presence or absence of CNOTs on the data lines. The “?”marks in the diagramindicate that one should decide whether or not to have a CNOT gate on each line, depending on the value of the data to load. This circuithas a T count of 4L − 4, which is due entirely to the unary iteration. The T-gate cost of this circuit is independent of the number of bitsused to store each element of the database.


041015-13

hLjðjlihlj ⊗ 1ÞjLi ¼ wl

λ∀ l ∈ ½0; LÞ: ð33Þ

Because SELECT only uses jli to control operations, phaseerror (including entanglement with the junk register) in thestate produced by PREPARE will commute across SELECT

and be corrected by PREPARE†. However, SELECT itselfnecessarily introduces entanglement between its targetregister jψi and the index register jli (plus associatedjunk register). So PREPARE† will not exactly restore jli orthe junk register. Although we only specify the action ofPREPARE on the j0i state, PREPARE will be applied to otherstates due to this imperfect uncomputation effect. This isaccounted for by requiring that (i) qubits coming out ofPREPARE† are kept and fed back into the next PREPARE

operation and (ii) the reflection step between PREPARE† andPREPARE only affects the j0i state.Given the observation that the existence of an entangled

junk register is acceptable, we seek to implement a circuitthat effects the transformation,

j0i⊗ð1þ2μþ2 logLÞ ↦XL−1l¼0

ffiffiffiffiffiρl

pjlijtempli; ð34Þ

where ρl ≡ wl=λ are probabilities characterizing theapproximate Hamiltonian we are encoding. Whereas theexact Hamiltonian would be associated with probabilitiesρl ≡ wl=λ, the value ρl is a μ-bit binary approximation toρl such that

jρl − ρlj ¼jwl − wlj

λ≤

1

2μL≤δ

λ

¼ffiffiffi2

pΔE

4Lλð1þ ΔE2

8λ2Þ ð1 − kHk2=λ2Þ; ð35Þ

μ¼�log

�2

ffiffiffi2

pλ

ΔE

�þ log

�1þΔE2

8λ2

�− log

�1−

kHk2λ2

��;

ð36Þ

where the expression for δ comes from Eq. (A10) of theAppendix, and it bounds the largest acceptable deviation inthe coefficients of the terms in a Hamiltonian approximat-ing the one we mean to implement. The second log inEq. (36) is Oð1Þ because we do not take ΔE larger than λ.The Hamiltonians we consider are frustrated; thus, kHk=λis no larger than a constant (less than 1), and the third log inEq. (36) is Oð1Þ as well.The idea behind our scheme is to create the superposition

in an indirect fashion, which involves starting in a uniformsuperposition over an initial index l and then using aprecomputed binary representation of a probability (loadedfrom QROM), keepl, to decide whether we should keep lor swap it with a classically precomputed alternate indexaltl, which is also loaded from QROM (see Fig. 11).Specifically, our procedure creates a uniform superposition

in jli over L values and then uses QROM (see Fig. 10 andSec. III C) to load jaltli and jkeepli. Note that if L is not abinary power, one can prepare the initial superpositionusing the amplitude amplification circuit discussed later inFig. 12. The procedure described thus far prepares the state

XL−1l¼0

ffiffiffiffi1

L

rjlijaltlijkeepli: ð37Þ

We then construct a circuit that coherently swaps theregisters jli and jaltli with probability keepl to createthe state in Eq. (34). In order to create the state in Eq. (34)from Eq. (37), we need to introduce one additional registerof size μ, which we refer to as jσi. We put this entire registerinto a uniform superposition and then compare it to theprobability represented by keepl. If keepl ≤ σ, we swapregisters jli and jaltli. Thus, after the procedure isfinished, i.e., in Eq. (34), the garbage register will be inthe state

jtempli ¼1ffiffiffiffiffiffiffiffiffiffiffiffiffi

2μLρlp

�jaltlijkeepli

Xkeepl−1σ¼0

jσij0i

þX

kjaltk¼l

jkijkeepkiX2μ−1

σ¼keepk

jσij1i�; ð38Þ

where the rightmost qubit is the result of a comparisonbetween keepl and σ. For Eq. (34) to give the correct state,we need jtempli to be normalized, which means that werequire

keeplþP

kjaltk¼lð2μ−keepkÞ2μL

¼ ρl¼wl

λ; ∀l∈ ½0;LÞ:

ð39Þ

FIG. 11. Generic SUBPREPARE circuit for initializing an arbi-trary state with L unique amplitudes. It spans 2μþ 2 logLþOð1Þ qubits and has a T count of 4ðLþ μÞ þOðlogLÞ, whereμ is calculated as in Eq. (36). Produces a state

PL−1l¼0

ffiffiffiffiffiffiffiffiffiffiwl=λ

p jlijtempli, where “temp” is temporary garbage data that will beuncomputed when uncomputing the preparation of this state afterapplication of SELECT. The data-loading parts of this circuit usethe QROM implementation described in Fig. 10 in Sec. III C.The UNIFORML circuit is used to initialize the initial superpositionover l.


041015-14

We can find the values of keepl and altl from Eq. (39) ina sequential way. At any step, let L denote the set of l forwhich we have already found these values. Then, we canrewrite Eq. (39) as

keepl þP

k∉Ljaltk¼lð2μ − keepkÞ2μL

¼ ρl −P

k∈Ljaltk¼lð2μ − keepkÞ2μL

¼ ρ0l: ð40Þ

This expression involves only known quantities on theright-hand side, which we call ρ0l for short. We show byinduction that the average of ρ0l for l ∉ L is 1=L, and theρ0l are non-negative. These are clearly true initially becausethen ρ0l ¼ ρl. Now, assume that these conditions are true atsome step. If the values ρ0l are all equal for l ∉ L, then wecan just take altl ¼ l and any value of keepl, and satisfyEq. (40) for all remaining l. Otherwise, there will be onevalue, l0, where ρ0l0 is below the average 1=L and another,l1, where ρ0l1 is above 1=L. For l0, we choose keepl0

¼2μLρ0l0 and altl0 ¼ l1. Then, l0 is added to the set L, andthe values of ρ0l are updated. According to Eq. (40), theonly value of ρ0l that is updated is that for l ¼ l1, where wereplace it with ρ0l1

þ ρ0l0− 1=L. This ensures that the

average value of ρ0l for l ∉ L is still 1=L, and since wehad ρ0l1 > 1=L, the new value is non-negative.A more intuitive way to understand our approach to

preparation is that it is equivalent to classical alias sampling

[72], which samples l with probability ρl by the followingprocedure:(1) Select l uniformly at random from ½0; LÞ.(2) Look up altl and keepl.(3) Return l with probability keepl=2μ; otherwise

return altl.The procedure for determining the altl and keepl is then towork backwards starting from the distribution ρl andupdate this distribution by shifting probabilities from l1

to l0 until we obtain a uniform distribution [73].This procedure is illustrated in Fig. 13. One starts with a

histogram of the desired distribution and looks for a bar thatis too small, fixes this by transferring probability from a barthat is too high, and so on until all bars have the correctheight. Each probability transfer permanently solves the barthat was too low, and the remaining bars form a smallerinstance of the same problem. Thus, it is not possible to getstuck in a loop or a dead end. See also the module utils/_lcu_util.py in version 0.6 of OPENFERMION [74,75] foropen-source python code that performs this iterativematching process (and also handles discretizing the dis-tribution) in OðLÞ time.

IV. CONSTRUCTIONS FOR THE ELECTRONICSTRUCTURE HAMILTONIAN

Using the appropriate discretization into a basis of Nspin orbitals, the electronic structure Hamiltonian can bewritten as

FIG. 12. A circuit that uses amplitude amplification [70,71] to conditionally and reversibly prepare the uniform superposition

ð1=ffiffiffiffiffiffiffiffi2kL

pÞP2kL−1

l¼0 jli, where L is odd, starting from the j0i state. The circuit spans kþ 2 logLþOð1Þ qubits and has a T count of2kþ 10 logLþO( logð1=ϵÞ). If the control is omitted, the T count drops to 8 logLþO( logð1=ϵÞ). If L ¼ 1, the Rz rotations are notneeded and the T count drops to 2k. If L ¼ 1 and the control is omitted, the T count is zero.

FIG. 13. Depiction of choosing altl and keepl values for a discretized probability distribution. The left histogram is the inputprobability distribution, after discretization into steps of 1=ð23LÞ. The squares must be redistributed so that every column has at mosttwo colors and height of 1=L (i.e., eight squares as indicated by the dashed line) without moving the bottom square of each column. Thehistogram on the right satisfies these constraints, and the histogram in the middle is an intermediate distribution. Arrows indicate whichboxes were moved where. The color of the top square of each column in the histogram on the right determines the value of altl, whereaskeepl is determined by where the color transition is within each column.


041015-15

H ¼Xp;q;σ

Tðp − qÞa†p;σaq;σ þXp;σ

UðpÞnp;σ

þX

ðp;αÞ≠ðq;βÞVðp − qÞnp;αnq;β; ð41Þ

where a†p;σ and ap;σ are fermionic creation and annihilationoperators on spatial orbital p ∈ f0;…; N=2 − 1g withspin σ ∈ f↑;↓g, and np;σ ¼ a†p;σap;σ is the number oper-ator. These operators satisfy the canonical fermionic anti-commutation relations fa†p;α; a†q;βg ¼ fap;α; aq;βg ¼ 0 and

fa†p;α; aq;βg ¼ δp;qδα;β.Mapping to qubits under the Jordan-Wigner transforma-

tion [61,76], Eq. (41) becomes

H ¼Xp≠q;σ

Tðp − qÞ2

ðXp;σZXq;σ þ Yp;σZYq;σÞ

þX

ðp;αÞ≠ðq;βÞ

Vðp − qÞ4

Zp;αZq;β

−Xp;σ

�Tð0Þ þUðpÞ þP

qVðp − qÞ2

�Zp;σ

þXp

�Tð0Þ þUðpÞ þ

Xq

Vðp − qÞ2

�1; ð42Þ

where we have introduced the notation Z that will be usedthroughout the paper, which we now explain. The tensorfactors on which Pauli operators act can always beinterpreted as some integer. For instance, ðp; σÞ is map-pable to an integer under a particular choice of canonicalordering in the Jordan-Wigner transformation. Whenplaced between two Pauli operators, the notation AjZAk

denotes the operator AjZjþ1…Zk−1Ak. The exact mappingbetween a spin orbital indexed by ðp; σÞ and a qubitindexed by an integer is discussed later.The forms of Eq. (41) and Eq. (42) encompass a

wide range of fermionic Hamiltonians, including themolecular electronic structure (aka “quantum chemistry”)Hamiltonian in any basis that diagonalizes the Coulombpotential [39]. The particular coefficients will depend onthe discretization scheme and basis functions chosen torepresent the system. One such representation, derived foruse in quantum simulations in Ref. [39], prescribes thecoefficients

TðpÞ ¼Xν

k2ν cos ðkν · rpÞ2N

;

UðpÞ ¼ −Xj;ν≠0

4πζj cos ðkν · Rj − kν · rpÞΩk2ν

;

VðpÞ ¼Xν≠0

2π cos ðkν · rpÞΩk2ν

; ð43Þ

where each spatial orbital p is associated with an orbitalcentroid rp ¼ pð2Ω=NÞ1=3, and Ω is the computationalcell volume. The momentum modes are defined askν ¼ 2πν=Ω1=3, with ν ∈ ½−ðN=2Þ1=3; ðN=2Þ1=3�⊗3. Whendealing with molecular potentials, Rj and ζj are the positionand charge of the jth nucleus, respectively.As discussed in Ref. [39], the Hamiltonian of Eq. (43)

corresponds to discretization in a basis composed of rotatedplane waves known as the “plane wave dual” basis. Thebasis set discretization error associated with the dual basisis asymptotically equivalent to a Galerkin discretizationusing any other single-particle basis functions, includingGaussian orbitals [39]. Thus, Eq. (43) is a general expres-sion of the electronic structure problem that is asymptoti-cally equivalent to any other representation. While wellsuited for simulating periodic materials, despite asymptoticequivalence, this basis set is not particularly compact forthe simulation of molecules. Another basis set compatiblewith Eq. (41) and Eq. (42), while being much moreappropriate for molecules, is the so-called “Gausslet” basis[41]. Gausslets are derived from a ternary wavelet trans-formation [77] of Gaussian orbitals and have similarintrinsic basis set discretization errors to standardGaussian orbitals [41].The simulation procedures here will make use of the

structure in Eq. (42). Specifically, our algorithm will makeuse of the fact that the Hamiltonian consists of only fourtypes of terms—Zp, ZpZq, XpZXq, and YpZYq—and thatthere are only 3N=2 unique values of the coefficients. Ouralgorithms do not utilize any particular structure in the dualbasis Hamiltonian in Eq. (43) beyond the fact that itsatisfies the form of Eq. (41). This is important since itimplies that the techniques of this paper are compatiblewith other representations of the electronic structureHamiltonian, such as the finite difference discretization[39], finite element methods, and Gausslet basis sets [41],which produce Hamiltonians consistent with Eq. (41) butnot Eq. (43).

A. Electronic structure Hamiltonian selection oracle

In order to implement the SELECT and PREPARE oraclesfor the electronic structure Hamiltonian of Eq. (41), onemust first define a scheme for indexing all of the terms. Forthe case of the general electronic structure Hamiltonian inEq. (41), we index terms with the registers jθi, jUi, jVi,jpi, jαi, jqi, and jβi. The jpi and jqi registers are little-endian binary encodings of integers going from 0 toN=2 − 1, thus using logN − 1 qubits each; the otherregisters are each a single bit, which we use to specifythe unitary that SELECT should apply to the systemregister jψi.The jαi and jβi bits are used to specify the spins f↑;↓g,

which, together with the spatial orbital specifications p andq, index a spin orbital. Thus, a register set as jpijαijqijβi


041015-16

will index a Hamiltonian term that involves action on thespin orbitals indexed by ðp; αÞ and ðq; βÞ. Next, wheneverjUi ¼ j1i, it will be the case (by construction of ourcircuits) that ðp; αÞ ¼ ðq; βÞ, and we will apply the Zp;α

terms. If jVi ¼ j1i, we will apply the Zp;αZq;β terms. IfjUijVi ¼ j0ij0i and p < q, it will also be the case thatα ¼ β, and we will apply the Xp;αZXq;α terms; if jUijVi ¼j0ij0i and p > q, it will again be the case that α ¼ β, andwe will apply the Yq;αZYp;α terms. Finally, the jθi registerencodes whether the unitary should have a negative phase(if jθi ¼ j1i). Thus, our SELECT circuit meets the followingspecification (where UNDEFINED means this case should notoccur):

SELECTCHEMjθ; U; V; p; α; q; βijψi¼ ð−1Þθjθ; U; V; p; α; q; βi

⊗

8>>>>>>>><>>>>>>>>:

Zp;αjψi U ∧ ¬V ∧ ððp; αÞ ¼ ðq; βÞ)Zp;αZq;βjψi ¬U ∧ V ∧ ððp; αÞ ≠ ðq; βÞ)Xp;αZXq;αjψi ¬U ∧ ¬V ∧ ðp < qÞ ∧ ðα ¼ βÞYq;αZYp;αjψi ¬U ∧ ¬V ∧ ðp > qÞ ∧ ðα ¼ βÞUNDEFINED otherwise:

ð44Þ

We present our implementation of SELECTCHEM inFig. 14. Our circuit relies on the subroutines that wedescribe in Sec. III A, which provide a method for

selectively applying strings of Pauli operators to a systemregister of size N, with controls on logN qubits. Importantnotation for these subroutines is also defined in Sec. III A,and thus, that section is necessary for understanding thedetails of Fig. 14.Since p and q are actually three-dimensional vectors

with elements taking integer values p ∈ ½0; ðN=2Þ1=3 − 1�and σ ∈ f↑;↓g, we should clarify how the spin orbitalsðp; σÞ are mapped to an integer representing qubits. Forease of exposition, we define the following mappingfunction for a D-dimensional system,

M≡ ðN=2Þ1=D; fðp; σÞ ¼ δσ;↓MD þXD−1

j¼0

pjMj;

ð45Þ

where D ¼ 3 for chemistry and D ¼ 2 for the Hubbardmodel. The δ function behaves as one might expect:δ↑;↓ ¼ 0 and δ↓;↓ ¼ 1. Thus, it should be understood thatXp;σ implies the X operator acting on qubit fðp; σÞ.

B. Electronic structure coefficientpreparation oracle

We see from Eq. (41) that there are only OðNÞunique coefficients in the Hamiltonian, despite theHamiltonian having OðN2Þ different terms. Based on theindexing in Eq. (44) and definition in Eq. (6), our PREPAREinitializes

FIG. 14. SELECTCHEM circuit, with a T count of 12N þ 8 logN þOð1Þ, which implements the functionality specified by Eq. (44),conditioned on a “control” qubit. The unitaries performing Majorana and indexed operations (each requiring 4N T gates) are explicitlyconstructed in Sec. III A. As described in Fig. 9, the unitaries labeled as ZAj apply the operation Z0…Zj−1Aj to the target register,depending on the value from the input jpi register. These operations require an extra logN ancillae, so the overall circuit spansN þ 3 logN þOð1Þ qubits. The operation that targets the system register with Zq is a variant of Fig. 7, with the Xl gate replaced by Zl.All indexed operations reuse the same ancillae.


041015-17

PREPARECHEMj0i⊗ð3þ2logNÞ

↦Xp;σ

UðpÞjθpij1iUj0iV jp;σ;p;σi

þXp≠q;σ

Tðp−qÞjθð0Þp−qij0iUj0iV jp;σ;q;σi

þX

ðp;αÞ≠ðq;βÞVðp−qÞjθð1Þp−qij0iUj1iV jp;α;q;βi; ð46Þ

where the values of the coefficients and the state of jθi,related to the coefficients in Eq. (42), are defined as

UðpÞ ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffijTð0Þ þUðpÞ þP

qVðp − qÞj2λ

s;

TðpÞ ¼ffiffiffiffiffiffiffiffiffiffiffiffiffijTðpÞj

λ

r; VðpÞ ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffijVðpÞj4λ

r;

θp ¼ 1 − sign( − Tð0Þ − UðpÞ −PqVðp − qÞ)

2;

θð0Þp ¼ 1 − sign(TðpÞ)2

; θð1Þp ¼ 1 − sign(VðpÞ)2

:

ð47Þ

The TðpÞ coefficient inside the square root in Eq. (47)differs from the coefficient in Eq. (42) by a factor of 2 sinceit occurs for each type of term only once depending onwhether p < q or p > q.

To implement PREPARE, we first synthesize a unitaryreferred to as SUBPREPARE, which acts as follows:

SUBPREPAREj0i⊗ð2þlogNÞ

↦XN−1

d¼0

(UðdÞjθdij1iUj0iT þ TðdÞjθð0Þd ij0iUj0iV

þ VðdÞjθð1Þd ij0iUj1iV)jdi: ð48ÞSince in this step we initialize a state on OðlogNÞ qubits,the techniques of Ref. [69] would allow one to implementSUBPREPARE with a T count of O(N logð1=ϵÞ). However,in Fig. 15, we show an even more efficient method forsynthesizing SUBPREPARE with T-gate complexityO(N þ logð1=ϵÞ), based on the techniques introduced inSec. III D. Using SUBPREPARE, we can implement the entirePREPARE circuit with the same asymptotic T complexity.In our SUBPREPARE circuit, l is really a vector ofintegers; thus, we use “modular vector indices” such thatif v is a three-dimensional vector within a rectangularspace with each dimension having M values, then thefunction application FðvÞ should be expanded to FðvÞ¼FðvmodMÞ¼FðvxmodM;vymodM;vzmodMÞ, consistentwith the mapping introduced in Eq. (45).While applying SUBPREPARE to create the state in

Eq. (48), we also initialize the jαi qubit in the jþi statewith a Hadamard. We then use the UNIFORML circuit fromFig. 12 to initialize the jqi register in an equal superpositionin a way that is controlled on the jUi ancilla qubit being inthe state j0iU. Subsequent to this step, the state becomes

FIG. 15. SUBPREPARE circuit for the electronic structure Hamiltonian, as in Fig. 16, with a T count of 6N þOðμþ logNÞ and a qubitcount of 2μþ 3 logN þOð1Þ, where μ is defined in Eq. (36). The data-loading subroutine is implemented as in Fig. 10 and has a Tcount of 3 × 4M3 − 4 ¼ 6N − 4. The UNIFORM subroutine is implemented as in Fig. 12 and has a T count of OðμÞ. The compare-and-swap operations have a negligible OðlogNÞ T count. As in Eq. (45), D denotes the system dimension (usually D ¼ 3), andM refers tothe number of values along each dimension such that N ¼ 2MD. Although we only specify the behavior of the circuit when the U, V,and p qubits start in the j0i state, the circuit is also invoked in contexts where this is not the case.


041015-18

Eq: ð48Þ ↦XN=2−1

d¼0

UðdÞjθdij1iUj0iV jdijþiαj0i⊗logN

þXN=2−1

d¼0

XN=2−1

q¼0

ðTðdÞjθð0Þd ij0iUj0iV

þ VðdÞjθð1Þd ij0iUj1iVÞjdijþiαjqij0i: ð49Þ

The register labeled as jdi in Eq. (48)will ultimately becomeour jpi register, but immediately after SUBPREPARE, it ismore appropriate to think of it as encoding a valuejp − qi. As we can see in Eq. (46), when jVi ¼ j1i and

p ¼ q, it is necessarily the case that α ≠ β. The middlepart of our PREPARE circuit is dedicated to correctlyinitializing this tricky part of the superposition. To dothis, we use an ancilla to apply a Hadamard gate to j0iβonly when jVi ¼ j1i and jdi ≠ j0i⊗ðlogN−1Þ. In the eventthat jVi ¼ j1i and jdi ¼ j0i⊗ðlogN−1Þ, we apply a CNOT

gate with an open control on jαi which targets j0iβ, thusensuring that jβi ≠ jαi when p − q ¼ 0. Then, we setjβi ¼ jαi for the U and T part of the superposition byapplying a Toffoli gate with regular control on jαi andopen control on jVi, targeting jβi. After these operations,the state can be expressed as

Eq: ð49Þ ↦XN=2−1

d¼0

Xσ

�UðdÞjθdij1iUj0iV jd; σ; 0; σi þ

XN=2−1

q¼0

TðdÞjθð0Þd ij0iUj0iV jd; σ; q; σi�

þXα

�Vð0Þjθð1Þ0 ij0iUj1iV j0; α; q;¬αi þ

Xβ

XN=2−1

d¼1

XN=2−1

q¼0

VðdÞjθð1Þd ij0iUj1iV jd; α; q; βi�: ð50Þ

The final step consists of converting the jdi register to values representing jpi. To do this, we must add the jqi registerinto the jdi register when jUi ¼ j0i so that jdþ qi ¼ jp − qþ qi ¼ jpi. However, we also want to copy the jdi registerinto the jqi register when jUi ¼ j1i; thus, prior to this operation, we also implement a Fredkin gate, which swaps jdi andjqi, conditioned on jUi ¼ j1i. After the Fredkin gate and the addition of jdi into jqi,

Eq: ð50Þ ↦XN=2−1

d¼0

Xσ

�UðdÞjθdij1iUj0iV jd; σ; d; σi þ

XN=2−1

q¼0

TðdÞjθð0Þd ij0iUj0iV jdþ q; σ; q; σi�

þXα

�Vð0Þjθð1Þ0 ij0iUj1iV jq; α; q;¬αi þ

Xβ

XN=2−1

d¼1

XN=2−1

q¼0

VðdÞjθð1Þd ij0iUj1iV jdþ q; α; q; βi�: ð51Þ

Then, simply by relabeling d ¼ p − q whenever jUi ¼ j0iand d ¼ p whenever jUi ¼ 1, we see that our stateis identical to the desired one [from Eq. (46)]. We showhow to use SUBPREPARE to implement PREPARECHEM

in Fig. 16. The gate complexity of SUBPREPARE isO(N þ logð1=ϵÞ), and the gate complexity of all othercomponents of this circuit is OðlogNÞ. Thus, the overallgate complexity of PREPARE is OðN þ log 1=ϵÞ.

C. Resources required for electronicstructure simulation

The parameter λ from Eq. (6) has significant impli-cations for the complexity of our algorithm; as seen inEq. (27), our circuit size will scale linearly in λ. For thecase of general electronic structure, we can see fromEq. (41) that λ is

λ ¼Xpq

jTðp − qÞj þXp

jUðpÞj þXp≠q

jVðp − qÞj: ð52Þ

This expression and the extremely naive assumption thatall coefficients are Oð1Þ would imply that λ ∈ OðN2Þ.For the case of quantum chemistry in the dual basis, i.e.,Eq. (43), the work of Ref. [39] obtains the same bound:

λ ∈ O�N7=3

Ω1=3 þN5=3

Ω2=3

�∈ OðN2Þ; ð53Þ

where the last relation holds when studying electronicstructure systems that grow with fixed density N ∝ Ω,which is the usual situation. For encoding the electronicstructure Hamiltonian, we also determine that P ¼ 6N þO( logðN=ϵÞ) and S ¼ 12N þOðlogNÞ in terms of Tcomplexity. Thus, from Eq. (27), we can conclude thatthe overall T complexity of our procedure is roughly

ffiffiffi2

pπλðSþ 2PÞΔE

≈24

ffiffiffi2

pπλ

ΔEN; ð54Þ


041015-19

which, for the electronic structure Hamiltonian, isrigorously bounded by OðN3=ΔEÞ.Ancilla required for our electronic structure simula-

tion come from three sources: qubits required for ourentanglement-based phase estimation [given by Eq. (24)],qubits required to store coefficient values in QROM[given by Eq. (36)], and ancilla actually required forour implementation of PREPARE and SELECT, which,for the electronic structure Hamiltonian simulation, is5 logN þOð1Þ. Putting these sources together, the totalancillae required are

log

� ffiffiffi2

pπλ

2ΔE

�þ 2 log

�2

ffiffiffi2

pλ

ΔE

�þ 5 logN þOð1Þ

¼ log

�4

ffiffiffi2

pπλ3N5

ΔE3

�þOð1Þ; ð55Þ

where the additive constant is small and can usually beneglected for problem sizes of interest. This expressiongives the ancilla count in Theorem 1.We now estimate resources required for specific problem

instances. In practice, we find that λ scales better thanOðN2Þ, but exactly how much better is system dependent.For a particular material, the value of λ can be influenced bya number of factors. These factors include the particulars ofthe bases used, the geometry and atomic composition of thematerial, and whether one scales toward continuum orthermodynamic limits.Perhaps the simplest chemistry system that is classically

intractable is a molecule without nuclei: the uniformelectron gas, also known as jellium. Jellium is a system

of η electrons with real kinetic energy and Coulombinteractions confined to a box of finite volume Ω withperiodic boundary conditions. Plane waves are a near-idealbasis for the simulation of jellium; the system is naturallyexpressed using the discretization of Eq. (43) with aconstant external potential, i.e., ζj ¼ 0. Jellium is aninteresting system to simulate on early quantum computersdue to its simplicity, classical intractability [39], historicalsignificance tied to breakthroughs in density functionaltheory [78] as well as the fractional quantum Hall effect[79], and tradition as a benchmark for classical electronicstructure calculations.The phase diagram of jellium is typically para-

metrized in terms of the Wigner-Seitz radius, whichcharacterizes the electron density in three dimensions asrs ¼ (3Ω=ð4πηÞ)1=3, where η is the number of electrons.Although the ground state of jellium at high densities(metallic, rs ∼ 1 Bohr radii per particle) and at very lowdensities (insulating, rs ∼ 100 Bohr radii per particle) iswell known, the phase diagram in the intermediate densityregime is less certain [80–85]. Whereas perturbation theoryperforms well in the high-density regime [86,87], quantumMonte Carlo has been the most competitive simulation toolin the low- to intermediate-density regimes [88–91]. Forsystems with more than 50 electrons, quantumMonte Carlosimulations of jellium typically introduce a bias to controlthe sign problem, such as the fixed-node approximation,full-configuration quantum Monte Carlo with initiators, orauxiliary-field quantum Monte Carlo with a constrainedphase bias. The systematic error from these biases isthought to be as large as half a percent in the energy

FIG. 16. PREPARE circuit for the electronic structure Hamiltonian. It implements the unitary in Eq. (46) with a T count of6N þOðμþ logNÞ, where μ is defined in Eq. (36). The SUBPREPARE subroutine is the dominant cost, and it is implemented as inFig. 15. When U is set by SUBPREPARE, the Fredkin gates and the addition copy p’s value over q. When V is set, uniform superpositionsover q and β are prepared except, if p − q ¼ 0, then β is instead set to be opposite to α in order to guarantee ðp; αÞ ≠ ðq; βÞ. Beware thatthis conditional preparation of β doubles the weight of the p ¼ q cases (relative to the p ≠ q cases) and must be accounted for in theLCU coefficients given to SUBPREPARE. When neither U nor V is set, α is copied into β, and a uniform superposition over q is prepared.The remaining operations contribute negligible Oðμþ logNÞ T gates. The UNIFORM⊗D

M operation prepares D registers in a uniformsuperposition of basis states going up to M, as in Fig. 12. Controls on multiqubit lines are conditioned on every qubit within the line;e.g., what appears to be a Toffoli gate is actually a NOT gate with 1þD logM controls. The action of our PREPARECHEM circuit is onlyspecified in the case where the inputs are all j0i. During actual execution, this is not the case because the effects of the SELECT operationwill prevent PREPARE†CHEM from exactly uncomputing theU, V, p, q, α, and β qubits. This is expected behavior, and it is accounted for byrequiring that the potentially not-uncomputed qubits be kept and used as inputs for the next PREPARECHEM circuit.


041015-20

[81,88], which is on a scale similar to the energy differencebetween competing phases in the intermediate-densityregime. Even for modest system sizes such as 50 electronsand twice as many spin orbitals, quantum simulations canoffer bias-free results that cannot be obtained by quantumMonte Carlo.We include numerics in Fig. 17 that empirically estimate

a tighter bound on λ for jellium in the classicallychallenging regime corresponding to rs ¼ 10 Bohr radiiat half-filling N ¼ d2ηe. Those numerics, shown in Fig. 17,indicate an empirical scaling of λ ¼ Oð∼N5=3Þ. If we targeta chemical accuracy of ΔE ¼ 0.0016 Hartree, then fromEq. (54), we see that roughly 2 × 107 T gates would berequired for jellium with 54 orbitals, 2 × 108 T gates wouldbe required for jellium with 128 orbitals, and about a billionT gates would be required for jellium with 250 orbitals.While these numbers are promising, for small sizes, thesesimulations require a number of ancilla comparable to N. Tcounts and ancilla resources are tabulated for severaljellium problem instances in Table III.The dual basis of Eq. (43) is also a natural choice for

periodic condensed phase systems (e.g., solids) besidesjellium. Considering only this basis, there are two param-eters that determine the accuracy of the simulation withrespect to the true material. The first one is the number ofplane waves used to discretize the cell, which determinesthe spacing of the quasipoints in the dual basis [39]. Moreplane waves equate to a finer grid and more accuratediscretization. The second parameter is the size of the

supercell, which determines the error one incurs byrepresenting an infinite system with a finite, periodicone, also known as the finite-size error. There are differentways of reducing the finite-size error for a physical system.One common method used in density functional theoryutilizes Bloch’s theorem to divide the sampling probleminto so-called “k-points” within the first Brillouin zone[92]. The smoothness of the energy with respect to thek-points and additional symmetry provided can offeradvantages in certain approaches at the cost of increasedcomplexity, often resulting in a complex Hamiltonianrepresentation at nonzero k-points. The origin k-point, alsocalled the gamma point, maintains a real Hamiltonian forthe appropriate basis functions. An alternative to k-pointsampling is increasing the size of the supercell, whichincreases the relevance of the gamma point. For simplicity,here we only consider the gamma point, so the naturalparameter to change is the number of repetitions of the unitcell that fixes the size of the supercell being simulated. Alarger supercell tends to incur less finite-size error as thesystem is scaled to the thermodynamic limit.It is clear that these two parameters are not entirely

independentwith respect to the accuracy of representation ofthe true system. For example, a much larger supercell withthe same number of grid points clearly offers a coarser andless accurate representation of the true system. Moreover, inclassicalmethods, it is commonpractice to extrapolate alongboth parameters to increase the accuracy for a givencomputational cost [92].We do not introduce such complex-ities here but rather show empirically how the choice of theseparameters influences the parameter λ, which determines thecost of our algorithms, leaving optimizations such asextrapolation schemes to future work.Figure 18 shows the value of λ as a function of the

number of qubits being used to discretize the materialcell at a fixed supercell size for several real materials.

N qubits (fixed filling; rs = 10)

FIG. 17. The value λ as a function of the number of qubits N forthe 3D spinful dual-basis jellium Hamiltonian at a Wigner-Seitzradius of 10 Bohr radii (assuming the system is initialized at half-filling so N ¼ d2ηe), which corresponds to an increasing cellvolume as the number of basis functions increases. This densitywas chosen for study since it is in the classically challengingregime for jellium [39]. The bound is in atomic units of energy(Hartree). The number in parentheses corresponds to the best-fitexponent for the trend line (the dotted line), which suggests thatλ ¼ Oð∼N5=3Þ.

TABLE III. Resources required for quantum simulation of 3Dspinful jellium in the dual basis at a Wigner-Seitz radius of 10Bohr radii, where the cell volume is calculated assuming thesystem is at half-filling. The units for λ are Hartree. The numberof logical ancillae is computed using Eq. (55), and the number ofT gates is computed using Eq. (54). These estimates assume(rather conservatively in comparison to classical limitations) thatwe should target an additive chemical accuracy of ΔE ¼ 0.0016Hartree. These problem sizes are large enough that contemporaryclassical methods cannot reliably provide unbiased estimates withlow enough systematic error to resolve competing phases withinthe fixed-size basis.

Spinorbitals λ value

Logicalancilla

Totallogical T count

54 5 69 123 1.8 × 107

128 23 82 210 1.9 × 108

250 64 91 341 1.1 × 109

1024 640 112 1136 4.3 × 1010


041015-21

The number of qubits here is equal to twice the number ofplane waves since spin is being treated explicitly. Equalnumbers of plane waves along each of the reciprocal latticevectors of the supercell are used, as opposed to the morecommon spherical energy cutoff schemes [93], as this enablestheuse of theplane-wavedual basis [39]. The exponent (slopeon the log-log plot) of a least-squares linear regression to thedata is listed alongside the material, and we see empiricallythat the value of λ scales just under λ ¼ OðN2Þ in the numberof basis functions while keeping the size of the supercellfixed, which matches the analytical bound rather closely.From the formof theHamiltonian inEq. (43), one can see thatat a fixednumber of planewaves, increasing thevolumeof thesupercell Ω tends to decrease λ such that λ ¼ Oð∼Ω−1=2Þ,due to representing lower-frequency modes with respect tothe plane-wave representation of the kinetic energy, despiteincreasing the total nuclei present. We show this effectempirically in the center of Fig. 18.The first two panels of Fig. 18 leave open the question of

the impact on λ of increasing the supercell size whilemaintaining a constant density of dual quasipoints or aconstant density of plane waves, as we expect the impact ofthe last two aspects to compete in some way. Empirically,this is shown in the right portion of Fig. 18, which plots thevalues of λ for a fixed density of points in increasingsupercell sizes. Note that this scaling is most comparableto past studies on single molecules since molecular volumetends to grow as one adds electrons. We observe that thescaling in this case is more favorable as a function of thenumber of qubits than simply refining the grid alone, and inall cases, it is better than λ ¼ Oð∼N3=2Þ. From Eq. (54), thisnumerical data would suggest that the T complexity of ouroverall algorithm is empirically bounded byOð∼N5=2=ΔEÞwhen the goal is simulation of real materials.

To treat molecules properly, one should further con-sider pseudopotentials [94], methods of extrapolation tocontinuum and thermodynamic limits, and embeddingmethods [95,96]. We leave such a thorough comparisonof fault-tolerant resources required for specific instancesof real materials other than jellium to future work.However, by comparing Fig. 17 and the right panel ofFig. 18, it is apparent that jellium is a reasonable proxy forother materials in that λ values are comparable. As the restof the simulation circuit is identical up to the particularangles of certain single-qubit rotations, one can estimatethe cost of simulating these materials from our analysis ofthe fault-tolerant overheads required to simulate jelliumin Sec. VI.

V. CONSTRUCTIONS FOR THEHUBBARD MODEL

In this section, we describe specialized implementationsof the SELECT and PREPARE oracles for simulation of theplanar repulsive-interaction Fermi-Hubbard model; wethen estimate the overall T complexity of simulating suchmodels. The Hubbard model is a canonical model of amany-electron system often used to model superconduc-tivity in cuprate superconductors. Despite its simplicity,the Hubbard model exhibits a wide range of correlatedelectron behavior including superconductivity, magnetism,and interaction-driven metal-insulator transitions [97].The Hubbard model is essentially a special case of

Eq. (41) when the model is restricted to a planar grid. TheHamiltonian can be expressed as

H ¼ −tXhp;qi;σ

a†p;σaq;σ þu2

Xp;α≠β

np;αnp;β; ð56Þ

FIG. 18. Left panel: A plot of the λvalue for one unit cell of the listedmaterial near equilibrium bond length as a function of the number ofqubits used to discretize the cell. A plane-wave dual basis is used with an equal number of points along each of the axes. The value givenafter thematerial name corresponds to the best-fit scaling for that particular material. Thematerials display roughly similar scaling in λ as afunction of the number of qubits. Center panel: A plot of the λ value as a function of the cell volume for a fixed number of qubits, 1024 inthis case. We observe that fixing the number of qubits while increasing the cell size tends to decrease λ as expected. Right panel: A plot ofthe λvalue as a function of the number of qubits, where the supercell size is scaled proportionally to the number of points along each axis. Inthis case, two basis functions are used along each axis in the unit cell, which is scaled proportionally as the cell volume grows.


041015-22

where the notation hp; qi implies that terms exist onlybetween sites that are adjacent on a planar lattice withperiodic boundary conditions. This Hamiltonian can beexpressed under the Jordan-Wigner transformation as

H ¼ −t2

Xhp;qi;σ

ðXp;σZXp;σ þ Yp;σZYp;σÞ

þ u8

Xp;α≠β

Zp;αZp;β −u4

Xp;σ

Zp;σ þuN4

1: ð57Þ

We focus on the Hubbard model with periodic boundaryconditions (which is a more typical system to study than theHubbard model with open boundary conditions).

A. Hubbard model Hamiltonian selection oracle

We see from Eq. (56) that there are only three uniquecoefficients in the Hubbard Hamiltonian: The coefficient of−XZX and −YZY terms is t=2, the coefficient of ZZ terms

is u=8, and the coefficient of local −Z terms is u=4. Thismakes the implementation of the PREPARE circuit excep-tionally simple. Ultimately, we show that the PREPARE

circuit for the Hubbard model can be implemented at a costof O( logð1=ϵÞ). This scaling virtually guarantees that forall problem sizes of interest, the scaling of the overallalgorithms will be dominated by the cost of the SELECT

circuit.We index terms in the Hubbard Hamiltonian using

the registers jUijVijpxijpyijαijqxijqyijβi. Note that it isimportant for us to explicitly separate px and py in ourconstruction of the Hubbard model circuits since thisstructure is fundamental to the efficiency of our scheme.Our indexing scheme is nearly identical to the schemeused for the arbitrary chemistry Hamiltonian in Eq. (7),but here we do not need the θ parameter since we knowthe sign of the parameters in advance. Thus, our SELECT

circuit for the Hubbard model will meet the followingspecifications:

SELECTHUBjU;V; p; α; q; βijψi

¼ jU;V; p; α; q; βi ⊗

8>>>>>>>><>>>>>>>>:

−Zp;αjψi U ∧ ¬V ∧ ððp; αÞ ¼ ðq; βÞ)Zp;αZq;βjψi ¬U ∧ V ∧ ðp ¼ qÞ ∧ ðα ¼ 0Þ ∧ ðβ ¼ 1Þ−Xp;αZXq;αjψi ¬U ∧ ¬V ∧ ðp < qÞ ∧ ðα ¼ βÞ−Yq;αZYp;αjψi ¬U ∧ ¬V ∧ ðp > qÞ ∧ ðα ¼ βÞUNDEFINED otherwise;

ð58Þ

where, for ease of exposition, p ¼ px þ pyM and q ¼ qx þ qyM, consistent with the convention of Eq. (45) for D ¼ 2. Byexploiting translational invariance in the Hubbard model, we are able to implement SELECTHUB in a slightly more efficientfashion, achieving a T count of only 10N þOðlogNÞ. We show this more efficient implementation in Fig. 19.

B. Hubbard model coefficient preparation oracle

Our PREPARE circuit for the Hubbard model has the following specification:

PREPAREHUBj0i⊗ð2þ2 logNÞ

↦XM−1

px¼0

XM−1

py¼0

� ffiffiffiffiffiu8λ

rj0iUj1iV jpxijpyij0iαjpxijpyij1iβ þ

ffiffiffiffiffiu4λ

r Xσ∈f↓;↑g

j1iUj0iV jpxijpyijσijpxijpyijσi

þffiffiffiffiffit2λ

r Xσ∈f↓;↑g

ðj0iUj0iV jpxijpyijσijpx þ 1ijpyijσi þ j0iUj0iV jpxijpyijσijpxijpy þ 1ijσiÞ

þffiffiffiffiffit2λ

r Xσ∈f↓;↑g

ðj0iUj0iV jpxijpyijσijpx − 1ijpyijσi þ j0iUj0iV jpxijpyijσijpxijpy − 1ijσiÞ�; ð59Þ

where the first line above corresponds to −Z and ZZterms, the second line corresponds to −XZX terms, andthe final line corresponds to the −YZY terms. Note thatwe are looking at a Hubbard model with periodic

boundary conditions, so wherever something like jpx þ 1iappears, we really mean jðpx þ 1Þ mod Mi, which weomitted from the above equation for clarity. Our imple-mentation of PREPARE begins for the Hubbard model by


041015-23

initializing a two-qubit state containing the three distinctcoefficients for the U, V, and T terms. This is donewith standard circuit synthesis techniques with a T countof O( logð1=ϵÞ) [69]. We then spread these coefficientsover the various cases. We depict our implementationin Fig. 20.

C. Hubbard model resources

For the case of the planar Hubbard model in Eq. (56), it isreadily apparent that

λ ¼ 2Ntþ Nu2

∈ OðNÞ; ð60Þ

FIG. 19. A SELECT circuit for the Hubbard model, with function determined by how p relates to q. Recall from Eq. (45) thatM ¼ ffiffiffiffiffiffiffiffiffi

N=2p

for the Hubbard model. This circuit has a T count of 10N þOðlogNÞ and spans N þ 3 logN þOð1Þ qubits. If control isOFF, the circuit has no effect. Otherwise, if jUijVi ¼ j1ij0i, it will be the case that ðp; αÞ ¼ ðq; βÞ, and our circuit applies −Zp;α. IfjUijVi ¼ j0ij1i, we again have that p ¼ q, but this time, we also have that α ¼ 0 and β ¼ 1, so the circuit applies Zp;0Zq;1. If

jUijVi ¼ j0ij0i and p < q, the circuit performs −Xp;αZXq;α, but if p > q, the circuit performs −Yp;αZYq;α. The larger gates in thiscircuit are Majorana operators described in Fig. 9 and an indexed operation explained in Fig. 7 (except that the Xl gate is replaced by aZl gate). The Majorana operators each have a T count of 4N, but the indexed operation has no dependence on α and so has a T count of2N. All other circuit components have T counts in OðlogNÞ.

FIG. 20. PREPAREHUB circuit with T count O( logðN=ϵÞ). The Ry operations are used to prepare the three distinct LCU coefficients,which, including multiplicity, are

ffiffiffiffiffiffiffiffiffiffitN=λ

p(for the 2N T-type terms),

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiNu=ð4λÞp

(for the N ¼ 2M2 U-type terms), andffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiNu=ð8λÞp

(forthe N=2 ¼ M2 V-type terms). We only specify the action of this circuit in the case where the inputs are all j0i. During the actualexecution, the effects of the SELECT operation will prevent PREPARE† from exactly uncomputing the U, V, px, py, and α qubits as well asthe bottom two ancilla qubits. This is the expected behavior, and it is required that the potentially not-uncomputed qubits be kept andused as inputs for the next PREPAREHUB circuit.


041015-24

assuming that we are dealing with the spinful model withperiodic boundary conditions. We also determine that P ∈O( logðN=ϵÞ) and that S ¼ 10N þOðlogNÞ. Thus, thetotal T cost of the Hubbard algorithm is

ffiffiffi2

pπλðSþ 2PÞΔE

¼ffiffiffi2

pπð2tþ u=2ÞNðSþ 2PÞ

ΔE

≈20

ffiffiffi2

pπtþ 5

ffiffiffi2

pπu

ΔEN2: ð61Þ

Ancillae required for our Hubbard model simulationcome from two sources: qubits required for ourentanglement-based phase estimation [see Eq. (24)] andancillae actually required for our implementation ofPREPARE and SELECT, which for the Hubbard model is3 logN þOð1Þ. Putting these sources together, the totalancillae required are

log� ffiffiffi

2p

πλ

2ΔE

�þ 3 logN þOð1Þ ¼ log

� ffiffiffi2

pπλN3

2ΔE

�þOð1Þ;

ð62Þ

where the additive constant is small and can usually beneglected for problem sizes of interest. This expressiongives the ancillae count in Theorem 2.While numerically exact solutions to the Hubbard

model are available for one-dimensional [98] and infinite-dimensional systems [99], no known polynomial-timescaling classical methods can provide reliable solutions tothe planar model in all parts of its phase diagram [97]. Forstate-of-the-art approximate methods, the most challenginglow-temperature phase of the model appears to be theintermediate-interaction regime due to the presence of manycompeting phases, around u=t ¼ 4 [97]. Accordingly, wefocus our analysis on this regime. If u ¼ 4t, then λ ¼ 4Nt.An interesting and classically challenging-to-obtain accu-racy (beyond the agreement of state-of-the-art numericalmethods [97]) for this regime would be in the vicinity ofΔE ≈ t=100 [100]; these choices would suggest a T com-plexity of approximately

40ffiffiffi2

pπt

t=100N2 < ð1.8 × 104ÞN2 ð63Þ

and an ancilla count of approximately

log

�8

ffiffiffi2

pπtN3

t=100

�< 12þ 3 logN: ð64Þ

We summarize these resources for various interesting sizesof Hubbard model simulation in Table IV.

D. Exploiting locality in simulationsof lattice Hamiltonians

Looking forward, another way that our circuits canbe applied is to accelerate the recent Lieb-Robinsonsimulation method of Ref. [49]. Lieb-Robinson boundsreveal an intriguing fact about local Hamiltonians:Interactions spread out in a light cone similar in form tothe causal diamonds used in relativity to indicate theregions of space-time that can have an impact on an eventat a point in spacetime [101]. More specifically, Lieb-Robinson bounds show that information propagates atfinite speeds (up to exponentially small errors) in systemswith nearest-neighbor interactions. The idea behindRef. [49] is to exploit this structure to break up theevolution into subpieces that can be independently simu-lated, thus reducing the cost of simulation.We formalize this by envisioning that we have a lattice

of N sites, Λ, and a Hamiltonian that consists of terms thatact upon these sites, H ¼ P

X⊆Λ hX. Here, each hX is localin that if hX and hY act on different sites in the lattice, then½hX; hY � ¼ 0, and hX only has support on sites that are aconstant Euclidean distance away from each other. Notethat this definition of locality also incorporates the termswithin the Hubbard model. The final concept that weneed in order to explain the method is that of distancebetween sites. We assume that for all X, Y ⊆ Λ, distðX; YÞyields the minimum Euclidian distance between anytwo points within the lattice vectors contained within setsX and Y. For example, given a lattice 1D on 10 sites,distðf3; 4; 5g; f8; 9; 10gÞ ¼ 3. The following lemma (arestatement of Lemma 6 in Ref. [49]) explains the impactthat the locality imposed by the Lieb-Robinson bound hason simulation.Lemma 3 (patching lemma). Let Λ be a lattice on N

sites with a Hamiltonian H ¼ PX⊆ΛhX, where each hX is

a local bounded Hamiltonian for every X ⊆ Λ. Let A, B, Cbe subsets of Λ, and let HP1…Pq

for any sequence

TABLE IV. Resources required for quantum simulation of aplanar Hubbard model with periodic boundary conditions andspin, as in Eq. (56). The dimension of the system indicates howmany sites (spatial orbitals) are on each side of the square model.The number of system qubits is thus twice the number of spatialorbitals. The number of logical ancillae is computed as Eq. (64).Finally, the number of T gates is computed using Eq. (63), whichassumes that u=t ¼ 4 and ΔE ¼ t=100. The first three problemsizes in the table are near the classically intractable regime.

DimensionSpin

orbitalsLogicalancilla

Totallogical T count

6 × 6 72 33 105 9.3 × 107

8 × 8 128 33 161 2.9 × 108

10 × 10 200 36 236 7.1 × 108

20 × 20 800 42 842 1.2 × 1010


041015-25

P∶f1;…; qg ↦ fA; B;Cgq be for integer q ≥ 1 HP1…Pq¼P

X∈P1⋃…⋃PqhX (for example, HAB ¼ P

X⊆A⋃BhX). Thereare constants v ≥ 0, called the Lieb-Robinson velocity, andμ > 0 such that

ke−iHABCt − e−iHABteiHBte−iHBCtk

∈ O� X

X⊆ðA⋃B⋃CÞnðA⋃BÞnCkhXkevt−μ distðA;CÞ

�: ð65Þ

Note that, in the above terminology, ðA ∪ B ∪ CÞnðA ∪ BÞnC is the boundary of the sets AB and C, meaningthe set of all terms within the Hamiltonian that act on sitescontained in both A or B and C. Lemma 3 is the core of thesimulation method. The central idea behind the proof is touse the patching lemma recursively to break up theevolution into a product of evolution operators, each ofwhich contains terms that act on one or two of theconstituent subsets of sites in the problem. This is con-ceptually similar to a Trotter decomposition; however, asthe error in this approximation can be made exponentiallysmall by choosing the patches in Lemma 3 to be linearly farapart, the error can be controlled in a tighter fashionwithout requiring short time steps (unlike Trotter decom-positions [22,102]). For example, consider regions A, B, C,D. Then, we can write

e−iHABCDt ≈ e−iHABteiHBte−iHBCDt

≈ e−iHABteiHBte−iHBCteiHCte−iHCDt: ð66Þ

In order to achieve scaling that is polylogarithmic in 1=ϵ,the evolution of each patch needs to be simulated using amethod with polylogarithmic scaling in 1=ϵ, such as thetruncated Taylor series simulation result [23] or qubitiza-tion [49]. Our circuits can be used to optimize this resultsince qubitization remains the best way to simulate theevolution, and our SELECT and PREPARE circuits meet therequirements of qubitization oracles. This result is formallygiven as Theorem 1 of Ref. [49], a special case of which isrestated below for convenience.Theorem 4. Assume the preconditions of Lemma 3 and

that, for every unit ball in Λ within the Euclidean metricspace RD, at most Oð1Þ sites are contained within the balland hX ¼ 0 if the diameter of the set X is greater than 1.Additionally, let each hX be efficiently computable andhave norm at most 1. Then, there exists a quantumalgorithm that simulates the evolution of H for time τwith accuracy ϵ that uses O(τNpolylogðτN=ϵÞ) 2-qubitgates and further has gate depth O(τpolylogðτN=ϵÞ).We claim that our approach can be used to achieve

OðN=ϵÞ scaling for simulating the Hubbard model withnearest-neighbor interactions. The Hubbard model satisfiesthe preconditions because each term in the Hamiltonian islocal on the fermion lattice [49]. By using our constructions

for the PREPARE and SELECT circuits, we can reduceconstant factors (and some log factors in T complexity)involved in the qubitized simulation while saturating theOðτNÞ scaling of Theorem 4. We then choose τ ∈ Oð1=λÞand apply phase estimation on the result. In order toestimate the eigenvalue to within error ϵ with highprobability, we need Oðλ=ϵÞ repetitions of the circuit.Thus, by multiplying the two results, we find that theoverall scaling for simulating such a Hubbard model isOðN=ϵÞ, as claimed.This approach requires some follow-up work in order

to determine exact T counts. Specifically, we need toimplement a full qubitized simulation (rather thane−i arccosðH=λÞ). This transformation is known to beachievable with a polylogarithmic-sized circuit [25].While our work provides a highly optimized methodfor implementing the oracles needed in this process,more work remains to estimate constant factors asso-ciated with this simulation.

VI. RESOURCE ANALYSIS FORFAULT-TOLERANT IMPLEMENTATION

Throughout this work, we have focused on the number ofT gates as the primary cost model of interest. The reasonsfor this are our focus on hardware consisting of a 2Dnearest-neighbor coupled array of qubits, the intention touse the surface code [31–35], and the high relative over-head of T gates compared to all others in that context. Inthis section, we discuss the overhead of the completealgorithm in detail.When using the surface code, each T gate is imple-

mented by first preparing a magic state

jTi≡ Tjþi ¼ j0i þ eiπ=4j1iffiffiffi2

p ð67Þ

that is consumed during the gate. The gate is probabilistic,and 50% of the time, T† is actually applied instead of T.When the gate implemented is not as desired, an S gatemust be inserted to correct it. Preparing T states requires asubstantial amount of time and hardware, which wedescribe below. In an effort to minimize the number ofphysical qubits required, we therefore only prepare a

TABLE V. Breakdown of the various elements that make up theMajorana operator circuit from Fig. 9 and the data lookup circuitfrom Fig. 10. Here, N is the number of spin orbitals in the systemthat the circuits are being applied to.

ComputeANDs

UncomputeANDs

NakedCNOTs Subcircuits

Fig. 9 N − 1 N − 1 0.5N 0.5NFig. 10 1.5N − 1 1.5N − 1 0.75N 0.75N


041015-26

single T state at a time. We assume the availability ofa correlated-error minimum-weight perfect matchingdecoder [103] capable of keeping pace with 1 μs roundsof surface code error detection, and capable of deliveringfeedforward in 10–20 μs. We calculate the qubit andtime overhead for physical gate error rates p ¼ 10−3

and p ¼ 10−4.The overhead is approximated by considering only the

overhead of the Majorana operator circuit from Fig. 9 andthe data lookup circuit from Fig. 10. It is expected that thesecircuits will account for over 90% of the total algorithmoverhead. These circuits break down into a number ofcommon pieces: compute ANDs, uncompute ANDs, nakedCNOTs, and active subcircuits. The number of these pieces,in terms of the number of algorithm target qubits, N, isshown in Table V.

A surface code implementation of compute AND in Fig. 4is shown in Fig. 21. The regular geometric structure can bedecomposed into plumbing pieces, namely, cubic volumeseach containing a single, small, light-colored cube. Thecompressed [36] version has a depth 15 plumbing pieces.The circumference of each stringlike structure (defect) isthe surface code distance d, and the minimum separation ofdefects of the same color is also d. In the temporal direction(left-right), each unit of d is a round of error detection. Inthe spatial directions (plane perpendicular to temporal),each unit of d corresponds to two qubits. Note that a singleCNOT, after compression, takes depth 1 plumbing piece asdrawn. The overhead of any algorithm ultimately needs tobe expressed as some number of qubits (space) and seconds(time). A plumbing piece is a convenient device-physicsand code-distance independent measure of space-timevolume. As described above, the ð5d=4Þ3 cubic volumeof a plumbing piece can easily be converted to qubits andseconds, given a code distance d and single-round error-detection time.A plumbing piece depth 8 surface code implementation

of uncompute AND in Fig. 4 is shown in Fig. 22. This couldbe compressed by performing the measurement differentlyso that no initial Hadamard would be necessary; however,the current surface code form is more easily identified withthe original abstract form, and further compression is notnecessary, as the execution time of the algorithm, as weshall see, is limited by our serial preparation of T states.An effective plumbing piece depth 5 surface code

implementation of the Majorana operator active subcircuitis shown in Fig. 23. The unusual pair-of-horns structure is

FIG. 21. (a) Canonical surface code AND gate computation (Fig. 4) circuit. The bottom pair of white lines (coming from the backinstead of from the left) represents the injection of a jTi state. Each dark ring is a CNOT. Each dark U-shape labeled T* is a random T orT† gate (as determined by a measurement in a location not shown in the diagram connecting to the dark U-shape). The boxes labeled S*immediately following each T* are S or S† gates that are included if the random T or T† gate results in the incorrect gate. The final twoboxes are Hadamard and S† operations, respectively. (b) Compressed version. The final three boxes can be compressed to a single box,as an arbitrary single-qubit Clifford can be performed inside using twists [104] and other techniques [105]. Distinct dark structures canbe made to touch, provided this occurs in a single place [36].

FIG. 22. Uncompute AND (Fig. 4) implemented directly as aHadamard operation followed by a measurement on the bottomqubit. The outcome of this measurement determines if thesequence of operations on the top two qubits (a CZ implementedas a CNOT framed by Hadamard operations) are included oromitted.


041015-27

to permit an uncompute AND circuit to fit in before the finalCNOT. The inner loop of the data circuit (Fig. 10) is just apair of single-control multiple-target CNOTs, and a single

additional CNOT. This case can be implemented in plumb-ing piece depth 4 and is not shown.Preparing a T state is an involved process [106,107],

which is shown for discussion purposes in Fig. 24. Theimportant features for our purposes are the fact that this canbe tiled vertically (meaning in time) every six plumbingpieces, and the whole structure occupies an area of 160plumbing pieces. A significant amount of fast classicalfeedforward is required, as many T gates are potentiallyfollowed by S gates, and the paths of the connections fromthe first (small) level of distillation to the second (large)level must be determined based on which succeed. Ourassumption of a 10–20-μs latency decoder is sufficient tomake this work. We are interested in the overhead ofsolving instances of the electronic structure and HubbardHamiltonians discussed in prior sections. We must choose atarget inaccuracy ϵ to fix the number of data logical qubitsand gates required. To first order, the dependence of thegate count on ϵ can be ignored. We choose ϵ ¼ 10−3.Table VI summarizes the circuit input parameters wewill study.Given Tables Vand VI, and the plumbing piece depths of

the various circuit elements, we can calculate the totalnumber of data plumbing pieces Ndata

PP and hence the codedistance required to ensure no more than a 1% chance oflogical error in any data plumbing piece using pLðd; pÞ≃2dð50pÞðdþ1Þ=2 < 1=ð100Ndata

PP Þ. Similarly, knowing thatthe compute AND circuit contains 4 T gates and that noother part of the data or Majorana operator circuits containsT gates, we can calculate the total number of T gates, NT,and hence the target T-state error rate from distillation of1=ð100NTÞ. We also calculate the total number of T

FIG. 24. Preparing a T state in the surface code. Physical jTi0 states are injected into the 16 lower factories in the lower rear right,which distill them into less noisy jTi1 states. Fifteen of the successful distillations are forwarded to the larger factory towards the frontand left, which distills them into a jTi2 state with low enough error. The output is shown as the top-left dark U-shape.

FIG. 23. Surface code implementation of the inner loop of theMajorana operator circuit. Contrast with Fig. 9, noting that thecircuit has been somewhat modified to reduce its surface codespacetime volume. In particular, the controlled-Y operations fromFig. 9 have been propagated through the controlled-Z operations,producing CNOT operations that are cheaper to perform. Thiscreates phase error, which must be corrected by an S gate on thecontrol of the entire Majorana operator. Here, we do not showinitial Hadamard gates on every target qubit.


041015-28

distillation plumbing pieces, NTPP, and a code distance to

ensure that the chance of T plumbing piece error is below1=ð100NT

PPÞ. We have elected to keep algorithm error rateslow to ensure that, on average, only a few repetitions arerequired. For both p ¼ 10−3 and p ¼ 10−4 and all algo-rithm instances considered, T-state distillation of the formin Fig. 24 is sufficient to achieve the target logical errorrate. This information is collectively sufficient to calculatethe qubit and time overheads, shown for all cases inTable VII.The previous paragraphs described a manual overhead

estimation method. There are a number of approximationsthat go into such an estimate, in particular, assuming that itwill always be possible to route gates in 3D spacetimewithout overhead beyond that of where the data qubitsare stored. In order to strengthen the relevance of thepresented results, we have also used a software-automated

approximation method. The software is an improvedversion of the tool from Ref. [108]. Automated overheadapproximation starts from a Cliffordþ T representation ofthe circuit to be analyzed (e.g., Fig. 8) and ends with a fullsurface code layout, having each gate translated into acorresponding configuration of plumbing pieces. Thus, theautomated estimation work flow is similar to the manualone. However, certain circuit particularities are analyzeddifferently, so similarities and differences between the twomethods are discussed.The Cliffordþ T circuit is prepared according to a

worst-case scenario, based on the available hardwarerestrictions, plumbing-piece layout problems, and T-gatecorrection mechanisms. The preparation of a single dis-tilled T state at a time is a restriction that influences theresulting surface code layout: The Cliffordþ T gates haveto be scheduled (laid out) in such a way that the T gates will

TABLE VI. Cases for which we will calculate the fault-tolerant overhead, along with numbers relevant to estimating the non-negligible components of this overhead. Column 3 contains the number of times that the W oracle is queried. For the Hubbard model,we consider the system at intermediate coupling (u=t ¼ 4), implying that λ ¼ 4Nt, and we consider an accuracy of ΔE ¼ t=100. Forjellium, values of λ are provided in Table III, and we target chemical accuracy, which is defined as ΔE ¼ 0.0016Hartree. Each call toWincludes a call to SELECT, PREPARE, and PREPARE†, which in turn apply QROM and Majorana operations (which are the dominant costsof the algorithm). Columns 4 and 5 estimate the number of times the Majorana operator circuit (see Fig. 9) must be applied to the entiresystem or else to half of the system. For electronic structure, there are three Majorana operators of size N per query to W (see Fig. 14).For Hubbard, there are two Majorana operators of size N and one of size N=2 per query to W (see Fig. 19). Column 6 estimates thenumber of times QROM lookups (see Fig. 10) of size L ¼ 3N=2 are performed. This does not occur in our Hubbard model circuits, but ithappens twice per query toW in our electronic structure circuits (once in PREPARE and once in PREPARE†; see Fig. 15). The final columncontains the maximum number of data qubits required at any point in the algorithm, which occurs while applying the Majoranaoperations in SELECT. This number does not include space to prepare T states, which will be discussed separately.

System Spin orbitals (N) W queries MajoranaN MajoranaN=2 QROM3N=2 Max qubits

Hubbard model 72 1.3 × 105 2.5 × 105 1.3 × 105 0 105Hubbard model 128 2.3 × 105 4.6 × 105 2.3 × 105 0 161Hubbard model 200 3.6 × 105 7.2 × 105 3.6 × 105 0 236Hubbard model 800 1.4 × 106 2.8 × 106 1.4 × 106 0 842Electronic structure 54 1.4 × 104 4.2 × 104 0 2.8 × 104 123Electronic structure 128 6.3 × 104 1.9 × 105 0 1.3 × 105 210Electronic structure 250 1.7 × 105 5.3 × 105 0 3.5 × 105 341Electronic structure 1024 1.8 × 106 5.3 × 106 0 3.5 × 106 1136

TABLE VII. Manual calculation of qubit and time overheads of general chemistry and Hubbard circuits, assuming gate error rates ofp ¼ 10−3 and p ¼ 10−4, a 2D array of nearest-neighbor coupled qubits, and a surface code error-detection cycle time of 1 μs. Theexecution time being estimated is the duration of one complete run of the phase estimation process.

Problem Physical qubits Execution time (h)

System Spin orbitals (N) p ¼ 10−3 p ¼ 10−4 p ¼ 10−3 p ¼ 10−4

Hubbard model 72 1.4 × 106 4.4 × 105 4.6 2.6Hubbard model 128 2.1 × 106 6.6 × 105 15 8.4Hubbard model 200 3.2 × 106 8.9 × 105 40 21Hubbard model 800 1.4 × 107 3.6 × 106 6.7 × 102 3.7 × 102

Electronic structure 54 1.4 × 106 3.9 × 105 0.82 0.43Electronic structure 128 2.4 × 106 8.1 × 105 9.9 5.6Electronic structure 250 4.4 × 106 1.2 × 106 58 30Electronic structure 1024 2.0 × 107 4.8 × 106 2.7 × 103 1.4 × 103


041015-29

be executed as soon as possible, but not earlier thanthe availability of distilled T states. The state distillationform in Fig. 24 implies that a T gate can be executed, onaverage, every six plumbing pieces along the time axis.Additionally, T-gate implementations are probabilistic, andS-gate corrections may be necessary. Thus, our scenarioconsiders that all T gates are followed by the corrective Sgate, resulting in a synthetic increase of the Cliffordþ T

circuit depth. Circuit preparation is followed by an opti-mization procedure, where as many Clifford gates aspossible are scheduled between two subsequent T gates.The software simulates the availability of the distilled Tstates and places T gates whenever their execution ispossible. If no T states are available, the T gates aredelayed, which will later increase the approximated timeoverhead.Finally, the Cliffordþ T circuit is translated into the

surface code layout. From a resource estimation perspec-tive, the complexity of this task is increased because thesoftware currently only partially includes the optimizationstrategy presented in Fig. 21(b): Final boxes can becompressed to a single one, but distinct dark structuresare not allowed to touch (for verification or debuggingpurposes). Due to this fact, the automated approximationuses a slightly different Cliffordþ T realization of thecomputing AND gate (cf. Fig. 4), which has the advantageof being more suitable for automatic placement in stairway-structured circuits (i.e., the arrangement of AND gates inFig. 9). Automatic placement of those Cliffordþ T sub-circuits results in a shorter depth of the generated surfacecode layouts. Overhead of the two basic circuits in units ofplumbing pieces can be found in Table VIII. These dataconverted into qubits and time can be found in Table IX.The automatically generated estimations are comparable tothe manual ones, though generally slightly higher, exceed-ing the manual estimates by 10%–20%. While the auto-mated method is penalized by missing optimizationstrategies that are possible when analyzing circuits man-ually, and the need to provide explicit communication pathsfor long-range gates, some of this penalty is canceled byusing algorithmic methods too complex to perform man-ually. The fact that both approximation methods lead to

FIG. 25. Illustration of the circuit layout used for resourceestimation. The figure shows a portion of a circuit towards theoutputs (the time axis indicating circuit execution runs fromthe back to the front). The figure sketches the arrangement of thelogical qubits (light gray lines), the CNOTs (dark gray lines), withthe boxes (green) abstracting the multilevel distillation circuitsfrom Fig. 24 and the bounding box (black wire frame) expressingthe resources estimated to lay out the circuit.

TABLE VIII. Automatically generated resource estimates of the Majorana and QROM circuits (Figs. 9 and 10). The area width,height, and time columns give the dimensions of the bounding box (e.g., Fig. 25) in units of plumbing pieces. The last column is thenumber of plumbing pieces estimated to be actively used within the bounding box. The volume numbers do not include the volume ofthe T factory, but they do include idle qubits that are present in the algorithm as a whole but not the individual circuits. “Braided volume”refers to the amount of actively used volume, i.e., nonempty space with defects used to encode qubits and operations. QROM circuits areindexed like QROM3

2N because, in context, the QROM index size L is always 50% larger than the number of orbitals N.

System Circuit (N) T count Area (PP2) Time (PP) Volume (PP3) Braided volume (PP3)

Hubbard model Majorana72 284 17 × 16 1840 500 480 429 624Hubbard model Majorana128 508 25 × 16 3252 1 300 800 1 155 817Hubbard model Majorana200 796 36 × 16 5080 2 926 080 2 637 504Hubbard model Majorana800 3196 123 × 16 20 262 39 875 616 37 451 032Electronic structure Majorana54 212 20 × 16 1382 442 240 365 717Electronic structure Majorana128 508 32 × 16 3252 1 665 024 1 473 563Electronic structure Majorana250 996 51 × 16 6342 5 175 072 4685164Electronic structure Majorana1024 4092 165 × 16 25 932 68 460 480 64 114 531Electronic structure QROM3

254 320 20 × 16 2068 661 760 558 098

Electronic structure QROM32128 764 32 × 16 4872 2 494 464 2 273 711

Electronic structure QROM32250 1496 51 × 16 9508 7 758 528 7 272 549

Electronic structure QROM321024 6140 165 × 16 38 892 102 674 880 100 399 903


041015-30

such comparable qubit and time overheads strengthens ourconfidence in these estimates. The time estimates, inparticular, are practically identical.The data are highly encouraging, with physical qubit

counts of order a million and times in hours for all but thelargest cases considered. Significant further reduction isexpected to be possible. For example, the N qubits in thejψi register are only operated on by the Majorana operatorcircuit, and this circuit targets just one of these qubits at atime. This implies that the remainder can be stored morecompactly in square surface code patches while not beinginteracted with. This method could easily reduce theoverhead of these N qubits, which account for 70%–80% of the physical qubits, by a factor of 6. This resultwould conservatively lower overall physical qubit require-ments by a factor of two.

VII. CONCLUSION

In this work, we introduced especially efficient fault-tolerant quantum circuits for using phase estimation toestimate the spectra of electronic Hamiltonians. Unlike pastwork, which has focused on realizing phase estimationunitaries encoding e−iHτ, corresponding to time evolutionunderH for duration τ, we focused on a recent idea that onemight more cheaply realize phase estimation unitariesencoding the quantum walk ei arccosðH=λÞ, where λ is aparameter closely related to the induced 1-norm of thesystem Hamiltonian [26,27]. We construct explicit quan-tum circuits for realizing this quantum walk with Tcomplexity linear in basis size for both the planarHubbard model and electronic structure Hamiltonians insecond quantization. We showed that phase estimationprojects these systems to an eigenstate and estimates theassociated eigenvalue to within additive error ϵ by queryingthe quantum walk operator an optimal number of times,scaling asOðλ=ϵÞ. To accomplish this result, we introducedgeneral techniques that we conjecture are near optimal forstreaming bits of a unary register and for implementing aquantum read-only memory. We introduced a new form of

Heisenberg-limited phase estimation specialized to linearcombinations of unitaries based simulations and providedbounds on T complexity and ancilla count, which remaintight even at small finite sizes.In addition to providing explicit Cliffordþ T circuits, we

compiled the bottleneck components of these simulationsto fault-tolerant surface code gates in order to rigorouslydetermine the resources that would be required for errorcorrecting interesting problems. We performed this com-pilation both by hand and by using automatic tools andfound similar overheads in both cases. We found thatclassically intractable instances of jellium and the Fermi-Hubbard model could be simulated with under 1 × 106 Tgates and would require about 1 × 106 physical qubits inthe surface code, with two-qubit error rates on the order of10−3. At error rates of 10−4, about an order of magnitudefewer physical qubits would be required. We also priced outsimulations of realistic solid-state materials such as dia-mond, graphite, silicon, metallic lithium, and crystallinelithium hydride and found that only slightly more than1 × 109 T gates and a few million physical qubits would berequired.Despite focusing on different systems, our results are

most readily comparable to the previous state-of-the-artresults from Ref. [43]. Even though Ref. [43] soughtempirical estimates of the T complexity rather than rigorousupper bounds as we did, they estimated that approximately1015–1016 T gates would be required for a 108-qubitsimulation of the FeMoco molecule active space. Bycomparison, our upper bounds on the T complexityrequired to solve the classically intractable electronicstructure problems studied here were roughly a milliontimes less. The low T complexity is the result of designing alean algorithm from the ground up, with insights matchedto the Hamiltonian and with innovative algorithmic sub-routines. The improvements are distributed across severalparts of our approach, each of which provides 1 or 2 ordersof magnitude improvement. Because our simulationsrequire only a few times more physical qubits than is

TABLE IX. Automatically generated qubit and time overheads of general chemistry and Hubbard circuits assuming gate error ratesof p ¼ 10−3 and p ¼ 10−4, a 2D array of nearest-neighbor coupled qubits, and a surface code error-detection cycle time of 1 μs.The execution time being estimated is the duration of one complete run of the phase estimation process.

Problem Physical qubits Execution time (h)

System N spin orbitals p ¼ 10−3 p ¼ 10−4 p ¼ 10−3 p ¼ 10−4

Hubbard model 72 1.7 × 106 5.3 × 105 4.6 2.6Hubbard model 128 2.4 × 106 7.8 × 105 15 8.4Hubbard model 200 3.8 × 106 1.0 × 106 40 21Hubbard model 800 1.5 × 107 4.2 × 106 6.7 × 102 3.7 × 102

Electronic structure 54 1.7 × 106 4.7 × 105 0.85 0.44Electronic structure 128 2.9 × 106 9.5 × 105 10 5.7Electronic structure 250 5.1 × 106 1.4 × 106 58 30Electronic structure 1024 2.3 × 107 5.6 × 106 2.8 × 103 1.4 × 103


041015-31

required by a single T factory, it is reasonable to expect thatthe simulations we outline here will become practical on thefirst universal fault-tolerant quantum devices, many yearsbefore the simulations discussed in Ref. [43] would beviable.Several important directions for future research pertain to

the extension of these simulation techniques to representa-tions that would be more effective for single molecules.While the dual basis described in Ref. [39] is well suited totreating solid-state materials such as the ones explored here(e.g., jellium, solid-state silicon, graphite, diamond, lithiumand lithium hydride), by combining our techniques withthe “Gausslet” basis sets of Ref. [41], we should also beable to simulate single molecules with similar resolution toGaussian orbitals—thus extending our results to systemssuch as FeMoco with similar overheads to those observedin this work. However, deploying the Gausslet basisfunctions to systems with large atomic nuclei such as iron(as in FeMoco) will require further research. If basis errorsare a concern, then future work should combine resultsfrom this paper with Refs. [109] and [26] in order todetermine the cost of encoding first-quantized electronicspectra in quantum circuits; in first quantization, basiserrors are suppressed exponentially in the number of qubitsused to represent the system.Another remaining challenge is to compute a tighter

upper bound on the number of physical qubits required bythe algorithm. At the first moment that it becomes tech-nologically possible to distill magic states, the number ofphysical qubits available on one machine will still beextremely limited. Getting a meaningful computation tofit at all will be difficult. Fortunately, the qubit countestimates of this paper were fairly conservative: We onlyexplored surface code constructions that we were able tovalidate and implement in software; these constructions arenot necessarily optimal. For example, the logical qubitrepresentation used in lattice surgery [110] requires fewerphysical qubits than the double-defect representation usedin the estimates of this paper. Furthermore, there are severalplaces in our circuits where we preferred small multipli-cative improvements in T count over small additiveimprovements in logical qubit count. For example, wedelay uncomputing QROM lookups in order to avoidrecomputation, and when performing phase estimation,we minimize the number of oracle queries by using a full-size phase register instead of a single phase qubit. Since wehave managed to show that with error rates of 10−3 one cansolve interesting problems in chemistry using on the orderof a million physical qubits within the surface code, a nextnatural goal would be to try to further reduce the resourcesrequired to be on the order of a 100 000 physical qubits.

ACKNOWLEDGMENTS

The authors thank Yuval Sanders, Artur Scherer, MáriaKieferová, and Guang Hao Low for helpful discussions

about linear combinations of unitaries based simulationmethods. We thank Garnet Kin-Lic Chan and KostyantynKechedzhi for discussions pertaining to the regimes inwhich the Hubbard model would be interesting to simulate.We thank Ian Kivlichan, Zhang Jiang, and Dave Bacon forhelpful comments on an early version of this manuscript.D.W. B. is funded by an Australian Research CouncilDiscovery Project (Grant No. DP160102426).

APPENDIX: PROPAGATING ERRORS FROMHAMILTONIAN COEFFICIENTS INTO

PHASE ESTIMATE

In this appendix, we address the question of howaccurately coefficients of the Hamiltonian must be preparedin the PREPARE oracle in order to estimate the Hamiltonianeigenvalues to precision ϵ. As discussed in Sec. II A, ourphase estimation scheme involves estimating the phasesinduced by the operator ei arccosðH=λÞ. If one is near thesingularity of arccos, then a small error in the Hamiltoniancan have a significant impact on the phase. Let us define

H ≡XL−1l¼0

wlHl ðA1Þ

for our approximate encoding of H. Using the statepreparation technique in Sec. III D, we obtain

λ ¼XL−1l¼0

wl: ðA2Þ

We denote by δ an upper bound on the approximation inany of the wl, so

δ ≥ jwl − wlj: ðA3Þ

Next, note that the error in the eigenphase obeys

ϵPREP≤kei arccosðH=λÞ−ei arccosðH=λÞk≤karccosðH=λÞ−arccosðH=λÞk

≤X∞p¼0

ð2p−1Þ!!λ2pþ1ð2pþ1Þð2pÞ!!kH

2pþ1−H2pþ1k; ðA4Þ

where !! is the double factorial z!!¼ z · ðz−2Þ · ðz−4Þ…1assuming z is a natural number. It is straightforward toshow inductively that for any p > 0,

kH2pþ1−H2pþ1k≤ ð2pþ1ÞðmaxfkHk;kHkgÞ2pkH−Hk:ðA5Þ

We then have from Eq. (A1) that


041015-32

kH − Hk ≤XL−1l¼0

jwl − wlj ≤ Lδ: ðA6Þ

We further have that

max fkHk; kHkg ≤ kHk þ Lδ: ðA7Þ

Substituting these equations into Eq. (A4) then gives

ϵPREP ≤X∞p¼0

ð2p − 1Þ!!λ2pþ1ð2pÞ!! ðkHk þ LδÞ2pLδ

¼ Lδλ

X∞p¼0

ð2p − 1Þ!!ð2pÞ!!

�kHk þ Lδλ

�2p

¼ Lδλ

�1 −

�kHk þ Lδλ

�2−1=2

: ðA8Þ

This inequality can be solved for δ to give

δ ≥ϵPREP

ð1þ ϵ2PREPÞL� ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

λ2ð1þ ϵ2PREPÞ − kHk2q

− ϵPREPkHk�

≥ϵPREPλ

ð1þ ϵ2PREPÞLð1 − kHk2=λ2Þ: ðA9Þ

If we require that ϵPREP ≤ffiffiffi2

pΔE=ð4λÞ as in Eq. (25), then

this can be obtained by choosing

δ ¼ffiffiffi2

pΔE

4Lð1þ ΔE2

8λ2Þ ð1 − kHk2=λ2Þ: ðA10Þ

[1] R. P. Feynman, Simulating Physics with Computers, Int. J.Theor. Phys. 21, 467 (1982).

[2] R. P. Feynman, Quantum Mechanical Computers, Found.Phys. 16, 507 (1986).

[3] S. Lloyd, Universal Quantum Simulators, Science 273,1073 (1996).

[4] D. S. Abrams and S. Lloyd, Simulation of Many-BodyFermi Systems on a Universal Quantum Computer, Phys.Rev. Lett. 79, 2586 (1997).

[5] D. S. Abrams and S. Lloyd,Quantum Algorithm ProvidingExponential Speed Increase for Finding Eigenvalues andEigenvectors, Phys. Rev. Lett. 83, 5162 (1999).

[6] A. Y. Kitaev, Quantum Measurements and the AbelianStabilizer Problem, arXiv:quant-ph/9511026.

[7] A. Aspuru-Guzik, A. D. Dutoi, P. J. Love, and M.Head-Gordon, Simulated Quantum Computation ofMolecular Energies, Science 309, 1704 (2005).

[8] E. Farhi, J. Goldstone, S. Gutmann, J. Lapan, A. Lundgren,and D. Preda, A Quantum Adiabatic Evolution AlgorithmApplied to Random Instances of an NP-Complete Problem,Science 292, 472 (2001).

[9] L.-A. Wu, M. S. Byrd, and D. A. Lidar, Polynomial-Time Simulation of Pairing Models on a QuantumComputer, Phys. Rev. Lett. 89, 057904 (2002).

[10] R. Babbush, P. J. Love, and A. Aspuru-Guzik, AdiabaticQuantum Simulation of Quantum Chemistry, Sci. Rep. 4,6603 (2014).

[11] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q.Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien, AVariational Eigenvalue Solver on a Photonic QuantumProcessor, Nat. Commun. 5, 4213 (2014).

[12] P. J. J. O’Malley, R. Babbush, I. D. Kivlichan, J. Romero,J. R. McClean, R. Barends, J. Kelly, P. Roushan,A. Tranter, N. Ding, B. Campbell, Y. Chen, Z. Chen,B. Chiaro, A. Dunsworth, A. G. Fowler, E. Jeffrey,A. Megrant, J. Y. Mutus, C. Neill et al., Scalable QuantumSimulation of Molecular Energies, Phys. Rev. X 6, 031007(2016).

[13] A. Kandala, A. Mezzacapo, K. Temme, M. Takita, J. M.Chow, and J. M. Gambetta, Hardware-efficient QuantumOptimizer for Small Molecules and Quantum Magnets,Nature (London) 549, 242 (2017).

[14] J. I. Colless, V. V. Ramasesh, D. Dahlen, M. S. Blok, J. R.McClean, J. Carter, W. A. de Jong, and I. Siddiqi, RobustDetermination of Molecular Spectra on a Quantum Proc-essor, Phys. Rev. X 8, 011021 (2018).

[15] C. Hempel, C. Maier, J. Romero, J. McClean, T. Monz, H.Shen, P. Jurcevic, B. Lanyon, P. Love, R. Babbush, A.Aspuru-Guzik, R. Blatt, and C. Roos, Quantum ChemistryCalculations on a Trapped-Ion Quantum Simulator,Phys. Rev. X 8, 031022 (2018).

[16] J. R. McClean, J. Romero, R. Babbush, and A. Aspuru-Guzik, The Theory of Variational Hybrid Quantum-Classical Algorithms, New J. Phys. 18, 023023 (2016).

[17] J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush,and H. Neven, Barren Plateaus in Quantum NeuralNetwork Training Landscapes, arXiv:1803.11173.

[18] D. Wecker, M. B. Hastings, and M. Troyer, ProgressTowards Practical Quantum Variational Algorithms,Phys. Rev. A 92, 042303 (2015).

[19] I. D. Kivlichan, J. McClean, N. Wiebe, C. Gidney, A.Aspuru-Guzik, G. K. -L. Chan, and R. Babbush, QuantumSimulation of Electronic Structure with Linear Depth andConnectivity, Phys. Rev. Lett. 120, 110501 (2018).

[20] J. Romero, R. Babbush, J. McClean, C. Hempel, P. Love,and A. Aspuru-Guzik, Strategies for Quantum ComputingMolecular Energies Using the Unitary Coupled ClusterAnsatz, arXiv:1701.02691.

[21] P.-L. Dallaire-Demers, J. Romero, L. Veis, S. Sim, and A.Aspuru-Guzik, Low-Depth Circuit Ansatz for PreparingCorrelated Fermionic States on a Quantum Computer,arXiv:1801.01053.

[22] M. Suzuki, Improved Trotter-Like Formula, Phys. Lett. A180, 232 (1993).

[23] D. W. Berry, A. M. Childs, R. Cleve, R. Kothari, andR. D. Somma, Simulating Hamiltonian Dynamics with aTruncated Taylor Series, Phys. Rev. Lett. 114, 090502(2015).

[24] G. H. Low and I. L. Chuang, Optimal Hamiltonian Sim-ulation by Quantum Signal Processing, Phys. Rev. Lett.118, 010501 (2017).


041015-33

https://doi.org/10.1007/BF02650179

https://doi.org/10.1007/BF02650179

https://doi.org/10.1007/BF01886518

https://doi.org/10.1007/BF01886518

https://doi.org/10.1126/science.273.5278.1073

https://doi.org/10.1126/science.273.5278.1073

https://doi.org/10.1103/PhysRevLett.79.2586



http://arXiv.org/abs/quant-ph/9511026

https://doi.org/10.1126/science.1113479

https://doi.org/10.1126/science.1057726


https://doi.org/10.1038/srep06603

https://doi.org/10.1038/srep06603

https://doi.org/10.1038/ncomms5213



https://doi.org/10.1038/nature23879



https://doi.org/10.1088/1367-2630/18/2/023023

http://arXiv.org/abs/1803.11173

https://doi.org/10.1103/PhysRevA.92.042303




https://doi.org/10.1016/0375-9601(93)90701-Z

https://doi.org/10.1016/0375-9601(93)90701-Z





[25] G. H. Low and I. L. Chuang, Hamiltonian Simulation byQubitization, arXiv:1610.06546.

[26] D.W. Berry, M. Kieferová, A. Scherer, Y. R. Sanders,G. H. Low, N. Wiebe, C. Gidney, and R. Babbush,Improved Techniques for Preparing Eigenstates of Fer-mionic Hamiltonians, npj Quantum Inf. 4, 22 (2018).

[27] D. Poulin, A. Y. Kitaev, D. Steiger, M. Hastings, and M.Troyer, Fast Quantum Algorithm for Spectral Properties,Phys. Rev. Lett. 121, 010501 (2018).

[28] A. Dunsworth, A. Megrant, R. Barends, Y. Chen, Z. Chen,B. Chiaro, A. Fowler, B. Foxen, E. Jeffrey, J. Kelly,P. V. Klimov, E. Lucero, J. Y. Mutus, M. Neeley, C. Neill,C. Quintana, P. Roushan, D. Sank, A. Vainsencher,J. Wenneret al., Low Loss Multi-layer Wiring for Super-conducting Microwave Devices, Appl. Phys. Lett. 112,063502 (2018).

[29] X. Fu, M. A. Rol, C. C. Bultink, J. van Someren, N.Khammassi, I. Ashraf, R. F. L. Vermeulen, J. C. de Sterke,W. J. Vlothuizen, R. N. Schouten, C. G. Almudever, L.DiCarlo, and K. Bertels, An Experimental Microarchitec-ture for a Superconducting Quantum Processor, Proceed-ings of the 50th Annual IEEE/ACM InternationalSymposium on Microarchitecture, Cambridge, Massachu-setts, 2017 (ACM, New York, 2017), p. 813–825.

[30] N. T. Bronn, V. P. Adiga, S. B. Olivadese, X. Wu, J. M.Chow, and D. P. Pappas, High Coherence Plane BreakingPackaging for Superconducting Qubits, Quantum Sci.Technol. 3, 024007 (2018).

[31] S. B. Bravyi and A. Y. Kitaev,Quantum Codes on a Latticewith Boundary, arXiv:quant-ph/9811052.

[32] E. Dennis, A. Y. Kitaev, A. Landahl, and J. Preskill,Topological Quantum Memory, J. Math. Phys. (N.Y.)43, 4452 (2002).

[33] R. Raussendorf and J. Harrington, Fault-Tolerant Quan-tum Computation with High Threshold in Two Dimen-sions, Phys. Rev. Lett. 98, 190504 (2007).

[34] R. Raussendorf, J. Harrington, and K. Goyal, TopologicalFault-Tolerance in Cluster State Quantum Computation,New J. Phys. 9, 199 (2007).

[35] A. G. Fowler, M. Mariantoni, J. M. Martinis, and A. N.Cleland, Surface Codes: Towards Practical Large-ScaleQuantum Computation, Phys. Rev. A 86, 032324 (2012).

[36] A. G. Fowler and S. J. Devitt, A Bridge to Lower OverheadQuantum Computation, arXiv:1209.0510.

[37] J. Hubbard, Electron Correlations in Narrow EnergyBands, Proc. R. Soc. A 276, 238 (1963).

[38] T. Helgaker, P. Jorgensen, and J. Olsen, MolecularElectronic Structure Theory (Wiley, New York, 2002).

[39] R. Babbush, N. Wiebe, J. McClean, J. McClain, H. Neven,and G. K. -L. Chan, Low-Depth Quantum Simulation ofMaterials, Phys. Rev. X 8, 011044 (2018).

[40] I. Kivlichan, N. Wiebe, C. Gidney, J. McClean, W. Sun, V.Denchev, A. Fowler, A. Aspuru-Guzik, and R. Babbush,Low T Gate Trotter-Based Quantum Simulation of Corre-lated Electrons (unpublished).

[41] S. R. White, Hybrid Grid/Basis Set Discretizations ofthe Schrödinger Equation, J. Chem. Phys. 147, 244102(2017).

[42] R. Babbush, D. W. Berry, I. D. Kivlichan, A. Y. Wei, P. J.Love, and A. Aspuru-Guzik, Exponentially More Precise

Quantum Simulation of Fermions in Second Quantization,New J. Phys. 18, 033032 (2016).

[43] M. Reiher, N. Wiebe, K. M. Svore, D. Wecker, and M.Troyer, Elucidating Reaction Mechanisms on QuantumComputers, Proc. Natl. Acad. Sci. U.S.A. 114, 7555(2017).

[44] J. D. Whitfield, J. Biamonte, and A. Aspuru-Guzik,Simulation of Electronic Structure Hamiltonians UsingQuantum Computers, Mol. Phys. 109, 735 (2011).

[45] D. Wecker, B. Bauer, B. K. Clark, M. B. Hastings, and M.Troyer, Gate-Count Estimates for Performing QuantumChemistry on Small Quantum Computers, Phys. Rev. A 90,022305 (2014).

[46] J. R. McClean, R. Babbush, P. J. Love, and A. Aspuru-Guzik, Exploiting Locality in Quantum Computation forQuantum Chemistry, J. Phys. Chem. Lett. 5, 4368 (2014).

[47] D. Poulin, M. B. Hastings, D. Wecker, N. Wiebe, A. C.Doherty, and M. Troyer, The Trotter Step Size Required forAccurate Quantum Simulation of Quantum Chemistry,Quantum Inf. Comput. 15, 361 (2015).

[48] R. Babbush, J. McClean, D. Wecker, A. Aspuru-Guzik,and N. Wiebe, Chemical Basis of Trotter-Suzuki Errors inChemistry Simulation, Phys. Rev. A 91, 022311 (2015).

[49] J. Haah, M. B. Hastings, R. Kothari, and G. H. Low,Quantum Algorithm for Simulating Real Time Evolutionof Lattice Hamiltonians, arXiv:1801.03922.

[50] D. Wecker, M. B. Hastings, N. Wiebe, B. K. Clark, C.Nayak, and M. Troyer, Solving Strongly CorrelatedElectron Models on a Quantum Computer, Phys. Rev.A 92, 062318 (2015).

[51] N. Cody Jones, J. D. Whitfield, P. L. McMahon, M.-H.Yung, R. V. Meter, A. Aspuru-Guzik, and Y. Yamamoto,Faster Quantum Chemistry Simulation on Fault-TolerantQuantum Computers, New J. Phys. 14, 115023 (2012).

[52] A. M. Childs, D. Maslov, Y. Nam, N. J. Ross, and Y. Su,Toward the First Quantum Simulation with QuantumSpeedup, arXiv:1711.10980.

[53] M. Szegedy, Quantum Speed-up of Markov ChainBased Algorithms, in 45th Annual IEEE Symposium onFoundations of Computer Science (IEEE, New York,2004), pp. 32–41.

[54] A. M. Childs and N. Wiebe, Hamiltonian SimulationUsing Linear Combinations of Unitary Operations,Quantum Inf. Comput. 12, 901 (2012).

[55] D. Aharonov and A. Ta-Shma, Adiabatic Quantum StateGeneration and Statistical Zero Knowledge, in Proceed-ings of the 35th ACM Symposium on Theory of Computing—STOC ’03 (ACM Press, New York, 2003), p. 20.

[56] A. Luis and J. Perina, Optimum Phase-Shift Estimationand the Quantum Description of the Phase Difference,Phys. Rev. A 54, 4564 (1996).

[57] R. B. Griffiths and C.-S. Niu, Semiclassical FourierTransform for Quantum Computation, Phys. Rev. Lett.76, 3228 (1996).

[58] C. Gidney and R. Babbush, Quantum Read-Only Memoryfor Implement Efficient Fault-Tolerant Quantum Oracles(in press).

[59] R. Babbush, D. W. Berry, Y. R. Sanders, I. D. Kivlichan,A. Scherer, A. Y. Wei, P. J. Love, and A. Aspuru-Guzik, Exponentially More Precise Quantum Simulation


041015-34


https://doi.org/10.1038/s41534-018-0071-5


https://doi.org/10.1063/1.5014033

https://doi.org/10.1063/1.5014033

https://doi.org/10.1088/2058-9565/aaa645

https://doi.org/10.1088/2058-9565/aaa645

http://arXiv.org/abs/quant-ph/9811052

https://doi.org/10.1063/1.1499754

https://doi.org/10.1063/1.1499754


https://doi.org/10.1088/1367-2630/9/6/199



https://doi.org/10.1098/rspa.1963.0204


https://doi.org/10.1063/1.5007066

https://doi.org/10.1063/1.5007066

https://doi.org/10.1088/1367-2630/18/3/033032

https://doi.org/10.1073/pnas.1619152114

https://doi.org/10.1073/pnas.1619152114

https://doi.org/10.1080/00268976.2011.552441



https://doi.org/10.1021/jz501649m





https://doi.org/10.1088/1367-2630/14/11/115023





of Fermions in the Configuration Interaction Representa-tion, Quantum Sci. Technol. 3, 015006 (2018).

[60] C. Gidney, Halving the Cost of Quantum Addition,arXiv:1709.06648.

[61] R. D. Somma, G. Ortiz, J. E. Gubernatis, E. Knill, andR. Laflamme, Simulating Physical Phenomena by Quan-tum Networks, Phys. Rev. A 65, 042323 (2002).

[62] C. Gidney, R. Babbush, M. Mohseni, and H. Neven,Quantum Read-Only Memory for Implementing EfficientFault-Tolerant Quantum Oracles (in press).

[63] V. Giovannetti, S. Lloyd, and L. Maccone, QuantumRandom Access Memory, Phys. Rev. Lett. 100, 160501(2008).

[64] B. C. Travaglione, M. A. Nielsen, H. M. Wiseman, and A.Ambainis, ROM-based Computation: Quantum versusClassical, Quantum Inf. Comput. 2, 324 (2002).

[65] I. Kerenidis and A. Prakash, Quantum RecommendationSystems, arXiv:1603.08675.

[66] V. Giovannetti, S. Lloyd, and L. Maccone, Architecturesfor a Quantum Random Access Memory, Phys. Rev. A 78,052310 (2008).

[67] F.-Y. Hong, Y. Xiang, Z.-Y. Zhu, L.-Z. Jiang, andL.-N. Wu, Robust Quantum Random Access Memory,Phys. Rev. A 86, 010306 (2012).

[68] S. Arunachalam, V. Gheorghiu, T. Jochym-O’Connor, M.Mosca, and P. V. Srinivasan, On the Robustness of BucketBrigade Quantum RAM, New J. Phys. 17, 123010 (2015).

[69] V. V. Shende, S. S. Bullock, and I. L. Markov, Synthesis ofQuantum-Logic Circuits, IEEE Trans. CAD IntegratedCircuits Syst. 25, 1000 (2006).

[70] L. K. Grover, Synthesis of Quantum Superpositions byQuantum Computation, Phys. Rev. Lett. 85, 1334 (2000).

[71] P. Høyer, Arbitrary Phases in Quantum Amplitude Am-plification, Phys. Rev. A 62, 052304 (2000).

[72] A. Walker, New Fast Method for Generating DiscreteRandom Numbers with Arbitrary Frequency Distributions,Electron. Lett. 10, 127 (1974).

[73] M. Vose, A Linear Algorithm for Generating RandomNumbers with a Given Distribution, IEEE Transactions onSoftware Engineering 17, 972 (1991).

[74] Code provided at www.openfermion.org.[75] J. R. McClean, I. D. Kivlichan, K. J. Sung, D. S. Steiger, Y.

Cao, C. Dai, E. S. Fried, C. Gidney, B. Gimby, T. Häner, T.Hardikar, V. Havlíček, C. Huang, Z. Jiang, M. Neeley, T.O’Brien, I. Ozfidan, M. D. Radin, J. Romero, N. Rubinet al.,OpenFermion: The Electronic Structure Package forQuantum Computers, arXiv:1710.07629.

[76] P. Jordan and E. Wigner, Über das Paulische Äquivalenz-verbot, Z. Phys. 47, 631 (1928).

[77] G. Evenbly and S. R. White, Representation and Design ofWavelets Using Unitary Circuits, Phys. Rev. A 97, 052314(2018).

[78] P. Hohenberg and W. Kohn, Inhomogeneous Electron Gas,Phys. Rev. 136, B864 (1964).

[79] M. Stone, Quantum Hall Effect (World Scientific,Singapore, 1992).

[80] D. M. Ceperley and B. J. Alder, Ground State of theElectron Gas by a Stochastic Method, Phys. Rev. Lett.45, 566 (1980).

[81] B. Tanatar and D.M. Ceperley, Ground State of the Two-Dimensional Electron Gas, Phys. Rev. B 39, 5005 (1989).

[82] F. H. Zong, C. Lin, and D. M. Ceperley, Spin Polarizationof the Low-Density Three-Dimensional Electron Gas,Phys. Rev. E 66, 036703 (2002).

[83] C. Attaccalite, S. Moroni, P. Gori-Giorgi, and G. B.Bachelet, Correlation Energy and Spin Polarization inthe 2D Electron Gas, Phys. Rev. Lett. 88, 256601 (2002).

[84] N. D. Drummond and R. J. Needs, Phase Diagram of theLow-Density Two-Dimensional Homogeneous ElectronGas, Phys. Rev. Lett. 102, 126402 (2009).

[85] G. G. Spink, R. J. Needs, and N. D. Drummond, QuantumMonte Carlo Study of the Three-Dimensional Spin-Polarized Homogeneous Electron Gas, Phys. Rev. B 88,085121 (2013).

[86] M. Gell-Mann and K. A. Brueckner, Correlation Energy ofan Electron Gas at High Density, Phys. Rev. 106, 364(1957).

[87] D. L. Freeman, Coupled-Cluster Expansion Applied to theElectron Gas: Inclusion of Ring and Exchange Effects,Phys. Rev. B 15, 5512 (1977).

[88] J. J. Shepherd, G. Booth, A. Grüneis, and A. Alavi, FullConfiguration Interaction Perspective on the Homo-geneous Electron Gas, Phys. Rev. B 85, 081103 (2012).

[89] J. J. Shepherd, G. H. Booth, and A. Alavi, Investigationof the Full Configuration Interaction QuantumMonte Carlo Method Using Homogeneous Electron GasModels, J. Chem. Phys. 136, 244101 (2012).

[90] M. T. Wilson and B. L. Gyorffy, A Constrained PathAuxiliary-Field Quantum Monte Carlo Method for theHomogeneous Electron Gas, J. Phys. Condens. Matter 7,L371 (1995).

[91] M. Motta, D. E. Galli, S. Moroni, and E. Vitali, ImaginaryTime Density-Density Correlations for Two-DimensionalElectron Gases at High Density, J. Chem. Phys. 143,164108 (2015).

[92] R. Martin, Electronic Structure (Cambridge UniversityPress, Cambridge, England, 2004).

[93] R. Martin, L. Reining, and D. Ceperley, InteractingElectrons (Cambridge University Press, Cambridge,England, 2016).

[94] S. Tosoni, C. Tuma, J. Sauer, B. Civalleri, and P. Ugliengo,A Comparison between Plane Wave and Gaussian-typeOrbital Basis Sets for Hydrogen Bonded Systems: FormicAcid as a Test Case, J. Chem. Phys. 127, 154102 (2007).

[95] G. Knizia and G. K.-L. Chan, Density Matrix Embedding:A Simple Alternative to Dynamical Mean-Field Theory,Phys. Rev. Lett. 109, 186404 (2012).

[96] B. Bauer, D. Wecker, A. J. Millis, M. B. Hastings, andM. Troyer, Hybrid Quantum-Classical Approach to Cor-related Materials, Phys. Rev. X 6, 031045 (2016).

[97] J. P. F. LeBlanc, A. E. Antipov, F. Becca, I. W. Bulik,G. K. -L. Chan, C.-M. Chung, Y. Deng, M. Ferrero,T. M. Henderson, C. A. Jimenez-Hoyos, E. Kozik, X.-W.Liu, A. J. Millis, N. V. Prokofev, M. Qin, G. E. Scuseria, H.Shi, B. V. Svistunov, L. F. Tocchio, I. S. Tupitsyn, S. R.White et al., Solutions of the Two-Dimensional HubbardModel: Benchmarks and Results from a Wide Range ofNumerical Algorithms, Phys. Rev. X 5, 041041 (2015).


041015-35

https://doi.org/10.1088/2058-9565/aa9463









https://doi.org/10.1088/1367-2630/17/12/123010

https://doi.org/10.1109/TCAD.2005.855930

https://doi.org/10.1109/TCAD.2005.855930



https://doi.org/10.1049/el:19740097

https://doi.org/10.1109/32.92917

https://doi.org/10.1109/32.92917

www.openfermion.org

www.openfermion.org

www.openfermion.org


https://doi.org/10.1007/BF01331938



https://doi.org/10.1103/PhysRev.136.B864



https://doi.org/10.1103/PhysRevB.39.5005

https://doi.org/10.1103/PhysRevE.66.036703





https://doi.org/10.1103/PhysRev.106.364

https://doi.org/10.1103/PhysRev.106.364



https://doi.org/10.1063/1.4720076

https://doi.org/10.1088/0953-8984/7/28/001

https://doi.org/10.1088/0953-8984/7/28/001

https://doi.org/10.1063/1.4934666

https://doi.org/10.1063/1.4934666

https://doi.org/10.1063/1.2790019




[98] E. H. Lieb and F. Y. Wu, Absence of Mott Transition inan Exact Solution of the Short-Range, One-BandModel in One Dimension, Phys. Rev. Lett. 20, 1445(1968).

[99] W. Metzner and D. Vollhardt, Correlated Lattice Fermionsin Infinity Dimensions, Phys. Rev. Lett. 62, 324 (1989).

[100] Z. Jiang, K. J. Sung, K. Kechedzhi, V. N. Smelyanskiy, andS. Boixo, Quantum Algorithms to Simulate Many-BodyPhysics of Correlated Fermions, Phys. Rev. Applied 9,044036 (2018).

[101] E. H. Lieb and D.W. Robinson, The Finite Group Velocityof Quantum Spin Systems, in Statistical Mechanics(Springer, New York, 1972), pp. 425–431.

[102] N. Wiebe, D. W. Berry, P. Hoyer, and B. C. Sanders,Higher Order Decompositions of Ordered Operator Ex-ponentials, J. Phys. A 43, 065203 (2010).

[103] A. G. Fowler, Optimal Complexity Correction of Corre-lated Errors in the Surface Code, arXiv:1310.0863.

[104] B. J. Brown, K. Laubscher, M. S. Kesselring, andJ. R. Wootton, Poking Holes and Cutting Corners to

Achieve Clifford Gates with the Surface Code, Phys.Rev. X 7, 021029 (2017).

[105] A. G. Fowler, Low-Overhead Surface Code LogicalHadamard, Quantum Inf. Comput. 12, 970 (2012).

[106] S. Bravyi and A. Y. Kitaev, Universal Quantum Compu-tation with Ideal Clifford Gates and Noisy Ancillas,Phys. Rev. A 71, 022316 (2005).

[107] B. W. Reichardt, Quantum Universality from Magic StatesDistillation Applied to CSS Codes, Quantum Inf. Process.4, 251 (2005).

[108] A. Paler, A. G. Fowler, and R. Wille, Synthesis of ArbitraryQuantum Circuits to Topological Assembly: Systematic,Online and Compact, Sci. Rep. 7, 10414 (2017).

[109] I. D. Kivlichan, N. Wiebe, R. Babbush, and A. Aspuru-Guzik, Bounding the Costs of Quantum Simulation ofMany-Body Physics in Real Space, J. Phys. A 50, 305301(2017).

[110] C. Horsman, A. G. Fowler, S. Devitt, and R. V. Meter,Surface Code Quantum Computing by Lattice Surgery,New J. Phys. 14, 123011 (2012).


041015-36




https://doi.org/10.1103/PhysRevApplied.9.044036

https://doi.org/10.1103/PhysRevApplied.9.044036

https://doi.org/10.1088/1751-8113/43/6/065203





https://doi.org/10.1007/s11128-005-7654-8

https://doi.org/10.1007/s11128-005-7654-8

https://doi.org/10.1038/s41598-017-10657-8

https://doi.org/10.1088/1751-8121/aa77b8

https://doi.org/10.1088/1751-8121/aa77b8

https://doi.org/10.1088/1367-2630/14/12/123011

encoding electronic spectra in quantum circuits with ... · encoding electronic spectra in quantum...

Documents