princeton university...acknowledgments i would like to express my sincere appreciation to all those...

Reversible and Quantum Circuit

Synthesis

Chia-Chun Lin

A Dissertation

Presented to the Faculty

of Princeton University

in Candidacy for the Degree

of Doctor of Philosophy

Recommended for Acceptance

by the Department of

Electrical Engineering

Adviser: Niraj K. Jha

June 2014

c⃝ Copyright by Chia-Chun Lin, 2014.

All rights reserved.

Abstract

This thesis presents five major tools for the synthesis of reversible and quantum cir-

cuits. Quantum computation has the ability to solve several important problems sig-

nificantly faster than its classical counterpart. Because of this promise, much research

effort has been dedicated to discovering new quantum algorithms and technologies.

Quantum mechanics postulates that the time-evolution of quantum states is re-

versible. Thus, reversibility is a necessary condition for quantum computing. Hence,

we propose an effective method and tool, called RMDDS, for synthesizing reversible

circuits. Since the evolution of quantum states is determined by some primitive

physical operations, quantum computers implemented in different physical systems

have different cost. Therefore, we propose an optimized quantum gate library, called

QGLVP, for various physical machine descriptions.

To enhance synthesis efficiency, we introduce QLib, a quantum module library,

which contains scripts to generate quantum modules for many well-known quantum

algorithms.

Since a quantum system inevitably interacts with the environment, this leads to

error and consequent failure of computation. To address this problem, we propose

FTQLS, a tool that synthesizes and optimizes fault-tolerant quantum circuits by using

logic identity rules for various physical machine descriptions.

Finally, we present a tool, called PAQCS, for physical design-aware fault-tolerant

quantum circuit synthesis. It effectively synthesizes quantum logic circuits into quan-

tum physical circuits, targeting different physical machine descriptions and quantum

error correction codes.

iii

Acknowledgments

I would like to express my sincere appreciation to all those who made this thesis

possible.

I am deeply grateful for the guidance of my adviser, Prof. Niraj K. Jha. His

enthusiasm for innovative research and meticulous attention to detail greatly inspired

and influenced me. He showed me the right direction at every critical moment during

the development of this thesis and directed me all the way.

I also received much advice and guidance from Prof. Sun-Yuan Kung, Prof. Sus-

mita Sur-Kolay, and Prof. Amlan Chakrabarti. Discussions with them were always a

great source for new ideas. Without their encouragement and inspiration, this thesis

would not have been possible.

Thesis review is demanding work, and doing it under time pressure only makes

it harder. I would like to thank my dissertation readers, Prof. Niraj K. Jha, Prof.

Susmita Sur-Kolay, and Prof. Kaushik Sengupta, for their extensive efforts in polish-

ing this thesis. I am also thankful to Prof. Niraj K. Jha, Prof. Sun-Yuan Kung, and

Prof. Stephen A. Lyon for agreeing to be on my final public oral committee.

Profs. Bede Liu and Sharad Malik kindly offered me teaching opportunities in

their ELE482 and ELE206 courses. Those were valuable and enjoyable experiences.

I would like to thank my English tutor Sandra Richter and host family Anne

Remillard for their help, encouragement in my PhD life, and sharing the American

culture and traditions with me.

I appreciate the help I received from Chun-Yi Lee, Chunxiao Li, Meng Zhang,

Jun-Wei Chuah, Ting-Jung Lin, Sourindra Chaudhuri, Aoxiang Tang, Xianmin Chen,

Yang Yang, Debajit Bhattacharya over the past few years.

Last, but not the least, I am grateful to my father, mother, brother, and wife An

Dai and all my friends for their support during my PhD study. They were unbelievably

encouraging and forbearing. They are the reason I keep going.

iv

List of Abbreviations

BDD binary decision diagram

CTL Clifford plus T library

DDS decision diagram synthesis

FT fault-tolerant

FTQLS fault-tolerant quantum logic synthesis

FTS fault-tolerant set

IT ion trap

KFDD Kronecker functional decision diagram

LP linear photonics

NA neutral atom

NP nonlinear photonics

PAQCS physical design-aware fault-tolerant quantum circuit synthesis

PMD physical machine description

QC quantum cost

QD quantum dot

QECC quantum error correction code

QGLVP quantum gate library for various physical machine descriptions

QLib quantum module library

RMDDS Reed-Muller decision diagram synthesis

RMRLS Reed-Muller reversible logic synthesis

SC superconducting

SKA Solovay-Kitaev algorithm

STA skipping table algorithm

ULB universal logic block

v

Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

List of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

1 Introduction 1

1.1 Quantum System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Quantum Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Quantum Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Reversible Computing . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.5 FT Quantum Computation . . . . . . . . . . . . . . . . . . . . . . . . 9

1.6 Physical Machine Description . . . . . . . . . . . . . . . . . . . . . . 10

1.7 Physical Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.8 Synthesis Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.9 Contributions of the Thesis . . . . . . . . . . . . . . . . . . . . . . . 15

1.10 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2 Related Work 17

2.1 Reversible Logic Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1.1 Exact Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . 18

vi

2.1.2 Heuristic Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2 Quantum Logic Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2.1 Logic Identities . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2.2 Quantum Shannon Decomposition . . . . . . . . . . . . . . . . 22

2.2.3 Quantum Compilation . . . . . . . . . . . . . . . . . . . . . . 22

2.3 Physical Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3.1 Tiled Quantum Architecture . . . . . . . . . . . . . . . . . . . 23

2.3.2 Physical Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 25

3 RMDDS: Reed-Muller Decision Diagram Synthesis for Reversible Logic 26

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.2.1 Reversible Functions . . . . . . . . . . . . . . . . . . . . . . . 29

3.2.2 Reversible Gates . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.2.3 Reed-Muller Expansion . . . . . . . . . . . . . . . . . . . . . . 32

3.2.4 Decision Diagrams . . . . . . . . . . . . . . . . . . . . . . . . 33

3.3 Motivational Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.3.1 Example 1: #qubits vs. QC . . . . . . . . . . . . . . . . . . . 34

3.3.2 Example 2: Circuit Size vs. Synthesis Time . . . . . . . . . . 35

3.4 RMDDS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4.1 RMRLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.4.2 DDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.4.3 RMDDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.4.4 Synthesis Example . . . . . . . . . . . . . . . . . . . . . . . . 44

3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

vii

4 QLib: Quantum Module Library 59

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.2 Quantum Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.2.1 One-qubit Gates . . . . . . . . . . . . . . . . . . . . . . . . . 62

4.2.2 Two-qubit Gates . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.2.3 Reversible Gates . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.3 File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.4 Quantum Module Library . . . . . . . . . . . . . . . . . . . . . . . . 68

4.4.1 QFT/IQFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.4.2 Bernstein-Vazirani Search (BVS) . . . . . . . . . . . . . . . . 69

4.4.3 Grover’s Search . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4.4.4 Arithmetic Circuits . . . . . . . . . . . . . . . . . . . . . . . . 72

4.5 Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5 QGLVP: Optimized Quantum Gate Library for Various PMDs 84

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

5.2 Primitive Quantum Gates . . . . . . . . . . . . . . . . . . . . . . . . 87

5.2.1 One-qubit Gates . . . . . . . . . . . . . . . . . . . . . . . . . 88

5.2.2 Two-qubit Gates . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.3 Identity Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.3.1 One-qubit Identity Rules . . . . . . . . . . . . . . . . . . . . . 93

5.3.2 Two-qubit Identity Rules . . . . . . . . . . . . . . . . . . . . . 96

5.3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.4 Quantum Gate Library . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5.4.1 RY(θ) Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

5.4.2 RZ(θ) Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5.4.3 H Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

viii

5.4.4 CZ Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5.4.5 CNOT Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

5.4.6 CP Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.4.7 G Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.4.8 iSW Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

5.4.9 SWAP Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

5.4.10 ZENO Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.4.11 Toffoli Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.4.12 Peres Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

5.4.13 Fredkin Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

5.4.14 Cost Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

5.4.15 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

5.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

6 FTQLS: Fault-tolerant Quantum Logic Synthesis 118

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

6.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

6.2.1 FT Quantum Computation . . . . . . . . . . . . . . . . . . . 121

6.2.2 FTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

6.3 Synthesis Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

6.3.1 Technology Mapping . . . . . . . . . . . . . . . . . . . . . . . 126

6.3.2 Quantum Compilation . . . . . . . . . . . . . . . . . . . . . . 126

6.4 Circuit Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

6.4.1 Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6.4.2 Simplify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6.4.3 Interchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.4.4 Commute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

6.4.5 Optimization Process . . . . . . . . . . . . . . . . . . . . . . . 133

ix

6.4.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

6.4.7 Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137


6.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

7 PAQCS: Physical Design-aware Fault-tolerant Quantum Circuit Synthesis147

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

7.2 Swap Based Quantum Tile Architecture . . . . . . . . . . . . . . . . 150

7.3 Motivational Example . . . . . . . . . . . . . . . . . . . . . . . . . . 152

7.4 Physical Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

7.4.1 Qubit Placement . . . . . . . . . . . . . . . . . . . . . . . . . 154

7.4.2 Channel Routing . . . . . . . . . . . . . . . . . . . . . . . . . 158

7.4.3 PAQCS: Synthesis Flow . . . . . . . . . . . . . . . . . . . . . 163


7.5.1 Placement and Routing . . . . . . . . . . . . . . . . . . . . . . 165

7.5.2 PAQCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

7.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

8 Conclusions and Future Research 180

Bibliography 184

x

List of Tables

3.1 Experimental results for reversible functions under #qubits minimization 48

3.2 Experimental results for reversible functions under QC minimization . 49

3.3 Percentage increase in #qubits and decrease in QC from the #qubits to QC optimization objective for reversible functions 51

3.4 Comparison with the best results . . . . . . . . . . . . . . . . . . . . 52

3.5 Experimental results for irreversible functions under #qubits minimization 56

3.6 Experimental results for irreversible functions under QC minimization 57

3.7 Percentage increase in #qubits and decrease in QC from the #qubits to QC optimization objective for irreversible functions 58

4.1 Syntax for quantum gates . . . . . . . . . . . . . . . . . . . . . . . . 68

4.2 Impact of synthesis scripts . . . . . . . . . . . . . . . . . . . . . . . . 82

5.1 Supported operations in different PMDs . . . . . . . . . . . . . . . . 86

5.2 One-qubit identity rules (length=2) . . . . . . . . . . . . . . . . . . . 95

5.3 One-qubit identity rules (length=3) . . . . . . . . . . . . . . . . . . . 96

5.4 H gate identity rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

5.5 RXY gate identity rules . . . . . . . . . . . . . . . . . . . . . . . . . 97

5.6 Asqu gate identity rules . . . . . . . . . . . . . . . . . . . . . . . . . 97

5.7 Two-qubit identity rules . . . . . . . . . . . . . . . . . . . . . . . . . 100

5.8 Execution cycles for each operation in QD . . . . . . . . . . . . . . . 102

5.9 Execution cycles for each operation in SC . . . . . . . . . . . . . . . . 102

5.10 Number of operations of each gate on every PMD . . . . . . . . . . . 114

xi

5.11 Number of execution cycles of each gate on every PMD . . . . . . . . 115

5.12 Cost of Grover’s algorithm for search value |01⟩ . . . . . . . . . . . . 116

6.1 Synthesis result for RZ(π64) using SKA and STA . . . . . . . . . . . . 123

6.2 Conversion between one-qubit FTS and CTL . . . . . . . . . . . . . . 124

6.3 Interchange rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

6.4 Experimental results for plain adder circuits . . . . . . . . . . . . . . 139

6.5 Experimental results for QFT circuits with only OPT-3 optimization 140

6.6 Experimental results for QFT circuits with three-stage optimization and the improvement142

6.7 Using CTL and FTS for QFT circuits in the LP system . . . . . . . 143

6.8 Synthesis results part I . . . . . . . . . . . . . . . . . . . . . . . . . . 144

6.9 Synthesis results part II . . . . . . . . . . . . . . . . . . . . . . . . . 145

6.10 Average percentage reductions for different PMDs . . . . . . . . . . . 145

6.11 Optimization results for a 4-bit modular adder based on SKA and STA 146

7.1 Physical #ops for three QECCs and two PMDs . . . . . . . . . . . . 166

7.2 Physical #cycles for three QECCs and two PMDs . . . . . . . . . . . 167

7.3 Synthesis results for QLib benchmarks part I . . . . . . . . . . . . . . 169

7.4 Synthesis results for QLib benchmarks part II . . . . . . . . . . . . . 170

7.5 Synthesis results for RevLib benchmarks part I . . . . . . . . . . . . 171

7.6 Synthesis results for RevLib benchmarks part II . . . . . . . . . . . . 172

7.7 Comparisons with synthesis results presented in other work . . . . . . 175

7.8 Average improvements due to PAQCS . . . . . . . . . . . . . . . . . 176

xii

List of Figures

1.1 Bloch sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 An example quantum circuit . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 High-level synthesis flow . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.1 Three-level tile hierarchy, each kth-level tile is composed of nine (k − 1)th-level tiles 24

2.2 Two types of tiled architecture. (a) movement based and (b) swap based 24

3.1 (a) A 2-to-1 irreversible OR gate/function, and (b) a 3-to-3 reversible OR gate obtained by adding a constant input (c1) and two garbage outputs (g1, g2) 30

3.2 (a) n-bit MCT, (b) NOT, (c) CNOT, and (d) Toffoli gate . . . . . . . 30

3.3 Two implementations of a five-bit MCT gate: (a) with [#qubits,QC] = [5,29] and (b) with [#qubits,QC] = [8,16] 34

3.4 Truth table for Example 2 . . . . . . . . . . . . . . . . . . . . . . . . 35

3.5 (a) Quick synthesis by direct placement of PPRM terms on ancillary lines, with [#qubits,QC] = [6,19] and (b) RMRLS requiring more synthesis time, with [#qubits,QC] = [3,11] 35

3.6 Synthesis flow of RMRLS . . . . . . . . . . . . . . . . . . . . . . . . 36

3.7 MCT cascades for various decompositions of a non-shared vertex in a DD 38

3.8 MCT cascades for various decompositions of a shared vertex in a DD through the addition of an ancillary bit 39

3.9 Synthesis flow of RMDDS . . . . . . . . . . . . . . . . . . . . . . . . 40

3.10 Pseudocode of RMDDS . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.11 A synthesis example . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.12 Distribution of solutions for 0410184 when using the NCTFP library. 50

3.13 Solutions for hwb6 58 under relaxed synthesis times when using the NCTFP library 53

4.1 Two-qubit quantum gates: (a) CNOT, (b) CP, (c) CZ, and (d) SWAP 65

xiii

4.2 An example circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.3 An example output file . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.4 A four-qubit QFT circuit . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.5 An EWS module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.6 The BVS circuit for n = 3 and a = 5 . . . . . . . . . . . . . . . . . . 71

4.7 The circuit structure for Grover’s search . . . . . . . . . . . . . . . . 72

4.8 The circuit for diffusion operator D . . . . . . . . . . . . . . . . . . . 72

4.9 The circuit structure of Cuccaro’s adder (four-qubit) . . . . . . . . . 74

4.10 (a) MAJ, and (b) UMA modules for Cuccaro’s adder . . . . . . . . . 74

4.11 The circuit structure of Draper’s adder (four-qubit) . . . . . . . . . . 75

4.12 (a) CMAJ, and (b) CUMA modules for controlled Cuccaro’s adder . . 75

4.13 A four-qubit multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.14 Rotate left by 1 (≪ 1) . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.15 A modular adder (a+b)%N . . . . . . . . . . . . . . . . . . . . . . . 77

4.16 A modular adder for constant a and N . . . . . . . . . . . . . . . . . 78

4.17 A modular subtracter for constant a and N . . . . . . . . . . . . . . 78

4.18 Constant modular multiplier . . . . . . . . . . . . . . . . . . . . . . 79

4.19 Restored modular multiplier . . . . . . . . . . . . . . . . . . . . . . . 80

4.20 Modular exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.1 Symbols of (a) iSW and (b) G gates . . . . . . . . . . . . . . . . . . . 91

5.2 CZ implementation: (a) and (b) in the NP system, (c) in the IT system104

5.3 CNOT implementation: (a) from H gate, (b) SC, NA, (c) QD, and (d) IT systems104

5.4 CP gate construction: (a) from RZ and CRZ gates, and for the (b) LP, NP, (c) QD, (d) NA, and (e) IT systems106

5.5 G gate construction: (a) from RZ and CRZ gates, and for the (b) LP, NP, (c) QD, (d) NA, and (e) SC systems106

5.6 iSW gate construction: (a) from CRX, (b) from CZ, applicable to QD, NA, and LP, (c) NP, and (d) IT systems107

5.7 SWAP gate construction: (a) NP, (b) NA, (c) QD, (d) IT, and (e) SC systems108

5.8 ZENO gate construction: (a) from the SWAP gate, and (b) for the QD systems109

xiv

5.9 Several ways to construct a CCZ gate using five two-qubit gates . . . 110

5.10 Two ways to construct a CS gate . . . . . . . . . . . . . . . . . . . . 111

5.11 Construction of a CA gate . . . . . . . . . . . . . . . . . . . . . . . . 111

5.12 Toffoli gate construction based on (a) CNOT and (b) CZ gates . . . . 111

5.13 Toffoli gate construction: (a) basic implementation, (b) suitable for the SC system (the dashed box is a CV gate), (c) suitable for the IT system112

5.14 (a) Peres and (b) Fredkin gates . . . . . . . . . . . . . . . . . . . . . 112

5.15 Peres gate construction: (a) basic implementation, (b) suitable for the SC system, (c) suitable for the IT system113

5.16 (a) Implementation of Grover’s algorithm for search value |01⟩, and (b) optimized circuit for the NP system (the gates in the dashed box can be merged into an Asqu operation; the dashed arrow indicates the critical path)116

6.1 FT implementations: (a) RZ(π4), (b) RX(

π4), and (c) RY(

π4). “↗” indicates a measurement gate125

6.2 Synthesis flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

6.3 Conversion of non-FT two-qubit gates to those in FTS(2): (a) CP, (b) G, and (c) iSW127

6.4 Synthesis flow of non-FT one-qubit gates based on the FT table . . . 128

6.5 Application flow for the simplify rule . . . . . . . . . . . . . . . . . . 131

6.6 Commute rules for (a) CNOT, (b) CP, and (c) G gates . . . . . . . . 132

6.7 Pseudocode of the optimization process . . . . . . . . . . . . . . . . . 134

6.8 An example of window selection . . . . . . . . . . . . . . . . . . . . . 135

6.9 An optimization example . . . . . . . . . . . . . . . . . . . . . . . . . 136

6.10 Verification of the optimization process . . . . . . . . . . . . . . . . 137

7.1 The architecture creates communication channels through the use of swap chains, which exchange the states of adjacent qubits. Application of a swap chain, consisting of S(q8,q5) and S(q5,q2), to the three qubits is equivalent to the application of S(s8,s5) and S(s8,s2) to the corresponding qubit states. Note that though their qubit states are swapped, the qubit positions remain fixed151

7.2 A motivational example. (a) A logic circuit, where q0-q5 and s0-s5 refer to the name and state of qubits, respectively. (b) When qubits are placed trivially, #sw = (costB,costU,costR) = (8,6,8). (c) When channel routing is carefully taken into account, #sw = (8,3,4). (d) When both qubit placement and routing are carefully taken into account, #sw reduces to (4,1,2)152

7.3 A placement example. (a) Initial state. Placement of the (b) v0, (c) v2, (d) v1, (e) v3, (f) v4, and (g) v5 qubits157

7.4 A routing example. (a) Input circuit and its qubit and state layouts. (b)-(e) Routing for C(s0,s8). (f)-(h) Routing for C(s0,s5), C(s3,s6), and C(s5,s7), respectively. (i)-(j) Routing for C(s2,s0). (k) Routing for C(s5,s8). (l)-(m) Routing for C(s7,s6). (n) Recovery of the qubit states to their original position: pop ga stack with S(s0,s5), S(s4,s0), and S(s1,s0). (o) Recovered qubits states161

7.5 Synthesis flow of PAQCS . . . . . . . . . . . . . . . . . . . . . . . . . 164

7.6 #ops for the Steane code . . . . . . . . . . . . . . . . . . . . . . . . . 176

7.7 #cycles for the Steane code . . . . . . . . . . . . . . . . . . . . . . . 177

7.8 #ops for the Bacon-Shor code . . . . . . . . . . . . . . . . . . . . . . 177

xv

7.9 #cycles for the Bacon-Shor code . . . . . . . . . . . . . . . . . . . . . 178

7.10 #ops for the Knill code . . . . . . . . . . . . . . . . . . . . . . . . . . 178

7.11 #cycles for the Knill code . . . . . . . . . . . . . . . . . . . . . . . . 179

xvi

Chapter 1

Introduction

The lure of quantum computing comes from the promise that it can significantly out-

perform its classical counterpart when solving some important problems. Large-scale

quantum computers will be able to solve certain problems much more quickly than

any classical computer using the best currently known algorithms, like integer factor-

ization using Shor’s algorithm [1], which is capable of breaking RSA encryption [2] in

polynomial time. There exist quantum algorithms, such as Grover’s algorithm, which

run faster than any possible classical algorithms. Given sufficient computational re-

sources, a classical computer could be made to simulate any quantum algorithm [3].

However, the computational basis of 100 quantum bits (qubits), for example, would

already be too large to be represented on a classical computer because it would re-

quire 2100 complex values to be stored. Due to its strong computation ability, quantum

computation has ignited a lot of interest in the field.

Quantum computation is the study of the information processing that can be

accomplished using quantum mechanical systems [3, 4]. It is tempting to say that the

operations of a quantum computer are governed by the laws of quantum mechanics.

However, it is true that although all physical operations are governed by quantum

mechanics, we would not say that our desktop are quantum computers. In general, a

1

quantum computer is the one whose operations are governed by certain very special

transformations of its internal states. The laws of quantum mechanics allow these

peculiar transformations to take place under very carefully controlled conditions.

In a quantum computer, the logical operations must have no physical interactions

whatever that are not under the complete control of the system. All other interactions

may introduce potentially catastrophic disruptions into the operation of a quantum

computer. Such disruptions between what matters for the computation and what

does not result in decoherence, which is fatal to quantum computation.

To avoid decoherence, in general, quantum operations cannot be carried out in

macroscopic physical systems, because most such systems cannot be isolated from the

external environment. Such an isolation can be achieved if the qubits are operated

upon in microscopic physical systems. Such microscopic systems must be decoupled

from their surroundings except for the completely controlled interactions that are

associated with the computation process itself.

Quantum mechanics postulates that all quantum operations (except measurement,

as discussed later) are invertible, and thus all valid quantum gates are reversible. In

fact, quantum computers display an important part of their magic through reversible

operations, which transform the initial state of qubits into its final form, using only

processes whose action can be inverted. There is only a single irreversible operation

in quantum computation, the measurement, which is the only way to extract useful

information from qubits after their state has acquired its final form. The extracted

information is then processed by a classical computer.

In general, to implement quantum algorithms, we need both quantum and clas-

sical computers. The quantum computer executes the desired sequence of quantum

operations and the classical computer provides the control for these operations and

also performs post-processing of the computed results.

2

A quantum algorithm is executed by a quantum circuit, which comprises a se-

quence of quantum gates. These quantum gates may be decomposed into several

primitive quantum operations, supported by corresponding quantum physical ma-

chine descriptions (PMDs). In addition, because quantum systems are delicate and

difficult to control, fault-tolerant (FT) quantum circuits are needed for practical im-

plementation. Therefore, several quantum error correction codes (QECCs) are pro-

posed to facilitate FT computation. For physical quantum circuit realization, the

physical address of qubits needs to be considered in order to honor the physical dis-

tance constraint.

The synthesis of quantum circuits is difficult and several metrics must be con-

sidered, such as the number of primitive operations (#ops), the number of critical

execution cycles (#cycles), QECC, and different PMDs. In the thesis, we provide sev-

eral methodologies for quantum circuit synthesis and optimization targeting different

metrics. These techniques are scalable to any quantum algorithm. Their integration

yields optimized physical quantum circuits for various PMDs at the end of synthesis

process.

In the section, some background knowledge about quantum computing is provided,

followed by a high-level view of quantum circuit synthesis.

1.1 Quantum System

In quantum computation, a qubit refers to a unit of quantum information [3, 4]. It

has very different characteristics from a classical bit. For example, a classical bit may

only take on two distinct values: 0 or 1. However, a qubit does not suffer from this

limitation.

In a two-state quantum system, a qubit |ψ⟩ can be described by [3]:

|ψ⟩ = α0|0⟩+ α1|1⟩ (1.1)3

where |·⟩ is the ket vector in Dirac notation, which indicates that |0⟩ and |1⟩ are

column vectors corresponding to:

|0⟩ ≡

10

, |1⟩ ≡01

(1.2)A qubit is a superposition of |0⟩ and |1⟩, which means that the qubit |ψ⟩ exists

in the two states simultaneously. α0 and α1 are complex coefficients. They represent

the amplitudes of |0⟩ and |1⟩, respectively, with the normalization constraint |α0|2 +

|α1|2 = 1.

The qubit in Eq. (1.1) can also be written in terms of azimuth and elevation angles

as:

|ψ⟩ = eiγ(cos θ2|0⟩+ eiϕ sin θ

2|1⟩) (1.3)

where γ, θ, and ϕ are all real numbers. The global phase eiγ is physically indis-

tinguishable and thus can be ignored. Therefore, we can further simplify it to the

form:

|ψ⟩ = cos θ2|0⟩+ eiϕ sin θ

2|1⟩ (1.4)

We can visualize Eq. (1.4) on the surface of a three-dimensional sphere, known as the

Bloch sphere, as shown in Fig. 1.1. The north and south poles represent states |0⟩

and |1⟩, respectively. A point on the Bloch sphere represents a superposition of the

two states. A valid single-qubit operation represents a rotation on the Bloch sphere.

The pair of elevation and azimuth angles (θ, ϕ) are in the range of 0 ≤ θ < π and

0 ≤ ϕ < 2π.

Since a physical system changes over time, a quantum state |ψ⟩ is actually a func-

tion of time: |ψ(t)⟩. Quantum mechanics postulates that the evolution of a quantum

4

φ

θ

|0〉

|1〉

x

y

z

|ψ〉

Figure 1.1: Bloch sphere

state of a closed quantum system can be described by Schrödinger’s equation [3]:

i~∂|ψ(t)⟩∂t

= Ĥ|ψ(t)⟩ (1.5)

where ~ is Planck’s constant divided by 2π and Ĥ is a Hermitian operator, called

the Hamiltonian, which represents the total observable energy of the system. For

simplicity, if we consider Ĥ to be independent of time, the equation can be solved as:

|ψ(t)⟩ = e−iĤ(t−t0)/~|ψ(t0)⟩ = Û |ψ(t0)⟩ (1.6)

where Û is a unitary operator, U †U = I, and I is identity operator. Hence, all

valid quantum operations are unitary and thus reversible. Reversibility is a necessary

condition for quantum computing.

The evolution of an isolated quantum system with a finite number of states can

be described by a unitary matrix. An n-qubit operation is represented by a 2n × 2n

unitary matrix.

5

BC

A

Time flow

| ψ〉 C(B⊗A)| ψ〉

Figure 1.2: An example quantum circuit

1.2 Quantum Circuit

A quantum circuit comprises a sequence of quantum gates. An n-qubit circuit is

depicted with n horizontal lines, with time flowing from left to right. A quantum

computation typically requires the application of several quantum gates, sequentially

or in parallel, to various subsets of qubits. The net unitary transformation they

perform can be expressed in a matrix product form, which is a cascade of the unitary

matrices of the corresponding quantum gates, using two rules: dot product and tensor

product [3, 4].

Dot Product

The dot product is the same as matrix multiplication. If several gates act on the

same subset of qubits, then those gates must be applied in series and their overall

effect computed by the dot product. In a quantum circuit, if operation A acts before

B, their overall effect is computed by their dot product in reverse order, i.e., B · A.

Tensor Product

If adjacent gates within a quantum circuit act on independent subsets of qubits, then

they can be applied simultaneously in parallel. The net effect of the parallel gates is

evaluated with the tensor product, denoted by “⊗”.

An example is shown in Fig. 1.2. The system computes C · [B ⊗A], where A and

B are 2× 2 matrices and C is a 4× 4 matrix.

6

As indicated earlier, all quantum operations are unitary and thus invertible. The

inverse of a unitary matrix U is U−1 = U †, which is its conjugate transpose. Therefore,

obtaining the inverted circuits is straightforward, i.e., reverse the order of the gates

and use the conjugate transpose of each gate. For example, the inverse circuit of the

one in Fig. 1.2 is [B† ⊗ A†] · C†.

When two physical systems are treated as one combined system, the state space

of the combined physical system is the tensor product H = H1 ⊗H2, where H1 and

H2 are the component subsystems. It is important to note that the state of a many-

qubit composite system cannot always be decomposed as a product of its component

subsystems. In this case, we say that the qubits are entangled. Entanglement is a

physical phenomenon that occurs when groups of particles are generated or interact in

ways that the quantum state of each member must subsequently be described relative

to each other.

1.3 Quantum Algorithm

A quantum algorithm is one that runs on a realistic model of a quantum computer [4].

Just as a classical algorithm is a finite sequence of instructions, where each step

or instruction can be performed on a classical computer, a quantum algorithm is a

sequence of instructions, where each instruction is performed on a quantum computer.

Although all classical algorithms can also be performed on a quantum computer, the

term quantum algorithm is typically used for those algorithms that are inherently

quantum or use some essential feature of quantum computation, such as quantum

superposition or entanglement [5].

Quantum algorithms are interesting because they might be able to solve some

problems much faster than classical algorithms. The most well known algorithms are

Shor’s algorithm for factoring and Grover’s algorithm for searching an unstructured

7

database or an unordered list. Shor’s algorithm runs exponentially faster than the

best known classical algorithm for factoring. Grover’s algorithm runs quadratically

faster than the best possible classical algorithm for the same task.

In general, a quantum algorithm consists of both classical and quantum func-

tions [3, 6]. Classical functions are described in the classical domain, i.e., the mapping

of input to output states consists of binary values. Since quantum mechanics requires

the time-evolution of quantum states to be reversible, the classical functions need to

be reversible. Their circuit implementations consist of a sequence of reversible gates

that implement reversible operations. Quantum functions are described in the quan-

tum domain, i.e., the mapping of input to output states is described by a unitary

transformation matrix with complex values. Their circuit realizations are composed

of quantum gates.

1.4 Reversible Computing

A function is reversible if its input and output mapping is bijective and thus is in-

formation lossless. Therefore, all reversible computations can be computed forward

or backward (inversely) without loss of information. The study of reversible logic

has historically been motivated by power consumption. Landauer’s principle (or Von

Neumann-Landauer limit) states that regardless of the technology chosen for imple-

menting a circuit, when a bit of information is erased, at leastKT ln 2 Joules of energy

is dissipated, where K is the Boltzmann constant and T the operating temperature

in degrees Kelvin [7]. In 1973, Bennett [8] showed that zero-energy computation is

possible only if the computation is reversible and a recent experiment has been shown

to validate this perspective [9].

Reversible computing not only attracts interest from the point of view of power

consumption, but also finds many applications in other areas. One of the most impor-

8

tant applications of reversible computing is quantum computation [3]. In quantum

computation, reversible functions are mostly used for generating oracles. An ora-

cle can be seen as the description of the problem to solve. The reversible functions

provide a high-level description of quantum algorithms. These functions are synthe-

sized to reversible circuits. Then, these reversible gates are further decomposed into

low-level quantum circuits for physical realization on a target PMD.

1.5 FT Quantum Computation

Quantum systems are fragile because when they inevitably interact with the environ-

ment, the information stored in the system decoheres, thus resulting in the failure of

computation. Hence, quantum circuits need to be FT in a practical implementation.

FT quantum computation implies that quantum circuits perform correctly even when

errors occur.

Many QECCs have been proposed to facilitate FT quantum computation, such as

Steane code [3, 10], Bacon-Shor code [11], Knill code [12], and surface code [13, 14].

QECC is employed to detect and correct such errors. However, the QECC circuit may

itself incur an error. Thus, the probability of an error, for each qubit must be smaller

than an error threshold, such that the overall circuit error probability is reduced after

QECC is applied [15].

Assuming that the errors are independent, if the probability of a single error in a

single physical gate is p, the error probability of an encoded logic gate is given by Cp2,

where C is a constant that depends on the code and circuitry used [3]. This can be

improved if a concatenation code is used, which entails another level of QECC applied

on top of the previous level. Therefore, the error rate becomes C(Cp2)2 = C3p4 for

two-level concatenation. In general, a k-level concatenation code has a logical error

9

rate of pk = 1/C(Cp)2k . Therefore, when p < 1/C, the error rate can be made

arbitrarily small.

Implementations of these QECCs are based on some FT gates [16], as will be

defined and explained in Chapter 6. Therefore, if the circuits contain non-FT gates,

they should be converted to FT gates first. This conversion process is called quantum

compilation. For physical implementation of FT quantum computation, tiled quan-

tum architectures are used where each tile is able to do a set of FT computations.

The configuration and size of the tiles depend on the type of QECC and PMD.

1.6 Physical Machine Description

Physical realization of quantum computers is very challenging [3]. It not only requires

a robust physical representation of qubits, but also the enabling of their time-evolution

as desired. In quantum mechanics, the time-evolution of a closed quantum system

is described by a unitary operator determined by its Hamiltonian [3]. Since different

quantum systems have different Hamiltonians, they also have different PMDs. To

perform a quantum operation, one must be able to control the Hamiltonians of the

system. However, an operation may be easily performed in one system but with

difficulty in another. Hence, one PMD may be more suitable for implementing a

quantum logic gate than another.

Several quantum systems have been proposed that support different PMDs:

• Quantum dot (QD): In this system, a qubit is represented by the spin states

of two electrons in a double electrostatically defined quantum dot, which has

two potential wells with a tunneling barrier between them [17].

• Superconducting (SC): In this system, charged carriers are used to represent

qubits [18, 19, 20, 21, 22, 23]. At low temperatures in certain metals, two

10

electrons can bind together to form a Cooper pair. Such a pair can be confined

within an electrostatic box and used to represent quantum information.

• Ion trap (IT): This quantum system is based on a 2D lattice of confined ions,

each of which represents a physical qubit that can be moved within the lattice

to accommodate local interactions [3, 24, 25, 26].

• Neutral atom (NA): This is a system of trapped neutral atoms that can be

isolated from the environment and whose simple quantum-level structure can

be exploited [27, 28, 29].

• Linear photonics (LP): In this quantum system, a probabilistic two-photon

gate is teleported into a quantum circuit with high probability [30, 31, 32, 33].

• Nonlinear photonics (NP): This quantum system is based on weak cross-

Kerr nonlinearities [30, 34, 35].

A broad survey of quantum systems is available from the ARDA quantum com-

puting roadmap [36].

1.7 Physical Design

A quantum logic circuit does not consider the physical address of qubits and logic

gates in it are applied to qubits without considering the physical distance between

these qubits. In a physical circuit realization, the physical address of qubits needs

to be considered. In a practical implementation of quantum circuits, physical qubits

have to be placed on a grid. The grid implements the architecture of the quantum

computer. A physical distance constraint often imposed is that quantum gates can

only operate on adjacent qubits on the grid.

Current quantum technologies support a set of one- and two-qubit quantum gates.

A two-qubit gate enables interaction between the two qubits. In a physical circuit

11

implementation, when the two qubits are far apart, a communication channel must

be created between them. This communication overhead not only increases the com-

putation latency but also the probability of a computation error.

A complex multi-qubit reversible gate must be decomposed into a sequence of one-

and two-qubit quantum gates. However, even a one- or two-qubit gate may not be

directly implementable in a physical quantum machine. Therefore, it must be further

decomposed using the set of supported primitive quantum operations in the PMD

of the quantum machine. A quantum gate may require a very different number of

primitive quantum operations for realization on different machines.

1.8 Synthesis Flow

There are many difficulties and challenges in implementing a quantum computer. A

hierarchical architecture for constructing a quantum computer can be found in [37].

From top to bottom, it includes quantum computing theory, quantum programming,

quantum computer architecture, quantum computer micro-architecture, and technol-

ogy building blocks. In this thesis, we concentrate on the synthesis of FT quantum

circuits, i.e., we convert a given quantum algorithm into an FT circuit that is opti-

mized for the given PMD.

Quantum circuit synthesis is concerned with the ability to automatically generate

an optimized quantum circuit from a given quantum algorithm. The synthesis of

quantum circuits is generally difficult, but can be performed effectively by hierarchi-

cally decomposing it into many stages. Fig. 1.3 shows a high-level view of a three-stage

synthesis flow. It includes synthesis from the quantum algorithm to non-FT quantum

logic, non-FT quantum logic circuit to FT quantum logic, and FT quantum logic to

FT quantum physical circuit. Each synthesis stage is described next.

12

Optimized

quantum

gate library

(QGLVP)

Quantum algorithm

Quantum functionsClassical functions

Quantum

module library (QLib)

Reversible logic

synthesis (RMDDS)

non-FT quantum logic

PMD

FT quantum

logic synthesis

(FTQLS)

Physical design-aware quantum

circuit synthesis (PAQCS)QECC

FT physical quantum circuit

FT quantum logic

Figure 1.3: High-level synthesis flow

13

The input of the flow is a quantum algorithm, which generally contains two parts:

quantum functions and classical functions. Synthesizing an arbitrary quantum func-

tion is very difficult and inefficient. Fortunately, quantum functions usually have

regular structures and hence can be effectively synthesized with some quantum mod-

ules. Therefore, a quantum module library (QLib) [38] is introduced to facilitate the

synthesis of commonly used quantum modules. On the other hand, classical functions

are reversible functions and can be synthesized by the Reed-Muller Decision Diagram

Synthesis (RMDDS) tool [39], which is a flexible and efficient tool for reversible logic

synthesis. RMDDS is able to optimize the number of qubits or the quantum cost of

the circuit implementation, and the objective is influenced by the implementation of

the underlying technology. After the synthesis of quantum and reversible functions, a

non-FT quantum logic circuit, which implements the quantum algorithm, is derived.

It is composed of high-level quantum logic gates.

In order to convert the high-level logic gates into low-level quantum gates, an

optimized quantum gate library for various PMDs (QGLVP) [40] is used. The gate

library decomposes high-level gates into primitive quantum gates supported by various

PMDs. With the decomposition library, the circuit is then optimized by the FT

quantum logic synthesizer, FTQLS [41], which synthesizes and optimizes the non-FT

logic to FT logic circuits. The focus of FTQLS is on optimizing circuits with FT

primitive gates. A logical FT circuit is composed of a set of FT gates, such that the

QECCs can be easily applied to the circuit.

Next, the physical circuit is synthesized by using the physical design-aware quan-

tum circuit synthesizer, PAQCS [42]. Based on the QECC and PMD chosen, the

qubits are placed on a 2D grid, with some communication channels constructed. In

this stage, the physical cost of QECC is considered. Thus, the optimization can be

targeted to different QECCs and PMDs, which reflect true circuit cost.

14

1.9 Contributions of the Thesis

As indicated in the previous section, the synthesis of quantum circuit is divided into

three stages. The thesis proposes several effective synthesis methodologies to facilitate

the synthesis flow.

The thesis first proposes RMDDS, which is a flexible and efficient reversible logic

synthesizer. It is flexible in the sense that users can can either optimize the number of

qubits or the quantum cost in the circuit implementation. It is also efficient because

the circuits can be synthesized within user-defined CPU times. This combination of

flexibility and efficiency has been missing from synthesizers presented earlier.

The thesis then discusses QLib, which contains scripts to generate quantum mod-

ules of different sizes and specifications for well-known quantum algorithms. Because

many quantum algorithms use similar subroutines that can be implemented with

similar circuit modules, QLib is very helpful in quantum logic synthesis. In addition,

QLib can also serve as a suite of benchmarks for quantum logic and physical synthesis.

The thesis thereafter presents QGLVP. Quantum gates are themselves realized

using primitive quantum operations that are supported by the PMDs. Thus, the

quantum cost for implementing a quantum operation may differ from one PMD to

another. Hence, the optimized quantum gate decompositions are different in different

PMDs. To make our quantum gate library efficient in terms of the number of primitive

quantum operations involved and the associated delay, we explore one-qubit and two-

qubit quantum identity rules that can help remove redundancies in the quantum

gate implementation. QGLVP thus provides technology mapping for quantum logic

synthesis.

Next, FTQLS is presented. The input to FTQLS is an unoptimized quantum

logic realized using a set of commonly used gates and its output is an optimized FT

quantum logic that only comprises primitive quantum operations supported by the

given PMD. FTQLS does technology mapping for different PMDs and then converts

15

non-FT circuits to FT circuits. For technology mapping, it utilizes QGLVP. Efficient

conversion to FT circuits is done by integrating two quantum compilers and an FT

cache table into FTQLS. For improving synthesis results, an FT set of gates that is

directly supported by each PMD is proposed. Quantum circuit optimization is done

by utilizing quantum identity rules. A methodology such as FTQLS has not been

attempted before, to the best of our knowledge.

The last part of the thesis discusses PAQCS. It does physical design-aware quan-

tum circuit synthesis considering different PMDs and QECCs. It contains two efficient

and effective algorithms: one for physical qubit placement and another for routing of

communications. Physical design-aware quantum circuit design that can be targeted

at multiple PMDs and QECCs has not been attempted before, to the best of our

knowledge.

1.10 Thesis Outline

The rest of this thesis is organized as follows. Chapter 2 describes related work for

the thesis. Chapter 3 describes RMDDS for reversible logic synthesis. Chapter 4

discusses QLib, which contains scripts to generate quantum modules. Chapter 5

introduces QGLVP, which is an optimized quantum gate library for various PMDs.

Chapter 6 presents FTQLS, which effectively synthesize FT quantum logic circuit

to various PMDs. Chapter 7 presents PAQCS. It synthesizes quantum logic circuits

to quantum physical circuits. Chapter 8 concludes the thesis and presents ideas for

future research.

16

Chapter 2

Related Work

This chapter surveys previous work in reversible and quantum logic synthesis. Re-

versible logic synthesis generates a high-level description of quantum oracle functions.

Hence, it is a very important part of the synthesis flow. Many different methodolo-

gies for reversible logic synthesis have been proposed. Each has some advantages

and disadvantages. Therefore, it is possible that a hybrid method may exploit the

complementary advantages of such methodologies.

For quantum circuit synthesis, some important issues must be considered that do

not exist in traditional circuit synthesis. These issues include the use of fault-tolerant

(FT) gates, the type of quantum error correction code (QECC) applied, physical

machine description (PMD), and physical qubit placement and routing.

In this chapter, we summarize several important techniques that are used in re-

versible and quantum circuit synthesis and discuss their characteristics, performance,

and applications.

2.1 Reversible Logic Synthesis

Reversible computation has been historically motivated by theoretical research in low-

power electronics. However, reversible logic not only attracts interest from low-power

17

research, but also finds many applications in other areas. Given that reversible trans-

formations are the bottleneck in many widely used algorithms, reversible instructions

have been added to various microprocessor instruction sets [43]. In addition, the bit-

permutation operation, which is reversible, has been proposed in [44] to increase the

efficiency of cryptographic algorithms. Reversible debugging and program inversion

allow us to undo a command in debugging environments and allow the reconstruction

of decisions that lead to some particular outcomes. Therefore, program inversion and

debugging [45, 46] and reversible programming languages [47, 48] have been proposed.

Reversible adiabatic circuits have been proposed to recycle signal energy [49] and for

supercomputing [50].

Quantum computation is another important motivation for reversible computation

because unitary transformations in quantum mechanics are reversible. Many quantum

algorithms [51, 52, 53] have been introduced to solve several difficult problems in

polynomial time. Because of these important applications, reversible circuit synthesis

has become an important research topic.

Reversible logic synthesis is concerned with the ability to automatically gener-

ate a reversible circuit from a given reversible function. It synthesizes reversible

functions into reversible circuits composed of multiple-controlled Toffoli (MCT) [54],

Fredkin [55], and Peres [56] gates. Several reversible logic synthesis algorithms have

been proposed. In general, they can be classified into two categories: exact and

heuristic.

2.1.1 Exact Synthesis

Exact algorithms perform an exhaustive search for globally optimal solutions. An

exact algorithm based on depth-first search with iterative deepening was used in [57]

to find optimal solutions for all three-bit reversible functions. Another exact algorithm

based on Boolean satisfiability was proposed in [58]. This algorithm guarantees a

18

reversible circuit with a minimum number of gates only for circuits with up to 10 gates.

Although exact algorithms guarantee optimality, they consume excessive synthesis

times since there are 2n! permutations for an n-input reversible function. Thus, the

synthesis time explodes with an increase in the number of input variables or the

number of gates, limiting the usage of exact algorithms in practice.

2.1.2 Heuristic Synthesis

In order to synthesize reversible circuits more efficiently, several heuristic methods

have been proposed that lead to reasonably good solutions. According to a sur-

vey [59], these methods can be categorized into four types: (1) search-based, (2)

binary-decision-diagram-based, (3) transformation-based, and (4) cycle-based. Dif-

ferent heuristic algorithms excel at different aspects of synthesis: number of qubits

(#qubits), quantum cost (QC) or synthesis time.

Search-based Method

Gupta et al. [60] proposed a search-based method, called Reed-Muller reversible logic

synthesis (RMRLS), which uses the positive-polarity Reed-Muller (PPRM) expansion

to synthesize reversible circuits. The methodology in [61] extends the one in [60] for

synthesis with more reversible gates, such as Fredkin and Peres gates. Given enough

memory and time, these methods can find a minimal circuit. RMRLS is typically

effective in synthesizing circuits with variable count up to 6. However, many reversible

functions/circuits exceed these limits.

Transformation-based Method

Transformation-based synthesis approaches [59, 62, 63, 64] rely on a truth table de-

scription. This approach compares the identity function I with a given permutation

function F , and uses a sequence of reversible gates to transform F into I. To conduct

19

the transformation, the optimization metric used is the sum of Hamming distances

between the binary patterns of F and I in each row of the truth table. The method

iterates through the rows of the truth table. Since the size of the truth table increases

exponentially with the number of inputs, the method becomes impractical for large

input functions.

Cycle-based Method

Instead of synthesizing an entire permutation, one can factor it into a set of cycles and

synthesize the resulting cycles separately. A cycle (a1, a2, . . . , an) is a permutation

such that f(a1) = a2, f(a2) = a3, . . . , and f(an) = a1. Cycle-based methods [57, 65,

66] use group theory to factor reversible functions into a set of cycles.

Decision Diagram Method

Since most of the heuristic algorithms are not scalable to a large number of inputs,

Wille et al. [67] introduced a scalable heuristic algorithm based on Shannon decompo-

sition and binary decision diagram (BDD) to efficiently synthesize large circuits. The

average synthesis time is just a few seconds. Later, Soeken et al. [68] and Pang et al.

[69, 70] used the Kronecker functional decision diagram (KFDD) to further improve

circuit cost through Davio and Shannon expansions. Since both BDD and KFDD

synthesis methods are based on the decision diagram data structure, we categorize

them under the label: decision diagram synthesis (DDS). Although DDS is scalable

and very efficient, it generates many ancillary input and output bits, which weakens

their practical impact.

Due to the characteristics of different methods, a well-designed hybrid synthesis

method that leverages complementary advantages of various heuristics may provide an

opportunity to perform effective tradeoffs. Reed-Muller Decision Diagram Synthesis

(RMDDS), presented in this thesis, is such a hybrid method that exploits the advan-

20

tages of search based and decision diagram based methods. It is efficient in terms of

synthesis time. In addition, it enables trade-offs between #qubits and quantum cost,

as shown in Chapter 3.

2.2 Quantum Logic Synthesis

Quantum logic circuits are built using a cascade of quantum gates, which operate on

qubits. Synthesis of quantum logic has many aspects, based on the input and output

it generates. Some quantum logic synthesis problems have been tackled before, as

discussed next.

2.2.1 Logic Identities

In circuit design, often gates in adjacent circuit blocks can be merged, since a cir-

cuit may admit many different but equivalent decompositions. These decompositions

constitute circuit identities or templates. They can be used to optimize the quantum

cost or critical path of the circuit. Some quantum circuit identities and templates

have been proposed in [3, 71, 72, 73, 74]. The methods in [3, 71] consider identities

involving Controlled-NOT (CNOT) gates. The method in [72] only considers one-

qubit gate identities. The methods in [73, 74] target the NCV library consisting of

CNOT and Controlled-V (CV) gates, where V is the square root of NOT. Since dif-

ferent quantum systems support different primitive quantum operations, i.e., physical

machine descriptions (PMDs), the methods discussed above are only partially appli-

cable and, in many cases, completely inapplicable to the targeted PMDs. We need

identities rules that are generally applicable to every PMD.

21

2.2.2 Quantum Shannon Decomposition

Quantum Shannon decomposition is used to synthesize generic quantum circuits.

The input of the synthesizer is an arbitrary quantum matrix and its output is a

cascade of controlled or uncontrolled rotation gates. Typically, it utilizes cosine-

sine decomposition [75, 76] or optimized brute-force search [77] to synthesize optimal

circuits. However, due to the size of the matrix representation (an n-qubit circuit

is described by a 2n × 2n matrix) and high computational complexity, O(4n), these

methods cannot process large quantum circuits. Thus, they have limited use.

2.2.3 Quantum Compilation

Since an FT quantum circuit is only composed of FT gates, the conversion of non-FT

gates to FT gates is an important procedure. The purpose of quantum compilation

is to find a cascade of gates from a discrete universal gate set to approximate any

arbitrary unitary gate within an error threshold. Currently, quantum compilation

basically uses two methodologies to convert non-FT circuits into FT circuits. One is

based on the Solovay-Kitaev algorithm (SKA) [16, 78] and the other is based on the

skipping table algorithm (STA) [79, 80]. SKA and STA are introduced in detail in

Section 6.2.1.

2.3 Physical Design

Quantum logic circuits cannot be directly implemented on a physical quantum system

because they contain temporal, but no spatial, information of the circuit. A quantum

physical circuit is derived by placing qubits on a grid, which denotes physical locations

of the qubits. Adjacent qubits can interact with each other on the grid. If the two

qubits that a two-qubit gate is applied to are not adjacent, a communication channel

must be created.

22

In general, a physical circuit is composed of many quantum tiles. Each tile contains

a qubit and is able to do a set of FT operations. The implementation of a tile is based

on the type of PMD and the QECC employed.

2.3.1 Tiled Quantum Architecture

Many quantum technologies and architectures have been proposed to enable various

degrees of qubit interactions. The ion trap (IT) technology [24, 25, 26] uses one-

dimensional (1D) interaction, in which only two adjacent qubits can interact with

each other. This is, of course, highly restrictive. The quantum dot (QD) [17], su-

perconducting (SC) [18, 19, 20, 21, 22, 23], neutral atom (NA) [27, 28, 29], linear

photonics (LP) [30, 31, 32, 33], and non-linear photonics (NP) [30, 34, 35] technolo-

gies use two-dimensional (2D) interaction, in which a qubit (except for those at the

boundaries) can interact with four neighboring qubits. The technology presented in

[81] uses three-dimensional (3D) interaction, in which a qubit can interact with six

neighboring qubits. However, this makes quantum control very difficult. Currently,

the 2D interaction technology is the most popular and most current research is based

on the 2D structure.

Various tiled quantum architectures have been proposed in order to implement FT

quantum computation [82, 83, 84]. In such architectures, each tile contains a qubit

register and implements a set of FT quantum gates. The architecture is typically

hierarchical, which means that a kth-level tile comprises multiple (k− 1)th-level tiles.

An example is shown in Fig. 2.1. It shows each level to be composed of nine sub-level

tiles.

Two types of quantum tile architectures have been proposed based on the mobility

of qubits. The first type is movement based [82, 84, 85], as shown in Fig. 2.2(a). The

universal logic block (ULB) tile shown in the figure is analogous to a configurable logic

block employed in conventional FPGAs. ULBs are separated by routing channels

23

T2

T1

T1

T1

T1

T1

T1

T1

T1

T1

T0

T0

T0

T0

T0

T0

T0

T0

T0

level 1 level 0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

T0

Figure 2.1: Three-level tile hierarchy, each kth-level tile is composed of nine (k−1)th-level tiles

ULB ULB ULB

ULB ULB ULB

ULB ULB ULB

classical control

cla

ssic

al co

ntr

ol

classical control

cla

ssic

al co

ntr

ol

Q Q Q

Q Q Q

Q Q Q

(b)(a)

Figure 2.2: Two types of tiled architecture. (a) movement based and (b) swap based

that are used to move qubits. NA, LP, and NP are three quantum technologies that

support the movement of qubits and are thus compatible with this architecture. The

second type is swap based, as shown in Fig. 2.2(b). Each tile Q contains a qubit and

a set of FT gates. In QD and SC quantum technologies, the qubits are fixed, but the

states of qubits move through the use of swap chains. A swap chain exchanges the

states of adjacent qubits through a sequence of swap gates.

For physical implementation of the tiles, several designs have been proposed. The

designs are based on different QECCs, each with different #ops and #cycles. Some

important tile implementations for the three QECCs are: Steane code [86], Bacon-

Shor code [87], and Knill code [88].

24

2.3.2 Physical Synthesis

Many techniques have been proposed to perform physical synthesis of quantum logic

circuits to 1D quantum architectures [89, 90, 91, 92]. However, they are not suitable

for 2D quantum architectures. A few hand-optimized 2D quantum circuits have been

proposed in [93, 94, 95]. However, developing methodologies for physical circuit syn-

thesis on 2D architectures is a nascent area of research. Recently, a methodology for

qubit placement that minimizes communication overhead in a 2D quantum architec-

ture has been proposed [96], which is the first attempt to address this problem. It

uses mixed integer linear programming for qubit placement optimization.

25

Chapter 3

RMDDS: Reed-Muller Decision

Diagram Synthesis for Reversible

Logic

In this chapter, we propose a flexible and efficient reversible logic synthesizer. It ex-

ploits the complementary advantages of two methods: Reed-Muller Reversible Logic

Synthesis (RMRLS) and Decision Diagram Synthesis (DDS), and is thus called Reed-

Muller Decision Diagram Synthesis (RMDDS) [39]. RMRLS does not scale to a large

number of qubits. DDS tools, even though efficient, add a large number of ancillary

qubits and typically incur much higher quantum cost than necessary. RMDDS over-

comes these obstacles. It is flexible in the sense that users can either optimize the

number of qubits (#qubits) or the quantum cost (QC) in the circuit implementation.

It is also efficient because the circuits can be synthesized within user-defined CPU

times. This combination of flexibility and efficiency has been missing from synthesiz-

ers presented earlier. When used to synthesize reversible functions, RMDDS reduces

#qubits by up to 79.2% (average of 54.6%) when the synthesis objective is to min-

imize #qubits and the QC by up to 71.5% (average of 35.7%) when the synthesis

26

objective is to minimize QC, relative to DDS methods. For irreversible functions

(which are automatically embedded in reversible functions), the corresponding best

(average) reductions in #qubits is 42.1% (22.5%) for minimization of #qubits and

63.0% (25.9%) for the minimization of QC.

3.1 Introduction

As indicated in Chapter 2.1, many reversible logic synthesis algorithms have been

proposed in prior works. Since the exact algorithms are complex and require long

synthesis times, in order to synthesize reversible circuits more efficiently, usually some

heuristic methods are used that generate reasonably good solutions. RMRLS uses

the positive-polarity Reed-Muller (PPRM) expansion to synthesize reversible circuits

with optimized QC. RMRLS is typically effective in synthesizing circuits with small

number of qubits and is not scalable to large circuits. On the other hand, DDS

efficiently synthesizes large circuits with very low synthesis times, but with a large

number of ancillary qubits.

Both QC and #qubits are important metrics for evaluating the performance of

reversible circuits. The QC of a reversible gate is equal to the number of elementary

quantum operations required to implement its functionality. Hence, QC is related

to the computation time and defines the computational complexity of the circuit.

#qubits is an important metric for evaluating the quantum hardware cost of the

circuit. Since allowing higher #qubits in the implementation may make its realization

impractical, reducing #qubits is very important.

The aim of this chapter is to present a heuristic algorithm that is scalable (to a

large number of inputs) and efficient (i.e., synthesis time is at most a few minutes) like

the DDS methods, yet results in fewer #qubits and smaller QC than these methods, in

general. Thus, a hybrid synthesis method that leverages complementary advantages

27

of various heuristics may provide an opportunity to perform tradeoffs among these

three aspects. However, the integration of these heuristics needs to be performed

carefully since a bad integration interface can decrease its efficiency and effectiveness.

The integration of a transformation-based method with DDS is not easy because the

truth table specification does not scale with the number of inputs. In order to form

a hybrid of cycle-based and DDS methods, decomposition into cycles and conversion

between the canonical cycle form, which is the input format used by cycle-based

methods, and a decision diagram (DD) pose major hurdles.

Due to the above drawbacks, we choose to integrate RMRLS with DDS. Because

the conversion between the PPRM expansion and a DD is straightforward, it is much

easier to form a hybrid of RMRLS and DDS. This combination can leverage the

complementary advantages of RMRLS (in reducing #qubits and QC) and DDS (in

terms of scalability and efficiency). We refer to this hybrid method as RMDDS.

RMDDS is very flexible. It allows users to specify an objective (or impose constraints)

on the circuit implementation, such as #qubits, QC, or synthesis time. Previous

reversible synthesizers do not provide such a flexibility.

The remainder of the chapter is organized as follows. Section 3.2 provides back-

ground material on reversible circuits. Section 3.3 provides some simple examples to

motivate our work. Section 3.4 discusses the synthesis procedure in detail. Section 3.5

presents experimental results. Finally, Section 3.6 concludes.

3.2 Background

In this section, we present background material to facilitate understanding of re-

versible logic synthesis.

28

3.2.1 Reversible Functions

A function is reversible if it performs a bijective (one-to-one and onto) mapping of all

its inputs and outputs, else it is irreversible. Since reversible functions are bijective:

(1) the operation of reversible functions is bidirectional, which means they can be

operated in both the forward and backward directions, (2) reversible operations are

information lossless, (3) the number of inputs and outputs of reversible functions are

equal, (4) the outputs of reversible functions are permutations of the inputs, and (5)

because information needs to be conserved, fan-out in reversible circuits is prohibited.

The truth table of an irreversible function can be expanded to convert it to a

reversible function. This can be done by adding garbage outputs and corresponding

constant inputs to balance the number of inputs and outputs and then bijectively

filling the augmented truth table. An example is shown in Fig. 3.1. The 2-to-1 OR

gate/function is irreversible because its truth table is not bijective. There are three

repeated outputs (1). Hence, its operation is information-lossy since we cannot always

deduce the input vector from the corresponding output bit (i.e., information is erased

during computation). This can be remedied by adding a constant input (c1) and two

garbage outputs (g1 and g2) and filling the truth table bijectively.

An irreversible function can be transformed into a reversible one with the addition

of ⌈log2 p⌉ outputs, where p is the number of repeated outputs [97]. However, there

are many ways to perform the transformation since there are 2n! permutations for an

n-input reversible function.

3.2.2 Reversible Gates

Reversible circuits are constructed by cascading reversible gates. A commonly used

reversible gate is theMultiple Controlled Toffoli (MCT) gate: an n-bit MCT gate [54]

has n inputs and outputs. It passes the first n− 1 inputs (referred to as control bits)

29

c1

a b f g1

g2

0

0

0 0 0 0

0 0 1 1 0 0

0 1 0 1 0 1

0 1 1 1 1 0

1 0 0 0 0 1

1 0 1 0 1 0

1 1 0 0 1 1

1 1 1 1 1 1

OR

c1

a

b

f

g1

g2

a

bf

(a)

(b)

a b f0

0 0

0 1 1

1 0 1

1 1 1

Figure 3.1: (a) A 2-to-1 irreversible OR gate/function, and (b) a 3-to-3 reversible ORgate obtained by adding a constant input (c1) and two garbage outputs (g1, g2)

x1

x2

x3

x1

x2

x3⊕x1x2

x1

x2

x1

x2⊕x1

x1 x1⊕1=x1'

x1x2x3

xn-1xn

y1y2y3

yn-1yn

...

...

(a)

(b)

(c)

(d)

Figure 3.2: (a) n-bit MCT, (b) NOT, (c) CNOT, and (d) Toffoli gate

to the output unaltered and inverts the nth input (referred to as the target bit) if the

first n− 1 inputs are all 1’s. That is:

control bits: yi = xi, 1 ≤ i ≤ n− 1,

target bit: yn = xn ⊕ x1x2 . . . xn−1(3.1)

An MCT gate is shown in Fig. 3.2(a). A 1-bit MCT gate inverts the input uncon-

ditionally and is, hence, also called a NOT gate [Fig. 3.2(b)]. A 2-bit MCT gate is

also called the Feynman or Controlled-NOT (CNOT) gate. It inverts the target bit

30

if the control bit is 1 [Fig. 3.2(c)]. Fig. 3.2(d) depicts a 3-bit MCT gate. Convention-

ally, a 3-bit MCT gate is simply referred to as the Toffoli gate. It inverts the target

bit if both the control bits are 1.

Another popular reversible gate is the n-bit Multiple Controlled Fredkin (MCF)

gate, defined as follows [55]:

control bits: yi = xi, 1 ≤ i ≤ n− 2,

target bit 0: yn−1 = xn−1x1x2...xn−2 ⊕ xnx1x2...xn−2

target bit 1: yn = xnx1x2...xn−2 ⊕ xn−1x1x2...xn−2

(3.2)

The MCF gate swaps the two target bits if all the control bits are 1’s. Conven-

tionally, a 3-bit MCF gate (n = 2) is simply referred to as the Fredkin gate.

A third type of reversible gate is the three-bit Peres gate [56]. It accomplishes

the operation of the cascade of a CNOT gate and a Toffoli gate, with the operation

defined as follows:

y1 = x1 ⊕ x2

y2 = x2

y3 = x3 ⊕ x1x2

(3.3)

The QC of a reversible gate is equal to the number of elementary quantum op-

erations required to implement its functionality. The method for deriving the QC of

arbitrary quantum gates was first introduced by Barenco et al. [98]. An n-bit quan-

tum gate is hierarchically decomposed into basic elementary reversible gates. NOT,

CNOT, and Toffoli gate are defined to have a QC of 1, 1, and 5, respectively. The

QC of an n-bit MCT gate is 2n − 3. However, if the gate contains some ancillary

qubits, its QC can be reduced because the ancillary qubits can be used to hold some

temporary values, thus reducing the computational complexity [73, 74].

31

The QC of a reversible circuit is simply the sum of the QC of its constituent gates.

A high QC leads to a high computation time, leading to a higher probability that the

quantum system will be subjected to noise and just collapse. Thus, minimizing the

circuit QC is also a very important design objective.

3.2.3 Reed-Muller Expansion

Any Boolean function can be described by an exclusive-OR sum-of-products (ESOP)

expression [99]. The ESOP expression can be easily converted to the PPRM expan-

sion in which all the variables are uncomplemented (by replacing each complemented

variable x′ by x⊕ 1). The PPRM expansion is canonical and has the form:

f(x1, x2, . . . , xn) =a0 ⊕ a1x1 ⊕ . . .⊕ anxn ⊕ a12x1x2⊕

a13x1x3 ⊕ . . .⊕ an−1,nxn−1xn ⊕ . . .⊕

a12...nx1x2 . . . xn

(3.4)

Each cube in the expansion is referred to as a Reed-Muller term. Fixed-polarity

Reed-Muller (FPRM) expansion is a variant of the PPRM expansion and uses only

the complemented or uncomplemented form (but not both) of each variable. An n-

input function has 2n FPRM expansions. The PPRM expansion is just a special case

of the FPRM expansion because it only uses uncomplemented variables. The choice

of polarity for each variable influences the number of terms in the resulting FPRM

expansion. The following example shows four different expansions for an example

Boolean function: y = a + b′c. Generally, the PPRM expansion has the largest

32

number of terms since only positive polarity is used for each variable.

SOP: y = a+ b′c

ESOP: y = a⊕ a′b′c

PPRM(a,b,c) : y = abc⊕ ac⊕ bc⊕ a⊕ c

FPRM(a′,b′,c) : y = a′b′c⊕ a′ ⊕ 1

(3.5)

3.2.4 Decision Diagrams

DDs are data structures that can efficiently represent Boolean functions. A DD is a

directed acyclic graph defined as follows [100, 101]:

Definition 1: A DD over variables X = {x1, x2, ..., xn} is a rooted directed acyclic

graphG = (V,E) with vertex set V containing two types of vertices: non-terminal and

terminal. A terminal vertex is labeled 0 or 1 and has no successors. A non-terminal

vertex v is labeled with a variable x ∈ X, which is called the decision variable, and

has exactly two successors denoted by low(v) and high(v) ∈ V .

Definition 2: A DD is free if each variable is encountered at most once on each path

in the DD from the root to a terminal vertex.

Definition 3: A DD is ordered if it is free and the variables are encountered in the

same order on every path from the root to a terminal vertex.

A BDD decomposes Boolean functions into smaller sub-functions by only using

Shannon decomposition on every variable, whereas a KFDD decomposes Boolean

functions into sub-functions by using Davio or Shannon decompositions.

Shannon and Davio decompositions are defined as follows:

f = x′if0i ⊕ xif 1i Shannon (S)

f = f 0i ⊕ xif 2i positive Davio (pD)

f = f 1i ⊕ x′if 2i negative Davio (nD)

(3.6)

33

(b)

a

b

c

d

e

0

0

0

a

b

c

d

e

cd

bcd

abcd⊕e

a

b

c

d

e

a

b

c

d

abcd⊕e

(a)

QC=29 QC = 5 + 5 + 5 + 1

Figure 3.3: Two implementations of a five-bit MCT gate: (a) with [#qubits,QC] =[5,29] and (b) with [#qubits,QC] = [8,16]

where f 0i is the 0-cofactor of f with respect to xi [i.e., f0i = f(x1, . . . , xi−1, 0, xi+1, xn)],

f 1i similarly is the 1-cofactor, and f2i = f

0i ⊕ f 1i .

3.3 Motivational Examples

We next discuss various design tradeoffs that can be made, with the aid of two

examples.

For reversible circuit synthesis, we should trade off among synthesis time, #qubits

and QC. In most cases, a circuit with minimum #qubits, QC, and synthesis time does

not exist, as illustrated by two examples next.

3.3.1 Example 1: #qubits vs. QC

Fig. 3.3(a) shows an example of a five-bit MCT gate. It contains five qubits and

no ancillary bits. Its QC is 25 − 3 = 29. However, we can also implement the five-

bit Toffoli gate as shown in Fig. 3.3(b). This requires eight qubits, including three

ancillary qubits. Lines 5, 6 (garbage outputs e, cd) and 7 (garbage output bcd) realize

intermediate values required for line 8 (the valid final output). The QC now reduces

to only 16. This shows how #qubits can be traded off for QC.

34

c b a co

bo

ao

0 0 0 0 0 1

0 0 1 0 0 00 1 0 1 1

0 1 1 0 1 0

1

0 0 0 1 11

0 1 1 0 01

1 0 1 0 11

1 1 1 1 0

1 ao=a⊕1

bo

=ac⊕b⊕c

co

=ab⊕ac⊕b

(a)

(b)

Figure 3.4: Truth table for Example 2

a

b

c

1

0

0

ao

bo

co

(a)

-

-

-

(a⊕1)c⊕b = ac⊕b⊕c

a

b

c

(a⊕1)(ac⊕b⊕c)⊕c = ac⊕ab⊕b

ao

bo

co

(b)

Figure 3.5: (a) Quick synthesis by direct placement of PPRM terms on ancillarylines, with [#qubits,QC] = [6,19] and (b) RMRLS requiring more synthesis time,with [#qubits,QC] = [3,11]

3.3.2 Example 2: Circuit Size vs. Synthesis Time

The circuit size is determined by #qubits and QC. Fig. 3.4(a) shows the truth table

of a three-variable reversible function and Fig. 3.4(b) its PPRM expansion. We can

trivially synthesize the circuit (in almost no synthesis time) by placing the PPRM

terms on three ancillary lines, one for each output, yielding #qubits = 6 and QC =

19, as shown in Fig. 3.5(a). However, RMRLS yields the much better circuit shown

in Fig. 3.5(b), with #qubits = 3 and QC = 11. This requires more synthesis time.

3.4 RMDDS Algorithm

In order to synthesize reversible circuits under a user-defined optimization objective

or constraints on #qubits, QC or synthesis time, we propose a hybrid method, called

35

L1: Pop search node

from PQ

L2: Select factor

substitute

vi

← vi

⊕ factor

Insert new

search nodes

Higher score?

Yes

Synthesis

complete or

time out ?

Yes

No

End

No

Yes

More factors

available?

No

Initialization

PQ empty?

No

Yes

Figure 3.6: Synthesis flow of RMRLS

RMDDS, that is inspired by RMRLS and DDS and integrates their best features. It

can effectively and efficiently explore the design space for circuit synthesis. In the

following, we first briefly introduce the algorithms and concepts involved in RMRLS

and DDS and then describe RMDDS.

3.4.1 RMRLS

RMRLS [60] exploits the similarity in the functional expression of MCT gates and the

PPRM form. This similarity is evident from Eqs. (3.1) and (3.4). The synthesis flow

of RMRLS is shown in Fig. 3.6. The input to the algorithm is the PPRM expansion

of each output of the reversible function. In the initialization stage, a search node

that contains the PPRM expansions of all the outputs is pushed into a priority queue

(PQ). The node is initialized as the root node of an N-ary search tree. PQ maintains

a list of nodes sorted by their priorities and the search tree records the search space

of the function.

36

The algorithm then enters two hierarchical loops to synthesize MCT gates. The

first loop L1 iterates over search nodes and the second loop L2 iterates over valid

MCT gates within a search node. In each iteration of L1, the highest-priority node

is popped from PQ (if PQ is not empty). In each iteration of L2, factors in the

PPRM expansion are investigated. The rule for finding a suitable factor is as follows.

For each output function vout, we search for the PPRM terms that have the format

vout = v⊕ factor where the factor is a PPRM term that does not contain variable v.

For example, for the PPRM expansion aout = a⊕ ab⊕ ac⊕ bc⊕ 1, bc and 1 are valid

factors but ab and ac are not, because the output function is aout and both ab and

ac contain variable a. Wh

princeton university...acknowledgments i would like to express my sincere appreciation to all those...

Documents