princeton university...acknowledgments i would like to express my sincere appreciation to all those...
TRANSCRIPT
-
Reversible and Quantum Circuit
Synthesis
Chia-Chun Lin
A Dissertation
Presented to the Faculty
of Princeton University
in Candidacy for the Degree
of Doctor of Philosophy
Recommended for Acceptance
by the Department of
Electrical Engineering
Adviser: Niraj K. Jha
June 2014
-
c⃝ Copyright by Chia-Chun Lin, 2014.
All rights reserved.
-
Abstract
This thesis presents five major tools for the synthesis of reversible and quantum cir-
cuits. Quantum computation has the ability to solve several important problems sig-
nificantly faster than its classical counterpart. Because of this promise, much research
effort has been dedicated to discovering new quantum algorithms and technologies.
Quantum mechanics postulates that the time-evolution of quantum states is re-
versible. Thus, reversibility is a necessary condition for quantum computing. Hence,
we propose an effective method and tool, called RMDDS, for synthesizing reversible
circuits. Since the evolution of quantum states is determined by some primitive
physical operations, quantum computers implemented in different physical systems
have different cost. Therefore, we propose an optimized quantum gate library, called
QGLVP, for various physical machine descriptions.
To enhance synthesis efficiency, we introduce QLib, a quantum module library,
which contains scripts to generate quantum modules for many well-known quantum
algorithms.
Since a quantum system inevitably interacts with the environment, this leads to
error and consequent failure of computation. To address this problem, we propose
FTQLS, a tool that synthesizes and optimizes fault-tolerant quantum circuits by using
logic identity rules for various physical machine descriptions.
Finally, we present a tool, called PAQCS, for physical design-aware fault-tolerant
quantum circuit synthesis. It effectively synthesizes quantum logic circuits into quan-
tum physical circuits, targeting different physical machine descriptions and quantum
error correction codes.
iii
-
Acknowledgments
I would like to express my sincere appreciation to all those who made this thesis
possible.
I am deeply grateful for the guidance of my adviser, Prof. Niraj K. Jha. His
enthusiasm for innovative research and meticulous attention to detail greatly inspired
and influenced me. He showed me the right direction at every critical moment during
the development of this thesis and directed me all the way.
I also received much advice and guidance from Prof. Sun-Yuan Kung, Prof. Sus-
mita Sur-Kolay, and Prof. Amlan Chakrabarti. Discussions with them were always a
great source for new ideas. Without their encouragement and inspiration, this thesis
would not have been possible.
Thesis review is demanding work, and doing it under time pressure only makes
it harder. I would like to thank my dissertation readers, Prof. Niraj K. Jha, Prof.
Susmita Sur-Kolay, and Prof. Kaushik Sengupta, for their extensive efforts in polish-
ing this thesis. I am also thankful to Prof. Niraj K. Jha, Prof. Sun-Yuan Kung, and
Prof. Stephen A. Lyon for agreeing to be on my final public oral committee.
Profs. Bede Liu and Sharad Malik kindly offered me teaching opportunities in
their ELE482 and ELE206 courses. Those were valuable and enjoyable experiences.
I would like to thank my English tutor Sandra Richter and host family Anne
Remillard for their help, encouragement in my PhD life, and sharing the American
culture and traditions with me.
I appreciate the help I received from Chun-Yi Lee, Chunxiao Li, Meng Zhang,
Jun-Wei Chuah, Ting-Jung Lin, Sourindra Chaudhuri, Aoxiang Tang, Xianmin Chen,
Yang Yang, Debajit Bhattacharya over the past few years.
Last, but not the least, I am grateful to my father, mother, brother, and wife An
Dai and all my friends for their support during my PhD study. They were unbelievably
encouraging and forbearing. They are the reason I keep going.
iv
-
List of Abbreviations
BDD binary decision diagram
CTL Clifford plus T library
DDS decision diagram synthesis
FT fault-tolerant
FTQLS fault-tolerant quantum logic synthesis
FTS fault-tolerant set
IT ion trap
KFDD Kronecker functional decision diagram
LP linear photonics
NA neutral atom
NP nonlinear photonics
PAQCS physical design-aware fault-tolerant quantum circuit synthesis
PMD physical machine description
QC quantum cost
QD quantum dot
QECC quantum error correction code
QGLVP quantum gate library for various physical machine descriptions
QLib quantum module library
RMDDS Reed-Muller decision diagram synthesis
RMRLS Reed-Muller reversible logic synthesis
SC superconducting
SKA Solovay-Kitaev algorithm
STA skipping table algorithm
ULB universal logic block
v
-
Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
1 Introduction 1
1.1 Quantum System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Quantum Circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Quantum Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Reversible Computing . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 FT Quantum Computation . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Physical Machine Description . . . . . . . . . . . . . . . . . . . . . . 10
1.7 Physical Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.8 Synthesis Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.9 Contributions of the Thesis . . . . . . . . . . . . . . . . . . . . . . . 15
1.10 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2 Related Work 17
2.1 Reversible Logic Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.1 Exact Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . 18
vi
-
2.1.2 Heuristic Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Quantum Logic Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.1 Logic Identities . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.2 Quantum Shannon Decomposition . . . . . . . . . . . . . . . . 22
2.2.3 Quantum Compilation . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Physical Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Tiled Quantum Architecture . . . . . . . . . . . . . . . . . . . 23
2.3.2 Physical Synthesis . . . . . . . . . . . . . . . . . . . . . . . . 25
3 RMDDS: Reed-Muller Decision Diagram Synthesis for Reversible Logic 26
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2.1 Reversible Functions . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.2 Reversible Gates . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.3 Reed-Muller Expansion . . . . . . . . . . . . . . . . . . . . . . 32
3.2.4 Decision Diagrams . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Motivational Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3.1 Example 1: #qubits vs. QC . . . . . . . . . . . . . . . . . . . 34
3.3.2 Example 2: Circuit Size vs. Synthesis Time . . . . . . . . . . 35
3.4 RMDDS Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4.1 RMRLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4.2 DDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4.3 RMDDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4.4 Synthesis Example . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
vii
-
4 QLib: Quantum Module Library 59
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2 Quantum Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2.1 One-qubit Gates . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2.2 Two-qubit Gates . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2.3 Reversible Gates . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4 Quantum Module Library . . . . . . . . . . . . . . . . . . . . . . . . 68
4.4.1 QFT/IQFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.4.2 Bernstein-Vazirani Search (BVS) . . . . . . . . . . . . . . . . 69
4.4.3 Grover’s Search . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4.4 Arithmetic Circuits . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5 Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5 QGLVP: Optimized Quantum Gate Library for Various PMDs 84
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.2 Primitive Quantum Gates . . . . . . . . . . . . . . . . . . . . . . . . 87
5.2.1 One-qubit Gates . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.2.2 Two-qubit Gates . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.3 Identity Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.3.1 One-qubit Identity Rules . . . . . . . . . . . . . . . . . . . . . 93
5.3.2 Two-qubit Identity Rules . . . . . . . . . . . . . . . . . . . . . 96
5.3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.4 Quantum Gate Library . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.4.1 RY(θ) Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.4.2 RZ(θ) Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.4.3 H Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
viii
-
5.4.4 CZ Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.4.5 CNOT Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.4.6 CP Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.4.7 G Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.4.8 iSW Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.4.9 SWAP Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.4.10 ZENO Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.4.11 Toffoli Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.4.12 Peres Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.4.13 Fredkin Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.4.14 Cost Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.4.15 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
6 FTQLS: Fault-tolerant Quantum Logic Synthesis 118
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.2.1 FT Quantum Computation . . . . . . . . . . . . . . . . . . . 121
6.2.2 FTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.3 Synthesis Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.3.1 Technology Mapping . . . . . . . . . . . . . . . . . . . . . . . 126
6.3.2 Quantum Compilation . . . . . . . . . . . . . . . . . . . . . . 126
6.4 Circuit Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.4.1 Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.4.2 Simplify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.4.3 Interchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.4.4 Commute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.4.5 Optimization Process . . . . . . . . . . . . . . . . . . . . . . . 133
ix
-
6.4.6 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.4.7 Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
6.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7 PAQCS: Physical Design-aware Fault-tolerant Quantum Circuit Synthesis147
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
7.2 Swap Based Quantum Tile Architecture . . . . . . . . . . . . . . . . 150
7.3 Motivational Example . . . . . . . . . . . . . . . . . . . . . . . . . . 152
7.4 Physical Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
7.4.1 Qubit Placement . . . . . . . . . . . . . . . . . . . . . . . . . 154
7.4.2 Channel Routing . . . . . . . . . . . . . . . . . . . . . . . . . 158
7.4.3 PAQCS: Synthesis Flow . . . . . . . . . . . . . . . . . . . . . 163
7.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
7.5.1 Placement and Routing . . . . . . . . . . . . . . . . . . . . . . 165
7.5.2 PAQCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
7.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
8 Conclusions and Future Research 180
Bibliography 184
x
-
List of Tables
3.1 Experimental results for reversible functions under #qubits minimization 48
3.2 Experimental results for reversible functions under QC minimization . 49
3.3 Percentage increase in #qubits and decrease in QC from the #qubits to QC optimization objective for reversible functions 51
3.4 Comparison with the best results . . . . . . . . . . . . . . . . . . . . 52
3.5 Experimental results for irreversible functions under #qubits minimization 56
3.6 Experimental results for irreversible functions under QC minimization 57
3.7 Percentage increase in #qubits and decrease in QC from the #qubits to QC optimization objective for irreversible functions 58
4.1 Syntax for quantum gates . . . . . . . . . . . . . . . . . . . . . . . . 68
4.2 Impact of synthesis scripts . . . . . . . . . . . . . . . . . . . . . . . . 82
5.1 Supported operations in different PMDs . . . . . . . . . . . . . . . . 86
5.2 One-qubit identity rules (length=2) . . . . . . . . . . . . . . . . . . . 95
5.3 One-qubit identity rules (length=3) . . . . . . . . . . . . . . . . . . . 96
5.4 H gate identity rules . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.5 RXY gate identity rules . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.6 Asqu gate identity rules . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.7 Two-qubit identity rules . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.8 Execution cycles for each operation in QD . . . . . . . . . . . . . . . 102
5.9 Execution cycles for each operation in SC . . . . . . . . . . . . . . . . 102
5.10 Number of operations of each gate on every PMD . . . . . . . . . . . 114
xi
-
5.11 Number of execution cycles of each gate on every PMD . . . . . . . . 115
5.12 Cost of Grover’s algorithm for search value |01⟩ . . . . . . . . . . . . 116
6.1 Synthesis result for RZ(π64) using SKA and STA . . . . . . . . . . . . 123
6.2 Conversion between one-qubit FTS and CTL . . . . . . . . . . . . . . 124
6.3 Interchange rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.4 Experimental results for plain adder circuits . . . . . . . . . . . . . . 139
6.5 Experimental results for QFT circuits with only OPT-3 optimization 140
6.6 Experimental results for QFT circuits with three-stage optimization and the improvement142
6.7 Using CTL and FTS for QFT circuits in the LP system . . . . . . . 143
6.8 Synthesis results part I . . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.9 Synthesis results part II . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.10 Average percentage reductions for different PMDs . . . . . . . . . . . 145
6.11 Optimization results for a 4-bit modular adder based on SKA and STA 146
7.1 Physical #ops for three QECCs and two PMDs . . . . . . . . . . . . 166
7.2 Physical #cycles for three QECCs and two PMDs . . . . . . . . . . . 167
7.3 Synthesis results for QLib benchmarks part I . . . . . . . . . . . . . . 169
7.4 Synthesis results for QLib benchmarks part II . . . . . . . . . . . . . 170
7.5 Synthesis results for RevLib benchmarks part I . . . . . . . . . . . . 171
7.6 Synthesis results for RevLib benchmarks part II . . . . . . . . . . . . 172
7.7 Comparisons with synthesis results presented in other work . . . . . . 175
7.8 Average improvements due to PAQCS . . . . . . . . . . . . . . . . . 176
xii
-
List of Figures
1.1 Bloch sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 An example quantum circuit . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 High-level synthesis flow . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Three-level tile hierarchy, each kth-level tile is composed of nine (k − 1)th-level tiles 24
2.2 Two types of tiled architecture. (a) movement based and (b) swap based 24
3.1 (a) A 2-to-1 irreversible OR gate/function, and (b) a 3-to-3 reversible OR gate obtained by adding a constant input (c1) and two garbage outputs (g1, g2) 30
3.2 (a) n-bit MCT, (b) NOT, (c) CNOT, and (d) Toffoli gate . . . . . . . 30
3.3 Two implementations of a five-bit MCT gate: (a) with [#qubits,QC] = [5,29] and (b) with [#qubits,QC] = [8,16] 34
3.4 Truth table for Example 2 . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5 (a) Quick synthesis by direct placement of PPRM terms on ancillary lines, with [#qubits,QC] = [6,19] and (b) RMRLS requiring more synthesis time, with [#qubits,QC] = [3,11] 35
3.6 Synthesis flow of RMRLS . . . . . . . . . . . . . . . . . . . . . . . . 36
3.7 MCT cascades for various decompositions of a non-shared vertex in a DD 38
3.8 MCT cascades for various decompositions of a shared vertex in a DD through the addition of an ancillary bit 39
3.9 Synthesis flow of RMDDS . . . . . . . . . . . . . . . . . . . . . . . . 40
3.10 Pseudocode of RMDDS . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.11 A synthesis example . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.12 Distribution of solutions for 0410184 when using the NCTFP library. 50
3.13 Solutions for hwb6 58 under relaxed synthesis times when using the NCTFP library 53
4.1 Two-qubit quantum gates: (a) CNOT, (b) CP, (c) CZ, and (d) SWAP 65
xiii
-
4.2 An example circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3 An example output file . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.4 A four-qubit QFT circuit . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.5 An EWS module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.6 The BVS circuit for n = 3 and a = 5 . . . . . . . . . . . . . . . . . . 71
4.7 The circuit structure for Grover’s search . . . . . . . . . . . . . . . . 72
4.8 The circuit for diffusion operator D . . . . . . . . . . . . . . . . . . . 72
4.9 The circuit structure of Cuccaro’s adder (four-qubit) . . . . . . . . . 74
4.10 (a) MAJ, and (b) UMA modules for Cuccaro’s adder . . . . . . . . . 74
4.11 The circuit structure of Draper’s adder (four-qubit) . . . . . . . . . . 75
4.12 (a) CMAJ, and (b) CUMA modules for controlled Cuccaro’s adder . . 75
4.13 A four-qubit multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.14 Rotate left by 1 (≪ 1) . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.15 A modular adder (a+b)%N . . . . . . . . . . . . . . . . . . . . . . . 77
4.16 A modular adder for constant a and N . . . . . . . . . . . . . . . . . 78
4.17 A modular subtracter for constant a and N . . . . . . . . . . . . . . 78
4.18 Constant modular multiplier . . . . . . . . . . . . . . . . . . . . . . 79
4.19 Restored modular multiplier . . . . . . . . . . . . . . . . . . . . . . . 80
4.20 Modular exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.1 Symbols of (a) iSW and (b) G gates . . . . . . . . . . . . . . . . . . . 91
5.2 CZ implementation: (a) and (b) in the NP system, (c) in the IT system104
5.3 CNOT implementation: (a) from H gate, (b) SC, NA, (c) QD, and (d) IT systems104
5.4 CP gate construction: (a) from RZ and CRZ gates, and for the (b) LP, NP, (c) QD, (d) NA, and (e) IT systems106
5.5 G gate construction: (a) from RZ and CRZ gates, and for the (b) LP, NP, (c) QD, (d) NA, and (e) SC systems106
5.6 iSW gate construction: (a) from CRX, (b) from CZ, applicable to QD, NA, and LP, (c) NP, and (d) IT systems107
5.7 SWAP gate construction: (a) NP, (b) NA, (c) QD, (d) IT, and (e) SC systems108
5.8 ZENO gate construction: (a) from the SWAP gate, and (b) for the QD systems109
xiv
-
5.9 Several ways to construct a CCZ gate using five two-qubit gates . . . 110
5.10 Two ways to construct a CS gate . . . . . . . . . . . . . . . . . . . . 111
5.11 Construction of a CA gate . . . . . . . . . . . . . . . . . . . . . . . . 111
5.12 Toffoli gate construction based on (a) CNOT and (b) CZ gates . . . . 111
5.13 Toffoli gate construction: (a) basic implementation, (b) suitable for the SC system (the dashed box is a CV gate), (c) suitable for the IT system112
5.14 (a) Peres and (b) Fredkin gates . . . . . . . . . . . . . . . . . . . . . 112
5.15 Peres gate construction: (a) basic implementation, (b) suitable for the SC system, (c) suitable for the IT system113
5.16 (a) Implementation of Grover’s algorithm for search value |01⟩, and (b) optimized circuit for the NP system (the gates in the dashed box can be merged into an Asqu operation; the dashed arrow indicates the critical path)116
6.1 FT implementations: (a) RZ(π4), (b) RX(
π4), and (c) RY(
π4). “↗” indicates a measurement gate125
6.2 Synthesis flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.3 Conversion of non-FT two-qubit gates to those in FTS(2): (a) CP, (b) G, and (c) iSW127
6.4 Synthesis flow of non-FT one-qubit gates based on the FT table . . . 128
6.5 Application flow for the simplify rule . . . . . . . . . . . . . . . . . . 131
6.6 Commute rules for (a) CNOT, (b) CP, and (c) G gates . . . . . . . . 132
6.7 Pseudocode of the optimization process . . . . . . . . . . . . . . . . . 134
6.8 An example of window selection . . . . . . . . . . . . . . . . . . . . . 135
6.9 An optimization example . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.10 Verification of the optimization process . . . . . . . . . . . . . . . . 137
7.1 The architecture creates communication channels through the use of swap chains, which exchange the states of adjacent qubits. Application of a swap chain, consisting of S(q8,q5) and S(q5,q2), to the three qubits is equivalent to the application of S(s8,s5) and S(s8,s2) to the corresponding qubit states. Note that though their qubit states are swapped, the qubit positions remain fixed151
7.2 A motivational example. (a) A logic circuit, where q0-q5 and s0-s5 refer to the name and state of qubits, respectively. (b) When qubits are placed trivially, #sw = (costB,costU,costR) = (8,6,8). (c) When channel routing is carefully taken into account, #sw = (8,3,4). (d) When both qubit placement and routing are carefully taken into account, #sw reduces to (4,1,2)152
7.3 A placement example. (a) Initial state. Placement of the (b) v0, (c) v2, (d) v1, (e) v3, (f) v4, and (g) v5 qubits157
7.4 A routing example. (a) Input circuit and its qubit and state layouts. (b)-(e) Routing for C(s0,s8). (f)-(h) Routing for C(s0,s5), C(s3,s6), and C(s5,s7), respectively. (i)-(j) Routing for C(s2,s0). (k) Routing for C(s5,s8). (l)-(m) Routing for C(s7,s6). (n) Recovery of the qubit states to their original position: pop ga stack with S(s0,s5), S(s4,s0), and S(s1,s0). (o) Recovered qubits states161
7.5 Synthesis flow of PAQCS . . . . . . . . . . . . . . . . . . . . . . . . . 164
7.6 #ops for the Steane code . . . . . . . . . . . . . . . . . . . . . . . . . 176
7.7 #cycles for the Steane code . . . . . . . . . . . . . . . . . . . . . . . 177
7.8 #ops for the Bacon-Shor code . . . . . . . . . . . . . . . . . . . . . . 177
xv
-
7.9 #cycles for the Bacon-Shor code . . . . . . . . . . . . . . . . . . . . . 178
7.10 #ops for the Knill code . . . . . . . . . . . . . . . . . . . . . . . . . . 178
7.11 #cycles for the Knill code . . . . . . . . . . . . . . . . . . . . . . . . 179
xvi
-
Chapter 1
Introduction
The lure of quantum computing comes from the promise that it can significantly out-
perform its classical counterpart when solving some important problems. Large-scale
quantum computers will be able to solve certain problems much more quickly than
any classical computer using the best currently known algorithms, like integer factor-
ization using Shor’s algorithm [1], which is capable of breaking RSA encryption [2] in
polynomial time. There exist quantum algorithms, such as Grover’s algorithm, which
run faster than any possible classical algorithms. Given sufficient computational re-
sources, a classical computer could be made to simulate any quantum algorithm [3].
However, the computational basis of 100 quantum bits (qubits), for example, would
already be too large to be represented on a classical computer because it would re-
quire 2100 complex values to be stored. Due to its strong computation ability, quantum
computation has ignited a lot of interest in the field.
Quantum computation is the study of the information processing that can be
accomplished using quantum mechanical systems [3, 4]. It is tempting to say that the
operations of a quantum computer are governed by the laws of quantum mechanics.
However, it is true that although all physical operations are governed by quantum
mechanics, we would not say that our desktop are quantum computers. In general, a
1
-
quantum computer is the one whose operations are governed by certain very special
transformations of its internal states. The laws of quantum mechanics allow these
peculiar transformations to take place under very carefully controlled conditions.
In a quantum computer, the logical operations must have no physical interactions
whatever that are not under the complete control of the system. All other interactions
may introduce potentially catastrophic disruptions into the operation of a quantum
computer. Such disruptions between what matters for the computation and what
does not result in decoherence, which is fatal to quantum computation.
To avoid decoherence, in general, quantum operations cannot be carried out in
macroscopic physical systems, because most such systems cannot be isolated from the
external environment. Such an isolation can be achieved if the qubits are operated
upon in microscopic physical systems. Such microscopic systems must be decoupled
from their surroundings except for the completely controlled interactions that are
associated with the computation process itself.
Quantum mechanics postulates that all quantum operations (except measurement,
as discussed later) are invertible, and thus all valid quantum gates are reversible. In
fact, quantum computers display an important part of their magic through reversible
operations, which transform the initial state of qubits into its final form, using only
processes whose action can be inverted. There is only a single irreversible operation
in quantum computation, the measurement, which is the only way to extract useful
information from qubits after their state has acquired its final form. The extracted
information is then processed by a classical computer.
In general, to implement quantum algorithms, we need both quantum and clas-
sical computers. The quantum computer executes the desired sequence of quantum
operations and the classical computer provides the control for these operations and
also performs post-processing of the computed results.
2
-
A quantum algorithm is executed by a quantum circuit, which comprises a se-
quence of quantum gates. These quantum gates may be decomposed into several
primitive quantum operations, supported by corresponding quantum physical ma-
chine descriptions (PMDs). In addition, because quantum systems are delicate and
difficult to control, fault-tolerant (FT) quantum circuits are needed for practical im-
plementation. Therefore, several quantum error correction codes (QECCs) are pro-
posed to facilitate FT computation. For physical quantum circuit realization, the
physical address of qubits needs to be considered in order to honor the physical dis-
tance constraint.
The synthesis of quantum circuits is difficult and several metrics must be con-
sidered, such as the number of primitive operations (#ops), the number of critical
execution cycles (#cycles), QECC, and different PMDs. In the thesis, we provide sev-
eral methodologies for quantum circuit synthesis and optimization targeting different
metrics. These techniques are scalable to any quantum algorithm. Their integration
yields optimized physical quantum circuits for various PMDs at the end of synthesis
process.
In the section, some background knowledge about quantum computing is provided,
followed by a high-level view of quantum circuit synthesis.
1.1 Quantum System
In quantum computation, a qubit refers to a unit of quantum information [3, 4]. It
has very different characteristics from a classical bit. For example, a classical bit may
only take on two distinct values: 0 or 1. However, a qubit does not suffer from this
limitation.
In a two-state quantum system, a qubit |ψ⟩ can be described by [3]:
|ψ⟩ = α0|0⟩+ α1|1⟩ (1.1)3
-
where |·⟩ is the ket vector in Dirac notation, which indicates that |0⟩ and |1⟩ are
column vectors corresponding to:
|0⟩ ≡
10
, |1⟩ ≡01
(1.2)A qubit is a superposition of |0⟩ and |1⟩, which means that the qubit |ψ⟩ exists
in the two states simultaneously. α0 and α1 are complex coefficients. They represent
the amplitudes of |0⟩ and |1⟩, respectively, with the normalization constraint |α0|2 +
|α1|2 = 1.
The qubit in Eq. (1.1) can also be written in terms of azimuth and elevation angles
as:
|ψ⟩ = eiγ(cos θ2|0⟩+ eiϕ sin θ
2|1⟩) (1.3)
where γ, θ, and ϕ are all real numbers. The global phase eiγ is physically indis-
tinguishable and thus can be ignored. Therefore, we can further simplify it to the
form:
|ψ⟩ = cos θ2|0⟩+ eiϕ sin θ
2|1⟩ (1.4)
We can visualize Eq. (1.4) on the surface of a three-dimensional sphere, known as the
Bloch sphere, as shown in Fig. 1.1. The north and south poles represent states |0⟩
and |1⟩, respectively. A point on the Bloch sphere represents a superposition of the
two states. A valid single-qubit operation represents a rotation on the Bloch sphere.
The pair of elevation and azimuth angles (θ, ϕ) are in the range of 0 ≤ θ < π and
0 ≤ ϕ < 2π.
Since a physical system changes over time, a quantum state |ψ⟩ is actually a func-
tion of time: |ψ(t)⟩. Quantum mechanics postulates that the evolution of a quantum
4
-
φ
θ
|0〉
|1〉
x
y
z
|ψ〉
Figure 1.1: Bloch sphere
state of a closed quantum system can be described by Schrödinger’s equation [3]:
i~∂|ψ(t)⟩∂t
= Ĥ|ψ(t)⟩ (1.5)
where ~ is Planck’s constant divided by 2π and Ĥ is a Hermitian operator, called
the Hamiltonian, which represents the total observable energy of the system. For
simplicity, if we consider Ĥ to be independent of time, the equation can be solved as:
|ψ(t)⟩ = e−iĤ(t−t0)/~|ψ(t0)⟩ = Û |ψ(t0)⟩ (1.6)
where Û is a unitary operator, U †U = I, and I is identity operator. Hence, all
valid quantum operations are unitary and thus reversible. Reversibility is a necessary
condition for quantum computing.
The evolution of an isolated quantum system with a finite number of states can
be described by a unitary matrix. An n-qubit operation is represented by a 2n × 2n
unitary matrix.
5
-
BC
A
Time flow
| ψ〉 C(B⊗A)| ψ〉
Figure 1.2: An example quantum circuit
1.2 Quantum Circuit
A quantum circuit comprises a sequence of quantum gates. An n-qubit circuit is
depicted with n horizontal lines, with time flowing from left to right. A quantum
computation typically requires the application of several quantum gates, sequentially
or in parallel, to various subsets of qubits. The net unitary transformation they
perform can be expressed in a matrix product form, which is a cascade of the unitary
matrices of the corresponding quantum gates, using two rules: dot product and tensor
product [3, 4].
Dot Product
The dot product is the same as matrix multiplication. If several gates act on the
same subset of qubits, then those gates must be applied in series and their overall
effect computed by the dot product. In a quantum circuit, if operation A acts before
B, their overall effect is computed by their dot product in reverse order, i.e., B · A.
Tensor Product
If adjacent gates within a quantum circuit act on independent subsets of qubits, then
they can be applied simultaneously in parallel. The net effect of the parallel gates is
evaluated with the tensor product, denoted by “⊗”.
An example is shown in Fig. 1.2. The system computes C · [B ⊗A], where A and
B are 2× 2 matrices and C is a 4× 4 matrix.
6
-
As indicated earlier, all quantum operations are unitary and thus invertible. The
inverse of a unitary matrix U is U−1 = U †, which is its conjugate transpose. Therefore,
obtaining the inverted circuits is straightforward, i.e., reverse the order of the gates
and use the conjugate transpose of each gate. For example, the inverse circuit of the
one in Fig. 1.2 is [B† ⊗ A†] · C†.
When two physical systems are treated as one combined system, the state space
of the combined physical system is the tensor product H = H1 ⊗H2, where H1 and
H2 are the component subsystems. It is important to note that the state of a many-
qubit composite system cannot always be decomposed as a product of its component
subsystems. In this case, we say that the qubits are entangled. Entanglement is a
physical phenomenon that occurs when groups of particles are generated or interact in
ways that the quantum state of each member must subsequently be described relative
to each other.
1.3 Quantum Algorithm
A quantum algorithm is one that runs on a realistic model of a quantum computer [4].
Just as a classical algorithm is a finite sequence of instructions, where each step
or instruction can be performed on a classical computer, a quantum algorithm is a
sequence of instructions, where each instruction is performed on a quantum computer.
Although all classical algorithms can also be performed on a quantum computer, the
term quantum algorithm is typically used for those algorithms that are inherently
quantum or use some essential feature of quantum computation, such as quantum
superposition or entanglement [5].
Quantum algorithms are interesting because they might be able to solve some
problems much faster than classical algorithms. The most well known algorithms are
Shor’s algorithm for factoring and Grover’s algorithm for searching an unstructured
7
-
database or an unordered list. Shor’s algorithm runs exponentially faster than the
best known classical algorithm for factoring. Grover’s algorithm runs quadratically
faster than the best possible classical algorithm for the same task.
In general, a quantum algorithm consists of both classical and quantum func-
tions [3, 6]. Classical functions are described in the classical domain, i.e., the mapping
of input to output states consists of binary values. Since quantum mechanics requires
the time-evolution of quantum states to be reversible, the classical functions need to
be reversible. Their circuit implementations consist of a sequence of reversible gates
that implement reversible operations. Quantum functions are described in the quan-
tum domain, i.e., the mapping of input to output states is described by a unitary
transformation matrix with complex values. Their circuit realizations are composed
of quantum gates.
1.4 Reversible Computing
A function is reversible if its input and output mapping is bijective and thus is in-
formation lossless. Therefore, all reversible computations can be computed forward
or backward (inversely) without loss of information. The study of reversible logic
has historically been motivated by power consumption. Landauer’s principle (or Von
Neumann-Landauer limit) states that regardless of the technology chosen for imple-
menting a circuit, when a bit of information is erased, at leastKT ln 2 Joules of energy
is dissipated, where K is the Boltzmann constant and T the operating temperature
in degrees Kelvin [7]. In 1973, Bennett [8] showed that zero-energy computation is
possible only if the computation is reversible and a recent experiment has been shown
to validate this perspective [9].
Reversible computing not only attracts interest from the point of view of power
consumption, but also finds many applications in other areas. One of the most impor-
8
-
tant applications of reversible computing is quantum computation [3]. In quantum
computation, reversible functions are mostly used for generating oracles. An ora-
cle can be seen as the description of the problem to solve. The reversible functions
provide a high-level description of quantum algorithms. These functions are synthe-
sized to reversible circuits. Then, these reversible gates are further decomposed into
low-level quantum circuits for physical realization on a target PMD.
1.5 FT Quantum Computation
Quantum systems are fragile because when they inevitably interact with the environ-
ment, the information stored in the system decoheres, thus resulting in the failure of
computation. Hence, quantum circuits need to be FT in a practical implementation.
FT quantum computation implies that quantum circuits perform correctly even when
errors occur.
Many QECCs have been proposed to facilitate FT quantum computation, such as
Steane code [3, 10], Bacon-Shor code [11], Knill code [12], and surface code [13, 14].
QECC is employed to detect and correct such errors. However, the QECC circuit may
itself incur an error. Thus, the probability of an error, for each qubit must be smaller
than an error threshold, such that the overall circuit error probability is reduced after
QECC is applied [15].
Assuming that the errors are independent, if the probability of a single error in a
single physical gate is p, the error probability of an encoded logic gate is given by Cp2,
where C is a constant that depends on the code and circuitry used [3]. This can be
improved if a concatenation code is used, which entails another level of QECC applied
on top of the previous level. Therefore, the error rate becomes C(Cp2)2 = C3p4 for
two-level concatenation. In general, a k-level concatenation code has a logical error
9
-
rate of pk = 1/C(Cp)2k . Therefore, when p < 1/C, the error rate can be made
arbitrarily small.
Implementations of these QECCs are based on some FT gates [16], as will be
defined and explained in Chapter 6. Therefore, if the circuits contain non-FT gates,
they should be converted to FT gates first. This conversion process is called quantum
compilation. For physical implementation of FT quantum computation, tiled quan-
tum architectures are used where each tile is able to do a set of FT computations.
The configuration and size of the tiles depend on the type of QECC and PMD.
1.6 Physical Machine Description
Physical realization of quantum computers is very challenging [3]. It not only requires
a robust physical representation of qubits, but also the enabling of their time-evolution
as desired. In quantum mechanics, the time-evolution of a closed quantum system
is described by a unitary operator determined by its Hamiltonian [3]. Since different
quantum systems have different Hamiltonians, they also have different PMDs. To
perform a quantum operation, one must be able to control the Hamiltonians of the
system. However, an operation may be easily performed in one system but with
difficulty in another. Hence, one PMD may be more suitable for implementing a
quantum logic gate than another.
Several quantum systems have been proposed that support different PMDs:
• Quantum dot (QD): In this system, a qubit is represented by the spin states
of two electrons in a double electrostatically defined quantum dot, which has
two potential wells with a tunneling barrier between them [17].
• Superconducting (SC): In this system, charged carriers are used to represent
qubits [18, 19, 20, 21, 22, 23]. At low temperatures in certain metals, two
10
-
electrons can bind together to form a Cooper pair. Such a pair can be confined
within an electrostatic box and used to represent quantum information.
• Ion trap (IT): This quantum system is based on a 2D lattice of confined ions,
each of which represents a physical qubit that can be moved within the lattice
to accommodate local interactions [3, 24, 25, 26].
• Neutral atom (NA): This is a system of trapped neutral atoms that can be
isolated from the environment and whose simple quantum-level structure can
be exploited [27, 28, 29].
• Linear photonics (LP): In this quantum system, a probabilistic two-photon
gate is teleported into a quantum circuit with high probability [30, 31, 32, 33].
• Nonlinear photonics (NP): This quantum system is based on weak cross-
Kerr nonlinearities [30, 34, 35].
A broad survey of quantum systems is available from the ARDA quantum com-
puting roadmap [36].
1.7 Physical Design
A quantum logic circuit does not consider the physical address of qubits and logic
gates in it are applied to qubits without considering the physical distance between
these qubits. In a physical circuit realization, the physical address of qubits needs
to be considered. In a practical implementation of quantum circuits, physical qubits
have to be placed on a grid. The grid implements the architecture of the quantum
computer. A physical distance constraint often imposed is that quantum gates can
only operate on adjacent qubits on the grid.
Current quantum technologies support a set of one- and two-qubit quantum gates.
A two-qubit gate enables interaction between the two qubits. In a physical circuit
11
-
implementation, when the two qubits are far apart, a communication channel must
be created between them. This communication overhead not only increases the com-
putation latency but also the probability of a computation error.
A complex multi-qubit reversible gate must be decomposed into a sequence of one-
and two-qubit quantum gates. However, even a one- or two-qubit gate may not be
directly implementable in a physical quantum machine. Therefore, it must be further
decomposed using the set of supported primitive quantum operations in the PMD
of the quantum machine. A quantum gate may require a very different number of
primitive quantum operations for realization on different machines.
1.8 Synthesis Flow
There are many difficulties and challenges in implementing a quantum computer. A
hierarchical architecture for constructing a quantum computer can be found in [37].
From top to bottom, it includes quantum computing theory, quantum programming,
quantum computer architecture, quantum computer micro-architecture, and technol-
ogy building blocks. In this thesis, we concentrate on the synthesis of FT quantum
circuits, i.e., we convert a given quantum algorithm into an FT circuit that is opti-
mized for the given PMD.
Quantum circuit synthesis is concerned with the ability to automatically generate
an optimized quantum circuit from a given quantum algorithm. The synthesis of
quantum circuits is generally difficult, but can be performed effectively by hierarchi-
cally decomposing it into many stages. Fig. 1.3 shows a high-level view of a three-stage
synthesis flow. It includes synthesis from the quantum algorithm to non-FT quantum
logic, non-FT quantum logic circuit to FT quantum logic, and FT quantum logic to
FT quantum physical circuit. Each synthesis stage is described next.
12
-
Optimized
quantum
gate library
(QGLVP)
Quantum algorithm
Quantum functionsClassical functions
Quantum
module library (QLib)
Reversible logic
synthesis (RMDDS)
non-FT quantum logic
PMD
FT quantum
logic synthesis
(FTQLS)
Physical design-aware quantum
circuit synthesis (PAQCS)QECC
FT physical quantum circuit
FT quantum logic
Figure 1.3: High-level synthesis flow
13
-
The input of the flow is a quantum algorithm, which generally contains two parts:
quantum functions and classical functions. Synthesizing an arbitrary quantum func-
tion is very difficult and inefficient. Fortunately, quantum functions usually have
regular structures and hence can be effectively synthesized with some quantum mod-
ules. Therefore, a quantum module library (QLib) [38] is introduced to facilitate the
synthesis of commonly used quantum modules. On the other hand, classical functions
are reversible functions and can be synthesized by the Reed-Muller Decision Diagram
Synthesis (RMDDS) tool [39], which is a flexible and efficient tool for reversible logic
synthesis. RMDDS is able to optimize the number of qubits or the quantum cost of
the circuit implementation, and the objective is influenced by the implementation of
the underlying technology. After the synthesis of quantum and reversible functions, a
non-FT quantum logic circuit, which implements the quantum algorithm, is derived.
It is composed of high-level quantum logic gates.
In order to convert the high-level logic gates into low-level quantum gates, an
optimized quantum gate library for various PMDs (QGLVP) [40] is used. The gate
library decomposes high-level gates into primitive quantum gates supported by various
PMDs. With the decomposition library, the circuit is then optimized by the FT
quantum logic synthesizer, FTQLS [41], which synthesizes and optimizes the non-FT
logic to FT logic circuits. The focus of FTQLS is on optimizing circuits with FT
primitive gates. A logical FT circuit is composed of a set of FT gates, such that the
QECCs can be easily applied to the circuit.
Next, the physical circuit is synthesized by using the physical design-aware quan-
tum circuit synthesizer, PAQCS [42]. Based on the QECC and PMD chosen, the
qubits are placed on a 2D grid, with some communication channels constructed. In
this stage, the physical cost of QECC is considered. Thus, the optimization can be
targeted to different QECCs and PMDs, which reflect true circuit cost.
14
-
1.9 Contributions of the Thesis
As indicated in the previous section, the synthesis of quantum circuit is divided into
three stages. The thesis proposes several effective synthesis methodologies to facilitate
the synthesis flow.
The thesis first proposes RMDDS, which is a flexible and efficient reversible logic
synthesizer. It is flexible in the sense that users can can either optimize the number of
qubits or the quantum cost in the circuit implementation. It is also efficient because
the circuits can be synthesized within user-defined CPU times. This combination of
flexibility and efficiency has been missing from synthesizers presented earlier.
The thesis then discusses QLib, which contains scripts to generate quantum mod-
ules of different sizes and specifications for well-known quantum algorithms. Because
many quantum algorithms use similar subroutines that can be implemented with
similar circuit modules, QLib is very helpful in quantum logic synthesis. In addition,
QLib can also serve as a suite of benchmarks for quantum logic and physical synthesis.
The thesis thereafter presents QGLVP. Quantum gates are themselves realized
using primitive quantum operations that are supported by the PMDs. Thus, the
quantum cost for implementing a quantum operation may differ from one PMD to
another. Hence, the optimized quantum gate decompositions are different in different
PMDs. To make our quantum gate library efficient in terms of the number of primitive
quantum operations involved and the associated delay, we explore one-qubit and two-
qubit quantum identity rules that can help remove redundancies in the quantum
gate implementation. QGLVP thus provides technology mapping for quantum logic
synthesis.
Next, FTQLS is presented. The input to FTQLS is an unoptimized quantum
logic realized using a set of commonly used gates and its output is an optimized FT
quantum logic that only comprises primitive quantum operations supported by the
given PMD. FTQLS does technology mapping for different PMDs and then converts
15
-
non-FT circuits to FT circuits. For technology mapping, it utilizes QGLVP. Efficient
conversion to FT circuits is done by integrating two quantum compilers and an FT
cache table into FTQLS. For improving synthesis results, an FT set of gates that is
directly supported by each PMD is proposed. Quantum circuit optimization is done
by utilizing quantum identity rules. A methodology such as FTQLS has not been
attempted before, to the best of our knowledge.
The last part of the thesis discusses PAQCS. It does physical design-aware quan-
tum circuit synthesis considering different PMDs and QECCs. It contains two efficient
and effective algorithms: one for physical qubit placement and another for routing of
communications. Physical design-aware quantum circuit design that can be targeted
at multiple PMDs and QECCs has not been attempted before, to the best of our
knowledge.
1.10 Thesis Outline
The rest of this thesis is organized as follows. Chapter 2 describes related work for
the thesis. Chapter 3 describes RMDDS for reversible logic synthesis. Chapter 4
discusses QLib, which contains scripts to generate quantum modules. Chapter 5
introduces QGLVP, which is an optimized quantum gate library for various PMDs.
Chapter 6 presents FTQLS, which effectively synthesize FT quantum logic circuit
to various PMDs. Chapter 7 presents PAQCS. It synthesizes quantum logic circuits
to quantum physical circuits. Chapter 8 concludes the thesis and presents ideas for
future research.
16
-
Chapter 2
Related Work
This chapter surveys previous work in reversible and quantum logic synthesis. Re-
versible logic synthesis generates a high-level description of quantum oracle functions.
Hence, it is a very important part of the synthesis flow. Many different methodolo-
gies for reversible logic synthesis have been proposed. Each has some advantages
and disadvantages. Therefore, it is possible that a hybrid method may exploit the
complementary advantages of such methodologies.
For quantum circuit synthesis, some important issues must be considered that do
not exist in traditional circuit synthesis. These issues include the use of fault-tolerant
(FT) gates, the type of quantum error correction code (QECC) applied, physical
machine description (PMD), and physical qubit placement and routing.
In this chapter, we summarize several important techniques that are used in re-
versible and quantum circuit synthesis and discuss their characteristics, performance,
and applications.
2.1 Reversible Logic Synthesis
Reversible computation has been historically motivated by theoretical research in low-
power electronics. However, reversible logic not only attracts interest from low-power
17
-
research, but also finds many applications in other areas. Given that reversible trans-
formations are the bottleneck in many widely used algorithms, reversible instructions
have been added to various microprocessor instruction sets [43]. In addition, the bit-
permutation operation, which is reversible, has been proposed in [44] to increase the
efficiency of cryptographic algorithms. Reversible debugging and program inversion
allow us to undo a command in debugging environments and allow the reconstruction
of decisions that lead to some particular outcomes. Therefore, program inversion and
debugging [45, 46] and reversible programming languages [47, 48] have been proposed.
Reversible adiabatic circuits have been proposed to recycle signal energy [49] and for
supercomputing [50].
Quantum computation is another important motivation for reversible computation
because unitary transformations in quantum mechanics are reversible. Many quantum
algorithms [51, 52, 53] have been introduced to solve several difficult problems in
polynomial time. Because of these important applications, reversible circuit synthesis
has become an important research topic.
Reversible logic synthesis is concerned with the ability to automatically gener-
ate a reversible circuit from a given reversible function. It synthesizes reversible
functions into reversible circuits composed of multiple-controlled Toffoli (MCT) [54],
Fredkin [55], and Peres [56] gates. Several reversible logic synthesis algorithms have
been proposed. In general, they can be classified into two categories: exact and
heuristic.
2.1.1 Exact Synthesis
Exact algorithms perform an exhaustive search for globally optimal solutions. An
exact algorithm based on depth-first search with iterative deepening was used in [57]
to find optimal solutions for all three-bit reversible functions. Another exact algorithm
based on Boolean satisfiability was proposed in [58]. This algorithm guarantees a
18
-
reversible circuit with a minimum number of gates only for circuits with up to 10 gates.
Although exact algorithms guarantee optimality, they consume excessive synthesis
times since there are 2n! permutations for an n-input reversible function. Thus, the
synthesis time explodes with an increase in the number of input variables or the
number of gates, limiting the usage of exact algorithms in practice.
2.1.2 Heuristic Synthesis
In order to synthesize reversible circuits more efficiently, several heuristic methods
have been proposed that lead to reasonably good solutions. According to a sur-
vey [59], these methods can be categorized into four types: (1) search-based, (2)
binary-decision-diagram-based, (3) transformation-based, and (4) cycle-based. Dif-
ferent heuristic algorithms excel at different aspects of synthesis: number of qubits
(#qubits), quantum cost (QC) or synthesis time.
Search-based Method
Gupta et al. [60] proposed a search-based method, called Reed-Muller reversible logic
synthesis (RMRLS), which uses the positive-polarity Reed-Muller (PPRM) expansion
to synthesize reversible circuits. The methodology in [61] extends the one in [60] for
synthesis with more reversible gates, such as Fredkin and Peres gates. Given enough
memory and time, these methods can find a minimal circuit. RMRLS is typically
effective in synthesizing circuits with variable count up to 6. However, many reversible
functions/circuits exceed these limits.
Transformation-based Method
Transformation-based synthesis approaches [59, 62, 63, 64] rely on a truth table de-
scription. This approach compares the identity function I with a given permutation
function F , and uses a sequence of reversible gates to transform F into I. To conduct
19
-
the transformation, the optimization metric used is the sum of Hamming distances
between the binary patterns of F and I in each row of the truth table. The method
iterates through the rows of the truth table. Since the size of the truth table increases
exponentially with the number of inputs, the method becomes impractical for large
input functions.
Cycle-based Method
Instead of synthesizing an entire permutation, one can factor it into a set of cycles and
synthesize the resulting cycles separately. A cycle (a1, a2, . . . , an) is a permutation
such that f(a1) = a2, f(a2) = a3, . . . , and f(an) = a1. Cycle-based methods [57, 65,
66] use group theory to factor reversible functions into a set of cycles.
Decision Diagram Method
Since most of the heuristic algorithms are not scalable to a large number of inputs,
Wille et al. [67] introduced a scalable heuristic algorithm based on Shannon decompo-
sition and binary decision diagram (BDD) to efficiently synthesize large circuits. The
average synthesis time is just a few seconds. Later, Soeken et al. [68] and Pang et al.
[69, 70] used the Kronecker functional decision diagram (KFDD) to further improve
circuit cost through Davio and Shannon expansions. Since both BDD and KFDD
synthesis methods are based on the decision diagram data structure, we categorize
them under the label: decision diagram synthesis (DDS). Although DDS is scalable
and very efficient, it generates many ancillary input and output bits, which weakens
their practical impact.
Due to the characteristics of different methods, a well-designed hybrid synthesis
method that leverages complementary advantages of various heuristics may provide an
opportunity to perform effective tradeoffs. Reed-Muller Decision Diagram Synthesis
(RMDDS), presented in this thesis, is such a hybrid method that exploits the advan-
20
-
tages of search based and decision diagram based methods. It is efficient in terms of
synthesis time. In addition, it enables trade-offs between #qubits and quantum cost,
as shown in Chapter 3.
2.2 Quantum Logic Synthesis
Quantum logic circuits are built using a cascade of quantum gates, which operate on
qubits. Synthesis of quantum logic has many aspects, based on the input and output
it generates. Some quantum logic synthesis problems have been tackled before, as
discussed next.
2.2.1 Logic Identities
In circuit design, often gates in adjacent circuit blocks can be merged, since a cir-
cuit may admit many different but equivalent decompositions. These decompositions
constitute circuit identities or templates. They can be used to optimize the quantum
cost or critical path of the circuit. Some quantum circuit identities and templates
have been proposed in [3, 71, 72, 73, 74]. The methods in [3, 71] consider identities
involving Controlled-NOT (CNOT) gates. The method in [72] only considers one-
qubit gate identities. The methods in [73, 74] target the NCV library consisting of
CNOT and Controlled-V (CV) gates, where V is the square root of NOT. Since dif-
ferent quantum systems support different primitive quantum operations, i.e., physical
machine descriptions (PMDs), the methods discussed above are only partially appli-
cable and, in many cases, completely inapplicable to the targeted PMDs. We need
identities rules that are generally applicable to every PMD.
21
-
2.2.2 Quantum Shannon Decomposition
Quantum Shannon decomposition is used to synthesize generic quantum circuits.
The input of the synthesizer is an arbitrary quantum matrix and its output is a
cascade of controlled or uncontrolled rotation gates. Typically, it utilizes cosine-
sine decomposition [75, 76] or optimized brute-force search [77] to synthesize optimal
circuits. However, due to the size of the matrix representation (an n-qubit circuit
is described by a 2n × 2n matrix) and high computational complexity, O(4n), these
methods cannot process large quantum circuits. Thus, they have limited use.
2.2.3 Quantum Compilation
Since an FT quantum circuit is only composed of FT gates, the conversion of non-FT
gates to FT gates is an important procedure. The purpose of quantum compilation
is to find a cascade of gates from a discrete universal gate set to approximate any
arbitrary unitary gate within an error threshold. Currently, quantum compilation
basically uses two methodologies to convert non-FT circuits into FT circuits. One is
based on the Solovay-Kitaev algorithm (SKA) [16, 78] and the other is based on the
skipping table algorithm (STA) [79, 80]. SKA and STA are introduced in detail in
Section 6.2.1.
2.3 Physical Design
Quantum logic circuits cannot be directly implemented on a physical quantum system
because they contain temporal, but no spatial, information of the circuit. A quantum
physical circuit is derived by placing qubits on a grid, which denotes physical locations
of the qubits. Adjacent qubits can interact with each other on the grid. If the two
qubits that a two-qubit gate is applied to are not adjacent, a communication channel
must be created.
22
-
In general, a physical circuit is composed of many quantum tiles. Each tile contains
a qubit and is able to do a set of FT operations. The implementation of a tile is based
on the type of PMD and the QECC employed.
2.3.1 Tiled Quantum Architecture
Many quantum technologies and architectures have been proposed to enable various
degrees of qubit interactions. The ion trap (IT) technology [24, 25, 26] uses one-
dimensional (1D) interaction, in which only two adjacent qubits can interact with
each other. This is, of course, highly restrictive. The quantum dot (QD) [17], su-
perconducting (SC) [18, 19, 20, 21, 22, 23], neutral atom (NA) [27, 28, 29], linear
photonics (LP) [30, 31, 32, 33], and non-linear photonics (NP) [30, 34, 35] technolo-
gies use two-dimensional (2D) interaction, in which a qubit (except for those at the
boundaries) can interact with four neighboring qubits. The technology presented in
[81] uses three-dimensional (3D) interaction, in which a qubit can interact with six
neighboring qubits. However, this makes quantum control very difficult. Currently,
the 2D interaction technology is the most popular and most current research is based
on the 2D structure.
Various tiled quantum architectures have been proposed in order to implement FT
quantum computation [82, 83, 84]. In such architectures, each tile contains a qubit
register and implements a set of FT quantum gates. The architecture is typically
hierarchical, which means that a kth-level tile comprises multiple (k− 1)th-level tiles.
An example is shown in Fig. 2.1. It shows each level to be composed of nine sub-level
tiles.
Two types of quantum tile architectures have been proposed based on the mobility
of qubits. The first type is movement based [82, 84, 85], as shown in Fig. 2.2(a). The
universal logic block (ULB) tile shown in the figure is analogous to a configurable logic
block employed in conventional FPGAs. ULBs are separated by routing channels
23
-
T2
T1
T1
T1
T1
T1
T1
T1
T1
T1
T0
T0
T0
T0
T0
T0
T0
T0
T0
level 1 level 0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
T0
Figure 2.1: Three-level tile hierarchy, each kth-level tile is composed of nine (k−1)th-level tiles
ULB ULB ULB
ULB ULB ULB
ULB ULB ULB
classical control
cla
ssic
al co
ntr
ol
classical control
cla
ssic
al co
ntr
ol
Q Q Q
Q Q Q
Q Q Q
(b)(a)
Figure 2.2: Two types of tiled architecture. (a) movement based and (b) swap based
that are used to move qubits. NA, LP, and NP are three quantum technologies that
support the movement of qubits and are thus compatible with this architecture. The
second type is swap based, as shown in Fig. 2.2(b). Each tile Q contains a qubit and
a set of FT gates. In QD and SC quantum technologies, the qubits are fixed, but the
states of qubits move through the use of swap chains. A swap chain exchanges the
states of adjacent qubits through a sequence of swap gates.
For physical implementation of the tiles, several designs have been proposed. The
designs are based on different QECCs, each with different #ops and #cycles. Some
important tile implementations for the three QECCs are: Steane code [86], Bacon-
Shor code [87], and Knill code [88].
24
-
2.3.2 Physical Synthesis
Many techniques have been proposed to perform physical synthesis of quantum logic
circuits to 1D quantum architectures [89, 90, 91, 92]. However, they are not suitable
for 2D quantum architectures. A few hand-optimized 2D quantum circuits have been
proposed in [93, 94, 95]. However, developing methodologies for physical circuit syn-
thesis on 2D architectures is a nascent area of research. Recently, a methodology for
qubit placement that minimizes communication overhead in a 2D quantum architec-
ture has been proposed [96], which is the first attempt to address this problem. It
uses mixed integer linear programming for qubit placement optimization.
25
-
Chapter 3
RMDDS: Reed-Muller Decision
Diagram Synthesis for Reversible
Logic
In this chapter, we propose a flexible and efficient reversible logic synthesizer. It ex-
ploits the complementary advantages of two methods: Reed-Muller Reversible Logic
Synthesis (RMRLS) and Decision Diagram Synthesis (DDS), and is thus called Reed-
Muller Decision Diagram Synthesis (RMDDS) [39]. RMRLS does not scale to a large
number of qubits. DDS tools, even though efficient, add a large number of ancillary
qubits and typically incur much higher quantum cost than necessary. RMDDS over-
comes these obstacles. It is flexible in the sense that users can either optimize the
number of qubits (#qubits) or the quantum cost (QC) in the circuit implementation.
It is also efficient because the circuits can be synthesized within user-defined CPU
times. This combination of flexibility and efficiency has been missing from synthesiz-
ers presented earlier. When used to synthesize reversible functions, RMDDS reduces
#qubits by up to 79.2% (average of 54.6%) when the synthesis objective is to min-
imize #qubits and the QC by up to 71.5% (average of 35.7%) when the synthesis
26
-
objective is to minimize QC, relative to DDS methods. For irreversible functions
(which are automatically embedded in reversible functions), the corresponding best
(average) reductions in #qubits is 42.1% (22.5%) for minimization of #qubits and
63.0% (25.9%) for the minimization of QC.
3.1 Introduction
As indicated in Chapter 2.1, many reversible logic synthesis algorithms have been
proposed in prior works. Since the exact algorithms are complex and require long
synthesis times, in order to synthesize reversible circuits more efficiently, usually some
heuristic methods are used that generate reasonably good solutions. RMRLS uses
the positive-polarity Reed-Muller (PPRM) expansion to synthesize reversible circuits
with optimized QC. RMRLS is typically effective in synthesizing circuits with small
number of qubits and is not scalable to large circuits. On the other hand, DDS
efficiently synthesizes large circuits with very low synthesis times, but with a large
number of ancillary qubits.
Both QC and #qubits are important metrics for evaluating the performance of
reversible circuits. The QC of a reversible gate is equal to the number of elementary
quantum operations required to implement its functionality. Hence, QC is related
to the computation time and defines the computational complexity of the circuit.
#qubits is an important metric for evaluating the quantum hardware cost of the
circuit. Since allowing higher #qubits in the implementation may make its realization
impractical, reducing #qubits is very important.
The aim of this chapter is to present a heuristic algorithm that is scalable (to a
large number of inputs) and efficient (i.e., synthesis time is at most a few minutes) like
the DDS methods, yet results in fewer #qubits and smaller QC than these methods, in
general. Thus, a hybrid synthesis method that leverages complementary advantages
27
-
of various heuristics may provide an opportunity to perform tradeoffs among these
three aspects. However, the integration of these heuristics needs to be performed
carefully since a bad integration interface can decrease its efficiency and effectiveness.
The integration of a transformation-based method with DDS is not easy because the
truth table specification does not scale with the number of inputs. In order to form
a hybrid of cycle-based and DDS methods, decomposition into cycles and conversion
between the canonical cycle form, which is the input format used by cycle-based
methods, and a decision diagram (DD) pose major hurdles.
Due to the above drawbacks, we choose to integrate RMRLS with DDS. Because
the conversion between the PPRM expansion and a DD is straightforward, it is much
easier to form a hybrid of RMRLS and DDS. This combination can leverage the
complementary advantages of RMRLS (in reducing #qubits and QC) and DDS (in
terms of scalability and efficiency). We refer to this hybrid method as RMDDS.
RMDDS is very flexible. It allows users to specify an objective (or impose constraints)
on the circuit implementation, such as #qubits, QC, or synthesis time. Previous
reversible synthesizers do not provide such a flexibility.
The remainder of the chapter is organized as follows. Section 3.2 provides back-
ground material on reversible circuits. Section 3.3 provides some simple examples to
motivate our work. Section 3.4 discusses the synthesis procedure in detail. Section 3.5
presents experimental results. Finally, Section 3.6 concludes.
3.2 Background
In this section, we present background material to facilitate understanding of re-
versible logic synthesis.
28
-
3.2.1 Reversible Functions
A function is reversible if it performs a bijective (one-to-one and onto) mapping of all
its inputs and outputs, else it is irreversible. Since reversible functions are bijective:
(1) the operation of reversible functions is bidirectional, which means they can be
operated in both the forward and backward directions, (2) reversible operations are
information lossless, (3) the number of inputs and outputs of reversible functions are
equal, (4) the outputs of reversible functions are permutations of the inputs, and (5)
because information needs to be conserved, fan-out in reversible circuits is prohibited.
The truth table of an irreversible function can be expanded to convert it to a
reversible function. This can be done by adding garbage outputs and corresponding
constant inputs to balance the number of inputs and outputs and then bijectively
filling the augmented truth table. An example is shown in Fig. 3.1. The 2-to-1 OR
gate/function is irreversible because its truth table is not bijective. There are three
repeated outputs (1). Hence, its operation is information-lossy since we cannot always
deduce the input vector from the corresponding output bit (i.e., information is erased
during computation). This can be remedied by adding a constant input (c1) and two
garbage outputs (g1 and g2) and filling the truth table bijectively.
An irreversible function can be transformed into a reversible one with the addition
of ⌈log2 p⌉ outputs, where p is the number of repeated outputs [97]. However, there
are many ways to perform the transformation since there are 2n! permutations for an
n-input reversible function.
3.2.2 Reversible Gates
Reversible circuits are constructed by cascading reversible gates. A commonly used
reversible gate is theMultiple Controlled Toffoli (MCT) gate: an n-bit MCT gate [54]
has n inputs and outputs. It passes the first n− 1 inputs (referred to as control bits)
29
-
c1
a b f g1
g2
0
0
0 0 0 0
0 0 1 1 0 0
0 1 0 1 0 1
0 1 1 1 1 0
1 0 0 0 0 1
1 0 1 0 1 0
1 1 0 0 1 1
1 1 1 1 1 1
OR
c1
a
b
f
g1
g2
a
bf
(a)
(b)
a b f0
0 0
0 1 1
1 0 1
1 1 1
Figure 3.1: (a) A 2-to-1 irreversible OR gate/function, and (b) a 3-to-3 reversible ORgate obtained by adding a constant input (c1) and two garbage outputs (g1, g2)
x1
x2
x3
x1
x2
x3⊕x1x2
x1
x2
x1
x2⊕x1
x1 x1⊕1=x1'
x1x2x3
xn-1xn
y1y2y3
yn-1yn
...
...
(a)
(b)
(c)
(d)
Figure 3.2: (a) n-bit MCT, (b) NOT, (c) CNOT, and (d) Toffoli gate
to the output unaltered and inverts the nth input (referred to as the target bit) if the
first n− 1 inputs are all 1’s. That is:
control bits: yi = xi, 1 ≤ i ≤ n− 1,
target bit: yn = xn ⊕ x1x2 . . . xn−1(3.1)
An MCT gate is shown in Fig. 3.2(a). A 1-bit MCT gate inverts the input uncon-
ditionally and is, hence, also called a NOT gate [Fig. 3.2(b)]. A 2-bit MCT gate is
also called the Feynman or Controlled-NOT (CNOT) gate. It inverts the target bit
30
-
if the control bit is 1 [Fig. 3.2(c)]. Fig. 3.2(d) depicts a 3-bit MCT gate. Convention-
ally, a 3-bit MCT gate is simply referred to as the Toffoli gate. It inverts the target
bit if both the control bits are 1.
Another popular reversible gate is the n-bit Multiple Controlled Fredkin (MCF)
gate, defined as follows [55]:
control bits: yi = xi, 1 ≤ i ≤ n− 2,
target bit 0: yn−1 = xn−1x1x2...xn−2 ⊕ xnx1x2...xn−2
target bit 1: yn = xnx1x2...xn−2 ⊕ xn−1x1x2...xn−2
(3.2)
The MCF gate swaps the two target bits if all the control bits are 1’s. Conven-
tionally, a 3-bit MCF gate (n = 2) is simply referred to as the Fredkin gate.
A third type of reversible gate is the three-bit Peres gate [56]. It accomplishes
the operation of the cascade of a CNOT gate and a Toffoli gate, with the operation
defined as follows:
y1 = x1 ⊕ x2
y2 = x2
y3 = x3 ⊕ x1x2
(3.3)
The QC of a reversible gate is equal to the number of elementary quantum op-
erations required to implement its functionality. The method for deriving the QC of
arbitrary quantum gates was first introduced by Barenco et al. [98]. An n-bit quan-
tum gate is hierarchically decomposed into basic elementary reversible gates. NOT,
CNOT, and Toffoli gate are defined to have a QC of 1, 1, and 5, respectively. The
QC of an n-bit MCT gate is 2n − 3. However, if the gate contains some ancillary
qubits, its QC can be reduced because the ancillary qubits can be used to hold some
temporary values, thus reducing the computational complexity [73, 74].
31
-
The QC of a reversible circuit is simply the sum of the QC of its constituent gates.
A high QC leads to a high computation time, leading to a higher probability that the
quantum system will be subjected to noise and just collapse. Thus, minimizing the
circuit QC is also a very important design objective.
3.2.3 Reed-Muller Expansion
Any Boolean function can be described by an exclusive-OR sum-of-products (ESOP)
expression [99]. The ESOP expression can be easily converted to the PPRM expan-
sion in which all the variables are uncomplemented (by replacing each complemented
variable x′ by x⊕ 1). The PPRM expansion is canonical and has the form:
f(x1, x2, . . . , xn) =a0 ⊕ a1x1 ⊕ . . .⊕ anxn ⊕ a12x1x2⊕
a13x1x3 ⊕ . . .⊕ an−1,nxn−1xn ⊕ . . .⊕
a12...nx1x2 . . . xn
(3.4)
Each cube in the expansion is referred to as a Reed-Muller term. Fixed-polarity
Reed-Muller (FPRM) expansion is a variant of the PPRM expansion and uses only
the complemented or uncomplemented form (but not both) of each variable. An n-
input function has 2n FPRM expansions. The PPRM expansion is just a special case
of the FPRM expansion because it only uses uncomplemented variables. The choice
of polarity for each variable influences the number of terms in the resulting FPRM
expansion. The following example shows four different expansions for an example
Boolean function: y = a + b′c. Generally, the PPRM expansion has the largest
32
-
number of terms since only positive polarity is used for each variable.
SOP: y = a+ b′c
ESOP: y = a⊕ a′b′c
PPRM(a,b,c) : y = abc⊕ ac⊕ bc⊕ a⊕ c
FPRM(a′,b′,c) : y = a′b′c⊕ a′ ⊕ 1
(3.5)
3.2.4 Decision Diagrams
DDs are data structures that can efficiently represent Boolean functions. A DD is a
directed acyclic graph defined as follows [100, 101]:
Definition 1: A DD over variables X = {x1, x2, ..., xn} is a rooted directed acyclic
graphG = (V,E) with vertex set V containing two types of vertices: non-terminal and
terminal. A terminal vertex is labeled 0 or 1 and has no successors. A non-terminal
vertex v is labeled with a variable x ∈ X, which is called the decision variable, and
has exactly two successors denoted by low(v) and high(v) ∈ V .
Definition 2: A DD is free if each variable is encountered at most once on each path
in the DD from the root to a terminal vertex.
Definition 3: A DD is ordered if it is free and the variables are encountered in the
same order on every path from the root to a terminal vertex.
A BDD decomposes Boolean functions into smaller sub-functions by only using
Shannon decomposition on every variable, whereas a KFDD decomposes Boolean
functions into sub-functions by using Davio or Shannon decompositions.
Shannon and Davio decompositions are defined as follows:
f = x′if0i ⊕ xif 1i Shannon (S)
f = f 0i ⊕ xif 2i positive Davio (pD)
f = f 1i ⊕ x′if 2i negative Davio (nD)
(3.6)
33
-
(b)
a
b
c
d
e
0
0
0
a
b
c
d
e
cd
bcd
abcd⊕e
a
b
c
d
e
a
b
c
d
abcd⊕e
(a)
QC=29 QC = 5 + 5 + 5 + 1
Figure 3.3: Two implementations of a five-bit MCT gate: (a) with [#qubits,QC] =[5,29] and (b) with [#qubits,QC] = [8,16]
where f 0i is the 0-cofactor of f with respect to xi [i.e., f0i = f(x1, . . . , xi−1, 0, xi+1, xn)],
f 1i similarly is the 1-cofactor, and f2i = f
0i ⊕ f 1i .
3.3 Motivational Examples
We next discuss various design tradeoffs that can be made, with the aid of two
examples.
For reversible circuit synthesis, we should trade off among synthesis time, #qubits
and QC. In most cases, a circuit with minimum #qubits, QC, and synthesis time does
not exist, as illustrated by two examples next.
3.3.1 Example 1: #qubits vs. QC
Fig. 3.3(a) shows an example of a five-bit MCT gate. It contains five qubits and
no ancillary bits. Its QC is 25 − 3 = 29. However, we can also implement the five-
bit Toffoli gate as shown in Fig. 3.3(b). This requires eight qubits, including three
ancillary qubits. Lines 5, 6 (garbage outputs e, cd) and 7 (garbage output bcd) realize
intermediate values required for line 8 (the valid final output). The QC now reduces
to only 16. This shows how #qubits can be traded off for QC.
34
-
c b a co
bo
ao
0 0 0 0 0 1
0 0 1 0 0 00 1 0 1 1
0 1 1 0 1 0
1
0 0 0 1 11
0 1 1 0 01
1 0 1 0 11
1 1 1 1 0
1 ao=a⊕1
bo
=ac⊕b⊕c
co
=ab⊕ac⊕b
(a)
(b)
Figure 3.4: Truth table for Example 2
a
b
c
1
0
0
ao
bo
co
(a)
-
-
-
(a⊕1)c⊕b = ac⊕b⊕c
a
b
c
(a⊕1)(ac⊕b⊕c)⊕c = ac⊕ab⊕b
ao
bo
co
(b)
Figure 3.5: (a) Quick synthesis by direct placement of PPRM terms on ancillarylines, with [#qubits,QC] = [6,19] and (b) RMRLS requiring more synthesis time,with [#qubits,QC] = [3,11]
3.3.2 Example 2: Circuit Size vs. Synthesis Time
The circuit size is determined by #qubits and QC. Fig. 3.4(a) shows the truth table
of a three-variable reversible function and Fig. 3.4(b) its PPRM expansion. We can
trivially synthesize the circuit (in almost no synthesis time) by placing the PPRM
terms on three ancillary lines, one for each output, yielding #qubits = 6 and QC =
19, as shown in Fig. 3.5(a). However, RMRLS yields the much better circuit shown
in Fig. 3.5(b), with #qubits = 3 and QC = 11. This requires more synthesis time.
3.4 RMDDS Algorithm
In order to synthesize reversible circuits under a user-defined optimization objective
or constraints on #qubits, QC or synthesis time, we propose a hybrid method, called
35
-
L1: Pop search node
from PQ
L2: Select factor
substitute
vi
← vi
⊕ factor
Insert new
search nodes
Higher score?
Yes
Synthesis
complete or
time out ?
Yes
No
End
No
Yes
More factors
available?
No
Initialization
PQ empty?
No
Yes
Figure 3.6: Synthesis flow of RMRLS
RMDDS, that is inspired by RMRLS and DDS and integrates their best features. It
can effectively and efficiently explore the design space for circuit synthesis. In the
following, we first briefly introduce the algorithms and concepts involved in RMRLS
and DDS and then describe RMDDS.
3.4.1 RMRLS
RMRLS [60] exploits the similarity in the functional expression of MCT gates and the
PPRM form. This similarity is evident from Eqs. (3.1) and (3.4). The synthesis flow
of RMRLS is shown in Fig. 3.6. The input to the algorithm is the PPRM expansion
of each output of the reversible function. In the initialization stage, a search node
that contains the PPRM expansions of all the outputs is pushed into a priority queue
(PQ). The node is initialized as the root node of an N-ary search tree. PQ maintains
a list of nodes sorted by their priorities and the search tree records the search space
of the function.
36
-
The algorithm then enters two hierarchical loops to synthesize MCT gates. The
first loop L1 iterates over search nodes and the second loop L2 iterates over valid
MCT gates within a search node. In each iteration of L1, the highest-priority node
is popped from PQ (if PQ is not empty). In each iteration of L2, factors in the
PPRM expansion are investigated. The rule for finding a suitable factor is as follows.
For each output function vout, we search for the PPRM terms that have the format
vout = v⊕ factor where the factor is a PPRM term that does not contain variable v.
For example, for the PPRM expansion aout = a⊕ ab⊕ ac⊕ bc⊕ 1, bc and 1 are valid
factors but ab and ac are not, because the output function is aout and both ab and
ac contain variable a. Wh