reinforcement learning in bu phases of quantum controlcnls.lanl.gov/external/piml/marin...
TRANSCRIPT
![Page 1: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/1.jpg)
BU
A. Polkovnikov
D. Sels P. MehtaA, G.R. Day
P, Weinberg
Reinforcement Learning in Phases of Quantum Control
Marin BukovarXiv: 1705.00565 (2017)arXiv: 1711.09109 (2017)
![Page 2: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/2.jpg)
BU
A. Polkovnikov
D. Sels P. MehtaA, G.R. Day
P, Weinberg
Reinforcement Learning in Phases of Quantum Control
Marin BukovarXiv: 1705.00565 (2017)arXiv: 1711.09109 (2017)
![Page 3: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/3.jpg)
GOAL: teach reinforcement learning agent to prepare states
in non-integrable quantum Ising model
Marin Bukov
H(t) = �X
j
JSzj+1S
zj + hzS
zj + hx(t)S
xj
![Page 4: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/4.jpg)
GOAL: teach reinforcement learning agent to prepare states
in non-integrable quantum Ising model
Marin Bukov
1. Single-qubit System (THIS TALK)problem setup, RL agent solutioncontrol phase transitions overconstrained phase, correlated (glassy) phase, controllable phase
J = 0
H(t) = �X
j
JSzj+1S
zj + hzS
zj + hx(t)S
xj
![Page 5: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/5.jpg)
GOAL: teach reinforcement learning agent to prepare states
in non-integrable quantum Ising model
Marin Bukov
1. Single-qubit System (THIS TALK)problem setup, RL agent solutioncontrol phase transitions overconstrained phase, correlated (glassy) phase, controllable phase
J = 0
H(t) = �X
j
JSzj+1S
zj + hzS
zj + hx(t)S
xj
2. Two-qubit System (POSTER)spontaneous symmetry breaking in optimal control landscape
L = 2
![Page 6: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/6.jpg)
GOAL: teach reinforcement learning agent to prepare states
in non-integrable quantum Ising model
Marin Bukov
1. Single-qubit System (THIS TALK)problem setup, RL agent solutioncontrol phase transitions overconstrained phase, correlated (glassy) phase, controllable phase
J = 0
H(t) = �X
j
JSzj+1S
zj + hzS
zj + hx(t)S
xj
2. Two-qubit System (POSTER)spontaneous symmetry breaking in optimal control landscape
L = 2
3. Many-Body System (POSTER)control phase diagram: overconstrained phase, glassy/correlated phase
Alex
![Page 7: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/7.jpg)
x
y
|"i
|#i
| ii
Optimal Qubit State Preparation
Hamiltonian:| ⇤i
Marin Bukov
H(t) = �Sz � hx(t)S
x
initial state:
target state:
| ii : GS of Hi = �Sz � 2Sx
| ⇤i : GS of H⇤ = �Sz + 2Sx
![Page 8: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/8.jpg)
x
y
|"i
|#i
| ii
T
Optimal Qubit State Preparation
Hamiltonian:
GOAL: find protocol
such that | (t = 0)i = | ii, | (t = T )i = | ⇤ih(t) 2 [�4, 4]
measure: fidelity Fh(T ) = |h (T )| ⇤i|2
| ⇤i
Marin Bukov
H(t) = �Sz � hx(t)S
x
initial state:
target state:
| ii : GS of Hi = �Sz � 2Sx
| ⇤i : GS of H⇤ = �Sz + 2Sx
![Page 9: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/9.jpg)
What is Reinforcement Learning?
Silver et al., Nature 529 (2016) [Google DeepMind]Marin Bukov
agent
ENVIRONMENT
reward
action
![Page 10: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/10.jpg)
What is Reinforcement Learning?
Silver et al., Nature 529 (2016) [Google DeepMind]Marin Bukov
agent
ENVIRONMENT
reward
action
![Page 11: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/11.jpg)
Reinforcement Learning in a Nutshell
• Machine Learning • Supervised Learning • Reinforcement Learning (RL)• Unsupervised Learning
agent
ENVIRONMENT
Sutton and Barto, Reinforcement Learning: an Introduction, MIT press (2015)Marin Bukov
![Page 12: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/12.jpg)
Reinforcement Learning in a Nutshell
• Machine Learning • Supervised Learning • Reinforcement Learning (RL)• Unsupervised Learning
agent
ENVIRONMENT
Sutton and Barto, Reinforcement Learning: an Introduction, MIT press (2015)Marin Bukov
![Page 13: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/13.jpg)
Reinforcement Learning in a Nutshell
• Machine Learning • Supervised Learning • Reinforcement Learning (RL)• Unsupervised Learning
RL states S = {s}
agent
ENVIRONMENT
RL states S = {(t, h(t))}
Sutton and Barto, Reinforcement Learning: an Introduction, MIT press (2015)Marin Bukov
![Page 14: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/14.jpg)
Reinforcement Learning in a Nutshell
• Machine Learning • Supervised Learning • Reinforcement Learning (RL)• Unsupervised Learning
RL states S = {s}available actions A = {a}
agent
ENVIRONMENT
available actions A = {a = �h} = {0,±0.1,±0.2,±0.5,±1.0,±2.0,±4.0,±8.0}RL states S = {(t, h(t))}
Sutton and Barto, Reinforcement Learning: an Introduction, MIT press (2015)Marin Bukov
![Page 15: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/15.jpg)
Reinforcement Learning in a Nutshell
• Machine Learning • Supervised Learning • Reinforcement Learning (RL)• Unsupervised Learning
RL states S = {s}available actions A = {a}rewards R = {r}
agent
ENVIRONMENT
available actions A = {a = �h} = {0,±0.1,±0.2,±0.5,±1.0,±2.0,±4.0,±8.0}
rewards R = {r(t) 2 [0, 1]}
RL states S = {(t, h(t))}
Sutton and Barto, Reinforcement Learning: an Introduction, MIT press (2015)Marin Bukov
![Page 16: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/16.jpg)
Reinforcement Learning in a Nutshell
• Machine Learning • Supervised Learning • Reinforcement Learning (RL)• Unsupervised Learning
RL states S = {s}available actions A = {a}rewards R = {r}
cumulative expected reward or Q-function Q(s, a)
agent
ENVIRONMENT
available actions A = {a = �h} = {0,±0.1,±0.2,±0.5,±1.0,±2.0,±4.0,±8.0}
rewards R = {r(t) 2 [0, 1]}
RL states S = {(t, h(t))}
encodes experience/knowledge
Sutton and Barto, Reinforcement Learning: an Introduction, MIT press (2015)Marin Bukov
![Page 17: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/17.jpg)
Reinforcement Learning in a Nutshell
• Machine Learning • Supervised Learning • Reinforcement Learning (RL)• Unsupervised Learning
RL states S = {s}available actions A = {a}rewards R = {r}
GOAL: maximise cumulative expected reward / Q-function Q(s, a)asstarting from state and taking action
cumulative expected reward or Q-function Q(s, a)
agent
ENVIRONMENT
available actions A = {a = �h} = {0,±0.1,±0.2,±0.5,±1.0,±2.0,±4.0,±8.0}
rewards R = {r(t) 2 [0, 1]}
RL states S = {(t, h(t))}
encodes experience/knowledge
Sutton and Barto, Reinforcement Learning: an Introduction, MIT press (2015)Marin Bukov
![Page 18: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/18.jpg)
RL Applied to Quantum State Preparation
agent1
2
ENVIRONMENT
3 reward
feedback loop, updatesQ(s, a)
Marin Bukov
![Page 19: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/19.jpg)
RL Applied to Quantum State Preparation
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
feedback loop, updatesQ(s, a)
Marin Bukov
![Page 20: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/20.jpg)
RL Applied to Quantum State Preparation
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
tT
h(t)
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
feedback loop, updatesQ(s, a)
Marin Bukov
![Page 21: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/21.jpg)
RL Applied to Quantum State Preparation
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
1 start from state s0 = (t = 0, h = �1)
tT
h(t)
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
feedback loop, updatesQ(s, a)
Marin Bukov
![Page 22: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/22.jpg)
RL Applied to Quantum State Preparation
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
1 start from state s0 = (t = 0, h = �1)
take action a0 : �h = 0.8go to state s1 = (t = 0.05, h = �0.2)
tT
h(t)
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
feedback loop, updatesQ(s, a)
Marin Bukov
![Page 23: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/23.jpg)
RL Applied to Quantum State Preparation
2 solve Schrödinger Eq. and obtain the QM state | (�t)i
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
1 start from state s0 = (t = 0, h = �1)
take action a0 : �h = 0.8go to state s1 = (t = 0.05, h = �0.2)
tT
h(t)
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
feedback loop, updatesQ(s, a)
Marin Bukov
![Page 24: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/24.jpg)
RL Applied to Quantum State Preparation
2 solve Schrödinger Eq. and obtain the QM state | (�t)i
3 calculate reward and use it to update which in turn is used to choose subsequent actions
rQ(s, a)
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
1 start from state s0 = (t = 0, h = �1)
take action a0 : �h = 0.8go to state s1 = (t = 0.05, h = �0.2)
tT
h(t)r = 0
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
feedback loop, updatesQ(s, a)
Marin Bukov
![Page 25: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/25.jpg)
RL Applied to Quantum State Preparation
2 solve Schrödinger Eq. and obtain the QM state | (�t)i
3 calculate reward and use it to update which in turn is used to choose subsequent actions
rQ(s, a)
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
1 start from state s0 = (t = 0, h = �1)
take action a0 : �h = 0.8go to state s1 = (t = 0.05, h = �0.2)
tT
h(t)r = 0
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
feedback loop, updatesQ(s, a)
Marin Bukov
![Page 26: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/26.jpg)
RL Applied to Quantum State Preparation
2 solve Schrödinger Eq. and obtain the QM state | (�t)i
3 calculate reward and use it to update which in turn is used to choose subsequent actions
rQ(s, a)
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
1 start from state s0 = (t = 0, h = �1)
take action a0 : �h = 0.8go to state s1 = (t = 0.05, h = �0.2)
tT
h(t)r = 0
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
feedback loop, updatesQ(s, a)
Marin Bukov
![Page 27: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/27.jpg)
RL Applied to Quantum State Preparation
2 solve Schrödinger Eq. and obtain the QM state | (�t)i
3 calculate reward and use it to update which in turn is used to choose subsequent actions
rQ(s, a)
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
1 start from state s0 = (t = 0, h = �1)
take action a0 : �h = 0.8go to state s1 = (t = 0.05, h = �0.2)
tT
h(t)r = 0
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
feedback loop, updatesQ(s, a)
Marin Bukov
![Page 28: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/28.jpg)
RL Applied to Quantum State Preparation
2 solve Schrödinger Eq. and obtain the QM state | (�t)i
3 calculate reward and use it to update which in turn is used to choose subsequent actions
rQ(s, a)
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
1 start from state s0 = (t = 0, h = �1)
take action a0 : �h = 0.8go to state s1 = (t = 0.05, h = �0.2)
tT
h(t)r = |h ⇤| (t = T )i|2
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
episode completed
feedback loop, updatesQ(s, a)
Marin Bukov
![Page 29: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/29.jpg)
RL Applied to Quantum State Preparation
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
tT
h(t)r = |h ⇤| (t = T )i|2
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
feedback loop, updatesQ(s, a)
problems:state space exponentially big
how do we choose actions?
Marin Bukov
![Page 30: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/30.jpg)
RL Applied to Quantum State Preparation
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
tT
h(t)r = |h ⇤| (t = T )i|2
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
feedback loop, updatesQ(s, a)
problems:state space exponentially big
how do we choose actions?biased MC sampling
Marin Bukov
![Page 31: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/31.jpg)
RL Applied to Quantum State Preparation
i@t| (t)i = H(t)| (t)i, t 2 [0, T ]
| ii : GS of Hi = Sz � S
x
| ⇤i : GS of H⇤ = Sz + S
x
H(t) = Sz + h(t)Sx
tT
h(t)r = |h ⇤| (t = T )i|2
agent1
2
ENVIRONMENT
3 reward r =
⇢, 0 t < T
, t = T
0
|h ⇤| (t = T )i|2
feedback loop, updatesQ(s, a)
problems:state space exponentially big
how do we choose actions?biased MC sampling
exploration exploitation dilemma
Marin Bukov
![Page 32: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/32.jpg)
Reinforcement Learning Quantum State Preparation
bang-bang protocols
Marin Bukov
h 2 {±4}
H(t) = �Sz � hx(t)S
x
arXiv: 1705.00565 (2017)
the learning process
Chen et al, IEEE 25, 90 9920 (2014)
tT
episode completed
r = |h ⇤| (t = T )i|2hx(t)
![Page 33: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/33.jpg)
Reinforcement Learning Quantum State Preparation
bang-bang protocols
Marin Bukov
h 2 {±4}
H(t) = �Sz � hx(t)S
x
arXiv: 1705.00565 (2017)
the learning process
Chen et al, IEEE 25, 90 9920 (2014)
tT
episode completed
r = |h ⇤| (t = T )i|2hx(t)
![Page 34: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/34.jpg)
Reinforcement Learning Quantum State Preparation
bang-bang protocols
Marin Bukov
h 2 {±4}
H(t) = �Sz � hx(t)S
x
arXiv: 1705.00565 (2017)
the learning process
How hard is this optimisation problem?Chen et al, IEEE 25, 90 9920 (2014)
tT
episode completed
r = |h ⇤| (t = T )i|2hx(t)
![Page 35: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/35.jpg)
RL-inspired Discovery: Phase Diagram of Quantum Control
bang-bang protocols
Marin Bukov
h 7! 1� Fh(T )
H(t) = �Sz � hx(t)S
x
arXiv: 1705.00565 (2017)
![Page 36: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/36.jpg)
RL-inspired Discovery: Phase Diagram of Quantum Control
bang-bang protocols
infidelity landscape (schematic)
0 .0 0 .5 1 .0 1 .5 2 .0t
−4
−2
0
2
4
h(t)
Fh(T ) = 1 .000
(iii)
{h↵}infidelity landscape minima:
Marin Bukov
h 7! 1� Fh(T )
H(t) = �Sz � hx(t)S
x
arXiv: 1705.00565 (2017)
![Page 37: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/37.jpg)
RL-inspired Discovery: Phase Diagram of Quantum Control
bang-bang protocols
infidelity landscape (schematic)
0 .0 0 .5 1 .0 1 .5 2 .0t
−4
−2
0
2
4
h(t)
Fh(T ) = 1 .000
(iii)
{h↵}infidelity landscape minima:
Marin Bukov
h 7! 1� Fh(T )
H(t) = �Sz � hx(t)S
x
arXiv: 1705.00565 (2017)
![Page 38: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/38.jpg)
RL-inspired Discovery: Phase Diagram of Quantum Control
bang-bang protocols
infidelity landscape (schematic)
0 .0 0 .5 1 .0 1 .5 2 .0t
−4
−2
0
2
4
h(t)
Fh(T ) = 1 .000
(iii)
{h↵}infidelity landscape minima:
Marin Bukov
h 7! 1� Fh(T )
H(t) = �Sz � hx(t)S
x
arXiv: 1705.00565 (2017)
![Page 39: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/39.jpg)
RL-inspired Discovery: Phase Diagram of Quantum Control
bang-bang protocols
infidelity landscape (schematic)
0 .0 0 .5 1 .0 1 .5 2 .0t
−4
−2
0
2
4
h(t)
Fh(T ) = 1 .000
(iii)
{h↵}infidelity landscape minima:
Marin Bukov
h̄(t) =1
#real
X
↵
h↵(t)
Edwards-Anderson-like order parameter:
q(T ) ⇠X
t
{h(t)� h̄(t)}2
h 7! 1� Fh(T )
H(t) = �Sz � hx(t)S
x
arXiv: 1705.00565 (2017)
![Page 40: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/40.jpg)
0 .0 0 .5 1 .0 1 .5 2 .0 2 .5 3 .0 3 .5 4 .0T
0 .0
0 .2
0 .4
0 .6
0 .8
1 .0
Fh(T )
q(T )
TminTc
III
(iii)Fh (T )
(i)(ii)
I II
RL-inspired Discovery: Phase Diagram of Quantum Control
bang-bang protocols
infidelity landscape (schematic)
0 .0 0 .5 1 .0 1 .5 2 .0t
−4
−2
0
2
4
h(t)
Fh(T ) = 1 .000
(iii)
{h↵}infidelity landscape minima:
Marin Bukov
h̄(t) =1
#real
X
↵
h↵(t)
Edwards-Anderson-like order parameter:
q(T ) ⇠X
t
{h(t)� h̄(t)}2
h 7! 1� Fh(T )
H(t) = �Sz � hx(t)S
x
arXiv: 1705.00565 (2017)
![Page 41: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/41.jpg)
0 .0 0 .5 1 .0 1 .5 2 .0 2 .5 3 .0 3 .5 4 .0T
0 .0
0 .2
0 .4
0 .6
0 .8
1 .0
Fh(T )
q(T )
TminTc
III
(iii)Fh (T )
(i)(ii)
I II
RL-inspired Discovery: Phase Diagram of Quantum Control
bang-bang protocols
0 .0 0 .1 0 .2 0 .3 0 .4 0 .5t
−4
−2
0
2
4
h(t)
Fh(T ) = 0 .633(i)
0 .0 0 .2 0 .4 0 .6 0 .8 1 .0t
−4
−2
0
2
4
h(t)
Fh(T ) = 0 .845
(ii)
infidelity landscape (schematic)
0 .0 0 .5 1 .0 1 .5 2 .0t
−4
−2
0
2
4
h(t)
Fh(T ) = 1 .000
(iii)
{h↵}infidelity landscape minima:
Marin Bukov
h̄(t) =1
#real
X
↵
h↵(t)
Edwards-Anderson-like order parameter:
q(T ) ⇠X
t
{h(t)� h̄(t)}2
h 7! 1� Fh(T )
H(t) = �Sz � hx(t)S
x
arXiv: 1705.00565 (2017)
![Page 42: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/42.jpg)
Nature of Control Phase Transitions
Marin Bukov
H(t) = �Sz � hx(t)S
x
![Page 43: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/43.jpg)
0 .0 0 .2 0 .4 0 .6 0 .8 1 .0t
−4
−2
0
2
4
hx(t)
Nature of Control Phase Transitions
Marin Bukov
H(t) = �Sz � hx(t)S
x
![Page 44: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/44.jpg)
0 .0 0 .2 0 .4 0 .6 0 .8 1 .0t
−4
−2
0
2
4
hx(t)
Nature of Control Phase Transitions
Marin Bukov
H(t) = �Sz � hx(t)S
x
lattice sites
Isin
g sp
in
![Page 45: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/45.jpg)
0 .0 0 .2 0 .4 0 .6 0 .8 1 .0t
−4
−2
0
2
4
hx(t)
Nature of Control Phase Transitions
Marin Bukov
one-to-one correspondence:
bang-bang classical protocol spin state
infidelity energy
H(t) = �Sz � hx(t)S
x
lattice sites
Isin
g sp
in
![Page 46: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/46.jpg)
0 .0 0 .2 0 .4 0 .6 0 .8 1 .0t
−4
−2
0
2
4
hx(t)
Nature of Control Phase Transitions
Marin Bukov
one-to-one correspondence:
bang-bang classical protocol spin state
infidelity energy
H(t) = �Sz � hx(t)S
x
effective classical energy function governs control phase transitions
j : sites on time lattice
0 .0 0 .2 0 .4 0 .6 0 .8 1 .0t
−4
−2
0
2
4
h(t)
Fh(T ) = 0 .845
(ii) He↵(T ) = I(T ) +X
j
Gj(T )hj +X
ij
Jij(T )hihj +X
ijk
Kijk(T )hihjhk + . . .
lattice sites
Isin
g sp
in
![Page 47: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/47.jpg)
0 .0 0 .2 0 .4 0 .6 0 .8 1 .0t
−4
−2
0
2
4
hx(t)
Nature of Control Phase Transitions
Marin Bukov
one-to-one correspondence:
bang-bang classical protocol spin state
infidelity energy
control phase transitions: classical (?), non-equilibrium
H(t) = �Sz � hx(t)S
x
effective classical energy function governs control phase transitions
j : sites on time lattice
0 .0 0 .2 0 .4 0 .6 0 .8 1 .0t
−4
−2
0
2
4
h(t)
Fh(T ) = 0 .845
(ii) He↵(T ) = I(T ) +X
j
Gj(T )hj +X
ij
Jij(T )hihj +X
ijk
Kijk(T )hihjhk + . . .
lattice sites
Isin
g sp
in
![Page 48: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/48.jpg)
Outlook
web: mgbukov.github.ioarXiv: 1705.00565 (2017)arXiv: 1711.09109 (2017)
one can teach a reinforcement learning agent to prepare quantum states at short times with high fidelity
finding optimal driving protocol as hard as searching for absolute GS of a spin glass (even if system is disorder-free)
quantum control problems have extremely rich phase diagrams with overconstrained, controllable, correlated and glassy phases —> POSTER (Alex Day) exhibit symmetry breaking —> POSTER (MB)
control phase transitions: classical & nonequilibrium, generic?
![Page 49: Reinforcement Learning in BU Phases of Quantum Controlcnls.lanl.gov/external/piml/Marin Bukov.pdf · teach reinforcement learning agent to prepare states in non-integrable quantum](https://reader034.vdocument.in/reader034/viewer/2022051915/6006ee7270b9350003691aa5/html5/thumbnails/49.jpg)
Outlook
web: mgbukov.github.ioarXiv: 1705.00565 (2017)arXiv: 1711.09109 (2017)
open-source Python package for ED and quantum dynamics of arbitrary boson, fermion and spin many-body systems, supporting various (user-defined) symmetries and time evolution.
QuSpin: weinbe58.github.io/QuSpin/
SciPost Phys. 2, 003 (2017)