tabular methods & value approximation reviewavereshc/rl_fall19/lecture_18_tabular... · 1...
TRANSCRIPT
![Page 1: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/1.jpg)
‘-
1
TABULAR METHODS & VALUE APPROXIMATION REVIEWLecture 18
CSE4/510: Reinforcement Learning
October 24, 2019
![Page 2: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/2.jpg)
‘-
2
MDP
![Page 3: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/3.jpg)
‘-
3
MDP
![Page 4: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/4.jpg)
‘-
4
MDP
![Page 5: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/5.jpg)
‘-
5
MDP
![Page 6: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/6.jpg)
‘-
6
MDP
![Page 7: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/7.jpg)
‘-
7
MDP
![Page 8: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/8.jpg)
‘-
8
MDP
![Page 9: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/9.jpg)
‘-
9
MDP
1.
2.
3.
4.
5.
A
B
C
D
E
F
![Page 10: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/10.jpg)
‘-
10
MDP
1.
2.
3.
4.
5.
A
B
C
D
E
F
![Page 11: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/11.jpg)
‘-
11
Dynamic Programming
![Page 12: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/12.jpg)
‘-
12
Dynamic Programming
![Page 13: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/13.jpg)
‘-
13
Dynamic Programming
![Page 14: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/14.jpg)
‘-
14
1. Policy Evaluation
2. Policy Improvement
A
B
Dynamic Programming
![Page 15: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/15.jpg)
‘-
15
![Page 16: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/16.jpg)
‘-
16
Policy Model Pros Cons Applications
On
Policy
Model
Based
• It finds optimal
policies in
polynomial time for
most cases
• Guaranteed to find
optimal policy
• Requires the
knowledge of the
transition probability
this is an unrealistic
requirement for many
problems
Can be applied
to environment
for which the
state transition
probability is
known
Dynamic Programming
![Page 17: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/17.jpg)
‘-
17
Distribution vs Sample Model
1. Distribution model
2. Sample model B. List all possible outcomes and their
probabilities
A. Produce a single outcome taken
according to its probability of occurring
![Page 18: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/18.jpg)
‘-
18
Optimal Functions
B
A
C
1
2
3
![Page 19: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/19.jpg)
‘-
19
Optimal Functions
B
A
C
1
2
3
![Page 20: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/20.jpg)
‘-
20
Update Functions
1. Dynamic Programming
2. Monte Carlo
4. Temporal Difference
A.
B.
C.3. Q-learning
D.
![Page 21: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/21.jpg)
‘-
21
Update Functions
1. Dynamic Programming
2. Monte Carlo
4. Temporal Difference A.
B.
C.
3. Q-learning D.
![Page 22: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/22.jpg)
‘-
22
Overview
MC / DP / TD ?
![Page 23: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/23.jpg)
‘-
23
Monte Carlo
![Page 24: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/24.jpg)
‘-
24
Monte Carlo
![Page 25: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/25.jpg)
‘-
25
Monte Carlo
![Page 26: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/26.jpg)
‘-
26
Policy Model Pros Cons Applications
Monte Carlo
![Page 27: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/27.jpg)
‘-
27
Policy Model Pros Cons Applications
On
Policy
Model
Free
• Learn optimal
behavior directly
from interaction with
the environment
• Can be used to
focus on the region
of special interest
and be accurately
evaluated
• Must have the
terminal state
• Must wait until the
end of an episode
before return is
known. For problems
with very long
episodes this will
become too slow
It couldn’t be
used on
continues task,
should be
episodic
Monte Carlo
![Page 28: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/28.jpg)
‘-
28
SARSA vs Q-learning
![Page 29: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/29.jpg)
‘-
29
SARSA
![Page 30: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/30.jpg)
‘-
30
Policy Model Pros Cons Applications
SARSA
![Page 31: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/31.jpg)
‘-
31
Q-learning
![Page 32: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/32.jpg)
‘-
32
Q-Learning
Policy Model Pros Cons Applications
![Page 33: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/33.jpg)
‘-
33
Q-Learning
Policy Model Pros Cons Applications
Off
Policy
Model
Free
Easy to implement • Memory
requirement
increases with
number of states
• Does not
perform well in
stochastic
environment
Environment
with limited
number of
states and
discrete
action spaces
![Page 34: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/34.jpg)
‘-
34
Function Approximation
![Page 35: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/35.jpg)
‘-
35
Function Approximation
![Page 36: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/36.jpg)
‘-
36
Function Approximation
![Page 37: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/37.jpg)
‘-
37
Function Approximation
![Page 38: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/38.jpg)
‘-
38
Function Approximation
![Page 39: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/39.jpg)
‘-
39
Function Approximation
![Page 40: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/40.jpg)
‘-
40
Deep Q-network
![Page 41: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/41.jpg)
‘-
41
Deep Q-network
![Page 42: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/42.jpg)
‘-
42
Deep Q-network
![Page 43: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/43.jpg)
‘-
43
Deep Q-network (DQN)
Policy Model Pros Cons Applications
![Page 44: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/44.jpg)
‘-
44
Deep Q-network (DQN)
Policy Model Pros Cons Applications
Off
Policy
Model
Free
• Can generalize
to unseen states
• Input is just a
state
• It may over-
estimate value
• Cannot be
applicable to
continuous action
spaces
Environment
with limited
number of
states and
discrete
action spaces
![Page 45: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/45.jpg)
‘-
45
Double Deep Q-network
![Page 46: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/46.jpg)
‘-
46
Double Deep Q-network
![Page 47: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/47.jpg)
‘-
47
Double Deep Q-network
![Page 48: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/48.jpg)
‘-
48
Double Deep Q-network
![Page 49: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/49.jpg)
‘-
49
Double Deep Q-network (DDQN)
Policy Model Pros Cons Applications
![Page 50: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/50.jpg)
‘-
50
Double Deep Q-network (DDQN)
Policy Model Pros Cons Applications
Off
Policy
Model
Free
• Value estimation
is more accurate
comparing to
DQN
• Input is just a
state
• It may take longer
to train
Environment
with limited
number of
states and
discrete
action spaces
![Page 51: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/51.jpg)
‘-
51
Dueling Deep Q-network (Dueling DQN)
![Page 52: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/52.jpg)
‘-
52
Dueling Deep Q-network (Dueling DQN)
![Page 53: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/53.jpg)
‘-
53
Dueling Deep Q-network (Dueling DQN)
![Page 54: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/54.jpg)
‘-
54
Dueling Deep Q-network (Dueling DQN)
![Page 55: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/55.jpg)
‘-
55
Dueling Deep Q-network (Dueling DQN)
![Page 56: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/56.jpg)
‘-
56
Dueling Deep Q-network (Dueling DQN)
Policy Model Pros Cons Applications
![Page 57: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/57.jpg)
‘-
57
Prioritized Experience Replay (RER)
![Page 58: TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1 TABULAR METHODS & VALUE APPROXIMATION REVIEW Lecture 18 CSE4/510: Reinforcement Learning](https://reader034.vdocument.in/reader034/viewer/2022042521/5f73ee702b40cf51b9725761/html5/thumbnails/58.jpg)
‘-
58
PER