human level control through deep reinforcement learning naiyan wang

17
Human level control through deep reinforcement learning Naiyan Wang

Upload: emory-dalton

Post on 21-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Human level control through deep reinforcement learning Naiyan Wang

Human level control through deep reinforcement learning

Naiyan Wang

Page 2: Human level control through deep reinforcement learning Naiyan Wang

Part

1Q Learning

Page 3: Human level control through deep reinforcement learning Naiyan Wang

S

Q Learning

tate Action Reward

Page 4: Human level control through deep reinforcement learning Naiyan Wang

Q Learning

New State Old State Reward

Learning Rate Discount Factor

Page 5: Human level control through deep reinforcement learning Naiyan Wang

Part

2Deep Q Learning

Page 6: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 7: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 8: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 9: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 10: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 11: Human level control through deep reinforcement learning Naiyan Wang

End to End Cooking

Page 12: Human level control through deep reinforcement learning Naiyan Wang

End to End Learning

Page 13: Human level control through deep reinforcement learning Naiyan Wang

Formulation

Target Variable

1

2

3

Page 14: Human level control through deep reinforcement learning Naiyan Wang

Results AnalysisDQN is good at … DQN is bad at …

Page 15: Human level control through deep reinforcement learning Naiyan Wang

Part

3Discussion

Page 16: Human level control through deep reinforcement learning Naiyan Wang

Q: What is the key contributing factor?

Q: How to account for long term dependency ?

Discussion

A: Almost unlimited training data

A: Long short term memory may be the solution

Page 17: Human level control through deep reinforcement learning Naiyan Wang

Thank You