human level control through deep reinforcement learning naiyan wang

Human level control through deep reinforcement learning Naiyan Wang

Upload: emory-dalton

Post on 21-Dec-2015

218 views

Category:

Documents

0 download

Report

Download

Tags:

Embed Size (px):

TRANSCRIPT

Human level control through deep reinforcement learning

Naiyan Wang

Part

1Q Learning

Page 3: Human level control through deep reinforcement learning Naiyan Wang

Q Learning

tate Action Reward

Page 4: Human level control through deep reinforcement learning Naiyan Wang

Q Learning

New State Old State Reward

Learning Rate Discount Factor

Page 5: Human level control through deep reinforcement learning Naiyan Wang

Part

2Deep Q Learning

Page 6: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 7: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 8: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 9: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 10: Human level control through deep reinforcement learning Naiyan Wang

Traditional Cooking

Page 11: Human level control through deep reinforcement learning Naiyan Wang

End to End Cooking

Page 12: Human level control through deep reinforcement learning Naiyan Wang

End to End Learning

Page 13: Human level control through deep reinforcement learning Naiyan Wang

Formulation

Target Variable

Page 14: Human level control through deep reinforcement learning Naiyan Wang

Results AnalysisDQN is good at … DQN is bad at …

Page 15: Human level control through deep reinforcement learning Naiyan Wang

Part

3Discussion

Page 16: Human level control through deep reinforcement learning Naiyan Wang

Q: What is the key contributing factor?

Q: How to account for long term dependency ?

Discussion

A: Almost unlimited training data

A: Long short term memory may be the solution

Page 17: Human level control through deep reinforcement learning Naiyan Wang

Thank You

LEARNING TO REINFORCEMENT LEARN - arXiv · LEARNING TO REINFORCEMENT LEARN JX Wang 1, Z Kurth-Nelson , D Tirumala , H Soyer , JZ Leibo1, ... system gradually tunes into this consistent

Prefrontal cortex as a meta-reinforcement CS330 Student ...cs330.stanford.edu/presentations/presentation-11.4-1.pdf · Prefrontal cortex as a meta-reinforcement learning system Wang

Reinforcement and Imitation Learning for Diverse ... · Reinforcement and Imitation Learning for Diverse Visuomotor Skills Yuke Zhuy Ziyu Wang zJosh Merel Andrei Rusu Tom Erez zSerkan

Adaptive Dynamic Bipartite Graph Matching: A Reinforcement ...€¦ · Adaptive Dynamic Bipartite Graph Matching: A Reinforcement Learning Approach Yansheng Wang y, Yongxin Tong y,

Towards Monocular Vision based Obstacle Avoidance through ... (1).pdf · Towards Monocular Vision based Obstacle Avoidance through Deep Reinforcement Learning Linhai Xie, Sen Wang,

Arena: a toolkit for Multi-Agent Reinforcement Learning · Arena: a toolkit for Multi-Agent Reinforcement Learning Qing Wang 1, Jiechao Xiong , Lei Han , Meng Fang , Xinghai Sun 1,

Reinforcement Learning with Laser Cats! Marshall Wang Maria Jahja DTR Group Meeting October 5, 2015

A brief review of non-neural- network approaches to deep learning Naiyan Wang

for high REINFORCEMENT SYSTEMS - nevoga.com€¦ · 2 REINFORCEMENT SYSTEMS REINFORCEMENT SYSTEMS REINFORCEMENT SYSTEM PLEXUS®, PYRAPLEX®, FTW The reinforcement system PLEXUS®,

Guide to Historical Reinforcement - SRIA Concrete 2017 Historical Reinforcement... · Guide to Historical Reinforcement ... reinforcement material properties to use when checking

Modeling 3D Shapes by Reinforcement Learning · Modeling 3D Shapes by Reinforcement Learning Cheng Lin 1;2, Tingxiang Fan , Wenping Wang , and Matthias Nieˇner2 1 The University

Virtual to Real Reinforcement Learning for …YOU,WANG,LU: VIRTUAL TO REAL REINFORCEMENT LEARNING 1 Virtual to Real Reinforcement Learning for Autonomous Driving Xinlei Pan 1 [email protected]

ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING · COMPUTATIONAL INTELLIGENCE – Vol. I - Adaptive Dynamic Programming And Reinforcement Learning - Derong Liu, Ding Wang

Optimizing Sponsored Search Ranking Strategy by …Optimizing Sponsored Search Ranking Strategy by Deep Reinforcement Learning Li He [email protected] Liang Wang [email protected]

Bayesian Adaptive Matrix Factorization With Automatic ... · Bayesian Adaptive Matrix Factorization with Automatic Model Selection Peixian Chen Naiyan Wang Nevin L. Zhang Dit-Yan

Toward Interpretable Deep Reinforcement Learning with Linear … · Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees Guiliang Liu, Oliver Schulte, Wang Zhu

Deep Reinforcement Learning for Foreign Exchange Trading · 2019-08-23 · Deep Reinforcement Learning for Foreign Exchange Trading 1st Chun-Chieh Wang Department of Computer Science

A Probabilistic Approach to Robust Matrix Factorizationwinsty.net/papers/prmf.pdf · 2018-01-02 · A Probabilistic Approach to Robust Matrix Factorization Naiyan Wang†, Tiansheng

Eick: Reinforcement Learning. Reinforcement Learning Introduction Passive Reinforcement Learning Temporal Difference Learning Active Reinforcement Learning

Racing F-Zero with Imitation Learningcs229.stanford.edu/proj2017/final-reports/5243706.pdfRacing F-Zero with Imitation Learning Theory and Reinforcement Learning Stephanie Wang, Michael

Tutorial of Reinforcement: A Special Focus on Q-Learningtingwuwang/RL_tutorial.pdf · 2017. 11. 28. · Tutorial of Reinforcement: A Special Focus on Q-Learning TINGWU WANG, MACHINE

Online Robust Non-negative Dictionary Learning for Visual ......Online Robust Non-negative Dictionary Learning for Visual Tracking Naiyan Wang yJingdong Wangz Dit-Yan Yeung y Hong

Bayesian Reinforcement Learning - mlg.eng.cam.ac.ukmlg.eng.cam.ac.uk/rowan/files/BayesianReinforcementLearning.pdf · Introduction Bayesian Reinforcement Learning Bayesian Reinforcement

Reinforcement Learning and Deep Reinforcement Learningcse.ucdenver.edu/.../Class-22-Reinforcement-learning-DL.pdf · 2018. 11. 28. · Outlines 1 Principles of Reinforcement Learning

Towards Monocular Vision based Obstacle Avoidance ... Monocular Vision based Obstacle Avoidance through Deep Reinforcement Learning Linhai Xie, Sen Wang, Andrew Markham and Niki Trigoni

Self-Paced Cross-Modality Transfer Learning for …winsty.net/papers/spl_cmt_seg.pdfSelf-Paced Cross-Modality Transfer Learning for Efﬁcient Road Segmentation Weiyue Wang 1, Naiyan

Like What You Like: Knowledge Distill via Neuron …Like What You Like: Knowledge Distill via Neuron Selectivity Transfer Zehao Huang Naiyan Wang TuSimple fzehaohuang18, [email protected]

Structured Control Nets for Deep Reinforcement Learningproceedings.mlr.press/v80/srouji18a/srouji18a.pdf · architectures for DRL. The Dueling network of (Wang et al., 2016) splits

Mitigating Bias in Face Recognition Using Skewness …...Mitigating Bias in Face Recognition using Skewness-Aware Reinforcement Learning Mei Wang, Weihong Deng* Beijing University

Reinforcement Learning Architectures for Deep · Dueling Network Architectures for Deep Reinforcement Learning Paper by: Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc

Reinforcement Learning for Non-Stationary Markov Decision ...Reinforcement Learning for Non-Stationary Markov Decision Processes: The Blessing of (More) Optimism Wang Chi Cheung1 David

Deep-Reinforcement Learning Multiple Access for ...Deep-Reinforcement Learning Multiple Access for Heterogeneous Wireless Networks Yiding Yu, Taotao Wang, Soung Chang Liew ... While

Prefrontal cortex as a meta-reinforcement learning systemJane X. Wang 1,5, Zeb Kurth-Nelson1,2,5, Dharshan Kumaran1,3, Dhruva Tirumala1, Hubert Soyer1, Joel Z. Leibo1, Demis Hassabis1,4

Reinforcement Learning for FX trading Font: Roboto 14€¦ · Roboto 14 Reinforcement Learning for FX trading Yuqin Dai, Chris Wang, Iris Wang, Yilun Xu . Font ... The agent may learn

Collaborative Deep Learning for Recommender …Collaborative Deep Learning for Recommender Systems Hao Wang Hong Kong University of Science and Technology [email protected] Naiyan

human level control through deep reinforcement learning naiyan wang

Documents

discussion slide

end learning slide

solution slide

end cooking slide

deep q learning slide

traditional cooking

ction r eward slide

long term dependency