artificial intelligence - unite seouluniteseoul.com/2019/pdf/d2t5s4.pdf · 2020. 2. 6. · obstacle...

Artificial Intelligence & Deep Learning

Artificial Intelligence

3

— Classification

— Regression

— Natural Language Processing

— Object Detection

— Generative Model

— Reinforcement Learning

4

Deep Learning

Processing unit of Brain

Deep Learning

5

Processing unit of Neural Network

Input Output

Deep Learning

6

~ 2010

Deep Learning

7

Algorithms GPU Big Data

Deep Learning

8

2010~

Classification

9

Regression

10

Natural Language Processing

11

— OpenAI’s GPT-2

Object Detection

12

Generative Model

13

Generative Model

14

Generative Model

15

Reinforcement Learning

16

Deep Learning in Games

Generative Adversarial Networks (GAN)

Deep Learning in Games

18

Reinforcement Learning (RL)

VS

GAN

19

Real?

Fake?

Real

Fake

GAN

20

Real

Fake

Real?

Fake?

Generator

Discriminator

GAN

21

Training

GAN

22

Generator

Game Level Generation Using GAN

23


24


25

Design Using GAN

26

Generator

Condition

Design Using GAN

27

Generator

Sword


28

Reward

+ Reward - Reward


29

Reward

+ Reward - Reward


30

Agent Environment

State, Reward

Action


31

GridWorld Starcraft2


32


33


34


35


36


37

https://github.com/reinforcement-learning-kr/alpha_omok


38


39


40


41

— Multi-agents RL

— Meta RL

— Exploration

— Curiosity

— Noise in parameter

— Model-based RL

— Sim2Real


42

https://www.facebook.com/groups/ReinforcementLearningKR/

AI in Unity

AI in Unity

44

Challenges Machine Learning Agents

ML-agents Challenge

45

ML-agents Challenge

46

ML-agents Challenge

47

Obstacle Tower Challenge

48


49

— Montezuma’s Revenge

— Challenges— Sparse reward

— Hard exploration

— Requires planning

— Multi Task

Hard to Solve!!


50

— Montezuma’s Revenge is Solved!

— Demonstration— Aytar et al. 2018

— Curiosity

– Pathak er al. 2017

– Burda et al. 2018

— Go-Explore

– Ecoffet et al. 2018

Go-Explore: a New Approach for Hard-Exploration Problems (Ecoffet et al.)


51


52


53

— 3D Visual Observation

— Complex floor layout

— Generalization

– Floor, Room, Wall

– Every 10 floors

— Multi Task

– Key

– Sokoban

– Pit


54

— ModuLabs CTRL Team— Kyushik Min

— Jay Jung

— Suhyuk Park

— Hyojeong Jeon

— Round 1 is finished! ☺

— Curiosity based algorithm

— Average 6~7th floor


55

Unity ML-agents

56

Unity ML-agents

57

— Reinforcement Learning

Agent Environment

State, Reward

Action

Unity ML-agents

58

— Deep Reinforcement Learning

Agent Environment

?

State, Reward

Action

Unity ML-agents

59


Agent Environment

?

State, Reward

Action

Unity ML-agents

60


Agent Environment

State, Reward

Action

Unity ML-agents

61

Academy

- Managing brains

- Configuration setting

Brain

- Observation setting

- Action setting

Agent

- Script for Agent

- Control Setting

- Reward, done setting

Unity ML-agents

62

Unity ML-agents

63

VS: Single Agent

: Multi-Agent

: Adversarial Agents

: Imitation Learning

Unity ML-agents

64

: Training

: Heuristic

: PlayerBrain

: External

: Internal

Unity ML-agents

65

Agent1

Brain1 (Heuristic)

Agent2

Brain2 (Internal)

Agent4

Brain3 (External)

Agent3 Agent5

Academy

Deep Learning

Unity ML-agents

66

Reinforcement Learning Unity

Unity ML-agents Tutorial

67

RL Korea Unity ML-agents Tutorial Team

https://github.com/reinforcement-learning-kr/Unity_ML_Agents

Sokoban

- Discrete Action

- Deep Q-Network (DQN)


68

Drone

- Continuous Action

- Deep Deterministic Poli

cy Gradient (DDPG)

Pong

- Adversarial Environment

- Discrete Action

- DQN

Sokoban (Curriculum)

- Curriculum Learning

- Discrete Action

- DQN


69

Dodge

- Imitation Learning

- Discrete Action

- Behavioral Cloning

TwoLeg Walker

- Multi-agents

- Continuous Action

- MADDPG


70


71


72


73


74


75


76

Thank you

77

https://www.facebook.com/groups/ReinforcementLearningKR/

artificial intelligence - unite seouluniteseoul.com/2019/pdf/d2t5s4.pdf · 2020. 2. 6. · obstacle...

Documents