chapter 7. learning through imitation and exploration: towards humanoid robots that learn from...

Chapter 7. Learning through Imitation and Exploration: Towards Humanoid Robots

that Learn from Humansin Creating Brain-like Intelligence.

Course: Robots Learning from Humans

Min-Joon Kim

Intelligent Data Systems Lab.

School of Computer Science and Engineering

Seoul National University

September 18th, 2015

2

Contents Introduction

Physics-Based Model Dynamic Bayesian Network Model

Imitation Process BABIL Imitation Learning Algorithm Planning via Inference

Experiments: Learning Stable Full-Body Humanoid Motion via Imitation

Conclusion

Discussion

3

Introduction: Brain-Like Intelligence

Brain-like intelligence, our “goal”

From previous chapters…what is brain-like intelligence?

Two major obstacles Lack of mechanisms for rapid learning

https://youtu.be/l0N6mIpoN3M?t=37s Lack of the ability to handle uncertainty

https://youtu.be/l0N6mIpoN3M?t=37s



4


What about people?

Growing evidence that the brain may rely on Bayesian principles for perception and action

Humans can learn new skills by simply watching other humans

But what about robots?

5


Obvious differences in structure, etc.

Example: Honda ASIMO The question: How much time and code for the robot to

kick a ball? We must keep in mind how “short” the action time is

In order for a robot to “watch and learn” Functional units for segmentation Recognition of human actions Algorithm for constructing an imitative motor plan

6


If a robot can learn from watching a “teacher” Intuitive Easier due to kinematic similarities Can enable robots to perform noble behaviors

a.k.a learning

But we must be wary… Similar but different.

Not exactly A = B Must be careful in handling uncertainty

7

Proposed Method

Bayesian framework for imitation-based learning in humanoid robots

Learning a predictive model of the robots dynamics

Taking into account uncertainty and noise + map-ping

8

Physics-Based Modeling

One can approximate a humanoid robot as a set of articulated rigid bodies A robot with N joints between N+1 rigid bodies

Each joint possibly with multiple degrees of free-dom Expressed in vector form as a six dimensional motion

vector

9

Physics-Based Modeling Spatial acceleration of rigid body i:

Vector of all joint angles:

10


Forward Kinematics Computing the velocities and accelerations of all rigid

bodies:

11


Next, consider inertia and forces to model and constrain dynamics The spatial inertia (I*) must be known or estimated

Forces denoted in spatial notation:

12


Combined Newton-Euler equation of motion for rigid body i:

Net external force must be known or estimated

13


Compute the force transmitted from parent:

Apply above to computing the joint forces starting at leaf node to the root: Extract force components through the joint’s DOFs

14


We have formed the basis for solving the “inverse dynamics” problem: Given desired kinematics, compute the necessary joint

torques

But! Problems! Relative simplicity makes real world problems difficult to

solve

15


The large number of quantities that we MUST know or be accurately estimated is difficult to ob-tain

The formulation assumes that all external forcesare known.

Can we know, exactly, the … Ground reaction force? Frictional forces? Gravity?

Are all the bodies in a robot completely rigid?

16

Bayesian Approaches to Uncertainty

Bayesian networks provide a sound theoretical ap-proach to incorporating prior, yet uncertain informa-tion What we just “calculated” before!

17

Dynamic Bayesian Network Model of the Imita-tion Learning Process

Two sources of information Demonstrative Explorative

Selecting a set of actions based on probabilistic constraints: Matching Egocentric

18

Dynamic Bayesian Network Model of the Imita-tion Learning Process

Sources of uncertainty

Observing and imitating tasks is inherently difficult

Inter-trial variance of a human performing a skill

The need to predict future states of the agent (robot) given potential control values

19

The Generative Imitation Approach

Goal is to infer the posterior distributions over Random Variable At

Posterior distribution = the conditional probability that is assigned after the relevant evidence is taken into account

20

The Generative Imitation Approach

21

BABIL Imitation Learning Algorithm

Behavior Acquisition via Bayesian Inference and Learning

22

Planning via Inference

Given a set of evidence, pick actions which have high posterior likelihood = maximum a posteriori (MAP)

But MAP is NP-hard!

= maximum marginal posterior (MMP)

23

Learning Stable Full-Body Humanoid Motion via Imitation

25

Log Likelihood of Dynamics Config.

26

Dynamic Balance Duration over Imitation Trials

27

Learning Stable Full-Body Humanoid Motion via Imitation

https://www.youtube.com/watch?v=vZYaW9_haiU

28

Conclusion

A probabilistic framework that allows a humanoid robot to learn from a human teacher through imita-tion

A general approach to “programming” a complex robot without error-prone physics models

Can handle uncertainty via Bayesian models A more “brain-like” intelligence

29

Discussion

Do humans act/learn by probabilistic models?

Are we that “mechanical”?

30

Discussion

Do humans act/learn by probabilistic models?

Are we that “mechanical”?

Can self-consciousness be represented in proba-bilistic models?

chapter 7. learning through imitation and exploration: towards humanoid robots that learn from...

Documents

imitationbased learning

robots learning

physicsbased modelingone

physicsbased modelingnext

humanoid robots

robots dynamics

joint forces

body humanoid motion