embodied learning of qualitative models jure Žabkar exploration and curiosity in robot learning and...

18
Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 oint work with xpero partners

Upload: joanna-horton

Post on 12-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

Embodied Learning of Qualitative Models

Jure Žabkar

Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011

joint work with xpero partners

Page 2: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

problem

“How should a robot choose its actions and experiences so as to maximize the

effectiveness of its learning?”

Page 3: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

goals

• to learn comprehensible models

• no extrinsic reward

• intrinsic reward: improved prediction model about the environment

Page 4: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

our way

• learning from scratch(no explicit background knowledge, but given a learning algorithm)

• real robots, real-time learning

Page 5: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

learning loop

1. observe the environment (collect data)2. learn a model3. use the model to predict the effect of

each action4. choose the best action (w.r.t. active

learning strategy)5. observe the environment and check

whether the predictions match new observations

Page 6: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

starting scenarioQ: how does the area of the ball (as observed by the robot)change w.r.t. robot's actions?

area := #pixels of the red blob in the image from robot's camera

actions: sL, sR

(the distance of the L/R wheel)

Page 7: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

area = area(sL,sR)

task: find the appropriate model

equation discovery?we tried several algorithms, no success

Page 8: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

motivation

people most oftenreason qualitatively

AI: robots should mimic

human intelligence

why learning qualitative relations?

Page 9: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

the area problem, qualitatively

if action=forward then the area increases until it becomes constant (blob occupies the whole image)

if orientation<0 and action=left (increasing the

absolute value of the angle) then the area decreases until it becomes constant (zero)

...

Page 10: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

qualitative rules

prediction model gets much more accurate,but the predictions are

not that precise.

Page 11: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

methods

• active learning + planning• learning methods:

PadéŽabkar, Možina, Bratko, Demšar Learning Qualitative Models from Numerical Data, AIJ, 2011

STRUDELKošmerlj, Bratko, Žabkar Embodied Concept Discovery through Qualitative Action Models, IJUFKS, 2011

QubeŽabkar et al Preference Learning from Qualitative Partial Derivatives, ECML Preference Learning Workshop, 2010

Hyper (with predicate invention mechanism)Leban, Žabkar, Bratko An experiment in robot discovery with ILP Proc. ILP 2008

• tested on simulated (billiards) and real data (medical application, robotics)

Page 12: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

ceteris paribus

• e.g. partial differentiation• observe a qualitative relation

between two selected features, other features held constant

• qualitative relations of 3 types:– x increases f(x) increases (Padé)– preference relation: x y f(x) f(y) – structural: on(A,B,t1), on(A,C,t2)

"all other things being equal"

Page 13: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

qualitative modelsdata

qualitative

changes

qualitative models

Padé, Qube, STRUDEL

machine learning,statistics

Page 14: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

qualitative modelsdata

qualitative

changes

qualitative models

Padé, Qube, STRUDEL

machine learning,statistics

Page 15: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

qualitative modelsdata

qualitative

changes

qualitative models

Padé, Qube, STRUDEL

machine learning,statistics

Page 16: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

learning with structured data• ILP with predicate invention too

complex for real-time learning

• we use ILP to learn smaller subtasks – structural qualitative changes

Page 17: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

www.ailab.si/xpero

Page 18: Embodied Learning of Qualitative Models Jure Žabkar Exploration and Curiosity in Robot Learning and Inference, DAGSTUHL, March 2011 joint work with xpero

the concept "movable"the discovered condition which distinguishes different effects of actions:p1(Obj):-

at(T1, Obj, Pos1),at(T2, Obj, Pos2),neq_pos(Pos1, Pos2).

move(T, Obj):-p1(Obj),f1(T, Obj).

move(T, Obj):-not p1(Obj),f2(T, Obj).

f1(T1, Obj):-at(T1, Obj, Pos1),at(T2, Obj, Pos2),Pos1 \== Pos2,{T2 = T1+1}.

f2(T, Obj):-not f1(T, Obj).

the discovered effects of actions:

p1 is true if the object was observed at two different positions