chapter 17. the progress drive hypothesis course: robots learning from humans yoon, bo sung...

14
Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University http://bi.snu.ac.kr

Upload: alfred-richard

Post on 17-Jan-2018

214 views

Category:

Documents


0 download

DESCRIPTION

Introduction © 2015, SNU CSE Biointelligence Lab., 3 progress-driven learning could help to understand why children focus on specific imitative activities at a certain age how they progressively organize preferential interactions with particular entities present in their environment.

TRANSCRIPT

Page 1: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

Chapter 17. The Progress drive hypothesis

Course: Robots Learning from Humans

Yoon, Bo SungDepartment of EconomicsSeoul National University

http://bi.snu.ac.kr

Page 2: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

2

Contents Introduction

Progress driven learning

Developmental mechanisms for early imitation

Conclusion

Page 3: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

© 2015, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 3

Introduction progress-driven learning could help to understand

why children focus on specific imitative activities at a cer-tain age

how they progressively organize preferential interactions with particular entities present in their environment.

Page 4: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

© 2015, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 4

Progress Drive Learning System a progress drive hypothesis

early imitation phenomena, including self-imitation and simple interpersonal co-ordination is an 1)intrinsic moti-vation system driving the infant into situations expected to result in 2) maximal learning progress

progress-driven learning system a critic capable of producing internal rewards in order to

guide the agent towards learning new skills agent to acts in order to be in situations in which its error

in prediction decreases maximally fast.

Page 5: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

© 2015, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 5

Prediction and Meta-prediction

s : state a : action y : actual outcome y’ : expected outcome e : actual error ( = y – y’) e’ : expected error P(s, a) : prediction system Meta P (s, a) : meta prediction system

Page 6: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

© 2015, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 6

Examples of Learning Systems progress-driven learning system

In this system, agent acts in order to be in situations in which its error in predic-tion decreases maximally fast.

→ take action to maximize the expected process p’(t)

Mastery-driven systems

In this system, agent acts in order to be in situations in which its error in predic-tion is minimal.

→ take action to make “ e’ ” minimal (e = y – y’)

novelty-driven systems In this system, agent chooses actions leading to situations in which its error in

prediction is maximal. → take action to make “ e’ ” maximal (e = y – y’)

Page 7: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

© 2015, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 7

Screen Problem

s (state) = y (outcome) : 9-D vector in-cluding 9 binary value (0 or 1) corre-sponding to nine pixels

A (action) = 4-D vector including 4 bi-nary value (0 or 1) corresponding 4 but-tons

notation s : state a : action y : actual outcome y’ : expected outcome e : actual error ( = y – y’) e’ : expected error P(s, a) : prediction system → y’ Meta P (s, a) : meta prediction → e’

system

Page 8: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

© 2015, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 8

Progress Niches

The progress drive pushes the agent to discover and focus on situations which lead to maximal learning progress.

These situations, neither too predictable nor too difficult to predict, are ‘progress niches’.

Page 9: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

© 2015, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 9

Screen Problem

Page 10: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

© 2015, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 10

Neonate imitation (0m) : Like-me stance infants will indeed create a

discrimination between predictable couplings be-tween modals(what is seen and what is felt) and unpre-dictable situations (all the other cases)

they will focus on the first zone of their sensorimotor space that constitutes a ‘progress niche’.

Developmental Mechanisms for Early Imitation

Page 11: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

© 2015, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 11

Self-imitation(1 – 2m) : Circu-lar Reaction During the first two months of

their life, infants perform re-peated body

(sucking its fingers, shaking legs) children are structuring their own

behaviour in order to make it more predictable and in this way form ‘circular reactions’ motion

self-centred types of behaviour are ‘progress niches’.

Developmental Mechanisms for Early Imitation

Page 12: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

© 2015, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 12

Developmental Mechanisms for Early Imitation Pseudo-imitation (2-4m)

Infants become more attentive to the external world and par-ticularly to people.

Parents adapt their own re-sponses so that interactions with the child follow the nor-mal social rules that character-ize communicative ex-changes(ex. Taking turns )(Parental scaffolding).

if an adult imitates an infant’s own actions, it can trigger con-tinued activity in the infant.

Page 13: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

© 2015, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 13

developmental mechanisms for early imitation

Interaction with Objects (5-7m) Ashifts again from people to ob-

jects. Children gain increased control

over the manipulation of some objects on which they discover ‘affordances’

A progress-driven process can account for this discrimination between affordant objects and unmastered aspects of the envi-ronment.

Page 14: Chapter 17. The Progress drive hypothesis Course: Robots Learning from Humans Yoon, Bo Sung Department of Economics Seoul National University

© 2015, SNU CSE Biointelligence Lab., http://bi.snu.ac.kr 14

Conclusion

the development of imitative capabilities could be interpreted as a result of a progress driven process

Early imitation can be seen as the result of a process by which an agent looks for ‘progress niches’

by picking up easy-to-predict aspects of its environment, by performing self-centred circular reactions by engaging in scaffolded social interactions By discovering particular affordances of certain objects.

The existence of a progress drive could explain why certain types of imitative behaviour are produced by children at certain ages and

stop being produced later on. how discrimination between actions oriented towards the self, towards others and to-

wards the environment may occur.