chapter 8. situated dialogue processing for human-robot interaction in cognitive systems,...

23
Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski Matsvei Interdisciplinary Program in Cognitive Science Seoul National University http://cogsci.snu.ac.kr

Upload: sophie-shaw

Post on 29-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Chapter 8. Situated Dialogue Processing for Human-Robot Interactionin Cognitive Systems, Christensen et al.

Course: Robots Learning from Humans

Sabaleuski Matsvei

Interdisciplinary Program in Cognitive ScienceSeoul National University

http://cogsci.snu.ac.kr

Page 2: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Contents

Introduction

Background

Multi-level Intergration in Language Processing

Language Processing and Situational Experience

Talking

Talking about What You Can See

Talking about Places You Can Visit

Talking about Things You Can Do

Conclusions

2

Page 3: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Local visuo-spatial scenes

Spatial organization of an indoor environment

DIALOGUE

«THE WORLD»

Playmate scenario Explorer scenario

Introduction

Page 4: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Requirements for the solution

Gradual construction

Referentiality

Persistence

Efficiency & Effectiveness

LANGUAGE PERCEPTION

Page 5: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Winograd's SHRDLU

Incremental "left-to-right" linguistic analyses connected to visuo-spatial representations of local scenes.

Could understand and execute human

commands Had a basic memory

to supply context

Small virtual world Language consisting of

around 50 words

Page 6: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Steels's Semiotic Networks

Open-ended, adaptive communication system

Ability to learn Communicative

success above 80%

Lexicon of around 50 words

Impossible to connect alternative meanings at the same time

Sony AIBO robots

Page 7: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Bi-directionality hypothesis

• Gradual construction

Use of Combinatory Categorial Grammar (CCG)

• Referentiality

Use of structured discourse representation models with the

ability to resolve linguistic reference to situated context

• Persistence

Different referent resolutions can be combined, which is used in

visual learning

• Efficiency & Effectiveness

Incremental comprehension model can sort out unlikely word-

and meaning hypothesses;

Perfomance of speech recognition and parcing is close to 90%

Page 8: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Multi-level Intergration in Language Processing

Modular model

Context-independant representation is constructed first and only then it is intepreted against preceding dialogue context

Incremental model

Every new word is related to representations of the preceding input

Princimple of parsimony:Preferance of the least 'presuppositionally' heavy intepretations

e.g. The postman delivered the baby. Mary gave the child the dog bit a bandaid.

Incremental model is supported by the results of psycholinguistic research (saccadic eye movement research)

Page 9: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Language Processing and Situational Experience

Anticipatory effect

Disambiguation by scene understanding

Temporal projection

Focus of psycholinguistic research:

How information from situation awareness effects utterance

comprehension

Interaction between LANGUAGE and VISION is mediated by CATEGORIES

The research revlealed:

Page 10: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Talking

Listening

Comprehending

Representing an utterance

Representing the Interpretation of an Utterance in Context

Comprehending an Utterance in Context

Picking Up the Right Interpretation

Speaking

Producing an Utterance in Context

Producing Speech

Page 11: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Representing an utterance

Utterance is represented as ontologically richly sorted, relational structure - a logical form in a decidable fragment of modal logic

I want you to put the red mug to the right of the ball

Page 12: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Packing

Take the ball to the left of the box

Packing node

Internal relation

Packing nominal

Page 13: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Packing edge

Packing node target

Page 14: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Example of incremental parsing and packing of logical forms

Here is the ball

Page 15: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Representing the Interpretation of an Utterance in Context

Co-reference relations - relations between mentions referring to the same objects or events.eg. pronouns ('it'), anaphoric expressions ('the red mug')

New referent identifier – [NEW : {antn}]

Antecendant referent - [OLD : {anti}],

[OLD : anti < {antj, ..., antk} < NEW : {antn}].

Reference structure can specify preference orders over sets of old and new referents

Page 16: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Decision tree for dialogue moves

A dialogue move ('speech act') specifies how an etterance contributes to furthering the dialogue

Page 17: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Dialogue context model

Put the red ball next to the cube

Page 18: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Comprehending an Utterance in Context

Cross-modal salience model

Visual salience

Linguistic salience

Word recognition lattice

Page 19: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Example of an incremental analysis

Page 20: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Utterance interpretation at grammatical level

Page 21: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Picking up the right interpretation

Parse selectoin system based on a statistical linear model explores a set of relevant acoustic, syntactic, semantic and contextual features of the parses, and computes a likelihood score for each of them.

Parse selection is a function F :X →Y,where X is a set of possible input utterances, Y is a set of parsesWe alos assume:1. A function GEN(x) which enumerates all possible parses for an input x.2. A d-dimensional feature vector f (x, y) ∈ Rd, representing specific featuresof the pair (x, y). 3. A parameter vector w ∈ Rd

Where wT · f (x, y) is the inner product , and can be seen as a measure of the 'quality' of the phrase

Page 22: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Producing an utterance in context

http://mary.dfki.de:59125/

Producing of an utterance is triggered by a communicative goal.

Communicative goal specifies a dialogue move, and content which is to be

communicated.

The utterance realizer uses the same grammar as the parser.

The MARY speech synthesis engine then produces audio output.

References are generated by the use of incremental algorithm of Dale and

Reiter.

The algorithm is initialized with the intended referent, a contrast set and a list of

prefered attributes. It incrementally tries to rule out members of the set for which

a given property of the intended referent foes not hold.

Page 23: Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski

Thank you for your attention