guiding interaction behaviors for multi-modal grounded ...behavior annotations. we gather relevant...

Guiding Interaction Behaviors forMulti-modal Grounded Language Learning

Jesse Thomason, Jivko Sinapov, Raymond J. MooneyUniversity of Texas at Austin

Multi-Modal Grounded Linguistic Semantics

Robots need to be able to connect language to theirenvironment in order to discuss real world objects withhumans. Grounded language learning connects arobot’s sensory perception to natural language pred-icates. We consider a corpus of objects explored by arobot in previous work (Sinapov, IJCAI 2016).

grasp lift lower

drop press pushA hold and look behavior were also performed foreach object.I Objects sparsely annotated with languagepredicates from an interactive “I Spy” game withhumans (Thomason, IJCAI 2016).

I Many object examples are necessary to groundlanguage in multi-modal space

I We explore additional sources of information fromusers and corporaI Behavior annotations, “What behavior would you use tounderstand ‘red’?”

I Modality annotations, “How do you experience ‘heaviness’?”I Word embeddings, exploiting that “thin” is close to “narrow.”

Methodology

The decision d(p,o) ∈ [−1,1] for predicate p and ob-ject o is

d(p,o) =∑c∈C

κp,cGp,c(o), (1)

for Gp,c a linear SVM trained on labeled objects for p inthe feature space of context c with Cohen’s κp,c agree-ment with human labels during cross-validation.

“round”

grasp lift look

audio audio color

haptics haptics shape haptics

grasp lift look

audio audio color

haptics haptics shape haptics

Data from Human Interactions

Human Behavior Annotations

Estimated Context Reliability

+Behavior Annotations

Behavior Annotations. We gather relevant behaviorsfrom human annotators for each predicate, allowing usto augment classifier confidences with human sugges-tions. The predicate “heavy” favors interaction behav-iors like lifting and holding.

Modality Annotations. We also derive a modality an-notations past work (Lynott, 2009), allowing us to aug-ment classifier confidences with human intuitions. Thepredicate “white” favors visual-related modalities.

Word Embeddings. We calculate positive similarityas and subsequently get augment each predicate’sclassifier confidences with similarity-weighted aver-ages of its predicate neighbors. A predicate like “nar-row” with few object examples can borrow informationfrom a predicate like “thin” that is close in word embed-ding space and has many examples.

Experiment and Results

p r f1mc .282 .355 .311κ .406 .460 .422B+κ .489 .489 .465M+κ .414 .466 .430W+κ .373 .474 .412

I Adding behavior annotations or modality annotationsimproves performance over using kappa confidencealone.

I Behavior annotations helped the f -measure ofpredicates like “pink”, “green”, and “half-full”

I Modality annotations helped with predicates like“round”, “white”, and “empty”.

I Sharing kappa confidences across similarpredicates based on their embedding cosinesimilarity improves recall at the cost of precision.

I For example, “round” improved, but at the expense ofdomain-specific meanings of predicates like “water”.

jesse@cs.utexas.edu https://www.cs.utexas.edu/˜jesse/

guiding interaction behaviors for multi-modal grounded ...behavior annotations. we gather relevant...

Documents

modeling annotators: a generative approach to annotators are...

classiﬁer-basednon-linearprojectionforadaptive

design of a multimedia traﬃc classiﬁer for...

bootstrapping regular-expression recognizer to h elp h...

multiple classiﬁer systems in remote sensing: from basics...

sensing and learning human annotators engaged in narrative...

neckar: a named entity classiﬁer for wikidata

measuring transitivity using untrained annotators

learning a legibility classiﬁer for deformed...

ccmi : classiﬁer based conditional mutual information

a comparison between classiﬁer languages and classiﬁer...

classiﬁer probes - arxiv · understanding intermediate...

permutation tests for studying classiﬁer...

scail: classiﬁer weights scaling for class incremental...

interface prostheses with classiﬁer-feedback based user...

a framework to compare text annotators and its...

weakly-supervised classiﬁer learning via temporal...

dissertation classiﬁer diversity in combined...

nearest neighbor classiﬁer embedded network for active

integration of multiple annotators by aggregating...